Grok 3 by xAI demonstrates top-tier general intelligence. View detailed benchmark data including scores across coding, math, reasoning, speed, and cost metrics.
General Benchmarks
Coding Benchmarks
Reasoning Benchmarks
Speed Benchmarks
Cost Benchmarks
Context Benchmarks
Grok 3 — Benchmark Scores Overview
Scores normalized to percentage scale for visual comparison. ELO scores mapped to 0-100 range (1100-1500).
Compare Grok 3 With
Grok 3 — Frequently Asked Questions
How intelligent is Grok 3?
Grok 3 scores 1402 on the Chatbot Arena ELO rating, making it a high-performing AI model. This score is based on blind head-to-head human preference voting.
How much does Grok 3 cost?
Grok 3 costs $3.0 per 1M input tokens and $15.0 per 1M output tokens. This is mid-range pricing for its capability level.
How fast is Grok 3?
Grok 3 generates output at 65 tokens per second, which is slower, prioritizing quality over speed compared to other models. The time to first token is 400 ms.
How good is Grok 3 at coding?
Grok 3 achieves 42.0% on SWE-bench Verified, demonstrating moderate real-world software engineering capability. This benchmark tests the model's ability to resolve actual GitHub issues.
How good is Grok 3 at math and reasoning?
Grok 3 scores 82.0% on the MATH benchmark (competition-level mathematics). It also achieves 84.6% on GPQA Diamond, a graduate-level science reasoning benchmark.
What is the context window of Grok 3?
Grok 3 has a context window of 1.0M tokens. This determines how much text, conversation history, and code the model can process in a single request.
Who created Grok 3?
Grok 3 was created by xAI. It is classified as a mid model in the AI Value Index.
Is Grok 3 open source?
No, Grok 3 is a proprietary model. It is available through xAI's API and compatible providers.