Grok 4 by xAI demonstrates top-tier general intelligence, excellent coding ability, outstanding mathematical reasoning. View detailed benchmark data including scores across coding, math, reasoning, speed, and cost metrics.
General Benchmarks
Coding Benchmarks
Reasoning Benchmarks
Speed Benchmarks
Cost Benchmarks
Context Benchmarks
Grok 4 — Benchmark Scores Overview
Scores normalized to percentage scale for visual comparison. ELO scores mapped to 0-100 range (1100-1500).
Compare Grok 4 With
Grok 4 — Frequently Asked Questions
How intelligent is Grok 4?
Grok 4 scores 1430 on the Chatbot Arena ELO rating, making it a high-performing AI model. This score is based on blind head-to-head human preference voting.
How much does Grok 4 cost?
Grok 4 costs $3.0 per 1M input tokens and $15.0 per 1M output tokens. This is mid-range pricing for its capability level.
How fast is Grok 4?
Grok 4 generates output at 55 tokens per second, which is slower, prioritizing quality over speed compared to other models. The time to first token is 500 ms.
How good is Grok 4 at coding?
Grok 4 achieves 72.0% on SWE-bench Verified, demonstrating excellent real-world software engineering capability. This benchmark tests the model's ability to resolve actual GitHub issues.
How good is Grok 4 at math and reasoning?
Grok 4 scores 91.0% on the MATH benchmark (competition-level mathematics). It also achieves 87.0% on GPQA Diamond, a graduate-level science reasoning benchmark.
What is the context window of Grok 4?
Grok 4 has a context window of 256K tokens. This determines how much text, conversation history, and code the model can process in a single request.
Who created Grok 4?
Grok 4 was created by xAI. It is classified as a flagship model in the AI Value Index.
Is Grok 4 open source?
No, Grok 4 is a proprietary model. It is available through xAI's API and compatible providers.