OpenAI

o3 — Benchmark Scores, Pricing & Performance Analysis

FLAGSHIPOpenAI
Chatbot Arena ELO
1380
Output Speed
40 tok/s
Input Cost
$2.0/1M
Output Cost
$8.0/1M
Context Window
200K

o3 by OpenAI demonstrates strong general intelligence, solid coding performance, outstanding mathematical reasoning. View detailed benchmark data including scores across coding, math, reasoning, speed, and cost metrics.

o3 — Benchmark Scores Overview

Scores normalized to percentage scale for visual comparison. ELO scores mapped to 0-100 range (1100-1500).

o3 — Frequently Asked Questions

How intelligent is o3?

o3 scores 1380 on the Chatbot Arena ELO rating, making it a high-performing AI model. This score is based on blind head-to-head human preference voting.

How much does o3 cost?

o3 costs $2.0 per 1M input tokens and $8.0 per 1M output tokens. This is mid-range pricing for its capability level.

How fast is o3?

o3 generates output at 40 tokens per second, which is slower, prioritizing quality over speed compared to other models. The time to first token is 800 ms.

How good is o3 at coding?

o3 achieves 69.1% on SWE-bench Verified, demonstrating strong real-world software engineering capability. This benchmark tests the model's ability to resolve actual GitHub issues.

How good is o3 at math and reasoning?

o3 scores 96.0% on the MATH benchmark (competition-level mathematics). It also achieves 83.3% on GPQA Diamond, a graduate-level science reasoning benchmark.

What is the context window of o3?

o3 has a context window of 200K tokens. This determines how much text, conversation history, and code the model can process in a single request.

Who created o3?

o3 was created by OpenAI. It is classified as a flagship model in the AI Value Index.

Is o3 open source?

No, o3 is a proprietary model. It is available through OpenAI's API and compatible providers.