Every Frontier Model's API Price, Side by Side

API pricing changes fast. This guide covers every major coding-capable model's pricing as of March 30, 2026, verified directly from official documentation and pricing pages.

The Master Pricing Table

All prices are per 1 million tokens (standard context, no caching or batch discounts):

Model	Input (per MTok)	Output (per MTok)	Context Window	License
DeepSeek V3.2	$0.28	$0.42	164K	MIT
Qwen 3.5 Flash	$0.065	$0.26	1M	Apache 2.0
Qwen 3.5 Plus	~$0.26	~$1.56	1M	Apache 2.0
Mistral Large 3	$0.50	$1.50	262K	Apache 2.0
GLM-5	$1.00	$3.20	200K	MIT
GLM-5-Turbo	$1.20	$4.00	200K	Proprietary (API)
GLM-5-Code	$1.20	$5.00	200K	Proprietary (API)
Gemini 3.1 Pro	$2.00	$12.00	1M	Proprietary
GPT-5.4	$2.50	$15.00	1M	Proprietary
Claude Sonnet 4.6	$3.00	$15.00	1M	Proprietary
Claude Opus 4.6	$5.00	$25.00	1M	Proprietary

Sources: docs.z.ai (GLM), platform.claude.com (Claude), developers.openai.com (GPT), ai.google.dev (Gemini), api-docs.deepseek.com (DeepSeek), openrouter.ai (Qwen, Mistral).

What These Numbers Actually Mean

Raw per-token prices don't tell the full story. What matters is cost per task. A typical coding task involves:

Small task (fix a bug, write a function): ~2K input + ~1K output = 3K tokens
Medium task (build a component, refactor a file): ~10K input + ~5K output = 15K tokens
Large task (architect a feature, multi-file changes): ~50K input + ~20K output = 70K tokens

Cost Per Task Comparison

Model	Small Task (~3K tok)	Medium Task (~15K tok)	Large Task (~70K tok)
DeepSeek V3.2	$0.001	$0.005	$0.022
Qwen 3.5 Flash	$0.0004	$0.002	$0.008
Mistral Large 3	$0.003	$0.013	$0.055
GLM-5	$0.005	$0.024	$0.114
Gemini 3.1 Pro	$0.016	$0.080	$0.340
GPT-5.4	$0.020	$0.100	$0.425
Claude Opus 4.6	$0.035	$0.175	$0.750

Claude Opus 4.6 costs 34× more per large task than DeepSeek V3.2 and 7× more than GLM-5. That doesn't mean DeepSeek is better — it means you're paying for quality. The question is how much quality you need for each task.

Savings With Caching and Batch APIs

Every major provider offers ways to cut costs for high-volume use:

Provider	Cache Discount	Batch API Discount	Best Combined Price (Input)
Claude Opus 4.6	90% on cache reads	50%	$0.25/MTok (batch + cache)
GPT-5.4	50% on cached input	50%	$0.625/MTok
Gemini 3.1 Pro	90% on cache reads	N/A	$0.20/MTok (cache only)
DeepSeek V3.2	90% on cache hits	N/A	$0.028/MTok
GLM-5	80% on cached input	N/A	$0.20/MTok

With aggressive caching, Claude Opus drops from $5.00 to $0.25 per million input tokens — a 95% reduction. Gemini 3.1 Pro drops from $2.00 to $0.20. If you're building a product with repeated context (system prompts, codebase context), caching changes the economics dramatically.

Quality vs Cost: The SWE-bench Reality Check

Cheaper isn't always better. Here's how these models rank on SWE-bench Verified, the standard benchmark for real-world code editing:

Model	SWE-bench Verified	Output Cost/MTok	Cost-Efficiency Ratio
Claude Opus 4.6	~80.8%	$25.00	3.2% per dollar
Gemini 3.1 Pro	78.8%	$12.00	6.6% per dollar
GPT-5.4	78.2%	$15.00	5.2% per dollar
GLM-5	77.8%	$3.20	24.3% per dollar
Qwen 3.5	76.4%	$2.34	32.6% per dollar
DeepSeek V3.2	72–74%	$0.42	173% per dollar

GLM-5 offers the best balance of quality and cost among frontier models — 77.8% SWE-bench at $3.20/MTok output is 7.8× more cost-efficient than Claude Opus. DeepSeek V3.2 is the absolute cheapest but trades off 8+ points of SWE-bench accuracy.

Subscription Plans vs Pay-as-You-Go

For developers who code daily, subscription plans often beat API pricing:

Plan	Monthly Cost	Best Model	Break-Even vs API
Claude Pro	$20	Opus 4.6	~$0.80 of API usage/day
Claude Max 5×	$100	Opus 4.6	~$3.30 of API usage/day
ChatGPT Plus	$20	GPT-5.4	~$0.80 of API usage/day
GLM Coding Lite	~$10	GLM-5.1	~$0.33 of API usage/day
GLM Coding Pro	~$30	GLM-5.1 + GLM-5	~$1.00 of API usage/day

If you use your coding AI for more than 30 minutes a day, a subscription almost always beats pay-as-you-go pricing.

The BYOAI Strategy

If you're building apps on a platform that supports BYOAI (Bring Your Own AI), you can route different tasks to different models based on cost and complexity:

Boilerplate and simple edits: Qwen 3.5 Flash ($0.065/$0.26) or DeepSeek V3.2 ($0.28/$0.42)
Complex logic and architecture: Claude Opus 4.6 ($5/$25) or GPT-5.4 ($2.50/$15)
High-volume agentic tasks: GLM-5 ($1/$3.20) — best frontier-quality-per-dollar

Platforms like Serenities AI support BYOAI with no AI markup — you connect your own API key from any provider and pay only the provider's rates. Combined with batteries-included features (database, auth, storage, automation at $9–$24/month), you can run a full development stack without accumulating separate service subscriptions.

Hidden Costs to Watch

Long-context surcharges: GPT-5.4 doubles input pricing beyond 272K tokens and adds 1.5× on output. Gemini 3.1 Pro doubles input beyond 200K tokens. Claude removed its long-context premium on March 13, 2026 — the full 1M window is now at standard rates. Budget accordingly for large codebase ingestion.
Thinking tokens: Gemini 3.1 Pro's chain-of-thought reasoning generates internal tokens billed at output rates. A simple prompt can consume 3–5× more tokens than expected.
Rate limits: Cheap models may throttle under heavy load. DeepSeek V3.2 gives 5M free tokens on signup with no hard rate limit, but may slow responses during high-traffic periods.
Speed costs money: Claude Opus 4.6 Fast Mode costs 6× standard ($30/$150 per MTok). GPT-5.4 Pro costs 12× standard ($30/$180). Factor speed requirements into your budget.

Bottom Line

In March 2026, the pricing landscape for AI coding models spans a 100× range — from Qwen 3.5 Flash at $0.065 input to Claude Opus 4.6 Fast Mode at $30 input. The right choice depends on your task complexity, volume, and speed requirements.

For most developers, the sweet spot is a tiered approach: use a cheap model for routine tasks and escalate to a frontier model for hard problems. The GLM-5 family at $1/$3.20 occupies a unique position — frontier-level SWE-bench scores at mid-tier pricing — making it the strongest value proposition for developers who need quality without the Claude/GPT price tag.

AI Coding Model API Pricing Compared: GLM-5 vs Claude vs GPT vs Gemini (2026)

Every Frontier Model's API Price, Side by Side

The Master Pricing Table

What These Numbers Actually Mean

Cost Per Task Comparison

Savings With Caching and Batch APIs

Quality vs Cost: The SWE-bench Reality Check

Subscription Plans vs Pay-as-You-Go

The BYOAI Strategy

Hidden Costs to Watch

Bottom Line

Related Articles

Best AI Models 2026: Gemini 3.1 Pro vs Claude Opus 4.6 vs GPT-5.2 — Full Comparison

GLM-5.1 for Coding: Is the $10/Month Plan Worth It?

GLM-5.1 vs GPT-5.4 vs Gemini 3.1 Pro: Open-Source vs Proprietary for Coding (2026)

Ready to automate your workflows?