AI Models Directory — Benchmark Profiles for 118+ Models
Browse all AI models with benchmark scores, pricing, and performance data. Click any model for detailed analysis.
OpenAI
23 modelsGPT-5.2
FlagshipGPT-OSS 120B
Open SourceGPT-OSS 20B
Open SourceSora 2
FlagshipGPT-5.1 Codex Mini
Mid-RangeGPT-5.1 Codex
FlagshipGPT-5.1
FlagshipGPT-5 Nano
BudgetGPT-5 Mini
Mid-RangeGPT-5
Flagshipo3 Pro
Flagshipo4 Mini
Mid-Rangeo3
FlagshipGPT-4.1 Nano
BudgetGPT-4.1 Mini
BudgetGPT-4.1
Mid-RangeGPT-4.5
Flagshipo3 Mini
Mid-Rangeo1 Mini
Mid-Rangeo1
FlagshipGPT-4o Mini
BudgetGPT-4o
Mid-RangeDALL-E 3
FlagshipGemini 3.1 Pro
FlagshipVeo 3.1
FlagshipGemini 3 Flash
Mid-RangeGemini 3 Pro
FlagshipVeo 3
FlagshipGemini 2.5 Flash Lite
BudgetImagen 4
FlagshipGemini 2.5 Flash
Mid-RangeGemini 2.5 Pro
FlagshipGemma 3 4B
Open SourceGemma 3 12B
Open SourceGemma 3 27B
Open SourceGemini 2.0 Flash
BudgetAnthropic
11 modelsClaude Sonnet 4.6
FlagshipClaude Opus 4.6
FlagshipClaude Opus 4.5
FlagshipClaude Haiku 4.5
BudgetClaude Sonnet 4.5
FlagshipClaude Opus 4
FlagshipClaude Sonnet 4
Mid-RangeClaude 3.7 Sonnet
Mid-RangeClaude 3.5 Haiku
BudgetClaude 3.5 Sonnet
Mid-RangeClaude 3 Opus
FlagshipQwen
9 modelsQwen 3 VL 235B
Open SourceQwen 3 Coder 480B
Open SourceQwen 3 Next 80B
Open SourceQwen 3.5 397B
Open SourceQwen 3 Coder
Open SourceQwen 3 Max
Open SourceQwen 3 32B
Open SourceQwen 3 235B
Open SourceQwen 2.5 72B
Open SourceMistral
9 modelsMinistral 3 8B
Open SourceMinistral 3 14B
Open SourceMistral Large 3
Open SourceMagistral Medium 1.2
Mid-RangeMistral Large 25.12
FlagshipMistral Small 3.2
BudgetMistral Medium 3.1
Mid-RangeCodestral
Mid-RangePixtral Large
FlagshipxAI
6 modelsGrok 4.1 Fast
Mid-RangeGrok 4 Fast
Mid-RangeGrok 4
FlagshipGrok 3 Mini
BudgetGrok 3
Mid-RangeGrok 2
Mid-RangeDeepSeek
4 modelsDeepSeek V3.2
Open SourceDeepSeek V3.1
Open SourceDeepSeek R1 0528
Open SourceDeepSeek R1
Open SourceAmazon
3 modelsPerplexity
3 modelsMicrosoft
3 modelsMeta
3 modelsCohere
3 modelsByteDance
2 modelsZhipu AI
2 modelsMidjourney
2 modelsBaidu
2 modelsBlack Forest Labs
2 modelsAI21 Labs
2 modelsInclusionAI
1 modelMoonshot AI
1 modelLG AI Research
1 modelNous Research
1 modelIdeogram
1 modelStability AI
1 modelAbout the AI Models Directory
The AI Value Index tracks 118+ large language models from leading providers including OpenAI, Google, Anthropic, Qwen, Mistral, and more. Each model profile includes benchmark scores across general intelligence, coding, math, reasoning, speed, and cost metrics.
Models are categorized as Flagship, Mid-Range, Budget, or Open Source based on their capability tier and pricing. Click any model to view its full benchmark profile, or use the Compare tool to see side-by-side comparisons, or check Pricing for detailed cost analysis.
Frequently Asked Questions
How many AI models does the AI Value Index track?
The AI Value Index currently tracks 49+ large language models from 8+ leading providers including OpenAI, Anthropic, Google, Meta, DeepSeek, xAI, Qwen, and Mistral. New models are added as they launch.
What is the difference between Flagship, Mid-Range, Budget, and Open Source models?
Flagship models (e.g. GPT-5.2, Claude Opus 4.6) offer peak capability at premium prices. Mid-Range models balance quality and cost. Budget models (e.g. GPT-5 Nano) prioritize low cost for high-volume use. Open Source models (e.g. Llama, Qwen) can be self-hosted and fine-tuned freely.
Which AI provider has the most models?
OpenAI and Google currently offer the largest model lineups, each with 8+ models spanning flagship to budget tiers. Anthropic, Meta, and DeepSeek each offer 4-6 models, while xAI, Qwen, and Mistral round out the directory.
How often is the AI models directory updated?
The directory is updated within days of a new model launch or pricing change. Benchmark scores are refreshed as new evaluation results become available from official leaderboards and independent testing platforms.
What data is shown on each model profile?
Each model profile includes Chatbot Arena ELO, SWE-bench Verified, MMLU-Pro, HumanEval, and 20+ other benchmark scores, plus input and output pricing per 1M tokens, output speed, context window size, and provider details.