Back to Articles
News

Xiaomi MiMo-V2 Review: The 1T-Parameter Model That Fooled the AI Community

Xiaomi's MiMo-V2-Pro launched anonymously as "Hunter Alpha" on OpenRouter, and the AI community assumed it was DeepSeek V4. It wasn't. It's a 1T-parameter MoE model with 42B active params that approaches Claude Opus 4.6 on coding at 1/5th the cost. Full review of both Pro and Flash variants.

Nishant LamichhaneUpdated 9 min read
Cover image for Xiaomi MiMo-V2 Review: The 1T-Parameter Model That Fooled the AI Community

The "Hunter Alpha" Reveal

In mid-March 2026, a mystery model appeared on OpenRouter under the name "Hunter Alpha." It quickly topped the daily usage charts, surpassing 1 trillion tokens in total usage. The AI community was convinced it was DeepSeek V4 testing in stealth mode.

It wasn't. On March 18, 2026, Xiaomi revealed that Hunter Alpha was actually MiMo-V2-Pro — the smartphone giant's flagship large language model, built by a team led by former DeepSeek researcher Luo Fuli. Reuters subsequently debunked the DeepSeek V4 speculation.

Two Models: Flash and Pro

Xiaomi's MiMo-V2 comes in two variants:

Spec

MiMo-V2-Flash

MiMo-V2-Pro

Total Parameters

309B

1T

Active Parameters

15B

42B

Architecture

MoE, Hybrid Attention (5:1)

MoE, Hybrid Attention (7:1)

Context Window

256K tokens

1M tokens

Input Price

$0.10/M tokens

$1.00/M tokens

Output Price

$0.30/M tokens

$3.00/M tokens

License

MIT (open weights on HuggingFace)

Proprietary (API-only)

MiMo-V2-Pro Benchmarks

The Pro model's performance has been independently evaluated by Artificial Analysis:

Benchmark

MiMo-V2-Pro

Comparison

Intelligence Index

49

GLM-5: 50, GPT-5.2 Codex: ~49

GDPval-AA (Agentic)

1426 ELO

GLM-5: 1406, Claude Sonnet 4.6: 1633

ClawEval (Agent Scaffold)

61.5

Claude Opus 4.6: 66.3, GPT-5.2: 50.0

Hallucination Rate

30%

Flash: 48%

The Pro model is the highest-scoring Chinese-origin model on GDPval-AA (agentic tasks), beating GLM-5 (1406) and Kimi K2.5 (1283). On ClawEval, it scores 61.5 — approaching Claude Opus 4.6's 66.3 and significantly beating GPT-5.2's 50.0.

Token Efficiency: A Key Advantage

MiMo-V2-Pro used 77M output tokens to run the Intelligence Index — significantly less than GLM-5 (109M) and Kimi K2.5 (89M). This matters because many Chinese models are notoriously verbose, driving up effective costs even when per-token pricing looks cheap.

MiMo-V2-Flash Benchmarks

The Flash variant punches well above its weight for a 15B-active-parameter model:

Benchmark

MiMo-V2-Flash

Notes

SWE-Bench Verified

73.4%

Leading open-source model for SWE

SWE-Bench Multilingual

71.7%

Strong cross-language coding

AIME 2025 (Math)

94.1%

Near-frontier math reasoning

Intelligence Index

41

Average: 26

Flash generates output at 141.9 tokens per second (median across providers, per Artificial Analysis Feb 2026 data; Xiaomi claims up to 150 tok/s) — nearly 2.5x the average for open-weight models of similar size. This speed comes from Xiaomi's Multi-Token Prediction (MTP) architecture, which predicts multiple future tokens in a single forward pass.

Key Technical Innovations

Rollout Routing Replay (R3)

Xiaomi developed R3 to solve a common MoE problem: routing drift between training and inference. R3 enforces a deterministic constraint where experts activated during rollout are strictly reused during backpropagation, eliminating performance degradation that plagues other sparse models in production.

Multi-Teacher On-Policy Distillation (MOPD)

MiMo-V2-Flash uses a novel training technique where domain-specialized teacher models provide dense, token-level rewards. This lets the smaller model absorb expertise from multiple larger teachers without the quality loss typical of standard distillation.

Speculative Decoding via MTP

The Multi-Token Prediction layers serve double duty: during training they improve learning, and during inference they act as draft models for speculative decoding — achieving up to 3.6 acceptance length and 2.6× decoding speedup.

Pricing in Context

Model

Input/M

Output/M

Intelligence Index

MiMo-V2-Flash

$0.10

$0.30

41

MiMo-V2-Pro

$1.00

$3.00

49

GLM-5.1

$1.00

$3.20

50

Claude Opus 4.6

$5.00

$25.00

53

GPT-5.4

$2.50

$15.00

57

MiMo-V2-Pro delivers roughly 92% of GLM-5's intelligence at nearly identical pricing. Flash delivers 78% of GLM-5's intelligence at 1/10th the cost.

The MiMo-V2 Ecosystem

Xiaomi isn't just shipping models — they're building an agent ecosystem. According to A2A Protocol, the MiMo-V2 series includes:

  • MiMo-V2-Pro: Flagship reasoning and coding model

  • MiMo-V2-Flash: Fast, efficient model for high-volume tasks

  • MiMo-V2-Omni: Multimodal variant (text + image + video)

  • MiMo-V2-TTS: Text-to-speech model

MiMo-V2-Pro has partnerships with five major agent frameworks: OpenClaw, OpenCode, KiloCode, Blackbox, and Cline — offering one week of free API access for developers worldwide.

Considerations

  • Data sovereignty: MiMo-V2 is operated by Xiaomi, a Chinese company. Enterprises with strict data handling requirements should evaluate compliance needs before production deployment.

  • Flash verbosity: While Pro is token-efficient (77M tokens for the Intelligence Index), Artificial Analysis noted Flash generated 97M tokens for the same evaluation — making effective costs much higher than per-token pricing suggests.

  • Creative writing: Both models excel at reasoning and coding but trail denser models like Claude Opus on creative and nuanced text generation.

Bottom Line

MiMo-V2-Pro is the strongest model from a company nobody expected to compete in frontier AI. Its stealth launch as Hunter Alpha proved the model can compete on merit without brand recognition. At $1/$3 per million tokens, it's a legitimate alternative to GLM-5.1 for agentic and coding workloads — and the Flash variant at $0.10/$0.30 is the cheapest frontier-adjacent model available.

Xiaomi's $8.7 billion AI investment is producing results. With DeepSeek researchers on the team and a smartphone ecosystem of hundreds of millions of devices waiting for on-device AI, MiMo-V2 is just the beginning.

Sources

Share this article

Related Articles

Ready to automate your workflows?

Start building AI-powered automations with Serenities AI today.