Back to Articles
AI News

GLM-5.1: Zhipu's Open-Source Model Scores 94.6% of Claude Opus 4.6 in Coding

Z.ai (formerly Zhipu AI) releases GLM-5.1, scoring 45.3 on coding benchmarks — just 2.6 points behind Claude Opus 4.6. Trained entirely on Huawei chips, open-source under MIT license, and starting at $3/month. Here's what the benchmarks actually show and what's still unverified.

Serenities AIUpdated 10 min read
Cover image for GLM-5.1: Zhipu's Open-Source Model Scores 94.6% of Claude Opus 4.6 in Coding

What Just Happened

On March 27, 2026, Z.ai (formerly Zhipu AI) released GLM-5.1 — an incremental upgrade to its flagship GLM-5 model that narrows the gap with Claude Opus 4.6 to just 2.6 points on coding benchmarks. If the numbers hold up under independent scrutiny, this is the closest any open-source model has come to matching the top proprietary coding model.

The timing is significant. Z.ai became the world's first publicly traded foundation model company after its Hong Kong IPO in January 2026, raising $558 million at a $6.6 billion IPO valuation (reaching $7.1 billion by first-day close). GLM-5.1 is the company's bid to prove that open-source models trained entirely on non-American hardware can compete at the frontier.

The Benchmark Numbers

Using Claude Code as the evaluation framework, Z.ai reports the following coding scores:

Model

Coding Score

vs. Opus 4.6

Claude Opus 4.6

47.9

Baseline

GLM-5.1

45.3

94.6%

GLM-5

35.4

73.9%

The jump from GLM-5 (35.4) to GLM-5.1 (45.3) represents a 28% improvement in a single point release — a massive leap that suggests significant post-training optimization rather than just minor tuning.

For broader context, here's how the GLM-5 base model (which GLM-5.1 builds on) performs on established third-party benchmarks:

Benchmark

GLM-5

Claude Opus 4.6

GPT-5.4

SWE-bench Verified

77.8%

81.4%

~80%

AIME 2026

92.7%

GPQA-Diamond

86.0%

Artificial Analysis Index

50

53

57

On Artificial Analysis, GLM-5 ranks as the highest-scoring open-weight model on the Intelligence Index. The SWE-bench Verified gap between Opus 4.6 (~81.4%) and GLM-5 (77.8%) is about 3.6 percentage points — a gap that would have been unthinkable from an open-source model six months ago.

Critical Caveat: Benchmarks Are Self-Reported

This needs to be stated clearly: the GLM-5.1 coding benchmark (45.3 points, 94.6% of Opus) is entirely self-reported by Z.ai. As of March 29, 2026, no independent third-party evaluation lab has published corroborating results for GLM-5.1 specifically.

There are additional concerns with the methodology:

  • The evaluation uses Claude Code as the test harness, which is an unconventional choice that makes cross-benchmark comparison difficult

  • GLM-5.1 launched just two days ago — there has been no time for the broader research community to replicate results

  • The specific scoring methodology has not been publicly detailed beyond what Z.ai has shared

That said, Z.ai has a track record of backing up internal numbers. The GLM-5 base model's 77.8% on SWE-bench Verified was externally validated — the highest score among all open-source models on that benchmark. So there is reason to take the GLM-5.1 claims seriously while awaiting confirmation.

Bottom line: Treat the 94.6% figure as a promising preliminary claim, not an established fact. Wait for independent evaluations before making workflow decisions based on it.

Architecture and Training

GLM-5.1 inherits the GLM-5 architecture, which is substantial:

  • Total parameters: 744 billion

  • Architecture: Mixture of Experts (MoE) with 256 experts, 8 active per token

  • Active parameters per inference: ~40–44 billion (~5.4–5.9% sparsity rate)

  • Context window: 200K tokens

  • Max output tokens: 131,072

  • Attention mechanism: DeepSeek Sparse Attention (DSA) for efficient long-context processing

  • Pre-training data: 28.5 trillion tokens

The most notable aspect of GLM-5's training is the hardware. The entire model family was trained on 100,000 Huawei Ascend 910B chips using the MindSpore framework — with zero NVIDIA GPU involvement. This is particularly significant given that Z.ai was placed on the US Entity List in January 2025, restricting its access to American chips.

The fact that a model trained entirely on non-NVIDIA hardware can reach within 3.6 points of Claude Opus 4.6 on SWE-bench Verified is one of the most significant developments in the AI hardware landscape this year.

Pricing: Where GLM-5.1 Changes the Math

This is where GLM-5.1 gets genuinely disruptive. The cost difference compared to proprietary alternatives is not incremental — it is an order of magnitude.

API Pricing (GLM-5 Base)

Model

Input (per 1M tokens)

Output (per 1M tokens)

GLM-5

$1.00

$3.20

GPT-5.4

$2.50

$15.00

Claude Opus 4.6

$5.00

$25.00

GLM-5 is 5x cheaper on input and nearly 8x cheaper on output compared to Claude Opus 4.6. For high-volume coding workflows, this difference compounds fast.

GLM Coding Plan (Subscription)

Z.ai also offers a subscription model specifically designed for coding workflows:

Plan

Price

Promo Price

Requests (per 5 hours)

Lite

$10/month

$3 first month

120

Pro

$30/month

$15 first month

600

Max

Higher tier

Expanded limits

The Coding Plan includes access to GLM-5.1, GLM-5, GLM-5-Turbo, and GLM-4.7, along with features like vision understanding, web search, and web reader — all compatible with Claude Code, Cline, and other popular coding tools.

Compare this to Claude Max at $100–$200/month or Claude Pro at $20/month with usage limits. For developers doing high-volume daily coding, the economics are hard to ignore.

Open Source Under MIT License

GLM-5 is already available on Hugging Face under the MIT license — the most permissive open-source license available. This means unrestricted commercial use, modification, and redistribution with no strings attached.

Z.ai's global head Zixuan Li has confirmed that GLM-5.1 will also be open-sourced under MIT, following the same precedent. The standalone GLM-5.1 API and open weights are expected within weeks, though no specific date has been announced.

For local deployment, the GLM-5 family is already supported by:

  • vLLM and SGLang for inference

  • KTransformers and xLLM for local deployment

  • NVIDIA NVFP4 quantized version for optimized inference

  • GGUF format for llama.cpp compatibility

  • MLX format for Apple Silicon

At 744 billion parameters (1.51TB on Hugging Face), this is not a model you run on a laptop. But for teams with GPU infrastructure or cloud deployments, the MIT license means zero per-token costs beyond your own compute.

Who Is Z.ai?

For readers unfamiliar with the company, here is the quick background:

  • Founded: 2019, spun out of Tsinghua University by professors Tang Jie and Li Juanzi

  • Headquarters: Beijing, China (international brand: Z.ai)

  • IPO: January 8, 2026 on the Hong Kong Stock Exchange — the world's first publicly traded foundation model company

  • Market cap: ~$31 billion (as of March 2026)

  • Total funding raised: $1.4 billion+ across 12 rounds, plus $558 million IPO

  • Investors: Alibaba, Tencent, Ant Group, Meituan, Xiaomi, Saudi Aramco's Prosperity7 Ventures

  • Revenue: ~$53 million trailing twelve months (as of December 2025), with 325% year-over-year growth

  • Entity List: Added to the US export control Entity List in January 2025

Z.ai is considered one of China's "AI Tigers" alongside MiniMax and Moonshot AI. The company's stock surged nearly 30% after the GLM-5 release in February 2026, though it later fell 23% amid compute resource shortages that led to user complaints and temporarily restricted new signups.

The Rapid Release Cadence

Z.ai has been shipping at an aggressive pace:

  • July 2025: GLM-4.5

  • September 2025: GLM-4.6

  • December 2025: GLM-4.7

  • February 11, 2026: GLM-5 (the flagship release)

  • March 15, 2026: GLM-5-Turbo

  • March 27, 2026: GLM-5.1

Six model releases in nine months. The February-to-March window alone saw three releases. This cadence reflects both competitive pressure in the Chinese AI market and Z.ai's ambition to establish GLM as a serious alternative to Claude and GPT for coding workflows.

What GLM-5.1 Means for Developers

The Cost Arbitrage Play

Several early adopters and commentators are converging on a practical strategy: use GLM for daily coding tasks, reserve Claude Opus for complex or high-stakes work. At $3–$30/month for the GLM Coding Plan versus $100–$200/month for Claude Max, the cost savings are substantial if GLM-5.1 delivers 90%+ of Opus-level quality for routine work.

The Open-Source Milestone

If GLM-5.1's coding performance is independently confirmed at or near the claimed levels, it represents the first time an open-source model has reached within 5% of the top proprietary coding model. For organizations that need to self-host AI models due to data sovereignty, compliance, or cost requirements, this changes the calculus significantly.

The Hardware Story

For the broader AI industry, the fact that a frontier-competitive model was trained entirely on Huawei Ascend chips — with zero NVIDIA involvement — challenges the assumption that NVIDIA hardware is required for cutting-edge AI training. This has implications for the global chip supply chain, export controls, and the competitive dynamics of the AI industry beyond just model capabilities.

What We Still Don't Know

Several important questions remain unanswered:

  • Independent benchmark verification — Will third-party evaluations confirm the 94.6% claim? The model is two days old. This is the biggest unknown.

  • Performance on non-coding tasks — The 45.3 score is coding-specific. How does GLM-5.1 perform on general reasoning, creative writing, analysis, and other tasks compared to GLM-5?

  • Open-source timeline — Z.ai has confirmed MIT licensing but not a specific release date for GLM-5.1 weights. "Within weeks" is vague.

  • Compute availability — Z.ai experienced capacity issues after the GLM-5 launch, restricting new signups. Can they handle the demand for GLM-5.1?

  • Real-world coding quality — Benchmark scores test specific patterns. How does GLM-5.1 perform on real-world codebases, debugging sessions, and multi-file refactoring?

How to Use GLM-5.1 With Your Workflow

GLM-5.1 is compatible with the tools most developers already use:

  • Claude Code — Supported as a model provider

  • Cline — Compatible via API

  • Standard OpenAI-compatible API format — Works with any tool that supports the OpenAI chat completions API

For developers using platforms like Serenities AI with its BYOAI (Bring Your Own AI) model, GLM-5.1 can be connected via API key alongside Claude, GPT, Gemini, DeepSeek, MiniMax, and other providers. This lets you switch between models freely — using GLM for high-volume daily work and Claude or GPT for tasks that require the absolute best reasoning performance — all within the same project.

The Bottom Line

GLM-5.1 is either a landmark moment for open-source AI or an overpromised point release — and we won't know which until independent benchmarks arrive. What we do know:

  • The base GLM-5 model has externally validated performance that puts it within 3.6 points of Claude Opus 4.6 on SWE-bench Verified

  • The pricing is 5–8x cheaper than Claude Opus 4.6 on a per-token basis

  • The MIT license means no vendor lock-in and zero per-token costs for self-hosted deployments

  • The Huawei-only training pipeline is a geopolitically significant demonstration that frontier AI does not require American silicon

  • The coding-specific claims (94.6% of Opus) are unverified by third parties and should be treated as preliminary

For developers: watch for independent evaluations over the coming weeks. If the numbers hold, GLM-5.1 at $3/month could become the most cost-effective coding model available. If they don't, the GLM-5 base model is still a formidable open-source option at a fraction of proprietary pricing.

Either way, the gap between open-source and proprietary AI models is closing faster than anyone predicted.

Share this article

Related Articles

Ready to automate your workflows?

Start building AI-powered automations with Serenities AI today.