In a matter of hours on the night of Feb. 11–12, three of China’s best‑known large‑model companies pushed forward new versions or upgrades that together mark a step change in how domestic AI firms are thinking about production. Zhipu (智谱) formally unveiled and open‑sourced GLM‑5, MiniMax quietly surfaced a new M2.5 model inside its agent product, and DeepSeek upgraded core capabilities — a flurry of activity that crystallises a wider industry pivot from prototype demos to continuous, agentic engineering.
Zhipu positions GLM‑5 not as another chat model but as a foundation for “Agentic Engineering”: systems that can run multi‑step procedures, manage resources and deliver production‑grade results rather than single‑turn code snippets. The company said GLM‑5 is the anonymous “Pony Alpha” that had already been used by global developers to build games, agent worlds and full applications during testing, arguing that real‑world, brand‑free usage is evidence of genuine capability.
Technically, Zhipu reports substantial upgrades. GLM‑5’s total parameters rose to 744 billion with an activated parameter budget of 40 billion, up from a previous 355B (32B active), and pretraining data increased to about 28.5 trillion tokens. Zhipu also says it integrated DeepSeek’s sparse‑attention mechanism to preserve long‑context performance while lowering deployment cost, and introduced an asynchronous reinforcement framework called “Slime” to sustain learning over long interactions.
Benchmark claims are aggressive. Zhipu cites an Artificial Analysis ranking that places GLM‑5 fourth globally and first among open‑source models, and reports strong performance on software engineering and terminal tests — including a 77.8 on SWE‑bench‑Verified and 56.2 on Terminal Bench 2.0 — scores it says surpass Google’s Gemini 3 Pro and align the model with Claude Opus 4.5. In an evocative test called Vending Bench 2, GLM‑5 allegedly ran a simulated vending‑machine business for a year and finished with a $4,432 balance, a claim used to illustrate the model’s planning and resource‑management competence.
MiniMax’s move was less theatrical but equally consequential: users noticed a new “M2.5” model option in its Agent product before any formal announcement. Early testers describe powerful agent and coding capabilities at much lower compute cost. MiniMax markets M2.5 as a production‑grade agent model with just 10 billion activated parameters, high throughput (supporting 100 TPS) and efficiency advantages that make it suitable for cross‑platform office productivity tasks such as Excel, PPT and deep research.
DeepSeek, meanwhile, appears to be taking a different tack by quietly extending practical capabilities. The company’s context window reportedly jumped to 1 million tokens from 128K and its knowledge cutoff was updated to May 2025, signalling a focus on longer workflows and fresher information. The staggered cadence of these announcements suggests the sector is racing on multiple fronts: model scale, efficiency, contextual length and real‑world tool use.
The commercial response was immediate. Chinese AI equities and related chip stocks rallied on the news, and analysts and developers framed the night as evidence that Chinese models are now competing for production work previously thought to belong to Western incumbents. For buyers and builders, the shift matters: vendors that can deliver dependable, efficient agentic systems will have an edge in enterprise adoption, while those that remain at the demo stage risk being leapfrogged.
Yet important questions remain. Benchmarks supplied by vendors can be selective; operational robustness, security, safety guardrails and reproducible third‑party verification will determine which models actually flourish in enterprise and consumer deployments. The broader geopolitical context — export controls on advanced chips, international partnerships on data and research, and Western cloud providers’ reactions — will shape how quickly these models influence markets beyond China’s borders.
For now, the takeaway is speed. The simultaneous rollouts underscore how quickly China’s AI ecosystem is moving from proof‑of‑concept to production, and how competition is shifting from pure scale to a blend of efficiency, long‑context reasoning and continuous agentic behaviour. In that race, falling behind can happen in a single night.
