Xiaomi has unveiled three new large models under its MiMo V2 banner — a trillion‑parameter base model tuned for agent workflows (MiMo‑V2‑Pro), a full‑modal agent (MiMo‑V2‑Omni) and a high‑fidelity speech synthesiser (MiMo‑V2‑TTS) — signalling a deliberate move by a major device maker to own both the underlying model stack and the agent layer that sits above it.
The launch clarifies a recent market mystery: two anonymous models that dominated API call charts on OpenRouter under the names Hunter Alpha and Healer Alpha were in fact early test versions of Xiaomi’s new models. Xiaomi is making those early builds available to developers via OpenRouter and is offering limited free access and integration support through several agent frameworks, a strategy designed to jump‑start third‑party experimentation.
MiMo‑V2‑Pro is Xiaomi’s flagship. The model package exceeds one trillion parameters in total and exposes an active working set of about 42 billion parameters; it supports one‑million token context windows and has been tuned for complex, multi‑step tool use, long‑range planning and automated workflow orchestration. Benchmarks place it among the global top ten on Artificial Analysis and near contemporary high‑end models on agent and programming tasks, while Xiaomi’s published API pricing is a fraction of comparable commercial offerings — a clear competitive lever aimed at developers.
MiMo‑V2‑Omni targets real‑world, cross‑modal agent tasks. It ingests text, vision and audio, handles long continuous audio, multi‑speaker separation and audio‑visual reasoning, and claims superior performance on several audio and video benchmarks versus leading multimodal models. Xiaomi demonstrates Omni performing shopping research, price‑comparison and automated interaction with web services and offices suites, showing how an agent can carry tasks end‑to‑end from discovery to purchase.
MiMo‑V2‑TTS is marketed as an agent‑ready text‑to‑speech model trained on "over hundreds of millions of hours" of speech data using Xiaomi’s Audio Tokenizer and a multi‑codebook speech‑text architecture. The model emphasises controllable, fine‑grained prosody, dialect support and role‑based voice acting, including singing, and is positioned to give agents a more natural, expressive voice.
Xiaomi has also rolled out MiMo Claw, an on‑site agent experience that lets users "raise a shrimp" — the community metaphor for deploying and running agent workflows — with a 30‑minute free session that auto‑destroys data on exit. The company is integrating MiMo into its browser and office ecosystem and plans week‑long free API access to developer frameworks including OpenClaw, OpenCode and others to encourage real‑world agent applications.
The release comes amid intensifying competition in China’s domestic large model market, where Xiaomi’s MiMo‑V2‑Pro ranks behind GLM‑5 and MiniMax in some aggregates but competes closely on agent and programming measures. Xiaomi’s team includes engineers formerly associated with DeepSeek, and the company’s decision to surface early test models under a different name — then claim them — underscores a pragmatic approach to seeding usage and gathering real‑world feedback.
For international audiences the important takeaway is strategic rather than purely technical: Xiaomi is demonstrating that an end‑to‑end device maker can combine large, capable models with product and ecosystem control to deliver lower‑cost, on‑device and cloud‑assisted agent experiences. That combination could reshape where and how advanced AI services are hosted, how they are monetised, and how quickly consumer‑facing agents proliferate beyond niche developer communities.
The rollout also raises familiar questions about governance and safety. Cheap, broadly accessible agent APIs accelerate experimentation but also widen the attack surface for misuse, data leakage and regulatory scrutiny. Xiaomi’s promise that short demo sessions auto‑destroy data is a start, but long‑term deployments that link model outputs to real‑world actions — such as automating purchases, handling accounts or controlling devices — will test current frameworks for model accountability and platform responsibility.
Whether Xiaomi’s pricing and integration strategy forces a recalibration among cloud model providers and rival Chinese teams remains to be seen. For now, the company has laid down a marker: device OEMs can be more than hardware manufacturers — they can be the gatekeepers and enablers of the next generation of practical, task‑oriented AI agents.
