China’s technology giants turned the 2026 Lunar New Year into more than a marketing spectacle. Alongside blockbuster cash promotions from Alibaba, Tencent and Baidu, the holiday period has seen a torrent of new multimodal AI models — video, image and text systems — from ByteDance, Alibaba, Zhipu (智谱), MiniMax and smaller rivals. What looks like a product race is also a large-scale stress test of the compute and commercial plumbing beneath the Chinese AI ecosystem.
The technical backdrop is stark. ByteDance’s Seedance 2.0 video model, one of the headline releases, consumes roughly 350,000 tokens to generate a single 10‑second 1080p clip. As firms push from short chatbots into image and video generation, the computational burden per user is rising fast. Industry data cited by domestic brokers show daily token calls at major platforms jumping from billions in early 2024 to multiple tens of trillions by the end of 2025, with combined mainstream-model consumption in February 2026 running at approximately 180 trillion tokens per day.
The immediate commercial consequence is pressure on cloud and GPU capacity, and a corresponding rethink of pricing. Cloud providers worldwide — from AWS and Google Cloud to Chinese providers — have already announced capacity price increases. Chinese model vendors are following suit: Zhipu reworked its GLM Coding Plan with price rises starting around 30%, citing sustained demand and the need for greater investment in stability and optimisation, and its new offering sold out almost immediately.
Analysts and investment banks have begun to treat tokens — the unit of model inference — as a new "metering" currency for AI services. Where digital services were once measured in daily active users or minutes, sellers now have reason to charge for inference tokens because multimodal and long-context models make token consumption a structural, rather than incidental, cost. That shift gives cloud operators and model providers renewed pricing power and creates business models based on subscriptions, tiered access and usage-based billing.
For investors and product teams the implications are tangible. Brokers counsel exposure to cloud infrastructure (GPUs, storage, I/O), to model vendors that can monetise high‑ROI enterprise scenarios (coding, agents, business workflows) and to tools that manage safety and runtime governance. Token inflation benefits suppliers of compute and specialised software, but it also creates friction for consumer-facing services if costs are passed through or if capacity shortages trigger throttling.
The rapid price resets and capacity tightness also expose strategic vulnerabilities. Heavy reliance on high-end accelerators ties the industry to global supply chains and export controls; open-source models and efficiency breakthroughs could quickly compress margins; and proliferating multimodal outputs — especially realistic video — make content moderation and regulation urgent. In short, the commercial upside for vendors is real, but so are the operational, regulatory and geopolitical risks.
China’s AI "Spring Festival" therefore matters beyond a parade of product debuts. It is a stress-test of an industrial transition: from experimental chatbots to ubiquitous, compute‑hungry multimodal services. The winners will be firms that can secure and price scarce compute, translate token consumption into durable enterprise value, and contain the legal and reputational fallout of more convincing synthetic media.
