Alibaba Cloud has announced a significant price reduction for its DeepSeek-V4-Pro model service on the Bailian platform, specifically targeting 'implicit cache' billing. Starting April 29, 2026, the price for cached tokens will drop to a mere 1 RMB per million tokens. This move represents a strategic pivot in the Chinese artificial intelligence market, moving the competition from headline-grabbing model prices to the more nuanced terrain of operational efficiency.
Implicit caching allows the system to store previously processed input data, meaning subsequent queries that reference the same information can be served at a fraction of the cost. Under the new pricing structure, only the initial 'miss' tokens are billed at standard rates, while recurring 'hits' receive the discounted rate. This approach is particularly advantageous for developers working with large codebases or long-form documents where context remains constant across multiple interactions.
The inclusion of DeepSeek-V4-Pro in this price adjustment is noteworthy. DeepSeek, a Chinese lab that has gained international acclaim for its high-performance, cost-efficient models, has become a favorite for enterprise applications. By further lowering the barriers to entry, Alibaba Cloud is effectively cementing its platform as the preferred destination for deploying DeepSeek’s advanced reasoning and multi-modal capabilities at a massive scale.
This pricing shift occurs against the backdrop of a broader, aggressive discounting cycle within the Chinese cloud industry. Major players like Baidu, Tencent, and Alibaba are no longer just competing on model parameters, but on the total cost of ownership (TCO) for AI integration. As companies transition from experimental AI pilots to full-scale production, the ability to manage recurring token costs through caching becomes a decisive factor in vendor selection.
