Chinese AI start‑up DeepSeek has begun a limited gray test of what appears to be a substantially upgraded large language model. Reporters who disabled both the model's "deep thinking" and "online search" features discovered the system now accepts a context window of 1 million tokens—up from the previous 128k—and its internal knowledge cutoff has been advanced to May 2025. The company has not formally announced the change, suggesting a cautious, staged rollout to partners or power users.
A context capacity of one million tokens is not a minor tweak: it transforms the kinds of tasks the model can handle. Where 128k already allowed long documents and substantial multi‑turn sessions, 1M tokens can hold the contents of dozens of long technical papers, entire legal contracts or books, or a massive codebase, enabling end‑to‑end reasoning over far larger information sets without repeated retrieval steps. For enterprises and research teams this reduces the friction of chunking, stitching and prompt engineering that long‑form workflows require.
The updated knowledge cutoff—May 2025—matters because it places the model's training data well beyond the common 2023–2024 horizons of many contemporaries. That makes the system more immediately useful for tasks that rely on up‑to‑date market, scientific or regulatory information. Pairing fresher knowledge with a far longer context window improves the model's practical utility in document analysis, summarisation, compliance review and technical troubleshooting.
Technically, providing a million‑token context at acceptable latency and cost is difficult, and points to architectural changes. The upgrade may reflect more efficient attention mechanisms, hierarchical memory, retrieval‑augmented pipelines that emulate long context, or a fundamentally new base model—hints that the company is moving beyond simple fine‑tuning of earlier architectures. The absence of "deep thinking" and "online search" in the diagnostic run implies the new model was being probed in a conservative, standalone configuration to check base behaviour.
This development also feeds into the broader race among Chinese AI firms to deliver increasingly capable models while navigating regulatory and export constraints. Domestic providers from established incumbents to fast‑moving startups are competing to offer commercial products that combine up‑to‑date training, large contexts and safety controls. A successful 1M‑token model would be attractive to sectors such as finance, law, healthcare and cloud services inside China, where data residency and control are high priorities.
There are policy and product risks. Larger contexts amplify both potential value and vulnerabilities: the model can aggregate sensitive information and produce plausible‑sounding but incorrect inferences across extensive documents. Operationalising such a model requires investment in robust safety filters, provenance tracking and efficient serving infrastructure. Nonetheless, DeepSeek's quiet test suggests the company is angling for an enterprise‑grade position in the next phase of the LLM market.
