Yin Qi, one of China’s most visible AI entrepreneurs from the first wave of computer-vision startups, has resurfaced at the centre of the country’s second AI era. On 26 January, StepFun (阶跃星辰) announced Yin as its chairman while confirming a B+ financing round that raised more than RMB5 billion — the largest single big-model financing in China’s past year. Yin will split his time between StepFun and Qianli Technology (千里科技), the listed company he chairs following last year’s acquisition and rebrand of Lifan Group assets.
The personnel move and the cash infusion together mark a tactical shift in China’s large-model landscape. StepFun’s new financing attracted a mix of state-led industrial funds and strategic corporate investors — from Shanghai and provincial investment vehicles to Tencent, Qiming, and other existing backers — signalling both policy and market appetite for heavy, model-centric R&D. The company says the money will be used to push foundation-model research and accelerate “AI+terminal” commercialisation, an explicit strategy to marry large models with consumer and automotive hardware.
Yin’s personal narrative helps explain the strategy. A Tsinghua ‘Yao class’ alumnus who founded Megvii (旷视科技) at 22, Yin learned hard lessons when the AI 1.0 cohort failed to translate massive fundraising into sustained profits or smooth public listings. His public statements now emphasise product-market fit, organisational engineering and talent density alongside an enduring belief in AGI, particularly a variant that is embodied and interacts with the physical world.
Operationally, Yin says StepFun will focus on three technical axes in 2026: advancing its foundation model series (Step 3.5 to Step 4), deep multimodal fusion across text, speech and vision, and VLA work — the integration of visual, language and action capabilities suited to end devices. Commercially, StepFun intends to prioritise terminal scenarios where hardware serves as the carrier for AI — cars, phones and consumer wearables — using Qianli’s automotive roadmap as a first real-world laboratory for model-driven features.
The move highlights a wider phase change in China’s ecosystem. The sector has migrated beyond an early-stage scramble of ‘AI six small tigers’ to a more differentiated field where some firms are pursuing public exits and scale (e.g., Zhipu AI, MiniMax), others pivot to vertical or customised enterprise models (e.g., Baichuan, Zero1 Wanwu), and a cohort like StepFun pursues deep R&D plus ecosystem tie-ups. Investors appear to be rewarding mixed strategies that combine heavy model research with clearer hardware or industry pathways to revenue.
Strategically, Yin’s insistence on software-hardware integration is pragmatic. He argues that models alone are not a closed commercial loop and that hardware — whether cars or consumer devices — is necessary to anchor value capture and repeated user interaction. This three-part stack (model+software+hardware) echoes broader industry thinking that the post-parameter arms race will be decided by cost, latency and integration at the edge rather than by raw parameter counts alone.
But challenges remain. Building world-class foundation models is capital- and talent-intensive, and the labour market for senior AI researchers and systems engineers is fiercely competitive. Technical risk is compounded by geopolitical and supply-chain constraints that could affect chip access and cross-border partnerships. Moreover, China’s tighter regulatory environment around data, model safety and export controls injects uncertainty into timelines for productisation and overseas scale-up.
For global observers, the episode is significant because it illustrates how China is reconciling two priorities: catching up on frontier model capabilities while ensuring those capabilities are monetised in domestically strategic sectors. The combination of a high-profile industry veteran, a large state-and-capital-backed financing round, and a clear hardware-centred go-to-market strategy makes StepFun a company to watch as China’s AI competition moves from headline model launches to the hard work of embedding intelligence into physical platforms.
