# AI inference
Latest news and articles about AI inference
Total: 3 articles found

Nvidia’s $20bn Bet on ‘Extreme’ Inference Chips Signals a Shift from Training to Cheap, High‑Throughput AI
Nvidia’s roughly $20 billion acquisition of Groq’s technology and team marks a strategic bet that AI’s commercial future lies in low‑cost, high‑throughput inference rather than giant training clusters. Chinese startups and spin‑outs are racing to produce specialized inference chips, aiming to slash per‑token costs and capture regional markets as AI applications scale rapidly.

Alibaba Unveils Qwen3‑Max‑Thinking, a Trillion‑Parameter Inference Model Aimed at Beating Western Rivals
Alibaba has released Qwen3‑Max‑Thinking, a trillion‑parameter inference model it says surpasses leading Western models on multiple benchmarks, with stronger agent tool‑calling and reduced hallucinations. The company is opening trials on PC and web, positioning the model for broad commercial use while leaving independent verification of its claims outstanding.

vLLM Team's Inferact Secures $150m Seed at $800m Valuation, Signalling Fresh Bet on AI Inference Infrastructure
Inferact, founded by the creators of open‑source vLLM, raised $150 million in a seed round at an $800 million valuation led by Andreessen Horowitz and Lightspeed. The deal signals strong investor conviction in companies that can commercialize efficient LLM inference, but Inferact will face competition from cloud providers and specialized rivals as it seeks to translate open‑source credibility into enterprise revenue.