# Groq

The Token Wars Begin: Nvidia’s Vera Rubin vs China’s Low‑Cost Inference Push

At GTC 2026 Nvidia declared the AI era has shifted from training models to continuously generating tokens and presented Vera Rubin, a full‑stack platform it says can cut token costs dramatically. At the same time, Chinese large‑model providers are already undercutting foreign counterparts on token prices and capturing high API volumes, creating a global contest over who will set token pricing and infrastructure standards.

NeTe2026年3月17日 10:32

#Nvidia#Vera Rubin#inference

Close-up of two NVIDIA RTX 2080 graphics cards with dual fans, high-performance hardware.

Technology

Huang’s GTC Playbook: NVIDIA Repackages AI as Token Factories — Hardware, Agents and a $1tn Inference Bet

At GTC Huang declared a structural shift from training to inference, unveiling a hardware and software roadmap — Vera Rubin systems, Groq LPU integration, Kyber racks, and OpenClaw/NemoClaw agent frameworks — he says could create at least $1 trillion in revenue by 2027. The announcements reframe AI as a token‑generation business that will reshape data centre design, software stacks and corporate IT strategy.

NeTe2026年3月17日 04:02

#NVIDIA#Jensen Huang#Vera Rubin

Portrait of an indigenous man in the Amazon, showcasing traditional face paint.

Technology

Amazon Taps Cerebras for Cloud Inference Push, Taking Aim at Nvidia’s Dominance

AWS will deploy Cerebras inference chips alongside its Trainium3 processors in a new service aimed at faster, cheaper AI inference for chatbots and coding tools. The move reflects a market shift from GPU‑heavy training towards specialised, lower‑latency inference hardware and intensifies competition with Nvidia’s GPU ecosystem.

NeTe2026年3月13日 22:38

#Amazon#Cerebras#AWS

Technology

Nvidia’s $20bn Bet on ‘Extreme’ Inference Chips Signals a Shift from Training to Cheap, High‑Throughput AI

Nvidia’s roughly $20 billion acquisition of Groq’s technology and team marks a strategic bet that AI’s commercial future lies in low‑cost, high‑throughput inference rather than giant training clusters. Chinese startups and spin‑outs are racing to produce specialized inference chips, aiming to slash per‑token costs and capture regional markets as AI applications scale rapidly.

NeTe2026年2月1日 02:30

#Nvidia#Groq#AI inference