# Inference

Latest news and articles about Inference

Total: 6 articles found

Internal view of a gaming PC showcasing advanced cooling and graphics capability.

The Great Recalibration: Why the GPU’s Hegemony in AI is Finally Cracking

As AI shifts from the training phase to mass deployment, the industry is moving away from GPU-centricity toward system-level efficiency. The resurgence of the CPU, driven by the needs of inference and AI Agents, is fundamentally changing the architecture of data centers and the competitive landscape for hardware giants like Intel, AMD, and Nvidia.

NeTe2026年5月1日 00:58

#AI Infrastructure#GPU#CPU

High-resolution macro shot of a computer CPU chip with gold pins against a blue background.

Technology

Shanghai’s Silicon Stealth: Zhongzixing Prepares to Tape Out AI-Native 'NEU' Chip

Shanghai startup Zhongzixing Technology plans a Q4 tape-out for its NEU AI chip, claiming massive performance and energy efficiency gains over traditional GPUs. Led by veterans from Intel and Nvidia, the firm represents China's push for specialized, high-performance AI silicon.

NeTe2026年4月25日 01:58

#Zhongzixing Technology#AI Chips#Semiconductors

A customer talks with a sales representative about a Tesla Model 3 in a car dealership, showcasing the electric car's features.

Technology

The Token Squeeze: Tesla’s New Silicon and Tencent’s Price Hikes Signal AI's Move to the Edge

Tesla's AI5 chip tape-out and Tencent Cloud's price hikes signal a strategic shift in the AI industry from cloud-based training to edge-side inference and high-volume token consumption. This transition is driving a revaluation of the AI supply chain, moving from raw hardware rental to sophisticated 'Agent-as-a-Service' business models.

NeTe2026年4月16日 03:29

#Tesla AI5#Tencent Cloud#Semiconductors

Close-up of two NVIDIA RTX 2080 graphics cards with dual fans, high-performance hardware.

Technology

The H100’s Second Act: Why NVIDIA’s Legacy Silicon is Seeing a 40% Rental Surge

NVIDIA’s four-year-old H100 GPUs are seeing a 40% price surge in the rental market due to a massive spike in demand for video generation and multi-agent AI systems. Despite the launch of newer Blackwell chips, supply remains critically tight, forcing AI giants to lock in long-term contracts for legacy silicon.

NeTe2026年4月2日 20:58

#NVIDIA#H100#GPU Shortage

Cyrillic alphabet tiles spelling a word on a smooth wooden surface.

Technology

The Inference Pivot: China Surpasses U.S. in Weekly AI Token Consumption

China has overtaken the United States in weekly AI token consumption, reaching 4.69 trillion tokens and claiming the top three most-called models globally. Projections from J.P. Morgan suggest a massive 370-fold growth in Chinese AI inference by 2030, signaling a definitive shift toward large-scale industrial application.

NeTe2026年3月22日 06:59

#Artificial Intelligence#China Tech#Large Language Models

Technology

Huang’s GTC Playbook: NVIDIA Repackages AI as Token Factories — Hardware, Agents and a $1tn Inference Bet

At GTC Huang declared a structural shift from training to inference, unveiling a hardware and software roadmap — Vera Rubin systems, Groq LPU integration, Kyber racks, and OpenClaw/NemoClaw agent frameworks — he says could create at least $1 trillion in revenue by 2027. The announcements reframe AI as a token‑generation business that will reshape data centre design, software stacks and corporate IT strategy.

NeTe2026年3月17日 04:02

#NVIDIA#Jensen Huang#Vera Rubin