Flash News

NVIDIA CEO Jensen Huang Emphasizes Future CPUs Must Be Designed for Billions of Agents, Not Humans

NVIDIA CEO Jensen Huang elaborated on the Vera CPU design philosophy at GTC Taipei 2026, emphasizing that future CPUs must be designed for billions of agents rather than humans.

The Vera CPU features extreme single-thread performance (highest IPC globally, 10 instructions per clock cycle), ultra-high per-core bandwidth, 3.6 TB/s total on-chip bandwidth, and exceptional energy efficiency. It supports PCIe Gen 6 and 1.2 TB/s LPDDR5 memory, significantly outperforming the current strongest x86 CPUs.

This CPU will play three major roles in the Vera Rubin system: GPU orchestration, agent harnessing, and AI storage.

In market dynamics, AI infrastructure is rapidly transitioning towards agent-optimized hardware, with funding concentrating on high IPC, low-latency CPU platforms. NVIDIA benefits from enhanced full-stack control, while traditional x86 CPU manufacturers face pressure due to generational performance lag.

Source: Public Information

ABAB AI Insight

NVIDIA has previously entered the server CPU market with the Grace CPU, and the release of the Vera CPU continues its path of integrated expansion from GPU dominance to CPU+GPU. It focuses on addressing the demands for extremely low-latency tool invocation, database access, and model orchestration in the agent era, breaking the paradigm of traditional CPUs designed for human second-level interactions.

In terms of capital strategy, NVIDIA is concentrating resources on optimizing single-thread performance and on-chip high-speed interconnects, motivated by the goal of building an agent-specific computing platform. By unifying architecture, it aims to reduce cross-chip overhead and achieve efficient collaboration from inference to tool invocation, having already sold millions of Grace CPUs to establish scale.

Similar cases include Apple's M series pursuit of extreme single-thread performance and unified memory, as well as NVIDIA's early positioning in data centers through Grace Blackwell; current AI hardware is at a critical stage of transitioning from human-interactive CPUs to agent-native CPUs.

Essentially, this represents a technological substitution: CPU design is shifting from human second-level interactions to agent nanosecond-level responses. The mechanism is that agents require extreme low latency and high bandwidth collaboration for autonomous operation, allowing NVIDIA to gain pricing power in the CPU domain through a new architecture and strengthen its full-stack control in AI infrastructure.

ABAB News · Cognitive Law

Humans think in seconds, agents decide in nanoseconds.
Past CPUs served billions of humans; future CPUs will serve billions of agents.
Excellent CPUs sell low latency, traditional CPUs sell core counts.

Source

·ABAB News

06/01/2026, 05:09 AM·

3 min read

·12 hrs ago