Meta Announces Multi-Year Partnership with Amazon AWS to Deploy Tens of Millions of AWS Self-Developed Graviton5 CPU Cores
Meta announced a multi-year partnership with Amazon AWS to deploy tens of millions of AWS self-developed Graviton5 CPU cores, focusing on supporting workloads related to Agentic AI, including real-time inference, complex task orchestration, code generation, and multi-step workflow management. Market sources indicate that the agreement is worth several billion dollars, with most chips to be deployed in U.S. data centers. Following the announcement, Amazon's stock price surged, nearing historical highs, reflecting the capital market's new pricing expectations for "CPU returning to AI data centers."
AWS introduced that Graviton5 uses a 3-nanometer process, with each chip featuring 192 cores, a 5-fold increase in cache capacity compared to the previous generation, and a reduction of about 35% in inter-core communication latency. Combined with the Nitro virtualization solution, it can run large-scale microservices and agent systems in isolated environments. Meta will prioritize using Graviton for a large number of lightweight inference and scheduling tasks, allowing expensive GPUs to focus on training large models and high-complexity inference, thereby overall reducing the unit computing cost of the AI agent system.
Source: Public Information
ABAB AI Insight
This collaboration is not merely about "Meta buying a batch of chips"; it clearly announces that in the era of Agentic AI, the computing architecture of data centers is shifting from "GPU dominance" to "GPU + high-density CPU layered collaboration." The core of Agentic AI relies heavily on a large number of small to medium-scale inferences, tool calls, searches, and task orchestration, which are highly parallel, latency-sensitive, but relatively low in arithmetic density, thus having a much higher dependency on CPUs than single large model inferences. By systematically migrating this workload to Graviton, Meta essentially outsources "control plane and lightweight inference" to cheaper ARM CPUs, reserving the most expensive GPUs for truly compute-intensive training and complex inference.
From an industry structure perspective, this decision reshapes the competitive landscape of AI chips. The previous narrative revolved almost entirely around NVIDIA GPUs, while the Meta–AWS agreement delineates a new high-value layer within data centers: specifically providing high-density CPU capabilities for agent scheduling, retrieval enhancement, toolchain calls, and online service operations. This gives Amazon an opportunity to bypass traditional GPU competition and establish a chip and cloud monopoly in the "AI control plane"—whoever controls the CPU platform for the agent layer controls the operation rights of massive online AI interactions and enterprise workflows.
For Meta, the large-scale use of Graviton also has a crucial financial dimension: Agentic AI is not a one-time large model call but a continuous, conversational, long-chain interaction embedded in business processes, with a cost structure closer to "ongoing cloud service fees" rather than "one-time API calls." If GPUs are still used to handle a large number of medium to low-intensity agent calls, the marginal costs will make many AI agent products aimed at individuals and enterprises unprofitable; by migrating scheduling and lightweight inference to self-developed CPUs, Meta can reduce the cost per interaction to a sustainable level, making the product form of "everyone having an AI agent" financially viable.