Skip to content
FAQ

Amazon's Custom Chip Business Crosses $20B Run Rate, Rivaling Its Standalone Business Units

Amazon's in-house silicon division — spanning Trainium, Inferentia, and Graviton — has surpassed a $20 billion annual revenue run rate and is growing over 100% year-over-year, CEO Andy Jassy has confirmed. With Trainium3 now shipping and major multi-year commitments from OpenAI, Anthropic, Meta, and Uber, Amazon is executing the most credible challenge to Nvidia's AI chip dominance in the hyperscaler space.

4 min read

Andy Jassy’s annual letter to shareholders typically contains one or two lines that the rest of the industry spends the following year parsing for competitive intelligence. This year, one of them was about chips. Amazon’s custom silicon business — encompassing the Trainium AI training accelerators, Inferentia inference chips, and Graviton general-purpose processors — has crossed a $20 billion annual revenue run rate, Jassy confirmed, and is growing at triple-digit percentages year-over-year.

To put that number in context: $20 billion in annual revenue would make Amazon’s chip division a significant standalone business in its own right, larger than many household-name semiconductor companies. The division did not exist in any meaningful commercial sense five years ago.

The Chips Behind the Number

Trainium and Inferentia are purpose-built for AI workloads in ways that general-purpose CPUs cannot replicate — and that even Nvidia GPUs handle less efficiently. The chips are designed around the specific matrix multiplication operations that underlie transformer model training and inference, with memory architectures tuned for the access patterns of large-scale attention computation.

Trainium2, which entered broad availability in 2025, offers approximately 30% better price-performance than comparable Nvidia H100 configurations for transformer training workloads — a meaningful advantage at the scale Amazon operates. The caveat has always been software: Nvidia’s CUDA ecosystem is deeply entrenched, and migrating workloads requires engineering investment. For hyperscalers like Amazon that can fund that migration, the economics justify it; for smaller organizations, CUDA lock-in remains real.

Trainium3, which began shipping to select customers in early 2026, extends that advantage with another 30-40% performance improvement. The chip uses a chiplet-based architecture that allows Amazon to modularly scale compute and memory independently — an approach borrowed from AMD’s Zen processor playbook and applied to AI accelerators. Early benchmark data from customers suggests Trainium3 matches or exceeds H200 performance for standard LLM training configurations while maintaining Trainium2’s cost advantage.

Inferentia, the inference-focused counterpart, has seen its adoption curve steepen sharply as enterprise AI deployments shift from proof-of-concept to production. Running inference at scale is fundamentally a cost problem, and Inferentia’s lower per-token cost compared to A100/H100-based inference infrastructure has driven adoption among high-volume users. Graviton, Amazon’s Arm-based general compute processor, rounds out the portfolio as the backbone of non-AI AWS workloads.

The Customer List Is the Story

The most strategically significant detail in Amazon’s chip disclosure is not the $20 billion number — it is the customer roster. OpenAI, Anthropic, Meta, and Uber have all signed multi-year commitments to Trainium infrastructure, according to Amazon.

OpenAI’s involvement is particularly striking. The company’s $50 billion commitment to AWS, announced earlier this year as part of a broader ecosystem investment, includes substantial Trainium3 allocation. That OpenAI — whose commercial partnership with Microsoft has made Azure its primary cloud provider — would also commit meaningfully to AWS Trainium signals that at the current scale of AI compute demand, no single hardware vendor can be relied upon as the sole supplier.

Anthropic’s commitment to Trainium was formalized as part of its own multi-billion-dollar AWS partnership, and runs parallel to the company’s continued use of Google Cloud and its own on-premises infrastructure. The redundancy is intentional: for a company training models at the frontier, having multiple hardware options reduces the single-point-of-failure risk that plagued AI companies during the GPU shortage of 2023-2024.

Meta’s adoption is perhaps the most technically interesting. Meta’s research team initially built its training infrastructure around Nvidia hardware, and its internal AI frameworks were optimized for CUDA. The migration to Trainium for a meaningful portion of Meta’s training workload required substantial software engineering investment — and Meta’s willingness to make that investment reflects the scale of the cost savings at play.

Jassy’s $50 Billion Bet

In his shareholder letter, Jassy went further than confirming the $20 billion figure. He described Amazon’s chip business as having the potential to reach $50 billion in annual revenue within “a few years,” and hinted — carefully, given regulatory sensitivities — that Amazon might begin selling Trainium and Inferentia access externally, not just as part of AWS services, but potentially as standalone chip products to enterprise data center operators.

That last point is speculative but strategically logical. The alternative to buying CUDA-compatible GPUs for an on-premises AI data center is currently limited to AMD’s MI series. Amazon entering that market with chips that have proven cost advantages in cloud-scale deployments would fundamentally alter the competitive dynamics for enterprise AI infrastructure.

Nvidia’s position is not as secure as the revenue numbers suggest. Its H100 and H200 chips are sold out, its margins are historically high, and its software ecosystem is genuinely defensible. But the $20 billion Amazon chip business, the Groq LPU’s inference speed advantage, and AMD’s aggressive MI350 roadmap all point in the same direction: hyperscalers and specialized vendors are chipping away at the assumption that Nvidia is the only viable path to frontier AI compute.

What This Means for the AI Compute Market

The $20 billion milestone is more than a financial achievement — it is proof that a credible alternative AI silicon ecosystem can be built at scale. The barriers to Nvidia displacement have always been software (CUDA), supply (allocations), and performance (FLOPS per dollar). Amazon has addressed the second and third while building the tooling to reduce the first.

For enterprises evaluating their AI infrastructure strategy, Amazon’s traction validates the multi-cloud, multi-chip approach that has been theoretically recommended but practically difficult to execute. The combination of cost savings, supply predictability, and AWS ecosystem integration makes Trainium worth serious evaluation for any organization running inference at significant scale.

For Nvidia, the more immediate concern may not be losing hyperscaler business — those customers will always hedge. It is the downstream effect on pricing power. If Amazon’s chips hold down the effective price of AI compute in the market, Nvidia’s ability to sustain gross margins north of 70% comes under structural pressure that CUDA lock-in alone cannot resolve indefinitely.

The data center AI chip market is, for the first time, beginning to look genuinely competitive. Amazon’s $20 billion is the clearest evidence yet that the competition is real.

Amazon AWS Trainium Inferentia custom silicon AI chips Nvidia cloud computing
Share

Related Stories