Amazon has officially launched its latest custom-built AI chip, Trainium3, marking a significant push to challenge Nvidia's dominance in the artificial intelligence hardware market. The new chip delivers 4.4 times faster performance and 40% greater energy efficiency compared to its predecessor, while AWS simultaneously introduced Trn3 UltraServers capable of handling 144 chips in a single system.
Built on 3-nanometer technology, each UltraServer provides 362 FP8 PFLOPs with up to 20.7 TB of HBM3e memory, enabling massive AI models to train in weeks instead of months. Early customers like Anthropic, Karakuri, and Decart are reporting training and inference cost reductions of up to 50% using Trainium3, with some achieving 4x faster inference at half the cost of Nvidia GPUs.
AWS claims that Trainium and Google's TPUs offer 50-70% lower cost-per-billion-tokens compared to high-end Nvidia H100 clusters, which could save enterprises hundreds of millions annually. Amazon holds an $8 billion stake in Anthropic, which has adopted Trainium for production workloads, signaling its competitiveness. Dave Brown, vice president at AWS, stated, 'As we get into early next year, we’ll start to scale out very, very quickly,' emphasizing Amazon's aggressive rollout.
However, Nvidia's CUDA software ecosystem remains a formidable barrier, as most AI development is optimized for it. Amazon acknowledges this by planning Trainium4 to support Nvidia's NVLink technology for mixed deployments. The announcement was made during Amazon's re:Invent conference in December 2025, where the company also updated its Nova AI model family and launched Nova Forge for custom model training.