AWS Launches Graviton5 CPU, Optimized for Agentic AI and Database Workloads
Key Takeaways
- ▸Graviton5 uses four 48-core chiplets (192 total) vs. monolithic design, improving yield and manufacturing cost
- ▸Shift to 3nm process from 4nm delivers better density and per-core efficiency despite higher overall power draw
- ▸L3 cache doubled per core; 420 GB/sec D2D interconnects connect chiplets with 96 PCIe 6.0 lanes and CXL 3.0 support
Summary
AWS's Annapurna Labs has officially launched Graviton5, a new 192-core Arm server CPU shipping in M9g and M9gd instances. The chip represents a significant architectural shift from previous generations, using a four-chiplet design rather than a monolithic die, with each chiplet containing 48 Neoverse V3 cores. The move to TSMC's 3-nanometer process, combined with DDR5 memory support, PCIe 6.0, and CXL 3.0 compatibility, enables 2.4X more performance per socket compared to Graviton4, while substantially increasing power consumption to approximately 650 watts.
The Graviton5 doubles L3 cache per core compared to its predecessor and features four die-to-die interconnects running at 420 GB/sec that link the chiplets into a virtual processor. AWS explicitly designed this chip to prioritize low-latency responsiveness over power efficiency, making it suitable for agentic AI systems and database workloads that demand consistent, predictable performance rather than minimal heat dissipation. The chiplet-based approach also improves manufacturing yield and cost compared to a monolithic design that would push against reticle limits.
- 2.4X more performance per socket than Graviton4, prioritizing latency and throughput over power efficiency for agentic AI and databases
- M9g and M9gd instances now available, offering better price-to-performance for responsive workloads
Editorial Opinion
AWS is making an intelligent engineering bet with Graviton5: sacrificing power efficiency for the low-latency responsiveness that agentic AI and modern databases demand. The shift from monolithic to chiplet design shows sophisticated cost engineering—higher per-unit yield on smaller dies more than offsets the 3nm premium. This positioning directly challenges the prevailing datacenter narrative that power efficiency is paramount, recognizing that responsiveness is now table stakes for competitive AI and database services.



