BotBeat
...
← Back

> ▌

MetaMeta
RESEARCHMeta2026-03-14

Meta Reveals Backend Aggregation Technology Powering Gigawatt-Scale AI Clusters Like Prometheus

Key Takeaways

  • ▸Backend Aggregation enables Meta to connect tens of thousands of GPUs across multiple data centers with petabit-range bandwidth capacity
  • ▸Prometheus cluster will deliver 1 gigawatt of AI computing capacity spanning multiple data center buildings in a single region
  • ▸BAG uses modular Jericho3 ASIC line cards, eBGP routing with UCMP, and MACsec security to ensure scalable, performant, and resilient interconnection
Source:
Hacker Newshttps://engineering.fb.com/2026/02/09/data-center-engineering/building-prometheus-how-backend-aggregation-enables-gigawatt-scale-ai-clusters/↗

Summary

Meta has disclosed technical details about Backend Aggregation (BAG), a critical networking technology enabling the company to seamlessly connect tens of thousands of GPUs across multiple data centers and regions. BAG functions as a centralized Ethernet-based super spine network layer that interconnects multiple fabric layers, with inter-BAG capacities reaching the petabit range (16-48 Pbps per region pair). The technology is central to Meta's Prometheus AI cluster project, which will deliver 1 gigawatt of computational capacity spanning several data center buildings and interconnecting tens of thousands of GPUs to power new and existing AI experiences across Meta's product ecosystem.

Meta's BAG implementation connects two different network fabrics—Disaggregated Schedule Fabric (DSF) and Non-Scheduled Fabric (NSF)—using modular hardware, advanced routing, and resilient topologies to ensure both performance and reliability at unprecedented scale. The system employs Jericho3 ASIC line cards with up to 432x800G ports, eBGP routing with Unequal Cost Multipath (UCMP) for load balancing, and MACsec encryption for security. As Meta's AI clusters continue to grow, the company expects BAG to play an increasingly important role in meeting future computational demands and driving innovation across its global network infrastructure.

  • Meta strategically distributes BAG layers regionally with oversubscription ratios around 4.5:1 (L2 to BAG) to balance scale and performance

Editorial Opinion

Meta's disclosure of Backend Aggregation technology demonstrates the critical importance of networking infrastructure in supporting next-generation AI systems at scale. As AI clusters grow to gigawatt-scale capacity, the traditional networking approaches become insufficient, and specialized solutions like BAG become essential differentiators. This technical innovation highlights how hardware-software co-design and careful engineering of interconnect topologies are as crucial to AI infrastructure as the compute itself, potentially informing how other hyperscalers will need to architect their own future AI clusters.

MLOps & InfrastructureAI HardwareScience & Research

More from Meta

MetaMeta
RESEARCH

Meta-Research Project Tests Replicability of Social Science Claims, Finds Widespread Issues

2026-04-05
MetaMeta
FUNDING & BUSINESS

Meta Lays Off Hundreds in Silicon Valley While Doubling Down on $135 Billion AI Investment

2026-04-04
MetaMeta
POLICY & REGULATION

Meta Pauses Mercor Work After Data Breach Exposes AI Training Secrets

2026-04-03

Comments

Suggested

Google / AlphabetGoogle / Alphabet
RESEARCH

Deep Dive: Optimizing Sharded Matrix Multiplication on TPU with Pallas

2026-04-05
NVIDIANVIDIA
RESEARCH

Nvidia Pivots to Optical Interconnects as Copper Hits Physical Limits, Plans 1,000+ GPU Systems by 2028

2026-04-05
Independent ResearchIndependent Research
RESEARCH

New Research Proposes Infrastructure-Level Safety Framework for Advanced AI Systems

2026-04-05
← Back to news
© 2026 BotBeat
AboutPrivacy PolicyTerms of ServiceContact Us