Intel Launches Rack-Scale Reference Designs for Agentic AI Workloads, Targeting 36,864-Core Systems
Key Takeaways
- ▸Intel's new rack-scale designs support up to 36,864 CPU cores in a 100kW power envelope, directly addressing the CPU compute demands of agentic AI workloads
- ▸The disaggregated inference approach with SambaNova separates prefill (GPU) from decode (accelerator) operations, improving token throughput by 2-3x—a pattern emerging across the industry (Nvidia-Groq, AWS-Cerebra partnerships)
- ▸Vector Core Compute and Together.AI are early adopters, signaling commercial traction for agentic AI infrastructure
Summary
Intel unveiled new rack-scale reference designs at Computex 2026 aimed at delivering high-density CPU compute for running AI agents at scale. Developed in partnership with Foxconn and other infrastructure providers, the blueprints support up to 128 of Intel's Granite Rapids or Clearwater Forest Xeon 6+ processors, delivering between 16,384 and 36,864 cores within a 100kW power envelope. The designs target two use cases: latency-sensitive agentic workloads and maximum-density configurations.
The announcement reflects growing industry recognition that agentic AI systems—which harness models like OpenClaw to connect AI to tools, APIs, and code interpreters—require substantial CPU compute beyond what GPUs and accelerators alone can provide. Intel CEO Lip Bu Tan emphasized customer demand to "think at the system level" for serving agentic workloads at scale.
Intel also revealed that Vector Core Compute will be among the first to deploy the platform, with Together.AI as its first commercial customer. The move builds on Intel's earlier disaggregated AI blueprint co-developed with SambaNova, which separates compute-heavy prefill operations to Nvidia GPUs while using SambaNova accelerators for decode, boosting per-user token throughput by 2-3x.
The announcement comes amid intensifying competition: Nvidia recently launched a 256-CPU Vera-based rack system, while Arm introduced its own pair of AGI CPU reference designs ranging from 36kW to 200kW configurations.
- Xeon 6+ processors with up to 288 cores represent Intel's CPU-centric play in the competitive agentic AI arms race against Nvidia, Arm, and AMD
Editorial Opinion
Intel's aggressive push into agentic AI infrastructure reflects a strategic bet that the industry underestimated CPU demand in the agent era. Unlike traditional LLM inference dominated by GPU accelerators, the harnesses and orchestration layers that make agents practical require significant CPU capacity—and Intel is positioning Xeon as the backbone. The disaggregated inference model is particularly clever, acknowledging that no single accelerator wins all workloads. However, Intel faces an uphill battle against Nvidia's ecosystem dominance and momentum with cloud providers; execution speed and customer adoption will determine whether these reference designs become industry standard or remain niche.



