ServeTheHome Successfully Clusters 8 NVIDIA GB10 Units to Run Kimi K2.5 and K2.6 Models

Key Takeaways

▸ServeTheHome successfully built an 8-node NVIDIA GB10 cluster exceeding NVIDIA's supported configurations, proving aggressive scaling is viable
▸A single MikroTik CRS804 DDQ 400GbE switch enables multi-node scaling via NCCL and RDMA RoCE networking, eliminating the need for high-power enterprise switches
▸The cluster successfully runs massive models including Kimi K2.5 and K2.6 locally, demonstrating practical on-premises alternatives to cloud-based model inference

Source:

Hacker Newshttps://www.servethehome.com/big-cluster-little-power-the-8x-nvidia-gb10-cluster-marvell-cisco-ubiquiti-qnap-arm/↗

Summary

ServeTheHome has demonstrated a fully functional 8-node NVIDIA GB10 cluster featuring 1TB of combined memory, 160 Arm cores, and 400GbE RDMA networking—exceeding NVIDIA's supported configurations at the time of assembly. The feat required months of engineering and careful networking design, ultimately enabling the team to run Kimi K2.5 and K2.6, both massive language models, locally across the distributed cluster.

The project showcases the practical viability of scaling GB10 systems beyond NVIDIA's initially supported two-node topology. When ServeTheHome began in February 2026, only two-unit connections were supported; by late March at NVIDIA GTC 2026, support expanded to four nodes, yet the team had already achieved eight. The cluster uses a MikroTik CRS804 DDQ 400GbE switch to enable NCCL-based multi-node scaling with RoCE RDMA, complemented by a secondary 10GbE management network.

The engineering breakthrough demonstrates that large-scale AI model deployment on GB10 hardware is now significantly more accessible than just months ago. ServeTheHome's documentation of the cluster design, networking topology, and performance results provides a practical roadmap for researchers and organizations looking to deploy cutting-edge language models without relying on cloud infrastructure.

NVIDIA's expanding support (2-node to 4-node during the project timeline) signals growing recognition of cluster demand, likely spurring further ecosystem tooling

Editorial Opinion

This demonstration is significant for AI infrastructure democratization. By proving that off-the-shelf networking hardware and thoughtful system design can enable stable eight-node GB10 clusters, ServeTheHome has lowered the technical and financial barriers to running state-of-the-art large language models locally. The rapid expansion of NVIDIA's official support during the project timeline suggests market demand is accelerating, likely positioning GB10 clusters as a viable middle ground between single-system deployments and expensive cloud alternatives.

ServeTheHome Successfully Clusters 8 NVIDIA GB10 Units to Run Kimi K2.5 and K2.6 Models

Key Takeaways

▸ServeTheHome successfully built an 8-node NVIDIA GB10 cluster exceeding NVIDIA's supported configurations, proving aggressive scaling is viable
▸A single MikroTik CRS804 DDQ 400GbE switch enables multi-node scaling via NCCL and RDMA RoCE networking, eliminating the need for high-power enterprise switches
▸The cluster successfully runs massive models including Kimi K2.5 and K2.6 locally, demonstrating practical on-premises alternatives to cloud-based model inference

Summary

NVIDIA's expanding support (2-node to 4-node during the project timeline) signals growing recognition of cluster demand, likely spurring further ecosystem tooling

Editorial Opinion

This demonstration is significant for AI infrastructure democratization. By proving that off-the-shelf networking hardware and thoughtful system design can enable stable eight-node GB10 clusters, ServeTheHome has lowered the technical and financial barriers to running state-of-the-art large language models locally. The rapid expansion of NVIDIA's official support during the project timeline suggests market demand is accelerating, likely positioning GB10 clusters as a viable middle ground between single-system deployments and expensive cloud alternatives.

ServeTheHome Successfully Clusters 8 NVIDIA GB10 Units to Run Kimi K2.5 and K2.6 Models

Key Takeaways

Summary

Editorial Opinion

More from NVIDIA

Framework Laptop 16 Now Offers NVIDIA RTX 5070 12GB Upgrade Module at Premium Pricing

NVIDIA's CUDA Tile Shows Promise for Custom GPU Kernels but Lags in Portability

The AI Cost Paradox: NVIDIA Executive Reveals Computing Expenses Now Exceed Human Labor

Comments

Suggested

AI Human Bench Launches Platform for Head-to-Head AI Competition

Canonical Launches Silicon-Optimized AI Model Snaps for Ubuntu, Simplifying Local Inference Deployment

Meta Contractor to Lay Off 700+ AI Training Workers as Company Shifts to Internal Systems

ServeTheHome Successfully Clusters 8 NVIDIA GB10 Units to Run Kimi K2.5 and K2.6 Models

Key Takeaways

Summary

Editorial Opinion

More from NVIDIA

Framework Laptop 16 Now Offers NVIDIA RTX 5070 12GB Upgrade Module at Premium Pricing

NVIDIA's CUDA Tile Shows Promise for Custom GPU Kernels but Lags in Portability

The AI Cost Paradox: NVIDIA Executive Reveals Computing Expenses Now Exceed Human Labor

Comments

Suggested

AI Human Bench Launches Platform for Head-to-Head AI Competition

Canonical Launches Silicon-Optimized AI Model Snaps for Ubuntu, Simplifying Local Inference Deployment

Meta Contractor to Lay Off 700+ AI Training Workers as Company Shifts to Internal Systems