NVIDIA GreenBoost Kernel Modules Released as Open Source
Key Takeaways
- ▸GreenBoost enables transparent GPU VRAM extension using system RAM and NVMe storage, allowing LLM inference beyond native GPU capacity without software modifications
- ▸The solution uses DMA-BUF and PCIe 4.0 direct memory access to achieve high-speed data movement (~32 GB/s) while maintaining CUDA compatibility
- ▸Released under GPL v2 as open source, GreenBoost operates as an independent kernel module alongside official NVIDIA drivers, reducing complexity compared to driver-level modifications
Summary
Developer Ferran Duarri has released GreenBoost, an open-source Linux kernel module and CUDA shim designed to extend GPU VRAM by leveraging system DDR4 RAM and NVMe storage. The solution allows users to run large language models that exceed available GPU memory without modifying inference software, addressing a critical limitation for consumer-grade GPU hardware. The technology works independently alongside official NVIDIA drivers and uses DMA-BUF to enable direct PCIe 4.0 communication between GPUs and system memory, achieving transparent memory extension at speeds around 32 GB/s on high-end consumer hardware like the RTX 5070.
GreenBoost was developed to solve practical constraints faced by machine learning practitioners, particularly the choice between degraded performance (CPU offloading), reduced model quality (smaller quantizations), or expensive hardware upgrades. The kernel module allocates pinned DDR4 pages and exports them as CUDA external memory, while a userspace CUDA shim intercepts memory allocation calls to route large allocations through the kernel module. Released under GPL v2 license, the project includes system monitoring tools and demonstrates successful testing on real hardware including an i9-14900KF processor and RTX 5070 GPU.
- The project addresses real constraints in consumer GPU machine learning by providing a practical alternative to buying expensive high-memory GPUs or accepting significant performance/quality degradation
Editorial Opinion
GreenBoost represents a pragmatic engineering solution to a persistent pain point in accessible AI development—the gap between consumer GPU capabilities and model size requirements. By working transparently above the driver layer and leveraging standard PCIe mechanisms, it offers the AI community a viable option that doesn't require architectural changes or expensive hardware investments. The open-source release could accelerate experimentation in memory-constrained environments, though real-world performance metrics across different workloads and hardware configurations will be essential to understanding its practical impact.



