GPUBreach: University of Toronto Researchers Expose Critical Privilege Escalation Vulnerability in NVIDIA GPU Drivers
Key Takeaways
- ▸GPUBreach exploits a design flaw in NVIDIA's GPU-driver RPC architecture by combining Rowhammer bit flips with unvalidated message handling to achieve kernel privilege escalation
- ▸The attack does not bypass IOMMU protections but rather abuses legitimate GPU DMA write permissions to corrupt driver state that the kernel implicitly trusts
- ▸The vulnerability stems from missing bounds validation on the elemCount field in GSP firmware messages, allowing heap buffer overflows when attacker-controlled data reaches the message handler
Summary
Researchers from the University of Toronto, led by Chris S. Lin and Prof. Gururaj Saileshwar, have disclosed GPUBreach, a novel class of attack targeting NVIDIA GPU drivers that combines Rowhammer fault injection with GPU memory management vulnerabilities to achieve privilege escalation. The attack exploits a critical design flaw in NVIDIA's kernel driver architecture, specifically how it handles bidirectional RPC message queues between the GPU's onboard ARM processor (GSP) and the host system.
The vulnerability works by leveraging Rowhammer bit flips in GDDR6 GPU memory to redirect framebuffer writes into the IOMMU-permitted shared memory region used for GSP firmware communication. By corrupting the status queue, attackers can inject malicious data that exploits an unvalidated field (elemCount) in the message handler, triggering a kernel heap buffer overflow that overwrites critical kernel function pointers and leads to root-level code execution. Notably, the attack does not bypass the IOMMU—it uses it exactly as intended, making it particularly insidious.
The root cause stems from an implicit trust model in the driver code, which assumes that data written to shared memory by the GSP firmware is always legitimate. The driver lacks bounds checking on message element counts, trusting the firmware to respect a 16-element maximum despite the ring buffer supporting up to 63 elements. While normal GSP firmware operation prevents exploitation, the vulnerability becomes critical once an attacker can write to the status queue pages—which Rowhammer successfully enables.
- Exploitation requires either GSP firmware compromise (computationally infeasible due to NVIDIA signing) or the ability to write to GPU shared memory pages—which Rowhammer successfully enables
- Security researchers have published detection methods and indicators of exploitation using GPU runtime telemetry, enabling defenders to identify active attacks
Editorial Opinion
GPUBreach represents a sophisticated convergence of hardware and software vulnerabilities that challenges conventional security assumptions about IOMMU protection and firmware trust boundaries. The attack's elegance lies in its use of legitimate GPU mechanisms against themselves—rather than bypassing protections, it weaponizes them. This disclosure underscores the critical need for GPU driver developers to eliminate implicit trust assumptions in inter-processor communication and implement comprehensive message validation, even for ostensibly 'trusted' firmware channels.


