ExploitGym: Frontier AI Models Successfully Exploit Real-World Vulnerabilities
Key Takeaways
- ▸ExploitGym benchmark demonstrates that frontier AI models can successfully exploit real-world vulnerabilities across diverse domains including userspace programs, JavaScript engines, and Linux kernel
- ▸Anthropic's Claude Mythos Preview achieved the strongest performance with 157 successful exploits, followed by OpenAI's GPT-5.5 with 120
- ▸AI models maintain meaningful exploitation capabilities even with common security protections (ASLR, DEP) enabled, raising systemic cybersecurity concerns
Summary
Researchers have introduced ExploitGym, a large-scale benchmark designed to evaluate AI agents' ability to turn security vulnerabilities into working exploits. The benchmark comprises 898 real-world vulnerability instances across three domains: userspace programs, Google's V8 JavaScript engine, and the Linux kernel. According to the study, frontier AI models demonstrate non-trivial exploitation capabilities, with Anthropic's Claude Mythos Preview achieving the strongest performance by successfully exploiting 157 vulnerabilities, followed by OpenAI's GPT-5.5 with 120 successful exploits.
The research reveals significant cybersecurity implications, as even with widely deployed security protections (like ASLR and DEP) enabled, AI models retain meaningful exploitation success rates. This finding highlights the growing security risks posed by increasingly capable AI agents, as they can combine low-level program reasoning, runtime adaptation, and sustained progress to transform theoretical vulnerabilities into practical attacks. The study establishes ExploitGym as an effective testbed for evaluating AI exploitation capabilities and underscores the urgency of developing robust defenses against AI-powered attacks.
- The research highlights the dual-use nature of AI exploitation capabilities and underscores the urgent need for AI safety governance and cybersecurity defenses
Editorial Opinion
ExploitGym represents a critical step in responsibly evaluating AI capabilities for a dual-use application that could reshape cybersecurity. The finding that frontier models can exploit real vulnerabilities at scale—even with defenses enabled—should trigger serious discussions about AI safety and security governance. While the benchmark serves defensive purposes, it also demonstrates how capable AI agents could lower the barrier for adversarial exploitation, making this research simultaneously valuable for security teams and concerning for the broader industry.


