FuzzingBrain V2: Multi-Agent LLM System Discovers 29 Zero-Day Vulnerabilities with 90% Detection Rate
Key Takeaways
- ▸FuzzingBrain V2 achieved 90% vulnerability detection rate in competition testing (36/40 vulnerabilities)
- ▸Discovered and reproduced 29 zero-day vulnerabilities across 12 open-source projects in real-world deployment
- ▸Introduced 'Suspicious Point' abstraction for optimal-granularity vulnerability localization, balancing context and precision
Summary
Researchers have unveiled FuzzingBrain V2, a sophisticated multi-agent LLM system designed to automate vulnerability discovery and reproduction in software code. The system addresses three critical challenges in LLM-based security analysis: high false-positive rates in vulnerability reports, suboptimal granularity for vulnerability localization, and difficulty reasoning about complex cross-function vulnerabilities.
The system introduces novel contributions including a control-flow-based abstraction called "Suspicious Point" for precise vulnerability localization, logic-driven hierarchical function analysis with dual-layer fuzzing, and MCP-based static and dynamic analysis tools. Built on Google's OSS-Fuzz to ensure all reported vulnerabilities are reproducible, FuzzingBrain V2 achieved a 90% detection rate (36 of 40 vulnerabilities) in the 2025 AIxCC Final Competition C/C++ dataset.
In real-world deployment, the system demonstrated remarkable impact by discovering 29 zero-day vulnerabilities across 12 open-source projects. All identified vulnerabilities were confirmed and fixed by project maintainers, with 2 assigned official CVE IDs. This represents a significant step forward in automated security analysis and the practical application of multi-agent AI systems to cybersecurity.
- All discovered vulnerabilities confirmed and patched by maintainers; 2 assigned CVE IDs
Editorial Opinion
This research demonstrates the transformative potential of multi-agent LLM systems for cybersecurity. The ability to automatically discover, reproduce, and verify zero-day vulnerabilities at scale could fundamentally accelerate the security posture of open-source software that powers critical infrastructure. The key breakthrough—ensuring all AI-generated findings are fuzzer-reproducible—addresses a major credibility gap in prior LLM-based security work.



