Google DeepMind's AlphaProof Nexus Solves Decades-Old Mathematical Problems
Key Takeaways
- ▸AlphaProof Nexus autonomously solved 9 decades-old Erdős problems and 44 OEIS conjectures using LLM-generated proof steps verified by Lean's compiler, costing only a few hundred dollars in compute per problem
- ▸The system uses compiler feedback as a grounding mechanism to offset LLM reasoning weaknesses, proving superior to pure natural-language approaches for mathematical proof generation
- ▸Surprisingly, the simplest agent variant (LLM + compiler feedback alone) solved all proven problems, suggesting simpler agentic loops may match or exceed complex multi-agent systems as language models improve
Summary
Google DeepMind unveiled AlphaProof Nexus, an AI system that combines large language models with formal mathematical verification to autonomously solve open mathematical problems. The framework successfully proved nine out of 353 attempted Erdős problems—including two questions that had remained unsolved for 56 years—along with 44 conjectures from the Online Encyclopedia of Integer Sequences, all at a cost of just a few hundred dollars per problem.
The system's architecture fundamentally differs from pure natural-language approaches: it generates proof steps in Lean's formal language, with each step verified by a compiler. Error messages feed directly back into the next attempt, creating a symbolic feedback loop that grounds the LLM's reasoning and mitigates its known weaknesses in logical consistency. This hybrid approach—combining Gemini 3.1 Pro with formal verification—proved more reliable than language-only methods for mathematical proof.
A striking finding emerged from post-hoc analysis: the simplest agent configuration using only an LLM and compiler feedback could prove all nine Erdős problems, albeit at higher computational cost per problem. This suggests that as language models improve, simpler agentic loops with formal verification may become increasingly effective. Beyond solving problems, mathematicians reported that incomplete proof attempts deepened their understanding, and the system proved effective at catching errors in existing literature—pointing to value well beyond solved problems.
- The framework provided value beyond complete proofs: mathematicians reported deeper problem understanding from failed attempts, and the system caught errors in existing formalizations
Editorial Opinion
AlphaProof Nexus represents a meaningful inflection point in AI-assisted mathematical discovery, not through raw language model capability but through pragmatic synergy between neural and symbolic reasoning. The finding that simple feedback loops rival sophisticated multi-agent systems is particularly revealing—it reframes the future of AI in reasoning-intensive domains as a marriage of symbolic verification and language model flexibility, rather than a pure scaling problem. This work may accelerate broader adoption of neurosymbolic approaches across science, engineering, and formal verification.



