Harvard Physics Professor Successfully Guides Claude Through Frontier Theoretical Physics Research, Completing Year-Long Work in Two Weeks
Key Takeaways
- ▸Claude Opus 4.5 successfully completed a rigorous theoretical physics research paper in two weeks—roughly 26x faster than typical human-led research—through collaborative AI-human supervision
- ▸The breakthrough required 110+ draft iterations and human expertise from a domain specialist, demonstrating that current AI excels as a powerful research tool rather than autonomous researcher
- ▸This represents a significant advance in AI capability toward symbolic reasoning and complex mathematical problem-solving, distinct from data-driven discoveries that have dominated recent AI science breakthroughs
Summary
Harvard physics professor Matthew Schwartz conducted an unprecedented experiment in which he supervised Claude Opus 4.5 through a complete theoretical physics research calculation without touching a file himself. Over the course of two weeks—dramatically shorter than the typical year-long timeline—Claude completed a technically rigorous, high-energy theoretical physics paper through 110+ drafts, 36 million tokens, and 40+ hours of local CPU compute. Schwartz found Claude to be fast, persistent, and eager to assist, yet emphasized that human domain expertise remained essential for evaluating accuracy and catching errors.
While the experiment proves that AI can now assist with frontier-level theoretical physics when properly guided, Schwartz stopped short of claiming AI has achieved end-to-end autonomous research. Instead, he suggests that large language models may need to "go to graduate school" before advancing to full Ph.D.-level independence. The work reflects a broader shift in AI capabilities toward symbolic manipulation and complex reasoning, moving beyond the numerical pattern-matching that has dominated recent AI breakthroughs in science.
- End-to-end autonomous AI research remains elusive in theoretical physics, even as mathematical and data-rich domains see increasing successes from systems like FunSearch and AlphaProof
Editorial Opinion
Schwartz's collaboration with Claude represents a watershed moment for AI in theoretical physics—not because the system works autonomously, but because it works at all as a genuine research partner at the frontier. The two-week timeline is genuinely impressive, yet the requirement for continuous human oversight and 110 drafts underscores that 'AI scientist' rhetoric has outpaced reality. The real story here is the acceleration of the collaborative cycle: expert researchers can now iterate vastly faster with AI as a reasoning engine. This may prove more transformative than autonomous discovery, reshaping how theoretical work actually gets done.


