AI Systems Could Begin Building Themselves by 2028, Industry Analysis Suggests
Key Takeaways
- ▸AI systems have achieved near-saturation on software engineering benchmarks like SWE-Bench, improving from ~2% to ~94% accuracy in less than two years
- ▸Two critical trends enable automated AI R&D: AI systems are becoming better at writing complex real-world code and chaining together multi-step tasks autonomously
- ▸All engineering components for automating AI research production appear to be in place; the remaining frontier is whether AI systems will become creative enough to generate novel research ideas
Summary
According to analysis by Import AI's gmays, there is a 60%+ probability that fully autonomous AI research and development—where AI systems build their own successors without human involvement—could happen by the end of 2028. The analysis draws on publicly available evidence from research papers and product deployments from frontier AI companies, with specific attention to rapid improvements in AI coding capabilities. Claude models are cited as a primary example of progress: Claude 2 achieved roughly 2% accuracy on the SWE-Bench software engineering benchmark when it launched in late 2023, while Claude Mythos Preview now achieves 93.9%, effectively saturating the benchmark. This dramatic progress in code generation and the ability of AI systems to autonomously chain together complex tasks—writing code, testing it, and refining it without human oversight—suggests that all the engineering components necessary for automating AI R&D are falling into place. The author estimates that while full automation may not occur in 2026, a proof-of-concept where a non-frontier model trains its successor could emerge within one to two years.
- This represents a potential inflection point for AI development: if scaling trends continue, the industry could cross into an era of fundamentally unpredictable AI advancement
Editorial Opinion
The prospect of AI systems automating their own research and development represents a genuine inflection point that demands serious attention from policymakers, safety researchers, and the AI industry itself. If credible evidence emerges of even prototype systems training their successors, it would fundamentally reshape how we think about AI timelines and governance. The fact that a serious AI researcher assigns 60%+ odds to this happening within three years—based on observable progress metrics rather than speculation—should elevate this from thought experiment to practical concern.

