BotBeat
...
← Back

> ▌

AnthropicAnthropic
RESEARCHAnthropic2026-06-18

Study of 112,000 Commits Reveals AI-Written Code Is No Buggier Than Human Code

Key Takeaways

  • ▸Analysis of 112,000+ commits found AI-written code introduces bugs at rates comparable to or better than human-written code in the same projects
  • ▸Human-driven AI agents (T2 agents like Claude Code) outperformed both autonomous bots and minimally-assisted development when developers actively reviewed and steered the process
  • ▸Rigorous statistical controls—especially accounting for commit size—were essential to avoid measurement bias and provide credible evidence
Source:
Hacker Newshttps://www.repowise.dev/blog/engineering/is-ai-written-code-buggier-than-human-code↗

Summary

A comprehensive analysis of 112,382 commits across 28 public repositories challenges the widespread assumption that AI-written code is more bug-prone than human-written code. Using the SZZ methodology—a standard technique in defect prediction that traces bugs backward through git history—researchers identified which commits introduced bugs and compared bug-introduction rates between AI-generated and human-written code in the same codebases.

The findings reveal that AI code is not buggier than human code, and in some cases, the opposite appears true. Importantly, the study differentiated between three tiers of AI-assisted development: T1 (autonomous bot agents like Devin and Copilot agents), T2 (human-driven agents like Claude Code, where developers actively steer and review), and T3 (minimal AI assistance with co-author trailers). The distinction proved crucial—different tiers exhibited markedly different bug-introduction patterns, with human-supervised AI agents showing particularly strong results.

The research employed rigorous statistical controls, including detection of AI-generated commits with 96.2% precision and accounting for commit size—the strongest predictor of bug introduction. These methodological safeguards distinguish these findings from anecdotal claims, ensuring results reflect actual code quality rather than differences in commit size or other confounding factors.

Editorial Opinion

This research provides much-needed empirical rigor to a debate that has largely been driven by anecdote and prior belief. The finding that human-supervised AI collaboration produces particularly high-quality code suggests that the future of AI-assisted development depends less on fully autonomous agents and more on tools that augment and accelerate human expertise. The results validate the collaborative model over fully autonomous systems, raising important questions about where the real value of AI coding tools lies—in pure code quality or in developer velocity and oversight.

Large Language Models (LLMs)AI AgentsScience & ResearchJobs & Workforce Impact

More from Anthropic

AnthropicAnthropic
POLICY & REGULATION

Trump Administration Imposes Export Controls on Anthropic's Claude Mythos After SK Telecom Access Dispute

2026-06-18
AnthropicAnthropic
UPDATE

Claude Design Now Stays On-Brand with Improved Editor and Deep Code Integration

2026-06-18
AnthropicAnthropic
POLICY & REGULATION

As Anthropic Faces AI Export Restrictions, Experts Say Capabilities Will Spread Across Industry

2026-06-18

Comments

Suggested

OpenAIOpenAI
RESEARCH

Two Agentic AI Systems Outperform Physicians in Medical Diagnosis and Care Planning

2026-06-18
Google / AlphabetGoogle / Alphabet
RESEARCH

Google Research Releases Vectorized Map of UK Farmland Features for Climate and Biodiversity Protection

2026-06-18
OpenAIOpenAI
RESEARCH

OpenAI Enhances GPT-5.4 for Sophisticated Web Design with Improved Visual Understanding and Computer Use

2026-06-18
← Back to news
© 2026 BotBeat
AboutPrivacy PolicyTerms of ServiceContact Us