BotBeat
...
← Back

> ▌

Google / AlphabetGoogle / Alphabet
RESEARCHGoogle / Alphabet2026-03-04

Google Researchers Propose Dual-LLM System to Filter Bad Code Fixes in Automated Program Repair

Key Takeaways

  • ▸Google researchers developed a dual-LLM policy system with bug abstention and patch validation to filter out low-quality automated code repairs
  • ▸Testing on 174 human-reported bugs from Google's codebase showed combined success rate improvements of up to 39 percentage points
  • ▸The system addresses a critical deployment challenge: reducing noise and wasted developer time reviewing unlikely-to-be-accepted automated patches
Source:
Hacker Newshttps://arxiv.org/abs/2510.03217↗

Summary

Researchers from Google have published a paper introducing a dual-LLM policy system designed to reduce noise in agentic automated program repair (APR) systems. The approach, detailed in a paper accepted to ICSE-SEIP 2026, addresses a critical challenge in deploying AI-powered code repair at scale: many automatically generated patches are unlikely to be accepted by human reviewers, wasting developer time and eroding trust in the technology.

The system employs two complementary LLM-based policies working in tandem. The first, called "bug abstention," filters out bugs that the agentic APR system is unlikely to fix successfully before attempting repair. The second, "patch validation," evaluates generated patches and rejects those unlikely to represent good fixes for the given bug. This two-stage filtering approach aims to present only high-quality, actionable patches to human developers for review.

Testing on Google's internal codebase showed significant improvements in success rates. On a dataset of 174 human-reported bugs, the bug abstention policy improved success rates by up to 13 percentage points, while patch validation added up to 15 percentage points. When both policies were combined, success rates increased by up to 39 percentage points. The researchers also demonstrated improvements on machine-generated bug reports for null pointer exceptions and sanitizer-detected issues.

The research represents a practical step toward industrial-scale deployment of AI-powered code repair systems, addressing the crucial gap between generating patches and ensuring they're worth a human developer's review time. By reducing false positives and low-quality suggestions, such filtering systems could make automated program repair a more trusted and efficient tool in professional software development workflows.

  • The approach was accepted to ICSE-SEIP 2026, a top software engineering conference, indicating strong industry relevance

Editorial Opinion

This research tackles one of the most pragmatic barriers to AI adoption in software engineering: trust erosion from noisy suggestions. While much attention focuses on making AI agents more capable at fixing bugs, Google's dual-filter approach recognizes that knowing when not to suggest a fix may be equally important. The impressive 39-point improvement in success rates suggests that quality filtering could be the key to making automated program repair genuinely useful in production environments rather than just technically impressive in research settings.

Large Language Models (LLMs)AI AgentsScience & ResearchProduct Launch

More from Google / Alphabet

Google / AlphabetGoogle / Alphabet
PRODUCT LAUNCH

Google DeepMind Launches Gemini 3.5 Flash: New Lightweight AI Model

2026-05-20
Google / AlphabetGoogle / Alphabet
PARTNERSHIP

Singapore Inks AI Deals with Google

2026-05-20
Google / AlphabetGoogle / Alphabet
UPDATE

Google Overhauls Workspace App Icons with Gradient Design to Emphasize AI Integration

2026-05-20

Comments

Suggested

Research CommunityResearch Community
RESEARCH

New Methodology Proposed for Selecting Runtime Architecture Patterns in Production LLM Agents

2026-05-20
Google / AlphabetGoogle / Alphabet
PRODUCT LAUNCH

Google DeepMind Launches Gemini 3.5 Flash: New Lightweight AI Model

2026-05-20
Executive Office of the President of the United States (Policy/Regulation)Executive Office of the President of the United States (Policy/Regulation)
RESEARCH

SID Achieves Search Breakthrough with SID-1, Outperforming GPT-5 at 1k+ QPS Using Reinforcement Learning

2026-05-20
← Back to news
© 2026 BotBeat
AboutPrivacy PolicyTerms of ServiceContact Us