BotBeat
...
← Back

> ▌

Google / AlphabetGoogle / Alphabet
RESEARCHGoogle / Alphabet2026-02-26

DeepMind Researchers Use LLMs to Autonomously Discover New Multi-Agent Learning Algorithms

Key Takeaways

  • ▸DeepMind's AlphaEvolve uses LLMs to automatically discover new multi-agent learning algorithms, reducing reliance on manual human design
  • ▸Two novel algorithms emerged: VAD-CFR for regret minimization and SHOR-PSRO for population-based training, both outperforming existing state-of-the-art methods
  • ▸The discovered algorithms employ non-intuitive mechanisms that human researchers might not have considered, including volatility-sensitive discounting and dynamic meta-solver blending
Source:
Hacker Newshttps://arxiv.org/abs/2602.16928↗

Summary

Researchers from DeepMind have published groundbreaking work demonstrating how large language models can automatically discover novel multi-agent reinforcement learning (MARL) algorithms. The team, led by Zun Li along with John Schultz, Daniel Hennes, and Marc Lanctot, introduced AlphaEvolve, an evolutionary coding agent that navigates the complex design space of game-theoretic learning algorithms without human intervention.

The research addresses a longstanding challenge in MARL: while foundational approaches like Counterfactual Regret Minimization (CFR) and Policy Space Response Oracles (PSRO) have strong theoretical foundations, designing their most effective variants has traditionally required extensive manual experimentation and human intuition. AlphaEvolve autonomously evolved two novel algorithms that outperform state-of-the-art baselines in imperfect-information games.

The first discovery, Volatility-Adaptive Discounted CFR (VAD-CFR), introduces non-intuitive mechanisms including volatility-sensitive discounting and consistency-enforced optimism to improve upon existing regret minimization approaches. The second, Smoothed Hybrid Optimistic Regret PSRO (SHOR-PSRO), employs a hybrid meta-solver that dynamically transitions from encouraging population diversity to rigorous equilibrium finding. Both algorithms demonstrate superior empirical convergence compared to manually designed alternatives, suggesting that LLM-driven algorithm discovery could accelerate progress in complex AI research domains.

  • This approach could accelerate algorithmic innovation in game theory and reinforcement learning by automating the exploration of vast design spaces

Editorial Opinion

This research represents a fascinating meta-level application of AI: using large language models to discover better AI algorithms themselves. The non-intuitive nature of the discovered mechanisms—like volatility-adaptive discounting—suggests that LLMs may explore algorithmic design spaces differently than human researchers, potentially uncovering solutions that bypass human cognitive biases. If this approach generalizes beyond game-theoretic learning, we could be entering an era where AI systems routinely contribute to their own algorithmic evolution, dramatically accelerating the pace of AI research itself.

Reinforcement LearningMultimodal AIAI AgentsMachine LearningScience & Research

More from Google / Alphabet

Google / AlphabetGoogle / Alphabet
RESEARCH

Deep Dive: Optimizing Sharded Matrix Multiplication on TPU with Pallas

2026-04-05
Google / AlphabetGoogle / Alphabet
INDUSTRY REPORT

Kaggle Hosts 37,000 AI-Generated Podcasts, Raising Questions About Content Authenticity

2026-04-04
Google / AlphabetGoogle / Alphabet
PRODUCT LAUNCH

Google Releases Gemma 4 with Client-Side WebGPU Support for On-Device Inference

2026-04-04

Comments

Suggested

AnthropicAnthropic
RESEARCH

Inside Claude Code's Dynamic System Prompt Architecture: Anthropic's Complex Context Engineering Revealed

2026-04-05
OracleOracle
POLICY & REGULATION

AI Agents Promise to 'Run the Business'—But Who's Liable When Things Go Wrong?

2026-04-05
AnthropicAnthropic
POLICY & REGULATION

Anthropic Explores AI's Role in Autonomous Weapons Policy with Pentagon Discussion

2026-04-05
← Back to news
© 2026 BotBeat
AboutPrivacy PolicyTerms of ServiceContact Us