BotBeat
...
← Back

> ▌

MetaMeta
RESEARCHMeta2026-05-12

AutoTTS: Researchers Cut LLM Inference Tokens by 70% with AI-Discovered Reasoning Strategy

Key Takeaways

  • ▸AutoTTS cuts inference tokens by approximately 70% compared to running 64 parallel chains while maintaining equivalent accuracy
  • ▸The Confidence Momentum Controller (CMC) was written and refined by an AI agent, not manually designed by researchers
  • ▸CMC uses real-time confidence signals and trends to dynamically decide when to branch, consolidate, explore, and prune reasoning paths
Source:
Hacker Newshttps://firethering.com/autotts-ai-inference-test-time-scaling/↗

Summary

Researchers from Google, Meta, and academic institutions (UMD, UVA, WUSTL, UNC) have unveiled AutoTTS (Automated Test-Time Scaling), a groundbreaking technique that reduces token usage in large language model inference by approximately 70% while maintaining accuracy. Rather than using the standard brute-force approach of running 64 parallel reasoning chains and selecting the majority answer, AutoTTS employs an AI agent to discover optimal reasoning strategies automatically through iterative refinement.

The core innovation is the Confidence Momentum Controller (CMC), an inference policy that was not hand-designed by researchers but automatically discovered by an AI agent. Unlike fixed inference rules, the CMC dynamically watches the model's confidence across reasoning traces and makes real-time decisions about when to branch, when to consolidate, when to explore new paths, and when to prune unpromising reasoning chains. This adaptive approach significantly outperforms traditional parallel sampling while dramatically reducing computational costs.

The discovery methodology demonstrates remarkable efficiency: researchers pre-computed a 'replay store' of cached reasoning traces from thousands of problems, then allowed an AI agent to write controller code, test it against the cached traces, evaluate accuracy and token efficiency, and iteratively refine the policy—all without making new model calls. This offline discovery process cost only $39.90 in API calls and completed in 160 minutes. The discovered controller generalized effectively across different benchmarks (AIME24, AIME25, HMMT25) and model sizes, achieving 69.5% token reduction at β=0.5 (balanced accuracy-speed tradeoff) while matching the accuracy of SC@64.

  • The entire discovery process cost only $39.90 in API calls thanks to offline evaluation against cached reasoning traces
  • The discovered policy transferred across benchmarks and model sizes, indicating robustness and practical applicability

Editorial Opinion

AutoTTS represents a paradigm shift in inference optimization: instead of human researchers designing better reasoning strategies, we can build environments where AI agents discover them through systematic exploration. This work demonstrates that the most efficient reasoning strategies might fundamentally differ from those humans would design, and that dramatic cost-efficiency breakthroughs can emerge from algorithmic discovery rather than computational brute force. The fact that discovery itself was remarkably affordable ($40) while delivering 70% savings in continuous inference costs suggests AutoTTS could become a standard tool for making enterprise-scale LLM deployment economically viable. Most importantly, this research validates a meta-principle—using AI to improve AI—that could unlock innovations across the entire field.

Large Language Models (LLMs)AI AgentsMachine LearningDeep Learning

More from Meta

MetaMeta
POLICY & REGULATION

Meta Employees Protest Mouse Tracking Technology at US Offices

2026-05-12
MetaMeta
RESEARCH

Meta's In-Kernel Broadcast Optimization Cuts Recommendation Inference Latency by 2/3

2026-05-12
MetaMeta
RESEARCH

Meta's Tuna-2 Simplifies Multimodal AI: Direct Pixel Embeddings Outperform Vision Encoders

2026-05-12

Comments

Suggested

vlm-runvlm-run
OPEN SOURCE

mm-ctx: Open-Source Multimodal CLI Toolkit Brings Vision Capabilities to AI Agents

2026-05-12
AnthropicAnthropic
PRODUCT LAUNCH

Anthropic Unleashes Computer Use: Claude 3.5 Sonnet Now Controls Your Desktop

2026-05-12
AnthropicAnthropic
PARTNERSHIP

SpaceX Backs Anthropic with Massive Data Centre Deal Amidst Musk's OpenAI Legal Battle

2026-05-12
← Back to news
© 2026 BotBeat
AboutPrivacy PolicyTerms of ServiceContact Us