BotBeat
...
← Back

> ▌

DeweyDewey
RESEARCHDewey2026-04-08

Agentic RAG Outperforms Full-Context Retrieval on FinanceBench by 7.7 Points

Key Takeaways

  • ▸Agentic RAG with Claude Opus achieved 83.7% accuracy on FinanceBench, outperforming full-context retrieval (76.0%) by 7.7 percentage points using the same model
  • ▸Dewey's iterative search approach successfully handled all 150 benchmark documents, while full-context retrieval failed on six large SEC filings that exceeded context limits
  • ▸Document enrichment features (section summaries, table captions, image captions) contribute to improved retrieval quality, enabling more effective financial analysis workflows
Source:
Hacker Newshttps://meetdewey.com/blog/financebench-eval↗

Summary

Dewey, a document research API, has demonstrated that agentic retrieval-augmented generation (RAG) significantly outperforms traditional full-context approaches on FinanceBench, a benchmark of 150 financial analysis questions derived from real SEC filings. Using Claude Opus as the reasoning model, Dewey achieved 83.7% accuracy compared to 76.0% for the same model using full-context retrieval—a 7.7-point improvement. The agentic approach also successfully handled all 150 documents, whereas full-context retrieval failed on six large PepsiCo 10-K filings that exceeded Claude's 1M-token context limit.

The research challenges the 2023 finding from Patronus AI that traditional vector RAG achieved only 19% accuracy on FinanceBench, suggesting that agentic retrieval with iterative search and document enrichment capabilities represents a more scalable and effective solution for financial document analysis. Dewey's system can make up to 50 search calls per question at exhaustive depth, enabling it to locate specific figures across multiple documents, compute financial ratios, compare across periods, and synthesize meaningful analysis. This breakthrough has significant implications for financial services and document-heavy industries, as it demonstrates that RAG can now match or exceed full-context approaches while remaining cost-effective and scalable to large document collections.

  • Agentic RAG offers better scalability and cost-efficiency compared to full-context approaches, addressing practical limitations for enterprise-scale financial document analysis

Editorial Opinion

This research represents a significant validation that agentic RAG architectures can solve real-world document analysis problems at scale. The 7.7-point improvement over full-context retrieval using the same model is not merely a statistical gain—it demonstrates that intelligent search and iterative reasoning outperform brute-force context expansion. For financial services and other document-heavy industries, this finding suggests that purpose-built agentic systems may offer a better path forward than context window arms races.

Large Language Models (LLMs)Natural Language Processing (NLP)Generative AIAI AgentsFinance & Fintech

Comments

Suggested

BittensorBittensor
FUNDING & BUSINESS

Covenant AI Exits Bittensor Over Centralization Concerns; TAO Token Plummets 15%

2026-04-10
MythosMythos
POLICY & REGULATION

Treasury Secretary and Federal Reserve Chair Meet with Bank CEOs Over AI Model Risks

2026-04-10
OracleOracle
POLICY & REGULATION

OpenJDK Bans AI-Generated Code Contributions, Allows Private Use for Analysis

2026-04-10
← Back to news
© 2026 BotBeat
AboutPrivacy PolicyTerms of ServiceContact Us