Berry: New MCP Server Aims to Combat AI Hallucinations Through Evidence-Based Verification

Key Takeaways

▸Berry is a verification-only MCP server that checks AI claims against user-provided evidence, not through prompting but at the tool boundary
▸The system uses two main tools: detect_hallucination for Q&A outputs and audit_trace_budget for structured reasoning traces
▸Berry employs an information-theoretic approach to measure if evidence provides sufficient support, catching citation laundering and weak justifications

Source:

Hacker Newshttps://strawberry.hassana.io/about↗

Summary

A developer has released Berry, an open-source Model Context Protocol (MCP) server designed to reduce AI hallucinations by requiring language models to back their claims with verifiable evidence. Unlike traditional approaches that rely on prompting or post-hoc filtering, Berry operates as a verification-only tool that checks whether AI-generated claims are actually supported by evidence provided by the user.

Berry exposes two primary verification tools: 'detect_hallucination,' which analyzes answers with citations to ensure each claim is supported by cited evidence, and 'audit_trace_budget,' which verifies structured reasoning traces step-by-step. The system uses an information-theoretic approach to measure whether evidence provides sufficient support for claims, flagging issues like citation laundering, weak support, and invented details. Importantly, Berry doesn't fetch evidence itself—users must provide code snippets, documentation, logs, or other relevant text spans.

The tool is positioned as a pragmatic solution rather than a silver bullet. The creator openly acknowledges Berry's limitations: the verification model is itself an LLM that can make mistakes, it requires quality evidence input, and it doesn't guarantee correctness. However, Berry aims to shift AI assistant failure modes from confidently stating unsupported claims to either finding proper evidence or admitting uncertainty. The MCP server integrates with AI coding assistants like Cursor, Claude Code, and Gemini, operating locally to provide verification at the tool boundary rather than through system prompts.

The tool doesn't fetch evidence, generate code, or guarantee correctness—it serves as a filter requiring users to provide trusted evidence spans
The creator positions Berry as a pragmatic improvement that shifts failure modes toward uncertainty rather than confident hallucinations

Editorial Opinion

Berry represents a thoughtful shift in addressing AI reliability—treating hallucination as an architectural problem rather than a prompting challenge. The evidence-required approach and honest acknowledgment of limitations are refreshing in a space often dominated by overblown claims. However, the tool's effectiveness depends entirely on users providing comprehensive, relevant evidence, which may create friction in fast-paced development workflows. Its real test will be whether developers adopt the discipline of evidence collection consistently enough to make the verification worthwhile.

Berry: New MCP Server Aims to Combat AI Hallucinations Through Evidence-Based Verification

Key Takeaways

▸Berry is a verification-only MCP server that checks AI claims against user-provided evidence, not through prompting but at the tool boundary
▸The system uses two main tools: detect_hallucination for Q&A outputs and audit_trace_budget for structured reasoning traces
▸Berry employs an information-theoretic approach to measure if evidence provides sufficient support, catching citation laundering and weak justifications

Summary

The tool doesn't fetch evidence, generate code, or guarantee correctness—it serves as a filter requiring users to provide trusted evidence spans
The creator positions Berry as a pragmatic improvement that shifts failure modes toward uncertainty rather than confident hallucinations

Editorial Opinion

Berry represents a thoughtful shift in addressing AI reliability—treating hallucination as an architectural problem rather than a prompting challenge. The evidence-required approach and honest acknowledgment of limitations are refreshing in a space often dominated by overblown claims. However, the tool's effectiveness depends entirely on users providing comprehensive, relevant evidence, which may create friction in fast-paced development workflows. Its real test will be whether developers adopt the discipline of evidence collection consistently enough to make the verification worthwhile.

Berry: New MCP Server Aims to Combat AI Hallucinations Through Evidence-Based Verification

Key Takeaways

Summary

Editorial Opinion

More from Independent Developer

CrankGPT: A Fully Offline, Hand-Powered AI Assistant

reasoning-core: Open-Source 130M-Param Guardrail Cuts AI Agent Token Usage by Up to 29%

The 'Google for AI Agents' Is Coming – and It's Being Built Outside Big Tech

Comments

Suggested

Microsoft's Leaked 'Aion' Project Reveals Vision for Copilot-First Operating System

Stanford Researchers Use Multi-Agent AI and Reinforcement Learning to Improve HIP Kernel Generation for AMD GPUs

Researchers Expose Critical Payload-Less Attack on LLM Agent Supply Chains

Berry: New MCP Server Aims to Combat AI Hallucinations Through Evidence-Based Verification

Key Takeaways

Summary

Editorial Opinion

More from Independent Developer

CrankGPT: A Fully Offline, Hand-Powered AI Assistant

reasoning-core: Open-Source 130M-Param Guardrail Cuts AI Agent Token Usage by Up to 29%

The 'Google for AI Agents' Is Coming – and It's Being Built Outside Big Tech

Comments

Suggested

Microsoft's Leaked 'Aion' Project Reveals Vision for Copilot-First Operating System

Stanford Researchers Use Multi-Agent AI and Reinforcement Learning to Improve HIP Kernel Generation for AMD GPUs

Researchers Expose Critical Payload-Less Attack on LLM Agent Supply Chains