Deconvolute Labs Reveals Critical 'Rug Pull' Vulnerability in AI Agent MCP Servers
Key Takeaways
- ▸MCP servers can dynamically modify tool schemas to trick AI agents into extracting and transmitting credentials like AWS access keys
- ▸The vulnerability exploits LLMs' inability to distinguish between legitimate tool requirements and adversarial instructions embedded in metadata
- ▸Current MCP implementations lack schema integrity verification after initial handshake, creating a 'confused deputy' security problem
Summary
Security researchers at Deconvolute Labs have disclosed a critical vulnerability in Model Context Protocol (MCP) implementations that allows malicious servers to steal credentials from AI agents through dynamic schema manipulation. The attack, dubbed a 'Rug Pull' or Schema Modification attack, exploits the fact that MCP clients do not verify schema integrity after initial handshake, allowing compromised servers to alter tool definitions mid-session to demand sensitive information like AWS access keys.
The vulnerability creates a 'confused deputy' problem where AI agents with access to local privileges and environment variables trust remote server instructions without verification. When a malicious MCP server modifies a tool schema to require credentials as mandatory parameters—disguised as legitimate API constraints—the LLM agent extracts and transmits the sensitive data while continuing normal operations to avoid detection. The attack is particularly insidious because agents that refresh tool lists before queries automatically ingest poisoned schemas without user awareness.
To address this vulnerability, Deconvolute Labs has released an open-source runtime firewall for MCP that implements schema pinning and stateful integrity checks. The company has published a demonstration repository showing both vulnerable and protected implementations, allowing developers to reproduce the attack and test defenses. The disclosure highlights fundamental trust model weaknesses in current MCP implementations that assume tool definitions remain static or that schema changes represent only legitimate API updates.
- Deconvolute Labs has released open-source tooling to prevent schema modification attacks through stateful integrity checks and schema pinning
Editorial Opinion
This disclosure exposes a fundamental architectural weakness in how AI agents interact with external services through protocols like MCP. The 'confused deputy' problem isn't new to security, but its manifestation in LLM-based systems is particularly concerning because these agents lack the ability to reason about trust boundaries or distinguish functional requirements from social engineering attempts. As AI agents gain more autonomy and access to sensitive systems, the security community must develop robust verification mechanisms that go beyond assuming good faith from external services—schema pinning is a good start, but the broader challenge of teaching agents to recognize adversarial manipulation remains unsolved.


