Researchers Demonstrate 'Brainworm' — First AI Agent Malware Hidden Entirely in Context Windows
Key Takeaways
- ▸Brainworm is the first demonstrated malware that exists entirely within AI agent context windows, requiring no traditional code, scripts, or binary executables
- ▸The attack exploits cross-session memory files (CLAUDE.md, AGENTS.md) in agents like Claude Code, Codex, and Gemini CLI to persist malicious instructions and force unauthorized tool calls
- ▸Traditional cybersecurity defenses are ineffective against this 'promptware' approach, as computer-use agents now function as natural language scripting interpreters
Summary
Security researchers at Origin have unveiled Brainworm, a novel form of malware that infects computer-use agents like Anthropic's Claude Code using only natural language instructions, with no traditional code or executables required. The promptware operates entirely within the agent's context window by exploiting cross-session memory files (CLAUDE.md, AGENTS.md) that are automatically loaded when users trust a project. Researchers demonstrated that these memory files can force AI agents to perform unauthorized tool calls and communicate with command-and-control infrastructure called Praxis, which Origin previously open-sourced as an adversarial framework for discovering and orchestrating computer-use agents.
The attack represents a fundamental paradigm shift in cybersecurity, as computer-use agents now function as scripting interpreters that accept executable instructions in pure natural language, bypassing traditional security measures that rely on scanning for binary artifacts, file hashes, or structured code patterns. The researchers drew parallels to the 1971 Creeper worm on ARPANET, positioning Brainworm as the modern equivalent for the AI agent era. While the current demonstration focuses primarily on persistence and command-and-control phases, Origin indicated plans to reveal a complete 'promptware kill chain' in subsequent posts.
The vulnerability affects multiple major AI coding assistants including Anthropic's Claude Code, OpenAI's Codex, and Google's Gemini CLI, all of which support project-specific memory features. Anthropic recently introduced an 'auto-memory' feature that automatically creates memory files on users' behalf, potentially expanding the attack surface. The research highlights a critical gap in endpoint security as traditional defenses built around detecting malicious executables, scripts, or behavioral anomalies are fundamentally ineffective against semantic attacks that operate purely through natural language manipulation of AI systems.
- Researchers have developed Praxis, an open-source adversarial C2 framework for orchestrating compromised AI agents across endpoints
- The demonstration represents only the persistence and C2 phases, with a full 'promptware kill chain' promised in future research
Editorial Opinion
This research exposes a genuinely novel attack vector that traditional endpoint security was never designed to address. The elegance and danger of Brainworm lies in its exploitation of features intended to make AI agents more helpful—persistent memory and natural language understanding—turning them into infection vectors that leave no forensic artifacts. As computer-use agents gain broader deployment in enterprise environments with access to sensitive systems, the security community urgently needs to develop new defensive paradigms that can detect and prevent semantic manipulation, not just binary malware.

