Researchers Demonstrate 'Brainworm' — First AI Agent Malware Hidden Entirely in Context Windows

Key Takeaways

▸Brainworm is the first demonstrated malware that exists entirely within AI agent context windows, requiring no traditional code, scripts, or binary executables
▸The attack exploits cross-session memory files (CLAUDE.md, AGENTS.md) in agents like Claude Code, Codex, and Gemini CLI to persist malicious instructions and force unauthorized tool calls
▸Traditional cybersecurity defenses are ineffective against this 'promptware' approach, as computer-use agents now function as natural language scripting interpreters

Source:

Hacker Newshttps://www.originhq.com/blog/brainworm↗

Summary

Security researchers at Origin have unveiled Brainworm, a novel form of malware that infects computer-use agents like Anthropic's Claude Code using only natural language instructions, with no traditional code or executables required. The promptware operates entirely within the agent's context window by exploiting cross-session memory files (CLAUDE.md, AGENTS.md) that are automatically loaded when users trust a project. Researchers demonstrated that these memory files can force AI agents to perform unauthorized tool calls and communicate with command-and-control infrastructure called Praxis, which Origin previously open-sourced as an adversarial framework for discovering and orchestrating computer-use agents.

The attack represents a fundamental paradigm shift in cybersecurity, as computer-use agents now function as scripting interpreters that accept executable instructions in pure natural language, bypassing traditional security measures that rely on scanning for binary artifacts, file hashes, or structured code patterns. The researchers drew parallels to the 1971 Creeper worm on ARPANET, positioning Brainworm as the modern equivalent for the AI agent era. While the current demonstration focuses primarily on persistence and command-and-control phases, Origin indicated plans to reveal a complete 'promptware kill chain' in subsequent posts.

The vulnerability affects multiple major AI coding assistants including Anthropic's Claude Code, OpenAI's Codex, and Google's Gemini CLI, all of which support project-specific memory features. Anthropic recently introduced an 'auto-memory' feature that automatically creates memory files on users' behalf, potentially expanding the attack surface. The research highlights a critical gap in endpoint security as traditional defenses built around detecting malicious executables, scripts, or behavioral anomalies are fundamentally ineffective against semantic attacks that operate purely through natural language manipulation of AI systems.

Researchers have developed Praxis, an open-source adversarial C2 framework for orchestrating compromised AI agents across endpoints
The demonstration represents only the persistence and C2 phases, with a full 'promptware kill chain' promised in future research

Editorial Opinion

This research exposes a genuinely novel attack vector that traditional endpoint security was never designed to address. The elegance and danger of Brainworm lies in its exploitation of features intended to make AI agents more helpful—persistent memory and natural language understanding—turning them into infection vectors that leave no forensic artifacts. As computer-use agents gain broader deployment in enterprise environments with access to sensitive systems, the security community urgently needs to develop new defensive paradigms that can detect and prevent semantic manipulation, not just binary malware.

Researchers Demonstrate 'Brainworm' — First AI Agent Malware Hidden Entirely in Context Windows

Key Takeaways

▸Brainworm is the first demonstrated malware that exists entirely within AI agent context windows, requiring no traditional code, scripts, or binary executables
▸The attack exploits cross-session memory files (CLAUDE.md, AGENTS.md) in agents like Claude Code, Codex, and Gemini CLI to persist malicious instructions and force unauthorized tool calls
▸Traditional cybersecurity defenses are ineffective against this 'promptware' approach, as computer-use agents now function as natural language scripting interpreters

Summary

Researchers have developed Praxis, an open-source adversarial C2 framework for orchestrating compromised AI agents across endpoints
The demonstration represents only the persistence and C2 phases, with a full 'promptware kill chain' promised in future research

Editorial Opinion

This research exposes a genuinely novel attack vector that traditional endpoint security was never designed to address. The elegance and danger of Brainworm lies in its exploitation of features intended to make AI agents more helpful—persistent memory and natural language understanding—turning them into infection vectors that leave no forensic artifacts. As computer-use agents gain broader deployment in enterprise environments with access to sensitive systems, the security community urgently needs to develop new defensive paradigms that can detect and prevent semantic manipulation, not just binary malware.

Researchers Demonstrate 'Brainworm' — First AI Agent Malware Hidden Entirely in Context Windows

Key Takeaways

Summary

Editorial Opinion

More from Anthropic

Anthropic Study Reveals AI Agent Memory Retrieval Accuracy at Just 9%, Exposing Infrastructure Challenges

Anthropic Receives Cease and Desist Over Claude Desktop Privacy Violations

Research: How URLs in Prompts Can Influence LLM Outputs Toward Training Data

Comments

Suggested

Microsoft's Leaked 'Aion' Project Reveals Vision for Copilot-First Operating System

Stanford Researchers Use Multi-Agent AI and Reinforcement Learning to Improve HIP Kernel Generation for AMD GPUs

Researchers Expose Critical Payload-Less Attack on LLM Agent Supply Chains

Researchers Demonstrate 'Brainworm' — First AI Agent Malware Hidden Entirely in Context Windows

Key Takeaways

Summary

Editorial Opinion

More from Anthropic

Anthropic Study Reveals AI Agent Memory Retrieval Accuracy at Just 9%, Exposing Infrastructure Challenges

Anthropic Receives Cease and Desist Over Claude Desktop Privacy Violations

Research: How URLs in Prompts Can Influence LLM Outputs Toward Training Data

Comments

Suggested

Microsoft's Leaked 'Aion' Project Reveals Vision for Copilot-First Operating System

Stanford Researchers Use Multi-Agent AI and Reinforcement Learning to Improve HIP Kernel Generation for AMD GPUs

Researchers Expose Critical Payload-Less Attack on LLM Agent Supply Chains