Security Researchers Expose Critical Prompt Injection Vulnerabilities in Claude, Gemini, and Copilot—But No Public Disclosure

Key Takeaways

▸Prompt injection attacks on AI coding agents can steal credentials by embedding malicious instructions in code comments, pull requests, and issue descriptions that the agent cannot distinguish from legitimate developer requests
▸The vulnerability is a fundamental property of how LLMs work, not a fixable bug in individual products, as models lack hardware-enforced boundaries between trusted and untrusted input
▸All three major AI coding agent providers (Anthropic, Google, Microsoft) addressed the findings with bug bounties but declined to issue CVEs, raising transparency and disclosure concerns

Source:

Hacker Newshttps://grith.ai/blog/we-hacked-claude-gemini-copilot↗

Summary

Security researchers have demonstrated that AI coding agents from Anthropic (Claude), Google (Gemini Code Assist), and Microsoft (GitHub Copilot) are vulnerable to prompt injection attacks that can steal credentials and secrets. The attack is remarkably simple: malicious instructions embedded in pull requests, code comments, README files, or issue descriptions are processed by the agents as legitimate commands, allowing attackers to exfiltrate API keys, environment variables, SSH keys, and other sensitive data. All three companies acknowledged the findings and paid bug bounties, but notably issued no CVE disclosures—a stark contrast to standard security practice for critical vulnerabilities.

The vulnerability reflects a fundamental architectural property of large language models: they process all input as a single stream of tokens without hardware-enforced boundaries between trusted developer instructions and untrusted environmental content. The agents make decisions about what to follow based on statistical patterns rather than formal access controls. This issue has been recognized as the #1 vulnerability in the OWASP Top 10 for LLM Applications since 2023, and research has shown instruction-hierarchy defenses can be bypassed with near-100% success rates. Notably, the attack is not theoretical—a government agency disclosed in March 2025 that AI coding agents were used as an attack vector in a real breach of internal systems, with agents granted access to repositories and CI/CD pipelines.

The attack has already been exploited in the wild—a government agency disclosed a 2025 breach where attackers used prompt injection via code review comments to compromise internal systems with access to repositories and CI/CD pipelines

Editorial Opinion

The absence of CVE disclosures from Anthropic, Google, and Microsoft for a demonstrably exploitable vulnerability affecting production systems is troubling and sets a dangerous precedent for AI security accountability. While prompt injection is indeed an architectural challenge inherent to LLMs rather than a traditional software bug, the real-world breach described by the government agency proves this is not an academic concern—it is an active threat. The industry's apparent reluctance to formally disclose and transparently communicate these risks to end users mirrors historical patterns of security theater over substance. Until AI coding agents operate with formal capability boundaries and explicit user consent mechanisms, treating prompt injection as a fundamental design limitation rather than an unfixable flaw is both intellectually honest and practically necessary.

Security Researchers Expose Critical Prompt Injection Vulnerabilities in Claude, Gemini, and Copilot—But No Public Disclosure

Key Takeaways

▸Prompt injection attacks on AI coding agents can steal credentials by embedding malicious instructions in code comments, pull requests, and issue descriptions that the agent cannot distinguish from legitimate developer requests
▸The vulnerability is a fundamental property of how LLMs work, not a fixable bug in individual products, as models lack hardware-enforced boundaries between trusted and untrusted input
▸All three major AI coding agent providers (Anthropic, Google, Microsoft) addressed the findings with bug bounties but declined to issue CVEs, raising transparency and disclosure concerns

Summary

The attack has already been exploited in the wild—a government agency disclosed a 2025 breach where attackers used prompt injection via code review comments to compromise internal systems with access to repositories and CI/CD pipelines

Editorial Opinion

The absence of CVE disclosures from Anthropic, Google, and Microsoft for a demonstrably exploitable vulnerability affecting production systems is troubling and sets a dangerous precedent for AI security accountability. While prompt injection is indeed an architectural challenge inherent to LLMs rather than a traditional software bug, the real-world breach described by the government agency proves this is not an academic concern—it is an active threat. The industry's apparent reluctance to formally disclose and transparently communicate these risks to end users mirrors historical patterns of security theater over substance. Until AI coding agents operate with formal capability boundaries and explicit user consent mechanisms, treating prompt injection as a fundamental design limitation rather than an unfixable flaw is both intellectually honest and practically necessary.

Security Researchers Expose Critical Prompt Injection Vulnerabilities in Claude, Gemini, and Copilot—But No Public Disclosure

Key Takeaways

Summary

Editorial Opinion

More from Anthropic

Claude Integrates with 1Password for Secure Password Management

European Rare Book Dealers Warn That AI Companies Are Systematically Destroying Obscure Editions for Training Data

PostgreSQL Rewritten in Rust Using Claude: From Four Failed Attempts to 1.8M Lines of Code

Comments

Suggested

Security Research Reveals How AI Code Reviewers Can Be Tricked Into Deploying Secret-Stealing Code

FBI Explores AI-Powered Signature Verification for Georgia Election Investigation

StepFun Unveils StepX Neo, Claiming World's First Agentic AI Smartphone

Security Researchers Expose Critical Prompt Injection Vulnerabilities in Claude, Gemini, and Copilot—But No Public Disclosure

Key Takeaways

Summary

Editorial Opinion

More from Anthropic

Claude Integrates with 1Password for Secure Password Management

European Rare Book Dealers Warn That AI Companies Are Systematically Destroying Obscure Editions for Training Data

PostgreSQL Rewritten in Rust Using Claude: From Four Failed Attempts to 1.8M Lines of Code

Comments

Suggested

Security Research Reveals How AI Code Reviewers Can Be Tricked Into Deploying Secret-Stealing Code

FBI Explores AI-Powered Signature Verification for Georgia Election Investigation

StepFun Unveils StepX Neo, Claiming World's First Agentic AI Smartphone