BotBeat
...
← Back

> ▌

AnthropicAnthropic
RESEARCHAnthropic2026-04-16

Security Researchers Demonstrate Claude Can Be Tricked Into Approving Malicious Code Through Git Identity Spoofing

Key Takeaways

  • ▸Claude and similar AI code reviewers can be fooled into approving malicious code by spoofing Git commit author metadata with just two commands
  • ▸AI systems may over-weight trust signals like author identity rather than conducting independent assessment of code quality and safety
  • ▸Automated review workflows that grant auto-approval to trusted developers are vulnerable when trust is based on unsigned, easily-forged metadata
Source:
Hacker Newshttps://www.theregister.com/2026/04/16/git_identity_spoof_claude/↗

Summary

Security researchers at Manifold Security have revealed a critical vulnerability in AI-powered code review workflows using Anthropic's Claude. By simply forging Git commit metadata to impersonate a trusted developer, attackers can trick Claude into approving malicious code changes. The attack exploits a common practice where automated review systems grant auto-approval privileges to commits from recognized maintainers, treating author identity as a trust signal without independently validating the code itself.

The vulnerability highlights a fundamental mismatch between how humans and AI systems evaluate trust. While a human reviewer might question unexpected changes from a maintainer or scrutinize the actual code diff, Claude appears to rely heavily on authorship metadata as a decision factor. This creates a security gap where spoofed identity becomes an effective bypass for security controls, particularly in open-source projects already struggling with review bottlenecks.

Manifold warns that as open-source communities increasingly adopt AI-powered workflow tools for automated code review and approval, these systems become attractive targets for poisoning attacks. The researchers emphasize that guardrails cannot reside solely in the AI model—proper controls must exist at the infrastructure level to verify actual commit authenticity and enforce code signing requirements.

  • Security controls must be enforced at the infrastructure and signing level, not solely within AI model decision-making

Editorial Opinion

This vulnerability underscores a critical lesson for AI adoption in security-critical workflows: AI systems should enhance human judgment, not replace it or automate away verification steps. While using Claude to accelerate code review is valuable, treating metadata-based trust signals as sufficient for auto-approval creates dangerous attack surface. The open-source community should view this as a wake-up call to implement commit signing, multi-factor verification, and human oversight for high-risk changes, regardless of claimed authorship.

CybersecurityEthics & BiasAI Safety & Alignment

More from Anthropic

AnthropicAnthropic
PARTNERSHIP

White House Pushes US Agencies to Adopt Anthropic's AI Technology

2026-04-17
AnthropicAnthropic
RESEARCH

AI Safety Convergence: Three Major Players Deploy Agent Governance Systems Within Weeks

2026-04-17
AnthropicAnthropic
PRODUCT LAUNCH

Finance Leaders Sound Alarm as Anthropic's Claude Mythos Expands to UK Banks

2026-04-17

Comments

Suggested

AnthropicAnthropic
RESEARCH

AI Safety Convergence: Three Major Players Deploy Agent Governance Systems Within Weeks

2026-04-17
OpenAIOpenAI
RESEARCH

When Should AI Step Aside?: Teaching Agents When Humans Want to Intervene

2026-04-17
AnthropicAnthropic
PRODUCT LAUNCH

Finance Leaders Sound Alarm as Anthropic's Claude Mythos Expands to UK Banks

2026-04-17
← Back to news
© 2026 BotBeat
AboutPrivacy PolicyTerms of ServiceContact Us