Anthropic Uncovers Large-Scale Distillation Attacks by Chinese AI Labs Targeting Claude

Key Takeaways

▸Three Chinese AI labs (DeepSeek, Moonshot, and MiniMax) used 24,000 fraudulent accounts to generate 16 million exchanges with Claude to illicitly extract its capabilities
▸The distillation attacks targeted Claude's most advanced features including agentic reasoning, tool use, and coding, with DeepSeek using sophisticated techniques to generate chain-of-thought training data
▸Illicitly distilled models lack safety safeguards, creating national security risks as they can be deployed for offensive cyber operations, surveillance, and disinformation without protections

Sources:

X (Twitter)https://www.anthropic.com/news/detecting-and-preventing-distillation-attacks↗

X (Twitter)https://x.com/AnthropicAI/status/2025997928242811253↗

Summary

Anthropic has revealed that three Chinese AI laboratories—DeepSeek, Moonshot, and MiniMax—conducted industrial-scale campaigns to illicitly extract Claude's capabilities through a technique called distillation. The campaigns generated over 16 million exchanges with Claude using approximately 24,000 fraudulent accounts, violating Anthropic's terms of service and regional access restrictions. The attacks specifically targeted Claude's most advanced capabilities including agentic reasoning, tool use, and coding abilities.

Distillation involves training a less capable model on the outputs of a stronger one, and while it's a legitimate technique when used properly, these campaigns used it to acquire powerful capabilities in a fraction of the time and cost required for independent development. Anthropic warns that illicitly distilled models lack necessary safeguards, creating significant national security risks as they can be deployed for offensive cyber operations, disinformation campaigns, and mass surveillance without the safety protections built into American models.

The company identified DeepSeek's campaign as involving over 150,000 exchanges with sophisticated techniques including generating chain-of-thought training data and creating censorship-safe alternatives to politically sensitive queries. Anthropic traced these operations through IP addresses, request metadata, and infrastructure indicators, and received corroboration from industry partners who observed similar attacks on their platforms. The company argues these attacks undermine export controls by allowing foreign labs to circumvent chip restrictions and close competitive advantages that such controls are designed to preserve.

Anthropic emphasizes that these campaigns are growing in intensity and sophistication, warning that the window to act is narrow and the threat extends beyond any single company or region. The company calls for rapid, coordinated action among industry players, policymakers, and the global AI community to address what it characterizes as an urgent threat to AI safety and national security.

These attacks undermine U.S. export controls by allowing foreign labs to acquire advanced AI capabilities without access to restricted chips, reinforcing the need for such controls
Anthropic calls for urgent, coordinated action from industry, policymakers, and the AI community to address the growing threat of distillation attacks

Anthropic Uncovers Large-Scale Distillation Attacks by Chinese AI Labs Targeting Claude

Key Takeaways

▸Three Chinese AI labs (DeepSeek, Moonshot, and MiniMax) used 24,000 fraudulent accounts to generate 16 million exchanges with Claude to illicitly extract its capabilities
▸The distillation attacks targeted Claude's most advanced features including agentic reasoning, tool use, and coding, with DeepSeek using sophisticated techniques to generate chain-of-thought training data
▸Illicitly distilled models lack safety safeguards, creating national security risks as they can be deployed for offensive cyber operations, surveillance, and disinformation without protections

Summary

These attacks undermine U.S. export controls by allowing foreign labs to acquire advanced AI capabilities without access to restricted chips, reinforcing the need for such controls
Anthropic calls for urgent, coordinated action from industry, policymakers, and the AI community to address the growing threat of distillation attacks

Anthropic Uncovers Large-Scale Distillation Attacks by Chinese AI Labs Targeting Claude

Key Takeaways

Summary

More from Anthropic

Anthropic Expands Partnership with SpaceX, Scales GB200 Capacity in Colossus 2

Advanced AI Models Bring Government to 'Reflection Point,' CIA Official Says

Anthropic Claude Code Sandbox Bypass: Second Vulnerability Exposes Critical Data Exfiltration Risk

Comments

Suggested

Barnes & Noble CEO Backs Selling AI-Written Books, Sparking Industry Debate on Transparency Standards

NVIDIA Reports Record $81.6B Revenue in Q1 FY2027, Data Center Segment Surges 92% YoY

Google DeepMind Launches Gemini 3.5 Flash: New Lightweight AI Model

Anthropic Uncovers Large-Scale Distillation Attacks by Chinese AI Labs Targeting Claude

Key Takeaways

Summary

More from Anthropic

Anthropic Expands Partnership with SpaceX, Scales GB200 Capacity in Colossus 2

Advanced AI Models Bring Government to 'Reflection Point,' CIA Official Says

Anthropic Claude Code Sandbox Bypass: Second Vulnerability Exposes Critical Data Exfiltration Risk

Comments

Suggested

Barnes & Noble CEO Backs Selling AI-Written Books, Sparking Industry Debate on Transparency Standards

NVIDIA Reports Record $81.6B Revenue in Q1 FY2027, Data Center Segment Surges 92% YoY

Google DeepMind Launches Gemini 3.5 Flash: New Lightweight AI Model