White House Demands Anthropic Block All Jailbreaks as Impasse Over Claude Fable 5 Intensifies

Key Takeaways

▸The White House is demanding Anthropic prove it can block jailbreaks on Claude Fable 5 before the model can be rereleased from export control suspension
▸The NSA has confirmed that guardrails on the model can be disabled through prompt engineering, contradicting Anthropic's claim that effects are minimal
▸The administration expects Anthropic to be proactive about testing all frontier models for vulnerabilities and reporting findings to the government

Source:

Hacker Newshttps://www.wired.com/story/the-white-house-wants-anthropic-to-block-all-jailbreaks-that-may-not-be-possible/↗

Summary

The Trump administration has escalated pressure on Anthropic following the export control suspension of Claude Fable 5, the company's most advanced model, over concerns that hackers could use prompt-based jailbreaks to circumvent the model's safety guardrails. Trump officials have made clear that Anthropic cannot rerelease the model without demonstrably addressing what the National Security Agency has confirmed are exploitable vulnerabilities, particularly those restricting access to dangerous capabilities in cybersecurity, chemistry, and biology.

While Anthropic has downplayed the severity of jailbreaking risks, the administration has moved beyond debating significance and now views the problem as Anthropic's responsibility to solve. Officials are pushing the company to adopt more proactive security testing for all frontier models and to self-report vulnerabilities before public release. However, the feasibility of what the White House is asking remains deeply contested among independent cybersecurity experts.

Security researchers increasingly argue that AI guardrails are merely a stopgap solution, suggesting that skilled users and advanced AI systems will inevitably find ways to bypass any safeguards. This raises a fundamental question: whether completely blocking jailbreaks is a realistic goal or simply a risk that must be managed alongside other security measures.

Cybersecurity experts question whether preventing jailbreaks entirely is even technically possible, suggesting guardrails are inherently bypassable
The dispute reflects growing tension between government AI safety demands and industry views on the technical feasibility and scope of those demands

Anthropic

POLICY & REGULATION Anthropic2026-06-17

White House Demands Anthropic Block All Jailbreaks as Impasse Over Claude Fable 5 Intensifies

Key Takeaways

▸The White House is demanding Anthropic prove it can block jailbreaks on Claude Fable 5 before the model can be rereleased from export control suspension
▸The NSA has confirmed that guardrails on the model can be disabled through prompt engineering, contradicting Anthropic's claim that effects are minimal
▸The administration expects Anthropic to be proactive about testing all frontier models for vulnerabilities and reporting findings to the government

Source:

Hacker Newshttps://www.wired.com/story/the-white-house-wants-anthropic-to-block-all-jailbreaks-that-may-not-be-possible/↗

Summary

Cybersecurity experts question whether preventing jailbreaks entirely is even technically possible, suggesting guardrails are inherently bypassable
The dispute reflects growing tension between government AI safety demands and industry views on the technical feasibility and scope of those demands

White House Demands Anthropic Block All Jailbreaks as Impasse Over Claude Fable 5 Intensifies

Key Takeaways

Summary

More from Anthropic

Global Nobel Laureates Issue Rome Declaration Calling for Coordinated AI Slowdown and Safety Measures

Australian Booksellers Caught in AI's Destructive Data-Harvesting Supply Chain

IssueTrojanBench Security Study Reveals Critical Vulnerabilities in AI Coding Agents

Comments

Suggested

Strangers Pretrain 15M-Parameter Language Model Using GitHub Actions and Hugging Face PRs

Research Identifies Fundamental Trilemma: LLM Safeguards Cannot Simultaneously Provide Reliable Safety, Useful Capability, and Open Access

Token Diplomacy: China Positions Open-Source AI as Global Strategic Resource

White House Demands Anthropic Block All Jailbreaks as Impasse Over Claude Fable 5 Intensifies

Key Takeaways

Summary

More from Anthropic

Global Nobel Laureates Issue Rome Declaration Calling for Coordinated AI Slowdown and Safety Measures

Australian Booksellers Caught in AI's Destructive Data-Harvesting Supply Chain

IssueTrojanBench Security Study Reveals Critical Vulnerabilities in AI Coding Agents

Comments

Suggested

Strangers Pretrain 15M-Parameter Language Model Using GitHub Actions and Hugging Face PRs

Research Identifies Fundamental Trilemma: LLM Safeguards Cannot Simultaneously Provide Reliable Safety, Useful Capability, and Open Access

Token Diplomacy: China Positions Open-Source AI as Global Strategic Resource