BotBeat
...
← Back

> ▌

AnthropicAnthropic
PRODUCT LAUNCHAnthropic2026-06-10

Anthropic's Fable Model Launches with Guardrails Critics Say Are Too Broad

Key Takeaways

  • ▸Fable's guardrails are overly broad and block legitimate cybersecurity work, educational requests, and code reviews based on keyword matching rather than actual malicious intent
  • ▸The model falls back to a less capable Claude version (Opus 4.8) when guardrails trigger, reducing utility for defensive security research and engineering best practices
  • ▸Anthropic's Cyber Verification Program allows approved professionals to access less restricted versions, but broader accessibility remains limited by the public model's aggressive safety restrictions
Source:
Hacker Newshttps://techcrunch.com/2026/06/10/cybersecurity-researchers-arent-happy-about-the-guardrails-on-anthropics-fable/↗

Summary

Anthropic released Fable on Tuesday, a public and limited version of its cybersecurity-focused model Mythos, aimed at making advanced AI capabilities available to the broader security community. The model includes aggressive guardrails designed to prevent misuse for malware development and biological weapons research. However, prominent cybersecurity researchers have criticized the restrictions as overly broad and counterproductive, reporting that Fable rejects even innocuous requests tangentially related to cybersecurity, such as reading blog posts or requesting code reviews.

Security researchers suggest Fable's guardrails are keyword-based and indiscriminate, triggering on any mention of cybersecurity-related terminology regardless of context. Valentina "Chompie" Palmiotti from IBM X-Force and other experts expressed frustration that the model frequently falls back to Claude Opus 4.8 due to overzealous safety filtering. Matt Suiche, a cybersecurity veteran at AI startup Tolmo, acknowledged the conservative approach is understandable for an initial public release but expects Anthropic to evolve the guardrails based on community feedback.

AnthropThropic launched Mythos in April with restricted access through Project Glasswing, later expanding to hundreds of organizations in 15 countries. The company offers a Cyber Verification Program for approved professionals seeking fewer limitations on model usage, similar to OpenAI's Trusted Access for Cyber program.

Editorial Opinion

Anthropic faces a genuine safety-versus-usability tradeoff with Fable, and while the guardrails reflect legitimate concerns about model misuse, keyword-based filtering is too blunt an instrument for nuanced security work. The feedback from the security community—a constituency Anthropic should want to support—suggests the current restrictions harm more than help. Anthropic should use this moment to develop more sophisticated safeguards that distinguish between defensive security practice and potential malicious use, moving beyond simple keyword matching toward context-aware filtering.

Generative AICybersecurityEthics & BiasAI Safety & Alignment

More from Anthropic

AnthropicAnthropic
PRODUCT LAUNCH

Anthropic Launches CC-Ledger: Cost Tracking Dashboard for Claude Code Sessions

2026-06-10
AnthropicAnthropic
POLICY & REGULATION

Anthropic Introduces Age and Identity Verification for Claude.ai Accounts

2026-06-10
AnthropicAnthropic
PRODUCT LAUNCH

Anthropic Releases Public Version of Fable 5 With Automated Safety Guardrails

2026-06-10

Comments

Suggested

OpenAIOpenAI
RESEARCH

ABC-Bench Shows LLM Agents Surpassing Human Experts on Biosecurity Tasks

2026-06-10
AppleApple
UPDATE

Apple Demonstrates Local Agentic AI on Mac Using MLX at WWDC 2026

2026-06-10
xAIxAI
POLICY & REGULATION

xAI Engineer Claims Illegal Termination After Raising Safety Concerns

2026-06-10
← Back to news
© 2026 BotBeat
AboutPrivacy PolicyTerms of ServiceContact Us