Claude Fable 5 Delisted; Anthropic Introduces OrcaRouter Multi-Model Routing System
Key Takeaways
- ▸Claude Fable 5 has been delisted; OrcaRouter provides a multi-model panel approach achieving Fable 5–equivalent capability across three performance tiers
- ▸Intelligent routing gates parallel execution only to real work, avoiding panel overhead for simple queries and protecting SLA and cost
- ▸best_of_n selection strategy returns a real model's answer verbatim; synthesize fusion merges responses at N+1 billing cost
Summary
Anthropic has introduced OrcaRouter, a multi-model routing system designed to replace the delisting of Claude Fable 5 by running multiple frontier models in parallel and selecting or synthesizing the strongest response. Rather than building a larger single model, the solution orchestrates a panel of models—including Claude Opus 4.8, GPT-5.5, and Gemini 3.1 Pro—behind a unified API that intelligently determines when to activate the parallel execution.
The system ships with three pre-built "Fusion" tiers targeting different performance-cost tradeoffs: fusion for max intelligence (1M token context), fusion-mini for balanced inference, and fusion-flash for speed and cost (200k token context). OrcaRouter includes a Routing DSL that gates expensive panel fan-out to genuine complex work—code, agent tasks, tool-using requests, and high-difficulty prompts—while routing simple queries to single-model inference to protect latency and cost. An internal LLM judge decides routing based on task classification and difficulty scoring.
The system supports two arbitration strategies: best_of_n, which runs a judge to pick the single strongest candidate from the panel and serve it verbatim, and synthesize, an aggregator-based fusion that merges all responses into a new synthesized answer. Both strategies present OpenAI-compatible endpoints, enabling drop-in replacement from Fable 5 without application refactoring.
- Routing DSL allows custom task classification rules; built-in routers use code/agent/tool-use/difficulty heuristics to decide when to fan out
- OpenAI-compatible API enables zero-code migration from Fable 5 and other single-model deployments
Editorial Opinion
OrcaRouter's panel-based approach sidesteps the risk of a single point of failure while leveraging existing frontier models—pragmatic engineering over model scaling. The best_of_n selection model is particularly elegant: returning a real model's answer eliminates the quality degradation that often plagues merged outputs. The tradeoff is complexity: teams must now reason about routing rules, judge latency (120s max), and N+1 billing for synthesis, making it less suitable for latency-critical workloads or applications already optimized for single-model inference.


