Cohere Launches Transcribe: Open-Source Speech Recognition Model Tops HuggingFace Leaderboard
Key Takeaways
- ▸Cohere Transcribe achieves #1 accuracy on HuggingFace Open ASR Leaderboard with 5.42% WER, surpassing Whisper v3, ElevenLabs Scribe v2, and other dedicated ASR models
- ▸Open-source model with full infrastructure control supports 14 languages and is optimized for production deployment with best-in-class inference efficiency
- ▸Performance validated across both standardized benchmarks and human evaluations, confirming real-world enterprise applicability for meeting transcription, analytics, and real-time customer support
Summary
Cohere has announced Transcribe, an open-source automatic speech recognition (ASR) model that achieves state-of-the-art accuracy while maintaining production-ready efficiency. The model currently ranks #1 on HuggingFace's Open ASR Leaderboard with a word error rate of 5.42%, outperforming closed-source alternatives including OpenAI's Whisper Large v3 and ElevenLabs Scribe v2. Cohere Transcribe supports 14 languages across European and Asian-Pacific regions and is designed specifically for enterprise AI workflows, from meeting transcription to real-time customer support agents.
The model represents a deliberate push to balance accuracy with practical production constraints. While delivering best-in-class transcription fidelity, Cohere Transcribe maintains a manageable inference footprint suitable for GPU and local deployment, achieving exceptional throughput measured by Real Time Factor (RTFx). The model is available for download as open-source weights and also accessible through Cohere's Model Vault, a secure managed inference platform. Independent human evaluations confirm that the benchmark performance translates reliably to real-world enterprise settings, handling multiple-speaker environments, diverse accents, and challenging acoustics.
- Available immediately via open-source download and Cohere's managed Model Vault platform, marking Cohere's entry into enterprise speech recognition workflows
Editorial Opinion
Cohere's Transcribe represents a meaningful contribution to democratizing high-performance speech recognition for enterprise use. By open-sourcing a model that outperforms proprietary alternatives while maintaining practical efficiency constraints, Cohere addresses a genuine gap in the market for production-ready ASR systems. The focus on real-world performance validation—not just benchmark scores—is particularly commendable and suggests this model could genuinely unlock new applications in customer support, compliance, and accessibility workflows across organizations.



