BotBeat
...
← Back

> ▌

CohereCohere
OPEN SOURCECohere2026-05-02

Cohere Releases Cohere-Transcribe: Open-Source 2B Speech Recognition Model Achieving #1 Performance on ASR Leaderboard

Key Takeaways

  • ▸Cohere releases open-source 2B-parameter speech recognition model under Apache 2.0 license on Hugging Face
  • ▸Achieves #1 ranking on Hugging Face Open ASR Leaderboard for English, outperforming proprietary competitors
  • ▸Supports 14 enterprise-critical languages with state-of-the-art accuracy across all languages
Source:
Hacker Newshttps://huggingface.co/blog/CohereLabs/cohere-transcribe-03-2026-release↗

Summary

Cohere has open-sourced cohere-transcribe-03-2026, a 2B-parameter speech recognition model available under Apache 2.0 license on Hugging Face. Trained from scratch on 0.5M hours of curated audio-transcript pairs, the model delivers state-of-the-art accuracy while maintaining exceptional efficiency, achieving offline throughput three times higher than similarly-sized competitor models.

The model's performance is impressive: it claims the #1 position on the Hugging Face Open ASR Leaderboard for English, outperforming both proprietary and open-source alternatives. Across 14 supported languages (including English, German, French, Italian, Spanish, Portuguese, Greek, Dutch, Polish, Arabic, Vietnamese, Mandarin Chinese, Japanese, and Korean), cohere-transcribe matches or exceeds all existing open-source models.

Architecturally, the model uses a 2B encoder-decoder transformer with a Fast-Conformer encoder, dedicating over 90% of parameters to the encoder and maintaining a lightweight decoder. This design minimizes autoregressive inference compute, enabling dramatically faster serving compared to models built on pre-trained text LLMs. Cohere partnered with vLLM to enable production-grade serving through an open-source stack, emphasizing deployment readiness alongside benchmark performance.

The release represents Cohere's first venture into audio AI and signals the company's diversification beyond large language models. A Hugging Face Space enables easy testing, and the open-source availability democratizes access to high-quality speech recognition technology for developers and researchers.

  • Delivers 3x higher offline throughput than similarly-sized models through encoder-heavy architecture
  • Production-ready with vLLM integration for efficient enterprise deployment and inference

Editorial Opinion

Cohere's open-source speech recognition release marks a strategic expansion beyond language models with a highly efficient, competitive offering. By achieving top-tier benchmark performance while maintaining production efficiency and multilingual support, Cohere demonstrates that specialized, well-engineered models can outcompete larger alternatives—a significant statement in an industry dominated by proprietary ASR services. The focus on practical deployment (vLLM integration) rather than academic metrics alone shows Cohere's commitment to building real-world deployable AI tools. This release will likely accelerate adoption of open-source transcription alternatives and provide enterprises with a high-quality, cost-effective option.

Speech & AudioMachine LearningDeep LearningOpen Source

More from Cohere

CohereCohere
PRODUCT LAUNCH

Cohere Launches Transcribe: Open-Source Speech Recognition Model Tops HuggingFace Leaderboard

2026-03-31
CohereCohere
PRODUCT LAUNCH

Cohere Launches Transcribe: Open-Source Speech Recognition Model Tops HuggingFace Leaderboard

2026-03-26

Comments

Suggested

MicrosoftMicrosoft
UPDATE

Microsoft Brings AI-Powered Focus Sessions to Windows 11 Clock App

2026-05-02
Open Research / AcademicOpen Research / Academic
RESEARCH

New Evaluation Framework Exposes Strategic Reasoning Risks Across 11 Leading LLMs

2026-05-02
AnthropicAnthropic
RESEARCH

Memory-Safe Code Emerges as Superior Defense Against AI-Driven Cyberattacks

2026-05-02
← Back to news
© 2026 BotBeat
AboutPrivacy PolicyTerms of ServiceContact Us