Introducing Cohere-transcribe: state-of-the-art speech recognition

Post Details

Company

Hugging Face

Date Published

March 26, 2026

Author

Julian Mack, Ekagra Ranjan, Walter Beller-Morales, Bharat venkitesh, and Pierre Richemond

Word Count

1,485

Company Posts That Month

63

Language

-

Hacker News Points

-

Post removed?

No

Source URL

huggingface.co/blog/CohereLabs/cohere-transcribe-03-2026-release

Summary

Cohere-transcribe-03-2026 is a newly launched 2-billion-parameter speech recognition model from CohereLabs, designed to deliver state-of-the-art accuracy across 14 enterprise-critical languages and is open-sourced on Hugging Face under an Apache 2.0 license. The model outperforms existing proprietary and open-source competitors in English, taking the top spot on the Hugging Face Open ASR Leaderboard, and shows comparable or superior performance in the other 13 languages. Built with an encoder-decoder X-attention transformer architecture, the model emphasizes efficiency and accuracy by dedicating over 90% of its parameters to the encoder, allowing for minimal autoregressive inference compute. Cohere-transcribe was trained on 0.5 million hours of curated audio and transcripts, supplemented with synthetic data, and utilizes a multilingual tokenizer with byte fallback to handle varied language inputs. The model's production viability is enhanced through collaboration with vLLM for efficient, scalable deployment, achieving up to twice the throughput compared to similar models. Despite its strengths, the model is not specifically trained for code-switched audio and may require a noise gate or voice activity detection to avoid errors from non-speech sounds. Cohere-transcribe represents a significant step in Cohere's efforts to enhance audio experiences on their North enterprise platform, with the model available for experimentation via Hugging Face and Cohere's API.

Trends Found in this Post

Trend	Post Mentions	Total Month Mentions	Posts	Companies	MoM
Real-time	3	6,457	1,307	242	+28%
LLM	2	6,078	960	218	+18%
Secrets Management	1	1,488	268	99	+7%

Use This Data

Use this post, company, and trend context to find content marketing opportunities, perform competitive analysis, or address product feature gaps via the Plushcap MCP server or the Plushcap API.