Company
Date Published
Author
Eric Bezzam, Steven Zheng, Eustache Le Bihan, and Vaibhav Srivastav
Word count
936
Language
-
Hacker News points
None

Summary

As of November 21, 2025, the Open ASR Leaderboard remains a pivotal tool for evaluating automatic speech recognition (ASR) models, comparing over 60 models from 18 organizations across 11 datasets. It highlights the growing complexity and specialization in ASR, with new multilingual and long-form transcription tracks added recently. While models combining Conformer encoders with large language model decoders show the best accuracy for English transcription, they are slower, leading to a speed-accuracy tradeoff where simpler CTC and TDT decoders offer faster throughput at the cost of slightly higher error rates. Multilingual models, like OpenAI's Whisper Large v3, provide strong baselines but often sacrifice single-language performance, underscoring the tradeoff between specialization and generalization. In long-form audio tasks, closed-source systems currently outperform open ones, though there is significant potential for innovation within the open-source community. The leaderboard fosters transparent model comparisons and encourages contributions to multilingual ASR, reflecting its role as a community-driven benchmark and a reference point for other language-specific leaderboards.