Company
Date Published
Author
Speechmatics Team
Word count
1408
Language
English
Hacker News points
30

Summary

Ursa provides the world's most accurate speech-to-text technology, achieving human-level transcription accuracy on the Kincaid46 dataset and surpassing Microsoft by 22% in relative accuracy improvement. The system uses a self-supervised learning approach with GPUs for inference, scaling up models to 2 billion parameters, which significantly improves accuracy and reduces training time. Ursa's enhanced model outperforms other vendors, including Google, Amazon, and Whisper, with a 35% relative improvement over the previous release. Additionally, the system offers translation capabilities between English and 34 languages, achieving higher BLEU scores than Google in downstream tasks. Ursa represents a quantum leap forward in speech technologies, setting a new standard for the industry, and is the clear choice for anyone seeking best-in-class speech recognition and translation.