Company
Date Published
Author
Benedetta Cevoli
Word count
758
Language
English
Hacker News points
None

Summary

As speech-to-text innovators, we are building systems and models that adapt to a broader range of voices to reduce the need for expensive, bias-creating, error-prone human intervention and labeled data. Speech recognition software cannot keep up with the times if it, like the languages it processes, does not evolve. To put this to the test, our Autonomous Speech Recognition (ASR) engine was compared to YouTube's automated captioning system across 24 videos, measuring its accuracy ratings against the video-streaming site's system. The results were conclusive, with our ASR achieving 90% or higher accuracy in all cases. As the world becomes increasingly globalized and online content grows, accurate speech-to-text software must continually update to keep up with evolving language and new words, making it a vital tool for accessibility and education. Our ASR features self-supervised learning, allowing it to learn from unlabeled data without human intervention, which gathers rich representations of speech from a wide breadth of voices and brings far-reaching accuracy. With the potential to save lives by providing accurate transcriptions in emergency situations, our software is poised to play a critical role in shaping the future of speech-to-text technology.