Boosting sample efficiency through Self-Supervised Learning

Company

Speechmatics

Date Published

April 12, 2023

Author

Bethan Thomas

Word count

1259

Language

English

Hacker News points

None

URL

www.speechmatics.com/company/articles-and-news/boosting-sample-efficiency-through-self-supervised-learning

Summary

We have demonstrated that scaling self-supervised learning significantly improves the sample efficiency of automatic speech recognition (ASR) models. By leveraging large amounts of unlabeled data, these models can learn rich representations of input features and improve performance with fewer samples of labeled data. This approach is particularly effective in low-resource settings where training with high sample efficiency is crucial. Our experiments show that scaling self-supervised learning leads to greater sample efficiency and generally better performance, even when reducing the amount of labeled training data by several orders of magnitude. The results have significant implications for ASR systems, enabling them to achieve excellence with a fraction of the hours of labeled data typically required.