The Future of Word Error Rate (WER)

Company

Speechmatics

Date Published

Oct. 25, 2022

Author

John Hughes

Word count

1631

Language

English

Hacker News points

None

URL

www.speechmatics.com/company/articles-and-news/the-future-of-word-error-rate

Summary

The NER model is an alternative to Word Error Rate (WER) for evaluating speech-to-text systems. It assigns a penalty level to errors based on their severity, with minor errors being easy to read through, standard errors disrupting the flow, and serious errors changing the meaning of text. The NER model requires human judgment to label errors, which can be subjective and difficult to automate. In contrast, WER follows a set of rules that can be fully automated. However, researchers have proposed alternative approaches using large language models (LLMs), such as few-shot learning and chain of thought reasoning, which can improve the accuracy of speech-to-text systems. The use of LLMs has led to significant advancements in natural language processing tasks, including reading comprehension and question answering. A new method uses a 2-shot prompt to automate the calculation of NER without human judgment, which may eventually replace WER for evaluating quality in Automatic Speech Recognition (ASR) systems.