Home / Companies / Deepgram / Blog / Post Details
Content Deep Dive

Large Vocabulary Speech Recognition Demystified

Blog post from Deepgram

Post Details
Company
Date Published
Author
Jose Nicholas Francisco
Word Count
2,693
Language
English
Hacker News Points
-
Summary

Large vocabulary speech recognition (LVSR) in production environments faces significant challenges due to the density of out-of-vocabulary (OOV) terms rather than a fixed dictionary size, often leading to transcription errors with specialized terms such as drug names, product codes, and legal jargon. Keyterm Prompting offers a solution for small, stable term sets by adjusting model decoding to favor specific terms, providing immediate gains without retraining, but has limitations when lists become too large or ambiguous, increasing the risk of force-fitting errors. Custom model training, which integrates domain vocabulary into the model's learned representations, is recommended when these limits are reached, offering a more robust solution with potential for significant accuracy improvements, albeit with a requirement for audio data and a longer timeline. The decision between Keyterm Prompting and custom training should be guided by the size and specificity of the domain vocabulary, as well as operational constraints, ensuring the right approach is taken to address the unique vocabulary challenges of each deployment.