Understanding Speech-to-Noise Ratio and Its Impact on Your App

Post Details

Company

Symbl.ai

Date Published

Jan. 6, 2021

Author

Team Symbl

Word Count

1,483

Language

English

Hacker News Points

-

Source URL

symbl.ai/developers/blog/understanding-speech-to-noise-ratio-and-its-impact-on-your-app

Summary

The speech-to-noise ratio (SNR) is a measure of the unwanted noise in an audio stream relative to recognizable speech, which can negatively affect system performance. SNR is an inconvenient feature because it's random and unpredictable, with no pattern, constant frequency, or amplitude, but there are measures to reduce its impact. Calculating SNR involves assessing the percentage of unwanted noise in an audio stream relative to recognizable speech, using formulas such as `SNR_dB = 20.log10(S_rms / N_rms)` or estimating it from a single stream of audio. External and internal sources of noise affect SNR, with external source noise being harder to eliminate but manageable, while internal source noise can be quantified and reduced through proper receiver design. The industry standard for speech recognition is sensitive to the type of noise and application, with certain SNR values indicating clean or noisy conditions. A low level of SNR decreases accuracy in speech recognition systems, limiting their operating range and affecting receivers' sensitivity. Signal Compensation and Noise Injection Theory are methods used to deal with SNR early on, removing or reducing noise effects in preprocessing stages or intentionally injecting moderate noises into training data to learn more generalizable deep neural network models. When looking for an API platform provider, prioritizing a robust speech recognition system with higher SNR is essential.