LiveKit noise cancellation: what it is and how it works
Blog post from LiveKit
LiveKit's noise cancellation system is designed to enhance audio quality for voice AI, livestreaming, and telephony by reducing background noise and competing speech, which can degrade transcription accuracy and interfere with turn detection. The platform provides different noise cancellation strategies for various applications: server-side processing for voice AI agents, client-side processing for livestreaming, and SIP trunk-level processing for telephony. LiveKit offers two main types of enhanced noise cancellation models: background noise suppression, which removes non-speech noise while preserving all speech, and voice isolation, which emphasizes a primary speaker by suppressing other voices and noise. These models, provided by Krisp and ai-coustics, should not be stacked on the same audio pathway to avoid unexpected results. Echo cancellation, a separate feature, prevents speaker output from looping back into the microphone and can be used alongside noise cancellation. For optimal results, LiveKit recommends applying noise cancellation within the agent for most voice AI applications, using different models based on whether the scenario involves a single speaker or multiple speakers with environmental noise. Pricing for noise cancellation models varies, with voice isolation being metered and background noise suppression included with LiveKit Cloud.