Home / Companies / AssemblyAI / Blog / Post Details
Content Deep Dive

Build a voice agent with LiveKit and AssemblyAI’s Voice Agent API

Blog post from AssemblyAI

Post Details
Company
Date Published
Author
Kelsey Foster
Word Count
1,970
Language
English
Hacker News Points
-
Summary

A comprehensive guide on building a multi-user, browser-ready voice agent using Python, LiveKit, and AssemblyAI's Voice Agent API, this tutorial outlines the integration of LiveKit for handling WebRTC transport and the AssemblyAI API for managing the AI pipeline including speech-to-text, language model, and text-to-speech over a single WebSocket. The worker acts as an intermediary between these systems, enabling the creation of a voice agent without needing to develop a separate orchestration layer for STT, LLM, and TTS or constructing a WebRTC stack. The tutorial highlights configuring LiveKit and AssemblyAI to work together, using a single WebSocket for server-side operations, and demonstrates the setup process involving cloning a repository, setting environment variables, and running the worker. It also addresses handling barge-in, tuning turn detection, and scaling for multiple participants, providing troubleshooting tips for common issues like audio quality and tool calling within a LiveKit room, emphasizing the benefits of using these technologies together for efficient voice agent deployment.