Home / Companies / AssemblyAI / Blog / Post Details
Content Deep Dive

Build an Agora voice agent with AssemblyAI's Voice Agent API

Blog post from AssemblyAI

Post Details
Company
Date Published
Author
Kelsey Foster
Word Count
1,588
Language
English
Hacker News Points
-
Summary

This text provides a comprehensive guide on building a server-side voice agent using AssemblyAI's Voice Agent API integrated with Agora's RTC platform, emphasizing the streamlined process of deploying AI voice agents without requiring separate STT, LLM, or TTS services. The integration leverages Agora's low-latency WebRTC capabilities and the Voice Agent API's comprehensive management of speech recognition, reasoning, and text-to-speech, all via a single WebSocket connection. The tutorial details the technical setup, including configuring API keys, running the bot, and handling audio resampling between the differing PCM rates of Agora and AssemblyAI. It highlights the advantages of using the Voice Agent API for neural turn detection, barge-in functionality, and tool calling, and addresses common troubleshooting scenarios related to the Agora SDK and API connectivity issues. The guide also notes the known limitations of the agora-python-server-sdk, such as its beta status and lack of Windows support, recommending the use of Linux or macOS environments for running the bot.