Home / Companies / AssemblyAI / Blog / Post Details
Content Deep Dive

Raw WebSocket Voice Agent with AssemblyAI's Voice Agent API

Blog post from AssemblyAI

Post Details
Company
Date Published
Author
Kelsey Foster
Word Count
1,772
Language
English
Hacker News Points
-
Summary

This tutorial provides an in-depth exploration of the Raw WebSocket Voice Agent using AssemblyAI's Voice Agent API, designed for developers who require complete control over voice agent events without an intermediary SDK. Highlighting the protocol's capabilities, the guide describes handling all server events, processing partial and final transcripts, managing tool calls, and maintaining session continuity even after brief disconnections. It emphasizes the protocol's structure, including client-to-server and server-to-client event exchanges, and offers practical advice on starting an agent, handling errors, and troubleshooting common issues such as session timeouts or audio quality. Additionally, the tutorial explains the tool-calling pattern, the importance of accumulating results, and how to resume sessions seamlessly, making it a comprehensive resource for embedding voice agents into custom applications.