Home / Companies / Cloudflare / Blog / Post Details
Content Deep Dive

Add voice to your agent

Blog post from Cloudflare

Post Details
Company
Date Published
Author
Sunil Pai and Korinne Alpers
Word Count
2,438
Language
English
Hacker News Points
-
Summary

Cloudflare has introduced an experimental voice pipeline for its Agents SDK, allowing developers to integrate real-time voice capabilities into their existing agent architectures without the need for a separate voice framework. The @cloudflare/voice package enables conversations with agents over a single WebSocket connection, maintaining the same Durable Object infrastructure and SQLite-backed conversation history as text interactions. This integration supports both full conversational voice agents and speech-to-text-only use cases, with built-in support for voice input and output using Workers AI providers like Deepgram. The system is designed to be provider-agnostic, allowing developers to mix and match components for their specific needs, and is compatible with various telephony and transport options, including WebRTC and Twilio. The voice pipeline aims to reduce latency by keeping the processing on Cloudflare's network and supports dynamic model switching and hooks for data interception. This approach allows for a seamless transition between voice and text inputs, providing a unified, multimodal agent experience.