Build realtime voice agents on AI Gateway
Blog post from Vercel
AI Gateway has expanded its support to include audio and voice capabilities, enabling real-time voice interactions, text-to-speech, and speech-to-text functionalities alongside existing text, image, and video modalities. This update, available in AI SDK 7, allows developers to integrate voice features into their applications using models from OpenAI and xAI, with the same routing, observability, and budget controls used for other models. The real-time voice feature allows applications to hold natural conversations by immediately responding to user input, enhancing the user experience for voice assistants and customer support tools. Text-to-speech and speech-to-text features facilitate the creation of audio content from text and vice versa, supporting tasks like voiceovers and transcription. Users can test audio models directly in their browsers without writing code, and audio calls are managed with the same API key and provider controls as other AI Gateway services, streamlining the integration of speech capabilities into existing applications.
No tracked trend matches for this post yet.