ElevenLabs — Introducing Multimodal Conversational AI

Company

ElevenLabs

Date Published

May 29, 2025

Author

Contact Sales

Word count

649

Language

English

Hacker News points

None

URL

elevenlabs.io/blog/introducing-multimodal-conversational-ai

Summary

ElevenLabs has introduced a significant enhancement to its Conversational AI platform by integrating true text and voice multimodality, allowing AI agents to process both spoken and typed inputs simultaneously. This development aims to improve user interactions by offering more natural, flexible, and effective communication across various scenarios, addressing the limitations of voice-only interactions such as transcription inaccuracies and user difficulties with complex inputs. By enabling users to switch between voice and text inputs seamlessly, the multimodal approach enhances interaction accuracy, user experience, and task completion rates, while promoting a more natural conversational flow. The platform provides easy configuration and deployment options, including widget, SDK, and WebSocket support, and benefits from existing innovations like high-quality voices, advanced speech models, and global infrastructure. The company anticipates that this feature will significantly enhance the capabilities and user experience of Conversational AI.