Introducing EVI 3: the world’s most realistic and instructible speech-to-speech foundation model

Post Details

Company

Hume

Date Published

May 29, 2025

Author

Alan Cowen

Word Count

1,112

Language

English

Hacker News Points

-

Source URL

www.hume.ai/blog/introducing-evi-3

Summary

Hume has introduced its third-generation speech-language model, EVI 3, which brings more expressiveness, realism, and emotional understanding to voice AI experiences. EVI 3 can be fully personalized with any voice and personality created by a user prompt, allowing for instant generation of new voices and personalities. This is achieved through Hume's latest research on speech-language models, which developed methods to capture the full range of human voices and speaking styles in one model. EVI 3 has been evaluated against other leading voice-to-voice AI models, including GPT-4o, Gemini, and Sesame, with favorable results in areas such as emotional tone modulation, expressiveness, and naturalness. The model is capable of delivering voice responses in under 300ms on state-of-the-art hardware and is currently available through a live demo and iOS app, with API access planned for release in the coming weeks.