Company
Date Published
Author
-
Word count
821
Language
English
Hacker News points
None

Summary

Octave 2, the latest iteration of the voice AI model, enhances text-to-speech capabilities by understanding emotional tones and extending support to 11 languages, including Japanese and Korean. It generates audio swiftly at under 200ms and is offered at half the price of its predecessor, Octave 1. New features include voice conversion and direct phoneme editing, which allow for realistic voice swapping and precise pronunciation adjustments, respectively. Octave 2's efficiency is bolstered by advanced chip deployment and a novel inference stack, enabling large-scale application across industries like entertainment and customer service. The recently launched EVI 4 mini extends these capabilities to a speech-to-speech API, facilitating the creation of interactive experiences, though it currently requires pairing with an external language model for native language generation. Access to Octave 2 and EVI 4 mini is available on the platform, with plans for more languages and features in the future.