Company
Date Published
Author
Ryan Morrison
Word count
858
Language
English
Hacker News points
None

Summary

ElevenLabs has released Eleven v3, an alpha research preview of a new AI voice model that introduces Audio Tags to enhance control over emotion, pacing, and sound effects in text-to-speech applications. These tags, which are words enclosed in square brackets, allow users to direct the AI voice to express emotions, delivery styles, and even nonverbal cues like pauses and tone, thus elevating the expressiveness of generated speech. This feature is particularly useful for producing immersive audiobooks, interactive characters, and dialogue-driven media, offering precise control over audio delivery. Despite Professional Voice Clones (PVCs) not being fully optimized for Eleven v3, Instant Voice Clones (IVCs) or designed voices can be utilized to explore v3's features. Available in the ElevenLabs UI and through a public API, Eleven v3 is currently offered at a discounted rate, encouraging experimentation with its enhanced capabilities.