Company
Date Published
Author
Lauren Rothwell
Word count
399
Language
English
Hacker News points
None

Summary

Scribe v2 Realtime, a cutting-edge Speech to Text model developed by ElevenLabs, is designed for ultra-low latency and optimized for agent-driven applications requiring speed, accuracy, and conversational precision. Capable of transcribing speech in under 150 milliseconds with high accuracy, Scribe v2 Realtime excels in real-world scenarios featuring noise, diverse accents, and challenging identifiers. It has demonstrated superior performance in capturing user intent compared to other real-time ASR models during internal benchmarks. Notably, Scribe v2 Realtime achieved the lowest Word Error Rate in the FLEURS multilingual benchmark, supporting a wide range of languages such as Spanish, Portuguese, and Hindi, which ensures that enterprises can deploy multilingual agents without compromising on speed or accuracy. The model is now available in ElevenLabs Agents and can be activated via the Advanced configuration section.