Home / Companies / AssemblyAI / Blog / Post Details
Content Deep Dive

Universal-3.5 Pro Realtime: the first streaming STT model that takes the agent's question as input

Blog post from AssemblyAI

Post Details
Company
Date Published
Author
-
Word Count
1,663
Company Posts That Month
28
Language
English
Hacker News Points
-
Summary

Universal-3.5 Pro Realtime is AssemblyAI's latest flagship real-time speech-to-text model that emphasizes improved context retention and language handling to enhance transcription accuracy. It allows voice agents to pass questions with context, reducing word error rates significantly by using a rolling memory to keep track of conversations. This model supports 18 languages with mid-sentence code-switching and provides advanced features like voice focus to isolate primary speakers, making it ideal for noisy environments. Universal-3.5 Pro Realtime outperforms competitors in various metrics, such as word error rate and entity error rate, offering a cost-effective solution with add-ons like diarization and voice isolation. It is designed to integrate seamlessly into existing systems, with automatic upgrades for most users and the flexibility to handle large-scale operations without rate limits.

Trends Found in this Post
Trend Post Mentions Total Month Mentions Posts Companies MoM
Real-time 20 5,457 1,338 238 -5%
Voice AI 10 2,232 214 48 -36%
AI Agents 1 4,874 1,103 240 -1%