Home / Companies / AssemblyAI / Blog / Post Details
Content Deep Dive

How to get the most out of Universal-3 Pro with prompt engineering

Blog post from AssemblyAI

Post Details
Company
Date Published
Author
Ryan Seams
Word Count
2,872
Language
English
Hacker News Points
-
Summary

Universal-3 Pro by AssemblyAI is a cutting-edge, promptable speech-to-text model designed specifically for speech tasks, offering enhanced transcription capabilities over its predecessor, Universal-2. It allows users to input natural language prompts with transcription requests, enabling customized handling of speech patterns, ambiguous audio, multilingual conversations, and personally identifiable information. The model excels in contextual understanding, facilitating accurate transcription even in complex scenarios, such as mixed-language audio or low-quality recordings. Users can guide the model's output with precise, directive language to capture specific linguistic nuances or mark uncertain segments for review. While the model is highly responsive to detailed instructions, it is not yet fully developed for certain tasks like long audio speaker labeling and non-speech audio tagging. AssemblyAI plans to expand its capabilities, including more language support and streaming features, to further enhance its utility in various production contexts.