Home / Companies / Deepgram / Blog / Post Details
Content Deep Dive

Introducing Nova-2: The Fastest, Most Accurate Speech-to-Text API

Blog post from Deepgram

Post Details
Company
Date Published
Author
Josh Fox
Word Count
2,281
Language
English
Hacker News Points
2
Summary

Deepgram introduces Nova-2, a next-generation speech-to-text model that outperforms alternatives in terms of accuracy, speed, and cost. Nova-2 is 18% more accurate than its predecessor and offers a 36% relative WER improvement over OpenAI Whisper (large). It delivers an average 30% reduction in word error rate (WER) over competitors for both pre-recorded and real-time transcription, with 5-40x faster pre-recorded inference time. Nova-2 is priced at $0.0043/min for pre-recorded audio, making it more affordable than other full-functionality providers. The model has been trained on a diverse dataset and offers improved entity accuracy, punctuation accuracy, and capitalization error rate compared to Nova-1. Deepgram's benchmarking methodology uses over 50 hours of human-annotated audio across various domains and compares Nova-2 with other prominent models in the market.