Introducing Nova-2: The Fastest, Most Accurate Speech-to-Text API

Post Details

Company

Deepgram

Date Published

Sept. 19, 2023

Author

Josh Fox

Word Count

2,281

Company Posts That Month

14

Language

English

Hacker News Points

2

Post removed?

No

Source URL

deepgram.com/learn/nova-2-speech-to-text-api

Summary

Deepgram introduces Nova-2, a next-generation speech-to-text model that outperforms alternatives in terms of accuracy, speed, and cost. Nova-2 is 18% more accurate than its predecessor and offers a 36% relative WER improvement over OpenAI Whisper (large). It delivers an average 30% reduction in word error rate (WER) over competitors for both pre-recorded and real-time transcription, with 5-40x faster pre-recorded inference time. Nova-2 is priced at $0.0043/min for pre-recorded audio, making it more affordable than other full-functionality providers. The model has been trained on a diverse dataset and offers improved entity accuracy, punctuation accuracy, and capitalization error rate compared to Nova-1. Deepgram's benchmarking methodology uses over 50 hours of human-annotated audio across various domains and compares Nova-2 with other prominent models in the market.

Trends Found in this Post

Trend	Post Mentions	Total Month Mentions	Posts	Companies	MoM
Real-time	12	2,216	526	161	-9%
LLM	1	2,134	271	94	-26%

Use This Data

Use this post, company, and trend context to find content marketing opportunities, perform competitive analysis, or address product feature gaps via the Plushcap MCP server or the Plushcap API.