Home / Companies / Deepgram / Blog / Post Details
Content Deep Dive

Flux Just Got A Little Smarter

Blog post from Deepgram

Post Details
Company
Date Published
Author
Jack Kearney
Word Count
1,950
Company Posts That Month
18
Language
English
Hacker News Points
-
Summary

Flux has undergone an enhancement through a new training paradigm, improving transcription accuracy and reducing false positives, particularly in start-of-turn detection. Unlike many speech-to-text systems that finalize transcriptions based on wall clock or pause time, Flux uses conversation time for finalization, offering low latency end-of-turn detection. This approach allows for an immediate high-quality transcript once a conversational turn ends. The newer version, Flux V0.1, adopts a more conservative transcription approach, optimizing accuracy specifically at the end of a turn, which leads to a 70% reduction in false positives and faster end-of-turn detection. While the model still revises transcripts throughout a turn, it is less likely to output incorrect words prematurely compared to its predecessor. This conservativeness also translates to improved transcription quality, showing significant gains in accuracy on various data sets, including a notable 10% improvement on a Common Voice test set. These advancements allow developers to benefit from the improved performance without altering their existing implementations, as the update has already been applied.

Trends Found in this Post
Trend Post Mentions Total Month Mentions Posts Companies MoM
Real-time 4 4,546 943 215 -38%
Voice AI 3 1,325 172 39 +140%
AI Model Fine-tuning 1 532 129 59 -12%
LLM 1 3,836 662 193 +2%