Home / Companies / Deepgram / Blog / Post Details
Content Deep Dive

Flux Just Got A Little Smarter

Blog post from Deepgram

Post Details
Company
Date Published
Author
Jack Kearney
Word Count
1,950
Language
English
Hacker News Points
-
Summary

Flux has undergone an enhancement through a new training paradigm, improving transcription accuracy and reducing false positives, particularly in start-of-turn detection. Unlike many speech-to-text systems that finalize transcriptions based on wall clock or pause time, Flux uses conversation time for finalization, offering low latency end-of-turn detection. This approach allows for an immediate high-quality transcript once a conversational turn ends. The newer version, Flux V0.1, adopts a more conservative transcription approach, optimizing accuracy specifically at the end of a turn, which leads to a 70% reduction in false positives and faster end-of-turn detection. While the model still revises transcripts throughout a turn, it is less likely to output incorrect words prematurely compared to its predecessor. This conservativeness also translates to improved transcription quality, showing significant gains in accuracy on various data sets, including a notable 10% improvement on a Common Voice test set. These advancements allow developers to benefit from the improved performance without altering their existing implementations, as the update has already been applied.