easytranscriber: Speech Recognition with Accurate Timestamps in the HF Ecosystem

Post Details

Company

Hugging Face

Date Published

March 3, 2026

Author

Faton Rekathati

Word Count

1,169

Company Posts That Month

63

Language

-

Hacker News Points

-

Post removed?

No

Source URL

huggingface.co/blog/KBLab/easytranscriber

Summary

Easytranscriber, developed by KBLab at the National Library of Sweden, is an automatic speech recognition (ASR) library focused on efficient transcription with precise word-level timestamps. By drawing inspiration from the WhisperX library, easytranscriber achieves speed improvements of 35% to 102%, attributed to its optimized GPU-accelerated forced alignment, parallel audio file loading, and batch processing for wav2vec2 models. The library supports both ctranslate2 and Hugging Face transformers as backends, integrating WhisperX functionality into the Hugging Face ecosystem. Its pipeline consists of voice activity detection, transcription, emission extraction, and forced alignment stages, which can be run sequentially or independently. Easytranscriber also features a search interface called easysearch, which enables users to browse and query transcription outputs with synchronized audio playback. The library is particularly beneficial for large-scale projects like the mass transcription of archival radio recordings, offering significant performance enhancements over traditional ASR libraries by reducing inefficiencies in data loading and alignment processes.

Trends Found in this Post

No tracked trend matches for this post yet.

Use This Data

Use this post, company, and trend context to find content marketing opportunities, perform competitive analysis, or address product feature gaps via the Plushcap MCP server or the Plushcap API.