Building a song transcription system with profanity filter using Whisper, GPT 3.5 and Spleeter

Post Details

Company

Gladia

Date Published

March 7, 2024

Author

-

Word Count

2,513

Company Posts That Month

3

Language

English

Hacker News Points

-

Post removed?

No

Source URL

www.gladia.io/blog/building-a-song-transcription-system-using-whisper-gpt-3-5-and-spleeter

Summary

The tutorial outlines the process of creating a song transcription system with a profanity filter using various technologies, including Spleeter, Gladia API, and GPT 3.5. It begins by providing a historical context of music streaming's evolution, starting with Napster in 1999 and progressing to modern platforms like Spotify and Amazon Music. The system is designed to separate vocals from instrumentals using Spleeter, transcribe the isolated vocals with Gladia's Whisper API, and analyze the transcription with GPT 3.5 to detect both explicit and implicit profanities. It addresses challenges like audio quality and background noise, using Gladia's noise reduction feature to enhance transcription accuracy. The tutorial also covers prompt engineering for GPT 3.5 to ensure it accurately identifies profanities in lyrics, and concludes with a pipeline to automate the workflow, demonstrating the system's effectiveness in detecting inappropriate content in songs.

Trends Found in this Post

Trend	Post Mentions	Total Month Mentions	Posts	Companies	MoM
Real-time	5	2,527	623	172	+6%
LLM	3	2,357	311	115	-2%

Use This Data

Use this post, company, and trend context to find content marketing opportunities, perform competitive analysis, or address product feature gaps via the Plushcap MCP server or the Plushcap API.