Achieving 90% Cost-Effective Transcription and Translation with Optimised OpenAI Whisper

Company

Monster API

Date Published

June 15, 2023

Author

Gaurav Vij

Word count

1229

Language

English

Hacker News points

None

URL

blog.monsterapi.ai/achieving-90-cost-effective-transcription-and-translation-with-optimised-openai-whisper

Summary

Q Blocks has introduced a decentralized GPU computing approach coupled with optimized model deployment, reducing the cost of execution and increasing throughput for large language models like OpenAI Whisper. This allows for significant cost savings and performance upgrades at scale. Optimizing AI models is crucial to reduce costs, increase speed, and manage scaling, making them more practical and sustainable. The Q Blocks GPU instance offers a 50% lower cost than AWS out of the box, resulting in 12x cost reduction when running an optimized model on their decentralized Tesla V100 GPU instance compared to AWS P3.2xlarge (Tesla V100) GPU instance. This can lead to even greater savings and performance upgrades for applications like Zoom calls and video subtitles, customer service chatbots, language translation, and transcription services.