Home / Companies / Monster API / Blog / Post Details
Content Deep Dive

Achieving 90% Cost-Effective Transcription and Translation with Optimised OpenAI Whisper

Blog post from Monster API

Post Details
Company
Date Published
Author
Gaurav Vij
Word Count
1,229
Language
English
Hacker News Points
-
Summary

Q Blocks has introduced a decentralized GPU computing approach coupled with optimized model deployment, reducing the cost of execution and increasing throughput for large language models like OpenAI Whisper. This allows for significant cost savings and performance upgrades at scale. Optimizing AI models is crucial to reduce costs, increase speed, and manage scaling, making them more practical and sustainable. The Q Blocks GPU instance offers a 50% lower cost than AWS out of the box, resulting in 12x cost reduction when running an optimized model on their decentralized Tesla V100 GPU instance compared to AWS P3.2xlarge (Tesla V100) GPU instance. This can lead to even greater savings and performance upgrades for applications like Zoom calls and video subtitles, customer service chatbots, language translation, and transcription services.