/plushcap/analysis/assemblyai/how-to-run-openais-whisper-speech-recognition-model

How to Run OpenAI’s Whisper Speech Recognition Model

What's this blog post about?

The Micro Machines example was transcribed with Whisper on both CPU and GPU at each model size, and the inference times are reported below. First, we see the results for CPU (i5-11300H) ``` Tiny: 0.02 sec Base: 0.06 sec Small: 0.14 sec Medium: 0.39 sec Large: 1.47 sec ``` Next, we have the results on GPU (high RAM GPU Colab environment) ``` Tiny: 0.01 sec Base: 0.02 sec Small: 0.05 sec Medium: 0.13 sec Large: 0.46 sec ``` Here are the same results side-by-side ``` CPU GPU Tiny: 0.02 sec 0.01 sec Base: 0.06 sec 0.02 sec Small: 0.14 sec 0.05 sec Medium: 0.39 sec 0.13 sec Large: 1.47 sec 0.46 sec ``` The cost to run Whisper is as follows, using different batch sizes (values of which can be found in the legend): ``` Tiny: 0.03 USD/h Base: 0.09 USD/h Small: 0.21 USD/h Medium: 0.57 USD/h Large: 2.28 USD/h ```

Company
AssemblyAI

Date published
Sept. 22, 2022

Author(s)
Ryan O'Connor

Word count
3409

Hacker News points
None found.

Language
English


By Matt Makai. 2021-2024.