faster-whisper
4× faster than reference Whisper using CTranslate2 — production sweet spot.
4× faster than reference Whisper using CTranslate2 — production sweet spot.
Best for production batch transcription on GPU, cheapest $/hour of any hosted Whisper variant. Pricing: free.
What it is
faster-whisper wraps Whisper in CTranslate2 — a tuned inference engine for transformer models. On a single consumer GPU it's ~4× faster than reference Whisper and uses ~2× less VRAM, with essentially identical accuracy. This is what most production Whisper stacks actually run, including Whipscribe. MIT-licensed and stable.
Watch out for: No diarization (pair with pyannote or whisperX); model conversion step can trip people up the first time.
Install / use
pip install faster-whisper
Features
| Speaker diarization | No |
| Word-level timestamps | Yes |
| Streaming / real-time | No |
| Languages supported | 99 |
| HIPAA eligible | No |
Links
faster-whisper vs Whipscribe
| Feature | faster-whisper | Whipscribe |
|---|---|---|
| Category | Open source | Transcription APIs |
| Pricing | free | free beta |
| Speaker diarization | No | Yes |
| Word timestamps | Yes | Yes |
| Streaming | No | No |
| Languages | 99 | 99 |
| Platforms | Linux, macOS, Windows, GPU | Web, API, MCP |
Alternatives to faster-whisper
Frequently asked about faster-whisper
Is faster-whisper more accurate than OpenAI Whisper?
Accuracy is essentially identical — same model weights, same training. The difference is runtime speed and VRAM usage, both of which faster-whisper improves materially. Any measurable WER gap is in the noise.
How much faster is faster-whisper?
Published benchmarks show roughly 4x faster inference on a single GPU versus the reference openai-whisper package, with ~2x lower VRAM. Your mileage varies with batch size, model size, and hardware.
Does faster-whisper support diarization?
No. It handles transcription only. Combine it with pyannote for speaker labels, or use whisperX, which integrates both.
What license is faster-whisper?
MIT — permissive for commercial and non-commercial use. CTranslate2 (the underlying engine) is also MIT-licensed.
Can faster-whisper run on CPU?
Yes. Use compute_type='int8' and a smaller model (base or small) for acceptable speed. For production CPU workloads, whisper.cpp is usually faster.
Whipscribe is a managed faster-whisper + whisperX service. If you want transcripts without running infrastructure, we're one click away.
Try Whipscribe →