insanely-fast-whisper
by Vaibhav Srivastav
CLI that transcribes 150 minutes of audio in ~98 seconds on an A100.
Category
Open source
License
Apache-2.0
Stars
★ 12.4k
Last push
2025-10-25
Pricing
free
Platforms
Linux, GPU
What it is
An opinionated CLI wrapper around Hugging Face Transformers + Flash Attention + BetterTransformer. Trades install complexity for throughput: ~150 min of audio in ~98s on an A100. The reference for "how fast can Whisper go on current hardware." Apache-2.0.
Best for: Batch-processing huge backlogs on rented H100/A100 time.
Watch out for: Requires an NVIDIA GPU with enough VRAM for Whisper-large-v3 + Flash Attention; CPU path is not practical.
Watch out for: Requires an NVIDIA GPU with enough VRAM for Whisper-large-v3 + Flash Attention; CPU path is not practical.
Install / use
pipx install insanely-fast-whisper
Features
| Speaker diarization | Yes |
| Word-level timestamps | Yes |
| Streaming / real-time | No |
| Languages supported | 99 |
| HIPAA eligible | No |
Links
Alternatives
Whipscribe is a managed faster-whisper + whisperX service. If you want transcripts without running infrastructure, we're one click away.
Try Whipscribe →