Transcription tools directory

Every transcription service we track — open-source engines, desktop apps, APIs, and products — with live GitHub stats, a features matrix, and honest current pricing. Curated by Whipscribe; updated 2026-04-20.

Updated 2026-04-20 · 26 tools tracked
OpenAI Whisper
OpenAI

The reference open-source multilingual ASR model from OpenAI.

OSS · MIT ★ 98.1k
whisper.cpp
Georgi Gerganov

C/C++ port of Whisper — runs on anything, from a Raspberry Pi to Apple Silicon.

OSS · MIT ★ 48.8k
faster-whisper
SYSTRAN

4× faster than reference Whisper using CTranslate2 — production sweet spot.

OSS · MIT ★ 22.3k
whisperX
Max Bain

Faster-whisper + forced alignment + speaker diarization in one pipeline.

OSS · BSD‑2‑Clause ★ 21.4k
insanely-fast-whisper
Vaibhav Srivastav

CLI that transcribes 150 minutes of audio in ~98 seconds on an A100.

OSS · Apache‑2.0 ★ 12.4k
stable-ts
jianfch

Whisper with stabilised timestamps — more accurate word-level timing.

OSS · MIT ★ 2.2k
WhisperKit
Argmax

Swift Whisper for Apple Silicon — CoreML, Metal, zero dependencies.

OSS · MIT ★ 6.0k
distil-whisper
Hugging Face

Distilled Whisper: 6× faster, 49% smaller, within 1% WER of the teacher.

OSS · MIT ★ 4.1k
SeamlessM4T
Meta AI

Meta's speech-to-text + speech-to-speech + text-to-speech model, 100 languages.

OSS · NOASSERTION ★ 11.8k
Vosk
Alpha Cephei

Lightweight offline speech recognition for 20+ languages, runs on a Raspberry Pi.

OSS · Apache‑2.0 ★ 14.6k
Buzz
Chidi Williams

Cross-platform desktop app for Whisper — open-source MacWhisper alternative.

OSS · MIT ★ 18.8k
MacWhisper
Jordi Bruin

Polished Mac app for Whisper — the default pick if you're on macOS.

freemium
SuperWhisper
Sindre Sorhus

Always-on system-wide dictation for macOS and iOS, powered by local Whisper.

freemium
Aiko
Sindre Sorhus

Free Mac App Store Whisper app — drag, drop, done.

free
OpenAI Whisper API
OpenAI

Hosted Whisper large-v3 from OpenAI — $0.006 per minute.

$0.006/min
AssemblyAI
AssemblyAI

Universal-2 model + diarization, PII redaction, topic detection, summarization.

from $0.37/hr
Deepgram
Deepgram

Nova-2 model, excellent streaming, strong at conversational audio.

from $0.0043/min
Rev.ai
Rev.ai

The API spin-off of Rev — strong English accuracy, topic detection, custom vocab.

from $0.02/min
Gladia
Gladia

Whisper-based API with diarization, 99-language coverage, pay-per-minute.

from $0.0102/min
Speechmatics
Speechmatics

Enterprise ASR with strong accents and on-prem deployment options.

contact sales
Otter.ai
Otter.ai

Meeting-bot transcription product for Zoom/Meet/Teams.

free / from $10/mo
Rev
Rev

Human + AI transcription, highest accuracy tier on the market.

AI from $0.25/min · human from $1.50/min
Descript
Descript

Audio/video editor that treats the transcript as the timeline — different product category.

free / from $12/mo
Trint
Trint

Enterprise-focused transcription + collaborative editor for newsrooms.

from $60/mo
Fireflies.ai
Fireflies.ai

Meeting-bot transcription + CRM integrations, competitor to Otter.

free / from $10/mo
Whipscribe
Neugence

Hosted faster-whisper + whisperX with paste-a-URL, batch, and MCP access.

This is us

Browse by category

Frequently asked

faster-whisper vs whisperX — which should I use?

faster-whisper is the speed-optimised runtime. whisperX adds speaker diarization (pyannote) and forced-alignment word timestamps on top. Use faster-whisper if your audio is single-speaker and you only need the transcript. Use whisperX if the content has multiple speakers and you need "who said what."

What's the cheapest transcription API in 2026?

Per-minute pricing (as of 2026-04-20): Deepgram Nova-2 at $0.0043/min is the cheapest streaming API. OpenAI Whisper API is $0.006/min. Self-hosting faster-whisper on a rented GPU is cheaper at scale but requires operational work. Prices shift — check the linked page.

What's the best open-source Otter.ai alternative?

For file-transcription, whisperX (or faster-whisper with pyannote) gives you the same transcript + speaker-label output Otter produces. For the meeting-bot workflow itself, there's no one-click OSS replacement — you'd need to combine Whisper + a bot framework (e.g. meeting-bot libraries) yourself.

Which is best on Apple Silicon (M-series Macs)?

whisper.cpp with the Metal backend is the fastest pure-CLI option. WhisperKit is the Swift-native choice for in-app integration. MacWhisper is the polished desktop app for non-technical users.

I need HIPAA compliance. Which options qualify?

For commercial APIs with HIPAA/BAA paths: Deepgram, AssemblyAI, Rev.ai, and Speechmatics all offer them on appropriate tiers. For self-hosted, HIPAA is your responsibility — the license doesn't grant compliance; your deployment architecture does.

Whisper says it supports 99 languages. Is that real?

The model weights cover 99 languages, but quality varies widely. English, Spanish, German, French, Japanese, and Chinese are excellent. Low-resource languages (e.g. many African and Southeast-Asian languages) are significantly weaker — often below a usable WER. SeamlessM4T is worth checking for those.

Prefer a hosted service over running your own GPU? Whipscribe runs faster-whisper + whisperX behind a web UI, REST API, and MCP server for Claude Desktop.

Try Whipscribe →