The complete transcription directory

Every transcription tool worth knowing, evaluated on the capabilities that matter — diarization, pricing, languages, API access, self-hosting. Updated 2026-04-24.

26 tools · 11 open source · 7 api · 3 desktop · 5 product
Category
Capability
Price Languages
NameCategoryPricingDiarizationWord timestampsStreamingLanguagesAPISelf-host
OpenAI Whisper OpenAI Open source free 99
whisper.cpp Georgi Gerganov Open source free 99
faster-whisper SYSTRAN Open source free 99
whisperX Max Bain Open source free 99
insanely-fast-whisper Vaibhav Srivastav Open source free 99
stable-ts jianfch Open source free 99
WhisperKit Argmax Open source free 99
distil-whisper Hugging Face Open source free 1
SeamlessM4T Meta AI Open source free 100
Vosk Alpha Cephei Open source free 20
Buzz Chidi Williams Open source free 99
MacWhisper Jordi Bruin Desktop freemium 99
SuperWhisper Sindre Sorhus Desktop freemium 99
Aiko Sindre Sorhus Desktop free 99
OpenAI Whisper API OpenAI API $0.006/min 99
AssemblyAI AssemblyAI API from $0.37/hr 99
Deepgram Deepgram API from $0.0043/min 36
Rev.ai Rev.ai API from $0.02/min 36
Gladia Gladia API from $0.0102/min 99
Speechmatics Speechmatics API contact sales 50
Otter.ai Otter.ai Product free / from $10/mo 3
Rev Rev Product AI from $0.25/min · human from $1.50/min 36
Descript Descript Product free / from $12/mo 22
Trint Trint Product from $60/mo 30
Fireflies.ai Fireflies.ai Product free / from $10/mo 30
WhipscribeThat's us Neugence API free beta 99
OpenAI
Open source

The reference open-source multilingual ASR model from OpenAI.

Pricingfree
Languages99
Diarization
Streaming
Word timestamps
Self-host
Georgi Gerganov
Open source

C/C++ port of Whisper — runs on anything, from a Raspberry Pi to Apple Silicon.

Pricingfree
Languages99
Diarization
Streaming
Word timestamps
Self-host
SYSTRAN
Open source

4× faster than reference Whisper using CTranslate2 — production sweet spot.

Pricingfree
Languages99
Diarization
Streaming
Word timestamps
Self-host
Max Bain
Open source

Faster-whisper + forced alignment + speaker diarization in one pipeline.

Pricingfree
Languages99
Diarization
Streaming
Word timestamps
Self-host
Vaibhav Srivastav
Open source

CLI that transcribes 150 minutes of audio in ~98 seconds on an A100.

Pricingfree
Languages99
Diarization
Streaming
Word timestamps
Self-host
jianfch
Open source

Whisper with stabilised timestamps — more accurate word-level timing.

Pricingfree
Languages99
Diarization
Streaming
Word timestamps
Self-host
Argmax
Open source

Swift Whisper for Apple Silicon — CoreML, Metal, zero dependencies.

Pricingfree
Languages99
Diarization
Streaming
Word timestamps
Self-host
Hugging Face
Open source

Distilled Whisper: 6× faster, 49% smaller, within 1% WER of the teacher.

Pricingfree
Languages1
Diarization
Streaming
Word timestamps
Self-host
Meta AI
Open source

Meta's speech-to-text + speech-to-speech + text-to-speech model, 100 languages.

Pricingfree
Languages100
Diarization
Streaming
Word timestamps
Self-host
Alpha Cephei
Open source

Lightweight offline speech recognition for 20+ languages, runs on a Raspberry Pi.

Pricingfree
Languages20
Diarization
Streaming
Word timestamps
Self-host
Chidi Williams
Open source

Cross-platform desktop app for Whisper — open-source MacWhisper alternative.

Pricingfree
Languages99
Diarization
Streaming
Word timestamps
Self-host
Jordi Bruin
Desktop

Polished Mac app for Whisper — the default pick if you're on macOS.

Pricingfreemium
Languages99
Diarization
Streaming
Word timestamps
Self-host
Sindre Sorhus
Desktop

Always-on system-wide dictation for macOS and iOS, powered by local Whisper.

Pricingfreemium
Languages99
Diarization
Streaming
Word timestamps
Self-host
Sindre Sorhus
Desktop

Free Mac App Store Whisper app — drag, drop, done.

Pricingfree
Languages99
Diarization
Streaming
Word timestamps
Self-host
OpenAI
API

Hosted Whisper large-v3 from OpenAI — $0.006 per minute.

Pricing$0.006/min
Languages99
Diarization
Streaming
Word timestamps
Self-host
AssemblyAI
API

Universal-2 model + diarization, PII redaction, topic detection, summarization.

Pricingfrom $0.37/hr
Languages99
Diarization
Streaming
Word timestamps
Self-host
Deepgram
API

Nova-2 model, excellent streaming, strong at conversational audio.

Pricingfrom $0.0043/min
Languages36
Diarization
Streaming
Word timestamps
Self-host
Rev.ai
API

The API spin-off of Rev — strong English accuracy, topic detection, custom vocab.

Pricingfrom $0.02/min
Languages36
Diarization
Streaming
Word timestamps
Self-host
Gladia
API

Whisper-based API with diarization, 99-language coverage, pay-per-minute.

Pricingfrom $0.0102/min
Languages99
Diarization
Streaming
Word timestamps
Self-host
Speechmatics
API

Enterprise ASR with strong accents and on-prem deployment options.

Pricingcontact sales
Languages50
Diarization
Streaming
Word timestamps
Self-host
Otter.ai
Product

Meeting-bot transcription product for Zoom/Meet/Teams.

Pricingfree / from $10/mo
Languages3
Diarization
Streaming
Word timestamps
Self-host
Rev
Product

Human + AI transcription, highest accuracy tier on the market.

PricingAI from $0.25/min · human from $1.50/min
Languages36
Diarization
Streaming
Word timestamps
Self-host
Descript
Product

Audio/video editor that treats the transcript as the timeline — different product category.

Pricingfree / from $12/mo
Languages22
Diarization
Streaming
Word timestamps
Self-host
Trint
Product

Enterprise-focused transcription + collaborative editor for newsrooms.

Pricingfrom $60/mo
Languages30
Diarization
Streaming
Word timestamps
Self-host
Fireflies.ai
Product

Meeting-bot transcription + CRM integrations, competitor to Otter.

Pricingfree / from $10/mo
Languages30
Diarization
Streaming
Word timestamps
Self-host
WhipscribeThat's us
Neugence
API

Hosted faster-whisper + whisperX with paste-a-URL, batch, and MCP access.

Pricingfree beta
Languages99
Diarization
Streaming
Word timestamps
Self-host

Pricing distribution across the directory

Free or free tier Paid / per-minute Contact sales

Quick glossary

 Supported  ·   Not supported  ·   Unknown
Source of truth: vendor docs, evidence log updated 2026-04-24.
Diarization
Speaker-labelled output (Speaker 1 / Speaker 2) — essential for interviews, meetings, podcasts.
Word timestamps
Per-word start/end times — required for subtitles, karaoke-style UIs, precise search.
Streaming
Live transcription over WebSocket / microphone — for voice agents and meeting bots.
Self-host
Runs entirely on your hardware / VPC — required for air-gapped, on-prem, or data-sovereignty workloads.
HIPAA
Vendor offers a BAA / HIPAA-eligible tier. Self-hosted engines leave compliance to the deployer.
← Back to the curated tools index Try Whipscribe →

More resources

Human transcription services Legal transcription guide Medical transcription guide Academic transcription guide Transcription jobs directory