Mozilla Common Voice
Mozilla Common Voice — public-domain multilingual speech corpus that powers many regional STT models.
Mozilla Common Voice — public-domain multilingual speech corpus that powers many regional STT models.
Best for researchers and teams training low-resource language ASR from a public-domain corpus. Pricing: free.
What it is
Common Voice is Mozilla's crowdsourced, CC0-licensed multilingual speech corpus, now covering 100+ languages including many African, Indic, and minority European languages that commercial STT vendors skip. Common Voice is the dataset behind a long tail of regional ASR models published on Hugging Face — not a recognition product itself, but a load-bearing input for every team building one. Best fit when the buyer is researchers and teams training low-resource language asr from a public-domain corpus. The honest caveat: it is a dataset, not a recognition product; consent and quality vary across language splits. As with any open-weights release, the integrator owns hosting, scaling, and SLA — but the licensing cost is zero and the model can be fine-tuned on in-house audio.
Watch out for: It is a dataset, not a recognition product; consent and quality vary across language splits.
Install / use
commonvoice.mozilla.org — download CC0 dataset by language
Features
| Speaker diarization | No |
| Word-level timestamps | No |
| Streaming / real-time | No |
| Languages supported | 100 |
| HIPAA eligible | No |
Mozilla Common Voice vs Whipscribe
| Feature | Mozilla Common Voice | Whipscribe |
|---|---|---|
| Category | Open source | Transcription APIs |
| Pricing | free | free beta |
| Speaker diarization | No | Yes |
| Word timestamps | No | Yes |
| Streaming | No | No |
| Languages | 100 | 99 |
| Platforms | Web | Web, API, MCP |
Alternatives to Mozilla Common Voice
Whipscribe is a managed faster-whisper + whisperX service. If you want transcripts without running infrastructure, paste a URL or drop a file in the form below — you'll have a transcript in seconds.