HuggingFace Datasets

by Hugging Face

Streaming loader for Common Voice, LibriSpeech, GigaSpeech, FLEURS.

TL;DR

Streaming loader for Common Voice, LibriSpeech, GigaSpeech, FLEURS.

Best for training pipelines that need to stream multi-TB speech corpora without caching the whole set. Pricing: free.

Category
Open source
License
Stars
Last push
Pricing
free
Platforms
Linux, macOS, Windows

What it is

The de-facto loader for every public speech dataset on the Hub. Apache-2.0.

Best for: Training pipelines that need to stream multi-TB speech corpora without caching the whole set.
Watch out for: Schemas vary per corpus; iterable mode lacks some indexing features.

Install / use

pip install datasets

Features

Speaker diarizationNo
Word-level timestampsYes
Streaming / real-timeNo
Languages supported100
HIPAA eligibleNo

HuggingFace Datasets vs Whipscribe

FeatureHuggingFace DatasetsWhipscribe
CategoryOpen sourceTranscription APIs
Pricingfreefree beta
Speaker diarizationNoYes
Word timestampsYesYes
StreamingNoNo
Languages10099
PlatformsLinux, macOS, WindowsWeb, API, MCP

Alternatives to HuggingFace Datasets

Whipscribe is a managed faster-whisper + whisperX service. If you want transcripts without running infrastructure, paste a URL or drop a file in the form below — you'll have a transcript in seconds.