VoxCeleb 1

by Oxford VGG

100k utterances of celebrity speech from YouTube — speaker recognition benchmark.

TL;DR

100k utterances of celebrity speech from YouTube — speaker recognition benchmark.

Best for speaker verification + identification baselines. Pricing: research-only.

Category
Open source
License
Stars
Last push
Pricing
research-only
Platforms
Web, HuggingFace

What it is

VoxCeleb 1 is 100k+ utterances from 1251 celebrity speakers harvested from YouTube — the canonical speaker-recognition benchmark. License: CC BY 4.0 metadata.

Best for: Speaker verification + identification baselines.
Watch out for: CC BY 4.0 (metadata) · YouTube audio subject to source TOS · 1251 celebrity speakers. Cite: Nagrani et al., Interspeech 2017.

Install / use

https://www.robots.ox.ac.uk/~vgg/data/voxceleb/  # registration

Features

Speaker diarizationNo
Word-level timestampsNo
Streaming / real-timeNo
Languages supported1
HIPAA eligibleNo

VoxCeleb 1 vs Whipscribe

FeatureVoxCeleb 1Whipscribe
CategoryOpen sourceTranscription APIs
Pricingresearch-onlyfree beta
Speaker diarizationNoYes
Word timestampsNoYes
StreamingNoNo
Languages199
PlatformsWeb, HuggingFaceWeb, API, MCP

Alternatives to VoxCeleb 1

Whipscribe is a managed faster-whisper + whisperX service. If you want transcripts without running infrastructure, paste a URL or drop a file in the form below — you'll have a transcript in seconds.