VoxCeleb 2

by Oxford VGG

1M utterances of celebrity speech — scaled-up speaker recognition corpus.

TL;DR

1M utterances of celebrity speech — scaled-up speaker recognition corpus.

Best for large-scale speaker-verification training; cross-language coverage (61% non-English). Pricing: research-only.

Category
Open source
License
Stars
Last push
Pricing
research-only
Platforms
Web

What it is

VoxCeleb 2 expands VoxCeleb 1 to 6112 speakers and 1M+ utterances, with broader language coverage. Standard pretraining + verification corpus.

Best for: Large-scale speaker-verification training; cross-language coverage (61% non-English).
Watch out for: CC BY 4.0 (metadata) · 6112 celebrity speakers · YouTube TOS applies. Cite: Chung et al., Interspeech 2018.

Install / use

https://www.robots.ox.ac.uk/~vgg/data/voxceleb/vox2.html  # registration

Features

Speaker diarizationNo
Word-level timestampsNo
Streaming / real-timeNo
Languages supported1
HIPAA eligibleNo

VoxCeleb 2 vs Whipscribe

FeatureVoxCeleb 2Whipscribe
CategoryOpen sourceTranscription APIs
Pricingresearch-onlyfree beta
Speaker diarizationNoYes
Word timestampsNoYes
StreamingNoNo
Languages199
PlatformsWebWeb, API, MCP

Alternatives to VoxCeleb 2

Whipscribe is a managed faster-whisper + whisperX service. If you want transcripts without running infrastructure, paste a URL or drop a file in the form below — you'll have a transcript in seconds.