Europarl-ST

by MLLP / UPV

Speech-translation corpus from European Parliament across 9 languages.

TL;DR

Speech-translation corpus from European Parliament across 9 languages.

Best for many-to-many European speech translation (72 source-target pairs). Pricing: free.

Category
Open source
License
Stars
Last push
Pricing
free
Platforms
Web

What it is

Europarl-ST is a many-to-many speech translation corpus across 9 European languages (en, fr, de, es, it, pt, pl, nl, ro), 72 translation pairs. License: CC BY 4.0.

Best for: Many-to-many European speech translation (72 source-target pairs).
Watch out for: CC BY 4.0 · derived from European Parliament plenary recordings · variable per-pair quantity. Cite: Iranzo-Sánchez et al., ICASSP 2020.

Install / use

https://www.mllp.upv.es/europarl-st/

Features

Speaker diarizationNo
Word-level timestampsNo
Streaming / real-timeNo
Languages supported9
HIPAA eligibleNo

Europarl-ST vs Whipscribe

FeatureEuroparl-STWhipscribe
CategoryOpen sourceTranscription APIs
Pricingfreefree beta
Speaker diarizationNoYes
Word timestampsNoYes
StreamingNoNo
Languages999
PlatformsWebWeb, API, MCP

Alternatives to Europarl-ST

Whipscribe is a managed faster-whisper + whisperX service. If you want transcripts without running infrastructure, paste a URL or drop a file in the form below — you'll have a transcript in seconds.