Looking at LJ Speech? Try this first.

Drop your audio. Transcript in seconds. 30 free min, then $2 = 200 min

LJ Speech

Name: LJ Speech
Author: Keith Ito

by Keith Ito

24h single-speaker English audiobook corpus — the canonical TTS baseline.

TL;DR

24h single-speaker English audiobook corpus — the canonical TTS baseline.

Best for single-speaker neural TTS baselines (Tacotron, FastSpeech, VITS, etc.). Pricing: free.

What it is

LJ Speech is 24 hours of single-speaker English read speech from public-domain audiobooks. Every modern English TTS paper trains on it. Public domain.

Best for: Single-speaker neural TTS baselines (Tacotron, FastSpeech, VITS, etc.).
Watch out for: Public domain (LibriVox / pre-1923 books) · 24h · single female speaker. Cite: Ito & Johnson, 2017.

Install / use

from datasets import load_dataset; ds = load_dataset('keithito/lj_speech')

Features

Speaker diarization	No
Word-level timestamps	No
Streaming / real-time	No
Languages supported	1
HIPAA eligible	No

LJ Speech vs Whipscribe

Feature	LJ Speech	Whipscribe
Category	Open source	Transcription APIs
Pricing	free	free beta
Speaker diarization	No	Yes
Word timestamps	No	Yes
Streaming	No	No
Languages	1	99
Platforms	Web, HuggingFace	Web, API, MCP

Alternatives to LJ Speech

OpenAI Whisper

OpenAI

The reference open-source multilingual ASR model from OpenAI.

OSS · MIT ★ 98.1k

whisper.cpp

Georgi Gerganov

C/C++ port of Whisper — runs on anything, from a Raspberry Pi to Apple Silicon.

OSS · MIT ★ 48.8k

faster-whisper

SYSTRAN

4× faster than reference Whisper using CTranslate2 — production sweet spot.

OSS · MIT ★ 22.3k

Whipscribe is a managed faster-whisper + whisperX service. If you want transcripts without running infrastructure, paste a URL or drop a file in the form below — you'll have a transcript in seconds.