Tortoise TTS

by neonbjb

Open-source neural TTS with strong prosody and voice cloning.

TL;DR

Open-source neural TTS with strong prosody and voice cloning.

Best for researchers and self-hosters who care more about prosody quality than inference speed. Pricing: free (Apache-2.0).

Category
Open source
License
Stars
Last push
Pricing
free (Apache-2.0)
Platforms
Linux, macOS, Windows

What it is

Tortoise TTS is a single-author Apache-2.0 voice model known for high-quality prosody and voice cloning. Generation is significantly slower than VITS/XTTS — minutes per sample on CPU. Consent posture: open weights — operator owns consent enforcement.

Best for: Researchers and self-hosters who care more about prosody quality than inference speed.
Watch out for: Slow inference even on GPU; English-only; cloning requires 3–5 reference clips.

Install / use

pip install tortoise-tts

Features

Speaker diarizationNo
Word-level timestampsNo
Streaming / real-timeNo
Languages supported1
HIPAA eligibleNo

Tortoise TTS vs Whipscribe

FeatureTortoise TTSWhipscribe
CategoryOpen sourceTranscription APIs
Pricingfree (Apache-2.0)free beta
Speaker diarizationYes
Word timestampsYes
StreamingNo
Languages199
PlatformsLinux, macOS, WindowsWeb, API, MCP

Alternatives to Tortoise TTS

Whipscribe is a managed faster-whisper + whisperX service. If you want transcripts without running infrastructure, paste a URL or drop a file in the form below — you'll have a transcript in seconds.