NVIDIA Mellotron

by NVIDIA

Multispeaker prosody TTS — historical NVIDIA release.

TL;DR

Multispeaker prosody TTS — historical NVIDIA release.

Best for reference for prosody-controlled TTS. Pricing: free.

Category
Open source
License
Stars
Last push
Pricing
free
Platforms
Linux

What it is

The Mellotron paper code. BSD-3-Clause.

Best for: Reference for prosody-controlled TTS.
Watch out for: Superseded by Mixer-TTS, FastPitch++.

Install / use

Features

Speaker diarizationNo
Word-level timestampsYes
Streaming / real-timeNo
Languages supported1
HIPAA eligibleNo

NVIDIA Mellotron vs Whipscribe

FeatureNVIDIA MellotronWhipscribe
CategoryOpen sourceTranscription APIs
Pricingfreefree beta
Speaker diarizationNoYes
Word timestampsYesYes
StreamingNoNo
Languages199
PlatformsLinuxWeb, API, MCP

Alternatives to NVIDIA Mellotron

Whipscribe is a managed faster-whisper + whisperX service. If you want transcripts without running infrastructure, paste a URL or drop a file in the form below — you'll have a transcript in seconds.