GPT-SoVITS

by RVC Boss

Few-shot voice cloning — companion to Whisper-cloned datasets.

TL;DR

Few-shot voice cloning — companion to Whisper-cloned datasets.

Best for cloning a speaker from 30s of clean audio for downstream voice agents. Pricing: free.

Category
Open source
License
Stars
Last push
Pricing
free
Platforms
Linux, macOS, Windows

What it is

Few-shot voice cloning combining a GPT-style content predictor with SoVITS. MIT.

Best for: Cloning a speaker from 30s of clean audio for downstream voice agents.
Watch out for: Misuse-prone; ensure you have consent.

Install / use

Features

Speaker diarizationNo
Word-level timestampsYes
Streaming / real-timeNo
Languages supported8
HIPAA eligibleNo

GPT-SoVITS vs Whipscribe

FeatureGPT-SoVITSWhipscribe
CategoryOpen sourceTranscription APIs
Pricingfreefree beta
Speaker diarizationNoYes
Word timestampsYesYes
StreamingNoNo
Languages899
PlatformsLinux, macOS, WindowsWeb, API, MCP

Alternatives to GPT-SoVITS

Whipscribe is a managed faster-whisper + whisperX service. If you want transcripts without running infrastructure, paste a URL or drop a file in the form below — you'll have a transcript in seconds.