NVIDIA Speech AI

by NVIDIA

NVIDIA Speech Research — NeMo + Canary + Parakeet + Riva origins.

TL;DR

NVIDIA Speech Research — NeMo + Canary + Parakeet + Riva origins.

Best for production NeMo models (Canary 1B, Parakeet RNNT, FastConformer) on GPU. Pricing: free.

Category
Open source
License
Stars
Last push
Pricing
free
Platforms
GitHub, NGC

What it is

NVIDIA Speech AI (NeMo team) releases Canary, Parakeet-TDT, FastConformer, and Riva — production-grade ASR + TTS + speaker models. License: Apache-2.0 toolkit; CC-BY-4.0 model weights.

Best for: Production NeMo models (Canary 1B, Parakeet RNNT, FastConformer) on GPU.
Watch out for: Apache-2.0 toolkit · model checkpoints CC-BY-4.0 (Canary) or NVIDIA Open Model License (Parakeet) — check per-model.

Install / use

pip install nemo-toolkit[asr]

Features

Speaker diarizationYes
Word-level timestampsYes
Streaming / real-timeYes
Languages supported25
HIPAA eligibleNo

NVIDIA Speech AI vs Whipscribe

FeatureNVIDIA Speech AIWhipscribe
CategoryOpen sourceTranscription APIs
Pricingfreefree beta
Speaker diarizationYesYes
Word timestampsYesYes
StreamingYesNo
Languages2599
PlatformsGitHub, NGCWeb, API, MCP

Alternatives to NVIDIA Speech AI

Whipscribe is a managed faster-whisper + whisperX service. If you want transcripts without running infrastructure, paste a URL or drop a file in the form below — you'll have a transcript in seconds.