NVIDIA Speech AI
by NVIDIA
NVIDIA Speech Research — NeMo + Canary + Parakeet + Riva origins.
TL;DR
NVIDIA Speech Research — NeMo + Canary + Parakeet + Riva origins.
Best for production NeMo models (Canary 1B, Parakeet RNNT, FastConformer) on GPU. Pricing: free.
Category
Open source
License
—
Stars
—
Last push
—
Pricing
free
Platforms
GitHub, NGC
What it is
NVIDIA Speech AI (NeMo team) releases Canary, Parakeet-TDT, FastConformer, and Riva — production-grade ASR + TTS + speaker models. License: Apache-2.0 toolkit; CC-BY-4.0 model weights.
Best for: Production NeMo models (Canary 1B, Parakeet RNNT, FastConformer) on GPU.
Watch out for: Apache-2.0 toolkit · model checkpoints CC-BY-4.0 (Canary) or NVIDIA Open Model License (Parakeet) — check per-model.
Watch out for: Apache-2.0 toolkit · model checkpoints CC-BY-4.0 (Canary) or NVIDIA Open Model License (Parakeet) — check per-model.
Install / use
pip install nemo-toolkit[asr]
Features
| Speaker diarization | Yes |
| Word-level timestamps | Yes |
| Streaming / real-time | Yes |
| Languages supported | 25 |
| HIPAA eligible | No |
NVIDIA Speech AI vs Whipscribe
| Feature | NVIDIA Speech AI | Whipscribe |
|---|---|---|
| Category | Open source | Transcription APIs |
| Pricing | free | free beta |
| Speaker diarization | Yes | Yes |
| Word timestamps | Yes | Yes |
| Streaming | Yes | No |
| Languages | 25 | 99 |
| Platforms | GitHub, NGC | Web, API, MCP |
Alternatives to NVIDIA Speech AI
Whipscribe is a managed faster-whisper + whisperX service. If you want transcripts without running infrastructure, paste a URL or drop a file in the form below — you'll have a transcript in seconds.