Drop your audio. Transcript in seconds. 30 free min, then $2 = 200 min
Deepgram
Deepgram is a real-time speech API built for voice agents and call analytics — Nova-3 ships low-latency streaming plus pre-recorded transcription, with diarization, PII redaction, and summarization wired into a single REST + WebSocket surface.
Nova-3 is Deepgram's current flagship model, available in nova-3-general (multilingual) and nova-3-medical. The platform exposes one REST endpoint for pre-recorded audio (/v1/listen) and a WebSocket for live streaming, with diarization, word-level timestamps, PII redaction, and summarization toggled by query parameters.
Best for real-time voice agents, call-center analytics, live captioning, and meeting tools where p50 streaming latency is the product. New accounts get $200 in free credit with no card; pay-as-you-go pricing runs from $0.0048/min streaming and $0.0077/min pre-recorded on Nova-3 monolingual, with a separate per-minute rate for the Voice Agent API.
What it is
Deepgram's Nova-2 is one of the strongest streaming ASR models on the market, with very low latency and good accuracy on conversational audio. HIPAA-eligible, per-minute pricing competitive with self-hosted for modest volume. Last price check: 2026-04-20.
Watch out for: Lower language coverage than Whisper variants; proprietary.
Install / use
Where Deepgram fits · 6 use-cases
Deepgram's strengths cluster around streaming latency, conversational accuracy, and an API surface that bundles agents + ASR + redaction. Pick the card closest to your build — each links to the canonical docs section.
The Agent API wraps Nova-3 STT, an LLM step, and Deepgram TTS behind one WebSocket so you ship a conversational agent without stitching three providers. Drop-in for phone bots, IVR replacements, and in-product voice copilots.
Built-in turn-taking + barge-in
Batch call recordings through /v1/listen with diarize=true and redact=pii to get speaker-labeled, PII-scrubbed transcripts ready for QA scoring and topic mining. HIPAA-eligible on paid plans.
diarize + redact + summarize=v2
Open a WebSocket to wss://api.deepgram.com/v1/listen and stream PCM; interim transcripts arrive word-by-word with timestamps. Common stack for webinar captions, live-event accessibility, and broadcast workflows.
Word-level timestamps inline
Send the episode URL or upload bytes; ask for diarize=true, punctuate=true, paragraphs=true, and summarize=v2 in one call to get a publishable transcript plus a model-generated recap.
Single request returns all artifacts
nova-3-general handles 10 base languages plus regional variants under one model id — useful when you can't predict the input language, or when you need code-switching inside a single utterance.
Detect + transcribe in one pass
Set redact=pii (or fine-grained tags like numbers, ssn) and the transcript ships with sensitive spans replaced by typed placeholders like [PHONE_NUMBER_1] — raw audio is not retained when the no-store option is enabled on enterprise plans.
Healthcare model for clinical audio
Quickstart · pick a language
Three working ways to transcribe a remote audio URL with Nova-3. Export your key as DEEPGRAM_API_KEY first — grab one from the Deepgram console (free $200 credit, no card).
Official deepgram-sdk v7+ · transcribe any HTTPS audio URL with Nova-3.
# pip install deepgram-sdk
import os
from deepgram import DeepgramClient, PrerecordedOptions
dg = DeepgramClient(os.environ["DEEPGRAM_API_KEY"])
source = {"url": "https://dpgr.am/spacewalk.wav"}
options = PrerecordedOptions(
model="nova-3",
smart_format=True,
diarize=True,
punctuate=True,
summarize="v2",
)
resp = dg.listen.rest.v("1").transcribe_url(source, options)
print(resp.results.channels[0].alternatives[0].transcript)
Official @deepgram/sdk · same pre-recorded call from Node 18+.
// npm install @deepgram/sdk
import { createClient } from "@deepgram/sdk";
const dg = createClient(process.env.DEEPGRAM_API_KEY);
const { result, error } = await dg.listen.prerecorded.transcribeUrl(
{ url: "https://dpgr.am/spacewalk.wav" },
{
model: "nova-3",
smart_format: true,
diarize: true,
punctuate: true,
summarize: "v2",
}
);
if (error) throw error;
console.log(result.results.channels[0].alternatives[0].transcript);
Plain HTTPS POST to /v1/listen · useful for shell pipelines and edge runtimes.
# pre-recorded URL
curl --request POST \
--url 'https://api.deepgram.com/v1/listen?model=nova-3&smart_format=true&diarize=true&punctuate=true&summarize=v2' \
--header "Authorization: Token $DEEPGRAM_API_KEY" \
--header 'Content-Type: application/json' \
--data '{"url":"https://dpgr.am/spacewalk.wav"}'
# or a local file
curl --request POST \
--url 'https://api.deepgram.com/v1/listen?model=nova-3&smart_format=true' \
--header "Authorization: Token $DEEPGRAM_API_KEY" \
--header 'Content-Type: audio/wav' \
--data-binary @call.wav
Features
| Speaker diarization | Yes |
| Word-level timestamps | Yes |
| Streaming / real-time | Yes |
| Languages supported | 36 |
| HIPAA eligible | Yes |
Links
- developers.deepgram.com/docs ↗Documentation root — quickstarts, feature guides, and API surface for /v1/listen + agent + TTS.
- Models & languages overview ↗Nova-3 variants (general / medical), supported languages, and which features each model exposes.
- Voice Agent getting-started ↗WebSocket Agent API — ASR + LLM + TTS in one stream, with code in Python, JS, C#, Go.
- Live streaming audio guide ↗Real-time transcription over WebSocket with interim partial results and word timestamps.
- deepgram/deepgram-python-sdk ↗Official Python SDK — v7.x line, async + sync clients, pre-recorded + streaming + agent.
- deepgram/deepgram-js-sdk ↗Official JavaScript / TypeScript SDK — browser, Node 18+, edge runtimes.
- deepgram/deepgram-go-sdk ↗Official Go SDK — same feature surface as Python / JS.
- deepgram.com/pricing ↗Current per-minute rates for Nova-3 streaming / pre-recorded plus the Voice Agent SKU and Growth plan discounts.
- status.deepgram.com ↗Live status for the Public, Batch, Streaming, TTS, and Voice Agent APIs — subscribe via email / Slack / webhook.
- console.deepgram.com/signup ↗Free $200 credit, no card required — keys appear in the console immediately after signup.
Deepgram vs Whipscribe
| Feature | Deepgram | Whipscribe |
|---|---|---|
| Category | Transcription APIs | Transcription APIs |
| Pricing | from $0.0043/min | free beta |
| Speaker diarization | Yes | Yes |
| Word timestamps | Yes | Yes |
| Streaming | Yes | No |
| Languages | 36 | 99 |
| Platforms | API | Web, API, MCP |
Sources & dates for the comparison above
- diarization: “Diarization recognizes speaker changes and attributes speech to speakers.” — source (checked 2026-04-23)
- word timestamps: “Each word returned includes start and end times in seconds.” — source (checked 2026-04-23)
- streaming: “Deepgram's streaming API transcribes live audio in real time over WebSockets.” — source (checked 2026-04-23)
- pricing: “Nova model pre-recorded transcription from $0.0043 per minute (pay-as-you-go).” — source (checked 2026-04-23)
Alternatives to Deepgram
Whipscribe is a managed faster-whisper + whisperX service. If you want transcripts without running infrastructure, paste a URL or drop a file in the form below — you'll have a transcript in seconds.