Looking at Rev.ai? Try this first.

Drop your audio. Transcript in seconds. 30 free min, then $2 = 200 min

Rev.ai

by Rev.ai

Rev.ai is Rev's developer API arm — Reverb ASR plus real-time streaming, with the unique option to drop the same job down to human transcribers when machine-grade accuracy isn't enough.

TL;DR

Rev.ai exposes async transcription via POST /speechtotext/v1/jobs and real-time via a WebSocket streaming endpoint, both powered by the Reverb model family — Rev's in-house English ASR that was open-sourced in October 2024 (Apache-2.0 code, models on HuggingFace) and benchmarks competitively against Whisper Large-v3. Word-level timestamps, diarization, custom vocabulary, language identification, sentiment, topic extraction, and summarization are all wired into the same job submission.

Best for long-form English media, podcasts, and call recordings where accuracy matters more than headline price, plus any workflow that occasionally needs to escalate the same audio to human transcribers under one vendor. Current pricing: Reverb at $0.20/hr (~$0.0033/min), Reverb Turbo at $0.10/hr (~$0.0017/min), Reverb foreign language at $0.30/hr, Whisper Fusion / Whisper Large at $0.005/min, and human transcription at $1.99/min. New accounts get free credit equivalent to 5 hours of Reverb ASR.

Category
Transcription APIs
License
Stars
Last push
Pricing
from $0.02/min
Platforms
API

What it is

Rev.ai is the developer API from Rev, historically a human-transcription service. The "Machine" endpoint is competitive on English and supports custom vocabulary that handles jargon better than generic Whisper. HIPAA-eligible on appropriate plans. Last price check: 2026-04-20.

Best for: English-heavy workloads where vocabulary customization matters (medical, legal, technical).
Watch out for: Pricier than Whisper-based APIs; non-English coverage narrower.

Install / use

View Rev.ai API docs ↗

Where Rev.ai fits · 6 use-cases

Rev.ai's pitch clusters around English ASR accuracy, the Reverb open model, and the optional human-grade escalation lane. Pick the card closest to your build — each links to the matching docs.rev.ai section.

Podcasts · long-form English
Async · Reverb model

Submit an episode URL to the async jobs endpoint and ask for diarization plus word timestamps. Reverb is tuned on Rev's 7M+ hours of human-verified speech, which shows up on long-form conversational English versus generic Whisper.

POST /speechtotext/v1/jobs
transcriber=reverb · diarize=true
Media · subtitles + chapters
Async · forced alignment

Use forced alignment to lock an existing transcript to word-level timestamps for SRT/VTT export, or run a fresh job with diarization for caption tracks plus speaker labels. Topic extraction can seed chapter markers.

Forced alignment endpoint
Word-accurate timing for caption files
Voice agents · streaming
WebSocket · real-time

Open a WebSocket to the streaming endpoint and pipe PCM frames; partial hypotheses arrive in real time for live captioning, IVR turn-taking, and in-product voice copilots. Custom vocabulary biases brand and jargon words.

Streaming Speech-to-Text
Real-time partials + final results
LLM pre-processing
Async + Insights stack

Bundle the transcript with summarization, topic extraction, and sentiment in a single workflow so the LLM downstream sees structured input instead of raw text. Useful for meeting bots, RAG over call audio, and analytics rollups.

Summary + Topics + Sentiment APIs
Insights stack runs on the same job id
Compliance · legal records
Human option · HIPAA / SOC II

When machine accuracy is not enough, Rev.ai can route the same job to its human transcription network for verbatim, certified output. Same API, same dashboard, same audit trail — useful for depositions, regulatory filings, and accessibility compliance.

Human transcription job type
$1.99/min · turnaround in hours
Multilingual fallback
Reverb FL + Whisper Fusion

For non-English audio, Rev.ai exposes a Reverb Foreign Language model plus Whisper Fusion / Whisper Large as transcriber options, with language identification as a pre-step when the input language is unknown. 57+ languages on the broader stack.

transcriber=reverb-fl | whisper-fusion
Language ID at $0.003/min seeds the choice
Pattern: Rev.ai is the right call when English accuracy and the option to escalate to humans under one vendor matter more than the lowest per-minute rate. If you want lower streaming latency, Deepgram is the closest comparison; if you want to skip the integration step entirely, drop a URL into Whipscribe below.

Quickstart · pick a runtime

Three working ways to submit a pre-recorded URL to the async jobs endpoint. Export your access token as REV_AI_API_KEY first — grab one from your Rev.ai dashboard (free credit equivalent to 5 hours of Reverb ASR on new accounts).

1Python SDK · async URL

Official rev_ai SDK · submit a URL job and poll for the transcript.

# pip install rev_ai
import os, time
from rev_ai import apiclient

client = apiclient.RevAiAPIClient(os.environ["REV_AI_API_KEY"])

job = client.submit_job_url(
    "https://www.rev.ai/FTC_Sample_1.mp3",
    metadata="podcast-001",
    transcriber="reverb",
)

while True:
    details = client.get_job_details(job.id)
    if details.status.name in ("TRANSCRIBED", "FAILED"):
        break
    time.sleep(5)

transcript = client.get_transcript_text(job.id)
print(transcript)
2Node / JavaScript SDK

Official revai-node-sdk · same async URL job from Node 18+.

// npm install revai-node-sdk
import { RevAiApiClient } from "revai-node-sdk";

const client = new RevAiApiClient(process.env.REV_AI_API_KEY);

const job = await client.submitJobUrl(
  "https://www.rev.ai/FTC_Sample_1.mp3",
  { metadata: "podcast-001", transcriber: "reverb" }
);

let details;
do {
  await new Promise(r => setTimeout(r, 5000));
  details = await client.getJobDetails(job.id);
} while (details.status === "in_progress");

const transcript = await client.getTranscriptText(job.id);
console.log(transcript);
3cURL · no SDK

Plain HTTPS POST to /speechtotext/v1/jobs · useful for shell pipelines and edge runtimes.

# submit a URL job
curl --request POST \
  --url 'https://api.rev.ai/speechtotext/v1/jobs' \
  --header "Authorization: Bearer $REV_AI_API_KEY" \
  --header 'Content-Type: application/json' \
  --data '{
    "source_config": {"url": "https://www.rev.ai/FTC_Sample_1.mp3"},
    "metadata": "podcast-001",
    "transcriber": "reverb"
  }'

# poll job status
curl --request GET \
  --url "https://api.rev.ai/speechtotext/v1/jobs/<JOB_ID>" \
  --header "Authorization: Bearer $REV_AI_API_KEY"

# fetch the transcript when status=transcribed
curl --request GET \
  --url "https://api.rev.ai/speechtotext/v1/jobs/<JOB_ID>/transcript" \
  --header "Authorization: Bearer $REV_AI_API_KEY" \
  --header 'Accept: text/plain'

Features

Speaker diarizationYes
Word-level timestampsYes
Streaming / real-timeYes
Languages supported36
HIPAA eligibleYes

Links

Rev.ai vs Whipscribe

FeatureRev.aiWhipscribe
CategoryTranscription APIsTranscription APIs
Pricingfrom $0.02/minfree beta
Speaker diarizationYesYes
Word timestampsYesYes
StreamingYesNo
Languages3699
PlatformsAPIWeb, API, MCP
Sources & dates for the comparison above
  1. diarization: “Diarization groups words by speaker in the response.”source (checked 2026-04-23)
  2. word timestamps: “Each element has a type, value, timestamp ts and end_ts in seconds.”source (checked 2026-04-23)
  3. streaming: “Rev AI's Streaming API accepts live audio over WebSockets.”source (checked 2026-04-23)
  4. pricing: “Asynchronous speech-to-text starts at $0.02 per minute.”source (checked 2026-04-23)

Alternatives to Rev.ai

Whipscribe is a managed faster-whisper + whisperX service. If you want transcripts without running infrastructure, paste a URL or drop a file in the form below — you'll have a transcript in seconds.