Looking at Rev.ai? Try this first.

Drop your audio. Transcript in seconds. 30 free min, then $2 = 200 min

Rev.ai

by Rev.ai

Rev.ai is Rev's developer API arm — Reverb ASR plus real-time streaming, with the unique option to drop the same job down to human transcribers when machine-grade accuracy isn't enough.

TL;DR

Rev.ai exposes async transcription via POST /speechtotext/v1/jobs and real-time via a WebSocket streaming endpoint, both powered by the Reverb model family — Rev's in-house English ASR that was open-sourced in October 2024 (Apache-2.0 code, models on HuggingFace) and benchmarks competitively against Whisper Large-v3. Word-level timestamps, diarization, custom vocabulary, language identification, sentiment, topic extraction, and summarization are all wired into the same job submission.

Best for long-form English media, podcasts, and call recordings where accuracy matters more than headline price, plus any workflow that occasionally needs to escalate the same audio to human transcribers under one vendor. Current pricing: Reverb at $0.20/hr (~$0.0033/min), Reverb Turbo at $0.10/hr (~$0.0017/min), Reverb foreign language at $0.30/hr, Whisper Fusion / Whisper Large at $0.005/min, and human transcription at $1.99/min. New accounts get free credit equivalent to 5 hours of Reverb ASR.

What it is

Rev.ai is the developer API from Rev, historically a human-transcription service. The "Machine" endpoint is competitive on English and supports custom vocabulary that handles jargon better than generic Whisper. HIPAA-eligible on appropriate plans. Last price check: 2026-04-20.

Best for: English-heavy workloads where vocabulary customization matters (medical, legal, technical).
Watch out for: Pricier than Whisper-based APIs; non-English coverage narrower.

Install / use

POST https://api.rev.ai/speechtotext/v1/jobs

View Rev.ai API docs ↗

Where Rev.ai fits · 6 use-cases

Rev.ai's pitch clusters around English ASR accuracy, the Reverb open model, and the optional human-grade escalation lane. Pick the card closest to your build — each links to the matching docs.rev.ai section.

Podcasts · long-form English

Async · Reverb model

Submit an episode URL to the async jobs endpoint and ask for diarization plus word timestamps. Reverb is tuned on Rev's 7M+ hours of human-verified speech, which shows up on long-form conversational English versus generic Whisper.

POST /speechtotext/v1/jobs
transcriber=reverb · diarize=true

Media · subtitles + chapters

Async · forced alignment

Use forced alignment to lock an existing transcript to word-level timestamps for SRT/VTT export, or run a fresh job with diarization for caption tracks plus speaker labels. Topic extraction can seed chapter markers.

Forced alignment endpoint
Word-accurate timing for caption files

Voice agents · streaming

WebSocket · real-time

Open a WebSocket to the streaming endpoint and pipe PCM frames; partial hypotheses arrive in real time for live captioning, IVR turn-taking, and in-product voice copilots. Custom vocabulary biases brand and jargon words.

Streaming Speech-to-Text
Real-time partials + final results

LLM pre-processing

Async + Insights stack

Bundle the transcript with summarization, topic extraction, and sentiment in a single workflow so the LLM downstream sees structured input instead of raw text. Useful for meeting bots, RAG over call audio, and analytics rollups.

Summary + Topics + Sentiment APIs
Insights stack runs on the same job id

Compliance · legal records

Human option · HIPAA / SOC II

When machine accuracy is not enough, Rev.ai can route the same job to its human transcription network for verbatim, certified output. Same API, same dashboard, same audit trail — useful for depositions, regulatory filings, and accessibility compliance.

Human transcription job type
$1.99/min · turnaround in hours

Multilingual fallback

Reverb FL + Whisper Fusion

For non-English audio, Rev.ai exposes a Reverb Foreign Language model plus Whisper Fusion / Whisper Large as transcriber options, with language identification as a pre-step when the input language is unknown. 57+ languages on the broader stack.

transcriber=reverb-fl | whisper-fusion
Language ID at $0.003/min seeds the choice

Pattern: Rev.ai is the right call when English accuracy and the option to escalate to humans under one vendor matter more than the lowest per-minute rate. If you want lower streaming latency, Deepgram is the closest comparison; if you want to skip the integration step entirely, drop a URL into Whipscribe below.

Quickstart · pick a runtime

Three working ways to submit a pre-recorded URL to the async jobs endpoint. Export your access token as REV_AI_API_KEY first — grab one from your Rev.ai dashboard (free credit equivalent to 5 hours of Reverb ASR on new accounts).

1Python SDK · async URL

Official rev_ai SDK · submit a URL job and poll for the transcript.

# pip install rev_ai
import os, time
from rev_ai import apiclient

client = apiclient.RevAiAPIClient(os.environ["REV_AI_API_KEY"])

job = client.submit_job_url(
    "https://www.rev.ai/FTC_Sample_1.mp3",
    metadata="podcast-001",
    transcriber="reverb",
)

while True:
    details = client.get_job_details(job.id)
    if details.status.name in ("TRANSCRIBED", "FAILED"):
        break
    time.sleep(5)

transcript = client.get_transcript_text(job.id)
print(transcript)

Source: revdotcom/revai-python-sdk ↗ · full options at docs.rev.ai/api/asynchronous ↗

2Node / JavaScript SDK

Official revai-node-sdk · same async URL job from Node 18+.

// npm install revai-node-sdk
import { RevAiApiClient } from "revai-node-sdk";

const client = new RevAiApiClient(process.env.REV_AI_API_KEY);

const job = await client.submitJobUrl(
  "https://www.rev.ai/FTC_Sample_1.mp3",
  { metadata: "podcast-001", transcriber: "reverb" }
);

let details;
do {
  await new Promise(r => setTimeout(r, 5000));
  details = await client.getJobDetails(job.id);
} while (details.status === "in_progress");

const transcript = await client.getTranscriptText(job.id);
console.log(transcript);

Source: revdotcom/revai-node-sdk ↗

3cURL · no SDK

Plain HTTPS POST to /speechtotext/v1/jobs · useful for shell pipelines and edge runtimes.

# submit a URL job
curl --request POST \
  --url 'https://api.rev.ai/speechtotext/v1/jobs' \
  --header "Authorization: Bearer $REV_AI_API_KEY" \
  --header 'Content-Type: application/json' \
  --data '{
    "source_config": {"url": "https://www.rev.ai/FTC_Sample_1.mp3"},
    "metadata": "podcast-001",
    "transcriber": "reverb"
  }'

# poll job status
curl --request GET \
  --url "https://api.rev.ai/speechtotext/v1/jobs/<JOB_ID>" \
  --header "Authorization: Bearer $REV_AI_API_KEY"

# fetch the transcript when status=transcribed
curl --request GET \
  --url "https://api.rev.ai/speechtotext/v1/jobs/<JOB_ID>/transcript" \
  --header "Authorization: Bearer $REV_AI_API_KEY" \
  --header 'Accept: text/plain'

Source: docs.rev.ai/api/asynchronous ↗

Features

Speaker diarization	Yes
Word-level timestamps	Yes
Streaming / real-time	Yes
Languages supported	36
HIPAA eligible	Yes

Rev.ai vs Whipscribe

Feature	Rev.ai	Whipscribe
Category	Transcription APIs	Transcription APIs
Pricing	from $0.02/min	free beta
Speaker diarization	Yes	Yes
Word timestamps	Yes	Yes
Streaming	Yes	No
Languages	36	99
Platforms	API	Web, API, MCP

Sources & dates for the comparison above

diarization: “Diarization groups words by speaker in the response.” — source (checked 2026-04-23)
word timestamps: “Each element has a type, value, timestamp ts and end_ts in seconds.” — source (checked 2026-04-23)
streaming: “Rev AI's Streaming API accepts live audio over WebSockets.” — source (checked 2026-04-23)
pricing: “Asynchronous speech-to-text starts at $0.02 per minute.” — source (checked 2026-04-23)

Alternatives to Rev.ai

OpenAI Whisper API

OpenAI

Hosted Whisper large-v3 from OpenAI — $0.006 per minute.

$0.006/min

AssemblyAI

Universal-2 model + diarization, PII redaction, topic detection, summarization.

from $0.37/hr

Deepgram

Nova-2 model, excellent streaming, strong at conversational audio.

from $0.0043/min

Whipscribe is a managed faster-whisper + whisperX service. If you want transcripts without running infrastructure, paste a URL or drop a file in the form below — you'll have a transcript in seconds.