Looking at Google Chirp / Chirp 2? Try this first.

Drop your audio. Transcript in seconds. 30 free min, then $2 = 200 min

Google Chirp / Chirp 2

Name: Google Chirp / Chirp 2
Price: 2 USD
Author: Google Cloud

by Google Cloud

Google's universal speech foundation model exposed via Speech-to-Text v2.

TL;DR

Google's universal speech foundation model exposed via Speech-to-Text v2.

Best for teams that need Google's strongest universal multilingual model and can tolerate region constraints. Pricing: per Speech-to-Text v2 pricing (region-tiered).

What it is

Chirp is Google's universal speech model exposed inside Google Cloud Speech-to-Text v2. Chirp 2 is the second generation, trained on additional multilingual data and offering improved accuracy especially for code-switched and accented speech. Chirp models support around 100 languages but feature parity with the legacy 'phone_call', 'video', and 'long' models varies; for example, certain diarization and adaptation features are tied to specific model selections. Chirp models are accessed via the v2 API by configuring a recognizer with the chirp or chirp_2 model identifier.

Best for: Teams that need Google's strongest universal multilingual model and can tolerate region constraints.
Watch out for: Region availability is limited; some features (diarization, model adaptation) not available on all Chirp variants.

Install / use

REST: v2 recognizer with model=chirp_2

Features

Speaker diarization	No
Word-level timestamps	Yes
Streaming / real-time	Yes
Languages supported	100
HIPAA eligible	Yes

Google Chirp / Chirp 2 vs Whipscribe

Feature	Google Chirp / Chirp 2	Whipscribe
Category	Transcription APIs	Transcription APIs
Pricing	per Speech-to-Text v2 pricing (region-tiered)	free beta
Speaker diarization	—	Yes
Word timestamps	Yes	Yes
Streaming	Yes	No
Languages	100	99
Platforms	API	Web, API, MCP

Alternatives to Google Chirp / Chirp 2

OpenAI Whisper API

OpenAI

Hosted Whisper large-v3 from OpenAI — $0.006 per minute.

$0.006/min

AssemblyAI

Universal-2 model + diarization, PII redaction, topic detection, summarization.

from $0.37/hr

Deepgram

Nova-2 model, excellent streaming, strong at conversational audio.

from $0.0043/min

Whipscribe is a managed faster-whisper + whisperX service. If you want transcripts without running infrastructure, paste a URL or drop a file in the form below — you'll have a transcript in seconds.