Google Chirp / Chirp 2

by Google Cloud

Google's universal speech foundation model exposed via Speech-to-Text v2.

TL;DR

Google's universal speech foundation model exposed via Speech-to-Text v2.

Best for teams that need Google's strongest universal multilingual model and can tolerate region constraints. Pricing: per Speech-to-Text v2 pricing (region-tiered).

Category
Transcription APIs
License
Stars
Last push
Pricing
per Speech-to-Text v2 pricing (region-tiered)
Platforms
API

What it is

Chirp is Google's universal speech model exposed inside Google Cloud Speech-to-Text v2. Chirp 2 is the second generation, trained on additional multilingual data and offering improved accuracy especially for code-switched and accented speech. Chirp models support around 100 languages but feature parity with the legacy 'phone_call', 'video', and 'long' models varies; for example, certain diarization and adaptation features are tied to specific model selections. Chirp models are accessed via the v2 API by configuring a recognizer with the chirp or chirp_2 model identifier.

Best for: Teams that need Google's strongest universal multilingual model and can tolerate region constraints.
Watch out for: Region availability is limited; some features (diarization, model adaptation) not available on all Chirp variants.

Install / use

REST: v2 recognizer with model=chirp_2

Features

Speaker diarizationNo
Word-level timestampsYes
Streaming / real-timeYes
Languages supported100
HIPAA eligibleYes

Google Chirp / Chirp 2 vs Whipscribe

FeatureGoogle Chirp / Chirp 2Whipscribe
CategoryTranscription APIsTranscription APIs
Pricingper Speech-to-Text v2 pricing (region-tiered)free beta
Speaker diarizationYes
Word timestampsYesYes
StreamingYesNo
Languages10099
PlatformsAPIWeb, API, MCP

Alternatives to Google Chirp / Chirp 2

Whipscribe is a managed faster-whisper + whisperX service. If you want transcripts without running infrastructure, paste a URL or drop a file in the form below — you'll have a transcript in seconds.