Google Speech Commands

by Google / TensorFlow

1s keyword-spotting corpus — 35 single-word commands, ~100k utterances.

TL;DR

1s keyword-spotting corpus — 35 single-word commands, ~100k utterances.

Best for keyword spotting (KWS), on-device wake-word baselines, embedded-ML benchmarks. Pricing: free.

Category
Open source
License
Stars
Last push
Pricing
free
Platforms
TensorFlow, HuggingFace

What it is

Speech Commands v0.02 is ~100k 1-second utterances of 35 single-word commands. The canonical KWS benchmark for embedded ML. License: CC BY 4.0.

Best for: Keyword spotting (KWS), on-device wake-word baselines, embedded-ML benchmarks.
Watch out for: CC BY 4.0 · 1-second utterances · 35 English single-word commands. Cite: Warden, 2018.

Install / use

from datasets import load_dataset; ds = load_dataset('google/speech_commands', 'v0.02')

Features

Speaker diarizationNo
Word-level timestampsNo
Streaming / real-timeNo
Languages supported1
HIPAA eligibleNo

Google Speech Commands vs Whipscribe

FeatureGoogle Speech CommandsWhipscribe
CategoryOpen sourceTranscription APIs
Pricingfreefree beta
Speaker diarizationNoYes
Word timestampsNoYes
StreamingNoNo
Languages199
PlatformsTensorFlow, HuggingFaceWeb, API, MCP

Alternatives to Google Speech Commands

Whipscribe is a managed faster-whisper + whisperX service. If you want transcripts without running infrastructure, paste a URL or drop a file in the form below — you'll have a transcript in seconds.