Looking at Silero VAD? Try this first.

Drop your audio. Transcript in seconds. 30 free min, then $2 = 200 min

Silero VAD

Name: Silero VAD
Author: snakers4 / Silero

by snakers4 / Silero

Tiny, accurate voice-activity-detection model — runs on CPU.

TL;DR

Tiny, accurate voice-activity-detection model — runs on CPU.

Best for front-end VAD for any streaming ASR pipeline. Pricing: free.

What it is

A few-MB pre-trained VAD that runs in milliseconds per chunk. MIT.

Best for: Front-end VAD for any streaming ASR pipeline.
Watch out for: Detects speech only, not phonemes; some false positives on music.

Install / use

pip install silero-vad

Features

Speaker diarization	No
Word-level timestamps	No
Streaming / real-time	Yes
Languages supported	99
HIPAA eligible	No

Silero VAD vs Whipscribe

Feature	Silero VAD	Whipscribe
Category	Open source	Transcription APIs
Pricing	free	free beta
Speaker diarization	No	Yes
Word timestamps	No	Yes
Streaming	Yes	No
Languages	99	99
Platforms	Linux, macOS, Windows, Android, iOS, Edge	Web, API, MCP

Alternatives to Silero VAD

OpenAI Whisper

OpenAI

The reference open-source multilingual ASR model from OpenAI.

OSS · MIT ★ 98.1k

whisper.cpp

Georgi Gerganov

C/C++ port of Whisper — runs on anything, from a Raspberry Pi to Apple Silicon.

OSS · MIT ★ 48.8k

faster-whisper

SYSTRAN

4× faster than reference Whisper using CTranslate2 — production sweet spot.

OSS · MIT ★ 22.3k

Whipscribe is a managed faster-whisper + whisperX service. If you want transcripts without running infrastructure, paste a URL or drop a file in the form below — you'll have a transcript in seconds.