Modal (ASR endpoints)

by Modal Labs

Modal's serverless GPU platform commonly used to host Whisper / faster-whisper as an API.

TL;DR

Modal's serverless GPU platform commonly used to host Whisper / faster-whisper as an API.

Best for teams self-hosting Whisper / faster-whisper / WhisperX who want serverless GPU without managing K8s. Pricing: per-second GPU compute (see Modal pricing).

Category
Transcription APIs
License
Stars
Last push
Pricing
per-second GPU compute (see Modal pricing)
Platforms
API

What it is

Modal Labs offers serverless GPU infrastructure used widely to host open-source ASR models (Whisper, faster-whisper, WhisperX, Pyannote diarization) behind a custom HTTP endpoint. You author Python code that loads a model and Modal handles scaling, cold-start, and billing per second. Not an ASR API itself, but a popular pick for teams that want the open-source stack with managed scaling. Pricing is per-second compute by GPU class. Best fit: teams self-hosting whisper / faster-whisper / whisperx who want serverless gpu without managing k8s. Caveats: not a turnkey asr api — you bring your own model code and pay for compute, not transcription minutes. Pricing as listed: per-second GPU compute (see Modal pricing). Directory tags: commercial-api, model-host. Last vendor-page check: 2026-05-12.

Best for: Teams self-hosting Whisper / faster-whisper / WhisperX who want serverless GPU without managing K8s.
Watch out for: Not a turnkey ASR API — you bring your own model code and pay for compute, not transcription minutes.

Install / use

pip install modal && modal deploy whisper_app.py

Features

Speaker diarizationNo
Word-level timestampsNo
Streaming / real-timeNo
Languages supportedNone
HIPAA eligibleNo

Modal (ASR endpoints) vs Whipscribe

FeatureModal (ASR endpoints)Whipscribe
CategoryTranscription APIsTranscription APIs
Pricingper-second GPU compute (see Modal pricing)free beta
Speaker diarizationYes
Word timestampsYes
StreamingNo
Languages99
PlatformsAPIWeb, API, MCP

Alternatives to Modal (ASR endpoints)

Whipscribe is a managed faster-whisper + whisperX service. If you want transcripts without running infrastructure, paste a URL or drop a file in the form below — you'll have a transcript in seconds.