Vapi
Voice-AI infrastructure: turnkey assistants composed of STT + LLM + TTS providers.
Voice-AI infrastructure: turnkey assistants composed of STT + LLM + TTS providers.
Best for developers building voice agents who want a stack of pluggable STT/LLM/TTS providers behind one API. Pricing: $0.05/min (Vapi orchestration) + provider pass-through.
What it is
Vapi is voice-AI infrastructure: a single API that orchestrates STT, LLM, and TTS providers into a real-time phone or web voice agent. Developers pick the underlying providers (Deepgram, AssemblyAI, OpenAI, ElevenLabs, etc.) per agent. Pricing is a per-minute Vapi orchestration fee plus pass-through provider cost. Best fit: developers building voice agents who want a stack of pluggable stt/llm/tts providers behind one api. Caveats: you pay for vapi orchestration plus each provider; latency depends on chosen pipeline. Pricing as listed: $0.05/min (Vapi orchestration) + provider pass-through. Feature flags from vendor docs: streaming. Directory tags: voice-intel, voice-agent. Last vendor-page check: 2026-05-12.
Watch out for: You pay for Vapi orchestration plus each provider; latency depends on chosen pipeline.
Install / use
Features
| Speaker diarization | No |
| Word-level timestamps | No |
| Streaming / real-time | Yes |
| Languages supported | None |
| HIPAA eligible | No |
Vapi vs Whipscribe
| Feature | Vapi | Whipscribe |
|---|---|---|
| Category | Products | Transcription APIs |
| Pricing | $0.05/min (Vapi orchestration) + provider pass-through | free beta |
| Speaker diarization | — | Yes |
| Word timestamps | — | Yes |
| Streaming | Yes | No |
| Languages | — | 99 |
| Platforms | API | Web, API, MCP |
Alternatives to Vapi
Whipscribe is a managed faster-whisper + whisperX service. If you want transcripts without running infrastructure, paste a URL or drop a file in the form below — you'll have a transcript in seconds.