Anthropic Voice Agent Patterns

by Anthropic

Reference patterns for building voice agents with Anthropic Claude models.

TL;DR

Reference patterns for building voice agents with Anthropic Claude models.

Best for teams that want Claude as the brain of a voice agent over external ASR/TTS. Pricing: see vendor pricing.

Category
Transcription APIs
License
Stars
Last push
Pricing
see vendor pricing
Platforms
Cloud, API

What it is

Anthropic does not offer a first-party realtime speech API, so Claude-based voice agents are typically built with Deepgram or AssemblyAI streaming ASR, Claude for reasoning and tool use, and ElevenLabs or Cartesia TTS. The pattern is well documented in LiveKit Agents and Pipecat templates and remains a strong choice for teams that prefer Claude's tool-use behavior over GPT-4.

Best for: Teams that want Claude as the brain of a voice agent over external ASR/TTS.
Watch out for: No first-party realtime audio API as of writing; rely on ASR + LLM + TTS plumbing.

Features

Speaker diarizationNo
Word-level timestampsNo
Streaming / real-timeNo
Languages supportedNone
HIPAA eligibleNo

Anthropic Voice Agent Patterns vs Whipscribe

FeatureAnthropic Voice Agent PatternsWhipscribe
CategoryTranscription APIsTranscription APIs
Pricingsee vendor pricingfree beta
Speaker diarizationYes
Word timestampsYes
StreamingNo
Languages99
PlatformsCloud, APIWeb, API, MCP

Alternatives to Anthropic Voice Agent Patterns

Whipscribe is a managed faster-whisper + whisperX service. If you want transcripts without running infrastructure, paste a URL or drop a file in the form below — you'll have a transcript in seconds.