Azure Conversation Transcription
Azure Speech's multi-speaker meeting transcription with channel and speaker ID.
Azure Speech's multi-speaker meeting transcription with channel and speaker ID.
Best for meeting-recorder vendors who want Microsoft's diarization + speaker-ID models. Pricing: per-Azure-Speech pricing.
What it is
Azure Conversation Transcription is the multi-speaker meeting variant of Azure Speech, producing transcripts with diarized speakers and (optionally) speaker recognition when speakers have enrolled voice profiles. Used in Microsoft Teams for transcripts and recall-style features. Pricing follows the Azure Speech price list. Best fit: meeting-recorder vendors who want microsoft's diarization + speaker-id models. Caveats: speaker recognition enrollment workflow has separate constraints; verify before relying on it. Pricing as listed: per-Azure-Speech pricing. Feature flags from vendor docs: speaker diarization, word-level timestamps, streaming, HIPAA-eligible under BAA. Directory tags: voice-intel, meeting-ai, hyperscaler. Last vendor-page check: 2026-05-12.
Watch out for: Speaker recognition enrollment workflow has separate constraints; verify before relying on it.
Install / use
Azure Speech SDK: ConversationTranscriber
Features
| Speaker diarization | Yes |
| Word-level timestamps | Yes |
| Streaming / real-time | Yes |
| Languages supported | None |
| HIPAA eligible | Yes |
Azure Conversation Transcription vs Whipscribe
| Feature | Azure Conversation Transcription | Whipscribe |
|---|---|---|
| Category | Products | Transcription APIs |
| Pricing | per-Azure-Speech pricing | free beta |
| Speaker diarization | Yes | Yes |
| Word timestamps | Yes | Yes |
| Streaming | Yes | No |
| Languages | — | 99 |
| Platforms | API | Web, API, MCP |
Alternatives to Azure Conversation Transcription
Whipscribe is a managed faster-whisper + whisperX service. If you want transcripts without running infrastructure, paste a URL or drop a file in the form below — you'll have a transcript in seconds.