Volcengine Speech

by ByteDance Volcano Engine

ByteDance's Volcano Engine speech-to-text — short, long, and streaming Mandarin ASR.

TL;DR

ByteDance's Volcano Engine speech-to-text — short, long, and streaming Mandarin ASR.

Best for short-form video, livestream, and TikTok-style content workflows that need Mandarin captions at scale. Pricing: tiered · pay-as-you-go in CNY.

Category
Transcription APIs
License
Stars
Last push
Pricing
tiered · pay-as-you-go in CNY
Platforms
Web, Android, iOS, Linux

What it is

Volcano Engine is ByteDance's enterprise cloud, and its speech product line carries the same engines that auto-caption Douyin and other ByteDance properties. Recognition latency is tuned for video and livestream use cases; the platform also exposes voice cloning and TTS alongside ASR. A pragmatic pick when serving Chinese creators and short-form video producers. Best fit when the buyer is short-form video, livestream, and tiktok-style content workflows that need mandarin captions at scale. The honest caveat: primarily mandarin and cantonese; non-chinese language coverage is narrow. Developer-grade API with self-serve signup; validate accuracy on representative audio for the target dialect before committing volume.

Best for: Short-form video, livestream, and TikTok-style content workflows that need Mandarin captions at scale.
Watch out for: Primarily Mandarin and Cantonese; non-Chinese language coverage is narrow.

Install / use

volcengine.com/product/speech-tech

Features

Speaker diarizationNo
Word-level timestampsNo
Streaming / real-timeYes
Languages supportedNone
HIPAA eligibleNo

Volcengine Speech vs Whipscribe

FeatureVolcengine SpeechWhipscribe
CategoryTranscription APIsTranscription APIs
Pricingtiered · pay-as-you-go in CNYfree beta
Speaker diarizationYes
Word timestampsYes
StreamingYesNo
Languages99
PlatformsWeb, Android, iOS, LinuxWeb, API, MCP

Alternatives to Volcengine Speech

Whipscribe is a managed faster-whisper + whisperX service. If you want transcripts without running infrastructure, paste a URL or drop a file in the form below — you'll have a transcript in seconds.