Baidu ERNIE Speech

by Baidu

Baidu's ERNIE-aligned speech models inside ERNIE Bot Cloud.

TL;DR

Baidu's ERNIE-aligned speech models inside ERNIE Bot Cloud.

Best for mandarin-first workflows wanting Baidu's ERNIE-aligned speech features. Pricing: tiered RMB-per-call.

Category
Transcription APIs
License
Stars
Last push
Pricing
tiered RMB-per-call
Platforms
API

What it is

Baidu ERNIE Speech ties Baidu's ERNIE foundation models to its speech recognition stack, exposing speech-to-text plus higher-level analysis (intent, semantics) tuned for Mandarin. Sold inside ERNIE Bot Cloud (Wenxin Qianfan). China-first; international customers should verify access. Best fit: mandarin-first workflows wanting baidu's ernie-aligned speech features. Caveats: china-first; international docs limited. Pricing as listed: tiered RMB-per-call. Feature flags from vendor docs: streaming. Directory tags: commercial-api, regional-asia, foundation-model. Last vendor-page check: 2026-05-12. The product slots into the broader voice-AI tooling landscape and is best evaluated head-to-head with adjacent vendors in its subcategory; verify current language coverage, region availability, and compliance terms on the vendor's public docs at time of build.

Best for: Mandarin-first workflows wanting Baidu's ERNIE-aligned speech features.
Watch out for: China-first; international docs limited.

Install / use

Baidu ERNIE Bot Cloud speech endpoints

Features

Speaker diarizationNo
Word-level timestampsNo
Streaming / real-timeYes
Languages supportedNone
HIPAA eligibleNo

Baidu ERNIE Speech vs Whipscribe

FeatureBaidu ERNIE SpeechWhipscribe
CategoryTranscription APIsTranscription APIs
Pricingtiered RMB-per-callfree beta
Speaker diarizationYes
Word timestampsYes
StreamingYesNo
Languages99
PlatformsAPIWeb, API, MCP

Alternatives to Baidu ERNIE Speech

Whipscribe is a managed faster-whisper + whisperX service. If you want transcripts without running infrastructure, paste a URL or drop a file in the form below — you'll have a transcript in seconds.