Alibaba Cloud Intelligent Speech Interaction
Alibaba's managed Chinese-first ASR with batch + real-time and customizable hotwords.
Alibaba's managed Chinese-first ASR with batch + real-time and customizable hotwords.
Best for china-region products that need first-class Mandarin and major Chinese-dialect support. Pricing: tiered RMB-per-second pricing.
What it is
Alibaba Cloud Intelligent Speech Interaction (ISI) is Alibaba's family of speech services covering real-time streaming ASR, batch file recognition, real-time captioning, hotword customization, and a separate model-customization workflow. The service is the strongest mainstream pick for Mandarin and major Chinese dialects (Cantonese, Shanghainese, Sichuanese), and is most commonly used inside Alibaba Cloud's China regions. International account access exists but feature parity is region-dependent. Pricing is RMB-denominated with tiered per-second rates. Best fit: china-region products that need first-class mandarin and major chinese-dialect support. Caveats: console + docs are chinese-first; international account access varies by region; export-control considerations. Pricing as listed: tiered RMB-per-second pricing. Feature flags from vendor docs: word-level timestamps, streaming. Directory tags: commercial-api, hyperscaler, regional-asia. Last vendor-page check: 2026-05-12.
Watch out for: Console + docs are Chinese-first; international account access varies by region; export-control considerations.
Install / use
Alibaba SDK: NlsClient with appkey + AccessKey
Features
| Speaker diarization | No |
| Word-level timestamps | Yes |
| Streaming / real-time | Yes |
| Languages supported | None |
| HIPAA eligible | No |
Alibaba Cloud Intelligent Speech Interaction vs Whipscribe
| Feature | Alibaba Cloud Intelligent Speech Interaction | Whipscribe |
|---|---|---|
| Category | Transcription APIs | Transcription APIs |
| Pricing | tiered RMB-per-second pricing | free beta |
| Speaker diarization | — | Yes |
| Word timestamps | Yes | Yes |
| Streaming | Yes | No |
| Languages | — | 99 |
| Platforms | API | Web, API, MCP |
Alternatives to Alibaba Cloud Intelligent Speech Interaction
Whipscribe is a managed faster-whisper + whisperX service. If you want transcripts without running infrastructure, paste a URL or drop a file in the form below — you'll have a transcript in seconds.