AI4Bharat
by IIT Madras
IIT Madras Indic AI lab — IndicVoices + Kathbath + IndicSUPERB + IndicWav2Vec.
TL;DR
IIT Madras Indic AI lab — IndicVoices + Kathbath + IndicSUPERB + IndicWav2Vec.
Best for indic-language speech + NLP research, training data, and models. Pricing: free.
Category
Open source
License
—
Stars
—
Last push
—
Pricing
free
Platforms
Web, GitHub
What it is
AI4Bharat at IIT Madras built the canonical Indic speech stack: IndicVoices (16kh), Kathbath, Shrutilipi, IndicWav2Vec, IndicSUPERB. License: CC BY 4.0 across most releases.
Best for: Indic-language speech + NLP research, training data, and models.
Watch out for: Academic; models + corpora CC BY 4.0 (most).
Watch out for: Academic; models + corpora CC BY 4.0 (most).
Install / use
https://ai4bharat.iitm.ac.in/
Features
| Speaker diarization | No |
| Word-level timestamps | No |
| Streaming / real-time | No |
| Languages supported | 22 |
| HIPAA eligible | No |
AI4Bharat vs Whipscribe
| Feature | AI4Bharat | Whipscribe |
|---|---|---|
| Category | Open source | Transcription APIs |
| Pricing | free | free beta |
| Speaker diarization | No | Yes |
| Word timestamps | No | Yes |
| Streaming | No | No |
| Languages | 22 | 99 |
| Platforms | Web, GitHub | Web, API, MCP |
Alternatives to AI4Bharat
Whipscribe is a managed faster-whisper + whisperX service. If you want transcripts without running infrastructure, paste a URL or drop a file in the form below — you'll have a transcript in seconds.