AI4Bharat

by IIT Madras

IIT Madras Indic AI lab — IndicVoices + Kathbath + IndicSUPERB + IndicWav2Vec.

TL;DR

IIT Madras Indic AI lab — IndicVoices + Kathbath + IndicSUPERB + IndicWav2Vec.

Best for indic-language speech + NLP research, training data, and models. Pricing: free.

Category
Open source
License
Stars
Last push
Pricing
free
Platforms
Web, GitHub

What it is

AI4Bharat at IIT Madras built the canonical Indic speech stack: IndicVoices (16kh), Kathbath, Shrutilipi, IndicWav2Vec, IndicSUPERB. License: CC BY 4.0 across most releases.

Best for: Indic-language speech + NLP research, training data, and models.
Watch out for: Academic; models + corpora CC BY 4.0 (most).

Install / use

https://ai4bharat.iitm.ac.in/

Features

Speaker diarizationNo
Word-level timestampsNo
Streaming / real-timeNo
Languages supported22
HIPAA eligibleNo

AI4Bharat vs Whipscribe

FeatureAI4BharatWhipscribe
CategoryOpen sourceTranscription APIs
Pricingfreefree beta
Speaker diarizationNoYes
Word timestampsNoYes
StreamingNoNo
Languages2299
PlatformsWeb, GitHubWeb, API, MCP

Alternatives to AI4Bharat

Whipscribe is a managed faster-whisper + whisperX service. If you want transcripts without running infrastructure, paste a URL or drop a file in the form below — you'll have a transcript in seconds.