HF Text Generation Inference (TGI)

by Hugging Face

Production inference server — runs audio-multimodal LLMs.

TL;DR

Production inference server — runs audio-multimodal LLMs.

Best for self-hosting Qwen2-Audio / SeamlessM4T / Phi-Audio over HTTP. Pricing: free.

Category
Open source
License
Stars
Last push
Pricing
free
Platforms
Linux, Docker

What it is

HF's production-grade inference server. HFOIL license — check before commercial deployment.

Best for: Self-hosting Qwen2-Audio / SeamlessM4T / Phi-Audio over HTTP.
Watch out for: ASR-specific paths newer than text-only ones.

Install / use

Features

Speaker diarizationNo
Word-level timestampsYes
Streaming / real-timeNo
Languages supported99
HIPAA eligibleNo

HF Text Generation Inference (TGI) vs Whipscribe

FeatureHF Text Generation Inference (TGI)Whipscribe
CategoryOpen sourceTranscription APIs
Pricingfreefree beta
Speaker diarizationNoYes
Word timestampsYesYes
StreamingNoNo
Languages9999
PlatformsLinux, DockerWeb, API, MCP

Alternatives to HF Text Generation Inference (TGI)

Whipscribe is a managed faster-whisper + whisperX service. If you want transcripts without running infrastructure, paste a URL or drop a file in the form below — you'll have a transcript in seconds.