TEN Framework
Open-source framework by Agora for building realtime multimodal voice AI agents.
Open-source framework by Agora for building realtime multimodal voice AI agents.
Best for builders on Agora RTC who want an OSS voice/video agent runtime. Pricing: free.
What it is
TEN (The Embodied Network) is Agora's open-source framework for realtime multimodal agents over its global RTC SDK. It abstracts ASR, LLM, TTS, and video components into reusable extensions and is positioned alongside Pipecat and LiveKit Agents as an OSS alternative for voice-agent runtime. Apache-2.0 licensed.
Watch out for: Newer project; ecosystem narrower than Pipecat or LiveKit Agents.
Features
| Speaker diarization | No |
| Word-level timestamps | No |
| Streaming / real-time | No |
| Languages supported | None |
| HIPAA eligible | No |
TEN Framework vs Whipscribe
| Feature | TEN Framework | Whipscribe |
|---|---|---|
| Category | Open source | Transcription APIs |
| Pricing | free | free beta |
| Speaker diarization | — | Yes |
| Word timestamps | — | Yes |
| Streaming | — | No |
| Languages | — | 99 |
| Platforms | Linux, macOS, Cloud | Web, API, MCP |
Alternatives to TEN Framework
Whipscribe is a managed faster-whisper + whisperX service. If you want transcripts without running infrastructure, paste a URL or drop a file in the form below — you'll have a transcript in seconds.