Apple Vision Pro Live Captions
Real-time captions in visionOS — overlay spoken speech as floating text in your field of view.
Real-time captions in visionOS — overlay spoken speech as floating text in your field of view.
Best for deaf and hard-of-hearing Vision Pro users in supported English / Spanish / French / German / Korean / Mandarin locales. Pricing: Free with Vision Pro ($3,499 headset).
What it is
Live Captions on visionOS uses on-device speech recognition to surface captions over any audio source — phone calls, FaceTime, ambient room conversation, or app audio. Same recognizer family as macOS / iOS Live Captions.
Watch out for: On-device but locale-restricted; FaceTime captioning still tagged "beta"; not a transcription product — captions are ephemeral, not saved.
Install / use
Settings > Accessibility > Live Captions on visionOS
Features
| Speaker diarization | No |
| Word-level timestamps | No |
| Streaming / real-time | Yes |
| Languages supported | 6 |
| HIPAA eligible | No |
Apple Vision Pro Live Captions vs Whipscribe
| Feature | Apple Vision Pro Live Captions | Whipscribe |
|---|---|---|
| Category | Desktop apps | Transcription APIs |
| Pricing | Free with Vision Pro ($3,499 headset) | free beta |
| Speaker diarization | — | Yes |
| Word timestamps | — | Yes |
| Streaming | Yes | No |
| Languages | 6 | 99 |
| Platforms | visionOS, Hardware | Web, API, MCP |
Alternatives to Apple Vision Pro Live Captions
Whipscribe is a managed faster-whisper + whisperX service. If you want transcripts without running infrastructure, paste a URL or drop a file in the form below — you'll have a transcript in seconds.