Free · 30 minutes a day · no signup

Drop a video get the subtitle file.

SRT and VTT exports with word-level timestamps. Drop them straight into YouTube Studio, Premiere, Final Cut, DaVinci Resolve, or any HTML5 player. Most one-hour videos finish in two to four minutes.

30 min / day free · no signup · $1/hr PAYG after · Never used to train AI · Or upload a file →

✓Word-level timing (not 3-second blocks) · ✓Works with every editor · ✓100+ languages auto-detected · ✓Never used to train AI

What you get

What a real subtitle file actually needs.

Word-level timing

Most free generators round to 3-second segments. Whipscribe times every word individually, so karaoke-style word-by-word captions render correctly and click-to-seek lands on the exact word.

Editor compatibility

SRT works in YouTube Studio, Premiere, Final Cut, DaVinci Resolve, CapCut, Descript, and every HTML5 video player. VTT is the web-standard format. Drop the file in, the captions appear — no format conversion, no plugin.

Edit before export

Open the transcript, fix a misheard name or jargon term, and the SRT updates with it. Export the corrected file — no round-trip through a desktop captioning app, no manual SRT timestamp arithmetic.

100+ languages

Whisper-large-v3 covers Spanish, French, German, Hindi, Mandarin, Arabic, Portuguese, and 90+ more. Auto-detect picks the language from the audio — you don't have to set anything.

Why word-level timing matters

Free auto-captions vs a real SRT file.

✗ Auto-captions (YouTube, free tools)

3-second blocks, no punctuation, often missing for non-English. Useful for accessibility floor, useless for word-by-word reveal or precise editing.

3-second blocks, not word-level
No punctuation, no paragraphs
Missing for many languages
Can't be edited inside the source
Doesn't export to all editors

✓ A real Whipscribe SRT/VTT

Word-level timestamps, full punctuation, and a file format every editor accepts. Drop it into YouTube Studio or your NLE timeline and you're done.

Word-level timestamps
Properly punctuated lines
100+ languages, auto-detected
SRT, VTT, TXT — three formats
Compatible with every editor

Sample output

Speaker-labelled. Word-timed. SRT-ready.

The same transcript that drives the editor also exports as SRT and VTT — every word carries its own timestamp.

transcript · whipscribe.com/view/subtitle-generator

NARRATOR 00:00:04 The first thing you'll notice when you open the file is that every line is punctuated and capitalized.

NARRATOR 00:00:11 That's not standard for auto-generated subtitles. Most free tools give you a wall of lowercase text.

NARRATOR 00:00:18 Word-level timing means each word's start and end are stored separately.

NARRATOR 00:00:23 If you're rendering animated captions, this is the difference between karaoke-style reveal and chunks-of-three.

Export

One transcript. Three clean formats.

Every paid tier exports all three. The free tier exports TXT and SRT.

.srt

SRT captions

Word-level. Every video editor reads this.

.vtt

WebVTT

HTML5 player + YouTube uploads.

.txt

Plain text

De-ummed paragraphs. Ready to paste.

Pricing

Honest pricing, no surprises.

Credits never expire. Upgrade or downgrade any month. Free tier resets daily — no signup, no card.

Free

$0/forever

Try every feature for 30 minutes a day. No card.

30 min / day
Speaker labels included
TXT + SRT export
No history retention

Try free

Pay-as-you-go

$1/hour

Best for one-off projects. Credits never expire.

$10 minimum top-up
Every export format
365-day history
API access

Top up

Pro

$8/month

Indie creators. 100 hours / month, all features.

100 hours / month
Clips + every aspect ratio
Branded captions
Priority queue

See Pro

Team

$29/month

Teams. 500 hours / month, shared workspace.

500 hours / month
Shared library
API + MCP for Claude
Workspace billing

See Team

FAQ

Subtitle generator questions, answered.

What's the difference between SRT and VTT?

SRT is the older, simpler format — works in every video editor and most players. VTT is the web standard — supports styling, positioning, and metadata, used by HTML5 video and modern streaming players. We export both from the same transcript so you have whichever you need.

Will these subtitles work in DaVinci Resolve / Premiere / Final Cut?

Yes. Standard SRT files import directly into DaVinci Resolve (drag onto the subtitle track), Premiere Pro (File → Import Captions), and Final Cut Pro (File → Import → Captions). The same file works in CapCut, Descript, and Adobe Express without conversion.

Can I edit a misheard word and re-export?

Yes. Open the transcript in our editor, click the word, fix it. The SRT and VTT regenerate with the correction in place. No timestamp recalculation needed — the timing is anchored to the audio, not to the previous text.

Does it support multiple languages in one file?

It auto-detects one primary language per file. For mixed-language interviews (English-Spanish code-switching, for example), it does a best-effort transcription in both — accuracy on the secondary language is lower. Translation to a different target language is not in scope here.

What about burned-in captions for social clips?

For social-ready burned-in captions (TikTok, Reels, Shorts) — see our clipping tool. It renders the captions into the video with brand colors and reveal-by-word animation. This page generates the SRT/VTT file; the clipping page generates the video with captions baked in.

How accurate is the timing?

Word-level timestamps are accurate to within ~50 milliseconds on clean audio. Slightly looser on noisy field recordings or thick accents. For precise editing this is well under the threshold of human perception — a 50ms drift on a captioned word is invisible.