Looking at WhisperKit? Try this first.

Drop your audio. Transcript in seconds. 30 free min, then $2 = 200 min

WhisperKit

Name: WhisperKit
Author: Argmax

by Argmax

Argmax's MIT-licensed Swift SDK that runs Whisper natively on Apple Silicon. CoreML-quantized weights schedule across the Apple Neural Engine, GPU, and CPU automatically — no PyTorch, no CUDA, no server hop.

TL;DR

WhisperKit is the speech-to-text core of the Argmax Open-Source SDK (v1.0.0, May 2026) — a Swift Package that ships pre-converted CoreML Whisper weights and an async API designed for Apple platforms. The OS schedules layers across the Apple Neural Engine, GPU (Metal), and CPU per call, so you get hardware acceleration that whisper.cpp only reaches with extra build flags and a separate encoder-conversion step. The same SPM dependency also exposes sibling kits SpeakerKit (pyannote diarization) and TTSKit (Qwen3-TTS).

Best for Apple-app developers shipping on-device voice features — dictation, accessibility captions, in-game voice, offline note-takers, HIPAA-sensitive transcription — where a Python runtime or a network round-trip is a non-starter. Supported targets: macOS 14+ and iOS 17+ per the v1.0.0 README; sister kits TTSKit (macOS 15 / iOS 18) and SpeakerKit (macOS 13 / iOS 16) have their own floors. License: MIT. Repo: argmaxinc/argmax-oss-swift (the legacy argmaxinc/WhisperKit URL redirects).

What it is

WhisperKit is Argmax's Swift-native Whisper runtime for Apple Silicon. CoreML-compiled encoder + decoder run on the Neural Engine, GPU, and CPU automatically — no PyTorch, no Python, no CUDA, and no manual conversion step. As of v1.0.0 (2026-05-01) the repo was renamed from argmaxinc/WhisperKit to argmaxinc/argmax-oss-swift and now ships WhisperKit + SpeakerKit (pyannote diarization) + TTSKit (Qwen3-TTS) in a single MIT-licensed Swift package. The whisperkit-cli (Homebrew + `swift run`), an OpenAI-compatible local server, and 27+ pre-converted model variants on HuggingFace make it the default choice for any developer who wants Whisper-grade transcription on Apple platforms without running a backend.

Best for: Shipping Whisper inside iOS/macOS/visionOS apps with Apple Neural Engine acceleration and no server round-trip.
Watch out for: Apple ecosystem only; diarization now sits in a sibling kit (SpeakerKit) rather than in WhisperKit itself; word-level forced alignment is not in the open-source surface.

Install / use

View on GitHub (Argmax Open-Source SDK) github.com

Add via Swift Package Manager in Xcode: File → Add Package Dependencies…

Pick your Apple target · 5 surfaces

WhisperKit targets the full Apple stack from one Swift package. The integration story differs by surface — what changes is the example app you start from, the practical model size, and which compute units the OS picks. Each card links the canonical project folder or doc page in the upstream repo.

iOS apps

iOS 17+ · iPhone 15 Pro tier runs large-v3

Add the Swift Package, import WhisperKit, and call transcribe from an actor or task. The CoreML scheduler routes the heavy attention layers to the Apple Neural Engine, so a SwiftUI dictation app stays responsive on an A17 Pro or M-series iPad. Practical model ceiling tracks RAM — tiny and base run on older iPhones, quantized large-v3 variants (547-626MB) target iPhone 15 Pro and newer with 8GB.

Pin the model with WhisperKitConfig(model: large-v3-v20240930_626MB) to ship a deterministic asset bundle instead of relying on lazy first-run download.

macOS apps

macOS 14+ · the dictation tool sweet spot

WhisperKit is the SDK behind the modern crop of Mac dictation apps (Superwhisper, Wispr-class tools). Async transcribe + system-wide hotkey + a small CoreML model gives type-as-you-speak latency on any M-series Mac. Argmax does not ship a flagship end-user app — third parties do — so building a competitive Mac voice utility is a tractable weekend project on top of WhisperKit.

M2 Ultra hits roughly 42x realtime on large-v3-turbo (ANE only) per the upstream benchmarks; M-series laptops still clear realtime by a wide margin.

visionOS

Apple Vision Pro · spatial captioning

Vision Pro has the RAM and the M-series silicon to run the full large-v3 comfortably, which makes WhisperKit the natural choice for on-device live captioning, immersive meeting tools, and accessibility overlays inside visionOS apps. Build target: same Swift package, no separate framework. v1.0.0 ships TTSKit alongside, so a captions-plus-readback pipeline lives in one dependency.

visionOS is a first-class target in the umbrella ArgmaxOSS product but verify your minimum deployment against the active v1.x README before shipping to TestFlight.

watchOS

Apple Watch · tiny/base only

Treat watchOS as a feasibility target, not a primary one. The Watch's tighter memory budget means tiny and base are the only realistic Whisper variants, and on-device transcription is usually a fallback path while audio also offloads to the paired iPhone. The legacy WhisperKit README listed watchOS 10+ in its platform matrix; the v1.0.0 README leads with macOS + iOS, so test against the current Package.swift before committing.

Open-source surface tracks the package's stated minimums — watch the Examples folder for a watchOS sample app before promising watch-native transcription in a launch.

CLI · whisperkit-cli

macOS terminal · brew install whisperkit-cli

Two flavors. Run whisperkit-cli transcribe from the shell for batch jobs, or run whisperkit-cli serve to spin up an OpenAI-compatible local server (POST /v1/audio/transcriptions, /v1/audio/translations, SSE streaming). The serve mode is the easiest way to drop WhisperKit into an existing Python / Node app on a Mac without writing any Swift — point the openai SDK's base_url at localhost.

Streaming microphone is exposed via the --stream flag on transcribe; partial results emit as the audio comes in.

If you also need diarization or on-device TTS, the same Swift package exposes SpeakerKit and TTSKit — import them as separate products from the umbrella ArgmaxOSS target. For non-Apple targets see whisper.cpp (Linux / Windows / Android / WASM) and faster-whisper (CUDA server-class GPUs).

Setup recipes · pick your platform

Three recipes covering the most common WhisperKit integrations. Verified against v1.0.0 of the Argmax Open-Source SDK and the current README.

1Add to a Swift project · SPM

Xcode File → Add Package Dependencies, paste the URL, depend on the WhisperKit product. Five-line transcribe.

// Package.swift
dependencies: [
    .package(
        url: "https://github.com/argmaxinc/argmax-oss-swift.git",
        from: "1.0.0"
    ),
],
.target(
    name: "YourApp",
    dependencies: [
        // Just the STT kit:
        .product(name: "WhisperKit", package: "argmax-oss-swift"),
        // Or import the full SDK (WhisperKit + SpeakerKit + TTSKit):
        // .product(name: "ArgmaxOSS", package: "argmax-oss-swift"),
    ]
)

// Anywhere in your code:
import WhisperKit

Task {
    let pipe = try await WhisperKit()
    let result = try await pipe.transcribe(
        audioPath: "path/to/audio.m4a"
    )
    print(result?.text ?? "")
}

Models download lazily on first call. Pin one for production: WhisperKit(WhisperKitConfig(model: "large-v3-v20240930_626MB")). Full model list: huggingface.co/argmaxinc/whisperkit-coreml ↗.

2iOS app · streaming mic

Start from the in-repo WhisperAX sample app — it wires AVAudioEngine into WhisperKit with partial-result emission.

// Sketch · clone of WhisperAX's streaming loop
import WhisperKit
import AVFoundation

@MainActor
final class Dictation: ObservableObject {
    @Published var partial: String = ""
    private var pipe: WhisperKit?
    private let engine = AVAudioEngine()

    func start() async throws {
        pipe = try await WhisperKit(
            WhisperKitConfig(model: "base.en")
        )

        let input = engine.inputNode
        let format = input.outputFormat(forBus: 0)
        input.installTap(onBus: 0, bufferSize: 4096, format: format) { buf, _ in
            Task { [weak self] in
                // Feed 16kHz PCM chunks into WhisperKit's transcribe loop.
                // WhisperAX uses an audio buffer + VAD ring; mirror that here.
                guard let pipe = self?.pipe else { return }
                let chunk = buf.toFloatArray()
                if let r = try? await pipe.transcribe(audioArray: chunk) {
                    self?.partial = r.text
                }
            }
        }
        try engine.start()
    }
}

The WhisperAX sample handles VAD chunking and ring-buffer plumbing — copy that pattern; the open-source surface does not currently ship a higher-level AudioStreamTranscriber class. Production streaming with sub-200ms guarantees lives in Argmax Pro ↗.

3CLI on macOS · whisperkit-cli

Homebrew is the fast path; swift run is the build-from-source path.

# 1. Install via Homebrew (macOS)
brew install whisperkit-cli

whisperkit-cli transcribe \
  --model "large-v3" \
  --audio-path input.wav

# 2. Or build from source
git clone https://github.com/argmaxinc/argmax-oss-swift.git
cd argmax-oss-swift
make setup
make download-model MODEL=large-v3-v20240930_626MB
swift run whisperkit-cli transcribe \
  --model-path "Models/whisperkit-coreml/openai_whisper-large-v3-v20240930_626MB" \
  --audio-path input.wav

# 3. OpenAI-compatible local server
swift run whisperkit-cli serve --model tiny --port 50060
# then call it from any OpenAI SDK with base_url=http://localhost:50060/v1

Microphone streaming: add --stream to whisperkit-cli transcribe. For air-gapped builds run make download-model ahead of time and ship the .mlmodelc bundles. Models live at huggingface.co/argmaxinc/whisperkit-coreml ↗.

What it really is

WhisperKit is an open-source Swift package from Argmax Inc. that runs OpenAI Whisper speech-to-text models entirely on Apple Silicon devices using CoreML. It exists because the reference Whisper code from OpenAI is Python+PyTorch and whisper.cpp — the most popular C++ port — treats the Apple Neural Engine as an opt-in extra rather than the primary execution path. WhisperKit compiles each Whisper variant into .mlmodelc bundles that the OS schedules across the Apple Neural Engine (ANE), GPU (Metal), and CPU automatically, so a single import gets idiomatic Swift-async transcription with hardware acceleration that whisper.cpp requires extra build flags and converted models to match.

The project was open-sourced under the MIT license in January 2024. On 2026-05-01 it graduated to v1.0.0 and was renamed the Argmax Open-Source SDK (repo argmaxinc/argmax-oss-swift), bundling three turn-key kits in one Swift package: WhisperKit (speech-to-text, Whisper), SpeakerKit (diarization, pyannote), and TTSKit (text-to-speech, Qwen3-TTS). The release adopts Swift 6 strict concurrency and vendors swift-transformers internally so consumer projects no longer pull HuggingFace's Hub library transitively.

Argmax distributes pre-converted CoreML weights for the entire Whisper family on HuggingFace at argmaxinc/whisperkit-coreml — tiny, base, small, medium, large-v2, large-v3, the September 2024 large-v3-v20240930 (better Spanish/Hindi/Korean), Distil-Whisper, plus quantized 'turbo' variants in the 547-955MB range that cut model size in half with minimal WER regression. Models download lazily on first use; whisperkit-cli ships via Homebrew (`brew install whisperkit-cli`) for command-line transcription, and a built-in OpenAI-compatible local server (POST /v1/audio/transcriptions) lets non-Swift apps call WhisperKit through the standard OpenAI SDK. Argmax also publishes a closed-source Pro SDK that adds real-time speaker-attributed transcription, custom vocabulary up to 3,000 terms, an Android Kotlin port, and a WebSocket streaming server compatible with Deepgram. The open-source package targets macOS 14+ and Xcode 16+; Apple Silicon (M1 or later, A14+ on iOS) is required for ANE acceleration. License: MIT.

Key specs

Repository

argmaxinc/argmax-oss-swift (renamed from argmaxinc/WhisperKit, 2026-05-01)

Latest release

v1.0.0 — 2026-05-01

License

MIT

Min platforms

macOS 14+, iOS 17+, watchOS 10+, visionOS 1+ · Xcode 16+

Swift toolchain

Swift 5.10 + Swift 6 strict concurrency

Whisper models

tiny / base / small / medium / large-v2 / large-v3 / large-v3-v20240930 / distil-large-v3 + quantized 'turbo' variants (547MB → 955MB)

Sister kits

SpeakerKit (pyannote diarization) · TTSKit (Qwen3-TTS)

Inference path

CoreML — auto-selects Apple Neural Engine + GPU (Metal) + CPU per layer

HuggingFace pull rate

~10.5M downloads/month for whisperkit-coreml

CLI

whisperkit-cli — `brew install whisperkit-cli` or `swift run`

Local server

Built-in OpenAI-compatible Audio API (POST /v1/audio/transcriptions, /v1/audio/translations, SSE streaming)

Performance (cited)

Device	Model	Speed	Source
M2 Ultra · ANE only	Whisper Large v3 Turbo	~42× realtime	source ↗
M2 Ultra · GPU + ANE	Whisper Large v3 Turbo	~72× realtime	source ↗
M3 Max · ANE	Large v3 Turbo decoder forward pass	4.6 ms / token (45% reduction vs non-CoreML baseline 8.4 ms)	source ↗

Get started — code

Swift Package install · swift

// Package.swift
dependencies: [
    .package(url: "https://github.com/argmaxinc/argmax-oss-swift.git", from: "1.0.0"),
],
.target(
    name: "YourApp",
    dependencies: [
        .product(name: "WhisperKit", package: "argmax-oss-swift"),
        // Or .product(name: "ArgmaxOSS", ...) for WhisperKit + SpeakerKit + TTSKit
    ]
)

Minimal transcription · swift

import WhisperKit

Task {
    let pipe = try await WhisperKit()
    let result = try await pipe.transcribe(audioPath: "path/to/audio.m4a")
    print(result?.text ?? "")
}

// Pin a specific model:
let pipe = try await WhisperKit(WhisperKitConfig(
    model: "large-v3-v20240930_626MB"
))

CLI · whisperkit-cli · bash

# Install via Homebrew
brew install whisperkit-cli

# Or build from source
git clone https://github.com/argmaxinc/argmax-oss-swift.git
cd argmax-oss-swift
make setup
make download-model MODEL=large-v3-v20240930_626MB
swift run whisperkit-cli transcribe \
  --model-path "Models/whisperkit-coreml/openai_whisper-large-v3-v20240930_626MB" \
  --audio-path audio.m4a

# Mic streaming
swift run whisperkit-cli transcribe --model-path ... --stream

OpenAI-compatible local server (any language) · bash

# Start the WhisperKit server
swift run whisperkit-cli serve --model tiny --port 50060

# Call it with the standard OpenAI SDK:
python - <<'PY'
from openai import OpenAI
client = OpenAI(base_url="http://localhost:50060/v1", api_key="unused")
resp = client.audio.transcriptions.create(
    file=open("audio.wav", "rb"),
    model="tiny",
)
print(resp.text)
PY

How it compares

vs whisper.cpp

whisper.cpp ships as portable C/C++ with a `WHISPER_COREML=1` build flag plus a separate `generate-coreml-model.py` step that converts the encoder only — the decoder still runs in ggml on CPU/Metal. WhisperKit ships pre-converted .mlmodelc bundles for both encoder and decoder, so the Apple Neural Engine handles the heavy attention layers without bridging headers, callback APIs, or manual memory management. On the M3 ANE, Argmax measured a 45% latency reduction (8.4ms → 4.6ms per decoder forward pass) versus a pre-CoreML baseline. Bottom line: whisper.cpp is the right answer for Linux servers and Intel Macs; WhisperKit is the right answer the moment you target Apple Silicon and want native Swift idioms.

vs whisperX

whisperX is a Python project that combines faster-whisper, wav2vec2 forced alignment, and pyannote diarization to produce word-timestamped, speaker-labeled transcripts on CUDA. WhisperKit's open-source surface is transcription only; diarization is now its sibling kit SpeakerKit (also pyannote, in the same Swift package as of v1.0.0); word-level forced alignment is not in the OSS package. To approximate whisperX behavior on a Mac, compose WhisperKit + SpeakerKit and use Whisper segment-level timestamps; for word-level alignment plus real-time speaker labels, Argmax Pro is the supported path.

vs faster-whisper

faster-whisper is a CTranslate2-based Whisper runtime — outstanding on NVIDIA GPUs and very strong on x86 CPU, but on Apple Silicon it cannot use the Neural Engine and lands on CPU. A Swift or Mac developer picking faster-whisper has to bundle Python (or use the C++ ctranslate2 lib through a custom binding), download non-CoreML weights, and lose ANE acceleration. WhisperKit's CoreML stack uses ANE + GPU + CPU automatically, integrates with Swift async/await, and ships through SPM. faster-whisper remains the right pick for Linux/CUDA servers; WhisperKit is the right pick on every Apple platform.

vs MacWhisper

MacWhisper is an end-user Mac transcription app built by Jordi Bruin on top of whisper.cpp; WhisperKit is the SDK other apps embed. Argmax does not publish a flagship end-user app — third-party apps like Superwhisper (App Store ID 6471464415) are the most prominent products in the WhisperKit ecosystem. If you want an app, use MacWhisper or Superwhisper; if you want to build the next one, use WhisperKit.

Who picks this

iOS app developer

Embed Whisper directly in a SwiftUI app for offline voice notes, in-game voice commands, or accessibility captions. Async/await API, single binary, no network calls.

Mac dictation tool builder

Build Superwhisper / Wispr-style global dictation by pairing WhisperKit's mic streaming with a system-wide hotkey. CoreML keeps latency low enough for real-time typing on M1+.

macOS background transcription daemon

Run whisperkit-cli serve as a launchd service exposing OpenAI-compatible HTTP on localhost:50060, then point any OpenAI SDK in any language at it for batch transcription jobs. Avoids per-minute cloud Whisper costs entirely.

Existing Python/Node app on OpenAI Audio API

Drop in WhisperKit's local server as a base_url override; existing client code (openai-python, openai-node) keeps working. Useful for HIPAA / on-device-required deployments.

Multi-kit speech app

Use the ArgmaxOSS umbrella to import WhisperKit + SpeakerKit + TTSKit together for an end-to-end pipeline (transcribe → diarize → respond with synthesized speech) with one Swift package dependency.

Every link in one place

GitHub · argmaxinc/argmax-oss-swift (current repo)GitHub · argmaxinc/WhisperKit (legacy URL, redirects)Latest release · v1.0.0 (Argmax Open-Source SDK)Swift Package Index Argmax homepage Argmax blog Argmax Pro SDK 2 announcement (April 2026)Argmax Pro SDK for Android WhisperKit CoreML model weights · HuggingFace WhisperKit Benchmarks (HuggingFace Space)whisperkittools (model conversion tooling)WhisperKit research paper · arXiv 2507.10860 Sample app · WhisperAX (in-repo Examples/WhisperAX)Superwhisper (third-party dictation app)Argmax Discord community

Features

Speaker diarization	No
Word-level timestamps	Yes
Streaming / real-time	Yes
Languages supported	99
HIPAA eligible	No

WhisperKit vs Whipscribe

Feature	WhisperKit	Whipscribe
Category	Open source	Transcription APIs
Pricing	free	free beta
Speaker diarization	No	Yes
Word timestamps	Yes	Yes
Streaming	Yes	No
Languages	99	99
Platforms	macOS, iOS, iPadOS, watchOS, visionOS	Web, API, MCP

Alternatives to WhisperKit

OpenAI Whisper

OpenAI

The reference open-source multilingual ASR model from OpenAI.

OSS · MIT ★ 98.1k

whisper.cpp

Georgi Gerganov

C/C++ port of Whisper — runs on anything, from a Raspberry Pi to Apple Silicon.

OSS · MIT ★ 48.8k

faster-whisper

SYSTRAN

4× faster than reference Whisper using CTranslate2 — production sweet spot.

OSS · MIT ★ 22.3k

Frequently asked about WhisperKit

Is WhisperKit the same as whisper.cpp on Mac?

No. whisper.cpp is a portable C/C++ Whisper port that runs Whisper on CPU with optional Metal GPU and an opt-in CoreML encoder; WhisperKit is a Swift-native package that compiles the full encoder and decoder to CoreML and lets the OS schedule layers across the Apple Neural Engine, GPU, and CPU automatically. WhisperKit is the right pick if you are shipping a Swift/SwiftUI app on Apple Silicon; whisper.cpp is the right pick when you need to run Whisper on Linux servers, Windows, Intel Macs, or embedded targets with no Apple framework available.

Does WhisperKit use the Apple Neural Engine (ANE)?

Yes. The CoreML model bundles published at huggingface.co/argmaxinc/whisperkit-coreml are compiled to run on ANE plus GPU plus CPU, and WhisperKit picks the compute units automatically. You can also pin them — e.g. `cpuAndNeuralEngine` to force ANE, `cpuAndGPU` to force Metal — via WhisperKitConfig.

How does WhisperKit compare to faster-whisper on Apple Silicon?

faster-whisper is a Python wrapper around CTranslate2 that targets CUDA and CPU; on Apple Silicon it falls back to CPU, so it does not use the Neural Engine and trails WhisperKit on Mac and iPhone. If you control the box and have an NVIDIA GPU, faster-whisper is excellent; if you ship a Mac or iOS app and want hardware acceleration without bundling Python or a CUDA runtime, WhisperKit wins by construction.

What's the difference between WhisperKit (open source) and Argmax Pro?

WhisperKit and the rest of the Argmax Open-Source SDK are MIT-licensed and ship the OpenAI Whisper, pyannote, and Qwen3-TTS models. Argmax Pro SDK is a closed-source extension with: real-time streaming transcription with live speaker attribution, custom-vocabulary support up to 3,000 keywords for domain accuracy, an Android/Kotlin port, a Deepgram-compatible WebSocket Local Server, and the Pro model variants (whisperkit-pro, parakeetkit-pro, speakerkit-pro). Pricing is on Argmax's site behind a 14-day trial.

Does WhisperKit support real-time / streaming transcription?

The open-source SDK supports microphone streaming via the CLI's `--stream` flag and partial-result streaming over Server-Sent Events from the local server, so you can build dictation-style apps. True real-time streaming with diarization and word-level latency guarantees is a Pro SDK feature; the open-source path streams transcripts as they're generated but does not promise sub-200ms first-token guarantees.

Where do I get the CoreML model files?

All variants are hosted at huggingface.co/argmaxinc/whisperkit-coreml. WhisperKit downloads the recommended model on first run; you can override with WhisperKitConfig(model:) using a glob like `large-v3-v20240930_626MB`. For air-gapped builds, run `make download-model MODEL=...` (or `make download-models` for the full set) and ship the resulting .mlmodelc bundles inside your app.

Is WhisperKit the same as whisperX?

No. whisperX is a Python project layering forced alignment (wav2vec2) and pyannote diarization on top of faster-whisper, primarily on CUDA. WhisperKit is a Swift CoreML inference framework; in the Argmax SDK 1.0.0 release, diarization is now a sibling kit (SpeakerKit, also pyannote-based) you can compose with WhisperKit, but word-level alignment is not part of the open-source surface. Visitors looking for whisperX behavior on Mac usually combine WhisperKit + SpeakerKit, or use Argmax Pro.

Does WhisperKit work on iPhone, iPad, Apple Watch, Vision Pro?

Yes. The package targets iOS, iPadOS, watchOS, and visionOS. Practical model size is the constraint: tiny and base run on Apple Watch and older iPhones; large-v3 quantized variants (547-626MB) target iPhone 15 Pro and newer with 8GB RAM. Vision Pro and M-series iPads run the full large-v3 comfortably.

What models should I use for production?

Argmax recommends `large-v3-v20240930_626MB` for maximum multilingual accuracy and `tiny` for fast iteration. The September 2024 v3 checkpoint is OpenAI's last Whisper update and noticeably better than 2023 large-v3 on Spanish, Hindi, and Korean. The `_turbo` suffix variants drop the heavy decoder for a lighter one with negligible WER regression on English; pick `_turbo_600MB` if real-time is the priority and `_626MB` non-turbo if WER is.

I searched 'whisperx on mac' — what should I use?

On Mac, WhisperKit + SpeakerKit covers the diarization half of whisperX with hardware acceleration the Python whisperX stack can't reach. You lose word-level forced alignment in the open-source path; if you need it, either run whisperX in a Linux Docker container or move to Argmax Pro.

Does WhisperKit support faster-whisper-style Apple Silicon Metal acceleration?

WhisperKit goes further than Metal: it uses CoreML, which schedules across ANE + GPU + CPU based on layer cost. faster-whisper has no Metal backend at all on Apple Silicon — it is CPU-only there. If your search was 'faster-whisper apple silicon metal support', WhisperKit is the answer for that intent.

Is WhisperKit free to use?

Yes — WhisperKit and the rest of the Argmax Open-Source SDK are MIT-licensed and free for commercial use. Argmax also publishes a closed-source Pro SDK with custom-vocabulary, real-time speaker-attributed streaming, and an Android port; pricing is on argmaxinc.com.

Does WhisperKit run on iOS?

Yes. WhisperKit ships on macOS 14+, iOS 17+, watchOS 10+, and visionOS — all CoreML-accelerated on Apple Silicon. Inference happens fully on-device; no network round-trip is required.

Does it work on Intel Macs?

It installs (Swift package, no architecture lock) but the CoreML weights are tuned for Apple Silicon. Intel Macs have no Neural Engine, so compute falls back to CPU + GPU and performance is similar to whisper.cpp's CPU mode.

Whipscribe is a managed faster-whisper + whisperX service. If you want transcripts without running infrastructure, paste a URL or drop a file in the form below — you'll have a transcript in seconds.