FreeWhispr

FreeWhispr is a local macOS transcription app for recording or importing audio, generating transcripts, assigning speakers, and exporting results. It keeps the workflow simple with fast recording, searchable sessions and folders, and built-in editing tools for speaker names and transcript cleanup.

Current State & Technical Stack

macOS app (SwiftUI + AVFoundation) for recording/importing audio, session folders, transcript editing, speaker naming/reassignment, and exports
ASR (speech-to-text) runs locally with whisper.cpp, including Apple Silicon acceleration (Metal/GPU and Core ML encoder support when available)
Speaker separation (diarization) runs locally with pyannote.audio in a Python worker runtime (Hugging Face token required for gated pyannote model access)
Worker communication uses JSONL IPC between the Swift app and the local Python worker (setup validation, progress events, jobs, errors)
Storage is local in macOS Application Support (audio, transcript JSON/TXT/SRT, logs, metrics)
Security: the Hugging Face token is stored in the macOS Keychain (not in session/transcript JSON files)
Packaging scripts build a macOS .app, .zip, and .dmg, with optional on-demand speaker runtime install to reduce download size
Quality: includes versioned JSON contract examples and basic tests for session store, exports, reconciliation, and worker IPC

Speaker Recognition (Speaker Separation) Setup - Step by Step

FreeWhispr can transcribe audio locally without extra account setup, but speaker separation (who said what) uses pyannote models hosted on Hugging Face and requires one-time access setup.

For people using the downloaded app (recommended)

If you installed FreeWhispr by downloading FreeWhispr.zip and dragging the app into Applications, follow these steps (no terminal needed):

1. Install and open FreeWhispr

Download FreeWhispr.zip
Unzip it
Drag FreeWhispr.app into Applications
Open FreeWhispr

2. Open Settings in FreeWhispr

Open Settings
Go to Diarization Setup / Model Access
Leave Enable diarization by default on (or toggle it on later per workflow)

3. Create a Hugging Face token (one-time)

Speaker separation uses gated pyannote models, so you need a Hugging Face account + token.

Create or sign in to a Hugging Face account
Generate an access token with Read access
Copy the token

4. Request/accept access to the gated pyannote model (one-time)

Open the model page and request/accept access:

pyannote/speaker-diarization-community-1

If Hugging Face prompts for terms/approval, complete that with the same account that created your token.

5. Paste the token into FreeWhispr

In Settings, paste the token into Hugging Face token (pyannote)
Click Save settings
Click Validate setup

You want to see something like:

whisper.cpp: available
pyannote: available
HF token: present

6. Test speaker separation

Record or import an audio file
Make sure speaker separation/diarization is enabled
Process the recording

The first run may take longer while models are downloaded/cached locally.

7. If it fails (common fixes)

Error: 401 Cannot access gated repo ... pyannote/speaker-diarization-community-1

Your token is present, but your account does not yet have access to the gated model
Go back to the model page and make sure access was approved/accepted
Confirm you used the same Hugging Face account for the token
Re-run Validate setup and try again

Error: whisper.cpp: missing

If you are using the packaged app, reinstall/update FreeWhispr and try Validate setup again
If you are running from source, run:

./scripts/install_whispercpp.sh --download-model large-v3-turbo

Restart the app and click Validate setup again

Transcription works, but no speaker labels

Make sure diarization/speaker separation is enabled
Very short clips or low-quality audio may produce weak speaker splits

Microphone records silence

Check macOS permissions: System Settings -> Privacy & Security -> Microphone
Allow access for FreeWhispr

8. Notes

The Hugging Face token is stored in the macOS Keychain (not in your transcript files).
You can still use FreeWhispr without speaker separation by disabling diarization.
Transcription and diarization run locally after the required models are installed and cached.

Running FreeWhispr from source (developer setup)

If you are running FreeWhispr from the source repo instead of using the packaged app, install the local worker dependencies first:

./scripts/install_worker_deps.sh

This creates a local Python environment (.venv313) and installs the worker dependencies (including pyannote.audio).

Then install whisper.cpp (default ASR backend) and a local model:

./scripts/install_whispercpp.sh --download-model large-v3-turbo

Then start the app from source:

cd app
swift run

Developer Notes

The packaged app is the easiest way to get started.
The source workflow is mainly for development and debugging.

Tests

./scripts/swift_test.sh
python3 -m unittest discover -s worker/tests -p 'test_*.py'

If ./scripts/swift_test.sh reports an SDK/compiler mismatch, update/select a matching Xcode/Command Line Tools installation. The script already works around the local module-cache permission issue.

Name		Name	Last commit message	Last commit date
Latest commit History 20 Commits
app		app
branding		branding
docs/contracts		docs/contracts
fixtures/transcripts		fixtures/transcripts
scripts		scripts
vendor		vendor
worker		worker
.gitignore		.gitignore
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

FreeWhispr

Current State & Technical Stack

Speaker Recognition (Speaker Separation) Setup - Step by Step

For people using the downloaded app (recommended)

1. Install and open FreeWhispr

2. Open Settings in FreeWhispr

3. Create a Hugging Face token (one-time)

4. Request/accept access to the gated pyannote model (one-time)

5. Paste the token into FreeWhispr

6. Test speaker separation

7. If it fails (common fixes)

8. Notes

Running FreeWhispr from source (developer setup)

Developer Notes

Tests

About

Uh oh!

Releases 1

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

FreeWhispr

Current State & Technical Stack

Speaker Recognition (Speaker Separation) Setup - Step by Step

For people using the downloaded app (recommended)

1. Install and open FreeWhispr

2. Open Settings in FreeWhispr

3. Create a Hugging Face token (one-time)

4. Request/accept access to the gated pyannote model (one-time)

5. Paste the token into FreeWhispr

6. Test speaker separation

7. If it fails (common fixes)

8. Notes

Running FreeWhispr from source (developer setup)

Developer Notes

Tests

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases 1

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages