phรณnos โ voice, sound, speech... but also murder, slaughter, homicide (yes, really)
Speak freely. Your words, where you need them.
A Whisper Flowโstyle dictation tool that runs entirely on hardware you control. Press a hotkey, talk, and watch your words appear in whatever app you're in. No cloud. No subscriptions. No mystery box between your microphone and your text.
Beta: usable for local/Tailscale dictation, but still early and ad-hoc signed.
Phonos slays subscription dictation โ your wallet gets to stay alive. โ ๏ธ
โโโ
The name means "murder" in Greek. We are keeping the bit, just making the software around it more serious.
| โ๏ธ Cloud dictation | ๐ช Phonos |
|---|---|
| Audio is processed by a remote service | Audio goes only to your configured server |
| Subscription required | Free and open source |
| Latency depends on internet and vendor load | Bounded by your hardware and model choice |
| One model fits all | Pick the model for your hardware |
| Vendor behavior is opaque | Open source and auditable |
| Surprise product decisions | You run the thing |
โโโ
- Press a hotkey โ hold it down or toggle, your call.
- Talk โ your Mac captures the audio.
- Whisper transcribes โ your server runs
faster-whisperin a dedicated subprocess. - Text appears โ pasted directly into the app you were using, with clipboard fallback.
โโโ
The server runs the active Whisper model in a dedicated subprocess. The main
FastAPI process communicates with that worker through local queues. When you
switch models via PUT /models/active, the old worker is stopped and a new
worker is started with the requested model, allowing the operating system to
reclaim model memory cleanly.
Same battle-tested idea as one-process-per-model serving, minus the orchestration ceremony for a local dictation box.
The server is intended for localhost, LAN, or private-network use such as
Tailscale. Set PHONOS_AUTH_TOKEN before exposing it beyond localhost.
โโโ
The โserverโ can be another machine on your LAN/Tailscale network, or just a
Docker container running locally on the same Mac. For local-only use, set the
Mac app Server URL to http://localhost:8765.
git clone https://github.com/jb381/phonos && cd phonos/apps/server
cp .env.example .env # optional: set PHONOS_AUTH_TOKEN
docker compose up -d # boom, transcription server on :8765Or run the published server image directly:
docker run -d \
--name phonos-server \
-p 8765:8765 \
-e PHONOS_AUTH_TOKEN=your-secret-token \
-v phonos_models:/root/.cache/huggingface \
ghcr.io/jb381/phonos-server:latestWithout Docker:
uv sync
uv run uvicorn phonos_server.main:app --host 0.0.0.0 --port 8765From a release โ download the latest Phonos-*.dmg from the Releases page, open it, and drag Phonos.app to Applications.
From source:
cd apps/macos
./dev-run.sh # fast dev loop: build, quit old app, launch new app
./build.sh # creates Phonos.dmg and Phonos.app
open Phonos.dmg # then drag to ApplicationsFor day-to-day macOS development, prefer ./dev-run.sh. It builds a debug app
bundle at .build/dev/Phonos.app, ejects any mounted Phonos DMG volumes,
quits the currently running app, and opens the freshly built one. It avoids the
installer DMG loop entirely.
Releases are triggered by git tag vX.Y.Z && git push --tags. CI builds,
ad-hoc signs, and publishes a DMG automatically.
No Apple Developer account yet = ad-hoc signing. Gatekeeper may complain on first launch โ right-click the app and choose Open, or go to System Settings โ Privacy & Security and click Open Anyway. Accessibility permission may need to be re-granted after ad-hoc rebuilds.
To silence Gatekeeper from the terminal,
sudomay be needed because the app lives in/Applications:sudo xattr -dr com.apple.quarantine /Applications/Phonos.appThe grown-up version of this is Developer ID signing + notarization. It is on the roadmap.
macOS does not allow scripts to grant Microphone or Accessibility permissions. The practical development fix is stable signing: set
PHONOS_CODESIGN_IDENTITYor install/use an Apple Development identity so the first manual grant sticks across rebuilds. Ad-hoc signing changes the app's code requirement often enough that macOS may ask for permissions again.
โโโ
- ๐๏ธ Menu-bar status item with recording/transcribing/paste state
- โจ๏ธ Global hotkey (Control-Space by default, customizable)
- ๐ฎ Hold-to-record and toggle recording modes
- ๐ Direct auto-paste into the previously active application
- ๐งฏ Clipboard fallback when Accessibility permission is not granted
- ๐ ๏ธ First-run setup for permissions and server connection
- ๐ Model selector with live switching from the server
- ๐ Persistent transcript history in the menu bar (SQLite, local to your Mac, searchable by text/model/language)
- ๐ Auth token stored in macOS Keychain
โโโ
| Component | What you need |
|---|---|
| Server | Docker, a CPU (or GPU if you're fancy ๐ง) |
| Client | macOS 14+, Xcode 15+ |
| Network | Tailscale or same LAN |
โโโ
| Method | Path | Purpose |
|---|---|---|
| GET | /health |
Server health + model info |
| GET | /models |
List configured models |
| GET | /models/active |
Get currently loaded model |
| PUT | /models/active |
Switch active model |
| POST | /transcribe |
Transcribe audio |
PUT /models/active and POST /transcribe require auth when PHONOS_AUTH_TOKEN is set.
โโโ
PHONOS_AUTH_TOKEN= # leave empty to skip auth
PHONOS_MODEL=base.en
PHONOS_MODELS=tiny.en,base.en,small.en,medium.en,large-v3,turbo,distil-large-v3
PHONOS_DEVICE=cpu
PHONOS_COMPUTE_TYPE=int8
PHONOS_VAD_FILTER=true
PHONOS_TRANSCRIBE_TIMEOUT_SECONDS=600
PHONOS_MAX_UPLOAD_MB=100Docker Compose binds to 127.0.0.1 by default. For remote access, set PHONOS_BIND=0.0.0.0 and PHONOS_AUTH_TOKEN.
โโโ
- Audio is recorded by the macOS app and uploaded only to the configured Phonos server.
- The server does not require internet access after model files are downloaded and cached.
- Set
PHONOS_AUTH_TOKENbefore binding the server to a LAN or private network interface. - The macOS auth token is stored in Keychain.
- Temporary client recording files are removed after each transcription flow completes.
- Transcript history is stored locally in a SQLite database (
~/Library/Application Support/Phonos/history.sqlite) and can be cleared or disabled in Settings. - Server logs include request metadata and transcript text for debugging; run the server only where those logs are acceptable.
- Official macOS builds are ad-hoc signed and not notarized yet, so Gatekeeper may require manual approval on first launch.
- Phonos is intended for localhost, LAN, or private networks such as Tailscale. Do not expose the server directly to the public internet.
- Clipboard restoration is best-effort for complex clipboard contents, though normal text clipboard restore is supported.
- Keychain integration tests are manual because unsigned test binaries can trigger macOS permission prompts.
โโโ
All models are English-optimized. Larger models are more accurate but slower and need more memory.
| Model | Params | Notes |
|---|---|---|
tiny.en |
39M | Fastest, lowest memory ๐ |
base.en |
74M | Fast, decent English quality |
small.en |
244M | Good quality/speed โ recommended CPU daily driver โ |
medium.en |
769M | Better accuracy, handles harder speech |
turbo |
798M | Speed-optimized, multilingual ๐ |
distil-large-v3 |
756M | Distilled large, strong English |
large-v3 |
1550M | Highest quality, very slow on CPU ๐ |
Start with small.en for CPU usage. Try turbo or distil-large-v3 if you
need higher quality or multilingual transcription. Use large-v3 only when the
server has enough CPU/GPU capacity and memory.
โโโ
MIT โ do whatever, just keep the Greek in the README. Preferably the murder one.
โโโ
made with โ, ๐ง, a mild obsession with terminal aesthetics, and a name that apparently means murder