A cross-platform speech-to-text dictation app. Tyrant records your microphone, transcribes the audio, and types the result into your active window. Its pluggable module system makes it easy to support new platforms — each OS-specific concern (typing, notifications, transcription) is isolated behind an interface, so adding support for a new platform means implementing a small module rather than rewriting the app.
Tyrant uses three module types, each with a base class and one or more implementations. The first available implementation is selected automatically at startup.
| Module | What it does | Implementations (by priority) |
|---|---|---|
| Transcription | Converts recorded audio to text | whisper — local inference via faster-whisper · mistral — Mistral AI API · noop — placeholder |
| Output | Types the transcribed text into the active window | xdotool — Linux/X11 · noop — logs only |
| Notification | Shows status notifications to the user | notify-send — Linux (libnotify) · noop — logs only |
You can force a specific implementation via environment variable (TRANSCRIPTION, OUTPUT, NOTIFICATION). See Forcing a Specific Module below.
- Push-To-Talk (PTT): Record only when a specific key is held.
- System Tray Icon: Easy access to Mute/Unmute and Quit.
- Python 3.x
- PortAudio (required for
sounddevice, e.g.,sudo apt install libportaudio2) xdotool(optional, for automatic typing on Linux/X11)notify-send(optional, for system notifications on Linux)- Mistral API Key (optional, if using Mistral transcription)
- Clone the repository.
- Create and activate a virtual environment:
python3 -m venv .venv source .venv/bin/activate - Install the required Python packages:
pip install -r requirements.txt
Create a .env file in the root directory and add your Mistral AI API key and optionally the model name. You can also provide a comma-separated list of context bias terms to help the transcriber prefer specific words (e.g., product names, acronyms):
MISTRAL_API_KEY=your_api_key_here
MISTRAL_MODEL=voxtral-mini-2602
MISTRAL_CONTEXT_BIAS=Kubernetes,K8s,PostgreSQLMISTRAL_CONTEXT_BIASis optional. When set, Tyrant passes these terms to Mistral as a context bias so the transcript is more likely to include them as spoken. Use a short, focused list; terms are case-sensitive and separated by commas.
When faster-whisper is installed, local transcription is used by default (no API key required). Configure it with optional .env variables:
WHISPER_MODEL=base
WHISPER_DEVICE=auto
WHISPER_COMPUTE_TYPE=autoWHISPER_MODEL: Model size —tiny,base,small,medium,large-v3(default:base). Larger models are more accurate but slower and use more memory.WHISPER_DEVICE:auto,cpu, orcuda(default:auto).WHISPER_COMPUTE_TYPE:auto,float16,int8,int8_float16(default:auto).
The model is downloaded automatically on first use.
Run the application:
python src/main.pyOptions:
-v,--verbose: Enable verbose logging.--ptt KEY: Use push-to-talk with the specified key (e.g.,ctrl,shift,caps_lock).
When running, a tray icon appears showing the current status (Idle, Recording, Transcribing, Muted).
- Right-click the icon to Mute/Unmute or Quit the application.
-
Manual (Default):
- Run
python src/main.py. - Recording starts immediately.
- Press
Ctrl+Cor use the tray menu to stop.
- Run
-
Push-To-Talk (PTT):
- Run
python src/main.py --ptt caps_lock. - The script waits for you to hold the specified key.
- Recording starts when you press the key and stops when you release it.
- Run
The application uses a flexible system for output, transcription, and notifications, defined in src/output.py, src/transcription.py, and src/notification.py. It automatically selects the first available method for each.
You can force a specific module via environment variables. If the forced module is not available (missing dependencies, missing API key, etc.), Tyrant will exit with an error on startup.
TRANSCRIPTION=whisper # or: mistral, noop
OUTPUT=xdotool # or: noop
NOTIFICATION=notify-send # or: noopWhen these variables are not set, the first available module is used automatically (see priority order below).
- xdotool: Uses
xdotoolto type text. (Requiresxdotoolinstalled). - noop: A fallback that only logs the transcription if no typing tool is found.
- whisper: Local transcription using faster-whisper. (Requires
faster-whisperinstalled, no API key needed). - mistral: Uses Mistral AI's API. (Requires
MISTRAL_API_KEYin.env). - noop: A fallback that returns a placeholder string if no transcription service is configured.
- notify-send: Uses
notify-sendto show system notifications. (Requireslibnotify-binor equivalent). - noop: A fallback that only logs notifications if no notification tool is found.
You can easily add new methods by inheriting from the Output, Transcription, or Notification base classes and implementing the required interface (is_available() and type(text), transcribe(file_path), or notify(title, message)). Register the new class in the corresponding *_MODULES dict to make it available via the env var.
- Ensure you have a window focused where you want the text to appear before the transcription finishes (if using
xdotool).
This project is licensed under the MIT License - see the LICENSE file for details.