Note: This application is for macOS only.
A macOS application that converts speech to text using OpenAI's Whisper model running locally. Press the Globe/Function key to start recording, press it again to stop recording, transcribe, and paste text at your current cursor position.
- System tray (menu bar) application that runs in the background
- Global hotkey (Globe/Function key) to trigger dictation
- Transcribes speech to text using OpenAI's Whisper model locally
- Automatically pastes transcribed text at your cursor position
- NEW: AI-powered text enhancement - when you have text selected, your voice becomes an instruction to modify that text using AWS Bedrock (Claude)
- Visual feedback with menu bar icon status
- Install Python dependencies:
pip install -r requirements.txt
- Install PortAudio (required for PyAudio):
brew install portaudio
- Set up AWS Bedrock (optional - for AI text enhancement):
cp .env.example .env
# Edit .env and add your AWS credentials
- Run the application in development mode:
python src/main.py
To run the script in the background:
- Install all dependencies:
pip install -r requirements.txt
- Run the script in the background:
nohup ./run.sh >/dev/null 2>&1 & disown
- The script will continue running in the background. You can then use the app as described in the Usage section.
- Launch the Whisper Dictation app. You'll see a microphone icon (🎙️) in your menu bar.
- Press the Globe key or Function key on your keyboard to start recording.
- Speak clearly into your microphone.
- Press the Globe/Function key again to stop recording.
- The app will transcribe your speech and automatically paste it at your current cursor position.
When you have AWS Bedrock configured, you can use voice commands to modify selected text:
- Select text in any application (highlight the text you want to modify)
- Press the Globe/Function key to start recording
- Give a voice instruction like:
- "Make this more professional"
- "Translate this to Spanish"
- "Summarize this paragraph"
- "Fix the grammar"
- "Make this sound friendlier"
- Press the Globe/Function key again to stop recording
- The selected text will be automatically replaced with the AI-enhanced version
Note: If no text is selected, the app behaves normally and just inserts the transcribed text.
You can also interact with the app through the menu bar icon:
- Click "Start/Stop Listening" to toggle recording
- Access Settings for configuration options
- Click "Quit" to exit the application
The app requires the following permissions:
- Microphone access (to record your speech).
Go to System Preferences → Security & Privacy → Privacy → Microphone and add your Terminal or the app. - Accessibility access (to simulate keyboard presses for pasting).
Go to System Preferences → Security & Privacy → Privacy → Accessibility and add your Terminal or the app.
- macOS 10.14 or later
- Microphone
- AWS Account with Bedrock access (optional - for AI text enhancement feature)
If something goes wrong or you need to stop the background process, you can kill it by running one of the following commands in your Terminal:
- List the running process(es):
ps aux | grep 'src/main.py'
- Kill the process by its PID:
kill -9 <PID>