Koe (声, Japanese for "voice") is a free, open-source alternative to subscription-based voice dictation tools. Press a hotkey (Desktop) or a button (Mobile), speak naturally, and get polished AI text typed at your cursor or copied to your clipboard.
Unlike cloud-based solutions that charge monthly fees, Koe uses your own Groq API key and stays free for up to 8 hours of transcription a day on Groq's free tier.
| Feature | WhisperFlow ($8+/mo) | Built-in OS Dictation | Koe (Free) |
|---|---|---|---|
| Cost | Subscription | Free | Free (BYOK) |
| Accuracy | High | Poor | High (Whisper) |
| AI Enhancement | Yes | No | Yes |
| Privacy | Cloud audio | Local | Local VAD + BYOK |
| Global Hotkey | Yes | Limited | Yes |
| Auto-Paste | Yes | No | Yes |
- Cross-Platform — Native performance on Windows (Desktop) and iOS/Android (Mobile)
- Global Hotkey (Desktop) — Press
Ctrl + Shift + Spaceanywhere to start or stop dictation - Clipboard-First (Mobile) — High-fidelity audio capture with instant polished results copied to your clipboard
- Pause Naturally — Koe keeps listening through short pauses instead of treating every breath like the end of a recording
- Rolling Segments — Long recordings are processed in the background as ordered chunks, so performance stays fast even on longer sessions
- Instant Transcription — Groq Whisper handles speech-to-text at high speed
- AI Text Enhancement — Each segment is refined before it is committed, so only polished text is returned
- Auto-Type (Desktop) — Refined text is typed progressively into the focused text field while you are still talking
- Minimalist UI — A premium, high-contrast interface designed for focus and speed
- Transcription History — One-click copy and retry for saved transcripts
- Usage Dashboard — Track daily audio seconds, request pressure, and queue activity
- Download the latest
.exefrom Releases. - Install and launch. Koe will live in your system tray.
- Clone the repo and navigate to
apps/mobile. - Install Expo Go on your device.
- Run
pnpm dev:mobileand scan the QR code. Note: Native builds (.ipa/.apk) can be generated via EAS.
# Clone the repository
git clone https://github.com/JStaRFilms/Koe.git
cd Koe
# Install all dependencies (Monorepo)
pnpm install
# Run Desktop
pnpm dev
# Build for production
pnpm build
# Run Mobile
pnpm dev:mobileIf pnpm dev fails with Electron failed to install correctly, pnpm likely skipped Electron's install script during dependency setup. This repo now allowlists the required build/install scripts for pnpm 10+, and an existing checkout can be repaired with:
pnpm rebuild electron esbuild protobufjs electron-winstaller- Real release artifacts should be built on GitHub Actions, not locally
- Push a matching version tag such as
v1.1.3after updatingpackage.json - The release workflow will build Windows and macOS and attach artifacts to that GitHub Release
- See docs/release-process.md
- The marketing website is the Next.js app in
koe-website/ - In Vercel Project Settings -> Build and Deployment -> Root Directory, set the root to
koe-website - Leave the framework as Next.js for that project
- A root-level
vercel.jsonis also included as a fallback so root builds targetkoe-website
- Windows 10/11 (for Desktop) or iOS/Android (for Mobile)
- Groq API Key (free tier available)
- Microphone access
- Launch Koe — it minimizes to your system tray
- Configure — Right-click the tray icon → Settings → Enter your Groq API key
- Dictate — Click any text field and press
Ctrl + Shift + Space - Speak — The pill UI appears. Talk naturally, including pauses
- Done — Press the hotkey again when you're finished. Koe finalizes the session, copies the full refined transcript, and keeps it in history
| Action | Shortcut |
|---|---|
| Start / Stop Recording | Ctrl + Shift + Space |
| Retry Last Failed / Latest Transcript | Ctrl + Shift + , |
| Open Settings | Tray menu |
- Koe records one continuous session until you stop it
- Internally, it breaks longer recordings into ordered segments
- Segments are transcribed and refined in the background
- Refined text is typed in order as it becomes ready
- When the session ends, Koe keeps one full final transcript in clipboard and history
The floating pill is designed to stay out of the way while still telling you what matters:
- Idle — Waiting for the next dictation
- Listening — Live voice levels and active recording state
- Warning — Mic fallback or chunk failure without immediately killing the session
- Processing — Finalizing remaining work after you stop
- Complete — Brief success state before hiding
Configure via the settings window (right-click the tray icon):
| Setting | Description | Default |
|---|---|---|
| Groq API Key | Your API key from console.groq.com | — |
| Language | Transcription language (auto for detection) |
auto |
| Prompt Style | How Koe refines the transcript | Clean |
| Auto-Paste | Automatically type into the focused window | enabled |
| Theme | Dark / Light mode | dark |
Koe uses a shared core architecture to ensure consistency across Desktop and Mobile. Business logic lives in @koe/core, while platform-specific drivers handle audio and output.
See the Detailed Architecture Guide for more info.
| Feature | Desktop (Windows) | Mobile (iOS/Android) |
|---|---|---|
| Trigger | Global Hotkey | Capture Button |
| Output | Auto-Paste / Type | Clipboard-First |
| Storage | electron-store |
SecureStore |
| Capture Logic | Local VAD + ordered segments | Metering-driven chunk rotation + ordered segments |
- Desktop speech detection runs locally using ONNX WebAssembly
- Mobile recording control stays on-device until a chunk is ready to transcribe
- Retry audio is stored only for failed or unresolved segments
- Your API key is stored locally on each platform and only used for transcription/refinement requests
| Layer | Technology |
|---|---|
| Framework | Electron + Vite |
| Frontend | Vanilla JavaScript, Custom CSS |
| Audio Capture | Web Audio API |
| Voice Detection | @ricky0123/vad-web (Silero VAD) |
| Transcription | Groq Whisper API (whisper-large-v3-turbo) |
| Text Enhancement | Groq chat refinement pipeline |
| Storage | electron-store + temp retry files |
| Packaging | electron-builder |
Koe is designed to stay inside Groq's free-tier limits:
| Metric | Limit | Approximate Usage |
|---|---|---|
| Requests per minute | 20 | ~6 transcribed segments/minute with paced refinement |
| Requests per day | 2,000 | ~8 hours of normal dictation |
| Audio per day | 28,800 sec | 8 hours |
The built-in scheduler tracks request pressure and keeps the app responsive while staying inside the cap.
- Global hotkey toggle (Desktop)
- Local VAD speech detection
- Groq Whisper transcription
- AI transcript refinement
- Auto-paste to focused window (Desktop)
- Transcription history & Usage dashboard
- Mobile App (iOS/Android V1)
- Shared Core Extraction
- Custom AI prompts
- Keyboard shortcut customization
- Export history as
.txt/.md - Native macOS support (Electron)
- Android IME (Custom Keyboard) implementation
- Snippet library with voice shortcuts
- App-specific tone profiles
- Cloud sync across devices
- Team collaboration features
See Feature Requests for the full backlog.
Contributions are welcome. Please see the repo docs and existing code patterns before opening a PR.
# Fork and clone
git clone https://github.com/your-username/Koe.git
cd Koe
# Install dependencies
pnpm install
# Start development
pnpm devKoe is transitioning to a monorepo to support multiple platforms:
- Root: Legacy Electron Desktop app and shared workspace configuration
apps/mobile: Expo-based mobile client (iOS/Android)packages/koe-core: Shared business logic, types, and API services
| Target | Command | Description |
|---|---|---|
| Desktop | pnpm dev |
Start the Electron app in dev mode |
| Mobile | pnpm dev:mobile |
Start the Expo development server |
| Core | pnpm build:core |
Build the shared logic package |
| All | pnpm type-check |
Run type-checking across all packages |
Koe/
├── apps/ # Application projects
│ └── mobile/ # Expo mobile app
├── packages/ # Shared logic
│ └── koe-core/ # Core services (Whisper, Sessions)
├── src/ # Legacy Desktop source
│ ├── main/ # Electron main process
│ └── renderer/ # UI code
├── docs/ # Documentation & Tasks
├── pnpm-workspace.yaml # Workspace config
└── package.json # Root manifest & scripts
- Ensure microphone permissions are granted in Windows Settings
- Check that your default recording device is selected
- If another app is holding the mic, Koe will try another available input and warn you in the pill UI
- Wait for the per-minute window to clear
- Check the usage dashboard for queue pressure
- Very long continuous dictation can still pile up requests on the free tier
- Some applications block simulated keystrokes
- Disable auto-paste in Settings and use
Ctrl + Vmanually - Run Koe as administrator if the issue persists
- Ensure you're on Windows 10/11 64-bit
- Check that Visual C++ Redistributables are installed
- Check Windows Event Viewer for crash details
- Groq — For the fast Whisper and chat APIs
- Silero — For the VAD model
- @ricky0123 — For the
vad-weblibrary - WhisperFlow — For helping prove the category exists
Koe is licensed under the ISC License. See the LICENSE file for details.
Built with ❤️ by J StaR Films Studios
Star us on GitHub if you find Koe useful.