A cross-platform Tauri desktop utility that extracts text from any region of your screen — including text inside images, videos, and other non-copyable interfaces — using local Tesseract OCR.
Press the global hotkey, drag a rectangle, and the recognised text lands on your clipboard.
Status: Tauri 2.x · macOS / Linux / Windows · 100% local OCR (no cloud calls)
demo.mp4
- How it works
- Quick start (users) | Download
- Quick start (developers)
- Building distributable installers
- Platform notes
- Project layout
- Architecture notes
- Roadmap
- App runs in the system tray (no main window).
- Global hotkey
Ctrl+Shift+T(Windows/Linux) /Cmd+Shift+T(macOS) triggers a screen capture. - A transparent fullscreen overlay opens for selection.
- The selected region is cropped, preprocessed (upscaled + grayscaled), and run through Tesseract.
- Result is written to the clipboard with a notification toast.
Just want to use the app? Grab the installer for your OS, install, launch — done.
Latest installers: Releases page · current v0.1.2.
| OS | Installer | Direct download |
|---|---|---|
| macOS (Apple silicon — M1/M2/M3) | Text Extractor_0.1.0_aarch64.dmg |
⬇ download |
| macOS (Intel) | Text Extractor_0.1.0_x64.dmg |
coming in next release — CI runner backlog |
| Windows (installer) | Text Extractor_0.1.0_x64-setup.exe |
⬇ download |
| Windows (MSI) | Text Extractor_0.1.0_x64_en-US.msi |
⬇ download |
| Linux (.deb) | Text Extractor_0.1.0_amd64.deb |
⬇ download |
| Linux (.AppImage) | Text Extractor_0.1.0_amd64.AppImage |
⬇ download |
Direct-download links hit GitHub's
/releases/latest/download/<filename>redirect — auto-tracks the newest release. GitHub stores asset filenames with.in place of spaces, so the URL hasText.Extractor_...while the file you save is namedText Extractor_....If a link 404s after a version bump, the version segment in the filename changed (e.g.
0.1.0→0.2.0). Open the Releases page and grab the file by hand.
- Download
Text Extractor_<version>_universal.dmg(or the arch-specific build) from your distributor. - Open the DMG, drag Text Extractor to Applications.
- First launch: right-click the app → Open → Open. (The app is unsigned; this bypasses Gatekeeper one time.)
Alternative one-liner:
xattr -dr com.apple.quarantine "/Applications/Text Extractor.app" - Grant Screen Recording permission when macOS asks (System Settings → Privacy & Security → Screen Recording). Without this, captures are black.
- Press
Cmd+Shift+Tanywhere → drag a rectangle → text is on your clipboard.
- Install the
.deb(Debian/Ubuntu):sudo dpkg -i text-extractor_<version>_amd64.debOr run the.AppImage:chmod +x Text\ Extractor_*.AppImage && ./Text\ Extractor_*.AppImage - Press
Ctrl+Shift+T→ drag → paste.
- Run the
.msior.exeinstaller. - Press
Ctrl+Shift+T→ drag → paste.
- Rust (1.77+)
- Node.js (18+)
- Platform-specific Tauri prerequisites: https://v2.tauri.app/start/prerequisites/
The Rust tesseract crate links to libtesseract.
| Platform | Install |
|---|---|
| macOS | brew install tesseract leptonica |
| Debian/Ubuntu | sudo apt install libtesseract-dev libleptonica-dev clang |
| Fedora | sudo dnf install tesseract-devel leptonica-devel clang |
| Arch | sudo pacman -S tesseract leptonica clang |
| Windows | Install UB Mannheim Tesseract, add to PATH. May need TESSDATA_PREFIX pointing at the tessdata directory. |
When shipping to end users, vendor the Tesseract binaries inside your installer rather than asking users to install them.
npm installIf src-tauri/icons/ is empty, generate icons from a 1024×1024 PNG:
npm run tauri icon path/to/source.pngnpm run tauri devApp launches into the system tray. Press the hotkey to trigger capture.
A helper script wraps tauri build and collects every installer into release/ so you can zip and send to friends.
npm run dist # macOS: universal DMG (arm64 + Intel) — slowest, widest reach
npm run dist:host # current arch only — fastest, recommended for testing
npm run dist:arm64 # Apple silicon only
npm run dist:x86_64 # Intel onlyOn Linux the script auto-builds .deb + .AppImage. On Windows it auto-builds .msi + .exe (NSIS).
release/
└── Text Extractor_0.1.0_universal.dmg # macOS
Text Extractor_0.1.0_amd64.deb # Linux
text-extractor_0.1.0_amd64.AppImage # Linux
Text Extractor_0.1.0_x64_en-US.msi # Windows
Text Extractor_0.1.0_x64-setup.exe # Windows
Send the file → friend installs → done.
- Verifies
node,npm,cargoare installed. - Runs
npm installifnode_modules/is missing. - On macOS,
rustup target adds required arches automatically. - Calls
npx tauri buildwith the right--bundlesfor the host OS. - Copies every produced installer into
release/.
npm run tauri buildBundles land in src-tauri/target/release/bundle/.
Two paths — pick whichever fits.
Build on your machine, attach DMG/installer to a GitHub Release in one shot:
# bumps no version — uses current package.json version as tag
npm run dist:publishWhat it does:
- Runs the normal
npm run distbuild (universal macOS DMG by default). - Reads
versionfrom package.json, tagsv<version>, pushes tag. gh release create(orgh release upload --clobberif tag already exists) attaches everything inrelease/.
Override target arch:
PUBLISH=1 bash scripts/build-installer.sh host # current arch only
PUBLISH=1 bash scripts/build-installer.sh arm64Requires: gh auth login once.
.github/workflows/release.yml builds macOS arm64 + macOS Intel + Linux (.deb/.AppImage) + Windows (.msi/.exe) in parallel, then publishes a single GitHub Release with all artifacts attached.
Trigger by pushing a tag:
# bump version in package.json + src-tauri/tauri.conf.json + src-tauri/Cargo.toml first
git commit -am "release v0.2.0"
git tag v0.2.0
git push origin main v0.2.0Or manually from the Actions tab → Release → Run workflow → enter tag.
Friends download from https://github.com/is-harshul/text-extractor/releases/latest.
First run takes ~15–25 min (cold Rust + Tesseract build on 4 OSes). Subsequent runs cache and finish in ~5–10 min.
Windows job uses
vcpkgfor Tesseract. If the build fails on Windows, the most common fix is regeneratingvcpkgcache — see tauri-apps/tauri#windows-tesseract docs.
Without signing, friends will hit OS warnings ("unidentified developer" / SmartScreen). To skip the warnings:
- macOS: Apple Developer ID ($99/yr). Configure
bundle.macOS.signingIdentityin src-tauri/tauri.conf.json and setAPPLE_ID/APPLE_PASSWORDfor notarization. - Windows: EV or OV code-signing certificate. Configure
bundle.windows.certificateThumbprint. - Linux: no signing required, but consider hosting the
.debin a PPA / providing a.repo.
See https://v2.tauri.app/distribute/ for full guides.
- First launch requires Screen Recording permission (System Settings → Privacy & Security → Screen Recording). Without it
xcapreturns a black image. - Unsigned builds trigger Gatekeeper. Right-click → Open works, or strip quarantine:
xattr -dr com.apple.quarantine "/Applications/Text Extractor.app".
- X11: works out of the box.
- Wayland:
xcapuses thexdg-desktop-portalprotocol, which prompts the user to grant screen access every capture session. Wayland design decision — no way around it. Recommend X11 to users for a smoother experience.
- Smoothest of the three. Make sure your installer bundles
tesseract.dll,leptonica.dll, and thetessdatafolder (English language data ≈ 15 MB).
text-extractor/
├── src/ # React overlay UI
│ ├── App.tsx # Selection rectangle, masks, hint UI
│ ├── main.tsx # React entry
│ └── styles.css # Overlay styles
├── src-tauri/
│ ├── src/
│ │ ├── main.rs # Desktop entry point
│ │ ├── lib.rs # Tauri setup, tray, hotkey, commands
│ │ ├── capture.rs # xcap-based screen capture
│ │ └── ocr.rs # Tesseract wrapper + preprocessing
│ ├── capabilities/
│ │ └── default.json # Tauri 2 permissions
│ ├── Cargo.toml
│ └── tauri.conf.json
├── scripts/
│ └── build-installer.sh # Distributable-installer builder
├── index.html
├── package.json
├── tsconfig.json
└── vite.config.ts
- Capture is taken before the overlay shows so the user selects against a frozen, pre-captured image. Capturing after selection causes flicker and includes the overlay in the screenshot.
- OCR runs on a blocking task (
tokio::task::spawn_blocking) so it doesn't stall the Tauri event loop. - The overlay window is hidden, not closed, between captures. Recreating windows is slow on macOS and would add noticeable latency.
- Scale factors are computed in the renderer (
physicalWidth / window.innerWidth) to handle HiDPI/Retina displays. On mixed-DPI multi-monitor setups this can drift — v2 fix is to capture the cursor's monitor specifically.
- Multi-monitor support (capture the monitor under the cursor)
- Language picker (Tesseract supports 100+ languages)
- Selection refinement before commit (drag handles to adjust the rectangle)
- Optional cloud OCR (Google Vision) for higher accuracy
- Capture history (last N extractions, accessible from tray)
- Settings window (custom hotkey, OCR language, preprocessing toggles)
TBD — add a LICENSE file before public distribution.