Add ai-translate extension by xwzhangSZU · Pull Request #27675 · raycast/extensions

xwzhangSZU · 2026-05-05T23:35:43Z

Description

Adds AI Translate, a Raycast extension focused on fast screenshot OCR translation.

Version 1.0.0 is intentionally scoped: take a screenshot quickly, extract visible text reliably, and translate it with as little friction as possible. It is not trying to become a full translation suite. The main use case is the common moment where users can see text on screen but cannot select it.

Why not just use Raycast's built-in translation? Raycast's built-in flow is useful when text can be selected or passed as plain text, but it does not cover OCR translation. In many real workflows, text is locked inside app UI, images, PDFs, slides, videos, remote desktops, protected documents, or web pages with broken selection. In those cases, screenshot capture is the most reliable input surface.

Raycast's built-in AI features also do not provide this level of model routing for translation. AI Translate lets users choose the provider, model ID, base URL, and API key themselves, so translation can run through the exact Token Plan, Coding Plan, or provider-specific subscription they already pay for.

AI Translate layers OCR and AI translation on top of that screenshot-first workflow. It prioritizes cost-effective, high-quality providers such as DeepSeek, Xiaomi MiMo, MiniMax, and Kimi through Anthropic-compatible /v1/messages endpoints, while still supporting OpenAI / ChatGPT and Gemini. Kimi now defaults to https://api.kimi.com/coding/ as its Anthropic-compatible coding base URL.

The default system prompt is sense-for-sense rather than literal: it asks the model to write as a native speaker of the target language would naturally express the same idea, while preserving the source meaning, tone, facts, and level of formality. For Chinese targets, it explicitly asks for natural Chinese expression instead of English syntax rewritten with Chinese words.

The extension also supports configurable translation prompts. Users can choose a built-in Prompt Profile such as Screenshot OCR, Technical / Developer, Academic Writing, Legal / Policy, Subtitle / Conversation, or Custom Only, then add reusable Custom Prompt Instructions for terminology, audience, tone, and formatting preferences. These instructions are included with every translation request while the source text stays in a separate Text: block.

The extension provides four commands:

Translate Screenshot: captures a screen region, runs OCR, and opens the recognized text in the translation view.
Extract Text from Screenshot: captures a screen region, runs OCR, and opens an editable result view with copy, compact-copy, translate, and retake actions.
Copy Text from Screenshot: captures a screen region, runs OCR, and copies the recognized text to the clipboard without opening a result view.
Translate Selected Text: translates selected text, typed text, or a fallback argument when selection is available.

OCR options include macOS Vision, Tesseract, Baidu OCR, and PaddleOCR HTTP services.

Official API documentation links are included in the README for each provider:

Screencast

Store screenshots are included in:

metadata/ai-translate-1.png
metadata/ai-translate-2.png
metadata/ai-translate-3.png

Checklist

I read the extension guidelines
I read the documentation about publishing
I ran npm run build and tested this distribution build in Raycast
I checked that files in the assets folder are used by the extension itself
I checked that assets used by the README are placed outside of the metadata folder

raycastbot · 2026-05-05T23:36:05Z

Congratulations on your new Raycast extension! 🚀

We're currently experiencing a high volume of incoming requests. As a result, the initial review may take up to 10-15 business days.

Once the PR is approved and merged, the extension will be available on our Store.

greptile-apps · 2026-05-05T23:43:36Z

Greptile Summary

This PR introduces a new AI Translate extension with four commands: screenshot OCR translation, text extraction from screenshots, clipboard copy of OCR text, and direct text translation — backed by configurable BYOK providers (DeepSeek, MiMo, MiniMax, Kimi, Gemini, OpenAI) with Anthropic-compatible and OpenAI-compatible routing.

The translate command's getSelectedText call can silently clear text the user has already typed: if nothing is selected, the rejection handler calls setInputText(\"\"), overwriting whatever input was typed during the async wait.
Multiple files manually define command argument interfaces (TranslateArguments, ExtractArguments, ExtractLaunchContext) that Raycast auto-generates in raycast-env.d.ts, continuing the same drift risk as the manually defined ExtensionPreferences flagged in a previous review thread.

Confidence Score: 3/5

The extension has a user-visible bug in its core translate command where a slow or absent text selection can silently wipe input the user has already typed.

The getSelectedText catch block unconditionally calls setInputText(""), meaning any user who opens the Translate command and starts typing before the selection promise settles will have their input cleared with no error message or recovery path. This is the extension's primary command and the bug is on its main input path.

src/translate.tsx — the getSelectedText setup effect needs a guard to avoid overwriting user input on rejection.

Important Files Changed

Filename	Overview
extensions/ai-translate/src/translate.tsx	Main translation view; contains a race condition where `getSelectedText` rejection can clear user-typed input, and a manually defined argument interface that should be auto-generated.
extensions/ai-translate/src/providers.ts	Provider routing and HTTP logic; uses `safeParseJson` to guard against non-JSON error bodies, handles Gemini, OpenAI-compatible, and Anthropic-compatible paths correctly.
extensions/ai-translate/src/ocr-engines.ts	OCR engine dispatch with Baidu token caching, Tesseract, and PaddleOCR support; `safeParseJson` + `isRawResponse` guard catches non-JSON responses correctly.
extensions/ai-translate/src/types.ts	Manually defines `ExtensionPreferences` (flagged in earlier review thread) instead of relying on the auto-generated `raycast-env.d.ts` type.
extensions/ai-translate/src/languages.ts	Auto-detect language logic correctly excludes Japanese (Hiragana/Katakana check) and Korean (Hangul check) before falling back to the CJK ideograph range.
extensions/ai-translate/src/preferences.ts	Preference reading and provider config resolution; provider ordering and fallback logic is correct but still uses the manually defined `ExtensionPreferences` generic.
extensions/ai-translate/src/extract-text-from-screenshot.tsx	OCR extraction view; manually defines `ExtractArguments` and `ExtractLaunchContext` interfaces that should be auto-generated by Raycast tooling.
extensions/ai-translate/package.json	Extension manifest includes `$schema`, metadata screenshots, and macOS-only platform; category list was noted as broader than needed in a prior review thread.

Prompt To Fix All With AI

Fix the following 2 code review issues. Work through them one at a time, proposing concise fixes.

---

### Issue 1 of 2
extensions/ai-translate/src/translate.tsx:60-65
**`getSelectedText` result can silently clear user-typed input**

If the user opens the command and immediately starts typing (before the `getSelectedText` promise settles), the catch block calls `setInputText("")` which overwrites whatever they have typed. Even a ~100 ms resolution time — normal when nothing is selected — is enough to observe this: the user types a few characters, the catch fires, and the search bar is cleared. This is a silent destructive state mutation with no feedback.

### Issue 2 of 2
extensions/ai-translate/src/translate.tsx:28-30
**Manually defined command argument interface**

`TranslateArguments` is a hand-written type for the `text` argument. Raycast auto-generates command argument types in `raycast-env.d.ts` at build time, so this manual definition can silently drift out of sync if the argument name or type changes in `package.json`. The same pattern appears in `extract-text-from-screenshot.tsx` with `ExtractArguments` and `ExtractLaunchContext`. Both files should use the generated types instead.

_{Reviews (6): Last reviewed commit: "Address review feedback" | Re-trigger Greptile}

- Add store screenshot - Fix extracted text result launch

- Add second store screenshot - Keep OCR result in extract view

xwzhangSZU · 2026-05-06T01:25:06Z

Addressed the Greptile feedback in the latest update.

Changes made:

Fixed Auto language detection so Japanese kana and Korean Hangul are excluded before treating CJK ideographs as Chinese.
Added safe JSON parsing for OCR API responses so non-JSON error bodies surface readable HTTP / invalid JSON errors instead of raw SyntaxError crashes.
Removed unreachable Kimi / MiMo provider workarounds from the OpenAI-compatible path now that those providers use Anthropic-compatible requests.
Removed the redundant Raycast List throttle and kept the existing internal debounce to reduce translation startup latency.

Latest extension commit in this PR: 70d1ae0.

greptile-apps · 2026-05-06T01:25:12Z

+      try {
+        const selectedText = normalizeInputText(await getSelectedText());
+        if (isMounted) setInputText(selectedText);
+      } catch {
+        if (isMounted) setInputText("");
+      }


getSelectedText result can silently clear user-typed input

If the user opens the command and immediately starts typing (before the getSelectedText promise settles), the catch block calls setInputText("") which overwrites whatever they have typed. Even a ~100 ms resolution time — normal when nothing is selected — is enough to observe this: the user types a few characters, the catch fires, and the search bar is cleared. This is a silent destructive state mutation with no feedback.

Prompt To Fix With AI

This is a comment left during a code review. Path: extensions/ai-translate/src/translate.tsx Line: 60-65 Comment: **`getSelectedText` result can silently clear user-typed input** If the user opens the command and immediately starts typing (before the `getSelectedText` promise settles), the catch block calls `setInputText("")` which overwrites whatever they have typed. Even a ~100 ms resolution time — normal when nothing is selected — is enough to observe this: the user types a few characters, the catch fires, and the search bar is cleared. This is a silent destructive state mutation with no feedback. How can I resolve this? If you propose a fix, please make it concise.

Initial Raycast AI translator

d9c5b4f

raycastbot added the new extension Label for PRs with new extensions label May 5, 2026

raycastbot added the platform: macOS label May 5, 2026

Prepare Raycast Store submission

07565b3

xwzhangSZU marked this pull request as ready for review May 5, 2026 23:37

greptile-apps Bot reviewed May 5, 2026

View reviewed changes

Comment thread extensions/ai-translate/src/types.ts

Comment thread extensions/ai-translate/src/ocr-engines.ts Outdated

Comment thread extensions/ai-translate/package.json Outdated

Add screenshot text extraction command

f5a9704

greptile-apps Bot reviewed May 6, 2026

View reviewed changes

Comment thread extensions/ai-translate/src/providers.ts Outdated

xwzhangSZU added 5 commits May 6, 2026 08:09

Remove MinerU OCR integration

9b757f2

Split screenshot OCR commands

bafe58f

Update ai-translate extension

4ac5016

- Add store screenshot - Fix extracted text result launch

Update ai-translate extension

99e988f

- Add second store screenshot - Keep OCR result in extract view

Add third store screenshot

2cbbc48

greptile-apps Bot reviewed May 6, 2026

View reviewed changes

Comment thread extensions/ai-translate/src/ocr-engines.ts

xwzhangSZU added 7 commits May 6, 2026 08:43

Add official API documentation links

8e3841e

Rewrite store copy in English

61f188c

Refocus copy on screenshot translation

7e8f009

Clarify model routing advantage

936b712

Update Kimi coding base URL

bc0dde5

Add configurable translation prompts

bf03139

Refine default translation prompt

add90a5

greptile-apps Bot reviewed May 6, 2026

View reviewed changes

Comment thread extensions/ai-translate/src/languages.ts Outdated

Address review feedback

70d1ae0

greptile-apps Bot reviewed May 6, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add ai-translate extension#27675

Add ai-translate extension#27675
xwzhangSZU wants to merge 16 commits intoraycast:mainfrom
xwzhangSZU:ext/ai-translate

xwzhangSZU commented May 5, 2026 •

edited

Loading

Uh oh!

raycastbot commented May 5, 2026

Uh oh!

greptile-apps Bot commented May 5, 2026 •

edited

Loading

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

xwzhangSZU commented May 6, 2026

Uh oh!

greptile-apps Bot May 6, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

xwzhangSZU commented May 5, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Screencast

Checklist

Uh oh!

raycastbot commented May 5, 2026

Uh oh!

greptile-apps Bot commented May 5, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Greptile Summary

Confidence Score: 3/5

Important Files Changed

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

xwzhangSZU commented May 6, 2026

Uh oh!

greptile-apps Bot May 6, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

xwzhangSZU commented May 5, 2026 •

edited

Loading

greptile-apps Bot commented May 5, 2026 •

edited

Loading