Skip to content

Feat/english thinking when hidden#1843

Open
cmyyy wants to merge 2 commits into
Hmbown:mainfrom
cmyyy:feat/english-thinking-when-hidden
Open

Feat/english thinking when hidden#1843
cmyyy wants to merge 2 commits into
Hmbown:mainfrom
cmyyy:feat/english-thinking-when-hidden

Conversation

@cmyyy
Copy link
Copy Markdown

@cmyyy cmyyy commented May 20, 2026

Problem

When show_thinking is disabled, thinking blocks are hidden from the
UI (HistoryCell::Thinking is filtered out in history.rs), but the
API still generates reasoning_content (controlled separately by
reasoning_effort). The ## Language rule in base.md forces the
thinking chain to match the user's input language — so a Chinese user
pays for Chinese thinking they never see.

The token-count difference between Chinese and English reasoning is
modest (DeepSeek's tokenizer handles CJK efficiently), but there is
still waste: the surrounding system prompt is English, mixed-language
reasoning fragments the token stream, and invisible content has no
reason to be localized at all.

Solution

Inject a ## Thinking Language override into the system prompt when
show_thinking is false. The model is told:

The user has disabled thinking display in settings — they will never
see your reasoning_content. Therefore, your internal thinking MUST
be in English regardless of the user's language. Your final reply
must STILL match the user's language.

Benefits:

  • Prompt-cache locality — English reasoning sits better inside an
    English system prompt, improving prefix-cache hit rates.
  • Mixed-language overhead — reasoning often interleaves code
    identifiers, file paths, and API names with natural language.
    Switching between CJK and ASCII mid-stream creates unnecessary
    token-boundary breaks.
  • Aligns cost with intent — when the user hides thinking, there is
    no user-facing reason to localize it.

Changes (6 files, +53 lines)

File Change
crates/tui/src/prompts.rs Add show_thinking to PromptSessionContext; inject ## Thinking Language override

|
| crates/tui/src/core/ops.rs | Add show_thinking to Op::SendMessage |
| crates/tui/src/core/engine.rs | Add to EngineConfig; thread through handle_send_message and
refresh_system_prompt |
| crates/tui/src/tui/ui.rs | Wire app.show_thinking through build_engine_config and prompt construction |
| crates/tui/src/main.rs | Pass show_thinking in CLI exec path |
| crates/tui/src/runtime_threads.rs | show_thinking: false for background agent threads |

Data flow: Settings::show_thinkingApp::show_thinking
Op::SendMessageEngineConfig::show_thinking
PromptSessionContext::show_thinking → override injection.

Behavior

show_thinking Reasoning language Reply language
true (default) Matches user input Matches user input
false English (forced) Matches user input

Testing

  • cargo test --all-features — 3120 passed; 6 pre-existing
    failures unrelated to this change (API key / temp directory /
    AGENTS.md path)
  • cargo fmt --all -- --check — clean
  • cargo clippy --all-targets --all-features — clean (2
    pre-existing needless_return warnings unrelated to this change)

Checklist

  • Updated docs or comments as needed
  • Added or updated tests where relevant (existing prompt tests
    updated for the new PromptSessionContext field)
  • Verified TUI behavior manually if UI changes (no UI logic
    change; the entire path is a backend-only data flow)

cmyyy added 2 commits May 21, 2026 00:03
When `show_thinking` is disabled in settings, thinking blocks are
hidden from the UI but the API still generates `reasoning_content`.
Because of the `## Language` rule in the system prompt, the thinking
chain follows the user's input language — if the user writes in
Chinese, the model thinks in Chinese for content they never see.

The tokenizer-level savings are modest (DeepSeek's vocab handles
Chinese efficiently), but the real benefit is keeping invisible
reasoning in English for better prompt-cache locality and fewer
token-boundary breaks in mixed-language (code + natural language)
contexts. When thinking is hidden, there is no reason not to use the
most cache-friendly language.

Changes:
- Add `show_thinking: bool` to `PromptSessionContext`, `EngineConfig`,
  and `Op::SendMessage`
- Inject a `## Thinking Language` override when `show_thinking` is
  false, redirecting `reasoning_content` to English while the final
  reply still matches the user's language
- Wire the field through the engine and TUI layers
Replace "To save tokens" with a rationale based on the user's intent:
when thinking is hidden, there is no reason to localize it.
Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces a show_thinking configuration option across the TUI engine, prompts, and UI components. When disabled, the system prompt is updated to instruct the model to perform its internal reasoning in English to save tokens, while still responding in the user's preferred language. Feedback was provided regarding the formatting of the multi-line system prompt string in crates/tui/src/prompts.rs, where the use of line continuation characters might introduce unintended whitespace.

Comment thread crates/tui/src/prompts.rs
Comment on lines +656 to +664
full_prompt.push_str(
"\n\n## Thinking Language\n\n\
The user has disabled thinking display in settings — they will \
never see your `reasoning_content`. Therefore, your internal \
thinking MUST be in English regardless of the user's language. \
This directive overrides the `## Language` section above for \
reasoning_content only. Your final reply must STILL match the \
user's language.",
);
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The use of \ for line continuation in this string literal will include all the leading whitespace from the subsequent lines. This results in a single long line with multiple spaces between words, which is likely not the intended formatting for the prompt. Using string literal concatenation is a cleaner way to format this multi-line string while maintaining readability in the code.

        full_prompt.push_str(
            "\n\n## Thinking Language\n\n"
            "The user has disabled thinking display in settings — they will "
            "never see your `reasoning_content`. Therefore, your internal "
            "thinking MUST be in English regardless of the user's language. "
            "This directive overrides the `## Language` section above for "
            "reasoning_content only. Your final reply must STILL match the "
            "user's language."
        );

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants