Skip to content

bug: thinking collapse - multiple root causes causing thinking blocks to freeze, truncate silently, or drop reasoning_content #861

@ZhouChaunge

Description

@ZhouChaunge

Problem

Users experience thinking collapse — a family of related defects where the model's reasoning (thinking) blocks either (a) freeze with a live spinner that never resolves, (b) get silently truncated to ≤4 lines with no clear expand affordance during streaming, or (c) disappear entirely from the API message history, causing HTTP 400 errors on the next turn.

Four distinct root causes have been identified through code analysis.


Root Cause 1: Thinking block stuck in streaming: true when ThinkingComplete is missed

Location: crates/tui/src/tui/ui.rsapply_engine_error_to_app

When a network interruption or mid-stream disconnect occurs, the ThinkingComplete event is never delivered to the UI event loop. The HistoryCell::Thinking { streaming: true, duration_secs: None } stays in the active_cell forever. The spinner never stops. The block never gets a duration stamp and stays ThinkingVisualState::Live with status: "live".

The test flush_active_cell_finalizes_unclosed_thinking_block acknowledges this defensively but only covers the flush_active_cell path. The error-recovery path through apply_engine_error_to_app clears streaming_thinking_active_entry = None without finalizing the cell:

// ui.rs — apply_engine_error_to_app
app.streaming_state.reset();
app.streaming_message_index = None;
app.streaming_thinking_active_entry = None;  // clears pointer but never finalizes the cell

Impact: After any API error or stream drop during a thinking phase, the thinking cell renders with a permanent live spinner.


Root Cause 2: streaming_state.reset() on ThinkingStarted drops buffered tail content

Location: crates/tui/src/tui/ui.rsEngineEvent::ThinkingStarted handler

EngineEvent::ThinkingStarted { .. } => {
    app.reasoning_buffer.clear();
    app.reasoning_header = None;
    app.thinking_started_at = Some(Instant::now());
    app.streaming_state.reset();          // unconditional reset
    app.streaming_state.start_thinking(0, None);
    let _ = ensure_streaming_thinking_active_entry(app);
}

streaming_state.reset() is called unconditionally. In a multi-block turn (V4 interleaved thinking), if a second ThinkingStarted arrives while the first block is still streaming, the uncommitted tail text in streaming_state for the first block is silently discarded.

Impact: In multi-round tool-call turns with interleaved thinking, the last few tokens of each thinking block may be silently truncated.


Root Cause 3: last_reasoning race between ThinkingComplete and MessageComplete

Location: crates/tui/src/tui/ui.rs — event dispatch loop

ThinkingComplete sets app.last_reasoning. MessageComplete reads and clears app.last_reasoning to build the ContentBlock::Thinking for api_messages.

If MessageComplete is processed before ThinkingComplete (possible when the engine channel drains events in burst order), last_reasoning is None at MessageComplete time, and the thinking block is not included in the API message history:

// EngineEvent::MessageComplete
let thinking = app.last_reasoning.take();  // could be None if ThinkingComplete hasn't arrived yet
if let Some(thinking) = thinking {
    blocks.push(ContentBlock::Thinking { thinking });
}

Impact: Thinking content is stripped from api_messages. On the next turn, DeepSeek V4 returns HTTP 400 (missing reasoning_content), or the model loses continuity of its own reasoning chain.


Root Cause 4 (UX): Hard truncation to 4 rendered lines with no streaming expand affordance

Location: crates/tui/src/tui/history.rsTHINKING_SUMMARY_LINE_LIMIT = 4

Completed thinking blocks are unconditionally collapsed to 4 rendered lines:

const THINKING_SUMMARY_LINE_LIMIT: usize = 4;

if collapsed && rendered.len() > THINKING_SUMMARY_LINE_LIMIT {
    rendered.truncate(THINKING_SUMMARY_LINE_LIMIT);
    truncated = true;
}

The affordance "thinking collapsed; press Ctrl+O for full text" is only shown when !streaming && (truncated || ...). During live streaming this line is suppressed entirely — users watching a long thinking block receive no indication that content is being hidden.

Impact: Users see partial thinking output during streaming with no signal that more exists or how to see it.


Acceptance Criteria

  • RC1: When a stream error fires while a thinking block is active (streaming_thinking_active_entry.is_some()), apply_engine_error_to_app must call finalize_streaming_thinking_active_entry (or equivalent) before clearing the pointer, so the block is stamped as interrupted rather than left frozen with a live spinner.

  • RC2: The ThinkingStarted handler must flush any uncommitted tail text from streaming_state into the current active thinking cell before calling streaming_state.reset(), or skip the reset entirely when a thinking entry is already active.

  • RC3: The event dispatch loop must guarantee that ThinkingComplete is always processed before MessageComplete for the same turn. Acceptable approaches: (a) enforce event emission order in the engine, (b) have MessageComplete inline-drain a pending thinking finalization when streaming_thinking_active_entry is still set, or (c) unify into a single ThinkingAndMessageComplete event.

  • RC4 (UX): During streaming of a thinking block that exceeds THINKING_SUMMARY_LINE_LIMIT, display a live affordance (e.g. "… (N more lines)") so the user knows the block is being truncated and can navigate to the full text.


Reproduction Steps (RC1 — easiest to reproduce)

  1. Start a session with deepseek-v4-pro (thinking enabled).
  2. Submit a complex multi-step prompt that triggers a long reasoning block.
  3. While the thinking spinner is active (… thinking · live), press Esc to cancel.
  4. Observe: the thinking cell retains the live spinner and status: "live" permanently instead of transitioning to interrupted/partial.

Affected Files

  • crates/tui/src/tui/ui.rs — RC1, RC2, RC3
  • crates/tui/src/tui/history.rs — RC4 (THINKING_SUMMARY_LINE_LIMIT)

Models Affected

Any V4 thinking-mode model: deepseek-v4-pro, deepseek-v4-flash with thinking enabled.

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't workingquestionFurther information is requested

    Projects

    No projects

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions