bug: thinking collapse - multiple root causes causing thinking blocks to freeze, truncate silently, or drop reasoning_content

## Problem

Users experience **thinking collapse** — a family of related defects where the model's reasoning (thinking) blocks either (a) freeze with a live spinner that never resolves, (b) get silently truncated to ≤4 lines with no clear expand affordance during streaming, or (c) disappear entirely from the API message history, causing HTTP 400 errors on the next turn.

Four distinct root causes have been identified through code analysis.

---

## Root Cause 1: Thinking block stuck in `streaming: true` when `ThinkingComplete` is missed

**Location:** `crates/tui/src/tui/ui.rs` — `apply_engine_error_to_app`

When a network interruption or mid-stream disconnect occurs, the `ThinkingComplete` event is never delivered to the UI event loop. The `HistoryCell::Thinking { streaming: true, duration_secs: None }` stays in the `active_cell` forever. The spinner never stops. The block never gets a duration stamp and stays `ThinkingVisualState::Live` with `status: "live"`.

The test `flush_active_cell_finalizes_unclosed_thinking_block` acknowledges this defensively but only covers the `flush_active_cell` path. The error-recovery path through `apply_engine_error_to_app` clears `streaming_thinking_active_entry = None` without finalizing the cell:

```rust
// ui.rs — apply_engine_error_to_app
app.streaming_state.reset();
app.streaming_message_index = None;
app.streaming_thinking_active_entry = None;  // clears pointer but never finalizes the cell
```

**Impact:** After any API error or stream drop during a thinking phase, the thinking cell renders with a permanent live spinner.

---

## Root Cause 2: `streaming_state.reset()` on `ThinkingStarted` drops buffered tail content

**Location:** `crates/tui/src/tui/ui.rs` — `EngineEvent::ThinkingStarted` handler

```rust
EngineEvent::ThinkingStarted { .. } => {
    app.reasoning_buffer.clear();
    app.reasoning_header = None;
    app.thinking_started_at = Some(Instant::now());
    app.streaming_state.reset();          // unconditional reset
    app.streaming_state.start_thinking(0, None);
    let _ = ensure_streaming_thinking_active_entry(app);
}
```

`streaming_state.reset()` is called unconditionally. In a multi-block turn (V4 interleaved thinking), if a second `ThinkingStarted` arrives while the first block is still streaming, the uncommitted tail text in `streaming_state` for the first block is silently discarded.

**Impact:** In multi-round tool-call turns with interleaved thinking, the last few tokens of each thinking block may be silently truncated.

---

## Root Cause 3: `last_reasoning` race between `ThinkingComplete` and `MessageComplete`

**Location:** `crates/tui/src/tui/ui.rs` — event dispatch loop

`ThinkingComplete` sets `app.last_reasoning`. `MessageComplete` reads and clears `app.last_reasoning` to build the `ContentBlock::Thinking` for `api_messages`.

If `MessageComplete` is processed **before** `ThinkingComplete` (possible when the engine channel drains events in burst order), `last_reasoning` is `None` at `MessageComplete` time, and the thinking block is **not included** in the API message history:

```rust
// EngineEvent::MessageComplete
let thinking = app.last_reasoning.take();  // could be None if ThinkingComplete hasn't arrived yet
if let Some(thinking) = thinking {
    blocks.push(ContentBlock::Thinking { thinking });
}
```

**Impact:** Thinking content is stripped from `api_messages`. On the next turn, DeepSeek V4 returns HTTP 400 (missing `reasoning_content`), or the model loses continuity of its own reasoning chain.

---

## Root Cause 4 (UX): Hard truncation to 4 rendered lines with no streaming expand affordance

**Location:** `crates/tui/src/tui/history.rs` — `THINKING_SUMMARY_LINE_LIMIT = 4`

Completed thinking blocks are unconditionally collapsed to 4 rendered lines:

```rust
const THINKING_SUMMARY_LINE_LIMIT: usize = 4;

if collapsed && rendered.len() > THINKING_SUMMARY_LINE_LIMIT {
    rendered.truncate(THINKING_SUMMARY_LINE_LIMIT);
    truncated = true;
}
```

The affordance `"thinking collapsed; press Ctrl+O for full text"` is only shown when `!streaming && (truncated || ...)`. During **live streaming** this line is suppressed entirely — users watching a long thinking block receive no indication that content is being hidden.

**Impact:** Users see partial thinking output during streaming with no signal that more exists or how to see it.

---

## Acceptance Criteria

- [ ] **RC1**: When a stream error fires while a thinking block is active (`streaming_thinking_active_entry.is_some()`), `apply_engine_error_to_app` must call `finalize_streaming_thinking_active_entry` (or equivalent) before clearing the pointer, so the block is stamped as interrupted rather than left frozen with a live spinner.

- [ ] **RC2**: The `ThinkingStarted` handler must flush any uncommitted tail text from `streaming_state` into the current active thinking cell before calling `streaming_state.reset()`, or skip the reset entirely when a thinking entry is already active.

- [ ] **RC3**: The event dispatch loop must guarantee that `ThinkingComplete` is always processed before `MessageComplete` for the same turn. Acceptable approaches: (a) enforce event emission order in the engine, (b) have `MessageComplete` inline-drain a pending thinking finalization when `streaming_thinking_active_entry` is still set, or (c) unify into a single `ThinkingAndMessageComplete` event.

- [ ] **RC4 (UX)**: During streaming of a thinking block that exceeds `THINKING_SUMMARY_LINE_LIMIT`, display a live affordance (e.g. `"… (N more lines)"`) so the user knows the block is being truncated and can navigate to the full text.

---

## Reproduction Steps (RC1 — easiest to reproduce)

1. Start a session with `deepseek-v4-pro` (thinking enabled).
2. Submit a complex multi-step prompt that triggers a long reasoning block.
3. While the thinking spinner is active (`… thinking · live`), press **Esc** to cancel.
4. Observe: the thinking cell retains the live spinner and `status: "live"` permanently instead of transitioning to interrupted/partial.

## Affected Files

- `crates/tui/src/tui/ui.rs` — RC1, RC2, RC3
- `crates/tui/src/tui/history.rs` — RC4 (`THINKING_SUMMARY_LINE_LIMIT`)

## Models Affected

Any V4 thinking-mode model: `deepseek-v4-pro`, `deepseek-v4-flash` with thinking enabled.


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

bug: thinking collapse - multiple root causes causing thinking blocks to freeze, truncate silently, or drop reasoning_content #861

Problem

Root Cause 1: Thinking block stuck in `streaming: true` when `ThinkingComplete` is missed

Root Cause 2: `streaming_state.reset()` on `ThinkingStarted` drops buffered tail content

Root Cause 3: `last_reasoning` race between `ThinkingComplete` and `MessageComplete`

Root Cause 4 (UX): Hard truncation to 4 rendered lines with no streaming expand affordance

Acceptance Criteria

Reproduction Steps (RC1 — easiest to reproduce)

Affected Files

Models Affected

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

bug: thinking collapse - multiple root causes causing thinking blocks to freeze, truncate silently, or drop reasoning_content #861

Description

Problem

Root Cause 1: Thinking block stuck in streaming: true when ThinkingComplete is missed

Root Cause 2: streaming_state.reset() on ThinkingStarted drops buffered tail content

Root Cause 3: last_reasoning race between ThinkingComplete and MessageComplete

Root Cause 4 (UX): Hard truncation to 4 rendered lines with no streaming expand affordance

Acceptance Criteria

Reproduction Steps (RC1 — easiest to reproduce)

Affected Files

Models Affected

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions

Root Cause 1: Thinking block stuck in `streaming: true` when `ThinkingComplete` is missed

Root Cause 2: `streaming_state.reset()` on `ThinkingStarted` drops buffered tail content

Root Cause 3: `last_reasoning` race between `ThinkingComplete` and `MessageComplete`