fix(#2861): release per-message render caches when streaming completes#2866
fix(#2861): release per-message render caches when streaming completes#2866aheritier wants to merge 1 commit into
Conversation
docker-agent
left a comment
There was a problem hiding this comment.
Assessment: 🟡 NEEDS ATTENTION
This PR correctly addresses the memory leak in long TUI sessions by introducing Finalize() on messageModel to drop renderCache and mdRenderer once streaming completes. The removeSpinner() targeted-invalidation improvement is architecturally sound. Two issues were found in the new code — one performance edge case and one dead code placement.
ff46eb0 to
306af6a
Compare
|
/review |
docker-agent
left a comment
There was a problem hiding this comment.
Assessment: 🟢 APPROVE
The memory-leak fix is well-reasoned and correctly targets the dominant sources of retained heap in long TUI sessions: the per-message renderCache.result ANSI string and the IncrementalRenderer's prefix cache. The Finalize()/finalizePreviousMessageView() lifecycle is sound, the SetMessage() un-finalize path is correct, and the regression test methodology (heap-inuse slope + structural invariant via reflection) is solid.
Two minor issues found in the added code (see inline comments).
messageModel held its rendered ANSI output (renderCache.result) and a per-message IncrementalRenderer (inputPrefix + outputPrefix + codeBlocksPrefix) for the lifetime of the session. With ~80-120 KB of retained strings per assistant turn and no eviction, a long TUI session grew its heap linearly until macOS's memorystatus subsystem killed docker-agent as the largest compressed process. Add Finalize() on messageModel that drops both caches when a message is no longer the active streaming target, and call it on the previous view whenever a new top-level entry is appended (and on every loaded view except the last in LoadFromSession). Render() and renderAssistantMarkdown() gracefully fall back to a transient renderer for finalized messages, so output stays correct; the chat list's bounded LRU still memoizes their rendered output. Finalize() is gated on MessageTypeAssistant: only assistant views allocate an IncrementalRenderer and accumulate large rendered ANSI strings during streaming, so user messages, tool calls, and other types have nothing to release. Setting finalized=true on those views would only have the side effect of permanently disabling renderCache for selected/hovered states (which bypass the parent's bounded LRU), forcing a fresh re-render every animation tick. addReasoningBlock() does not finalize the previous view: reasoning blocks routinely interleave with an actively streaming assistant turn, and finalizing the in-progress view would force every subsequent chunk through the transient-renderer path. The next non-reasoning entry will finalize via addMessage() when the streaming turn actually ends. removeSpinner() no longer calls invalidateAllItems(): the spinner is always at the tail, so wiping the LRU on every assistant turn was both wasteful and immediately re-armed the leak by forcing a full re-render of every previous message into a fresh renderCache. Spinner-driven messages are never put in the LRU (shouldCacheMessage returns false), so only the joined renderedLines slab needs to be reset. Adds a regression test that streams 400 large assistant messages and asserts both heap growth (<30 KB/message vs. the 161 KB/message pre-fix slope) and the structural invariant that at most one view retains render-cache state. The assertion uses a new exported HasLiveRenderState() method on the message.Model interface instead of cross-package reflection on unexported fields. Fixes #2861 Assisted-By: docker-agent
306af6a to
243c31f
Compare
|
/review |
docker-agent
left a comment
There was a problem hiding this comment.
Assessment: 🟢 APPROVE
This PR introduces a well-designed memory leak fix for the TUI's messageModel. The Finalize() / HasLiveRenderState() / finalizePreviousMessageView() mechanics are coherent, the gating on MessageTypeAssistant is correct, and the reasoning about why addReasoningBlock deliberately skips finalization is sound.
What was reviewed:
pkg/tui/components/message/message.go— newFinalize(),HasLiveRenderState(),renderAssistantMarkdowntransient-renderer path,SetMessageun-finalize logic,shouldCacheMessageguardpkg/tui/components/messages/messages.go—finalizePreviousMessageView()call sites,removeSpinnernarrowing frominvalidateAllItemsto targeted reset,LoadFromSessionfinalization looppkg/tui/components/messages/leak_test.go— heap-slope and live-render-state assertions
No confirmed or likely bugs were found in the added code. The implementation correctly addresses the linear heap growth described in #2861.
Summary
messageModelheld its rendered ANSI output (renderCache.result) and a per-messageIncrementalRenderer(inputPrefix+outputPrefix+codeBlocksPrefix) for the full lifetime of the session. With ~80–120 KB of retained strings per assistant turn and no eviction, a long TUI session grew its heap linearly until macOS'smemorystatus(jetsam) subsystem killeddocker-agentas the largest compressed process.Reported in #2861 via:
The compressed-pages number is dramatic because ANSI output (escape sequences + line padding to terminal width) compresses extremely well, but the underlying retained heap is real and grows at +161 KB / assistant message in a synthetic harness.
Fix
Finalize()andHasLiveRenderState()on themessage.Modelinterface and*messageModel.Finalize()dropsrenderCacheandReset()s + nilsmdRenderer. Sets afinalizedflag soRender()skips repopulatingrenderCache, andrenderAssistantMarkdown()falls back to a transient renderer rather than persisting one on the struct.Finalize()is gated onMessageTypeAssistant. Only assistant views allocate anIncrementalRendererand accumulate large rendered ANSI strings during streaming. User messages, tool calls and other types have nothing to release; settingfinalized = trueon them would only have the side effect of permanently disabling theirrenderCachefor selected/hovered states.SetMessage()un-finalizes the view if streaming ever resumes into it (Finalize is purely a cache hint).messages.model.finalizePreviousMessageView()runs whenever a new top-level entry is appended (addMessage,AddCancelledMessage,AddOrUpdateToolCall).addReasoningBlock()deliberately does not finalize. Reasoning blocks interleave with an actively streaming assistant turn, and finalizing mid-stream would force every subsequent chunk through the slow transient-renderer path. The next non-reasoning entry finalizes viaaddMessage()when the streaming turn actually ends.LoadFromSessionfinalizes every loaded view except the last.removeSpinner()no longer callsinvalidateAllItems(). The spinner is always at the tail, so wiping the entire 500-entry rendered-item LRU on every assistant turn was both wasteful and immediately re-armed the leak by forcing a full re-render of every previous message into a freshrenderCache. It now only resets the joinedrenderedLinesslab; spinner-driven messages are never put in the LRU (shouldCacheMessagereturnsfalse), so no LRUDeleteis needed.The chat-list's bounded LRU continues to memoize finalized messages' rendered output, so subsequent re-renders are still O(1) when a viewed message is in cache.
Validation
pkg/tui/components/messages/leak_test.gostreams 400 8-KiB assistant messages (200 warmup + 200 measured) and asserts both a per-message heap budget and a structural invariant. The structural assertion uses the new exportedHasLiveRenderState()method rather than cross-package reflection on unexported fields.task testpasses everywhere except the twopkg/teamloader/TestLoadExamples/.../dmr.yamlcases that require Docker Model Runner to be running locally — environmental, unrelated to this change.Fixes #2861
Assisted-By: docker-agent