Feature: DeepSeek cache-aware prompt diagnostics and wire payload optimization


## Problem
﻿
DeepSeek context caching depends on stable reusable prompt prefixes, but the TUI previously made it hard to see why cache hits were low.
﻿
In long sessions, cache reuse can be reduced by:
﻿
- static prompt prefix changes that are hard to detect
- growing conversation history
- large tool result messages
- repeated identical tool outputs
- repeated `<turn_meta>` blocks
﻿
## Proposal
﻿
Add minimal DeepSeek cache-aware diagnostics and wire-only payload optimization.
﻿
﻿
## Scope
﻿
- Parse and display DeepSeek cache usage fields:
- `prompt_cache_hit_tokens`
- `prompt_cache_miss_tokens`
- `prompt_tokens`
- `completion_tokens`
- `total_tokens`
﻿
- Add `/cache inspect` to show rendered prompt structure without printing full prompt text:
- Base static prefix hash
- Full request prefix hash
- static/history/dynamic layer classification
- first divergence from previous request
- SHA-256 hash and char length per layer
﻿
- Keep stable prompt content before dynamic user input where possible.
﻿
- Add stable Project Context Pack support before user input.
﻿
- Add `/cache warmup` using the same stable prefix construction as normal requests.
﻿
- Optimize rendered wire messages only:
- truncate oversized tool results before sending to DeepSeek
- deduplicate repeated identical tool results with stable refs
- deduplicate repeated `<turn_meta>` blocks with stable refs
﻿
## Non-goals
﻿
- Do not rewrite TUI architecture
- Do not change existing config format
- Do not remove full UI transcript output
- Do not modify original session messages
- Do not print full prompts, API keys, or sensitive environment values
- Do not guarantee 100% cache hits
﻿
## Acceptance Criteria
﻿
- `/cache inspect` can verify whether the base static prefix is stable.
- Cache hit/miss metrics are shown when DeepSeek returns them.
- Missing cache fields are handled gracefully.
- Large/repeated tool outputs are reduced only in rendered API messages.
- Repeated `<turn_meta>` blocks are reduced only in rendered API messages.
- UI transcript and saved session history remain unchanged.
- Tests cover prompt layer hashes, tool result budget/dedup, and `<turn_meta>` dedup.
﻿
## Related
﻿
- #1196 

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Feature: DeepSeek cache-aware prompt diagnostics and wire payload optimization #1253

Problem

Proposal

Scope

Non-goals

Acceptance Criteria

Related

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Feature: DeepSeek cache-aware prompt diagnostics and wire payload optimization #1253

Description

Problem

Proposal

Scope

Non-goals

Acceptance Criteria

Related

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions