Skip to content

v0.8.43 tracker: truth surface #1877

@Hmbown

Description

@Hmbown

Release theme

v0.8.43 should make agent state and evidence legible. The user should never be left staring at Working... without knowing what is running, what is blocked, what last happened, and what basis the agent has for its claims.

This is the cockpit release: expose the machine without making the UI noisy.

Product thesis

The terminal becomes trustworthy when it can answer:

  • What is active right now?
  • Which tool, command, subagent, request, or background process owns the wait?
  • What was the last meaningful event?
  • Is this slow, blocked, retrying, cancelled, or waiting for user input?
  • What evidence supports the agent's current claim or next action?
  • What is the one safest next action or inspection path?

In scope

  • A unified runtime state model for active turns, tool calls, shell commands, subagents, RLM sessions, and app-server/serve tasks.
  • A compact user-facing state surface in the TUI plus a command or detail pager for deeper inspection.
  • Clear stuck-state classification: API stall, process wait, blocked pipe, background job, cancellation pending, approval wait, or UI/input dead zone.
  • Evidence records for important claims: file path, command, issue/PR, test, log, memory, or user-provided context.
  • Trust labels for verified current, memory-derived, user-reported, unverified, and needs maintainer decision.
  • Open log/detail, retry, cancel, and resume affordances wired to the state model where they already exist. Deep interruption semantics can wait for v0.8.45.

Candidate issue clusters to promote from v0.8.42 triage

Review lenses

  • Terminal power user: can they tell what owns the wait without losing keyboard flow?
  • Expert agent operator: can they distinguish a slow model, blocked tool, stalled stream, running subagent, and false completion?
  • Security/sandbox reviewer: does the state surface avoid overstating trust or hiding tool execution?
  • Brave beginner: does the UI say what happened and what to do next without requiring internal vocabulary?

Acceptance criteria

  • The TUI can show the current active unit of work and last event without reading raw logs.
  • A stalled turn has a classified reason or an explicit unknown state with diagnostic pointers.
  • Failed, cancelled, timed-out, and completed states are visually and semantically distinct.
  • At least three historical stuck reports have either a verified fix, a canonical duplicate, or a documented cannot-reproduce disposition.
  • Final task summaries distinguish verified results from unverified assumptions.
  • Focused tests cover the state/evidence transitions touched by the release.

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't workingdocumentationImprovements or additions to documentationenhancementNew feature or requestquestionFurther information is requested

    Projects

    No projects

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions