Skip to content

feat(web-ui): Run Gates button + live gate progress for PROOF9 (#566)#574

Merged
frankbria merged 2 commits intomainfrom
feat/566-run-gates-button
Apr 7, 2026
Merged

feat(web-ui): Run Gates button + live gate progress for PROOF9 (#566)#574
frankbria merged 2 commits intomainfrom
feat/566-run-gates-button

Conversation

@frankbria
Copy link
Copy Markdown
Owner

@frankbria frankbria commented Apr 7, 2026

Summary

  • [Run Gates] button in the PROOF9 page header, disabled while a run is in progress
  • Clicking triggers POST /api/v2/proof/run, then polls GET /api/v2/proof/runs/{run_id} every 2s showing per-gate pending → running → passed/failed transitions via GateRunPanel
  • GateRunBanner shows overall pass/fail result on completion with a Retry option; requirements table refreshes automatically via mutate()
  • Error state surfaces with a Retry button if POST or polling fails

Changes

Backend

  • proof_v2.py: module-level _run_cache, RunStatusResponse model, GET /api/v2/proof/runs/{run_id} endpoint
  • test_proof_v2.py: 4 new tests for the polling endpoint

Frontend

  • types/index.ts: RunProofResponse, RunStatusResponse, GateRunStatus, GateRunEntry
  • lib/api.ts: proofApi.startRun() and proofApi.getRun()
  • hooks/useProofRun.ts (new): idle→starting→polling→complete/error state machine with interval cleanup
  • components/proof/GateRunPanel.tsx (new): per-gate status badges with animate-pulse for running state
  • components/proof/GateRunBanner.tsx (new): overall pass/fail banner
  • app/proof/page.tsx: Run Gates button + conditional GateRunPanel/GateRunBanner/error rendering

Test plan

  • uv run pytest tests/ui/test_proof_v2.py — 39 passed
  • npm test — 687 passed (7 new tests for useProofRun)
  • npm run build — clean build, no type errors
  • ruff check . — no issues

Closes #566

Summary by CodeRabbit

  • New Features

    • Run Gates flow: start a run, poll status, view per-gate progress, and see an overall pass/fail banner with retry.
    • New API to fetch cached run status and results for polling.
  • Tests

    • Added end-to-end tests for run endpoints, hook state transitions, and new UI components.

…page (#566)

- Backend: cache proof run results and expose GET /api/v2/proof/runs/{run_id}
  polling endpoint (RunStatusResponse model); 4 new tests
- Types: add RunProofResponse, RunStatusResponse, GateRunStatus, GateRunEntry
- API client: add proofApi.startRun() and proofApi.getRun()
- Hook: useProofRun manages idle→starting→polling→complete/error lifecycle
  with 2s interval poll and cleanup on unmount; 7 new tests
- Components: GateRunPanel (per-gate status badges) and GateRunBanner
  (pass/fail result with retry) added to proof component barrel
- Page: Run Gates button in header (disabled during active run),
  GateRunPanel + GateRunBanner + error state wired below filter controls,
  mutate() called on run completion to refresh requirements table

Closes #566
@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai bot commented Apr 7, 2026

Walkthrough

Adds a module-level run cache and a GET run-status endpoint to the backend; implements a frontend run lifecycle (start → poll → complete/error) via a new hook, two UI components, API client methods, and tests to support live gate-run progress and final result display.

Changes

Cohort / File(s) Summary
Backend Cache & Endpoint
codeframe/ui/routers/proof_v2.py
Added _run_cache dict, compute aggregated passed on POST /run, cache {results, passed, message}, added RunStatusResponse model and GET /runs/{run_id} endpoint returning cached status or 404.
Backend Tests
tests/ui/test_proof_v2.py
Added TestGetRunStatus suite validating GET returns 200 with expected fields for known run IDs, 404 for unknown IDs, and that GET results match POST results.
Frontend Run Hook
web-ui/src/hooks/useProofRun.ts, web-ui/src/hooks/index.ts
New useProofRun hook: run state machine (idle → starting → polling → complete/error), optimistic gateEntries, 2s polling via proofApi.getRun, error handling and retry(), exported types.
Frontend Hook Tests
web-ui/src/__tests__/hooks/useProofRun.test.ts
Added tests mocking proofApi.startRun/getRun to verify state transitions, polling behavior, success/failure cases, error handling, and retry behavior.
Frontend Page Integration
web-ui/src/app/proof/page.tsx
Integrated useProofRun, added [Run Gates] button (disabled while running), shows spinner/label while running, triggers SWR mutate on completion, conditionally renders GateRunPanel, GateRunBanner, and error UI.
Frontend UI Components
web-ui/src/components/proof/GateRunPanel.tsx, web-ui/src/components/proof/GateRunBanner.tsx, web-ui/src/components/proof/index.ts
Added GateRunPanel (live per-gate badges) and GateRunBanner (overall pass/fail banner with retry), exported from proof component barrel.
Frontend API & Types
web-ui/src/lib/api.ts, web-ui/src/types/index.ts
Added proofApi.startRun() (POST /api/v2/proof/run) and proofApi.getRun() (GET /api/v2/proof/runs/{run_id}); new types RunProofRequest, RunProofResponse, RunStatusResponse, GateRunStatus, and GateRunEntry.

Sequence Diagram

sequenceDiagram
    actor User
    participant Page as PROOF9 Page
    participant Hook as useProofRun Hook
    participant API as Frontend API
    participant Backend as Backend Router
    participant Cache as Run Cache

    User->>Page: Click [Run Gates]
    Page->>Hook: startRun(workspacePath)
    Hook->>API: POST /api/v2/proof/run (full=true)
    API->>Backend: POST request
    Backend->>Cache: Store results + computed passed
    Backend-->>API: RunProofResponse {run_id, results, message}
    API-->>Hook: Return RunProofResponse
    Hook->>Hook: build optimistic gateEntries, set runState=polling
    loop every 2s
        Hook->>API: GET /api/v2/proof/runs/{run_id}
        API->>Backend: GET request
        Backend->>Cache: Lookup run_id
        Backend-->>API: RunStatusResponse {status, results, passed, message}
        API-->>Hook: Return RunStatusResponse
        Hook->>Hook: update gateEntries & possibly set passed
        alt status == complete
            Hook->>Hook: clear interval, set runState=complete
        end
    end
    Hook-->>Page: expose final state (passed, gateEntries, message)
    Page->>Page: render GateRunBanner / GateRunPanel / refresh table
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Possibly related PRs

Poem

🐰 I thumped a key and watched the gates take flight,
Polling every heartbeat through the day and night.
Cache holds the answer, banners sing the fate,
Retry if needed — I’ll hop and wait! ✨

🚥 Pre-merge checks | ✅ 5
✅ Passed checks (5 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title accurately summarizes the main changes: adding a Run Gates button and live gate progress visualization for the PROOF9 feature.
Linked Issues check ✅ Passed The PR implements all core objectives from issue #566: Run Gates button, POST/GET endpoint integration, polling with gate status transitions, overall result banner, error handling with retry, and button disable during active runs.
Out of Scope Changes check ✅ Passed All changes align with issue #566 scope; no unrelated modifications detected outside the Run Gates feature and its supporting backend/frontend infrastructure.
Docstring Coverage ✅ Passed Docstring coverage is 83.33% which is sufficient. The required threshold is 80.00%.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch feat/566-run-gates-button

Comment @coderabbitai help to get the list of available commands and usage tips.

@claude
Copy link
Copy Markdown

claude bot commented Apr 7, 2026

PR Review: feat(web-ui): Run Gates button + live gate progress for PROOF9 (#566)

This is a clean implementation of Phase 3.5B. The architecture is sound — the state machine in useProofRun, the component split (panel vs. banner), and the backend polling endpoint are all well-structured. Tests cover the main paths. A few issues worth addressing:


Backend — proof_v2.py

_run_cache has no eviction

_run_cache: dict[str, dict] = {}

This grows unbounded for the lifetime of the server process. Every proof run adds an entry that's never removed. For a long-running server with frequent use this will be a memory leak. Consider a simple bounded cache (e.g., collections.OrderedDict with a max size of ~100 entries, popping the oldest on insert), or at minimum a comment noting this is an in-memory-only store that doesn't survive restarts.

all() on empty sequence returns True

passed = all(
    satisfied
    for gate_results in results.values()
    for _, satisfied in gate_results
)

If results is empty (no requirements evaluated), this will report passed=True. That's a misleading success signal. Consider passed = bool(results) and all(...).

status field should be narrower

status: str  # "running" | "complete"

The comment documents what it should be. Use Literal["running", "complete"] from typing to enforce it at the model level and get better OpenAPI docs.


Frontend — useProofRun.ts

Deduplication logic is duplicated verbatim (lines ~76–81 and ~95–101). Extract a small helper:

function dedupeByGate(entries: GateRunEntry[]): GateRunEntry[] {
  const seen = new Set<string>();
  return entries.filter(({ gate }) => seen.has(gate) ? false : (seen.add(gate), true));
}

Polling error swallows the actual error

} catch {
  setErrorMessage('Failed to retrieve run status. Please retry.');

The POST error handler surfaces the actual err.message, but the poll error handler uses a hardcoded string. Inconsistent — the poll failure message should include the underlying error for debuggability.

setGateEntries(uniqueEntries.length > 0 ? uniqueEntries : []) is redundantuniqueEntries is already [] when empty. Just setGateEntries(uniqueEntries).


Frontend — GateRunBanner.tsx

message prop is always ""
In page.tsx:

<GateRunBanner passed={passed} message="" onRetry={retry} />

The message prop is always an empty string so the <span> renders nothing. Either remove the prop and hardcode the message text, or pass the actual message from useProofRun's returned state (which already has response.message available if you surface it).

Hardcoded color classes
GateRunBanner uses raw Tailwind color classes (border-green-300, bg-green-50, text-green-900, bg-red-50, etc.) instead of the design system tokens. The error div in page.tsx uses border-destructive bg-destructive/10 which is consistent with Shadcn. The banner should follow the same pattern — at minimum the failure state should use destructive tokens for visual consistency.


GateRunPanel.tsx

STATUS_LABEL maps each key to itself — it's unused indirection. STATUS_LABEL[status] equals status. Either remove the map and render {status} directly, or add meaningful label text (e.g., running: 'Running…').


Tests

The 4 new TestGetRunStatus tests are well-scoped and cover the cases that matter. One note: the module-level _run_cache is shared across the entire test session. If any test leaves a stale entry, it persists. Currently fine since the tests don't depend on cache isolation, but worth keeping in mind as the test suite grows.


Summary

Severity Item
Medium _run_cache unbounded memory growth
Medium all() on empty results returns True
Low Literal typing for status field
Low Deduplication logic duplicated in useProofRun
Low Poll error swallows actual error message
Low message="" dead prop in GateRunBanner
Low Hardcoded color classes in GateRunBanner
Nit STATUS_LABEL is identity map — remove or add value

Core architecture, state machine, component decomposition, and test coverage are solid. The memory leak in _run_cache is the most important fix before merge. The rest can be addressed in a follow-up or in this PR.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 4

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
codeframe/ui/routers/proof_v2.py (1)

323-342: ⚠️ Potential issue | 🔴 Critical

This still blocks until the entire proof run finishes, and the run cache has no workspace isolation.

run_proof() is synchronous, so the client does not receive run_id until every gate has already completed. That means the new poller can never report real pending/running transitions, and the blocking gate execution also ties up the FastAPI event loop for the duration of the run. To satisfy the "live gate progress" behavior, this needs to return the run_id immediately and move execution/progress updates into background work or a threadpool-backed job store.

Additionally, _run_cache is a process-local, module-level dictionary keyed only by run_id with no workspace scoping. In a multi-workspace deployment, different workspaces can generate the same 8-character run_id, causing cache collisions. The GET endpoint retrieves cached results without verifying workspace ownership, creating isolation and reliability risks.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@codeframe/ui/routers/proof_v2.py` around lines 323 - 342, run_proof_endpoint
currently calls the synchronous run_proof and blocks until completion and uses a
process-global _run_cache keyed only by run_id; change run_proof_endpoint to
generate and immediately return run_id (keep run_id generation logic), then
dispatch execution of run_proof to a background task or threadpool (e.g.,
FastAPI BackgroundTasks or asyncio.create_task/wrapping run_proof in
run_in_executor) so progress updates and evidence are recorded asynchronously;
update the cache implementation (_run_cache) and any lookup code (including the
GET poller) to incorporate workspace scoping (e.g., include workspace.id or
workspace.name as part of the cache key or nest runs per workspace) and ensure
GET handlers verify the workspace from get_v2_workspace before returning cached
results to avoid cross-workspace collisions and leakage.
🧹 Nitpick comments (3)
web-ui/src/lib/api.ts (1)

635-638: Move the run request shape into @/types.

{ full: boolean } is the only new proof API contract still defined inline in the client. Please export a RunProofRequest type from web-ui/src/types/index.ts and use it here so the proof API surface stays centralized. As per coding guidelines "TypeScript types must be defined in web-ui/src/types/index.ts and API client functionality in web-ui/src/lib/api.ts".

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@web-ui/src/lib/api.ts` around lines 635 - 638, Add a RunProofRequest type to
the centralized types file and update the API client to use it: export an
interface/type named RunProofRequest (shape: { full: boolean }) from
web-ui/src/types/index.ts, then import RunProofRequest into
web-ui/src/lib/api.ts and replace the inline parameter type on startRun (the
function returning Promise<RunProofResponse>) from `{ full: boolean }` to
`RunProofRequest`, ensuring imports/exports are updated accordingly.
web-ui/src/components/proof/GateRunPanel.tsx (1)

17-21: Keep the status badges on the Nova gray palette.

The status pills are hard-coded to gray/blue/green/red utilities instead of shared theme variants. Please route these through the existing Badge styling/tokens so this panel stays consistent with the rest of web-ui/. As per coding guidelines "Web UI must use Shadcn/UI (Nova preset) with gray color scheme and Hugeicons (@hugeicons/react); never use lucide-react".

Also applies to: 38-39

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@web-ui/src/components/proof/GateRunPanel.tsx` around lines 17 - 21,
STATUS_CLASSES in GateRunPanel.tsx currently hardcodes Tailwind color utilities;
replace this with the shared Badge styling/variant tokens from the web-ui design
system so badges use the Nova gray palette (e.g., map GateRunStatus keys to the
Badge component's preset variant names or class tokens instead of 'bg-...'
strings), update the places that render the status pills (including the
occurrences referenced around lines 38-39) to render <Badge variant="..."> (or
apply the badge token) rather than raw utility classes, and ensure you
import/use the Badge from the shared shadcn/ui Nova preset and use Hugeicons for
any icons instead of lucide-react.
web-ui/src/components/proof/GateRunBanner.tsx (1)

17-21: Use theme tokens instead of hard-coded success/error colors.

These green-* / red-* utilities bypass the Nova/Shadcn palette and will drift from the rest of the gray UI. Please map this banner to shared semantic variants/tokens instead of bespoke color classes. As per coding guidelines "Web UI must use Shadcn/UI (Nova preset) with gray color scheme and Hugeicons (@hugeicons/react); never use lucide-react".

Also applies to: 30-35

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@web-ui/src/components/proof/GateRunBanner.tsx` around lines 17 - 21,
GateRunBanner.tsx uses hard-coded tailwind color utilities (green-*/red-*) for
the banner, text and dot which bypass the Nova/Shadcn theme; change these to the
shared semantic UI tokens/variants used across the app (e.g., the Shadcn/Nova
"success"/"error" semantic classes or the project's theme tokens) instead of
bespoke green/red classes, update the container, dot, title and message elements
in the GateRunBanner component and the similar block at lines ~30-35 to
reference those shared tokens/variants, and replace any non-approved icon libs
with `@hugeicons/react` per guidelines.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@codeframe/ui/routers/proof_v2.py`:
- Around line 48-49: The module-level cache _run_cache storing RunProofResponse
dicts is unsafe because it is process-local and doesn't record or enforce the
originating workspace, causing cross-worker 404s and unauthorized GET
/runs/{run_id} access; change the implementation so runs are persisted in shared
storage (or at minimum include workspace ownership in the cached value and
validate it on retrieval): replace or augment _run_cache entries to include
workspace_path (or write through to the existing shared store used elsewhere),
update functions that write/read runs (the POST/poller and the GET handler
around RunProofResponse usage) to key by workspace+run_id and to verify
workspace_path matches before returning a run, and remove reliance on pure
in-memory process-local storage to ensure multi-worker correctness.

In `@web-ui/src/app/proof/page.tsx`:
- Around line 258-275: The run-state UI (GateRunPanel, GateRunBanner, and the
error alert) is currently rendered only inside the "populated table" branch
and/or gated by gateEntries.length, so when requirements are empty or still
loading the progress/error surfaces never appear; move the block that
conditionally renders GateRunPanel (using runState and gateEntries), the
GateRunBanner (runState === 'complete', passed, retry) and the error alert
(runState === 'error', errorMessage, retry) out of the data.total > 0 branch and
place it above the table-rendering branch so the run flow UI is displayed
independently of data.total and before the table content is conditionally
rendered. Ensure you keep the same conditional checks (runState,
gateEntries.length, passed, errorMessage) and reuse the existing retry handler.

In `@web-ui/src/hooks/useProofRun.ts`:
- Around line 122-128: The retry() callback currently only clears local UI state
(calls clearPollInterval, setRunState('idle'), setGateEntries([]),
setPassed(null), setErrorMessage(null)) which makes "Retry" merely dismiss
instead of resuming the saved run; update retry to resume the existing runIdRef
by clearing the poll interval, resetting transient state, then re-start the
poll/flow for runIdRef.current (e.g., invoke the existing polling/start function
used elsewhere such as startPolling or resumeRun with runIdRef.current) and
setRunState to the appropriate running state; alternatively, if intended to be a
pure reset, rename retry to reset and update the UI copy accordingly to avoid
misleading "Retry" behavior.
- Around line 77-106: The polling using setInterval with an async callback can
cause overlapping requests and late responses to overwrite terminal state;
replace the setInterval-based loop (the block setting intervalRef.current) with
a serialized recursive poll function (e.g., an async pollRun that awaits
proofApi.getRun(workspaceRef.current, runIdRef.current), processes status into
gate entries via the current mapping logic, calls clearPollInterval when status
is 'complete', and schedules the next poll with setTimeout only after the
current request finishes), or alternatively add an explicit in-flight guard
boolean to prevent concurrent getRun calls; ensure uses of clearPollInterval,
setGateEntries, setPassed, setRunState, and error handling remain but are only
invoked from the serialized flow so late responses cannot overwrite terminal
states.

---

Outside diff comments:
In `@codeframe/ui/routers/proof_v2.py`:
- Around line 323-342: run_proof_endpoint currently calls the synchronous
run_proof and blocks until completion and uses a process-global _run_cache keyed
only by run_id; change run_proof_endpoint to generate and immediately return
run_id (keep run_id generation logic), then dispatch execution of run_proof to a
background task or threadpool (e.g., FastAPI BackgroundTasks or
asyncio.create_task/wrapping run_proof in run_in_executor) so progress updates
and evidence are recorded asynchronously; update the cache implementation
(_run_cache) and any lookup code (including the GET poller) to incorporate
workspace scoping (e.g., include workspace.id or workspace.name as part of the
cache key or nest runs per workspace) and ensure GET handlers verify the
workspace from get_v2_workspace before returning cached results to avoid
cross-workspace collisions and leakage.

---

Nitpick comments:
In `@web-ui/src/components/proof/GateRunBanner.tsx`:
- Around line 17-21: GateRunBanner.tsx uses hard-coded tailwind color utilities
(green-*/red-*) for the banner, text and dot which bypass the Nova/Shadcn theme;
change these to the shared semantic UI tokens/variants used across the app
(e.g., the Shadcn/Nova "success"/"error" semantic classes or the project's theme
tokens) instead of bespoke green/red classes, update the container, dot, title
and message elements in the GateRunBanner component and the similar block at
lines ~30-35 to reference those shared tokens/variants, and replace any
non-approved icon libs with `@hugeicons/react` per guidelines.

In `@web-ui/src/components/proof/GateRunPanel.tsx`:
- Around line 17-21: STATUS_CLASSES in GateRunPanel.tsx currently hardcodes
Tailwind color utilities; replace this with the shared Badge styling/variant
tokens from the web-ui design system so badges use the Nova gray palette (e.g.,
map GateRunStatus keys to the Badge component's preset variant names or class
tokens instead of 'bg-...' strings), update the places that render the status
pills (including the occurrences referenced around lines 38-39) to render <Badge
variant="..."> (or apply the badge token) rather than raw utility classes, and
ensure you import/use the Badge from the shared shadcn/ui Nova preset and use
Hugeicons for any icons instead of lucide-react.

In `@web-ui/src/lib/api.ts`:
- Around line 635-638: Add a RunProofRequest type to the centralized types file
and update the API client to use it: export an interface/type named
RunProofRequest (shape: { full: boolean }) from web-ui/src/types/index.ts, then
import RunProofRequest into web-ui/src/lib/api.ts and replace the inline
parameter type on startRun (the function returning Promise<RunProofResponse>)
from `{ full: boolean }` to `RunProofRequest`, ensuring imports/exports are
updated accordingly.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 957bddb9-a25f-4efe-839b-12016ae16362

📥 Commits

Reviewing files that changed from the base of the PR and between c29d021 and d834dfb.

📒 Files selected for processing (11)
  • codeframe/ui/routers/proof_v2.py
  • tests/ui/test_proof_v2.py
  • web-ui/src/__tests__/hooks/useProofRun.test.ts
  • web-ui/src/app/proof/page.tsx
  • web-ui/src/components/proof/GateRunBanner.tsx
  • web-ui/src/components/proof/GateRunPanel.tsx
  • web-ui/src/components/proof/index.ts
  • web-ui/src/hooks/index.ts
  • web-ui/src/hooks/useProofRun.ts
  • web-ui/src/lib/api.ts
  • web-ui/src/types/index.ts

Comment on lines +48 to +49
# Module-level cache: run_id → serialized RunProofResponse dict
_run_cache: dict[str, dict] = {}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

A module-local cache is not a safe run registry.

This state is process-local and the cached payload does not record which workspace created it. In multi-worker deployments the poller can 404 when POST and GET land on different workers, and even in a single process GET /runs/{run_id} cannot verify that the current workspace_path owns the run. Persist runs in shared storage keyed by workspace + run_id, or at least store and enforce the originating workspace here.

Also applies to: 360-364, 386-401

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@codeframe/ui/routers/proof_v2.py` around lines 48 - 49, The module-level
cache _run_cache storing RunProofResponse dicts is unsafe because it is
process-local and doesn't record or enforce the originating workspace, causing
cross-worker 404s and unauthorized GET /runs/{run_id} access; change the
implementation so runs are persisted in shared storage (or at minimum include
workspace ownership in the cached value and validate it on retrieval): replace
or augment _run_cache entries to include workspace_path (or write through to the
existing shared store used elsewhere), update functions that write/read runs
(the POST/poller and the GET handler around RunProofResponse usage) to key by
workspace+run_id and to verify workspace_path matches before returning a run,
and remove reliance on pure in-memory process-local storage to ensure
multi-worker correctness.

Comment on lines +258 to +275
{/* Gate run progress and result */}
{(runState === 'starting' || runState === 'polling' || runState === 'complete' || runState === 'error') && gateEntries.length > 0 && (
<GateRunPanel gateEntries={gateEntries} />
)}
{runState === 'complete' && passed !== null && (
<GateRunBanner passed={passed} message="" onRetry={retry} />
)}
{runState === 'error' && errorMessage && (
<div
role="alert"
className="mb-4 flex items-center gap-3 rounded-lg border border-destructive bg-destructive/10 px-4 py-3"
>
<p className="text-sm text-destructive flex-1">{errorMessage}</p>
<Button variant="ghost" size="sm" onClick={retry} className="text-destructive hover:text-destructive">
Retry
</Button>
</div>
)}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Render the run-state UI outside the populated-table branch.

GateRunPanel, the completion banner, and the error alert only exist inside the data.total > 0 path, and the panel is additionally gated on gateEntries.length > 0. If a user starts a run before the requirements query resolves, or the workspace currently has zero requirements, the dedicated progress/error UI has nowhere to render even though the header button is enabled. Move these surfaces above this branch so the run flow is visible independently of table data.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@web-ui/src/app/proof/page.tsx` around lines 258 - 275, The run-state UI
(GateRunPanel, GateRunBanner, and the error alert) is currently rendered only
inside the "populated table" branch and/or gated by gateEntries.length, so when
requirements are empty or still loading the progress/error surfaces never
appear; move the block that conditionally renders GateRunPanel (using runState
and gateEntries), the GateRunBanner (runState === 'complete', passed, retry) and
the error alert (runState === 'error', errorMessage, retry) out of the
data.total > 0 branch and place it above the table-rendering branch so the run
flow UI is displayed independently of data.total and before the table content is
conditionally rendered. Ensure you keep the same conditional checks (runState,
gateEntries.length, passed, errorMessage) and reuse the existing retry handler.

Comment on lines +77 to +106
intervalRef.current = setInterval(async () => {
try {
const status = await proofApi.getRun(workspaceRef.current, runIdRef.current);
if (status.status === 'complete') {
clearPollInterval();

// Build final gate entries with pass/fail status
const finalEntries: GateRunEntry[] = Object.values(status.results)
.flat()
.map((item) => ({
gate: item.gate,
status: item.satisfied ? ('passed' as GateRunStatus) : ('failed' as GateRunStatus),
}));
const seenFinal = new Set<string>();
const uniqueFinal = finalEntries.filter(({ gate }) => {
if (seenFinal.has(gate)) return false;
seenFinal.add(gate);
return true;
});

setGateEntries(uniqueFinal);
setPassed(status.passed);
setRunState('complete');
}
} catch {
clearPollInterval();
setErrorMessage('Failed to retrieve run status. Please retry.');
setRunState('error');
}
}, 2000);
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
# Inspect the polling loop and check whether a reentrancy guard exists nearby.
sed -n '76,106p' web-ui/src/hooks/useProofRun.ts
rg -n 'pollInFlight|isPolling|AbortController' web-ui/src/hooks/useProofRun.ts

Repository: frankbria/codeframe

Length of output: 1300


🏁 Script executed:

cat -n web-ui/src/hooks/useProofRun.ts

Repository: frankbria/codeframe

Length of output: 5458


Serialize polling; setInterval(async …) can race terminal state.

setInterval does not wait for the previous getRun() promise to settle. If a poll takes longer than 2s, the next tick starts another request, and a late completion or error can overwrite complete or error state after the interval has been cleared. Switch to a recursive setTimeout loop or add an explicit in-flight guard.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@web-ui/src/hooks/useProofRun.ts` around lines 77 - 106, The polling using
setInterval with an async callback can cause overlapping requests and late
responses to overwrite terminal state; replace the setInterval-based loop (the
block setting intervalRef.current) with a serialized recursive poll function
(e.g., an async pollRun that awaits proofApi.getRun(workspaceRef.current,
runIdRef.current), processes status into gate entries via the current mapping
logic, calls clearPollInterval when status is 'complete', and schedules the next
poll with setTimeout only after the current request finishes), or alternatively
add an explicit in-flight guard boolean to prevent concurrent getRun calls;
ensure uses of clearPollInterval, setGateEntries, setPassed, setRunState, and
error handling remain but are only invoked from the serialized flow so late
responses cannot overwrite terminal states.

Comment on lines +122 to +128
const retry = useCallback(() => {
clearPollInterval();
setRunState('idle');
setGateEntries([]);
setPassed(null);
setErrorMessage(null);
}, [clearPollInterval]);
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Make retry() actually retry the run, or rename it to reset().

This callback is wired to the UI's Retry affordances, but the implementation only clears local state and returns to idle. That turns Retry into a dismiss action, and a poll failure gives up on the saved runIdRef instead of using it to resume the current flow. Either restart/resume here, or rename the action and update the button copy so the behavior is accurate.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@web-ui/src/hooks/useProofRun.ts` around lines 122 - 128, The retry() callback
currently only clears local UI state (calls clearPollInterval,
setRunState('idle'), setGateEntries([]), setPassed(null), setErrorMessage(null))
which makes "Retry" merely dismiss instead of resuming the saved run; update
retry to resume the existing runIdRef by clearing the poll interval, resetting
transient state, then re-start the poll/flow for runIdRef.current (e.g., invoke
the existing polling/start function used elsewhere such as startPolling or
resumeRun with runIdRef.current) and setRunState to the appropriate running
state; alternatively, if intended to be a pure reset, rename retry to reset and
update the UI copy accordingly to avoid misleading "Retry" behavior.

- Backend: scope _run_cache by (workspace.repo_path, run_id) to prevent
  cross-workspace cache collisions; GET /runs/{run_id} verifies workspace
  before cache lookup
- Types: export RunProofRequest interface from @/types (was inline)
- GateRunPanel: use Badge design system variants (backlog/in-progress/done/failed)
  instead of hardcoded Tailwind color classes
- GateRunBanner: use semantic design system tokens (bg-muted/30, border-destructive,
  text-destructive) consistent with app patterns
@claude
Copy link
Copy Markdown

claude bot commented Apr 7, 2026

Review posted via gh CLI - see full review below.

@claude
Copy link
Copy Markdown

claude bot commented Apr 7, 2026

Review: feat(web-ui) Run Gates button + live gate progress

This is a clean Phase 3.5B implementation. A few things worth addressing:

  1. _run_cache is unbounded and not multi-worker safe. Memory leak risk plus silent 404s when multiple uvicorn/gunicorn workers each have their own dict. Fix: add max size + TTL (e.g., evict entries older than 5 minutes, cap at 100 entries) and document the single-worker limitation.

  2. message is hardcoded to empty string in page.tsx. RunStatusResponse.message is stored in useProofRun but not exposed in UseProofRunReturn and is discarded. One-liner fix: add runMessage to hook return and thread through to GateRunBanner.

  3. Duplicate dedup logic in useProofRun.ts. The Set-based gate dedup pattern appears twice. Extract a dedupeGates helper inside the hook.

Minor: RunStatusResponse.status includes 'running' but GET can only ever return 'complete' (POST is sync) - worth a comment. TestGetRunStatus depends on _run_cache not being cleared between tests - consider resetting in setup_method.

What is working well: The idle-to-starting-to-polling-to-complete/error state machine is clean and well-tested (7 hook tests covering all branches including retry). Interval cleanup via useEffect return and explicit clearPollInterval on every exit path is correct - no timer leaks. Backend test coverage is solid. aria-label, role=alert, aria-live attributes are present throughout. Component scope is minimal with no scope creep.

The memory/multi-worker cache issue is the only thing worth resolving before merge. Everything else is polish.

@claude
Copy link
Copy Markdown

claude bot commented Apr 7, 2026

test2

@claude
Copy link
Copy Markdown

claude bot commented Apr 7, 2026

Review: feat(web-ui) Run Gates button + live gate progress (issue 566)

This is a clean Phase 3.5B implementation. The state machine design in useProofRun is well-structured, tests are thorough, and the UI integration is minimal/appropriate. A few things worth addressing:


1. _run_cache is unbounded and not multi-worker safe (Backend)

Two problems:

  • Memory leak: the dict grows forever. Every proof run adds an entry that is never evicted. Under normal usage this is low-risk but it will grow monotonically for the life of the process.
  • Multi-worker unsafe: with multiple uvicorn/gunicorn workers, a POST landing on worker A stores the result in A's in-memory dict. A GET poll from the frontend hitting worker B returns 404. This silently breaks polling in any production-like deployment.

Since POST /run is already synchronous, the GET endpoint exists purely for consistency with the client's polling model. A simple fix is to add a max cache size + TTL (e.g., evict entries older than 5 minutes, cap at 100 entries) to bound the leak and document the single-worker limitation clearly.


2. message is hardcoded to empty string in page.tsx

The run message from the API (e.g. 'Proof run complete: 3 requirement(s) evaluated.') is returned in RunStatusResponse.message and stored in useProofRun, but it's not exposed in UseProofRunReturn and is discarded. GateRunBanner already renders message when non-empty. This is a one-liner fix: add runMessage to the hook's return and thread it through.


3. Duplicate deduplication logic in useProofRun.ts

The gate dedup pattern (build a Set, filter with seen.has / seen.add) appears twice: once for the optimistic 'running' entries and once for the final pass/fail entries. Extract a small helper inside the hook. Minor DRY issue but makes both call sites cleaner.


Minor notes

  • RunStatusResponse.status type includes 'running' but the GET endpoint can only ever return 'complete' (since POST is synchronous). Fine to keep for forward compatibility, but worth a comment acknowledging the current POST-is-sync contract.
  • Test isolation: TestGetRunStatus tests depend on the module-level _run_cache not being cleared between tests. If test order ever changes or tests run in isolation, these could fail. Consider resetting _run_cache in a setup_method or fixture.

What's working well

  • The idle -> starting -> polling -> complete/error state machine is clean and well-tested (7 hook tests covering all branches including retry).
  • Interval cleanup via useEffect return and explicit clearPollInterval on every exit path is correct - no timer leaks.
  • Backend test coverage for the new endpoint is solid (200, shape, 404, result match).
  • aria-label, role='alert', aria-live attributes are present throughout - good accessibility hygiene.
  • Component scope is appropriately minimal; no scope creep.

The memory/multi-worker cache issue is the only thing I'd want resolved before merge. Everything else is polish.

@claude
Copy link
Copy Markdown

claude bot commented Apr 7, 2026

.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

♻️ Duplicate comments (1)
codeframe/ui/routers/proof_v2.py (1)

48-49: ⚠️ Potential issue | 🟠 Major

Move proof run state out of the module-local cache.

This registry is still process-local. If POST /run and GET /runs/{run_id} land on different workers, a valid run_id will 404 even though the run succeeded. Persist the run state in shared storage instead of _run_cache.

Also applies to: 360-364, 386-401

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@codeframe/ui/routers/proof_v2.py` around lines 48 - 49, The module-level
cache _run_cache storing serialized RunProofResponse dicts must be replaced with
shared persistent storage so POST /run and GET /runs/{run_id} work across
workers: remove usage of _run_cache and instead persist the serialized
RunProofResponse under a key composed of (workspace_path, run_id) in your shared
store (e.g., DB/Redis/object store) when creating a run, and fetch/deserialize
that same key when serving GET /runs/{run_id}; update every place that reads or
writes _run_cache (including the blocks around the original declaration and the
regions noted at 360-364 and 386-401) to call the shared-store client, and
ensure you handle serialization, deserialization, errors, and optional
TTL/cleanup for completed runs.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@codeframe/ui/routers/proof_v2.py`:
- Around line 152-159: The RunStatusResponse/polling flow currently always
returns a completed result because run_proof() finishes before the client
receives run_id and the poll handler hardcodes status="complete"; fix by
persisting a Run record (with run_id, status, partial results, message, passed
flag) before executing the proof, return that run_id immediately from
run_proof(), then execute the proof work asynchronously (background thread/task
or job queue) and update the persisted Run record status to "pending" →
"running" → "complete"/"failed" as gates progress; update the poll endpoint that
uses RunStatusResponse to read the current record and return live status and
partial results, and ensure functions/methods named run_proof(), the poll
handler that constructs RunStatusResponse, and any Run model/state storage are
modified accordingly.

---

Duplicate comments:
In `@codeframe/ui/routers/proof_v2.py`:
- Around line 48-49: The module-level cache _run_cache storing serialized
RunProofResponse dicts must be replaced with shared persistent storage so POST
/run and GET /runs/{run_id} work across workers: remove usage of _run_cache and
instead persist the serialized RunProofResponse under a key composed of
(workspace_path, run_id) in your shared store (e.g., DB/Redis/object store) when
creating a run, and fetch/deserialize that same key when serving GET
/runs/{run_id}; update every place that reads or writes _run_cache (including
the blocks around the original declaration and the regions noted at 360-364 and
386-401) to call the shared-store client, and ensure you handle serialization,
deserialization, errors, and optional TTL/cleanup for completed runs.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 356e1484-a04f-479f-b4d4-a125da55775c

📥 Commits

Reviewing files that changed from the base of the PR and between d834dfb and 6311373.

📒 Files selected for processing (6)
  • codeframe/ui/routers/proof_v2.py
  • web-ui/src/components/proof/GateRunBanner.tsx
  • web-ui/src/components/proof/GateRunPanel.tsx
  • web-ui/src/hooks/useProofRun.ts
  • web-ui/src/lib/api.ts
  • web-ui/src/types/index.ts
✅ Files skipped from review due to trivial changes (1)
  • web-ui/src/types/index.ts
🚧 Files skipped from review as they are similar to previous changes (3)
  • web-ui/src/components/proof/GateRunBanner.tsx
  • web-ui/src/components/proof/GateRunPanel.tsx
  • web-ui/src/hooks/useProofRun.ts

Comment on lines +152 to +159
class RunStatusResponse(BaseModel):
"""Response for GET /runs/{run_id} — poll a completed run."""

run_id: str
status: str # "running" | "complete"
results: dict[str, list[dict[str, Any]]]
passed: bool
message: str
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

This backend cannot produce live gate progress yet.

run_proof() finishes before the client gets run_id, and the poll endpoint hardcodes status="complete" with final results only. That means the UI cannot receive real pending → running → passed/failed transitions from the backend, so the live-progress acceptance criteria is not actually met. Seed a run record before execution and update it during the run, or move the work to an async job.

Also applies to: 323-365, 374-402

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@codeframe/ui/routers/proof_v2.py` around lines 152 - 159, The
RunStatusResponse/polling flow currently always returns a completed result
because run_proof() finishes before the client receives run_id and the poll
handler hardcodes status="complete"; fix by persisting a Run record (with
run_id, status, partial results, message, passed flag) before executing the
proof, return that run_id immediately from run_proof(), then execute the proof
work asynchronously (background thread/task or job queue) and update the
persisted Run record status to "pending" → "running" → "complete"/"failed" as
gates progress; update the poll endpoint that uses RunStatusResponse to read the
current record and return live status and partial results, and ensure
functions/methods named run_proof(), the poll handler that constructs
RunStatusResponse, and any Run model/state storage are modified accordingly.

@frankbria frankbria merged commit ea8f4d2 into main Apr 7, 2026
14 checks passed
@frankbria frankbria deleted the feat/566-run-gates-button branch April 7, 2026 18:48
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Phase 3.5B] Run Gates: button, trigger, and live gate progress

1 participant