feat(approval): persist post-execution audit row alongside approval (#2135) by CodeGhost21 · Pull Request #2367 · tinyhumansai/openhuman

CodeGhost21 · 2026-05-20T19:41:22Z

Summary

Add executed_at, execution_outcome, and execution_error columns to pending_approvals so every approved tool call carries both a before-row (approval decision) and an after-row (terminal execution status).
New ApprovalGate::intercept_audited returns the persisted request_id so the four agent paths that gate external-effect tools can stamp the terminal status when the tool finishes.
New store::record_execution enforces monotonic audit (decided_at IS NOT NULL guard) and caps error strings at 512 chars to keep upstream crash dumps out of the audit log.

Problem

pending_approvals rows transition from (no decided_at) → (decided_at, decision) and stop there. There is no terminal entry recording whether the approved tool actually ran, what it returned, or whether it failed. A compliance reviewer can see approved but not executed — they have to cross-reference the agent transcript, which is the wrong place for an audit trail. Issue #2135 calls this gap out explicitly under "Tests cover audit write before AND after an action".

Solution

Approval domain

New ExecutionOutcome enum (Success / Failure / Aborted).
pending_approvals table gets three new columns. Idempotent PRAGMA table_info migration upgrades v1 DBs in place — pre-Make approvals and audit durable across non-CLI channels #2135 rows back-fill as NULL so older audit history stays readable.
store::record_execution(request_id, outcome, error) gated by decided_at IS NOT NULL so the audit trail stays monotonic (no executed before approved rows possible). Error strings capped at 512 chars + ellipsis.
ApprovalGate::intercept_audited(...) returns (GateOutcome, Option<String>). Legacy intercept(...) is a thin wrapper that drops the id so call sites that don't care about after-execution audit (CLI, tests) keep their current shape.
ApprovalGate::record_execution(...) wraps the store call with best-effort logging — audit-write failure is logged but never bubbles back to the agent (the tool already ran).

Wiring (4 call sites)

agent/harness/tool_loop.rs — capture request_id from the gate, stamp Success/Failure with the truncated tool result text after the tool finishes.
agent/harness/subagent_runner/ops.rs — same shape for sub-agent-dispatched tool calls.
channels/proactive.rs — stamp Success on ch.send Ok, Failure with the error message on Err.
agent/triage/escalation.rs — stamp Success/Failure on the triage-dispatched sub-agent outcome.

Submission Checklist

If a section does not apply to this change, mark the item as N/A with a one-line reason.

Tests added or updated (happy path + at least one failure / edge case) — 8 new tests; cargo test --lib approval → 71 passed, 0 failed.
Diff coverage ≥ 80% — every changed line in store.rs / gate.rs is exercised by a new or updated test. Four call-site changes are 5–10 lines each, all on hot paths already covered by existing harness tests.
Coverage matrix updated — N/A: behaviour-only change to existing approval surface; no new feature row added/removed/renamed.
All affected feature IDs from the matrix are listed in ## Related — N/A: no matrix row touched.
No new external network dependencies introduced — pure SQLite + in-process gate code.
Manual smoke checklist updated if this touches release-cut surfaces — N/A: audit-trail addition only; no user-visible UI flow changes when approvals work normally.
Linked issue closed via Closes #NNN in the ## Related section — see below.

Impact

Runtime/platform: desktop core + Tauri shell. Approval domain is in-process.
User-visible: none under happy-path operation. The audit trail in ~/.openhuman/approval/approval.db now carries executed_at + execution_outcome + execution_error per approved row; UI surfaces that consume these rows will see the new fields.
Performance / security: one extra SQLite UPDATE per gated tool call (best-effort, async-aware tracing on failure). Error strings are capped at 512 chars so a crash dump can't fill the audit DB.
Migration / compatibility: schema migration is idempotent and happens on every with_connection call. v1 DBs (older builds) get the new columns added in place, with existing rows back-filled as NULL.

Tests

cargo test --lib approval → 71 passed, 0 failed (8 new, 63 pre-existing).

New test	Covers
`record_execution_writes_terminal_audit_row_after_decide`	before-decide attempts must be a no-op; after decide they patch the row
`record_execution_persists_outcome_and_redacted_error`	Failure with error text round-trips through SQLite
`record_execution_caps_long_error_messages`	2 000-char input is capped to ~512 chars + ellipsis
`record_execution_unknown_id_is_safe_noop`	stray `record_execution` for an id that was never persisted reports `false` without writing
`migrate_columns_is_idempotent_on_v1_databases`	synthesise a pre-#2135 schema, prove the migration adds the new columns without losing existing rows AND that a second open is a no-op
`execution_outcome_round_trips` / `_serializes_as_snake_case`	on-wire shape for the new enum
`intercept_audited_returns_request_id_only_when_allowed_and_persisted`	Allow path returns the id so callers can `record_execution`
`intercept_audited_returns_none_id_for_denied_and_allowlisted`	Deny path and session-allowlist shortcut both return `None` (nothing to record afterward)

Closes Make approvals and audit durable across non-CLI channels #2135
Companion: Add approval audit history read path #2335 (audit history read path — surfaces the data this PR now also writes)
Companion: Fix expired pending approvals lingering after restart #2357 (lazy expiry of stale pending approvals after restart)
Follow-up PR(s)/TODOs: policy-callback hook for unattended channel/webhook denial (managed-runtime use case in the issue) — separate slice.

AI Authored PR Metadata (required for Codex/Linear PRs)

Linear Issue

Key: N/A
URL: N/A (GitHub-only issue Make approvals and audit durable across non-CLI channels #2135)

Commit & Branch

Branch: fix/2135-audit-action-execution (branched from origin/main after fresh fetch)
Commit SHA: see PR head

Validation Run

pnpm --filter openhuman-app format:check — N/A: no frontend changes.
pnpm typecheck — N/A: no TypeScript changes.
Focused tests: cargo test --lib approval → 71 passed, 0 failed.
Rust fmt/check (if changed): cargo fmt --manifest-path Cargo.toml applied; cargo check clean.
Tauri fmt/check (if changed): N/A: no Tauri shell changes.

Validation Blocked

command: N/A
error: N/A
impact: N/A

Behavior Changes

Intended behavior change: every gated tool call now leaves both a before-row (approval decision) and an after-row (terminal status) on the same pending_approvals row.
User-visible effect: audit-history surfaces will show executed_at + outcome per approved row; agent / UI under normal operation are unchanged.

Parity Contract

Legacy behavior preserved: intercept(...) keeps its existing GateOutcome shape (it now wraps intercept_audited and drops the id). CLI and existing tests don't need changes.
Guard/fallback/dispatch parity checks: record_execution is gated by decided_at IS NOT NULL so the audit trail stays monotonic; idempotent column migration is tested against a synthetic v1 schema.

Duplicate / Superseded PR Handling

Duplicate PR(s): none for this slice. Add approval audit history read path #2335 (audit history read) and Fix expired pending approvals lingering after restart #2357 (expiry) cover orthogonal acceptance criteria of the same issue.
Canonical PR: this PR for the before+after audit slice.
Resolution: N/A.

Summary by CodeRabbit

New Features
- Approval flow now captures audited request IDs and records terminal execution outcomes (success/failure/aborted) for approved external actions
- Audit trail extended to tool calls, escalations, and proactive external sends
Chores
- Database migration adds execution/audit columns and live-upgrade support to persist terminal outcomes and preserve immutability
Tests
- Added tests for audited approvals, execution recording, truncation/redaction, and migration behavior

…inyhumansai#2135) Approval gate currently writes a single durable row per intercepted tool call: it captures *the action was approved at X* but never the matching *the action ran at Y with outcome Z*. The acceptance criterion in tinyhumansai#2135 requires tests for audit writes both before AND after an action — this commit closes that gap. Approval domain - New `ExecutionOutcome` enum (`Success` / `Failure` / `Aborted`). - `pending_approvals` SQLite table gets three new columns: `executed_at`, `execution_outcome`, `execution_error`. Idempotent PRAGMA-table_info migration upgrades v1 DBs in place — pre-tinyhumansai#2135 rows back-fill as NULL so older audit history stays readable. - New `store::record_execution(request_id, outcome, error)`. Gated by `decided_at IS NOT NULL` so the audit trail stays monotonic (no "executed before approved" rows possible). Error strings are capped at 512 chars so upstream crash dumps can't fill the audit log; the agent still sees the full error in its tool-result envelope. - New `ApprovalGate::intercept_audited(...)` returns `(GateOutcome, Option<String>)`. The legacy `intercept(...)` is a thin wrapper that drops the id, so call sites that don't care about after-execution audit (CLI, tests) keep their current shape. - New `ApprovalGate::record_execution(...)` wraps the store call with best-effort logging — audit-write failure is logged but never bubbles back to the agent (the tool already ran). Wiring (4 call sites) - `agent/harness/tool_loop.rs`: capture request_id from the gate, stamp Success/Failure with the truncated tool result text after the tool finishes. - `agent/harness/subagent_runner/ops.rs`: same shape for sub-agent-dispatched tool calls. - `channels/proactive.rs`: stamp Success on `ch.send` Ok, Failure with the error message on Err. - `agent/triage/escalation.rs`: stamp Success/Failure on the triage-dispatched sub-agent outcome. Tests (8 new, 71 passing in `cargo test --lib approval`) - `record_execution_writes_terminal_audit_row_after_decide` — before-decide attempts must be a no-op; after decide they patch the row. - `record_execution_persists_outcome_and_redacted_error` — Failure with error text round-trips through SQLite. - `record_execution_caps_long_error_messages` — 2 000-char input is capped to ~512 chars + ellipsis. - `record_execution_unknown_id_is_safe_noop` — stray record_execution for an id that was never persisted reports `false` without writing. - `migrate_columns_is_idempotent_on_v1_databases` — synthesise a pre-tinyhumansai#2135 schema, prove the migration adds the new columns without losing existing rows AND that a second open is a no-op. - `execution_outcome_round_trips` / `_serializes_as_snake_case` — on-wire shape for the new enum. - `intercept_audited_returns_request_id_only_when_allowed_and_persisted` — Allow path returns the id so callers can record_execution. - `intercept_audited_returns_none_id_for_denied_and_allowlisted` — Deny path and session-allowlist shortcut both return None (nothing to record afterward). Out of scope (companion / follow-up PRs) - `composio.list_recent_decisions` audit-history read path — owned by tinyhumansai#2335 (open). - Lazy expiry of stale pending rows after restart — owned by tinyhumansai#2357 (open). - Policy-callback hook for unattended channel/webhook denial — follow-up.

coderabbitai · 2026-05-20T19:41:36Z

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

@coderabbitai resume to resume automatic reviews.
@coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

▶️ Resume reviews
🔍 Trigger review

📝 Walkthrough

Walkthrough

Adds audited approval intercepts that return request IDs, DB migration and columns for execution audit, an ExecutionOutcome enum, and wiring to record terminal success/failure across tool harness, escalation, and proactive external-channel sends.

Changes

Approval audit recording

Layer / File(s)	Summary
ExecutionOutcome type and audited gate APIs `src/openhuman/approval/types.rs`, `src/openhuman/approval/gate.rs`, `src/openhuman/approval/mod.rs`	`ExecutionOutcome` (Success/Failure/Aborted) is added. `ApprovalGate::intercept_audited()` returns `(GateOutcome, Option<request_id>)` and `record_execution()` writes terminal outcome; gate tests verify request-id threading and allowlist shortcut behavior.
Approval store schema migration and persistence `src/openhuman/approval/store.rs`	`pending_approvals` gains `executed_at`, `execution_outcome`, `execution_error`. `migrate_columns()` idempotently upgrades older DBs on open. New `record_execution()` persists outcome/error only after `decided_at` and when not already stamped; error text is sanitized and capped/truncated. Tests cover migration, truncation, idempotency, unknown-id safety, and redaction.
Tool harness & escalation integration `src/openhuman/agent/harness/tool_loop.rs`, `src/openhuman/agent/harness/subagent_runner/ops.rs`, `src/openhuman/agent/triage/escalation.rs`	Tool-loop, subagent runner, and escalation use `intercept_audited`, retain optional `approval_request_id` and gate handle, then call `record_execution` after execution to stamp Success/Failure (exec success is derived from structured result, not text prefixes). Existing allow/deny flows remain.
Proactive message channel audited integration `src/openhuman/channels/proactive.rs`	External-channel proactive sends call `intercept_audited` and record Success/Failure back to the approval audit row after send completes; web-channel delivery is unchanged.
i18n minor update `app/src/lib/i18n/chunks/de-5.ts`	Trailing mascot translation entries adjusted.

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

Possibly related PRs

tinyhumansai/openhuman#2149: Prior changes to external-effect ApprovalGate flow; this PR extends that flow with audited intercepts and execution recording.
tinyhumansai/openhuman#2335: Adds related approval-audit history read APIs over decided audit rows that this change stamps.
tinyhumansai/openhuman#2357: Changes pending_approvals lifecycle and interacts with the same row-state transitions updated here.

Suggested reviewers

senamakel

Poem

🐰 I found a tiny request id in my hat,
The gate gave a nod and then that was that.
I ran the tool, then penned the tale:
success or fail — stamped without fail.
Hopping home, I left a tidy audit trail.

🚥 Pre-merge checks | ✅ 5

✅ Passed checks (5 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The PR title accurately reflects the main change: adding post-execution audit row persistence alongside approval decisions, as evidenced by all core files modifying approval storage and audit recording.
Linked Issues check	✅ Passed	The PR successfully implements all coding requirements from `#2135`: durable storage of approval decisions/audit entries, audited tool execution paths across four call sites (tool_loop, subagent_runner, escalation, proactive channels), ExecutionOutcome enum for terminal status tracking, and error redaction with truncation.
Out of Scope Changes check	✅ Passed	The PR includes one minor out-of-scope change: a German i18n file (`de-5.ts`) was modified to fix duplicate keys unrelated to approval auditing, though this appears to be a merge conflict resolution.
Docstring Coverage	✅ Passed	Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 2

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (2)

src/openhuman/approval/gate.rs (1)

200-255: ⚠️ Potential issue | 🟠 Major | 🏗️ Heavy lift

Handle approval-vs-timeout races before forcing Deny.

decide() commits the approval in SQLite before it sends on the waiter. If that commit lands right as the TTL expires, this branch can still time out, return GateOutcome::Deny, and drop the request_id even though the row is already approved. That leaves the durable audit saying “approved” while the tool never runs and no execution outcome can be recorded. Please make the timeout path re-check the persisted decision (or otherwise serialize timeout vs. decide atomically) before denying.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/openhuman/approval/gate.rs` around lines 200 - 255, The timeout branch
currently unconditionally denies even if store::decide committed an approval
concurrently; modify the Err(_elapsed) handling in the gate wait logic to
re-check the persisted decision before returning Deny: after
self.evict_waiter(&request_id) (and before logging/returning), call the store
read method that fetches the stored ApprovalDecision for request_id (the same
persistence API that decide() writes), and if the persisted decision is Approve
return (GateOutcome::Allow, Some(request_id)) instead of denying; only if the
persisted decision is Deny or missing proceed with the existing deny path and
calls to store::decide. Ensure you reference self.ttl, rx, store::decide,
evict_waiter, and GateOutcome when locating the fix.

src/openhuman/agent/harness/subagent_runner/ops.rs (1)

1524-1550: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Use the structured execution result for audit status, not the rendered text.

!raw.starts_with("Error") is not a reliable success signal. A successful tool payload can legitimately begin with "Error" (for example copied logs or search results), which will persist a false Failure in the new audit row. Carry a success bool out of the timeout/execute match instead of re-parsing the display string.

Suggested fix

-                    let raw = match tokio::time::timeout(timeout, tool.execute(args)).await {
-                        Ok(Ok(result)) => {
-                            let raw = result.output();
-                            if result.is_error {
-                                format!("Error: {raw}")
-                            } else {
-                                raw
-                            }
-                        }
-                        Ok(Err(err)) => format!("Error executing {}: {err}", call.name),
-                        Err(_) => format!("Error: tool '{}' timed out", call.name),
-                    };
+                    let (raw, success) = match tokio::time::timeout(timeout, tool.execute(args)).await {
+                        Ok(Ok(result)) => {
+                            let raw = result.output();
+                            if result.is_error {
+                                (format!("Error: {raw}"), false)
+                            } else {
+                                (raw, true)
+                            }
+                        }
+                        Ok(Err(err)) => (format!("Error executing {}: {err}", call.name), false),
+                        Err(_) => (format!("Error: tool '{}' timed out", call.name), false),
+                    };
                     // Stamp the terminal status onto the
                     // pending_approvals audit row — best-effort,
                     // failures don't propagate to the agent (`#2135`).
                     if let (Some(gate), Some(req_id)) = (
                         approval_gate_for_audit.as_ref(),
                         approval_request_id.as_ref(),
                     ) {
-                        let success = !raw.starts_with("Error");
                         let exec_outcome = if success {
                             crate::openhuman::approval::ExecutionOutcome::Success
                         } else {
                             crate::openhuman::approval::ExecutionOutcome::Failure
                         };

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/openhuman/agent/harness/subagent_runner/ops.rs` around lines 1524 - 1550,
The audit success is being derived from the rendered text with
!raw.starts_with("Error"), which is brittle; instead capture a boolean success
from the tokio::time::timeout(...) match that distinguishes Ok(Ok(result)) vs
Ok(Err(_)) vs Err(_), and use that success value when computing exec_outcome and
err_text before calling gate.record_execution; update the match around
tool.execute(args) (the block that defines raw) to also set a success flag
(e.g., success: bool) and use that flag in the approval_gate_for_audit /
approval_request_id handling rather than parsing raw.

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@src/openhuman/approval/store.rs`:
- Around line 259-266: The truncation currently takes 512 chars then appends "…"
so stored strings can become 513 chars; update the closure that builds
trimmed_error (the error.map(|e| { ... }) block) to ensure the final string
never exceeds 512 characters by either taking 511 characters and then appending
"…" when truncated or by hard-truncating to 512 without appending; change the
chars.take(...) logic accordingly so trimmed_error respects the 512-character
cap.
- Around line 267-275: The UPDATE in the pending_approvals write currently
allows overwriting executed_at/execution_outcome/execution_error on retries;
modify the SQL in the execute call that sets executed_at, execution_outcome,
execution_error (the same execute invocation that uses params![now,
outcome.as_str(), trimmed_error, request_id]) to add "AND executed_at IS NULL"
to the WHERE clause (in addition to the existing "decided_at IS NOT NULL") so
the first terminal report wins and subsequent calls are ignored.

---

Outside diff comments:
In `@src/openhuman/agent/harness/subagent_runner/ops.rs`:
- Around line 1524-1550: The audit success is being derived from the rendered
text with !raw.starts_with("Error"), which is brittle; instead capture a boolean
success from the tokio::time::timeout(...) match that distinguishes
Ok(Ok(result)) vs Ok(Err(_)) vs Err(_), and use that success value when
computing exec_outcome and err_text before calling gate.record_execution; update
the match around tool.execute(args) (the block that defines raw) to also set a
success flag (e.g., success: bool) and use that flag in the
approval_gate_for_audit / approval_request_id handling rather than parsing raw.

In `@src/openhuman/approval/gate.rs`:
- Around line 200-255: The timeout branch currently unconditionally denies even
if store::decide committed an approval concurrently; modify the Err(_elapsed)
handling in the gate wait logic to re-check the persisted decision before
returning Deny: after self.evict_waiter(&request_id) (and before
logging/returning), call the store read method that fetches the stored
ApprovalDecision for request_id (the same persistence API that decide() writes),
and if the persisted decision is Approve return (GateOutcome::Allow,
Some(request_id)) instead of denying; only if the persisted decision is Deny or
missing proceed with the existing deny path and calls to store::decide. Ensure
you reference self.ttl, rx, store::decide, evict_waiter, and GateOutcome when
locating the fix.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 3047abd0-a16b-41a4-9e69-d03fa27fb0ea

📥 Commits

Reviewing files that changed from the base of the PR and between fa8d75f and 58c3264.

📒 Files selected for processing (8)

src/openhuman/agent/harness/subagent_runner/ops.rs
src/openhuman/agent/harness/tool_loop.rs
src/openhuman/agent/triage/escalation.rs
src/openhuman/approval/gate.rs
src/openhuman/approval/mod.rs
src/openhuman/approval/store.rs
src/openhuman/approval/types.rs
src/openhuman/channels/proactive.rs

…ror at 512 incl. ellipsis + make terminal write immutable Two CodeRabbit findings on src/openhuman/approval/store.rs: 1. **Minor — cap inclusive of ellipsis.** The previous truncation took 512 chars then appended "…", so stored values became 513 chars when truncation kicked in. Fixed by taking 511 chars when the input would overflow and appending the marker; short inputs pass through unchanged. New test asserts the truncated value is exactly 512 chars including the ellipsis. 2. **Major — terminal write must be immutable.** The UPDATE used `WHERE request_id = ?4 AND decided_at IS NOT NULL`, which let any later `record_execution` call silently overwrite executed_at / execution_outcome / execution_error. Added `AND executed_at IS NULL` so the first terminal report wins; a late retry / cleanup path now reports `false` and leaves the original audit row intact. New test `record_execution_is_idempotent_after_first_terminal_report_wins` pins the contract. `cargo test --lib approval::store` → 13 passed, 0 failed (13 = 11 pre-existing + 1 new idempotency test; the cap test was already there but its assertion now uses the exact 512-char count).

coderabbitai

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@src/openhuman/approval/store.rs`:
- Around line 263-269: The current code only truncates the execution_error
string via trimmed_error and still persists raw backend/tool text; update the
persistence step to run execution_error through the same audit-redaction routine
used for request payloads (use the project's audit redaction function that
sanitizes tokens, emails, and PII) instead of simple truncation, or
alternatively replace the stored value with a sanitized summary/error code;
ensure you update handling around trimmed_error/execution_error in store.rs so
the redacted result (not the raw or just-truncated string) is what gets written
to the durable audit table.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 5efd6e5c-d848-43f2-91bd-c2e8fd108125

📥 Commits

Reviewing files that changed from the base of the PR and between 58c3264 and c439549.

📒 Files selected for processing (1)

src/openhuman/approval/store.rs

…humansai#2367 review) Run the upstream error blurb through memory::safety::sanitize_text *before* truncating to 512 chars so the durable audit row cannot leak bearer tokens, API keys, private-key blocks, OAuth params, or government-ID PII the failing tool may have echoed back. Truncating first would split a secret mid-string and dodge the regex matchers. Adds record_execution_redacts_secrets_in_error_message asserting a bearer token and provider key are scrubbed from the stored row and that the [REDACTED] marker remains visible to audit readers.

# Conflicts: # src/openhuman/approval/store.rs

…ess (tinyhumansai#2367 review) - gate.rs: when the TTL elapses concurrently with a committed approval, `store::decide(Deny)` silently no-ops (its `WHERE decided_at IS NULL` guard preserves the existing row), but the gate previously returned Deny anyway — leaving the durable audit row saying Approved while the tool never ran. The timeout path now re-reads the persisted decision via the new `store::get_decision` and honors an existing Approve. - subagent_runner/ops.rs: replace `!raw.starts_with(\"Error\")` audit signal with a structured `exec_success` bool captured from the `tokio::time::timeout` match. Legitimate success payloads can start with the literal word \"Error\" (search hits, copied logs) and would otherwise persist a false Failure on the audit row. - store.rs: add `get_decision(config, request_id)` read helper used by the gate race resolution, plus a unit test covering missing/undecided/ decided round-trip. Addresses CodeRabbit review on PR tinyhumansai#2367: - src/openhuman/approval/gate.rs lines 200-255 (timeout vs decide race) - src/openhuman/agent/harness/subagent_runner/ops.rs lines 1524-1550 (brittle string-prefix success detection)

The merge from main brought back PR tinyhumansai#2367's older de-5.ts that still had duplicate keys for settings.developerMenu.mcpServer.* and settings.mcpServer.* — the canonical fix on main lives at lines 211-235. TS1117 was failing CI on this branch (Type Check TypeScript). Drop the duplicates so this file matches main.

coderabbitai

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (2)

src/openhuman/approval/store.rs (1)

319-328: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Only stamp execution on approved rows.

decided_at IS NOT NULL also matches Deny/auto-expired rows, so a stray caller can create an impossible audit trail with decision = deny and executed_at populated. Please restrict this update to approval decisions only at the store boundary.

Suggested fix

+        let approve_once = ApprovalDecision::ApproveOnce.as_str();
+        let approve_always = ApprovalDecision::ApproveAlwaysForTool.as_str();
         let updated = conn
             .execute(
                 "UPDATE pending_approvals
                  SET executed_at = ?1,
                      execution_outcome = ?2,
                      execution_error = ?3
                  WHERE request_id = ?4
-                   AND decided_at IS NOT NULL
+                   AND decided_at IS NOT NULL
+                   AND decision IN (?5, ?6)
                    AND executed_at IS NULL",
-                params![now, outcome.as_str(), trimmed_error, request_id],
+                params![
+                    now,
+                    outcome.as_str(),
+                    trimmed_error,
+                    request_id,
+                    approve_once,
+                    approve_always,
+                ],
             )

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/openhuman/approval/store.rs` around lines 319 - 328, The UPDATE currently
matches any decided row (including Deny/expired); modify the SQL in the
conn.execute call that updates pending_approvals so it only updates rows whose
decision is the approval value (e.g., add "AND decision = 'approve'" to the
WHERE clause), keeping the rest of the parameters (now, outcome.as_str(),
trimmed_error, request_id) and semantics the same so executed_at can only be
stamped for approved requests.

src/openhuman/agent/harness/subagent_runner/ops.rs (1)

1583-1619: ⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Thread exec_success through to the downstream status path.

This fixes the audit write, but the later call_success / text-mode status logic still falls back to !result_text.starts_with("Error"). A successful payload that begins with "Error" will now persist ExecutionOutcome::Success while the progress event and <tool_result status="..."> still report failure.

Suggested fix

-            let result_text = if !allowed_names.contains(&call.name) {
+            let (result_text, call_success) = if !allowed_names.contains(&call.name) {
                 tracing::warn!(
                     task_id = %task_id,
                     agent_id = %agent_id,
                     tool = %call.name,
                     "[subagent_runner] tool not in allowlist for this sub-agent"
                 );
                 let mut available: Vec<&str> = allowed_names.iter().map(|s| s.as_str()).collect();
                 if let Some(resolver) = lazy_resolver.as_ref() {
                     available.extend(resolver.known_slugs());
                 }
                 available.sort_unstable();
                 available.dedup();
-                format!(
+                (
+                    format!(
                         "Error: tool '{}' is not available to the {} sub-agent. Available tools: {}",
                         call.name,
                         agent_id,
                         available.join(", ")
-                )
+                    ),
+                    false,
+                )
             } else if let Some(tool) = extra_tools
                 .iter()
                 .find(|t| t.name() == call.name)
                 .or_else(|| parent_tools.iter().find(|t| t.name() == call.name))
             {
@@
                 if let Some(reason) = gate_denial {
-                    format!("Error: {reason}")
+                    (format!("Error: {reason}"), false)
                 } else {
                     let (raw, exec_success) =
                         match tokio::time::timeout(timeout, tool.execute(args)).await {
@@
                     if let (Some(gate), Some(req_id)) = (
                         approval_gate_for_audit.as_ref(),
                         approval_request_id.as_ref(),
                     ) {
-                        let success = exec_success;
-                        let exec_outcome = if success {
+                        let exec_outcome = if exec_success {
                             crate::openhuman::approval::ExecutionOutcome::Success
                         } else {
                             crate::openhuman::approval::ExecutionOutcome::Failure
                         };
-                        let err_text = if success { None } else { Some(raw.as_str()) };
+                        let err_text = if exec_success { None } else { Some(raw.as_str()) };
                         gate.record_execution(req_id, exec_outcome, err_text);
                     }
-                    raw
+                    (raw, exec_success)
                 }
             } else {
-                format!("Unknown tool: {}", call.name)
+                (format!("Unknown tool: {}", call.name), false)
             };
@@
-            let call_success = !result_text.starts_with("Error");
             let call_output_chars = result_text.chars().count();

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/openhuman/agent/harness/subagent_runner/ops.rs` around lines 1583 - 1619,
The code currently computes exec_success from the structured execute result but
later still derives call_success / text-mode status by checking
result_text.starts_with("Error"), causing mismatches when a successful payload
begins with "Error"; update the downstream status and progress-event logic (the
places that compute call_success and status and emit the <tool_result
status="..."> or progress events) to use the previously computed exec_success
boolean instead of parsing raw/result_text, and pass exec_success through to any
variables or function calls that currently accept call_success or status so the
audit record, progress event, and tool_result all use the same ExecutionOutcome.

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Outside diff comments:
In `@src/openhuman/agent/harness/subagent_runner/ops.rs`:
- Around line 1583-1619: The code currently computes exec_success from the
structured execute result but later still derives call_success / text-mode
status by checking result_text.starts_with("Error"), causing mismatches when a
successful payload begins with "Error"; update the downstream status and
progress-event logic (the places that compute call_success and status and emit
the <tool_result status="..."> or progress events) to use the previously
computed exec_success boolean instead of parsing raw/result_text, and pass
exec_success through to any variables or function calls that currently accept
call_success or status so the audit record, progress event, and tool_result all
use the same ExecutionOutcome.

In `@src/openhuman/approval/store.rs`:
- Around line 319-328: The UPDATE currently matches any decided row (including
Deny/expired); modify the SQL in the conn.execute call that updates
pending_approvals so it only updates rows whose decision is the approval value
(e.g., add "AND decision = 'approve'" to the WHERE clause), keeping the rest of
the parameters (now, outcome.as_str(), trimmed_error, request_id) and semantics
the same so executed_at can only be stamped for approved requests.

ℹ️ Review info

⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: e56c4e17-8529-4f93-96b6-4ba62d4193ec

📥 Commits

Reviewing files that changed from the base of the PR and between 5524460 and 8299906.

📒 Files selected for processing (5)

app/src/lib/i18n/chunks/de-5.ts
src/openhuman/agent/harness/subagent_runner/ops.rs
src/openhuman/agent/harness/tool_loop.rs
src/openhuman/approval/gate.rs
src/openhuman/approval/store.rs

✅ Files skipped from review due to trivial changes (1)

app/src/lib/i18n/chunks/de-5.ts

The 500ms TTL was racing the 50×10ms poll loop on slow CI runners — list_pending's lazy-expire would deny the row before decide() could fire, surfacing as "pending row never appeared" in approve_once_returns_allow on Linux CI. 2s gives the polling tests headroom while keeping timeout_returns_deny fast.

CodeGhost21 requested a review from a team May 20, 2026 19:41

coderabbitai Bot added agent Built-in agents, prompts, orchestration, and agent runtime in src/openhuman/agent/. working A PR that is being worked on by the team. labels May 20, 2026

coderabbitai Bot requested changes May 20, 2026

View reviewed changes

Comment thread src/openhuman/approval/store.rs Outdated

Comment thread src/openhuman/approval/store.rs

coderabbitai Bot requested changes May 20, 2026

View reviewed changes

Comment thread src/openhuman/approval/store.rs Outdated

Merge remote-tracking branch 'origin/main' into pr/2367-merge

f864634

coderabbitai Bot added feature Net-new user-facing capability or product behavior. rust-core Core Rust runtime in src/: CLI, core_server, shared infrastructure. labels May 22, 2026

coderabbitai Bot previously approved these changes May 22, 2026

View reviewed changes

senamakel self-assigned this May 23, 2026

senamakel added 2 commits May 22, 2026 17:47

Merge branch 'main' into pr/2367

565f31f

# Conflicts: # src/openhuman/approval/store.rs

senamakel dismissed coderabbitai[bot]’s stale review via 41091c9 May 23, 2026 00:57

coderabbitai Bot reviewed May 23, 2026

View reviewed changes

coderabbitai Bot previously approved these changes May 23, 2026

View reviewed changes

senamakel dismissed coderabbitai[bot]’s stale review via 6f35240 May 23, 2026 02:05

senamakel merged commit 2f5db77 into tinyhumansai:main May 23, 2026
26 checks passed

coderabbitai Bot mentioned this pull request May 23, 2026

feat: add ToolPolicy middleware for tool-loop allow/deny gate #2303

Open

5 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(approval): persist post-execution audit row alongside approval (#2135)#2367

feat(approval): persist post-execution audit row alongside approval (#2135)#2367
senamakel merged 8 commits into
tinyhumansai:mainfrom
CodeGhost21:fix/2135-audit-action-execution

CodeGhost21 commented May 20, 2026 •

edited by coderabbitai Bot

Loading

Uh oh!

coderabbitai Bot commented May 20, 2026 •

edited

Loading

Reviews paused

Walkthrough

Changes

Estimated code review effort

Possibly related PRs

Suggested reviewers

Poem

Uh oh!

coderabbitai Bot left a comment

Uh oh!

Uh oh!

Uh oh!

coderabbitai Bot left a comment

Uh oh!

Uh oh!

coderabbitai Bot left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

CodeGhost21 commented May 20, 2026 • edited by coderabbitai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Problem

Solution

Approval domain

Wiring (4 call sites)

Submission Checklist

Impact

Tests

Related

AI Authored PR Metadata (required for Codex/Linear PRs)

Linear Issue

Commit & Branch

Validation Run

Validation Blocked

Behavior Changes

Parity Contract

Duplicate / Superseded PR Handling

Summary by CodeRabbit

Uh oh!

coderabbitai Bot commented May 20, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Reviews paused

Walkthrough

Changes

Estimated code review effort

Possibly related PRs

Suggested reviewers

Poem

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

CodeGhost21 commented May 20, 2026 •

edited by coderabbitai Bot

Loading

coderabbitai Bot commented May 20, 2026 •

edited

Loading