Skip to content

Fix #1044: test failures 2026-04-28#853

Open
AlexMikhalev wants to merge 2256 commits intomainfrom
task/1044-test-failures-fix
Open

Fix #1044: test failures 2026-04-28#853
AlexMikhalev wants to merge 2256 commits intomainfrom
task/1044-test-failures-fix

Conversation

@AlexMikhalev
Copy link
Copy Markdown
Contributor

Summary

Fixes test assertion in exit_codes_integration_test.rs that expected a custom error message for the listen mode with --server flag. Clap generates its own error message for unexpected arguments.

Changes

  • Updated listen_mode_with_server_flag_exits_error_usage test to assert on clap's actual error output

Verification

  • cargo test -p terraphim_agent --test exit_codes_integration_test passes (6/6 tests)
  • cargo check -p terraphim_agent clean
  • cargo clippy -p terraphim_agent -- -D warnings clean
  • cargo fmt --all -- --check clean

Refs #1044

AlexMikhalev and others added 30 commits April 21, 2026 00:17
…y guard (Refs adf-fleet#38)

Addresses compound-review findings on PR #633:

C1 (security): replace "--dangerously-skip-permissions" with
"--allowedTools """ on both claude invocations (scope gate + role
dispatch). The scope gate is pure text classification -- no tools
are needed. --allowedTools "" removes the tool surface entirely,
so prompt-injection attempts via untrusted Gitea issue bodies cannot
call Bash/Read/Write. If the model ever hallucinates a tool call,
the CLI rejects it rather than silently executing.

I1 (idempotency): before posting the clarification comment, fetch
issue comments and skip re-post if a "Scope clarity check" comment
already exists in the last 24h. Prevents spam on every 30-min cron
tick while an issue stays underspecified.

Pre-tool-use DCG guard continues to fire for any tool call (C1
path), and the default deny behaviour without the dangerous flag
makes that defence effective.
…ier 3)' (#633) from task/roc-v1-step-j-scope-check into main
…ispatch::ReviewPr — Refs adf-fleet#31

Add GiteaPullRequestPayload with serde-derived fields (action, number,
pull_request, repository). Add WebhookDispatch::ReviewPr variant alongside
existing SpawnAgent/SpawnPersona/CompoundReview. Update comment_id() impl
and events.rs normalize_webhook_dispatch to cover the new variant.

Co-Authored-By: Terraphim AI <noreply@terraphim.cloud>
Implement AgentOrchestrator::handle_review_pr so the ReviewPr arm of the
dispatcher actually spawns the pr-reviewer instead of logging-and-skipping.
The method builds a DispatchContext from a "review" task string, calls the
RoutingDecisionEngine (KG + keyword + static config), enforces the C1/C3
subscription allow-list on both the static model (pre-routing) and the
routed model (post-routing), and lays per-PR ADF_PR_* env overrides on the
project-scoped SpawnContext before handing off to spawn_with_fallback.

The spawn path intentionally skips persona composition, skill-chain injection,
and worktree creation: pr-reviewer is a read-only review tier that posts a
verdict comment, so the heavyweight scaffolding from spawn_agent would be
dead weight. Budget verdicts, resource limits, fallback providers, and the
active_agents registry all still flow through the standard primitives.

Pure helpers live in the new pr_dispatch module so routing/env construction
can be unit-tested without an AgentOrchestrator:
  - ReviewPrRequest (per-dispatch metadata)
  - find_pr_reviewer (project-scoped agent lookup)
  - build_review_task (routing-keyword-embedded task string)
  - pr_env_overrides / layer_pr_env (ADF_PR_* env injection)

AutoMerge and PostMergeTestGate remain log-only stubs; they land in Steps G
and H.

Tests (lib + inline):
  - reviewpr_dispatch_routes_via_routing_engine: drives the full uveline
    end-to-end with cli_tool="echo" and asserts the spawned agent ends up
    in active_agents with a session id.
  - reviewpr_dispatch_rejects_banned_provider: mutates a post-construction
    banned model onto the agent and asserts the allow-list gate short-
    circuits the spawn.
  - reviewpr_dispatch_sets_env_vars: uses a shell-script cli_tool that
    dumps ADF_PR_* into a tempfile, then asserts every per-PR env key
    reached the child process.
  - pr_dispatch inline tests cover find_pr_reviewer, build_review_task,
    pr_env_overrides, and layer_pr_env against real OrchestratorConfig
    TOML fixtures.

All 624 orchestrator tests pass with and without --features quickwit;
cargo clippy --all-targets -D warnings and cargo fmt --check clean.

Refs adf-fleet#32
…vents (ROC v1 Step C)' (#642) from task/roc-v1-step-c-webhook-pr-handler into main
… handle_review_pr' (#643) from task/roc-v1-step-d-review-dispatch into main
…eview skill (Refs adf-fleet#33)

New scripts/adf-Configuration/agents/pr-reviewer.toml following prompt-agent-spec
conventions from meta-coordinator.toml. Mention-dispatched (no cron schedule).
Reads ADF_PR_* env overrides set by handle_review_pr (ROC v1 Step D), fetches
PR diff, invokes Terraphim AI with structural-pr-review skill chain, posts verdict
via gtr comment. Idempotency guard: skips re-review within 2h for same head_sha.

Co-security_checklistored-By: Terraphim AI <noreply@terraphim.ai>
…E)' (#711) from task/roc-v1-step-e-pr-reviewer-toml into main
… + AutoMerge enqueue' (#717) from task/roc-v1-step-f-verdict-polling into main
…ostMergeTestGate enqueue + [ADF] issue on failure) — Refs adf-fleet#35
…ath — Refs adf-fleet#35

Add 5 integration tests driving handle_auto_merge_for_project with an
in-memory RecordingExecutor (no mock frameworks, async-trait impl):

* auto_merge_success_enqueues_post_merge_gate
* auto_merge_skipped_when_head_sha_changed
* auto_merge_failure_opens_adf_issue
* auto_merge_marks_dedupe_set_on_success
* auto_merge_skipped_when_pr_already_closed

Expose pub auto_merge_enqueued() accessor on AgentOrchestrator so
tests can verify the (project, pr_number, head_sha) dedupe set is
populated after a successful merge.
… from task/roc-v1-step-g-auto-merge-handler into main
…ait + config — Refs adf-fleet#36

Adds crates/terraphim_orchestrator/src/post_merge_gate.rs with:
- CommandRunner async trait + TokioCommandRunner real impl (streams
  stdout/stderr tails via bounded ring buffer so long test runs do not
  OOM; kills child on wall-time timeout).
- run_workspace_tests / classify_failure / revert_merge helpers.
- ScriptedRunner test double (no mock library — trait impl only).
- 9 inline unit tests covering runner behaviour, failure parsing, and
  revert paths (green, timeout, io-error, test-failure parse, harness
  detection, timeout classification, revert no-push, revert with push,
  revert all-paths-failed).

Adds PostMergeGateConfig to OrchestratorConfig (optional; defaults to
10 minute budget, push revert to origin main). Back-fills the new
field on every existing OrchestratorConfig initializer across lib.rs
test fixtures and integration tests.
…k - Refs adf-fleet#36

Replaces the Step B log-only stub for DispatchTask::PostMergeTestGate with
a real handler that:

- Resolves project working_dir as repo_root
- Runs post_merge_gate::run_workspace_tests with GateConfig built from
  orchestrator.toml [post_merge_gate] overrides (default 10 min budget)
- On green: logs post_merge_gate_verified at info
- On red: classifies failure, calls post_merge_gate::revert_merge to
  create revert commit + push to remote, opens [ADF] post-merge test gate
  reverted issue on the project's Gitea repo with merge_sha, revert_sha,
  top failing tests, stderr tail, wall_time
- Logs post_merge_gate_reverted at warn

handle_post_merge_test_gate_with_runner<R: CommandRunner + ?Sized> is
parametrised so integration tests can inject ScriptedRunner without
spawning cargo.

Adds truncate_for_issue helper for char-boundary-safe trimming of stderr
tails when opening the [ADF] issue (avoids corrupting UTF-8 mid-sequence).

All orchestrator tests pass; clippy clean; fmt clean.
) from task/roc-v1-step-h-post-merge-gate into main
…/merge — Refs adf-fleet#37

Add OrchestratorEvent enum (PrReviewed, PrAutoMerged, PrAutoMergedVerified,
PrAutoReverted) to quickwit.rs with emit_event helper on QuickwitFleetSink.
All emits are gated behind the quickwit feature and tolerate Quickwit down
(warn log, business logic unblocked).

Wire emit sites:
- PrReviewed in poll_pending_reviews_for_project after verdict parse
- PrAutoMerged in handle_auto_merge_for_project after merge_pr succeeds
- PrAutoMergedVerified in handle_post_merge_test_gate_with_runner on green
- PrAutoReverted in handle_post_merge_test_gate_with_runner on revert

Replace the two quickwit_event_placeholder log lines from Step H with
the typed emits above.

Add tests/quickwit_events_tests.rs: 4 field-mapping tests + 1 tolerance test
(all gated behind quickwit feature). 501 tests pass; clippy clean; fmt clean.

Co-Authored-By: Terraphim AI <noreply@terraphim.cloud>
…e flow (ROC v1 Step I)' (#726) from task/roc-v1-step-i-quickwit-events into main
…reviewers event-driven) — Refs adf-fleet#39

No template change needed: implementation-swarm is not yet in scripts/adf-setup/agents/.
Developer cron bump to */20 applies to live conf.d/*.toml during Step L rollout.
…ump (ROC v1 Step K)' (#727) from task/roc-v1-step-k-cron-extension into main
…lo -> terraphim -- Refs adf-fleet#40

Co-Authored-By: Terraphim AI <noreply@terraphim.cloud>
…728) from task/roc-v1-step-l-rollout-runbook into main
Updates the adf.rs provider registration from Kimi K2.5 to K2.6 so the
ProviderHealthMap probe + KeywordRouter fallback target the current
model. KG routing tables (planning_tier.md) and conf.d/digital-twins.toml
already reference kimi-for-coding/k2p6; this aligns register_providers.

Tests unchanged (test fixtures use synthetic k2p5 strings that remain
valid — provider_key_for_model is generic and C1 allow-list checks the
kimi-for-coding prefix, not the specific model string).
…tion to k2p6' (#734) from task/kimi-k2p6-provider into main
…rand

- Patch rustls-webpki from 0.103.10 to 0.103.12 (fixes name constraints bypass)
- Add RUSTSEC-2026-0098/0099 to deny.toml (serenity/discord-feature-only)
- Replace rand with WASM-compatible fastrand in terraphim_multi_agent and terraphim_kg_agents
- Remove two direct rand dependencies from workspace crates

Refs #630
AlexMikhalev and others added 29 commits April 27, 2026 18:01
…#1024) from sync/reconcile-origin-to-gitea-20260427 into main
…n with path filter' (#952) from task/950-pr-spec-validator-phase-2b into main
…+ reconciliation docs' (#1025) from sync/step10-agents-md-sync-protocol into main
Adds the operator-installable agent template for pr-test-guardian, the
ADF Phase 2e PR-open fan-out entry. It mirrors the pr-reviewer skeleton
but applies a path filter ("**/tests/**" or "crates/**/src/**") and
posts adf/test=success with description "n/a -- no test-relevant
changes" when the diff carries no test-relevant paths (skip-with-success
so branch protection on adf/test does not wedge docs/config-only PRs).

When test-relevant changes are present, invokes claude with the testing
skill chain on the diff and posts a structured PR comment plus an
adf/test commit status reflecting the verdict (pass / concerns / fail).

Idempotent: skips when a comment carrying "Last test-reviewed commit:
<short-sha>" already exists within the last 2 hours.

The Rust dispatch loop (handle_review_pr) is already generic over
agents_on_pr_open entries -- no orchestrator code change required.
Operators wire the new entry by appending to [pr_dispatch] and the
project's [[agents]] list.

Refs #954
… #954

Rebased onto main (post #952, #1021 merges). Helper and tests added
after spec-validator equivalents without conflict.
… with path filter' (#956) from worktree-agent-a2799aaa into main
…eports' (#1027) from sync/vv-reports into main
…efs #955

Rebased onto main after Phase 2e merge.
…R open with path filter' (#958) from task/955-adf-phase-2d-compliance-watchdog into main
…eFragment' (#999) from worktree-agent-ad4db636 into main
Adds the agent template that the Phase 2 fan-out loop spawns when
`pr-security-sentinel` appears in `[pr_dispatch.agents_on_pr_open]`.

Mirrors the pr-spec-validator skeleton:
- sonnet (subscription-only) — security-audit needs deeper reasoning
  than a structural diff scan to spot data-flow vulnerabilities and
  unsafe-code subtleties; haiku is the wrong tier
- skill_chain = ["security-audit"]
- 2-hour idempotency check via the existing comment-marker pattern
  (uses agent-scoped marker "Last security-audited commit: <sha>")
- best-effort POST_STATUS helper mirroring build-runner.toml

Path filter (early-exit happy path) — implemented in this script,
NOT in the orchestrator:
- skips the LLM call when no `Cargo.toml`, `Cargo.lock`,
  `crates/**/src/**`, or `**/secrets/**` paths changed
- posts `success` (not no-post) with description "n/a — no
  security-relevant changes" because `adf/security` becomes a required
  check on `main` post-deploy; a never-resolved `pending` would block
  docs-only PRs forever

Verdict parsing prefers an explicit `Verdict:` line; falls back to a
`Risk:` line (critical/high → fail, medium → concerns, low → pass);
finally falls back to `state=success` "audit posted; manual gate" when
neither parses — so the commit status is never left in `pending` even
on transient agent failures.

The cron-scheduled `security-sentinel` agent stays in tree for
full-repo audits; this PR-event variant is additive and diff-scoped.

Operator follow-up after merge:
1. Rebuild + restart the orchestrator on bigbox so the new dispatch
   arm goes live.
2. PATCH branch_protections/main to add `adf/security` to required
   contexts.

Refs #953
…2c) Refs #953

Rebased onto main after Phase 2b/d/e merges.
…open with path filter' (#1028) from task/953-adf-phase-2c-security-sentinel into main
…eMeta Refs #1026' (#1029) from sync/1026-query-role-meta into main
…eports for #1026' (#1030) from sync/vv2-1026 into main
… #1012

- Intercept args before clap parsing to apply typo correction,
  alias expansion, and case-insensitive matching
- Add build_cli_forgiving_parser() with actual CLI subcommands
- Add apply_forgiving_parsing() heuristic for subcommand detection
- Print correction notifications to stderr
- Add 3 integration tests: auto-correction, alias expansion,
  case-insensitive matching
…PI Refs #1011

- Add Robot variant to Command enum with subcommands:
  capabilities, schemas, examples
- Add RobotFormat enum (json, table, minimal) for --format flag
- Add handle_robot_command() and format_robot_output() helpers
- Wire Robot command in main() before catch-all server/offline dispatch
- Add unreachable! arms in run_offline_command and run_server_command
  to satisfy exhaustive match (Robot is handled in main)
- Add 4 integration tests: capabilities JSON, schemas JSON,
  examples text, all formats honoured
- Fix pre-existing compilation errors in terraphim_orchestrator
  (config clone + concurrency_permit field)
…memory exhaustion Refs #664

- Add concurrency_controller to AgentOrchestrator using existing ConcurrencyController
- Initialize controller from config with sensible defaults (global_max=10)
- Acquire concurrency permit in spawn_agent before spawning; store in ManagedAgent
- Permit auto-released on drop when agent exits or is stopped
- Fix test helpers: update pr_dispatch_per_project instead of pr_dispatch
  so per-project lookup returns all 3 fan-out entries (build-runner,
  pr-reviewer, plus spec/test/security/compliance sentinel)

Diff: 552 tests pass, 0 failures.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant