Fix #1044: test failures 2026-04-28#853
Open
AlexMikhalev wants to merge 2256 commits intomainfrom
Open
Conversation
…y guard (Refs adf-fleet#38) Addresses compound-review findings on PR #633: C1 (security): replace "--dangerously-skip-permissions" with "--allowedTools """ on both claude invocations (scope gate + role dispatch). The scope gate is pure text classification -- no tools are needed. --allowedTools "" removes the tool surface entirely, so prompt-injection attempts via untrusted Gitea issue bodies cannot call Bash/Read/Write. If the model ever hallucinates a tool call, the CLI rejects it rather than silently executing. I1 (idempotency): before posting the clarification comment, fetch issue comments and skip re-post if a "Scope clarity check" comment already exists in the last 24h. Prevents spam on every 30-min cron tick while an issue stays underspecified. Pre-tool-use DCG guard continues to fire for any tool call (C1 path), and the default deny behaviour without the dangerous flag makes that defence effective.
…ier 3)' (#633) from task/roc-v1-step-j-scope-check into main
…ispatch::ReviewPr — Refs adf-fleet#31 Add GiteaPullRequestPayload with serde-derived fields (action, number, pull_request, repository). Add WebhookDispatch::ReviewPr variant alongside existing SpawnAgent/SpawnPersona/CompoundReview. Update comment_id() impl and events.rs normalize_webhook_dispatch to cover the new variant. Co-Authored-By: Terraphim AI <noreply@terraphim.cloud>
Implement AgentOrchestrator::handle_review_pr so the ReviewPr arm of the
dispatcher actually spawns the pr-reviewer instead of logging-and-skipping.
The method builds a DispatchContext from a "review" task string, calls the
RoutingDecisionEngine (KG + keyword + static config), enforces the C1/C3
subscription allow-list on both the static model (pre-routing) and the
routed model (post-routing), and lays per-PR ADF_PR_* env overrides on the
project-scoped SpawnContext before handing off to spawn_with_fallback.
The spawn path intentionally skips persona composition, skill-chain injection,
and worktree creation: pr-reviewer is a read-only review tier that posts a
verdict comment, so the heavyweight scaffolding from spawn_agent would be
dead weight. Budget verdicts, resource limits, fallback providers, and the
active_agents registry all still flow through the standard primitives.
Pure helpers live in the new pr_dispatch module so routing/env construction
can be unit-tested without an AgentOrchestrator:
- ReviewPrRequest (per-dispatch metadata)
- find_pr_reviewer (project-scoped agent lookup)
- build_review_task (routing-keyword-embedded task string)
- pr_env_overrides / layer_pr_env (ADF_PR_* env injection)
AutoMerge and PostMergeTestGate remain log-only stubs; they land in Steps G
and H.
Tests (lib + inline):
- reviewpr_dispatch_routes_via_routing_engine: drives the full uveline
end-to-end with cli_tool="echo" and asserts the spawned agent ends up
in active_agents with a session id.
- reviewpr_dispatch_rejects_banned_provider: mutates a post-construction
banned model onto the agent and asserts the allow-list gate short-
circuits the spawn.
- reviewpr_dispatch_sets_env_vars: uses a shell-script cli_tool that
dumps ADF_PR_* into a tempfile, then asserts every per-PR env key
reached the child process.
- pr_dispatch inline tests cover find_pr_reviewer, build_review_task,
pr_env_overrides, and layer_pr_env against real OrchestratorConfig
TOML fixtures.
All 624 orchestrator tests pass with and without --features quickwit;
cargo clippy --all-targets -D warnings and cargo fmt --check clean.
Refs adf-fleet#32
…vents (ROC v1 Step C)' (#642) from task/roc-v1-step-c-webhook-pr-handler into main
… handle_review_pr' (#643) from task/roc-v1-step-d-review-dispatch into main
…eview skill (Refs adf-fleet#33) New scripts/adf-Configuration/agents/pr-reviewer.toml following prompt-agent-spec conventions from meta-coordinator.toml. Mention-dispatched (no cron schedule). Reads ADF_PR_* env overrides set by handle_review_pr (ROC v1 Step D), fetches PR diff, invokes Terraphim AI with structural-pr-review skill chain, posts verdict via gtr comment. Idempotency guard: skips re-review within 2h for same head_sha. Co-security_checklistored-By: Terraphim AI <noreply@terraphim.ai>
…E)' (#711) from task/roc-v1-step-e-pr-reviewer-toml into main
… extension — Refs adf-fleet#34
…atcher accessor — Refs adf-fleet#34
… + AutoMerge enqueue' (#717) from task/roc-v1-step-f-verdict-polling into main
…Refs adf-fleet#35
…impl — Refs adf-fleet#35
…ostMergeTestGate enqueue + [ADF] issue on failure) — Refs adf-fleet#35
…andle_auto_merge — Refs adf-fleet#35
…ath — Refs adf-fleet#35 Add 5 integration tests driving handle_auto_merge_for_project with an in-memory RecordingExecutor (no mock frameworks, async-trait impl): * auto_merge_success_enqueues_post_merge_gate * auto_merge_skipped_when_head_sha_changed * auto_merge_failure_opens_adf_issue * auto_merge_marks_dedupe_set_on_success * auto_merge_skipped_when_pr_already_closed Expose pub auto_merge_enqueued() accessor on AgentOrchestrator so tests can verify the (project, pr_number, head_sha) dedupe set is populated after a successful merge.
… from task/roc-v1-step-g-auto-merge-handler into main
…ait + config — Refs adf-fleet#36 Adds crates/terraphim_orchestrator/src/post_merge_gate.rs with: - CommandRunner async trait + TokioCommandRunner real impl (streams stdout/stderr tails via bounded ring buffer so long test runs do not OOM; kills child on wall-time timeout). - run_workspace_tests / classify_failure / revert_merge helpers. - ScriptedRunner test double (no mock library — trait impl only). - 9 inline unit tests covering runner behaviour, failure parsing, and revert paths (green, timeout, io-error, test-failure parse, harness detection, timeout classification, revert no-push, revert with push, revert all-paths-failed). Adds PostMergeGateConfig to OrchestratorConfig (optional; defaults to 10 minute budget, push revert to origin main). Back-fills the new field on every existing OrchestratorConfig initializer across lib.rs test fixtures and integration tests.
…k - Refs adf-fleet#36 Replaces the Step B log-only stub for DispatchTask::PostMergeTestGate with a real handler that: - Resolves project working_dir as repo_root - Runs post_merge_gate::run_workspace_tests with GateConfig built from orchestrator.toml [post_merge_gate] overrides (default 10 min budget) - On green: logs post_merge_gate_verified at info - On red: classifies failure, calls post_merge_gate::revert_merge to create revert commit + push to remote, opens [ADF] post-merge test gate reverted issue on the project's Gitea repo with merge_sha, revert_sha, top failing tests, stderr tail, wall_time - Logs post_merge_gate_reverted at warn handle_post_merge_test_gate_with_runner<R: CommandRunner + ?Sized> is parametrised so integration tests can inject ScriptedRunner without spawning cargo. Adds truncate_for_issue helper for char-boundary-safe trimming of stderr tails when opening the [ADF] issue (avoids corrupting UTF-8 mid-sequence). All orchestrator tests pass; clippy clean; fmt clean.
…/merge — Refs adf-fleet#37 Add OrchestratorEvent enum (PrReviewed, PrAutoMerged, PrAutoMergedVerified, PrAutoReverted) to quickwit.rs with emit_event helper on QuickwitFleetSink. All emits are gated behind the quickwit feature and tolerate Quickwit down (warn log, business logic unblocked). Wire emit sites: - PrReviewed in poll_pending_reviews_for_project after verdict parse - PrAutoMerged in handle_auto_merge_for_project after merge_pr succeeds - PrAutoMergedVerified in handle_post_merge_test_gate_with_runner on green - PrAutoReverted in handle_post_merge_test_gate_with_runner on revert Replace the two quickwit_event_placeholder log lines from Step H with the typed emits above. Add tests/quickwit_events_tests.rs: 4 field-mapping tests + 1 tolerance test (all gated behind quickwit feature). 501 tests pass; clippy clean; fmt clean. Co-Authored-By: Terraphim AI <noreply@terraphim.cloud>
…e flow (ROC v1 Step I)' (#726) from task/roc-v1-step-i-quickwit-events into main
…reviewers event-driven) — Refs adf-fleet#39 No template change needed: implementation-swarm is not yet in scripts/adf-setup/agents/. Developer cron bump to */20 applies to live conf.d/*.toml during Step L rollout.
…ump (ROC v1 Step K)' (#727) from task/roc-v1-step-k-cron-extension into main
…lo -> terraphim -- Refs adf-fleet#40 Co-Authored-By: Terraphim AI <noreply@terraphim.cloud>
…728) from task/roc-v1-step-l-rollout-runbook into main
Updates the adf.rs provider registration from Kimi K2.5 to K2.6 so the ProviderHealthMap probe + KeywordRouter fallback target the current model. KG routing tables (planning_tier.md) and conf.d/digital-twins.toml already reference kimi-for-coding/k2p6; this aligns register_providers. Tests unchanged (test fixtures use synthetic k2p5 strings that remain valid — provider_key_for_model is generic and C1 allow-list checks the kimi-for-coding prefix, not the specific model string).
…tion to k2p6' (#734) from task/kimi-k2p6-provider into main
…rand - Patch rustls-webpki from 0.103.10 to 0.103.12 (fixes name constraints bypass) - Add RUSTSEC-2026-0098/0099 to deny.toml (serenity/discord-feature-only) - Replace rand with WASM-compatible fastrand in terraphim_multi_agent and terraphim_kg_agents - Remove two direct rand dependencies from workspace crates Refs #630
…#1024) from sync/reconcile-origin-to-gitea-20260427 into main
…_string Refs #1020
…n with path filter' (#952) from task/950-pr-spec-validator-phase-2b into main
…+ reconciliation docs' (#1025) from sync/step10-agents-md-sync-protocol into main
Adds the operator-installable agent template for pr-test-guardian, the
ADF Phase 2e PR-open fan-out entry. It mirrors the pr-reviewer skeleton
but applies a path filter ("**/tests/**" or "crates/**/src/**") and
posts adf/test=success with description "n/a -- no test-relevant
changes" when the diff carries no test-relevant paths (skip-with-success
so branch protection on adf/test does not wedge docs/config-only PRs).
When test-relevant changes are present, invokes claude with the testing
skill chain on the diff and posts a structured PR comment plus an
adf/test commit status reflecting the verdict (pass / concerns / fail).
Idempotent: skips when a comment carrying "Last test-reviewed commit:
<short-sha>" already exists within the last 2 hours.
The Rust dispatch loop (handle_review_pr) is already generic over
agents_on_pr_open entries -- no orchestrator code change required.
Operators wire the new entry by appending to [pr_dispatch] and the
project's [[agents]] list.
Refs #954
… #954 Rebased onto main (post #952, #1021 merges). Helper and tests added after spec-validator equivalents without conflict.
… with path filter' (#956) from worktree-agent-a2799aaa into main
…eports' (#1027) from sync/vv-reports into main
…efs #955 Rebased onto main after Phase 2e merge.
…R open with path filter' (#958) from task/955-adf-phase-2d-compliance-watchdog into main
…eFragment' (#999) from worktree-agent-ad4db636 into main
Adds the agent template that the Phase 2 fan-out loop spawns when `pr-security-sentinel` appears in `[pr_dispatch.agents_on_pr_open]`. Mirrors the pr-spec-validator skeleton: - sonnet (subscription-only) — security-audit needs deeper reasoning than a structural diff scan to spot data-flow vulnerabilities and unsafe-code subtleties; haiku is the wrong tier - skill_chain = ["security-audit"] - 2-hour idempotency check via the existing comment-marker pattern (uses agent-scoped marker "Last security-audited commit: <sha>") - best-effort POST_STATUS helper mirroring build-runner.toml Path filter (early-exit happy path) — implemented in this script, NOT in the orchestrator: - skips the LLM call when no `Cargo.toml`, `Cargo.lock`, `crates/**/src/**`, or `**/secrets/**` paths changed - posts `success` (not no-post) with description "n/a — no security-relevant changes" because `adf/security` becomes a required check on `main` post-deploy; a never-resolved `pending` would block docs-only PRs forever Verdict parsing prefers an explicit `Verdict:` line; falls back to a `Risk:` line (critical/high → fail, medium → concerns, low → pass); finally falls back to `state=success` "audit posted; manual gate" when neither parses — so the commit status is never left in `pending` even on transient agent failures. The cron-scheduled `security-sentinel` agent stays in tree for full-repo audits; this PR-event variant is additive and diff-scoped. Operator follow-up after merge: 1. Rebuild + restart the orchestrator on bigbox so the new dispatch arm goes live. 2. PATCH branch_protections/main to add `adf/security` to required contexts. Refs #953
…2c) Refs #953 Rebased onto main after Phase 2b/d/e merges.
…open with path filter' (#1028) from task/953-adf-phase-2c-security-sentinel into main
…eMeta Refs #1026' (#1029) from sync/1026-query-role-meta into main
…eports for #1026' (#1030) from sync/vv2-1026 into main
…ams SDK tests Refs #1034
… #1012 - Intercept args before clap parsing to apply typo correction, alias expansion, and case-insensitive matching - Add build_cli_forgiving_parser() with actual CLI subcommands - Add apply_forgiving_parsing() heuristic for subcommand detection - Print correction notifications to stderr - Add 3 integration tests: auto-correction, alias expansion, case-insensitive matching
…PI Refs #1011 - Add Robot variant to Command enum with subcommands: capabilities, schemas, examples - Add RobotFormat enum (json, table, minimal) for --format flag - Add handle_robot_command() and format_robot_output() helpers - Wire Robot command in main() before catch-all server/offline dispatch - Add unreachable! arms in run_offline_command and run_server_command to satisfy exhaustive match (Robot is handled in main) - Add 4 integration tests: capabilities JSON, schemas JSON, examples text, all formats honoured - Fix pre-existing compilation errors in terraphim_orchestrator (config clone + concurrency_permit field)
…memory exhaustion Refs #664 - Add concurrency_controller to AgentOrchestrator using existing ConcurrencyController - Initialize controller from config with sensible defaults (global_max=10) - Acquire concurrency permit in spawn_agent before spawning; store in ManagedAgent - Permit auto-released on drop when agent exits or is stopped - Fix test helpers: update pr_dispatch_per_project instead of pr_dispatch so per-project lookup returns all 3 fan-out entries (build-runner, pr-reviewer, plus spec/test/security/compliance sentinel) Diff: 552 tests pass, 0 failures.
…ch clap error output Refs #1044
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Fixes test assertion in exit_codes_integration_test.rs that expected a custom error message for the listen mode with --server flag. Clap generates its own error message for unexpected arguments.
Changes
Verification
Refs #1044