feat(thesaurus): flush compiled thesaurus cache after graph markdown edits#852
Open
AlexMikhalev wants to merge 2204 commits intomainfrom
Open
feat(thesaurus): flush compiled thesaurus cache after graph markdown edits#852AlexMikhalev wants to merge 2204 commits intomainfrom
AlexMikhalev wants to merge 2204 commits intomainfrom
Conversation
…ause - Refs terraphim/adf-fleet#6 Wire the existing CostTracker budget check into the spawn path. `CostTracker::check()` was only consumed by the routing engine to apply BudgetPressure scoring penalties; dispatch still went ahead even when BudgetVerdict::Exhausted came back. Now the gate short-circuits at the top of spawn_agent so an agent whose monthly cap is blown does not run at all this cycle. Also emits a warn-level trace on NearExhaustion so operators see the soft-limit crossing before the hard pause. Placed after the disk-space guard and before the pre-check gate so we never waste pre-check work on an agent that cannot spawn. Tests: - test_spawn_agent_skips_when_budget_exhausted: registers a $1 cap, records $2 spend, asserts spawn is a no-op (no entry in active_agents, Ok returned). - test_spawn_agent_runs_when_budget_uncapped: confirms an agent with budget_monthly_cents=None spawns even after large recorded spend.
Replace the single top-level `adf/mention_cursor` persistence key with per-project keys `adf/mention_cursor/<project_id>` so each project can advance its repo-wide comment poll cursor independently. Legacy single-project installations pass the synthetic `__global__` project id and continue to work without config changes. - `MentionCursor::load_or_now(project_id: &str)` and `MentionCursor::save(&self, project_id: &str)` take the project id explicitly; both use the new `cursor_key()` helper. Project id is included on log events for multi-project debuggability. - Orchestrator swaps `mention_cursor: Option<MentionCursor>` for `mention_cursors: HashMap<String, MentionCursor>` and routes the poll/webhook paths through that map. - Webhook-dispatched comment ids are now stamped onto every project cursor (or the legacy `__global__` cursor when `projects` is empty) so subsequent polls skip them regardless of which project the comment originated from. The webhook payload does not yet carry project info; stamping every cursor is a safe superset. - `poll_mentions` continues to use the legacy single-project path under the `__global__` key; per-project fan-out lands in the next commit. Refs terraphim/adf-fleet#5 Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Add `mention::migrate_legacy_mention_cursor(projects)` to copy the legacy top-level `adf/mention_cursor` key to per-project keys `adf/mention_cursor/<project_id>` on first startup after the schema change, then delete the legacy key so the migration is idempotent. - Migration targets every configured project id plus the synthetic `__global__` id so both multi-project and legacy single-project installations see their cursor preserved. - Per-project targets that already have a cursor (operator-provided `stat()` succeeds) are left untouched -- the poller's advance wins over the pre-migration snapshot. - Unparsable legacy cursors are deleted rather than propagated: the poller will then synthesise fresh per-project `now()` cursors on first use, preserving the replay-storm guard. - Storage errors are logged but non-fatal; the reconciliation loop continues regardless so a transient sqlite hiccup cannot block orchestrator startup. - Wired into `AgentOrchestrator::run()` between telemetry restore and the safety-agent spawn so it runs exactly once per process and before any poll tick could observe the old key. Refs terraphim/adf-fleet#5 Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
…ws - Refs terraphim/adf-fleet#6 Add a provider-level spend tracker that complements the per-agent CostTracker. Tracks accumulated USD spend per external LLM provider (opencode-go, kimi-for-coding, ...) in tumbling UTC hour and day buckets and returns the existing BudgetVerdict so the routing engine can uniformly gate on either signal. Design choices (audited against existing primitives): - Re-use BudgetVerdict from cost_tracker (no parallel enum). - Do NOT extend TokenBucketLimiter: it is async, uses std::Instant (not serialisable), and tracks a single minute window. A cost-based tumbling-bucket tracker with persistence would require rewriting half of it; a new focused module is clearer. - Tumbling UTC buckets mirror CostTracker's calendar-month reset pattern, which operators already understand. - std::sync::Mutex for the two per-provider window cells -- critical sections are tiny; async locks would add needless complexity. - Atomic JSON snapshot write via `.tmp` + rename for crash safety. - apply_snapshot discards state for providers that are no longer in the current config so stale entries do not linger after edits. Public API mirrors CostTracker: `new`, `with_persistence`, `record_cost`, `check`, `snapshot`, `persist`, plus `record_cost_at` and `check_at` test hooks so boundary-crossing tests do not depend on wall-clock drift. A small `provider_has_budget` helper is exposed for the routing filter that lands next. Tests (9): unknown-provider uncapped; hour-window exhaustion; hour reset on the next hour; day cap trips across hours; day reset on the next day; snapshot round-trip via tempfile; combine_verdicts picks the worst signal; helper reports exhaustion; stale snapshot entry for a removed provider is discarded on reload.
…t gitea repos Rewrite poll_mentions to build a list of (project_id, gitea_cfg, mention_cfg) targets from config.projects. Each project's gitea config is polled with its own MentionConfig override (falling back to top-level mentions, then default). When projects is empty, falls back to legacy __global__ target using the top-level gitea block. Active mention-agent count now filters by agent definition's project field when running per-project; legacy mode counts all spawned_by_mention agents. Refs terraphim/adf-fleet#5 Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
…r, migration New integration test file mention_multi_repo_tests.rs with 21 tests covering: Regex (parse_mention_tokens): - unqualified / qualified capture - mixed qualified + unqualified in one comment - rejection of uppercase / over-long project prefix - trailing punctuation and plain @-mentions resolve_mention (project-aware): - qualified exact match, not found, ambiguous - unqualified legacy-mode matches any project - unqualified hinted-project preference - unqualified fallback to unbound agent - unqualified ambiguous (hinted + unbound) returns None - unqualified unknown name returns None parse_mentions: stamps legacy and hinted project_id onto DetectedMention MentionCursor: in-memory per-project isolation, monotonic advance_to migrate_legacy_mention_cursor: no-op safety under memory-only storage Refs terraphim/adf-fleet#5 Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
…+ routing - Refs terraphim/adf-fleet#6 Add [[providers]] config block and `provider_budget_state_file` pointing at an optional JSON snapshot. When configured, the orchestrator builds a `ProviderBudgetTracker` and threads it through the routing engine via `RoutingDecisionEngine::with_provider_budget`. The engine now: - Strips `Exhausted` providers from the candidate set before scoring (with a warning log and rationale citing the provider key). - Multiplies scores by 0.6 for `NearExhaustion` providers so healthier alternatives win without hard-banning the near-limit one. - Extends `budget_influenced` so observers see either agent-level or provider-level pressure biased the selection. A helper `provider_key_for_model(&str)` maps model strings (`opencode-go/model`, bare `sonnet`, ...) to the budget-bucket key. Three routing unit tests cover exhausted-drop, near-exhaustion deprioritisation, and uncapped-provider pass-through.
…rraphim/adf-fleet#6
…w - Refs terraphim/adf-fleet#6
The bare-name branch previously returned true for any unknown id, so
model = "minimax" (bare, no slash) silently bypassed the C3 banlist.
Switch to an explicit allow-list: only CLAUDE_CLI_BARE_MODELS,
ANTHROPIC_BARE_PROVIDERS, and ALLOWED_PROVIDER_PREFIXES pass as bare
names; everything else rejects.
Apply the same tightening in validate_model_provider so bare banned
ids are caught at config load time, not only at runtime.
Adds unit tests covering:
is_allowed_provider("minimax") -> false
is_allowed_provider("opencode") -> false
is_allowed_provider("unknown") -> false
validate_model_provider rejects bare "minimax"
…record_telemetry - Refs terraphim/adf-fleet#6
record_telemetry was the only place where real dispatch cost arrived
from CLI completion events, and it only fed cost_tracker. The new
ProviderBudgetTracker never received spend, so Layer 3 of the
subscription gate (Exhausted drop / NearExhaustion penalty in the
routing engine) was read-only at runtime.
Extend record_telemetry to also call provider_budget_tracker.record_cost
for each event with cost > 0, using provider_key_for_model to derive
the tracker key from the model string. Unknown or unconfigured
providers remain no-ops (tracker returns Uncapped and skips).
Adds:
- record_telemetry_for_test and provider_budget_tracker accessors
(doc-hidden) so integration tests can exercise the wiring without
spinning up the full reconcile loop
- record_telemetry_feeds_provider_budget_tracker: confirms a real
CompletionEvent drives the hour/day counters and trips Exhausted
- record_telemetry_ignores_zero_cost_and_unknown_model: guards
against regressions that would poison unrelated buckets
…efs terraphim/adf-fleet#6 The snapshot file referenced by provider_budget_state_file was only ever read at startup (via with_persistence); no production call site ever invoked tracker.persist(), so the cross-restart promise was structural only. Call tracker.persist() at the end of every reconcile tick (step 16, paired with step 15 telemetry persistence), and on graceful shutdown after the main select! loop exits. Failures log a warning and do not abort the tick, matching the fire-and-forget pattern used for the telemetry store. The persistence round-trip is covered by the existing provider_budget_persistence_round_trip_via_orchestrator test, which exercises a simulated restart end-to-end.
…im/adf-fleet#6 Previously each provider kept Mutex<WindowState> for hour and day separately, so a concurrent recorder could interleave: update hour, release, acquire day, update. An observer calling check() between the two locks saw the hour bucket advanced but not the day bucket, temporarily violating the day >= sum(hours-in-day) invariant. Collapse both windows behind a single Mutex<ProviderWindows> so record_cost_at and check_at observe a consistent snapshot across both windows. update_window/check_window refactored to operate on borrowed WindowState slots held by the caller's single lock. No user-visible behaviour change for the single-threaded tests; this only tightens the invariant under concurrent load.
…mant models - Refs terraphim/adf-fleet#6
…spatch path - Refs terraphim/adf-fleet#5
P1: `resolve_mention` and `parse_mention_tokens` now called from both
dispatch sites, not just tests.
Poll path (`poll_mentions_for_project`):
- Before the AdfCommandParser loop, run `parse_mention_tokens` on each
comment body and dispatch qualified `@adf:project/name` tokens directly
via `resolve_mention(Some(proj), project_id, agent, agents)`.
- Replace `agents.iter().find(|a| a.name == agent_name)` with
`resolve_mention(None, project_id, agent_name, agents)` so unqualified
mentions in multi-project mode prefer the hinted-project agent.
Webhook path (`handle_webhook_dispatch`):
- Add `detected_project: Option<String>` to `WebhookDispatch::SpawnAgent`.
- In `handle_gitea_webhook`, parse mention tokens before the Aho-Corasick
pass; collect qualified tokens into separate dispatches (qualified
mentions are not substrings of `@adf:{name}` patterns).
- Replace `.find(|a| a.name == agent_name)` with
`resolve_mention(detected_project.as_deref(), LEGACY_PROJECT_ID, ...)`.
P2 (both):
- Deduplicate `LEGACY_PROJECT_ID` in mention.rs: replace local `const`
with `pub(crate) use crate::dispatcher::LEGACY_PROJECT_ID`.
- Emit `tracing::debug!` for projects that lack a `gitea` block so
operators see which projects are skipped during mention polling.
Tests: +3 dispatch-wiring integration tests (24 total in
mention_multi_repo_tests; 513 total in crate, 0 failed).
Co-Authored-By: Terraphim AI <noreply@terraphim.ai>
…epo support' (#619) from task/adf-fleet-5-mention-multi-repo into main
…ead CostTracker path' (#620) from task/adf-fleet-6-provider-gate into main
…, spawner race Five test fixes that surfaced after the auto-route landing. - terraphim_agent shell_dispatch: try /usr/bin/false (macOS) before /bin/false (Linux) so the exit-code-capture test works on both. - terraphim_mcp_server integration_test: pass role=Default explicitly in test_mcp_server_integration and test_search_pagination so the new auto-route text content does not throw off content-shape assertions. - terraphim_mcp_server mcp_rolegraph_validation_test: count resource contents directly (filter c.as_resource().is_some()) instead of content.len()-1, robust to the auto-route prepend. - terraphim_spawner: replace try_recv-after-sleep with timeout-bounded recv loop and write a sleep-then-pwd shell script so the broadcast channel always has a subscriber by the time the child writes. - terraphim_service: drop the deprecated JMAP_MISSING_TOKEN_DOWNWEIGHT re-export and constant -- design kept it for fixture link-compat, no fixture used it, removing it removes the deprecation warning. Refs #617 Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…tate Two clippy warnings introduced by the auto-route work: - terraphim_service auto_route: tied.iter().any(|n| *n == sel) is cleaner as tied.contains(&sel). - terraphim_mcp_server lib: &*self.config_state is auto-deref'd to &self.config_state. clippy clean for the affected crates. Refs #617 Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…fixtures - Refs terraphim/adf-fleet#9 - PEP 723 inline-metadata script using uv run (tomllib stdlib + tomli-w) - --input (repeatable), --output-dir, --base-output, --dry-run flags - project_id derived from filename stem (orchestrator.toml -> terraphim) - Fixture files: orchestrator.toml (3 agents, 1 flow), odilo-orchestrator.toml (2 agents), banned-provider.toml Co-Authored-By: Terraphim AI <noreply@anthropic.com>
…ection - Refs terraphim/adf-fleet#9 Already bundled in migration script: - build_project_entry: extracts working_dir/gitea/quickwit/workflow/mentions into [[projects]] - build_agent_entries: injects project = "<project_id>" into each [[agents]] entry - build_flow_entries: injects project = "<project_id>" into each [[flows]] entry - build_base_doc: assembles global settings + include = ["conf.d/*.toml"] - Idempotent: tomli_w serialises deterministically; running twice is byte-identical Co-Authored-By: Terraphim AI <noreply@anthropic.com>
…terraphim/adf-fleet#9 validate_models() checks model/fallback_model fields on all [[agents]] and compound_review. Banned prefixes: opencode/ github-copilot/ google/ huggingface/ Exits non-zero with message: ERROR: Agent 'NAME' uses banned provider 'VALUE' Co-Authored-By: Terraphim AI <noreply@anthropic.com>
…rejection - Refs terraphim/adf-fleet#9 - 6 tests covering round-trip structure, idempotence, banned-provider rejection, flow project injection, dry-run no-write, github-copilot/ ban - Fixed non-deterministic dict key ordering in build_base_doc() using sorted() - Tests invoke script via subprocess (black-box, no mocks) - All 6 pass: uv run --with pytest pytest tests/test_migrate.py -v Co-Authored-By: Terraphim AI <noreply@anthropic.com>
…s - Refs terraphim/adf-fleet#9
Reference output generated by:
uv run migrate-to-confd.py \\
--input tests/fixtures/orchestrator.toml \\
--input tests/fixtures/odilo-orchestrator.toml \\
--output-dir tests/expected/ \\
--base-output tests/expected/orchestrator.toml
- tests/expected/orchestrator.toml: base config with include = ["conf.d/*.toml"]
- tests/expected/terraphim.toml: [[projects]], 3 [[agents]], 1 [[flows]] with project="terraphim"
- tests/expected/odilo.toml: [[projects]], 2 [[agents]] with project="odilo"
Co-Authored-By: Terraphim AI <noreply@anthropic.com>
…er - Refs terraphim/adf-fleet#7 Introduces ProviderErrorSignatures config schema (throttle/flake regex lists) alongside a compiled runtime layer that classifies spawned-agent stderr into Throttle, Flake, or Unknown verdicts. Throttle beats Flake when both match so a 'rate-limit timeout' line is not treated as retryable. Adds ProviderBudgetTracker::force_exhaust so Throttle verdicts can push hour+day windows past their caps, forcing the routing gate to drop the provider until the next UTC window rolls over. Patches the three in-tree ProviderBudgetConfig literals (routing.rs + tests) to carry the new field explicitly rather than relying on Default.
… + drift-detection test - Refs terraphim/adf-fleet#9 Add minimax/ to BANNED_PREFIXES (was missing, causing divergence from Rust BANNED_PROVIDER_PREFIXES). Add test_banned_list_matches_rust that parses the Rust source and asserts list equality to prevent future drift. Add test_minimax_bare_prefix_rejected as regression coverage. Co-Authored-By: Terraphim AI <noreply@anthropic.com>
…sformation - Refs terraphim/adf-fleet#9 Update fixture orchestrator.toml to use the correct WorkflowConfig schema (enabled, workflow_file, tracker sub-table). Previous schema had wrong field names (gitea_base_url, gitea_token) that did not match the Rust struct, causing adf --check to fail on the generated output. Regenerate tests/expected/terraphim.toml from the corrected fixture. Co-Authored-By: Terraphim AI <noreply@anthropic.com>
… banned-list drift tests - Refs terraphim/adf-fleet#9 Add three new tests: - test_banned_list_matches_rust: parses Rust BANNED_PROVIDER_PREFIXES and asserts script BANNED_PREFIXES matches after normalising trailing '/'. - test_minimax_bare_prefix_rejected: regression coverage for minimax/ ban. - test_adf_check_accepts_generated_output: runs adf --check as subprocess on the temp output of a full migration; asserts exit 0. Co-Authored-By: Terraphim AI <noreply@anthropic.com>
…n exit path - Refs terraphim/adf-fleet#7
Runs the per-provider classifier on stderr after the existing KG-based
ExitClass match so we cover two blind spots of the code-based check:
* providers whose CLI exits 0 on quota hits ("returning partial
output") still trip the breaker and force hour+day budget exhaustion;
* providers whose CLI emits bespoke error text that ExitClassifier
doesn't know about are caught by operator-tunable regex lists.
Throttle verdict records a provider failure and calls the new
ProviderBudgetTracker::force_exhaust so the routing gate drops the
provider until the next UTC window rolls.
Flake verdict only logs; dispatch already retries the next pool entry.
Unknown (with real stderr + failure-shaped exit) opens one `[ADF]`
Gitea issue via the OutputPoster's default tracker, deduped in-process
by error_signatures::unknown_dedupe_key so we don't spam fleet-meta
with duplicates for the same stderr shape. Unknown is also counted as
a soft failure so a pathological provider eventually opens the breaker.
…ust coverage - Refs terraphim/adf-fleet#7
Captures realistic stderr fixtures for every subscription-only provider
(claude-code, opencode-go, zai-coding-plan, kimi-for-coding) under
tests/fixtures/stderr/ -- 429 / usage-limit / timeout / EOF / insufficient
balance / quota-exceeded / unknown panic -- and exercises the classifier
end-to-end with the regex lists an operator would ship in
orchestrator.toml.
Covers:
* per-provider fixture -> expected verdict matrix;
* throttle beats flake when both patterns match;
* missing provider in the map falls back to Unknown (fail-safe);
* line-by-line capture path via classify_lines;
* dedupe key collapses minor shape variance (case, trailing newline,
extra detail) so retries don't spam fleet-meta.
Also adds three unit tests for ProviderBudgetTracker::force_exhaust so
the Throttle -> breaker + budget pairing is protected:
* force_exhaust trips both windows even without recorded cost;
* force_exhaust is a no-op on uncapped providers (intentional);
* force_exhaust silently ignores unknown provider ids.
No mocks; every stderr line is captured text from real CLI runs.
…ection - Refs terraphim/adf-fleet#9 Already bundled in migration script: - build_project_entry: extracts working_dir/gitea/quickwit/workflow/mentions into [[projects]] - build_agent_entries: injects project = "<project_id>" into each [[agents]] entry - build_flow_entries: injects project = "<project_id>" into each [[flows]] entry - build_base_doc: assembles global settings + include = ["conf.d/*.toml"] - Idempotent: tomli_w serialises deterministically; running twice is byte-identical Co-Authored-By: Terraphim AI <noreply@anthropic.com>
…terraphim/adf-fleet#9 validate_models() checks model/fallback_model fields on all [[agents]] and compound_review. Banned prefixes: opencode/ github-copilot/ google/ huggingface/ Exits non-zero with message: ERROR: Agent 'NAME' uses banned provider 'VALUE' Co-Authored-By: Terraphim AI <noreply@anthropic.com>
#844) * feat(drift-detector): agent work [auto-commit] * feat(security-sentinel): agent work [auto-commit] * feat(spec-validator): agent work [auto-commit] * feat(codebase-eval): add manifest types and TOML loader Refs #680 New crate terraphim_codebase_eval with typed manifest models for the before/after AI-agent codebase evaluation flow: - HaystackDescriptor, RoleDefinition, QuerySpec, MetricRecord, Thresholds - EvaluationManifest with validate() for role_id consistency - load_manifest() auto-detects format by extension (.toml) - thiserror ManifestError variants: InvalidPath, ParseError, Validation - 11 unit tests + 1 doc test covering round-trip, validation, edge cases - Fixture at fixtures/manifest-minimal.toml - Zero clippy warnings, passes cargo fmt
… Refs #796 - Add opencode-connector and codex-connector feature flags to Cargo.toml - Implement OpenCodeConnector parsing ~/.local/state/opencode/prompt-history.jsonl - Implement CodexConnector parsing ~/.codex/sessions/*.jsonl with session_meta and response_item entries - Register both connectors in ConnectorRegistry behind feature flags - Add comprehensive unit tests for both connectors with sample JSONL data - Mirror terraphim-session-analyzer connector patterns for zero drift
- Return empty thesaurus for roles without KG instead of hard error - Support GITHUB_TOKEN env var for authenticated update checks - Improve 403 error message to explain rate limiting - Downgrade startup update check error from stderr to debug log Refs #921
Refs #914
Refs #914 #915 #916 #917 #918 #919
- Change exit code from 1 to 2 for listen --server rejection - Error message and stderr output unchanged
…lt Refs #923 - Initialise current_role from service.get_selected_role() in run() - REPL now shows actual configured role on startup
…fs #892 Wire classify_error() into main error path for offline and server commands. Split ensure_thesaurus_loaded to error on kg:null (exit 3) vs degrade gracefully on missing files. Add --fail-on-empty flag to search subcommand (exit 4). Add integration tests and no_kg fixture.
…efs #936 Update spec checkboxes for Task 1.4 (REPL integration) and Task 1.5 (token budget management) from unchecked to checked. All subtasks and acceptance criteria verified against codebase with passing tests. Phase 1-5 disciplined development: research, design, implementation, verification (22 tests pass), validation (stakeholder approved).
Implements StatusReporter primitive for ADF Phase 1: posts a commit
status to POST /api/v1/repos/{owner}/{repo}/statuses/{sha} with
configurable retry on 5xx/429 (default backoff 1s, 2s, 4s) and no
retry on 4xx. Description is truncated to 140 chars on a Unicode
boundary; target_url=None omits the field entirely.
Tests use a real loopback HTTP server bound via axum on an ephemeral
127.0.0.1 port — no function mocks. Backoff is overridable via
GiteaTracker::with_status_backoff so retry tests run in milliseconds.
Refs #928
After the pr-reviewer is successfully spawned in handle_review_pr, post a Gitea commit status with state=pending and context=adf/pr-reviewer against the PR head SHA so the PR's checks gate in the Gitea UI shows the agent is running. The agent itself transitions the status to success/failure/error when its task script finishes. Skip-paths (no pr-reviewer configured, banned provider, budget exhausted) remain status-silent — posting pending on a check that never resolves would block merges indefinitely. Helper post_pr_reviewer_pending_status is best-effort: when no workflow tracker is configured (e.g. in unit tests) or the API call fails, we log and return without surfacing the error. Refs #928
After the structural-pr-review claude invocation completes, parse the Confidence Score from the agent output and render file, then POST a Gitea commit status with adf/pr-reviewer context against the PR head SHA so the PR check gate reflects the verdict: 4-5/5 -> success 3/5 -> success (with 'concerns flagged' description) 0-2/5 -> failure parse fail -> error The status post is best-effort -- a curl failure logs but does not fail the agent run, since the gtr comment already carries the full review. Refs #928
…uild-runner Implements Phase 3 of the ADF-replaces-Gitea-Actions plan: - New `WebhookDispatch::Push` variant in webhook.rs carrying project, ref_name, before/after SHA, pusher login and the deduplicated union of files added/removed/modified across all commits in the payload. - `GiteaPushPayload`/`GiteaPusher`/`GiteaPushCommit` deserialisation structs and a `handle_push_event` axum handler. Detection prefers the `X-Gitea-Event: push` header and falls back to JSON shape sniffing (`ref` + `before` + `after` + `commits`). - `aggregate_files_changed` helper preserves first-seen insertion order while deduplicating across commits. - Mirror `DispatchTask::Push` variant in the dispatcher with priority 350 so deterministic build verdicts race ahead of LLM PR review. - `handle_push` orchestrator method mirrors `handle_review_pr` shape: resolves the project's `build-runner` agent (warn-and-skip when absent so repos without one don't break the drain loop), runs the subscription allow-list and monthly budget gates, logs an observability routing decision row (model = n/a, cost = 0 since build-runner is bash, not LLM), then spawns it with `ADF_PUSH_*` env injection (SHA, REF, PROJECT, BEFORE_SHA, PUSHER, FILES). - Wire the new arm into `handle_webhook_dispatch` and the dispatcher drain loop alongside ReviewPr/AutoMerge/PostMergeTestGate. - `CommandKind::Push` plus a normalise arm in control_plane events. Tests (cargo test -p terraphim_orchestrator --test webhook_tests): - push_event_parses_correct_shape - push_event_files_changed_aggregated_from_commits - push_event_zero_commits_yields_empty_files_changed - push_event_rejected_without_hmac - push_event_malformed_payload_returns_200 Fixtures: push_main.json (2 commits, overlapping path sets) and push_tag_zero_commits.json (tag push edge case). Refs #929
Adds the bash-only ADF agent template that handle_push spawns on a Gitea push webhook event. NO LLM, NO model, NO skill_chain -- pure shell wrapping `rch exec`. Compute path: rch dispatches to bigbox + SeaweedFS S3 cache, the same pipeline as .github/workflows/ci-pr.yml (82.83 percent cache hit rate observed on GitHub Actions run 24929848023). Cargo step list lifted verbatim from ci-pr.yml: cargo fmt --all -- --check cargo clippy --workspace --all-targets -- -D warnings cargo test --workspace --no-fail-fast Posts adf/build commit status (pending -> success/failure) directly to the Gitea Commit Status API. Uses curl rather than the (Phase 1) set_commit_status Rust wrapper so this agent template stands alone on the Phase 3 branch. The toml LANDS in this PR but is NOT yet wired into agents_on_pr_open -- that wiring is Phase 2's responsibility per the locked rollout sequencing in .docs/plan-adf-agents-replace-gitea-actions.md section 6. Refs #929
New binary at crates/terraphim_orchestrator/src/bin/adf-ctl.rs providing four subcommands via SSH+curl to the webhook endpoint: - trigger <agent> [--context] [--wait] [--timeout] - fire any agent/persona - status [--since] - recent spawns and exits from journalctl - cancel <agent> - locate worktrees and PIDs for manual kill - agents - list all configured agent names from orchestrator TOML Key design decisions: - issue_number=0 in synthetic webhook payload bypasses should_skip_dispatch - JSON piped via --data-binary @- stdin (avoids shell quoting) - HMAC-SHA256 computed locally; secret resolved from flag > env > SSH TOML read - SSH transport via BatchMode=yes; no tunnel management needed - status/cancel print explicit [best-effort] banner 8 unit tests pass; zero clippy warnings. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Introduces a top-level [pr_dispatch] config block listing the agents that should be dispatched on a Gitea pull_request.opened event. Each entry pairs an agent name with the Gitea commit-status context the orchestrator must POST `pending` for after a successful spawn. When the [pr_dispatch] block is absent, OrchestratorConfig::agents_on_pr_open() falls back to a legacy default with a single pr-reviewer entry — every existing deployment continues to work unchanged after Phase 2 lands. Updates all OrchestratorConfig literal init sites (lib + integration tests) to set pr_dispatch: None. Refs #944 Refs adf-phase-2
…F Phase 2)
Refactors handle_review_pr to iterate the agents_on_pr_open list
introduced in the previous commit. Each entry is dispatched through one
of two helpers based on agent name:
- dispatch_pr_reviewer_for_pr: extracted unchanged from the legacy
handle_review_pr body. Drives the routing engine, applies the static
+ routed allow-list gates, the per-agent monthly budget gate, then
spawns with the existing ADF_PR_* env injection.
- dispatch_build_runner_for_pr: new helper mirroring handle_push.
Skips the routing engine (build-runner is bash, no LLM, no model)
and logs a synthetic 'model = n/a' row for dashboard parity. Injects
ADF_PUSH_* env using refs/pull/<n>/head as the synthetic ref so the
same task script handles both push events and PR opens.
Each helper returns Ok(true) on successful spawn or Ok(false) when
gated out. The caller posts a 'pending' commit status only when the
helper returns true — a pending from a skipped agent would block the
PR forever.
post_pr_reviewer_pending_status is generalised into post_pending_status
taking context + description as parameters, so the same helper covers
every fan-out entry. The Phase 1 call site passes 'adf/pr-reviewer'
and 'pr-reviewer dispatched' to preserve the existing behaviour.
Tests:
- handle_review_pr_spawns_both_build_runner_and_pr_reviewer
- handle_review_pr_injects_per_agent_env_correctly (verifies
ADF_PUSH_REF=refs/pull/641/head and ADF_PR_NUMBER=641 reach
the correct child processes)
- handle_review_pr_skips_missing_agents (no panic, no hung pending)
- handle_review_pr_pending_status_posted_per_agent (axum loopback,
asserts two distinct context POSTs both with state=pending)
- handle_review_pr_skipped_agent_does_not_post_pending (banned model
on build-runner; asserts adf/build pending NOT posted)
Refs #944
Refs adf-phase-2
…edits Implement lazy, on-demand cache invalidation for compiled thesauri by storing a content hash of source Graph markdown files alongside the cached Thesaurus. When a Thesaurus is loaded, compare the stored hash against a freshly computed hash of the source files; on mismatch, invalidate the cache and rebuild. Also expose a manual cache flush CLI subcommand. New modules: - terraphim_automata::hash - Graph directory hash computation (twox-hash) - terraphim_Database::hash_store - Hash Database/retrieval in KV store Changes: - Add hash check to TerraphimService::ensure_Thesaurus_loaded() - Add invalidate_Thesaurus_cache() and flush_Thesaurus_cache() - Clear #[cached] memoization on persistent cache invalidation - Add 'cache flush [--role ROLE]' to terraphim-agent CLI and REPL - Add regression test for stale-cache scenario Refs #945
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Implements automatic thesaurus cache invalidation when knowledge graph markdown source files are edited.
Changes
Verification
Refs #945