Skip to content

Releases: agent-fox-dev/agent-fox

v3.4.1

26 Apr 19:33

Choose a tag to compare

What's Changed

Bug Fixes

  • fix(#547): Add errata markdown-to-DuckDB indexing path — errata files in docs/errata/ are now indexed into the DuckDB errata table, closing the write-only gap where errata were created but never retrievable
  • fix(#548): Fix audit-review task_group partitioning causing supersession silos — audit findings no longer use a hardcoded empty task_group, enabling proper supersession across review modes
  • fix(#549): Move steering.md from .agent-fox/specs/ to .agent-fox/
  • fix(#546): Fix ruff format violations in harvest warning strings
  • fix(#545): Serialize AuditJsonlSink writes with threading.Lock to prevent interleaved concurrent appends

Features

  • feat: ADR ingestion pipeline (spec 117) — MADR parser, validator, DuckDB migration v22 for adr_entries table, and integration into FoxKnowledgeProvider for retrieval during coder sessions

Documentation

  • docs: ADR 07 — Define audit JSONL event format (envelope schema + complete event type catalog)
  • docs: Code quality audit (specs 7–9)
  • docs: Parking service 3.4.0 audit

Chores

  • Bump version to 3.4.1

v3.4.0

24 Apr 10:38

Choose a tag to compare

What's Changed

Fixes

  • #543: Drop dead knowledge system columns retrieval_summary and coverage_data
  • #542: Fix ruff format violation in warning string
  • #541: Fix ruff format violation in list comprehension
  • #539: Add quick-triage bail-out to coder prompt
  • #537: Rename CLI command findings to insights
  • #536: Add AC-4 test and fix ruff format violation
  • #534: Add AC-3 integration test for verifier dispatch without phantom task group

Chores

  • Upgrade dependency version pins (pydantic, rich, duckdb, sentence-transformers, scikit-learn, pathspec, tree-sitter, pytest, ruff, mypy, and more)
  • Update auto-generated errata dates

v3.3.1

23 Apr 12:37

Choose a tag to compare

What's New

Features

  • Test coverage regression gate — measures per-file line coverage before and after coder sessions; blocks the task if coverage decreases on modified files (#520)
    • Multi-language coverage tool detection: pytest-cov (Python), cargo-tarpaulin (Rust), go test -cover (Go)
    • Coverage data stored in session outcomes for trend tracking (migration v20)
    • Blocking findings emitted via review_findings table on regression

Full Changelog: v3.3.0...v3.3.1

v3.3.0

23 Apr 11:55

Choose a tag to compare

What's New

Features

  • Verification checklist & task completion enforcement — structured verification checklist for spec compliance (#521)
  • State transition validation in GraphSync — validates engine state transitions to catch illegal graph moves (#523)
  • Eager pre-review with retry-predecessor — restores eager pre-review behavior with retry on predecessor failure (#519)
  • Lightweight errata generation from blocking — reinstates errata generation when issues are blocked (#522)
  • Knowledge system pruning — migration v18 removes causal links and dead knowledge modules (spec 116)

Bug Fixes

  • Fix max_items in property test to avoid retrieval cap masking failures
  • Use Path-typed specs_path variable in plan_cmd (#516)
  • Fix ruff format violation in RuntimeError f-string (#515)
  • Add proper type annotations for embedder and backend variables (#514)

Refactoring

  • Extract strategy classes from engine, fix_pipeline, and result_handler (#518)
  • Inline single-consumer modules and deduplicate review parser
  • Remove dead code and consolidate single-consumer modules (2 passes)
  • Remove dead code and consolidate near-identical abstractions
  • Delete dead knowledge modules (blocking_history, errata_store, gotcha_extraction, gotcha_store) and simplify provider

Full Changelog: v3.2.0...v3.3.0

v3.2.0

22 Apr 14:56

Choose a tag to compare

What's Changed

Features

  • knowledge: Decouple knowledge subsystem via KnowledgeProvider protocol (spec 114)
  • knowledge: Pluggable knowledge provider with gotcha extraction, errata store, and content hashing (spec 115)
  • engine: Wire FoxKnowledgeProvider into engine startup

Refactors

  • knowledge: Delete 40+ legacy knowledge pipeline modules (lang analyzers, retrieval, consolidation, embeddings, etc.)
  • config: Remove obsolete knowledge pipeline configuration options
  • cli: Remove onboard command and legacy nightshift streams

Chores

  • Supersede specs 112 (sleep time compute) and 113 (knowledge effectiveness)
  • Fix Unicode edge case in content hash determinism property test
  • Clean up leftover __pycache__ directories in deleted knowledge subdirectories

v3.1.4

22 Apr 08:34

Choose a tag to compare

What's Changed

Bug Fixes

  • engine: Close AsyncAnthropic clients to prevent event loop shutdown crash (fixes #506)
  • engine: Skip redundant cleanup ingestion when barrier already ran (fixes #505)
  • knowledge: Always write agent trace JSONL for transcript reconstruction (fixes #507)
  • Guard trace reconstruction behind debug flag to suppress spurious warning

Features

  • engine: Add pre-flight check to skip coder sessions when work is done (fixes #511)

Other

  • New specs 114 (knowledge decoupling), 115 (pluggable knowledge)
  • Coding-session architecture documentation
  • General cleanup

Full Changelog: v3.1.3...v3.1.4

v3.1.3

21 Apr 15:54

Choose a tag to compare

What's Changed

Bug Fixes

  • Budget exhaustion detection: Sessions that hit the SDK max-budget-usd limit are now detected and blocked immediately instead of being wastefully retried. The SDK returns is_error=True with no message on budget exhaustion — previously mapped to "Unknown error" and retried through the escalation ladder.
  • AssessmentManager config: Pass full_config (not OrchestratorConfig) to AssessmentManager, fixing missing attribute errors.
  • Escalation ladder starting tier: The escalation ladder now respects config.models.coding for the starting tier instead of always defaulting to STANDARD.
  • Timed-out session metrics: Emit descriptive error messages and metrics for sessions that time out.

Features

  • Knowledge system effectiveness (spec 113): Transcript reconstruction, compaction improvements, entity signal activation, cold-start handling, git extraction, audit consumption, retrieval quality validation, and audit prompt injection.

Other

  • Parking service audit report
  • Session budget increased for lengthy tasks

v3.1.2

21 Apr 03:31

Choose a tag to compare

Bug Fixes

  • engine: Move review concurrency cap before _prepare_launch to prevent phantom retry exhaustion (fixes #503)

    The review concurrency cap in _fill_parallel_pool was checked after _prepare_launch(), which increments the attempt tracker on "allowed" verdicts. When the single review slot was occupied, audit-review tasks were skipped but their attempt counter was already incremented. After max_retries + 1 (default 3) such pool-refill cycles, the circuit breaker permanently blocked the task with "Retry limit exceeded" — without ever starting a session. This cascade-blocked all downstream coding and verifier tasks, exceeding the block budget and halting the entire run.

Recovery for affected runs

If you have a stuck run with audit-review tasks blocked by "Retry limit exceeded", clear the stale state:

agent-fox reset --spec <affected_spec_name>

Full Changelog: v3.1.1...v3.1.2

v3.1.1

20 Apr 14:54

Choose a tag to compare

Bug Fixes

  • reset: clear session-scoped tables on reset to prevent block_limit death-loop (#501)

    After a block_limit run, reset --hard (and soft reset) left stale data in six session-scoped DB tables (runs, session_outcomes, review_findings, verification_results, drift_findings, blocking_history). The stale runs.status='block_limit' caused load_state_from_db() to load a terminal status, making the engine loop exit immediately on every subsequent agent-fox code invocation — a self-perpetuating death-loop with no CLI recovery path.

    All four reset paths (reset_all, reset_task, reset_spec, hard_reset_all/hard_reset_task) now clear session-scoped tables so that plan and code start from a clean state.

Full Changelog: v3.1.0...v3.1.1

v3.1.0

20 Apr 13:40

Choose a tag to compare

What's New

Sleep-Time Compute (Spec 112)

A new knowledge-processing pipeline that runs background computation during idle periods:

  • Core protocol & orchestrator — schema, configuration, and orchestration layer for sleep-time tasks
  • ContextRewriter — sleep task that rewrites and enriches knowledge context
  • BundleBuilder — sleep task that builds consolidated knowledge bundles
  • Retriever & integration wiring — retrieval layer with full integration into the existing knowledge system
  • Wiring verification — end-to-end verification of the sleep-time compute pipeline

Full Changelog

  • feat(112): implement core protocol, orchestrator, config, and schema
  • feat(112): implement ContextRewriter sleep task
  • feat(112): implement retriever and integration wiring
  • test(112): failing spec tests, checkpoint, and wiring verification