feat(fact-check): Phase 1 — AST-based post-polish fact-check#28
Merged
Conversation
Phase 1 of the polish-fact-check umbrella spec (docs/specs/polish-fact-check/), shipped as its own PR per the "four phases, four PRs" plan in the spec. Adds `src/attune_author/fact_check/`, a stdlib-only post-polish verification layer that runs against every polished template emitted by `apply_polish_results`. Four checks, zero LLM cost: - `check_python_refs` — parses Python code fences with `ast`, resolves each import + prose dotted path via `importlib.import_module` in the active venv. Catches the `attune.ops._readers` class of hallucination (the path parses fine but doesn't exist) — the most damaging failure mode in the attune-ai #351 regression fixture. - `check_cli_refs` — parses references of the form `attune <subcommand> --flag` and verifies the flag appears in the cached `--help` output for that subcommand chain. Every finding carries a version-coupling messaging block (installed attune-ai version + per-file override snippet) so the operator can resolve false positives across version drift without spelunking. - `check_md_links` — verifies relative `[label](target.md)` link targets exist. External URLs and pure anchors are skipped. - `check_numeric_refs` — verifies counts like `N templates`, `N features`, `N kinds` against the project filesystem and manifest. Unverifiable nouns (workflows, skills, agents) surface as warnings asking for human review. Wired into the polish pipeline at `generator.apply_polish_results`. Default mode is soft-fail: findings append an `## Unresolved references` table to the polished file. Strict mode raises `FactCheckError`. Control via `ATTUNE_AUTHOR_FACT_CHECK` env var (`off | soft | strict`, default `soft`) and the `[tool.attune-author.fact-check]` table in `pyproject.toml` (per-check toggles + per-file skip list). Regression fixture: `tests/fixtures/fact_check_ops_dashboard/` ships pre-fix and post-fix versions of the four attune-ai #351 docs. The fixture-based test suite asserts each check fires on the pre-fix files and is silent on the post-fix files, exercising the spec's "5/6 ops-dashboard errors caught" exit gate. Coverage: 55 new tests (`tests/unit/fact_check/`). One integration test verifies multi-check aggregation; per-check tests cover the happy path, the regression-fixture cases, de-duplication, and the version-coupling block. Spec tasks completed: 1.1–1.8, 1.10, 1.11, 1.11.1, 1.12, 1.13, 1.14, 1.15, 1.16. Deferred: 1.9 (CLI flags — env var ships in this PR; the named flags can land as a small follow-up). Phase 2 (ground-truth context injection), Phase 3 (faithfulness judge integration), and Phase 4 (tutorial static check) remain in spec; each ships as its own PR. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
The cli_refs check at src/attune_author/fact_check/cli_refs.py early-returns ``[]`` when ``shutil.which(cli)`` returns None. In CI, ``attune`` isn't installed (attune-ai is not in attune-author's dev deps), so the check silently produces no findings — and the four CLI-ref tests fail with "expected --turbo to surface" / "assert []". Locally the tests passed because the dev venv resolves ``attune`` via the uv workspace setup. CI is a clean checkout without that — hence the divergence across 8 platforms (ubuntu × 3.10–3.13 and windows × 3.10–3.13 in PR #28's matrix). Fix is one-line per affected test: monkey-patch ``cli_refs.shutil.which`` to return a non-None path so the guard passes and the rest of the test's monkey-patches (over _resolve_cli_name, _help_text, _installed_version) actually take effect. Verified: - 65/65 fact_check tests pass locally (with attune on PATH). - Same 11 fixture-based tests pass with PATH stripped of attune (the CI scenario): ``PATH=/usr/bin:/bin pytest …``. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Closes task 1.9 in docs/specs/polish-fact-check/tasks.md. The fact-check pass is controlled by ATTUNE_AUTHOR_FACT_CHECK (``off | soft | strict``, default ``soft``). Phase 1 of the spec landed this env-var path in e11feb5. This commit adds a matching CLI surface on the two commands that invoke the polish pipeline: attune-author generate <feat> --fact-check strict attune-author regenerate --no-fact-check Argparse adds the flags as a mutually exclusive group on both ``generate`` and ``regenerate`` subparsers. ``--no-fact-check`` is shorthand for ``--fact-check off``. ``_apply_fact_check_args`` translates either flag into the env var before the dispatch function imports the generator. Precedence (matches existing --rag pattern): 1. ATTUNE_AUTHOR_FACT_CHECK env var if set — shell-level intent wins over per-invocation flags so the operator can enforce a policy across an entire session. 2. ``--fact-check`` / ``--no-fact-check`` CLI flags — per-invocation override of the project default. 3. ``[tool.attune-author.fact-check]`` in pyproject.toml — project-level defaults loaded by load_config(). Tests added at tests/unit/fact_check/test_cli_flags.py (10 cases): each precedence rule, mutual-exclusivity enforcement, argparse choice validation, and the four-pass argparse shape across generate + regenerate. All 65 fact_check tests still pass. CHANGELOG/README updated to describe the three-layer control surface (the existing entries only documented the env var). Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Phase 1 of the polish-fact-check spec (umbrella PR #27). Adds an AST-based post-polish verification layer that catches LLM-fabricated technical detail without calling an LLM.
Four checks, zero LLM cost, ~600 LOC including tests + regression fixture:
check_python_refsattune.ops._readers)check_cli_refs--allow-run)check_md_linkscheck_numeric_refs498 templates)Behavior
Wired into
generator.apply_polish_resultsso every polished template runs through the check immediately after being written.## Unresolved referencestable at the bottom of the polished file.FactCheckErrorraised after the write; the bad file is left on disk for inspection.Controlled via
ATTUNE_AUTHOR_FACT_CHECKenv var (off | soft | strict, defaultsoft) and[tool.attune-author.fact-check]inpyproject.toml:Version coupling
check_cli_refsresolves against whichever attune-ai is installed in the active venv. Every finding includes proactive context so an operator running against a different attune-ai version can resolve false positives without spelunking:Regression fixture
tests/fixtures/fact_check_ops_dashboard/ships pre-fix and post-fix versions of the four attune-ai PR #351 docs. The fixture-based test suite asserts each check fires on the pre-fix files and is silent on the post-fix files, exercising the spec's "5/6 ops-dashboard errors caught" exit gate.Test plan
tests/unit/fact_check/— all green locally on Py 3.10ruff checkclean on new codeSpec tasks
Completed in this PR: 1.1–1.8, 1.10, 1.11, 1.11.1, 1.12, 1.13, 1.14, 1.15, 1.16.
Deferred: 1.9 (CLI flags
--fact-check=strict/--no-fact-check) — the env varATTUNE_AUTHOR_FACT_CHECKships in this PR; the named CLI flags can land as a small follow-up that wraps the existing env hook.Phase 2 (ground-truth context injection), Phase 3 (faithfulness judge), and Phase 4 (tutorial static check) remain to ship as separate PRs per the spec.
🤖 Generated with Claude Code