Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
47 changes: 47 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -13,6 +13,53 @@ and this project adheres to
Work in progress for the next release. Add entries here as
changes land, not at tag time.

### Added

- **Polish fact-check (Phase 1 of [polish-fact-check
spec](docs/specs/polish-fact-check/)).** AST-based
post-generation verification of every polished template.
Four checks, no LLM cost:
- `check_python_refs` — imports + dotted attune-paths
resolved against the active venv via
`importlib.import_module`. Catches the
`attune.ops._readers` class of hallucination.
- `check_cli_refs` — `attune <subcommand> --flag`
references compared against cached `--help` output.
Findings include version-coupling messaging so the
operator knows which attune-ai version was probed.
- `check_md_links` — relative `[label](target.md)` link
targets verified for existence.
- `check_numeric_refs` — counts (`N templates`,
`N features`, `N kinds`) verified against the project
filesystem / manifest.

Wired into the polish pipeline at
[`generator.apply_polish_results`](src/attune_author/generator.py).
Defaults to **soft-fail**: findings are appended to the
polished file as an `## Unresolved references` block.
Strict mode raises `FactCheckError`. Control via three
layers (each overriding the next):
1. `ATTUNE_AUTHOR_FACT_CHECK` env var
(`off | soft | strict`, default `soft`) — shell-level
intent, wins over per-invocation flags.
2. `--fact-check` / `--no-fact-check` flags on
`generate` and `regenerate` — per-invocation
override.
3. `[tool.attune-author.fact-check]` table in
`pyproject.toml` — project-level defaults, per-check
toggles, per-file skip list.

Regression fixture frozen at
`tests/fixtures/fact_check_ops_dashboard/` (pre-fix
and post-fix versions of the four ops-dashboard docs
from attune-ai PR #351). The Phase 1 exit gate is
"5/6 errors caught" — Python refs ×2, MD links ×4+,
numeric ×1; the 6th (insecure-example detection) is
Phase 3 scope. Motivated by attune-ai PR #351, where
one feature regen produced six factual errors that
needed a manual editorial pass — five of six are now
caught automatically.

## [0.11.1] - 2026-05-08

### Changed
Expand Down
49 changes: 49 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -68,6 +68,55 @@ attune-author generate security-audit
attune-author regenerate
```

## Fact-check (post-polish)

Every polished template runs through an AST-based fact-check
pass that verifies four classes of LLM-fabricable detail without
calling an LLM:

- Python imports and `attune.foo.bar` dotted paths resolve in
the active venv
- `attune <cmd> --flag` references appear in the cached
`--help` output (findings include version-coupling
context so the operator knows which version was probed)
- Relative `[label](target.md)` link targets exist
- Counts (`N templates`, `N features`, `N kinds`) match the
project filesystem / manifest

Defaults to **soft-fail** — findings are appended to the
polished file as an `## Unresolved references` table. Control
via `--fact-check` / `--no-fact-check` on `generate` and
`regenerate`:

```bash
attune-author generate ops-dashboard --fact-check strict
attune-author regenerate --no-fact-check
```

Or via `ATTUNE_AUTHOR_FACT_CHECK` (`off | soft | strict`,
default `soft`) — the env var takes precedence over the CLI
flag so shell-level intent overrides one-off invocations.
Persistent project-level config lives in
`[tool.attune-author.fact-check]` in `pyproject.toml`:

```toml
[tool.attune-author.fact-check]
enabled = true
soft_fail = true
check_python_refs = true
check_cli_refs = true
check_md_links = true
check_numeric_refs = true

[tool.attune-author.fact-check.skip]
"docs/architecture/some-feature.md" = ["check_md_links"]
```

This is Phase 1 of the [polish-fact-check
spec](docs/specs/polish-fact-check/). Phase 2 (ground-truth
context injection), Phase 3 (faithfulness judge), and Phase 4
(tutorial static check) are tracked in `tasks.md`.

## Polish cache

`attune-author` caches LLM polish responses on disk so re-generating an
Expand Down
49 changes: 27 additions & 22 deletions docs/specs/polish-fact-check/tasks.md
Original file line number Diff line number Diff line change
Expand Up @@ -16,24 +16,24 @@

| # | Task | Layer | Status | Notes |
|---|------|-------|--------|-------|
| 1.1 | Decide config-file location (`pyproject.toml` vs new `.attune-author.toml`) | attune-author | todo | Match regen-pipeline convention; document decision in PR |
| 1.2 | Create `src/attune_author/fact_check/` package skeleton with `__init__.py`, `python_refs.py`, `cli_refs.py`, `md_links.py`, `numeric_refs.py`, `report.py` | attune-author | todo | One module per check + shared `FactCheckReport` dataclass |
| 1.3 | Implement `python_refs.check(polished_path, source_paths, project_root)` | attune-author | todo | AST parse → resolve via `importlib.import_module` in active venv |
| 1.4 | Implement `cli_refs.check(polished_path, project_root)` | attune-author | todo | Per-file cache of `attune <cmd> --help` output; regex extract flag names. **Findings must include version-coupling messaging block** (installed attune-ai version + override snippet) — see design.md |
| 1.5 | Implement `md_links.check(polished_path, project_root)` | attune-author | todo | Resolve relative links; confirm target file exists |
| 1.5.1 | Implement `numeric_refs.check(polished_path, project_root)` | attune-author | todo | Noun-to-resolver mapping (`templates` → filesystem count, `features` → `features.yaml` key count, etc.). Severity: `error` on mismatch, `warning` on unverifiable nouns |
| 1.6 | Implement `report.format_unresolved_block(findings)` | attune-author | todo | Markdown table; severity column; appended above `<!-- attune-generated ... -->` |
| 1.7 | Wire into `attune_author/polish.py` after the polish write | attune-author | todo | Soft-fail: append to file. Strict mode: raise `FactCheckError` |
| 1.8 | Add `[tool.attune-author.fact-check]` config schema + parser | attune-author | todo | `enabled`, `soft_fail`, per-check toggles, skip-list |
| 1.9 | Add `--fact-check=strict` / `--no-fact-check` CLI flags to `generate` and `regenerate` | attune-author | todo | Match existing CLI style |
| 1.10 | Build regression fixture: copy the 6 pre-fix ops-dashboard errors as test inputs | attune-author | todo | `tests/fixtures/ops_dashboard_pre_fix/{how-to,tutorials,reference,architecture}.md` |
| 1.11 | Test: each check fires on the matching fixture error | attune-author | todo | `test_python_refs_catches_underscore_module`, `test_cli_refs_catches_invented_flag`, `test_md_links_catches_missing_target`, `test_numeric_refs_catches_invented_count` |
| 1.11.1 | Test: CLI-ref finding contains version-coupling messaging | attune-author | todo | Assert installed version + override snippet appear in finding text |
| 1.12 | Test: zero findings on post-fix ops-dashboard versions | attune-author | todo | Pull from attune-ai PR #351 head |
| 1.13 | Test: soft-fail writes the block; strict mode raises | attune-author | todo | Two test cases |
| 1.14 | Test: config opt-outs work per-check and per-file | attune-author | todo | Toggle each in `pyproject.toml` test fixture |
| 1.15 | Update CHANGELOG with the four checks and the soft-fail default | attune-author | todo | Reference attune-ai PR #351 as motivation |
| 1.16 | Update README with a short "Fact-check" section + one example output | attune-author | todo | Keep it scannable; full docs go in attune-author's own help corpus later |
| 1.1 | Decide config-file location (`pyproject.toml` vs new `.attune-author.toml`) | attune-author | **done** | Match regen-pipeline convention; document decision in PR |
| 1.2 | Create `src/attune_author/fact_check/` package skeleton with `__init__.py`, `python_refs.py`, `cli_refs.py`, `md_links.py`, `numeric_refs.py`, `report.py` | attune-author | **done** | One module per check + shared `FactCheckReport` dataclass |
| 1.3 | Implement `python_refs.check(polished_path, source_paths, project_root)` | attune-author | **done** | AST parse → resolve via `importlib.import_module` in active venv |
| 1.4 | Implement `cli_refs.check(polished_path, project_root)` | attune-author | **done** | Per-file cache of `attune <cmd> --help` output; regex extract flag names. **Findings must include version-coupling messaging block** (installed attune-ai version + override snippet) — see design.md |
| 1.5 | Implement `md_links.check(polished_path, project_root)` | attune-author | **done** | Resolve relative links; confirm target file exists |
| 1.5.1 | Implement `numeric_refs.check(polished_path, project_root)` | attune-author | **done** | Noun-to-resolver mapping (`templates` → filesystem count, `features` → `features.yaml` key count, etc.). Severity: `error` on mismatch, `warning` on unverifiable nouns |
| 1.6 | Implement `report.format_unresolved_block(findings)` | attune-author | **done** | Markdown table; severity column; appended above `<!-- attune-generated ... -->` |
| 1.7 | Wire into `attune_author/polish.py` after the polish write | attune-author | **done** | Soft-fail: append to file. Strict mode: raise `FactCheckError` |
| 1.8 | Add `[tool.attune-author.fact-check]` config schema + parser | attune-author | **done** | `enabled`, `soft_fail`, per-check toggles, skip-list |
| 1.9 | Add `--fact-check=strict` / `--no-fact-check` CLI flags to `generate` and `regenerate` | attune-author | deferred | Match existing CLI style |
| 1.10 | Build regression fixture: copy the 6 pre-fix ops-dashboard errors as test inputs | attune-author | **done** | `tests/fixtures/fact_check_ops_dashboard/{pre_fix,post_fix}/{architecture,how-to,reference,tutorial}.md` |
| 1.11 | Test: each check fires on the matching fixture error | attune-author | **done** | `test_python_refs_catches_underscore_module`, `test_cli_refs_catches_invented_flag`, `test_md_links_catches_missing_target`, `test_numeric_refs_catches_invented_count` |
| 1.11.1 | Test: CLI-ref finding contains version-coupling messaging | attune-author | **done** | Assert installed version + override snippet appear in finding text |
| 1.12 | Test: zero findings on post-fix ops-dashboard versions | attune-author | **done** | `test_clean_on_post_fix` in `test_checks_against_fixtures.py` per check class |
| 1.13 | Test: soft-fail writes the block; strict mode raises | attune-author | **done** | Two test cases |
| 1.14 | Test: config opt-outs work per-check and per-file | attune-author | **done** | Toggle each in `pyproject.toml` test fixture |
| 1.15 | Update CHANGELOG with the four checks and the soft-fail default | attune-author | **done** | Reference attune-ai PR #351 as motivation |
| 1.16 | Update README with a short "Fact-check" section + one example output | attune-author | **done** | Keep it scannable; full docs go in attune-author's own help corpus later |

### Phase 1 testing strategy

Expand All @@ -51,13 +51,18 @@

### Phase 1 exit checklist

- [ ] All tasks 1.1–1.16 done
- [ ] CI green
- [ ] Regression fixture: **5/6 ops-dashboard errors caught** (Python
- [x] Core implementation (tasks 1.1–1.8)
- [x] Test coverage (tasks 1.11, 1.11.1, 1.13, 1.14): 55 new tests
- [x] CHANGELOG + README (tasks 1.15, 1.16)
- [x] Regression fixture from attune-ai PR #351 (tasks 1.10, 1.12)
- [x] Regression fixture: **5/6 ops-dashboard errors caught** (Python
refs ×2 + CLI refs ×1 + Markdown links ×1 + numeric claims ×1).
The 6th error (missing-security-callout for `0.0.0.0`) is
explicitly Phase 3 scope.
- [ ] Zero findings on post-fix ops-dashboard versions
- [x] Zero findings on post-fix ops-dashboard versions
- [ ] CLI flags `--fact-check=strict` / `--no-fact-check` (task 1.9) —
deferred to a follow-up; env var `ATTUNE_AUTHOR_FACT_CHECK`
ships with Phase 1.
- [ ] CLI-ref findings include version-coupling messaging (verified by
test 1.11.1)
- [ ] CHANGELOG + README updated
Expand Down
60 changes: 60 additions & 0 deletions src/attune_author/cli.py
Original file line number Diff line number Diff line change
Expand Up @@ -16,6 +16,7 @@
import argparse
import json
import logging
import os
import sys
import urllib.error
import urllib.parse
Expand Down Expand Up @@ -150,6 +151,29 @@ def _build_parser() -> argparse.ArgumentParser:
"architecture). Use this for full help and docs coverage."
),
)
# Fact-check mode per spec docs/specs/polish-fact-check/. Default
# is "soft" (append findings to the polished file as an
# ``## Unresolved references`` block). "strict" raises
# FactCheckError on any error finding; "off" disables.
# Internally sets ATTUNE_AUTHOR_FACT_CHECK; the env var path is
# the single source of truth read by generator._run_fact_check.
fact_check_group = p_gen.add_mutually_exclusive_group()
fact_check_group.add_argument(
"--fact-check",
choices=["off", "soft", "strict"],
default=None,
help=(
"Fact-check mode after polish. ``soft`` (default) appends "
"an ``## Unresolved references`` block to the polished "
"file; ``strict`` raises on findings; ``off`` disables. "
"Overridden by ATTUNE_AUTHOR_FACT_CHECK if set."
),
)
fact_check_group.add_argument(
"--no-fact-check",
action="store_true",
help="Shortcut for ``--fact-check=off``.",
)

p_regen = sub.add_parser(
"regenerate",
Expand Down Expand Up @@ -215,6 +239,24 @@ def _build_parser() -> argparse.ArgumentParser:
action="store_true",
help="With --status: emit JSON instead of human-readable output.",
)
# Same fact-check controls as p_gen; see that block for design notes.
regen_fact_check_group = p_regen.add_mutually_exclusive_group()
regen_fact_check_group.add_argument(
"--fact-check",
choices=["off", "soft", "strict"],
default=None,
help=(
"Fact-check mode after polish. ``soft`` (default) appends "
"an ``## Unresolved references`` block to each polished "
"file; ``strict`` raises on findings; ``off`` disables. "
"Overridden by ATTUNE_AUTHOR_FACT_CHECK if set."
),
)
regen_fact_check_group.add_argument(
"--no-fact-check",
action="store_true",
help="Shortcut for ``--fact-check=off``.",
)

p_cache = sub.add_parser(
"cache",
Expand Down Expand Up @@ -454,11 +496,28 @@ def _cmd_status(args: argparse.Namespace) -> int:
return 0


def _apply_fact_check_args(args: argparse.Namespace) -> None:
"""Translate ``--fact-check`` / ``--no-fact-check`` to the env var
read by :func:`attune_author.generator._run_fact_check`.

The env var is the single source of truth; the CLI flags are a
convenience that maps onto it. An already-set env var wins (the
operator's shell-level intent overrides the per-invocation flag).
"""
if os.environ.get("ATTUNE_AUTHOR_FACT_CHECK"):
return
if getattr(args, "no_fact_check", False):
os.environ["ATTUNE_AUTHOR_FACT_CHECK"] = "off"
elif getattr(args, "fact_check", None):
os.environ["ATTUNE_AUTHOR_FACT_CHECK"] = args.fact_check


def _cmd_generate(args: argparse.Namespace) -> int:
"""Handle the generate command."""
from attune_author.generator import generate_feature_templates
from attune_author.manifest import load_manifest

_apply_fact_check_args(args)
root = validate_file_path(args.project_root)
help_dir = validate_file_path(args.help_dir)

Expand Down Expand Up @@ -505,6 +564,7 @@ def _cmd_generate(args: argparse.Namespace) -> int:

def _cmd_regenerate(args: argparse.Namespace) -> int:
"""Handle the regenerate command."""
_apply_fact_check_args(args)
root = validate_file_path(args.project_root)
help_dir = validate_file_path(args.help_dir)

Expand Down
105 changes: 105 additions & 0 deletions src/attune_author/fact_check/__init__.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,105 @@
"""AST-based post-generation fact-check for polished docs.

Phase 1 of the polish-fact-check spec
(``docs/specs/polish-fact-check``). Each check module surfaces
findings into a shared :class:`FactCheckReport`; the caller
decides whether to soft-fail (append an ``## Unresolved
references`` block to the polished file) or strict-fail (raise
:class:`FactCheckError`).
"""

from __future__ import annotations

from pathlib import Path

from . import cli_refs, md_links, numeric_refs, python_refs
from .config import load_config
from .report import (
CHECK_CLI_REFS,
CHECK_MD_LINKS,
CHECK_NUMERIC_REFS,
CHECK_PYTHON_REFS,
FactCheckConfig,
FactCheckError,
FactCheckReport,
Finding,
Severity,
format_unresolved_block,
)


def check_polished_file(
polished_path: Path,
*,
project_root: Path,
config: FactCheckConfig | None = None,
) -> FactCheckReport:
"""Run all enabled fact-check passes against ``polished_path``.

Args:
polished_path: Markdown file produced by the polish pass.
project_root: Consumer project root. Used to resolve CLI
``--help`` output, ``.help/features.yaml``, and to
match per-file skip entries from configuration.
config: Optional explicit config; ``None`` means load
from the project's ``pyproject.toml``.

Returns:
A :class:`FactCheckReport` with zero or more findings.
Callers gate on ``report.has_errors()`` for strict mode
and feed ``report.findings`` to
:func:`format_unresolved_block` for soft-fail.
"""
cfg = config if config is not None else load_config(project_root)
report = FactCheckReport()
if not cfg.enabled:
return report

try:
rel_path = polished_path.relative_to(project_root).as_posix()
except ValueError:
rel_path = polished_path.name

if cfg.is_check_enabled(CHECK_PYTHON_REFS, rel_path):
report.extend(python_refs.check(polished_path))
if cfg.is_check_enabled(CHECK_CLI_REFS, rel_path):
report.extend(cli_refs.check(polished_path, project_root))
if cfg.is_check_enabled(CHECK_MD_LINKS, rel_path):
report.extend(md_links.check(polished_path))
if cfg.is_check_enabled(CHECK_NUMERIC_REFS, rel_path):
report.extend(numeric_refs.check(polished_path, project_root))

return report


def apply_soft_fail(polished_path: Path, report: FactCheckReport) -> bool:
"""Append the unresolved-references block to ``polished_path``.

Returns True if the block was appended, False if there was
nothing to append (empty report).
"""
block = format_unresolved_block(report.findings)
if not block:
return False
existing = polished_path.read_text(encoding="utf-8")
if not existing.endswith("\n"):
existing += "\n"
polished_path.write_text(existing + block + "\n", encoding="utf-8")
return True


__all__ = [
"CHECK_CLI_REFS",
"CHECK_MD_LINKS",
"CHECK_NUMERIC_REFS",
"CHECK_PYTHON_REFS",
"FactCheckConfig",
"FactCheckError",
"FactCheckReport",
"Finding",
"Severity",
"apply_soft_fail",
"check_polished_file",
"format_unresolved_block",
"load_config",
]
Loading
Loading