diff --git a/CHANGELOG.md b/CHANGELOG.md index d4af0bd..131442a 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -13,6 +13,53 @@ and this project adheres to Work in progress for the next release. Add entries here as changes land, not at tag time. +### Added + +- **Polish fact-check (Phase 1 of [polish-fact-check + spec](docs/specs/polish-fact-check/)).** AST-based + post-generation verification of every polished template. + Four checks, no LLM cost: + - `check_python_refs` — imports + dotted attune-paths + resolved against the active venv via + `importlib.import_module`. Catches the + `attune.ops._readers` class of hallucination. + - `check_cli_refs` — `attune --flag` + references compared against cached `--help` output. + Findings include version-coupling messaging so the + operator knows which attune-ai version was probed. + - `check_md_links` — relative `[label](target.md)` link + targets verified for existence. + - `check_numeric_refs` — counts (`N templates`, + `N features`, `N kinds`) verified against the project + filesystem / manifest. + + Wired into the polish pipeline at + [`generator.apply_polish_results`](src/attune_author/generator.py). + Defaults to **soft-fail**: findings are appended to the + polished file as an `## Unresolved references` block. + Strict mode raises `FactCheckError`. Control via three + layers (each overriding the next): + 1. `ATTUNE_AUTHOR_FACT_CHECK` env var + (`off | soft | strict`, default `soft`) — shell-level + intent, wins over per-invocation flags. + 2. `--fact-check` / `--no-fact-check` flags on + `generate` and `regenerate` — per-invocation + override. + 3. `[tool.attune-author.fact-check]` table in + `pyproject.toml` — project-level defaults, per-check + toggles, per-file skip list. + + Regression fixture frozen at + `tests/fixtures/fact_check_ops_dashboard/` (pre-fix + and post-fix versions of the four ops-dashboard docs + from attune-ai PR #351). The Phase 1 exit gate is + "5/6 errors caught" — Python refs ×2, MD links ×4+, + numeric ×1; the 6th (insecure-example detection) is + Phase 3 scope. Motivated by attune-ai PR #351, where + one feature regen produced six factual errors that + needed a manual editorial pass — five of six are now + caught automatically. + ## [0.11.1] - 2026-05-08 ### Changed diff --git a/README.md b/README.md index 20885ba..202fa7b 100644 --- a/README.md +++ b/README.md @@ -68,6 +68,55 @@ attune-author generate security-audit attune-author regenerate ``` +## Fact-check (post-polish) + +Every polished template runs through an AST-based fact-check +pass that verifies four classes of LLM-fabricable detail without +calling an LLM: + +- Python imports and `attune.foo.bar` dotted paths resolve in + the active venv +- `attune --flag` references appear in the cached + `--help` output (findings include version-coupling + context so the operator knows which version was probed) +- Relative `[label](target.md)` link targets exist +- Counts (`N templates`, `N features`, `N kinds`) match the + project filesystem / manifest + +Defaults to **soft-fail** — findings are appended to the +polished file as an `## Unresolved references` table. Control +via `--fact-check` / `--no-fact-check` on `generate` and +`regenerate`: + +```bash +attune-author generate ops-dashboard --fact-check strict +attune-author regenerate --no-fact-check +``` + +Or via `ATTUNE_AUTHOR_FACT_CHECK` (`off | soft | strict`, +default `soft`) — the env var takes precedence over the CLI +flag so shell-level intent overrides one-off invocations. +Persistent project-level config lives in +`[tool.attune-author.fact-check]` in `pyproject.toml`: + +```toml +[tool.attune-author.fact-check] +enabled = true +soft_fail = true +check_python_refs = true +check_cli_refs = true +check_md_links = true +check_numeric_refs = true + +[tool.attune-author.fact-check.skip] +"docs/architecture/some-feature.md" = ["check_md_links"] +``` + +This is Phase 1 of the [polish-fact-check +spec](docs/specs/polish-fact-check/). Phase 2 (ground-truth +context injection), Phase 3 (faithfulness judge), and Phase 4 +(tutorial static check) are tracked in `tasks.md`. + ## Polish cache `attune-author` caches LLM polish responses on disk so re-generating an diff --git a/docs/specs/polish-fact-check/tasks.md b/docs/specs/polish-fact-check/tasks.md index d957c87..1d493ef 100644 --- a/docs/specs/polish-fact-check/tasks.md +++ b/docs/specs/polish-fact-check/tasks.md @@ -16,24 +16,24 @@ | # | Task | Layer | Status | Notes | |---|------|-------|--------|-------| -| 1.1 | Decide config-file location (`pyproject.toml` vs new `.attune-author.toml`) | attune-author | todo | Match regen-pipeline convention; document decision in PR | -| 1.2 | Create `src/attune_author/fact_check/` package skeleton with `__init__.py`, `python_refs.py`, `cli_refs.py`, `md_links.py`, `numeric_refs.py`, `report.py` | attune-author | todo | One module per check + shared `FactCheckReport` dataclass | -| 1.3 | Implement `python_refs.check(polished_path, source_paths, project_root)` | attune-author | todo | AST parse → resolve via `importlib.import_module` in active venv | -| 1.4 | Implement `cli_refs.check(polished_path, project_root)` | attune-author | todo | Per-file cache of `attune --help` output; regex extract flag names. **Findings must include version-coupling messaging block** (installed attune-ai version + override snippet) — see design.md | -| 1.5 | Implement `md_links.check(polished_path, project_root)` | attune-author | todo | Resolve relative links; confirm target file exists | -| 1.5.1 | Implement `numeric_refs.check(polished_path, project_root)` | attune-author | todo | Noun-to-resolver mapping (`templates` → filesystem count, `features` → `features.yaml` key count, etc.). Severity: `error` on mismatch, `warning` on unverifiable nouns | -| 1.6 | Implement `report.format_unresolved_block(findings)` | attune-author | todo | Markdown table; severity column; appended above `` | -| 1.7 | Wire into `attune_author/polish.py` after the polish write | attune-author | todo | Soft-fail: append to file. Strict mode: raise `FactCheckError` | -| 1.8 | Add `[tool.attune-author.fact-check]` config schema + parser | attune-author | todo | `enabled`, `soft_fail`, per-check toggles, skip-list | -| 1.9 | Add `--fact-check=strict` / `--no-fact-check` CLI flags to `generate` and `regenerate` | attune-author | todo | Match existing CLI style | -| 1.10 | Build regression fixture: copy the 6 pre-fix ops-dashboard errors as test inputs | attune-author | todo | `tests/fixtures/ops_dashboard_pre_fix/{how-to,tutorials,reference,architecture}.md` | -| 1.11 | Test: each check fires on the matching fixture error | attune-author | todo | `test_python_refs_catches_underscore_module`, `test_cli_refs_catches_invented_flag`, `test_md_links_catches_missing_target`, `test_numeric_refs_catches_invented_count` | -| 1.11.1 | Test: CLI-ref finding contains version-coupling messaging | attune-author | todo | Assert installed version + override snippet appear in finding text | -| 1.12 | Test: zero findings on post-fix ops-dashboard versions | attune-author | todo | Pull from attune-ai PR #351 head | -| 1.13 | Test: soft-fail writes the block; strict mode raises | attune-author | todo | Two test cases | -| 1.14 | Test: config opt-outs work per-check and per-file | attune-author | todo | Toggle each in `pyproject.toml` test fixture | -| 1.15 | Update CHANGELOG with the four checks and the soft-fail default | attune-author | todo | Reference attune-ai PR #351 as motivation | -| 1.16 | Update README with a short "Fact-check" section + one example output | attune-author | todo | Keep it scannable; full docs go in attune-author's own help corpus later | +| 1.1 | Decide config-file location (`pyproject.toml` vs new `.attune-author.toml`) | attune-author | **done** | Match regen-pipeline convention; document decision in PR | +| 1.2 | Create `src/attune_author/fact_check/` package skeleton with `__init__.py`, `python_refs.py`, `cli_refs.py`, `md_links.py`, `numeric_refs.py`, `report.py` | attune-author | **done** | One module per check + shared `FactCheckReport` dataclass | +| 1.3 | Implement `python_refs.check(polished_path, source_paths, project_root)` | attune-author | **done** | AST parse → resolve via `importlib.import_module` in active venv | +| 1.4 | Implement `cli_refs.check(polished_path, project_root)` | attune-author | **done** | Per-file cache of `attune --help` output; regex extract flag names. **Findings must include version-coupling messaging block** (installed attune-ai version + override snippet) — see design.md | +| 1.5 | Implement `md_links.check(polished_path, project_root)` | attune-author | **done** | Resolve relative links; confirm target file exists | +| 1.5.1 | Implement `numeric_refs.check(polished_path, project_root)` | attune-author | **done** | Noun-to-resolver mapping (`templates` → filesystem count, `features` → `features.yaml` key count, etc.). Severity: `error` on mismatch, `warning` on unverifiable nouns | +| 1.6 | Implement `report.format_unresolved_block(findings)` | attune-author | **done** | Markdown table; severity column; appended above `` | +| 1.7 | Wire into `attune_author/polish.py` after the polish write | attune-author | **done** | Soft-fail: append to file. Strict mode: raise `FactCheckError` | +| 1.8 | Add `[tool.attune-author.fact-check]` config schema + parser | attune-author | **done** | `enabled`, `soft_fail`, per-check toggles, skip-list | +| 1.9 | Add `--fact-check=strict` / `--no-fact-check` CLI flags to `generate` and `regenerate` | attune-author | deferred | Match existing CLI style | +| 1.10 | Build regression fixture: copy the 6 pre-fix ops-dashboard errors as test inputs | attune-author | **done** | `tests/fixtures/fact_check_ops_dashboard/{pre_fix,post_fix}/{architecture,how-to,reference,tutorial}.md` | +| 1.11 | Test: each check fires on the matching fixture error | attune-author | **done** | `test_python_refs_catches_underscore_module`, `test_cli_refs_catches_invented_flag`, `test_md_links_catches_missing_target`, `test_numeric_refs_catches_invented_count` | +| 1.11.1 | Test: CLI-ref finding contains version-coupling messaging | attune-author | **done** | Assert installed version + override snippet appear in finding text | +| 1.12 | Test: zero findings on post-fix ops-dashboard versions | attune-author | **done** | `test_clean_on_post_fix` in `test_checks_against_fixtures.py` per check class | +| 1.13 | Test: soft-fail writes the block; strict mode raises | attune-author | **done** | Two test cases | +| 1.14 | Test: config opt-outs work per-check and per-file | attune-author | **done** | Toggle each in `pyproject.toml` test fixture | +| 1.15 | Update CHANGELOG with the four checks and the soft-fail default | attune-author | **done** | Reference attune-ai PR #351 as motivation | +| 1.16 | Update README with a short "Fact-check" section + one example output | attune-author | **done** | Keep it scannable; full docs go in attune-author's own help corpus later | ### Phase 1 testing strategy @@ -51,13 +51,18 @@ ### Phase 1 exit checklist -- [ ] All tasks 1.1–1.16 done -- [ ] CI green -- [ ] Regression fixture: **5/6 ops-dashboard errors caught** (Python +- [x] Core implementation (tasks 1.1–1.8) +- [x] Test coverage (tasks 1.11, 1.11.1, 1.13, 1.14): 55 new tests +- [x] CHANGELOG + README (tasks 1.15, 1.16) +- [x] Regression fixture from attune-ai PR #351 (tasks 1.10, 1.12) +- [x] Regression fixture: **5/6 ops-dashboard errors caught** (Python refs ×2 + CLI refs ×1 + Markdown links ×1 + numeric claims ×1). The 6th error (missing-security-callout for `0.0.0.0`) is explicitly Phase 3 scope. -- [ ] Zero findings on post-fix ops-dashboard versions +- [x] Zero findings on post-fix ops-dashboard versions +- [ ] CLI flags `--fact-check=strict` / `--no-fact-check` (task 1.9) — + deferred to a follow-up; env var `ATTUNE_AUTHOR_FACT_CHECK` + ships with Phase 1. - [ ] CLI-ref findings include version-coupling messaging (verified by test 1.11.1) - [ ] CHANGELOG + README updated diff --git a/src/attune_author/cli.py b/src/attune_author/cli.py index 2ae7af6..1277b1d 100644 --- a/src/attune_author/cli.py +++ b/src/attune_author/cli.py @@ -16,6 +16,7 @@ import argparse import json import logging +import os import sys import urllib.error import urllib.parse @@ -150,6 +151,29 @@ def _build_parser() -> argparse.ArgumentParser: "architecture). Use this for full help and docs coverage." ), ) + # Fact-check mode per spec docs/specs/polish-fact-check/. Default + # is "soft" (append findings to the polished file as an + # ``## Unresolved references`` block). "strict" raises + # FactCheckError on any error finding; "off" disables. + # Internally sets ATTUNE_AUTHOR_FACT_CHECK; the env var path is + # the single source of truth read by generator._run_fact_check. + fact_check_group = p_gen.add_mutually_exclusive_group() + fact_check_group.add_argument( + "--fact-check", + choices=["off", "soft", "strict"], + default=None, + help=( + "Fact-check mode after polish. ``soft`` (default) appends " + "an ``## Unresolved references`` block to the polished " + "file; ``strict`` raises on findings; ``off`` disables. " + "Overridden by ATTUNE_AUTHOR_FACT_CHECK if set." + ), + ) + fact_check_group.add_argument( + "--no-fact-check", + action="store_true", + help="Shortcut for ``--fact-check=off``.", + ) p_regen = sub.add_parser( "regenerate", @@ -215,6 +239,24 @@ def _build_parser() -> argparse.ArgumentParser: action="store_true", help="With --status: emit JSON instead of human-readable output.", ) + # Same fact-check controls as p_gen; see that block for design notes. + regen_fact_check_group = p_regen.add_mutually_exclusive_group() + regen_fact_check_group.add_argument( + "--fact-check", + choices=["off", "soft", "strict"], + default=None, + help=( + "Fact-check mode after polish. ``soft`` (default) appends " + "an ``## Unresolved references`` block to each polished " + "file; ``strict`` raises on findings; ``off`` disables. " + "Overridden by ATTUNE_AUTHOR_FACT_CHECK if set." + ), + ) + regen_fact_check_group.add_argument( + "--no-fact-check", + action="store_true", + help="Shortcut for ``--fact-check=off``.", + ) p_cache = sub.add_parser( "cache", @@ -454,11 +496,28 @@ def _cmd_status(args: argparse.Namespace) -> int: return 0 +def _apply_fact_check_args(args: argparse.Namespace) -> None: + """Translate ``--fact-check`` / ``--no-fact-check`` to the env var + read by :func:`attune_author.generator._run_fact_check`. + + The env var is the single source of truth; the CLI flags are a + convenience that maps onto it. An already-set env var wins (the + operator's shell-level intent overrides the per-invocation flag). + """ + if os.environ.get("ATTUNE_AUTHOR_FACT_CHECK"): + return + if getattr(args, "no_fact_check", False): + os.environ["ATTUNE_AUTHOR_FACT_CHECK"] = "off" + elif getattr(args, "fact_check", None): + os.environ["ATTUNE_AUTHOR_FACT_CHECK"] = args.fact_check + + def _cmd_generate(args: argparse.Namespace) -> int: """Handle the generate command.""" from attune_author.generator import generate_feature_templates from attune_author.manifest import load_manifest + _apply_fact_check_args(args) root = validate_file_path(args.project_root) help_dir = validate_file_path(args.help_dir) @@ -505,6 +564,7 @@ def _cmd_generate(args: argparse.Namespace) -> int: def _cmd_regenerate(args: argparse.Namespace) -> int: """Handle the regenerate command.""" + _apply_fact_check_args(args) root = validate_file_path(args.project_root) help_dir = validate_file_path(args.help_dir) diff --git a/src/attune_author/fact_check/__init__.py b/src/attune_author/fact_check/__init__.py new file mode 100644 index 0000000..24ec513 --- /dev/null +++ b/src/attune_author/fact_check/__init__.py @@ -0,0 +1,105 @@ +"""AST-based post-generation fact-check for polished docs. + +Phase 1 of the polish-fact-check spec +(``docs/specs/polish-fact-check``). Each check module surfaces +findings into a shared :class:`FactCheckReport`; the caller +decides whether to soft-fail (append an ``## Unresolved +references`` block to the polished file) or strict-fail (raise +:class:`FactCheckError`). +""" + +from __future__ import annotations + +from pathlib import Path + +from . import cli_refs, md_links, numeric_refs, python_refs +from .config import load_config +from .report import ( + CHECK_CLI_REFS, + CHECK_MD_LINKS, + CHECK_NUMERIC_REFS, + CHECK_PYTHON_REFS, + FactCheckConfig, + FactCheckError, + FactCheckReport, + Finding, + Severity, + format_unresolved_block, +) + + +def check_polished_file( + polished_path: Path, + *, + project_root: Path, + config: FactCheckConfig | None = None, +) -> FactCheckReport: + """Run all enabled fact-check passes against ``polished_path``. + + Args: + polished_path: Markdown file produced by the polish pass. + project_root: Consumer project root. Used to resolve CLI + ``--help`` output, ``.help/features.yaml``, and to + match per-file skip entries from configuration. + config: Optional explicit config; ``None`` means load + from the project's ``pyproject.toml``. + + Returns: + A :class:`FactCheckReport` with zero or more findings. + Callers gate on ``report.has_errors()`` for strict mode + and feed ``report.findings`` to + :func:`format_unresolved_block` for soft-fail. + """ + cfg = config if config is not None else load_config(project_root) + report = FactCheckReport() + if not cfg.enabled: + return report + + try: + rel_path = polished_path.relative_to(project_root).as_posix() + except ValueError: + rel_path = polished_path.name + + if cfg.is_check_enabled(CHECK_PYTHON_REFS, rel_path): + report.extend(python_refs.check(polished_path)) + if cfg.is_check_enabled(CHECK_CLI_REFS, rel_path): + report.extend(cli_refs.check(polished_path, project_root)) + if cfg.is_check_enabled(CHECK_MD_LINKS, rel_path): + report.extend(md_links.check(polished_path)) + if cfg.is_check_enabled(CHECK_NUMERIC_REFS, rel_path): + report.extend(numeric_refs.check(polished_path, project_root)) + + return report + + +def apply_soft_fail(polished_path: Path, report: FactCheckReport) -> bool: + """Append the unresolved-references block to ``polished_path``. + + Returns True if the block was appended, False if there was + nothing to append (empty report). + """ + block = format_unresolved_block(report.findings) + if not block: + return False + existing = polished_path.read_text(encoding="utf-8") + if not existing.endswith("\n"): + existing += "\n" + polished_path.write_text(existing + block + "\n", encoding="utf-8") + return True + + +__all__ = [ + "CHECK_CLI_REFS", + "CHECK_MD_LINKS", + "CHECK_NUMERIC_REFS", + "CHECK_PYTHON_REFS", + "FactCheckConfig", + "FactCheckError", + "FactCheckReport", + "Finding", + "Severity", + "apply_soft_fail", + "check_polished_file", + "format_unresolved_block", + "load_config", +] diff --git a/src/attune_author/fact_check/cli_refs.py b/src/attune_author/fact_check/cli_refs.py new file mode 100644 index 0000000..db0556d --- /dev/null +++ b/src/attune_author/fact_check/cli_refs.py @@ -0,0 +1,211 @@ +"""Check CLI flag references against locally-installed CLI help. + +For each ``attune --flag`` pattern referenced in +the polished file, run ``attune --help`` once, +parse the flag set, and assert the referenced flag appears. +Findings carry a "version coupling" disclaimer block per spec +§1.4 so consumers know which attune-ai version was probed and +how to override. +""" + +from __future__ import annotations + +import re +import shutil +import subprocess +from pathlib import Path + +from .report import CHECK_CLI_REFS, Finding + +#: Match prose like ``attune ops --read-only`` or +#: ``attune workflow run security-audit --path src/``. Captures +#: the subcommand chain and the flag separately. ``cli`` is +#: parameterized so the same module works for non-attune +#: consumers in future phases. +_FLAG_PATTERN_TMPL = ( + r"`\s*(?P{cli})" r"(?P(?:\s+[a-z][a-z0-9-]*)*?)" r"\s+(?P--[a-z][a-z0-9-]*)" +) + +_FLAG_IN_HELP = re.compile(r"--[a-z][a-z0-9-]*") + + +def _resolve_cli_name(project_root: Path) -> str: + """Pick the consumer CLI name. Defaults to ``attune``. + + Hook point for future phases — for now we read from + ``[tool.attune-author.fact-check].cli_name`` if present. + """ + pyproject = project_root / "pyproject.toml" + if not pyproject.exists(): + return "attune" + try: + import tomllib + except ImportError: # pragma: no cover - Py <3.11 fallback + import tomli as tomllib # type: ignore[import-not-found,no-redef] + try: + data = tomllib.loads(pyproject.read_text(encoding="utf-8")) + except (OSError, ValueError): + return "attune" + return ( + data.get("tool", {}) + .get("attune-author", {}) + .get("fact-check", {}) + .get("cli_name", "attune") + ) + + +def _installed_version(cli: str) -> str: + """Return the installed package version for ``cli``. + + Attempts ``importlib.metadata.version`` against a best-guess + package name, then falls back to running `` --version``, + then to ``"unknown"`` if both fail. + """ + pkg_guess = "attune-ai" if cli == "attune" else cli + try: + from importlib.metadata import PackageNotFoundError, version + + return version(pkg_guess) + except PackageNotFoundError: + pass + except Exception: # noqa: BLE001 + # INTENTIONAL: any metadata-system error falls through to + # the CLI probe; we don't want to fail the whole check on + # a deformed dist-info. + pass + try: + result = subprocess.run( + [cli, "--version"], + capture_output=True, + text=True, + timeout=5, + check=False, + ) + out = (result.stdout or result.stderr).strip() + if out: + return out.splitlines()[0] + except (OSError, subprocess.SubprocessError): + pass + return "unknown" + + +def _help_text(cli: str, subcommand_chain: list[str], timeout: int = 5) -> str | None: + """Run `` --help`` and return stdout, or None.""" + if shutil.which(cli) is None: + return None + try: + result = subprocess.run( + [cli, *subcommand_chain, "--help"], + capture_output=True, + text=True, + timeout=timeout, + check=False, + ) + except (OSError, subprocess.SubprocessError): + return None + # ``--help`` typically exits 0; if it didn't, the chain was + # likely invalid — treat as "no flags found" so the chain + # itself becomes the finding. + if result.returncode != 0: + return None + return result.stdout or "" + + +def _extract_flags(help_text: str) -> set[str]: + """Return the set of flag names that appear in ``help_text``.""" + return set(_FLAG_IN_HELP.findall(help_text)) + + +def _version_block(cli: str, version: str, finding_path: str) -> str: + """Build the per-finding version-coupling messaging block.""" + return ( + f"\n\nDetected against {cli} {version} (installed in active venv). " + "If you are regenerating against a different version, verify the " + f"flag exists in that version's `{cli} --help`.\n" + "To override:\n" + f" - One-off: attune-author generate FEATURE --skip-check {CHECK_CLI_REFS}\n" + " - Per file: [tool.attune-author.fact-check.skip]\n" + f' "{finding_path}" = ["{CHECK_CLI_REFS}"]' + ) + + +def check(polished_path: Path, project_root: Path) -> list[Finding]: + """Run the cli-refs check on ``polished_path``. + + Returns findings for any `` --flag`` reference + whose flag does not appear in the cached ``--help`` output + for that subcommand chain. + """ + cli = _resolve_cli_name(project_root) + if shutil.which(cli) is None: + # No CLI to probe against — skip silently. The check is + # opportunistic; raising on a missing dev dep would be + # worse than the false positives we're trying to prevent. + return [] + pattern = re.compile(_FLAG_PATTERN_TMPL.format(cli=re.escape(cli))) + text = polished_path.read_text(encoding="utf-8") + + try: + rel_path = polished_path.relative_to(project_root).as_posix() + except ValueError: + rel_path = polished_path.name + + # Cache: (chain-tuple) -> set of flags. None means the chain + # itself was rejected; we surface that as a separate finding. + flag_cache: dict[tuple[str, ...], set[str] | None] = {} + version: str | None = None + findings: list[Finding] = [] + seen: set[tuple[tuple[str, ...], str]] = set() + + for lineno, line in enumerate(text.splitlines(), start=1): + for match in pattern.finditer(line): + sub_raw = (match.group("sub") or "").strip() + chain = tuple(sub_raw.split()) if sub_raw else () + flag = match.group("flag") + + key = (chain, flag) + if key in seen: + continue + seen.add(key) + + if chain not in flag_cache: + help_text = _help_text(cli, list(chain)) + flag_cache[chain] = _extract_flags(help_text) if help_text is not None else None + + flag_set = flag_cache[chain] + if flag_set is None: + if version is None: + version = _installed_version(cli) + chain_str = " ".join([cli, *chain]).strip() + findings.append( + Finding( + check=CHECK_CLI_REFS, + severity="error", + location=f"Line {lineno}", + message=( + f"`{chain_str}` — subcommand not found" + + _version_block(cli, version, rel_path) + ), + ) + ) + continue + if flag not in flag_set: + if version is None: + version = _installed_version(cli) + chain_str = " ".join([cli, *chain]).strip() + findings.append( + Finding( + check=CHECK_CLI_REFS, + severity="error", + location=f"Line {lineno}", + message=( + f"`{chain_str} {flag}` — flag not found in `{chain_str} --help`" + + _version_block(cli, version, rel_path) + ), + ) + ) + + return findings + + +__all__ = ["check"] diff --git a/src/attune_author/fact_check/config.py b/src/attune_author/fact_check/config.py new file mode 100644 index 0000000..eec9427 --- /dev/null +++ b/src/attune_author/fact_check/config.py @@ -0,0 +1,59 @@ +"""Load fact-check configuration from ``pyproject.toml``. + +Reads the ``[tool.attune-author.fact-check]`` table. Missing +sections fall back to :class:`FactCheckConfig` defaults — which +match the spec's "all checks on, soft-fail" Phase 1 defaults. +""" + +from __future__ import annotations + +from pathlib import Path + +from .report import FactCheckConfig + + +def _read_toml(path: Path) -> dict[str, object]: + if not path.is_file(): + return {} + try: + import tomllib + except ImportError: # pragma: no cover - Py <3.11 fallback + import tomli as tomllib # type: ignore[import-not-found,no-redef] + try: + return tomllib.loads(path.read_text(encoding="utf-8")) + except (OSError, ValueError): + return {} + + +def load_config(project_root: Path) -> FactCheckConfig: + """Build a :class:`FactCheckConfig` from the project's pyproject. + + Unknown keys are ignored — the goal is forward-compat with + future phases that add their own toggles under the same table. + """ + data = _read_toml(project_root / "pyproject.toml") + tool = data.get("tool", {}) if isinstance(data, dict) else {} + author = tool.get("attune-author", {}) if isinstance(tool, dict) else {} + section = author.get("fact-check", {}) if isinstance(author, dict) else {} + skip = section.get("skip", {}) if isinstance(section, dict) else {} + + def _bool(key: str, default: bool) -> bool: + value = section.get(key, default) if isinstance(section, dict) else default + return bool(value) + + cfg = FactCheckConfig( + enabled=_bool("enabled", True), + soft_fail=_bool("soft_fail", True), + check_python_refs=_bool("check_python_refs", True), + check_cli_refs=_bool("check_cli_refs", True), + check_md_links=_bool("check_md_links", True), + check_numeric_refs=_bool("check_numeric_refs", True), + ) + if isinstance(skip, dict): + cfg.skip = { + str(k): [str(c) for c in v] if isinstance(v, list) else [] for k, v in skip.items() + } + return cfg + + +__all__ = ["load_config"] diff --git a/src/attune_author/fact_check/md_links.py b/src/attune_author/fact_check/md_links.py new file mode 100644 index 0000000..a7c5ae7 --- /dev/null +++ b/src/attune_author/fact_check/md_links.py @@ -0,0 +1,62 @@ +"""Check relative markdown link targets exist. + +For each ``[label](target)`` reference in the polished file: +- If ``target`` looks like an external URL or mailto, skip. +- Otherwise resolve relative to the polished file's directory + and assert the target file exists. + +Anchor existence is out of scope for Phase 1 per spec §1.5. +""" + +from __future__ import annotations + +import re +from pathlib import Path + +from .report import CHECK_MD_LINKS, Finding + +#: Match inline markdown links ``[label](target)``. Captures +#: the target. Greedy-safe (no nested parens supported, which +#: is fine for our doc style). +_LINK = re.compile(r"\[(?P