Skip to content

feat(clio): auto-ingest documenter output (docs/**/*.md) into Clio#50

Merged
fstamatelopoulos merged 1 commit into
mainfrom
iteration-7/clio-ingest-docs
May 15, 2026
Merged

feat(clio): auto-ingest documenter output (docs/**/*.md) into Clio#50
fstamatelopoulos merged 1 commit into
mainfrom
iteration-7/clio-ingest-docs

Conversation

@fstamatelopoulos
Copy link
Copy Markdown
Owner

Summary

Closes a long-standing inconsistency: cfcf auto-ingests almost every workspace artifact into Clio (iteration logs, judge assessments, reflection analyses, plan.md, decision-log, architect-review, problem-pack, context-pack…), but explicitly excluded the documenter's docs/*.md output. The exclusion's stated rationale ("docs/ is canonical, Clio is redundant") applied equally to plan.md — which IS auto-ingested. The carve-out was inconsistent and cost cross-workspace discoverability of the most polished, integrative artifact a workspace produces.

Surfaced by user question while planning v0.24.4 dogfood of the freshly-finished gmbot run: "Is the documenter ingesting the generated docs into Clio? Or better the harness doing that?" — neither, by design. The design was wrong. This PR fixes it.

Behaviour

After the documenter completes (both auto-document path inside the iteration loop AND standalone cfcf document):

  • Walk <repo>/docs/ recursively
  • For each *.md file: ingest into Clio as a separate document
  • Stable per-file title: <workspace>: docs/<relative-path> (e.g. gmbot: docs/architecture.md, gmbot: docs/api/auth.md)
  • updateIfExists: true — re-running the documenter overwrites in place, never duplicates; sha256 dedup makes unchanged content a no-op
  • Author stamp: documenter|<adapter>|<model>
  • Metadata: {role: \"documenter\", artifact_type: \"documenter-output\", file_path: \"<rel>\", tier: \"semantic\", ingest_trigger: \"loop-auto\" | \"manual\", …} — the documenter-output artifact_type makes the new docs filterable via cfcf clio search --metadata
  • Non-.md files ignored (images, JSON config); dot-dirs (.git, .vscode) skipped; empty files skipped
  • Per-file errors logged + counted but never fail the batch (best-effort, same policy as other auto-ingest hooks)
  • Respects clio.ingestPolicy: \"off\" → no-op; \"summaries-only\" and \"all\" → runs

Files

  • packages/core/src/clio/loop-ingest.ts — new ingestDocumenterOutput() + walkMarkdownFiles() helper
  • packages/core/src/iteration-loop.ts — call site inside auto-document branch
  • packages/core/src/documenter-runner.ts — call site for standalone cfcf document
  • packages/core/src/templates/cfcf-documenter-instructions.md — prose updated to match reality

Test plan

  • bun run test — 1049 tests pass (+11 new)
  • bun run typecheck — clean
  • 11 new tests in loop-ingest.test.ts cover: multi-file ingest with stable titles, recursive walk, updateIfExists round-trip (one doc per file, content updates in place), non-.md filtering, dot-dir skipping, missing docs/ no-op, empty file skipping, author stamp, trigger metadata (loop-auto vs manual), clio.ingestPolicy: \"off\" → no-op, clio.ingestPolicy: \"summaries-only\" → runs
  • Manual smoke (post-merge): run cfcf document on a workspace with docs/architecture.md; verify cfcf clio search \"<term>\" --metadata '{\"artifact_type\":\"documenter-output\"}' returns the doc with the expected title format

Versioning

Target: v0.24.4 patch. New behaviour is additive and gated by the existing clio.ingestPolicy (no opt-out is needed beyond the existing config knob). Same shape as v0.24.3's release.

🤖 Generated with Claude Code

Closes a long-standing inconsistency: cfcf auto-ingests almost
every workspace artifact (iteration logs, judge assessments,
reflection analyses, plan.md, decision-log, architect-review,
problem-pack, context-pack...) — but explicitly excluded the
documenter's docs/*.md output. The exclusion's stated rationale
("docs/ is canonical, Clio is redundant") applied equally to
plan.md, which IS auto-ingested. Carve-out was inconsistent and
cost cross-workspace discoverability of the most polished,
integrative artifact a workspace produces.

Behaviour:
- After documenter completes (both auto-document path in
  iteration-loop AND standalone cfcf document), walk
  <repo>/docs/ recursively; ingest every *.md as a separate
  Clio doc.
- Stable per-file title: <workspace>: docs/<relative-path>
- updateIfExists: true → re-runs overwrite in place; sha256
  dedup makes unchanged content a no-op.
- Author: documenter|<adapter>|<model>
- Metadata includes artifact_type=documenter-output + file_path
  + ingest_trigger ("loop-auto" | "manual") for filtering.
- Non-.md files skipped; dot-dirs skipped; empty files skipped.
- Per-file errors logged + counted but never fail the batch.
- Respects clio.ingestPolicy (off / summaries-only / all).
  Documenter output runs on summaries-only — it IS a summary.

Implementation: new ingestDocumenterOutput() in loop-ingest.ts +
walkMarkdownFiles() helper; two call sites (iteration-loop
auto-document path, documenter-runner standalone path);
documenter template prose updated to match reality.

Test coverage: 11 new tests covering multi-file ingest, recursive
walk, updateIfExists round-trip, non-.md filtering, dot-dir
skipping, missing docs/ no-op, empty file skipping, author stamp,
trigger metadata, ingestPolicy gates. All 1049 tests pass.
Typecheck clean.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@fstamatelopoulos fstamatelopoulos merged commit b551159 into main May 15, 2026
3 checks passed
@fstamatelopoulos fstamatelopoulos deleted the iteration-7/clio-ingest-docs branch May 15, 2026 00:20
fstamatelopoulos added a commit that referenced this pull request May 15, 2026
Bumps version 0.24.3 -> 0.24.4 and consolidates the [Unreleased]
section into a [0.24.4] release entry. Shipped via PR #50.

Net-new since v0.24.3:
- Clio auto-ingest of documenter docs/**/*.md output (loop-auto
  + standalone cfcf document, with updateIfExists, stable per-file
  titles, gated by clio.ingestPolicy)

Test coverage: 1049 tests pass (+11 new). Typecheck clean.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant