diff --git a/.claude/rules/platform/memory-protocol.md b/.claude/rules/platform/memory-protocol.md index 2fe2dc8..bda6d09 100644 --- a/.claude/rules/platform/memory-protocol.md +++ b/.claude/rules/platform/memory-protocol.md @@ -14,6 +14,24 @@ The workaround comes from the issue thread (see also [#36636](https://github.com **Do not remove the `@MEMORY.md` line from `CLAUDE.md`.** Without it, workspace `MEMORY.md` exists on disk but never enters the agent's initial context — your memory index becomes invisible to the agent. +## Surfacing pending lint items + +When workspace `MEMORY.md` contains a `## Pending Review (Lint findings)` section, the agent **MUST** proactively surface those items in the current conversation. Ninja has stated explicitly that he will not proactively ask — without this rule, the items rot indefinitely. Context: ADR-069 (memory-lint Phase B.5 trial 2026-05-17 → 2026-06-17, beads `workspace-txyu`). + +### Surfacing strategy + +- **Preferred trigger (topic-related):** when the current conversation's topic naturally relates to a pending item, bring it up inline as part of the relevant answer (e.g., "Кстати, есть unresolved contradiction про X — какой из вариантов актуален?"). Topic-relatedness is the agent's own judgment, not a strict keyword match — err on the side of mentioning when there is a plausible connection. +- **Aged escalation (>14 days):** if a pending item is older than 14 days AND no topic-relevant opportunity has arisen during the session, surface it at a natural pause or at task end. Do not let aged items sit silent indefinitely. +- **One per session max:** never dump multiple pending items in a single message or session. Pick the most topic-relevant item, or if none is relevant, the oldest one. +- **Never interrupt urgency:** if Ninja is mid-urgent-task (incident, time-pressured debugging, mid-deploy), do not derail the flow with a pending item — wait for a natural break or for the urgent work to finish. +- **After resolution:** once Ninja resolves a contradiction, update BOTH memory files of the resolved pair (under `memory/auto/`) with the resolved value AND remove the corresponding bullet from the `## Pending Review (Lint findings)` section in workspace `MEMORY.md` — in the same operation. Write anti-loop fields to BOTH files symmetrically, not just the "losing" one — the nightly per-pair exclusion lookup checks EITHER file's `do_not_reopen` list, so a one-sided write still suppresses the pair, but symmetric writes match the canonical SKILL.md Phase B.5 step 5 invariant and keep nightly and interactive flows aligned. + - **Concurrency: take the consolidation lock, do not just check it.** Before any edit, acquire the same lock the nightly consolidator uses — a file-existence check is TOCTOU-racy (cron can grab the lock between the check and the first write) and never reclaims a stale lock. **All paths in this protocol are anchored at the workspace root via `${CLAUDE_PROJECT_DIR}`, never `$PWD` or relative paths.** The agent's shell may be in any subdirectory when invoked, so a relative `.claude/...` path or a `$PWD`-relative lock can target the wrong file (or fail to find the script entirely) and silently bypass coordination with the nightly consolidator. Run `bash "${CLAUDE_PROJECT_DIR}/.claude/skills/memory-consolidation/scripts/lock.sh" acquire "${CLAUDE_PROJECT_DIR}/.consolidation.lock" 60` and capture the `ACQUIRED ` value. If the script prints `LOCKED` (exit 1), tell Ninja the resolution is deferred and re-attempt at the next opportunity (cron runs are minutes, not hours). When acquired, pass the token to every later `refresh`/`release` call to prove ownership and release the lock at the end of the resolution — including on failure paths (after any rollback). The `60` arg is the stale-TTL in minutes; the script reclaims abandoned locks automatically. Also check `.maintenance.lock` first via `bash "${CLAUDE_PROJECT_DIR}/.claude/skills/memory-consolidation/scripts/lock.sh" check-maintenance "${CLAUDE_PROJECT_DIR}"`; defer if it returns `MAINTENANCE`. + - **Use the same safe-edit flow as the nightly path.** Wrap every affected file edit with `"${CLAUDE_PROJECT_DIR}/.claude/skills/memory-consolidation/scripts/safe-edit.sh" backup → write → verify → clean` (and `rollback` on verify failure), exactly as Phase C does. Always reference the script and target files via `${CLAUDE_PROJECT_DIR}` absolute paths so the call is correct regardless of the agent's current working directory. When the resolution touches two memory files plus `MEMORY.md`, apply the paired two-phase commit pattern from Phase B.5 step 5: `backup` ALL affected files first, apply all edits, run `verify` on ALL of them, and only `clean` the backups once every `verify` passes. If any `verify` fails, `rollback` every file from its backup and report the failure to Ninja — never leave a partial resolution where one file is annotated but another is untouched. + - **MEMORY.md `SUSPICIOUS_SHRINK` allowance.** When the only `MEMORY.md` change is to the Pending Review section (bullets and/or the section heading itself — interactive resolution typically just removes one bullet, but the bypass is defined to cover any in-section change), `safe-edit.sh verify` may legitimately return `SUSPICIOUS_SHRINK` (the section can be a large fraction of a small `MEMORY.md`). Apply the same bypass as the nightly path (SKILL.md Phase D step 2): accept the verify failure if (a) the post-edit file still contains the `# Memory Index` heading AND (b) everything OUTSIDE the Pending Review section is byte-identical between the backup (`${FILEPATH}.consolidation-backup`) and the post-edit file. Concretely, `outside_bytes(file)` is the file content with the section excised — from the `## Pending Review (Lint findings)` heading line through the byte immediately preceding the next `## ` heading or EOF, inclusive of any trailing blank line separating the section from what follows; if the section is absent in a given state, `outside_bytes` equals that full file. The bypass holds iff `outside_bytes(backup) == outside_bytes(post-edit)` (exact byte-string equality — content outside the section is not supposed to move). Otherwise rollback as usual. Note the bypass briefly when reporting the resolution to Ninja so the audit trail is preserved. + - **Anti-loop frontmatter.** Add `resolved_at`, `resolution_basis`, and update the `do_not_reopen` list on BOTH files of the resolved pair. `do_not_reopen` is a YAML list of records, each with `partner` (the other file's bare name) and `before` (a YYYY-MM-DD date). For this pair's partner: find any existing entry and apply `MAX(existing.before, new_value)` to its `before` field — NEVER shorten this pair's window. If no entry exists for this partner, APPEND a new `{partner, before}` record. Do NOT touch entries for unrelated partners — that's what makes the cooldown genuinely per-pair. `resolved_at` and `resolution_basis` are scalars and reflect the most recent resolution across all of this file's pairs. Default `before` value: `today + 90 days`. The `before` field is always a date — semantic-condition values like `"Ninja revisits topic X"` are not supported (the consolidation skill has no mechanism to auto-detect such events; use a far-future date if indefinite suppression is genuinely required). Match the canonical YAML in `.claude/skills/memory-consolidation/SKILL.md` Phase B.5 step 5. + - **Pending Review bullet format.** Preserve the bullet format `- detected_at=YYYY-MM-DD — file-A vs file-B — ` when editing — do not restyle existing bullets or rename the heading, since the consolidation skill parses both. If removing the last bullet empties the section, remove the section heading itself (do not leave an empty `## Pending Review (Lint findings)` heading behind). + - **Bullet match key when removing.** Remove ONLY the bullet whose triple `(file-A, file-B, reason)` matches the specific contradiction you are resolving — the same triple used as the dedup key when the bullet was written (SKILL.md Phase D step 2). The match key is the bullet's literal `reason` field (the third ` — `-separated segment after sanitization), NOT the frontmatter `resolution_basis` field which is a separate human-readable summary. The unordered file pair must match (order-insensitive) and the sanitized `reason` text must match exactly. If the same pair has multiple unresolved bullets with different reasons, leave the non-matching bullets in place — they represent distinct unresolved contradictions that still need resolution. + ## What goes WHERE: rules vs memory ### Rule (`.claude/rules/custom/`) diff --git a/.claude/skills/memory-consolidation/SKILL.md b/.claude/skills/memory-consolidation/SKILL.md index aa3fd1a..633289d 100644 --- a/.claude/skills/memory-consolidation/SKILL.md +++ b/.claude/skills/memory-consolidation/SKILL.md @@ -2,6 +2,12 @@ Nightly skill that crystallizes recent human conversation sessions into organized persistent memory. Reads session transcripts from the last 48 hours, extracts noteworthy facts, updates MEMORY.md and memory/auto/ files, and writes a narrative diary digest to memory/diary/. +## Feature Flags + +- `LINT_PHASE_B5_ENABLED=true` — Trial: 2026-05-17 → 2026-06-17. **The line above is the sole source of truth** — not an environment variable, not external config, not a settings file lookup. To roll back: edit that line to `LINT_PHASE_B5_ENABLED=false`, commit; the next nightly run reads this skill file and respects the new value. No data migration. See ADR-069. + +When the flag value (as read from the line above) is `false`, the skill executes Phases 0/A/B/C/D as before — no cross-file lint, no Pending Review writes to MEMORY.md, no appends to `memory/lint-stats.jsonl`. Phase C still applies the new frontmatter fields (`confidence`, `revisit_if`) when creating or updating files, since those are forward-compatible regardless of the lint pass. + ## Context This skill runs as a nightly cron. It is the agent's equivalent of sleep — a time for absorption and crystallization of information, not mechanical fact transfer. The goal is to understand what new information means in the context of existing memory, update stale entries, resolve contradictions, and produce diary entries as narrative digests. @@ -105,12 +111,70 @@ bash "${CLAUDE_SKILL_DIR}/scripts/lock.sh" refresh "${CLAUDE_PROJECT_DIR}/.conso ``` If refresh returns `STOLEN`, another run has reclaimed the lock — abort the pipeline immediately and output NO_REPLY. +### Phase B.5: Cross-file Lint (contradiction detection) + +**Gated by `LINT_PHASE_B5_ENABLED`** declared in the Feature Flags section at the top of this skill file (line ~7). The flag is read by the skill agent from this file — it is NOT an environment variable, NOT an external config lookup. During the 2026-05-17 → 2026-06-17 trial window the documented value is `true`. This entire phase is skipped only when the documented value reads exactly `LINT_PHASE_B5_ENABLED=false` — the rollback path (no data migration) for aborting the trial early. The documented value does NOT auto-flip after the trial window ends; the trial post-mortem decides whether to land the feature, in which case the value is changed by editing this file. Skipped runs MUST NOT write to `memory/lint-stats.jsonl` or touch the workspace `MEMORY.md` "Pending Review" section. + +Cross-file scan of existing `memory/auto/*.md` files for contradictions that the per-fact Phase B check cannot catch (Phase B only compares new vs existing, not existing vs existing). + +Initialize accumulators: `candidates_found = 0`, `contradictions_detected = 0`, `auto_resolved = 0`, `pending_added = 0`, `pending_removed = 0`, `pending_review = []`, `pending_resolved = []`. Also initialize `mutations_applied = 0` here — this counter is **shared with Phase C** (do not re-zero on entry to Phase C). `pending_resolved` accumulates `{files:[A,B], reason: canonical_reason}` records for pairs that this run auto-resolved (used by Phase D step 2 to clear any pre-existing MEMORY.md "Pending Review" bullet for the same contradiction so stale flags do not accumulate across runs). `pending_removed` is the count of actual MEMORY.md bullets cleared by Phase D step 2 — incremented per bullet match, NOT per `pending_resolved` entry (since most entries are no-ops on the first run where the contradiction is both detected and resolved before any bullet was ever written). + +1. **Build lightweight representation.** Iterate `memory/auto/*.md` and for each file extract: `{file, type, name, tags, title_tokens, body_predicates, negation_markers, do_not_reopen, recent_diary_mention}`. Source the fields from frontmatter (`type`, `name`, optional `tags`, the anti-loop list). `do_not_reopen` is read as a YAML list of records, each with `partner` (filename) and `before` (YYYY-MM-DD date) — if the field is absent treat as empty list. Normalize each `partner` to its bare filename (apply `basename`, strip any directory prefix) and dedupe by `partner` (if duplicate entries exist for the same partner, keep the one with the latest `before` — never shorten an existing pair's cooldown). Tokenize the `name` slug and the first body heading (or the `description` frontmatter field if no heading exists, else just the `name` slug tokens) into `title_tokens`. Skip stop words (`a, an, the, of, for, with, and, or, to, in, on`) and tokens of length < 3. Also scan the file body (everything after the closing `---` of the frontmatter) for two sets used by step 2's signals: `body_predicates` = the subset of `{prefers, uses, hates, requires}` that appear as case-insensitive whole-word matches (single-token only — negation-style phrases like "do not"/"don't" are covered by `negation_markers`); `negation_markers` = the subset of `{not, never, avoid, instead}` that appear as case-insensitive whole-word matches. The two sets are disjoint by construction so a single shared word cannot satisfy both signals in step 2. Both sets are empty if no match is found. + + **Provenance extraction for step 4a.** Populate `recent_diary_mention: bool` by scanning diary files in `memory/diary/` whose filename date (`YYYY-MM-DD.md`) is within the last 48 hours of today's date. In each diary file, look only at lines that fall under a `### Memory Changes` subsection (between that heading and the next `### ` or `## ` heading or EOF — Phase D step 1 writes `### Memory Changes` as the canonical heading for the per-file create/update log). Set `recent_diary_mention = true` if any such line contains either (a) the file's bare basename (e.g., `topic-slug.md`) as a substring, OR (b) the path form `memory/auto/` as a substring. This is the operational definition of "direct diary/session evidence" used by step 4a — a file mentioned in a recent diary's Memory Changes block was grounded by a real conversation event, whereas a file absent from recent diaries is older/inferred. No frontmatter field is required; provenance is derived entirely from the diary log Phase D already writes. If no diary files exist or none fall within 48 hours, `recent_diary_mention = false` for all files (step 4a then yields no winner and the resolver falls through to step 4b). + +2. **Cheap candidate generation FIRST** — do not blindly LLM-judge all `O(n^2)` pairs. For each unordered pair `(A, B)`, count matches across these signals: + - same `type` field + - overlapping `title_tokens` (≥ 1 shared token, ignoring stop words) + - overlapping `tags` (≥ 1 shared tag, if present) + - non-empty intersection of `body_predicates` between the two files (from the set extracted in step 1) + - non-empty intersection of `negation_markers` between the two files (from the set extracted in step 1) + + A pair is a **candidate** only if at least two of the above signals match. Each signal contributes at most 1 to the match count regardless of how many tokens/tags overlap. Increment `candidates_found` for each candidate pair. With ~40 files, false positives are the primary concern; this filter keeps LLM calls bounded. + + **Per-pair exclusion (anti-loop).** Skip the pair entirely if EITHER file's `do_not_reopen` list contains an entry whose `partner` matches the other file's bare name AND that entry's `before` date is later than today. Exclusion is genuinely per-pair: each partner has its own `before` date stored in its own record, so resolving A↔C cannot extend A↔B's cooldown. If the matching entry's `before` is absent or fails the `^\d{4}-\d{2}-\d{2}$` regex (malformed or legacy prior write), treat the exclusion as inactive for that specific pair and let the pair proceed to judgment — anti-loop requires a well-formed date. Only YYYY-MM-DD dates are recognized; the spec does not support semantic-condition values like `"Ninja revisits topic X"`, since the skill has no mechanism to auto-detect such events (use a far-future date if indefinite suppression is genuinely needed). + +3. **LLM judgment per candidate.** For each candidate pair, first establish canonical ordering: sort the two files by basename so that A's basename is lexicographically less than B's basename, and present them to the LLM in that order. This ensures `claim_a` / `claim_b` map deterministically to the lex-smaller / lex-larger file regardless of pair iteration order — without this, dedup keys produced by step 4c can differ between runs when filesystem enumeration flips the order. Then ask one in-skill question: "Do these two claims contradict each other, or is one a time-scoped evolution of the other?" The LLM must return a structured response: `{verdict, claim_a, claim_b}` where `verdict` is `contradiction` | `evolution` | `unrelated`, and `claim_a` / `claim_b` are the single full body lines (verbatim, including leading bullet/heading markers if any) from each file that carry the contradicting claim — with `claim_a` belonging to canonical-A (lex-smaller basename) and `claim_b` belonging to canonical-B. The `claim_a` / `claim_b` strings are used as exact match anchors in step 5; if either is empty or does not appear verbatim in the corresponding file body, downgrade the verdict to `unrelated` and log the mismatch in the Phase D diary Issues section. Only `contradiction` proceeds to step 4. **Time-scoped changes are NOT contradictions** — a fact like "used X then, uses Y now" is evolution, not contradiction. On malformed LLM output or transient error, treat as `unrelated` and log the failure in the Phase D diary Issues section. Increment `contradictions_detected` for each `contradiction` verdict. + +4. **Auto-resolve hierarchy** (apply in order, stop at first match): the rule is `evidence > confidence`. A "recency" tie-breaker was considered but dropped: `resolved_at` reflects a file's unrelated prior resolution history, not the freshness of the currently contradicting claim, so it is not a valid freshness proxy. Direct freshness evidence is already handled by (a). + + **Compute `canonical_reason` FIRST (applies to every contradiction, regardless of resolution path).** Before evaluating a–c below, derive the deterministic `reason` string from the LLM's `claim_a` and `claim_b` strings returned in step 3 — NOT from any free-form LLM summary, which would be reworded between runs and break dedup. Step 3's canonical-ordering rule (lex-smaller basename = canonical-A) guarantees `claim_a` / `claim_b` map to the same files across runs, so the composed string is order-invariant. Composition: for each claim, strip any leading bullet/heading markers (`-`, `*`, `#`, plus a single following space) and surrounding whitespace, collapse internal runs of whitespace to a single space, then truncate to 80 characters at a UTF-8 codepoint boundary (count codepoints, never split a multi-byte sequence mid-character — invalid UTF-8 written to MEMORY.md will break downstream `json.loads`-per-line stats parsing), with no trailing whitespace. Compose as the literal string ` | ` (single ASCII pipe with spaces). Then apply the standard `reason` sanitization documented in Phase D step 2 (single line, strip leading `#`, collapse newlines to `; `, replace em-dashes `—` with hyphen-space `- `, truncate to 200 chars — again at a UTF-8 codepoint boundary). The resulting string is `canonical_reason`. It is the dedup key shared by ALL routes that flag this contradiction (4c plus the deferral routes in steps 5 and 6) AND by the resolution-removal path (step 5 successful auto-resolves append `{files, reason: canonical_reason}` to `pending_resolved` so Phase D can clear any pre-existing MEMORY.md bullet for this same contradiction). Because identity is the file pair plus the underlying claim lines, all routes use the SAME `canonical_reason` — deferral causes (verify failure, anchor invalidation, mutation-limit) are forensic metadata logged in the Phase D diary Issues section, NOT embedded in the bullet's `reason` field. Embedding cause text would split one contradiction's bullets across multiple variants and defeat dedup; it would also prevent a later run from removing the same contradiction's stale bullet when it succeeds in auto-resolving. + + a. **Direct evidence wins over inferred.** Use the `recent_diary_mention` boolean populated in step 1 (true iff the file's basename appears under a `### Memory Changes` heading in any diary file from the last 48 hours — see step 1's "Provenance extraction" paragraph for the exact definition). If exactly one side of the pair has `recent_diary_mention == true`, that side wins; the other is treated as the losing claim. If both sides have it (both recently grounded) or neither does (both stale/inferred), this leg yields no winner — fall through to step 4b. + b. **Higher confidence wins** if both sides have a `confidence` field and `|confidence_A − confidence_B| >= 0.2`. If either side lacks `confidence` (legacy files predating the schema), treat it as `0.7` for this comparison only. + c. **Otherwise flag for review** — do NOT edit either file. Append an entry to `pending_review` using the canonical shape `{files:[A,B], reason: canonical_reason, detected_at:}`. This is the ONLY in-phase append for the unresolved path; do NOT increment a separate `pending_added` counter here — Phase D step 2 computes the final post-dedup value from the actual bullets written. The same `{files, reason: canonical_reason, detected_at:}` shape is reused by the deferral routes in steps 5 and 6 below, so Phase D's parser regex matches every entry uniformly and dedup correctly merges a future 4c finding with an earlier deferral for the same contradiction. + +5. **Apply auto-resolved edits.** For each auto-resolved pair: + - **Edits MUST use `safe-edit.sh` with paired two-phase commit semantics.** Each resolved pair touches two `memory/auto/*.md` files (A and B); they must succeed or fail together. The flow is: (i) `backup` BOTH files first; (ii) apply the annotation + frontmatter edits to BOTH files; (iii) run `verify` on BOTH files; (iv) ONLY if both verifies succeed, `clean` both backups. If `verify` fails on either file, `rollback` BOTH files from their backups and route the pair to `pending_review` using `{files:[A,B], reason: canonical_reason, detected_at:}` — the SAME `canonical_reason` computed in step 4, so a future run's 4c finding for this contradiction dedup-merges into the existing bullet. Log the deferral cause `"edit verify failed (file=, error=)"` in the Phase D diary Issues section; do NOT embed the cause in the bullet's `reason` field. Never `clean` one backup before the other has verified — otherwise a partial mutation can leave file A annotated while file B is untouched, breaking the symmetric anti-loop guarantee. + - **Re-verify anchor before each file's edit.** A prior pair within the same run may have already annotated the same line. Immediately before applying the annotation (after `backup` but before mutating the file), re-read the target file's current body and confirm the relevant `claim_a` / `claim_b` string still appears verbatim AND appears exactly ONCE. Count verbatim occurrences with whole-line equality (each body line, post-strip of trailing whitespace, compared against the claim string post-strip). If the exact match no longer holds (zero occurrences), abort this pair, rollback both files from their backups, and route the pair to `pending_review` using `{files:[A,B], reason: canonical_reason, detected_at:}`. Log the deferral cause `"anchor invalidated by prior edit (file=)"` in the Phase D diary Issues section; do NOT embed the cause in the bullet's `reason` field. If the claim string appears MORE than once (ambiguous anchor: the LLM only returned a verbatim line, not which occurrence it judged — auto-annotating any single occurrence risks marking the wrong line while still writing the 90-day `do_not_reopen` cooldown), likewise abort this pair, rollback both files from their backups, and route to `pending_review` using the same shape. Log the deferral cause `"ambiguous anchor: claim line appears N times in file="` (substitute N) in the Phase D diary Issues section; do NOT embed the cause in the bullet's `reason` field. Never auto-annotate when the anchor count is not exactly 1 — Ninja can disambiguate during interactive resolution. + - **Never silent-delete.** Locate the losing claim line by exact-match against the `claim_a` / `claim_b` string returned in step 3 (whichever side lost the auto-resolve in step 4). The anchor re-verify above guarantees exactly one verbatim occurrence at this point. Append ` (superseded YYYY-MM-DD: )` to that line — do NOT delete the original text. The annotation lives as a trailing parenthetical on the same line so audit history is preserved. + - Add anti-loop fields to BOTH files' frontmatter. `do_not_reopen` is a **list of records** keyed by `partner` — locate the existing entry for this pair's partner (if any) and apply `MAX(existing.before, new_value)` to its `before` field; NEVER shorten this pair's cooldown. If no entry exists for this partner, APPEND a new `{partner, before}` record. Other files' entries (for unrelated partners) are untouched — resolving A↔C must not extend A↔B's cooldown. `resolved_at` and `resolution_basis` are scalars and reflect the most recent resolution across all of this file's pairs. **Default `new_value` for `before`: `today + 90 days`** (same default as the interactive resolution path in `.claude/rules/platform/memory-protocol.md`, so nightly and manual cooldowns match). Use a far-future date (e.g., year 2099) if indefinite suppression is genuinely required; the field is always a YYYY-MM-DD date. + ```yaml + resolved_at: YYYY-MM-DD + resolution_basis: "" + do_not_reopen: # accumulating list — one record per resolved pair + - partner: # filename only, no path + before: YYYY-MM-DD # always a date; other partners' dates are untouched + ``` + `resolution_basis` MUST be sanitized: single line, max 200 chars, replace embedded newlines with `; `, strip leading `#`, double-quote and escape `"` as `\"` and `\` as `\\`. `before` values are always date-form (matching `^\d{4}-\d{2}-\d{2}$`) and written unquoted as YAML dates — semantic-condition values are not supported (the skill has no auto-trigger for them; use a far-future date if indefinite suppression is required). + - Increment `auto_resolved` by 1 per resolved pair. Increment `mutations_applied` by 2 per resolved pair — one per file edit, matching Phase C's "each file modification counts as one mutation" rule so the shared 5-per-run budget is counted consistently across phases. Append `{files:[A,B], reason: canonical_reason}` to `pending_resolved` so Phase D step 2 can remove any pre-existing MEMORY.md "Pending Review" bullet for this same contradiction (idempotent — a no-op if no matching bullet exists). **Timing:** all three (increments + `pending_resolved` append) fire ONLY after both files' `verify` calls succeed and both backups have been cleaned. A pair that fails verify and is rolled back does NOT consume the budget and is NOT appended to `pending_resolved` (the increments are not applied) — the remaining budget is preserved for subsequent pairs in the same run, and the failed pair instead lands in `pending_review` per the verify-failure bullet above. + +6. **Mutation limit is shared with Phase C.** The shared per-run budget is 5 mutations counted on `mutations_applied`. Before starting an auto-resolve pair, verify `mutations_applied + 2 <= 5` (a pair consumes 2). If the remaining budget cannot fit a full pair, stop applying further auto-resolves; remaining detections go to `pending_review` using `{files:[A,B], reason: canonical_reason, detected_at:}` — the SAME `canonical_reason` so dedup correctly merges with any earlier or later finding for this contradiction. Log the deferral cause `"mutation limit reached"` in the Phase D diary Issues section; do NOT embed the cause in the bullet's `reason` field. + +7. **Carry accumulators into Phase D.** Pass `candidates_found`, `contradictions_detected`, `auto_resolved`, `pending_added`, `pending_removed`, `pending_review`, and `pending_resolved` to Phase D for stats, Pending Review adds, and stale-bullet removals. `pending_removed` enters Phase D at 0 and is incremented by Phase D step 2 (per actual bullet matched and removed, not per `pending_resolved` entry). + +**Lock refresh:** Before continuing, refresh the lock: +```bash +bash "${CLAUDE_SKILL_DIR}/scripts/lock.sh" refresh "${CLAUDE_PROJECT_DIR}/.consolidation.lock" "" +``` +If refresh returns `STOLEN`, another run has reclaimed the lock — abort the pipeline immediately and output NO_REPLY. + ### Phase C: Apply Changes -**Mutation limit: 5 per run.** Each file creation or modification counts as one mutation. +**Mutation limit: 5 per run, shared with Phase B.5.** Each file creation or modification counts as one mutation. Phase B.5 may have already consumed part of this budget — do NOT re-initialize `mutations_applied` here. If `mutations_applied >= 5` on entry to Phase C, skip Phase C mutations entirely and proceed to Phase D. If any mutation fails, stop further mutations immediately (stop-on-failure). -Track: `mutations_applied = 0`, `mutations_failed = 0`. +Carry `mutations_applied` from Phase B.5 (if Phase B.5 was skipped via the feature flag, initialize `mutations_applied = 0` here). For each approved change (confidence >= 0.9), in priority order (updates before creates): @@ -121,17 +185,26 @@ For each approved change (confidence >= 0.9), in priority order (updates before 2. **Apply the edit** — update existing `memory/auto/` file, create new one, or update `MEMORY.md` index. - For `memory/auto/` files, use this frontmatter format: - ```markdown + For `memory/auto/` files, use this base frontmatter format on every create or update. Persist `confidence` and `revisit_if` on every write. Do NOT include the optional Phase B.5 fields unless they actually apply (do not copy commented-out lines from the template into new files). + ```yaml --- name: topic-slug description: One-line description used for relevance matching in future sessions type: user|project|reference|feedback + confidence: 0.9 + revisit_if: "Ninja decides to move" + # Optional, user-set only — this skill does not write `tags`, but Phase B.5 + # reads them as a candidate-generation signal when present: + # tags: [editor, tooling] --- Body content here. For feedback/project types, include **Why:** and **How to apply:** sections. ``` + When Phase B.5 resolves a contradiction touching this file, additionally write `resolved_at`, `resolution_basis`, and the `do_not_reopen` list (one `{partner, before}` record per resolved pair) — see Phase B.5 step 5 for the exact YAML and sanitization rules. These fields are absent from files that have never participated in a resolved contradiction. + + `revisit_if` is free-text and must be a single line. Useful phrasings: a concrete user-action trigger ("Ninja switches editors"), a date ("after 2026-09-01"), or `"Never"` for facts that are stable by nature (e.g. timezone). Apply the same sanitization as `resolution_basis` (max 200 chars, no embedded newlines, no leading `#`, double-quote and escape `"` as `\"` and `\` as `\\`). If Phase B's scoring did not yield a semantic trigger, default to `"Never"`. `confidence` mirrors the Phase B scoring rubric (1.0 / 0.9 / 0.7 / 0.5 / discarded below 0.5). The `revisit_if` field is written for human/agent inspection during interactive sessions; this skill does not read it back. + 3. **After editing MEMORY.md:** ```bash bash "${CLAUDE_SKILL_DIR}/scripts/safe-edit.sh" verify "${CLAUDE_PROJECT_DIR}/MEMORY.md" @@ -146,8 +219,8 @@ For each approved change (confidence >= 0.9), in priority order (updates before bash "${CLAUDE_SKILL_DIR}/scripts/safe-edit.sh" clean "${CLAUDE_PROJECT_DIR}/MEMORY.md" ``` -5. Increment `mutations_applied`. If `mutations_applied >= 5`, stop applying changes. - If any mutation fails, increment `mutations_failed` and stop further mutations. +5. Increment `mutations_applied` ONLY after the edit's `safe-edit.sh verify` succeeded and `clean` completed — matching Phase B.5 step 5's "fire after verify succeeds" rule so a rolled-back edit does NOT consume the shared budget. If `mutations_applied >= 5` after the increment, stop applying changes. + If `verify` fails for an edit, `rollback` and stop further mutations (stop-on-failure); the rolled-back edit's slot remains available for future runs. **Critical: Never modify CLAUDE.md, USER.md, or IDENTITY.md.** @@ -159,6 +232,17 @@ If refresh returns `STOLEN`, another run has reclaimed the lock — abort the pi ### Phase D: Report & Cleanup +Phase D substeps do NOT execute in literal numerical order. The diary (step 1) is written LAST among steps 1–3 so it can summarize the actual outcomes of steps 2 and 3 — including their issues — in a single write (the spec does NOT support amending an already-written diary section). Step 3 is also split: its compute-and-validate work runs early (to surface any issues), and the actual JSONL append runs after the diary. Execution order: + + 1. **Step 2** (MEMORY.md "Pending Review" edit) — applies `pending_resolved` removals first, then dedup-appends `pending_review` adds in a single MEMORY.md mutation, finalizes `pending_added`; may surface a `SUSPICIOUS_SHRINK` rollback issue. + 2. **Step 3 compute-and-validate** — compute `pending_total` and `avg_age_days` from the post-step-2 MEMORY.md, compose the JSONL line, validate it via `jq -e`. This sub-step MAY surface issues to the Issues collector: malformed bullets whose `detected_at` fails to parse (counted with age 0; logged), or a candidate JSON line that fails `jq` validation (NOT appended; logged). Do NOT perform the actual `>>` append yet — defer it to after the diary. + 3. **Step 1** (diary) — write diary using the finalized `pending_added` AND the full Issues set collected in (1) and (2). + 4. **Step 3 append** — perform the actual `printf '%s\n' "$LINE" >> memory/lint-stats.jsonl`, only if (2)'s `jq` validation passed. + 5. **Step 4** (release lock). + 6. **Step 5** (NO_REPLY). + +Two adjustments can lower `pending_added` relative to the count carried out of Phase B.5: (i) the dedup pass MAY remove items already present in MEMORY.md, and (ii) if step 2's edit fails `verify` and rolls back (excluding the documented `SUSPICIOUS_SHRINK` bypass), NO bullets were persisted and `pending_added` MUST be reset to 0. The same rollback condition also resets `pending_removed` to 0 (rollback restores the pre-edit MEMORY.md, so the removals never landed either). If a rollback occurs, the diary's "Lint" line MUST report `pending added: 0` and `pending removed: 0`, and the Issues section MUST note the rollback; step 3's stats line MUST also use the rolled-back value. Never write a diary line claiming new bullets or removed bullets when MEMORY.md was not actually modified. + 1. **Write diary entry** to `memory/diary/YYYY-MM-DD.md` (using today's date). The diary is a narrative digest — write it as if reflecting on the day's conversations. @@ -167,22 +251,28 @@ If refresh returns `STOLEN`, another run has reclaimed the lock — abort the pi - What was learned or confirmed - What memory changes were made (and why) - Items noted for manual curation (confidence 0.5–0.9) + - Lint findings from Phase B.5: candidates considered, contradictions detected, auto-resolves applied, items deferred to Pending Review (omit this bullet when `LINT_PHASE_B5_ENABLED=false`, since the accumulators were never populated) - Any errors or partial failures encountered If a diary file for today already exists, append a new section with a timestamp header. + **Parseable prefix.** New diary sections written from now on use this header line so a recent-activity log can be grepped: `## [YYYY-MM-DD HH:MM] consolidation | `. A consumer can run `grep "^## \[" memory/diary/*.md | tail -10` to see recent consolidations at a glance. **This change is forward-only** — do not rewrite existing diary headers; only newly written sections use the parseable prefix. + Format: ```markdown # Diary — YYYY-MM-DD - ## Consolidation at HH:MM + ## [YYYY-MM-DD HH:MM] consolidation | ### Sessions Reviewed - [topic]: brief description of what was discussed - ### Memory Changes + ### Memory Changes # REQUIRED heading — do not rename; Phase B.5 step 1 parses this heading text to populate `recent_diary_mention` (renaming silently breaks the evidence leg of the auto-resolve hierarchy). - Created/Updated memory/auto/filename.md — reason + ### Lint (Phase B.5) # omit this entire block when LINT_PHASE_B5_ENABLED=false + - Candidates: N, contradictions: N, auto-resolved: N, pending added: N, pending removed: N # `pending removed` = count of stale MEMORY.md bullets cleared this run by `pending_resolved` (Phase D step 2). Forensic deferral causes (verify failure, anchor invalidation, mutation-limit) are logged in the Issues section below. + ### Noted for Review - [confidence 0.7] Possible insight — context @@ -190,12 +280,30 @@ If refresh returns `STOLEN`, another run has reclaimed the lock — abort the pi - Any errors encountered during processing ``` -2. **Release consolidation lock:** +2. **Update workspace `MEMORY.md` "Pending Review" section.** Gated by `LINT_PHASE_B5_ENABLED`; skip if false. This step applies BOTH the nightly-auto-resolve removals (`pending_resolved` from Phase B.5 step 5) AND the new-flag adds (`pending_review`) in a single MEMORY.md mutation — removals first, then adds, so dedup decisions in the add pass see the post-removal state. + - **Process `pending_resolved` removals FIRST (nightly auto-resolve cleanup).** For each `{files, reason}` record in `pending_resolved`, find and remove the bullet whose triple `(file-A, file-B, reason)` matches per the removal rule below. Increment `pending_removed` by 1 ONLY when a matching bullet was actually found and removed; do NOT increment for no-op records where no matching bullet existed (e.g., the auto-resolve happened on the same run the contradiction was first detected, so no prior bullet exists to clear). Using `len(pending_resolved)` here would overcount — most first-run resolutions are no-ops. This prevents stale flags from accumulating: a contradiction previously surfaced in `MEMORY.md` and auto-resolved in a later nightly run no longer inflates `pending_total` / `avg_age_days`. Removal is idempotent. + - **Then process `pending_review` adds.** If the accumulator is non-empty, ensure `MEMORY.md` contains a section titled exactly `## Pending Review (Lint findings)`. Each unresolved item is one bullet in this strict, machine-parseable format (parser regex `^- detected_at=\d{4}-\d{2}-\d{2} `): `- detected_at=YYYY-MM-DD — file-A vs file-B — `. Sanitize `` the same way as `resolution_basis` (strip leading `#`, collapse newlines to `; `, truncate to 200 chars) AND additionally replace any em-dash characters (`—`, U+2014) in the reason with a hyphen-space (`- `) so the bullet's three-field structure can be split unambiguously on the literal ` — ` separator. New bullets MUST use `canonical_reason` from Phase B.5 step 4 as the `reason` field — deferral causes (verify failure, anchor invalidation, mutation-limit) are forensic and belong in the diary Issues section, NOT in the bullet. + - Before appending a new bullet, deduplicate on the triple `(file-A, file-B, reason)` (unordered file pair, exact reason after sanitization): if an existing bullet matches all three fields, do NOT append again. Two genuinely distinct contradictions between the same pair (different reasons) produce two separate bullets — do not collapse them. Update `pending_added` to count only newly written bullets. + - If `pending_review` is empty AND no prior unresolved bullets remain in the section (after removals), the section MUST be absent from `MEMORY.md` — do NOT leave an empty heading. + - **Removal rule (used by both the nightly `pending_resolved` path above AND interactive resolutions).** Remove ONLY the bullet whose triple `(file-A, file-B, reason)` matches the specific contradiction being resolved — the same triple used as the dedup key when the bullet was written. The match key is the bullet's literal `reason` field (the third dash-separated segment, sanitized as on write), NOT the frontmatter `resolution_basis` field which is a separate human-readable summary. The unordered file pair `(file-A, file-B)` must match (order-insensitive) and the sanitized `reason` text must match exactly. If the same pair has multiple unresolved bullets with different reasons, leave the non-matching bullets in place — they represent distinct unresolved contradictions. When the last bullet in the section is removed, the section heading itself is removed in the same edit. Match the section by its exact title `## Pending Review (Lint findings)` and remove only between that heading and the next `## ` heading or EOF — do not touch unrelated occurrences of the string. + - This edit uses the standard `safe-edit.sh backup / verify / rollback / clean` flow, with one allowance: when the edit's only effect is to add, remove, or rewrite Pending Review bullets and/or the section heading (i.e. nothing OUTSIDE the section changed), `safe-edit.sh verify` may legitimately return `SUSPICIOUS_SHRINK` (the section can be a large fraction of a small `MEMORY.md`, and a mixed run that removes a backlog of stale bullets while adding only one or two new ones can still net-shrink below the 20% threshold). The bypass covers BOTH removal-only and mixed (remove + add) edits, since Phase D step 2 applies removals and adds in a single mutation. In that case, accept the result if (a) the post-edit file still contains the `# Memory Index` heading AND (b) everything OUTSIDE the Pending Review section is byte-identical between the pre-edit (backup) and post-edit states. Concretely, define `outside_bytes(file)` as the byte-content of `file` with the Pending Review section excised — from the `## Pending Review (Lint findings)` heading line through the byte immediately preceding the next `## ` heading or EOF, inclusive of any trailing blank line that visually separates the section from what follows. If the section is absent in either state, `outside_bytes` equals the full file content for that state. The bypass holds iff `outside_bytes(backup) == outside_bytes(post-edit)` (exact byte-string equality, not ± tolerance — anything outside the section is not supposed to move). The runner MUST capture the backup state from `${FILEPATH}.consolidation-backup` (which is identical to the pre-edit file) and the post-edit state by re-reading `${FILEPATH}` after the write; both reads happen AFTER the write completes. Otherwise rollback as usual. Document the bypass in the diary Issues section so the audit trail is preserved. + +3. **Append a line to `memory/lint-stats.jsonl`.** Gated by `LINT_PHASE_B5_ENABLED`; skip if false. The file is created on first run if absent. Format is one strict JSON object per line, parseable by Python `json.loads` per line. Each angle-bracketed placeholder below is substituted with the actual value (`` becomes a literal integer like `7`, `` becomes a float like `4.5`): + ```json + {"date":"","candidates_found":,"contradictions_detected":,"auto_resolved":,"pending_added":,"pending_total":,"avg_age_days":} + ``` + - Before appending, validate the candidate line with `printf '%s' "$LINE" | jq -e . > /dev/null` — if validation fails, do NOT append; log the malformed line in the diary Issues section instead. + - Append the validated line with a trailing newline so the file remains parseable as one JSON object per line — e.g., `printf '%s\n' "$LINE" >> memory/lint-stats.jsonl`. Never use `printf '%s'` (no newline) when writing; that produces a single concatenated line and breaks `json.loads`-per-line. + - `pending_total` counts bullets in `MEMORY.md`'s "Pending Review (Lint findings)" section matching the regex `^- detected_at=\d{4}-\d{2}-\d{2} ` between the section heading and the next `## ` heading (or EOF), measured after this run's writes. + - `avg_age_days` is the mean age in days of all current pending bullets, computed as `mean(today − detected_at)` where `today` is the same YYYY-MM-DD used in this run's `date` field, and `detected_at` is parsed from each bullet via the regex above. Round to one decimal place. If `pending_total == 0`, write `0`. If any bullet's `detected_at` fails to parse, count it with age `0` and log the malformed bullet in the diary Issues section. + - Append-only — never rewrite earlier lines. + +4. **Release consolidation lock:** ```bash bash "${CLAUDE_SKILL_DIR}/scripts/lock.sh" release "${CLAUDE_PROJECT_DIR}/.consolidation.lock" "" ``` -3. **Output NO_REPLY** — this skill runs silently, never sends messages to chat. +5. **Output NO_REPLY** — this skill runs silently, never sends messages to chat. ## Error Handling diff --git a/.github/workflows/pii-scan.yml b/.github/workflows/pii-scan.yml index 2a28f45..b7b3c33 100644 --- a/.github/workflows/pii-scan.yml +++ b/.github/workflows/pii-scan.yml @@ -6,6 +6,15 @@ on: push: branches: [main] +# gitleaks-action@v2 lists PR commits to determine scan scope. In private +# repos, the default GITHUB_TOKEN does not grant pull-requests:read, so the +# action returns 403 "Resource not accessible by integration" and fails +# before scanning. Public repos worked under the more permissive default — +# private consumers of this reusable workflow need the explicit grant. +permissions: + contents: read + pull-requests: read + jobs: gitleaks: uses: ./.github/workflows/gitleaks-reusable.yml diff --git a/docs/plans/completed/2026-05-17-memory-lint-trial.md b/docs/plans/completed/2026-05-17-memory-lint-trial.md new file mode 100644 index 0000000..a84f6d6 --- /dev/null +++ b/docs/plans/completed/2026-05-17-memory-lint-trial.md @@ -0,0 +1,157 @@ +# Memory-Lint Phase B.5 Trial — Provenance Frontmatter + Contradiction Detection + Surfacing Rule + +## Goal + +30-day trial (2026-05-17 → 2026-06-17) of automated cross-file contradiction detection in memory, with proactive in-conversation surfacing. Adds to existing `memory-consolidation` skill: new Phase B.5 (lint pass), expanded Phase C (frontmatter persistence), expanded Phase D (Pending Review section in workspace MEMORY.md + stats file). Adds platform rule requiring the agent to surface pending items in conversation. + +Feature-flagged for instant rollback. Anti-loop fields prevent re-triggering same contradiction nightly. Auto-resolve uses `evidence > confidence` hierarchy (codex-recommended; a "recency" leg was dropped during review because `resolved_at` reflects unrelated prior resolutions and is not a valid freshness proxy for the current claim). + +References upstream: ADR-069, beads workspace-txyu, [Karpathy LLM Wiki gist](https://gist.github.com/karpathy/442a6bf555914893e9891c11519de94f) (abstract — no algorithm). Codex provided concrete algorithm. + +## Validation Commands + +```bash +grep -q 'LINT_PHASE_B5_ENABLED' .claude/skills/memory-consolidation/SKILL.md && \ +grep -q '### Phase B.5' .claude/skills/memory-consolidation/SKILL.md && \ +grep -q 'evidence > confidence' .claude/skills/memory-consolidation/SKILL.md && \ +grep -q 'resolved_at' .claude/skills/memory-consolidation/SKILL.md && \ +grep -qE 'do_not_reopen.{1,5}is a \*\*list of records\*\*' .claude/skills/memory-consolidation/SKILL.md && \ +grep -q '## Surfacing pending lint items' .claude/rules/platform/memory-protocol.md && \ +echo "All checks passed" +``` + +## Reference: Current `memory-consolidation` skill + +File: `.claude/skills/memory-consolidation/SKILL.md`. + +Existing phases: +- Phase 0: Validate (locks, dirs) +- Phase A: Gather sessions (last 48h) +- Phase B: Diff & Score — confidence scoring (1.0/0.9/0.7/0.5), supersession check on new vs existing +- Phase C: Apply changes (mutation limit 5/run, safe-edit with rollback) +- Phase D: Diary digest + release lock + +Existing contradiction handling: Phase B step 1 — when ingesting new info, LLM checks if it contradicts existing memory, newer-info supersedes. NO cross-file scan between existing files. + +Existing memory file frontmatter: +```yaml +--- +name: +description: +type: user|project|reference|feedback +--- +``` + +## Reference: Codex algorithm for contradiction detection + +**Cheap candidate generation FIRST** (don't blindly LLM-judge all pairs): +- Same `type` filter +- Overlapping entities (filename tokens, title words, frontmatter tags) +- Normalized predicate phrases: `prefers`, `uses`, `hates`, `requires`, `do not` +- Negation/opposition markers: `not`, `never`, `avoid`, `instead`, changed values + +Only candidate bundles → LLM judgment. With 40 files, false positives are the cost concern, not compute. + +**Auto-resolve hierarchy:** +1. Direct diary/session evidence beats inferred +2. Else higher confidence wins if `Δ confidence >= 0.2` +3. Else flag — do not edit + +(An earlier draft included a "newer evidence-date wins" leg; it was dropped during review — see `evidence > confidence` note above.) + +**Anti-loop fields** (added per memory file when resolved): +- `resolved_at: ` — scalar, last resolution touching this file +- `resolution_basis: ""` — scalar +- `do_not_reopen:` — accumulating list of `{partner, before}` records, one per resolved pair (per-partner cooldown, so resolving A↔C never extends A↔B's window). `before` is a YYYY-MM-DD date; default `today + 90 days`. + +**Time-scoped changes are NOT contradictions** ("used X then, uses Y now" is evolution). + +## Tasks + +### Task 1: Add feature flag, Phase B.5 lint, frontmatter persistence, and stats file to `memory-consolidation/SKILL.md` + +The skill must gain a feature flag at top, a new Phase B.5 (after Phase B, before Phase C) that scans existing memory files for cross-file contradictions, an updated Phase C that persists `confidence` and `revisit_if` in frontmatter, and an updated Phase D that writes a "Pending Review" section to workspace `MEMORY.md` and appends a structured line to `memory/lint-stats.jsonl`. Diary entries gain a parseable prefix. + +What we want: + +- **Feature flag** at the top of SKILL.md (before "Context"): + ```markdown + ## Feature Flags + + - `LINT_PHASE_B5_ENABLED=true` — Trial: 2026-05-17 → 2026-06-17. When false, skip Phase B.5 entirely. Rollback = flip to false (no data migration). See ADR-069. + ``` + When false, the skill executes Phases 0/A/B/C/D as before, no lint, no Pending Review writes, no stats file appends. + +- **Phase B.5 inserted between Phase B and Phase C.** Steps: + 1. Iterate `memory/auto/*.md` and build a lightweight in-memory representation: `{file, type, name, tags, title_tokens, claim_phrases}` extracted from frontmatter and body. Claim extraction uses bullet/paragraph splits. + 2. Candidate generation: for each pair of files, only proceed if at least two of these match — same `type` field, overlapping `title_tokens`, overlapping `tags`, or matching normalized predicate ("prefers", "uses", "hates", "requires", "do not"). Per-pair exclusion: skip a pair if either file's `do_not_reopen` list has a record whose `partner` matches the other file's bare name AND whose `before` date is later than today (per-pair, not global). + 3. For each candidate pair, ask the LLM (in-skill prompt) one question: "Do these two claims contradict, or is one time-scoped evolution of the other?" Return: `contradiction` | `evolution` | `unrelated`. Only `contradiction` proceeds. + 4. For each detected contradiction, attempt auto-resolve using hierarchy: (a) direct diary/session evidence in last 48h wins over inferred; (b) higher confidence wins if delta >= 0.2; (c) otherwise flag for review. + 5. Auto-resolved: edit the losing file to either remove the contradicting claim or mark it superseded. **Never silent-delete** — always replace with a `(superseded: ...)` annotation. Add `resolved_at`, `resolution_basis` (scalars) and append/merge a `{partner, before}` record in the `do_not_reopen` list of BOTH files' frontmatter (anti-loop, per-pair cooldown). + 6. Flagged unresolved: add an entry to a `pending_review` accumulator (used in Phase D). + 7. Respect mutation limit from Phase C (5 per run total across B.5 and C combined). + +- **Phase C** must now persist `confidence` (existing 0.0-1.0 float from Phase B scoring) and `revisit_if` (semantic trigger string, free-text — see ADR-style examples) in the frontmatter when creating or updating files. Update the frontmatter format documentation block in SKILL.md accordingly: + ```yaml + --- + name: topic-slug + description: One-line description + type: user|project|reference|feedback + confidence: 0.9 # 0.0-1.0, matches Phase B scoring + revisit_if: "Ninja decides to move" # semantic trigger, like ADR Revisit-if; "Never" valid + # Optional, added when resolved by Phase B.5: + # resolved_at: 2026-05-18 + # resolution_basis: "diary 2026-05-15 §3 explicit user statement" + # do_not_reopen: # per-pair cooldown list + # - partner: + # before: 2026-08-18 # YYYY-MM-DD; default = today + 90 days + --- + ``` + +- **Phase D** must: + 1. Update workspace `MEMORY.md` with a `## Pending Review (Lint findings)` section listing unresolved items, one bullet per item with file references and reason. If accumulator is empty, the section must be ABSENT from MEMORY.md (do not leave an empty heading). When resolving an item, the corresponding bullet is removed; when the last bullet is removed, the section itself is removed. + 2. Append one structured JSON line to `memory/lint-stats.jsonl`: + ```json + {"date":"YYYY-MM-DD","candidates_found":N,"contradictions_detected":N,"auto_resolved":N,"pending_added":N,"pending_total":N,"avg_age_days":N} + ``` + 3. Diary entry format gains a parseable prefix: `## [YYYY-MM-DD HH:MM] consolidation | `. This allows `grep "^## \[" memory/diary/*.md | tail -10` to retrieve a recent-activity log. Apply forward-only — do not rewrite existing diary entries. + 4. Continue to follow "Silent operation: never send messages to any chat" (unchanged — no push notifications). + +- **`memory/lint-stats.jsonl` creation**: file is created on first run if absent; subsequent runs append. Format is strict JSON-per-line, parseable by Python's `json.loads` per line. + +- All existing safety mechanisms preserved: lock checks, mutation limit, safe-edit with rollback, never modify CLAUDE.md/USER.md/IDENTITY.md. + +- [x] `.claude/skills/memory-consolidation/SKILL.md` contains `LINT_PHASE_B5_ENABLED` feature flag at the top +- [x] SKILL.md contains a `### Phase B.5` section between Phase B and Phase C +- [x] Phase B.5 documents candidate generation, LLM judgment, auto-resolve hierarchy (`evidence > confidence`), and anti-loop fields +- [x] Phase B.5 explicitly excludes time-scoped changes from being treated as contradictions +- [x] Phase B.5 documents "never silent-delete" — losing claim is replaced with `(superseded: ...)` annotation +- [x] Phase C frontmatter format documented in SKILL.md now includes `confidence` and `revisit_if` fields (with `resolved_at`, `resolution_basis`, and the `do_not_reopen` per-pair list as optional) +- [x] Phase D documents the workspace `MEMORY.md` "Pending Review" section format and its add/remove rules +- [x] Phase D documents `memory/lint-stats.jsonl` format with one JSON line per run +- [x] Phase D diary format documented to use `## [YYYY-MM-DD HH:MM] consolidation | ...` parseable prefix +- [x] Phase D notes the parseable-prefix change is forward-only (existing diary entries unchanged) +- [x] "Silent operation" line in SKILL.md is unchanged (no push notifications added) +- [x] Existing safety lines preserved: lock-check, mutation-limit, safe-edit, "Never modify CLAUDE.md, USER.md, or IDENTITY.md" + +### Task 2: Add "Surfacing pending lint items" section to `.claude/rules/platform/memory-protocol.md` + +The platform memory-protocol rule must gain a new section requiring the agent to proactively surface pending lint items during conversation. Without this rule, the agent has no behavioral reason to mention them — and the user has stated explicitly they will not proactively ask. + +What we want: + +- New section `## Surfacing pending lint items` added to `.claude/rules/platform/memory-protocol.md`, placed coherently within existing structure (after "Auto-load mechanism" if present, otherwise after the initial storage-locations section). +- Section explains: when workspace `MEMORY.md` contains a `## Pending Review (Lint findings)` section, the agent MUST proactively surface items in the current conversation. +- Section gives the surfacing strategy in concrete rules: + - **Preferred trigger**: when the conversation's topic relates to a pending item, bring it up inline as part of the relevant answer ("Кстати, есть unresolved contradiction про X — ..."). Topic-related = the agent's natural assessment, not a strict keyword match. + - **Aged escalation**: if a pending item is older than 14 days AND no topic-relevant opportunity has arisen, surface it at a natural pause or task end. + - **One per session max**: never dump multiple items in one message. Pick the most relevant or oldest. + - **Never interrupt urgency**: if the user is mid-urgent-task, do not derail — wait for natural break. + - **After resolution**: update the contradicting memory file(s) with the resolved value, then remove the bullet from the MEMORY.md "Pending Review" section in the same operation. Add `resolved_at` / `resolution_basis` and append/merge a `{partner, before}` record into the file's `do_not_reopen` list per the consolidation skill's pattern. +- Section references ADR-069 and beads `workspace-txyu` for trial context. + +- [x] `.claude/rules/platform/memory-protocol.md` contains a section titled `## Surfacing pending lint items` +- [x] Section explicitly states "MUST" requirement to proactively surface +- [x] Section lists at least: preferred trigger (topic-related), aged escalation (>14 days), one-per-session max, no-interrupt-urgency, after-resolution actions +- [x] Section references ADR-069 for context +- [x] No existing content in `memory-protocol.md` is removed or contradicted (additive change)