release(v0.24.2): PA prompt refinements — lock-step digest trigger + mtime self-check#48
Merged
Conversation
…mtime self-check Round-2 dogfood feedback from gmbot PA. Two of the four refinements matched what v0.24.1 already shipped (digest-first ordering, supersession pattern); the other two had subtle but real refinements applied here. **Refinement A — digest trigger becomes lock-step with the session log per-turn rule.** v0.24.1 had a four-clause testable trigger (Problem Pack edit / preference / supersession / rejection). Worked but still required the agent to judge "did any of these happen this turn?" Replaced with one lock-step rule that mirrors the existing session-log discipline: > After every substantive edit to a Problem Pack file, append a > Decisions bullet to the digest in the same turn — same hard > discipline as the per-turn session-log rule. Preferences / supersession / rejection remain, but as additional triggers under the lock-step rule, not the load-bearing rule. **Refinement B — self-check ritual names mtime comparison explicitly.** v0.24.1 said "scan session log for entries since last digest update; flush if 2+." That implicitly required parsing the session log — but the header format is agent-controlled, so there's no guaranteed pattern to count. Rewrote as a concrete file-mtime comparison the agent can execute via shell: > 1. stat workspace-summary.md > 2. stat session-<id>.md > 3. If session log is newer AND gap >= 2 turns, catch up. > > This is a file-mtime comparison, not a vibe check. **Harness-side F.32 dropped.** The same mtime heuristic was originally scoped as a cfcf-side diagnostic (cfcf doctor + dashboard chip). Killed: harness check is a passive observer (would warn on workspaces stopped weeks ago, can't distinguish mid-decision drift); the agent self-check is an active write barrier with full context. The agent check obviates the harness check. Test coverage: 2 new tests in prompt-assembler.test.ts pin the lock-step phrasing + the mtime/stat literals. All 28 tests pass. Full suite (1016 tests across packages) green; typecheck clean. No code-surface changes outside the prompt template + tests. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
stat/ mtime comparison onworkspace-summary.md+session-<id>.mdexplicitly, replacing the v0.24.1 "scan session log entries" framing (which had assumed a stable header format that doesn't exist — the session log shape is agent-controlled).pa-digest-staleness) dropped as a scope item. The agent self-check is an active write barrier with full context; a harness check would be a passive observer that warns on workspaces stopped weeks ago. The agent check obviates the harness check.Round 2 of the gmbot PA dogfood feedback. Strictly an extension of v0.24.1's PA-discipline arc — no code-surface changes outside the prompt template + its tests.
Why two of the four PA suggestions aren't in this PR
PA's refined list had 4 items; #1 (digest-first ordering) and #3 (supersession pattern) were already shipped exactly as described in v0.24.1. Only #2 (sharpened trigger) and #4 (concrete self-check) had subtle refinements worth applying — that's this patch.
Test plan
bun test packages/core/src/product-architect/prompt-assembler.test.ts— 28 pass (26 existing + 2 new)bun run test— full suite green (1016 across packages)bun run typecheck— cleancfcf specon a fresh workspace, verify the new prompt sections render correctly + the agent picks up the lock-step framingVersioning rationale
v0.24.2 (patch). Prompt-template-only change with no code-surface impact — same shape as the v0.24.1 PA-discipline commit. No new config keys, no new endpoints, no new code paths. Eligible for patch tag.
🤖 Generated with Claude Code