Skip to content

release(v0.24.2): PA prompt refinements — lock-step digest trigger + mtime self-check#48

Merged
fstamatelopoulos merged 1 commit into
mainfrom
iteration-7/pa-prompt-refinements-v2
May 12, 2026
Merged

release(v0.24.2): PA prompt refinements — lock-step digest trigger + mtime self-check#48
fstamatelopoulos merged 1 commit into
mainfrom
iteration-7/pa-prompt-refinements-v2

Conversation

@fstamatelopoulos
Copy link
Copy Markdown
Owner

Summary

  • Refinement A: digest write trigger collapses from a 4-clause "ANY of these" rule into one lock-step rule — every substantive Problem Pack edit gets a Decisions bullet in the same turn, same hard discipline as the per-turn session-log rule. Preferences / supersession / rejection demoted to "additional triggers."
  • Refinement B: turn-start self-check names stat / mtime comparison on workspace-summary.md + session-<id>.md explicitly, replacing the v0.24.1 "scan session log entries" framing (which had assumed a stable header format that doesn't exist — the session log shape is agent-controlled).
  • Harness-side F.32 (cfcf doctor pa-digest-staleness) dropped as a scope item. The agent self-check is an active write barrier with full context; a harness check would be a passive observer that warns on workspaces stopped weeks ago. The agent check obviates the harness check.

Round 2 of the gmbot PA dogfood feedback. Strictly an extension of v0.24.1's PA-discipline arc — no code-surface changes outside the prompt template + its tests.

Why two of the four PA suggestions aren't in this PR

PA's refined list had 4 items; #1 (digest-first ordering) and #3 (supersession pattern) were already shipped exactly as described in v0.24.1. Only #2 (sharpened trigger) and #4 (concrete self-check) had subtle refinements worth applying — that's this patch.

Test plan

  • bun test packages/core/src/product-architect/prompt-assembler.test.ts — 28 pass (26 existing + 2 new)
  • bun run test — full suite green (1016 across packages)
  • bun run typecheck — clean
  • Manual smoke: run cfcf spec on a fresh workspace, verify the new prompt sections render correctly + the agent picks up the lock-step framing
  • After merge: dogfood on the gmbot repo for at least one PA session, observe whether the digest-write rate matches the lock-step expectation

Versioning rationale

v0.24.2 (patch). Prompt-template-only change with no code-surface impact — same shape as the v0.24.1 PA-discipline commit. No new config keys, no new endpoints, no new code paths. Eligible for patch tag.

🤖 Generated with Claude Code

…mtime self-check

Round-2 dogfood feedback from gmbot PA. Two of the four refinements
matched what v0.24.1 already shipped (digest-first ordering,
supersession pattern); the other two had subtle but real
refinements applied here.

**Refinement A — digest trigger becomes lock-step with the session
log per-turn rule.**

v0.24.1 had a four-clause testable trigger (Problem Pack edit /
preference / supersession / rejection). Worked but still required
the agent to judge "did any of these happen this turn?" Replaced
with one lock-step rule that mirrors the existing session-log
discipline:

> After every substantive edit to a Problem Pack file, append a
> Decisions bullet to the digest in the same turn — same hard
> discipline as the per-turn session-log rule.

Preferences / supersession / rejection remain, but as additional
triggers under the lock-step rule, not the load-bearing rule.

**Refinement B — self-check ritual names mtime comparison
explicitly.**

v0.24.1 said "scan session log for entries since last digest
update; flush if 2+." That implicitly required parsing the session
log — but the header format is agent-controlled, so there's no
guaranteed pattern to count. Rewrote as a concrete file-mtime
comparison the agent can execute via shell:

>  1. stat workspace-summary.md
>  2. stat session-<id>.md
>  3. If session log is newer AND gap >= 2 turns, catch up.
>
> This is a file-mtime comparison, not a vibe check.

**Harness-side F.32 dropped.**

The same mtime heuristic was originally scoped as a cfcf-side
diagnostic (cfcf doctor + dashboard chip). Killed: harness check is
a passive observer (would warn on workspaces stopped weeks ago,
can't distinguish mid-decision drift); the agent self-check is an
active write barrier with full context. The agent check obviates
the harness check.

Test coverage: 2 new tests in prompt-assembler.test.ts pin the
lock-step phrasing + the mtime/stat literals. All 28 tests pass.
Full suite (1016 tests across packages) green; typecheck clean.

No code-surface changes outside the prompt template + tests.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@fstamatelopoulos fstamatelopoulos merged commit 737519d into main May 12, 2026
2 checks passed
@fstamatelopoulos fstamatelopoulos deleted the iteration-7/pa-prompt-refinements-v2 branch May 12, 2026 23:44
fstamatelopoulos added a commit that referenced this pull request May 12, 2026
Post-merge polish. Matches the style of earlier entries (PR #23,
PR #26) that surface the merged PR # at the top of the section.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant