Skip to content

fix(phase 6): require Learn smoke + walk-Learn-first Deliver smoke for two-app opps#354

Merged
jjackson merged 1 commit into
mainfrom
emdash/phase6-learn-deliver-gate
May 18, 2026
Merged

fix(phase 6): require Learn smoke + walk-Learn-first Deliver smoke for two-app opps#354
jjackson merged 1 commit into
mainfrom
emdash/phase6-learn-deliver-gate

Conversation

@jjackson
Copy link
Copy Markdown
Owner

Summary

Phase 6 multi-stage two-app opps produce recipe-vs-app-state mismatches because Connect's UI gates the Deliver app behind Learn-assessment completion, but Phase 2 was emitting 0 Learn journeys and Phase 3 was writing `smoke_journeys_per_app: {learn: 0, deliver: 1}` and Phase 6 ran the Deliver smoke anyway — which physically cannot reach Deliver from the post-claim handoff.

Three-skill structural fix + new static palette + learning doc.

Evidence

malaria-itn-app run `20260517-1829` Phase 6 verdict:

  • `structural_progress_fraction: 0.83` (9 valid screenshots, all Learn-app surfaces)
  • `recipe_failure_step: "Tap on 'V1 Long Visit'"` — the device was sitting on Learn's `actionBar="Malaria ITN SBC Training (Learn)"` with Learn modules visible. The Deliver V1 form was never reachable from that surface.

Connect's opp-detail screen makes the contract explicit: "Once you have completed the learning assessment, you will transition to delivery."

Changes

  • `skills/pdd-to-app-journeys/SKILL.md` — new coverage rule 4: every PDD with a Learn app MUST emit a `training-completion-smoke` journey (`app: learn`, `is_smoke: true`).
  • `skills/app-test-cases/SKILL.md` — Step 2 codifies the two-app coverage invariant (`learn=1 AND deliver=1`; halt at Phase 2 if not). Documents the faithful Deliver-smoke composition: walk all Learn modules to completion + sync + chain `deliver-launch.yaml` to reach Deliver. Step 5 enforces pre-write.
  • `skills/app-screenshot-capture/SKILL.md` — Step 2's pre-flight table was already correct (`halt on learn count != 1 OR deliver count != 1`); strengthened commentary with the malaria incident and an explicit anti-pattern callout.
  • `mcp/mobile/recipes/static/deliver-launch.yaml` (new) — drives atlas §§ 8/9/10 transitions: certificate (`Congratulations` text-anchor) → Download Delivery gate (`DOWNLOAD` text-anchor) → Deliver-mode `StandardHomeActivity` (structural anchor on `id/viewJobCard`).
  • `docs/learnings/2026-05-18-connect-gates-deliver-on-learn-completion.md` (new) — full invariant + cross-references to the atlas + outstanding resource-ID dump work.

Outstanding

  • Atlas §§ 8/9 resource-IDs are still TBD (palette uses text anchors + coordinate fallbacks from the 2026-05-14 turmeric session). A future Phase 6 run mid-window between Learn-pass and Deliver-download should `ui_dump` these surfaces.
  • "Walk all Learn modules to completion" composition in Phase 3 is described in prose, not codified. For 6+ module opps the recipe gets long; a `walk-learn-to-completion` helper palette is worth doing after at least one passing multi-stage smoke characterizes the composition shape.

Test plan

  • Skill markdown only (+ one new static recipe + learning doc) — no code paths touched
  • Version bump via `scripts/version-bump.sh` (0.13.278)
  • Next `/ace:run` on a multi-stage opp should produce Phase 2 = 10 journeys (9 Deliver + 1 Learn smoke), Phase 3 = `smoke_journeys_per_app: {learn: 1, deliver: 1}`, and Phase 6 captures both apps' screenshots

🤖 Generated with Claude Code

…r two-app opps

Connect's mobile UI only surfaces the Deliver app after the FLW completes
the Learn-app modules + final assessment AND that assessment-pass syncs
to Connect. There is no UI affordance to jump from claim-opp into Deliver
without walking Learn to completion first ("Once you have completed the
learning assessment, you will transition to delivery"). Phase 6 was
producing recipe-vs-app-state mismatches for multi-stage two-app opps
because Phase 2 emitted 0 Learn journeys, Phase 3 wrote `learn: 0`
smoke counts, and Phase 6 ran the Deliver smoke anyway — which physically
cannot reach Deliver from the claim handoff.

Three-skill structural fix + supporting palette + learning doc:

- skills/pdd-to-app-journeys: new coverage rule 4 — every PDD with a
  Learn app MUST emit `training-completion-smoke` (app: learn,
  is_smoke: true). Deeper Learn journeys move to /ace:qa-deep.
- skills/app-test-cases: Step 2 codifies the two-app coverage invariant
  (learn=1 AND deliver=1, halt at Phase 2 otherwise) + documents the
  faithful Deliver-smoke composition (walk all Learn modules to
  completion → sync → tap Resume → certificate → Download Delivery
  gate → Deliver mode). Step 5 enforces the invariant pre-write.
  Step 3's entry-point template references the new deliver-launch
  palette and updates the Learn-vs-Deliver guidance.
- skills/app-screenshot-capture: Step 2's pre-flight table was already
  correct (halt on `learn count != 1 OR deliver count != 1`), but agent
  discipline drifted past it. Strengthened the commentary with the
  malaria-itn-app incident and an explicit "don't rationalize past
  this" anti-pattern callout.
- mcp/mobile/recipes/static/deliver-launch.yaml: new static palette
  driving atlas §§ 8/9/10 transitions (certificate → Download Delivery
  gate → Deliver mode StandardHomeActivity). Anchors on text labels at
  §§ 8/9 (resource-IDs TBD) + structural `viewJobCard` at § 10
  (verified per atlas).
- docs/learnings/2026-05-18-connect-gates-deliver-on-learn-completion.md:
  captures the invariant + cross-references the atlas + outstanding
  resource-ID dump work.

Surfaced on malaria-itn-app run 20260517-1829 Phase 6 verdict
(structural_progress_fraction 0.83, 9 valid Learn-app screenshots,
recipe halt at `tapOn "V1 Long Visit"` because the device was sitting
on Learn's `actionBar="Malaria ITN SBC Training (Learn)"` — the
Deliver V1 form was never reachable from that surface).

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
@jjackson jjackson enabled auto-merge May 18, 2026 21:06
@jjackson jjackson merged commit 7f42822 into main May 18, 2026
2 checks passed
jjackson added a commit that referenced this pull request May 19, 2026
…step ordering

Two orthogonal bugs surfaced on malaria-itn-fgd/20260515-1645 Phase 6
attempt 12. Both unrelated to PR #354's Learn-Deliver chain scope.

Bug A — learn-tap-module.yaml Branch-B over-fires when form-name != module-name:
The same-name-suppressed-auto-skip branch guarded only on
`screen_suite_menu_list visible AND nav_btn_next NOT visible`, then re-tapped
`text:${MODULE_NAME}`. On J1 (module "Briefing Acknowledgement" → form
"Acknowledge Readiness"), the form-list screen's toolbar still reads
${MODULE_NAME}, so Branch-B fired and re-tapped the non-tappable toolbar
TextView instead of the form row. Fix: nested runFlow with inner guard
`visible: text: ${MODULE_NAME}, below: id: screen_suite_menu_list` so the
text-match is scoped to the list body. When form-name != module-name no
body row matches, Branch-B skips, caller's next learn-tap-module(FORM_NAME)
drills the form row by its own label.

Bug B — sweepStaleEmulatorState ordering (PR #349 follow-up):
PR #349 wired up the orphan-qemu sweep and adb-server restart but
(a) used a conservative `qemuPids.length >= liveCount + 2` kill threshold
that left the attempt-12 signature (2 orphan qemu + 1 stale adb-devices
entry, 2 NOT >= 1+2) below the kill bar, and (b) ran the adb-restart
immediately after the kill without waiting for the kernel to release the
emulator-NNNN TCP sockets — letting the freshly-restarted daemon adopt
the wedged-port state. Result: 2 of the next 3 ensureAvdRunning calls
still failed with "package service did not bind" until a second manual
adb kill-server/start-server fired inside the dispatch. Fix:
  1. Loosen the orphan kill to fire whenever qemuPids.length > liveCount.
  2. Add a 500ms socket-release wait between the last orphan SIGKILL and
     adb kill-server.

Tests:
  - static-recipe-invariants.test.ts asserts Branch-B's inner runFlow has
    `visible: text: ${MODULE_NAME}, below: id: screen_suite_menu_list`.
  - avd.test.ts asserts orphan-qemu kill precedes adb kill-server AND
    there's a ≥400ms gap between them (we wait 500ms, 100ms scheduling
    slack). Second test asserts the loosened threshold actually fires
    kills on the attempt-12 2-PID + 1-live-device signature.

Live reproducer: /Users/jjackson/.maestro/tests/2026-05-19_151650/
  screenshot-❌-1779218294468-(chunk-0.yaml).png.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant