From 25ce460c7418b73bc90298328180897d4a7fcf83 Mon Sep 17 00:00:00 2001
From: Jonathan Jackson <jjackson@dimagi.com>
Date: Fri, 22 May 2026 02:56:07 -0600
Subject: [PATCH] fix(solicitation-create): align with current labs reality +
 document ideal end-state (connect-labs#212)
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

Surfaced during the malaria-itn-app/20260521-1400 Phase 8 republish
(solicitation 3140). Three labs-side gaps required inline workarounds
in the skill until jjackson/connect-labs#212 ships:

(a) Wire-shape drift. Atom inputSchema declares {data: {...fields...}}
    but deployed labs server expects fields flat at JSON-RPC params
    level. Step 6 now documents both paths: try the MCP atom as
    documented first; if labs returns "unknown fields: ['data']",
    fallback to direct JSON-RPC POST with flat shape.

(b) questions[].framing rejected as unknown key. Adopted the literal
    'Why we're asking: <framing>\n\n<text>' inline anchor convention
    in the text field. Load-bearing for solicitation-review which
    parses the framing back out on this anchor.

(c) evaluation_criteria[].id required by server but undocumented in
    atom inputSchema. Added id as a required field on the criterion
    shape with slugify(name) as the derivation fallback.

Step 7a verifier rewritten from "curl the public URL" to
"get_solicitation round-trip" — the labs public-detail page now 302s
to /labs/login/ even when is_public: true (also tracked in #212), so
the curl-grep verifier can't run. The round-trip via
get_solicitation provides the same structural check at the
persistence layer.

When all four #212 items ship: drop the inline workarounds, emit
framing as a structured key, drop the wire-shape fallback paragraph,
restore the curl-the-public-URL verifier as a second post-round-trip
check.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
---
 .claude-plugin/marketplace.json     |   4 +-
 .claude-plugin/plugin.json          |   2 +-
 VERSION                             |   2 +-
 package.json                        |   2 +-
 skills/solicitation-create/SKILL.md | 161 +++++++++++++++++++++-------
 5 files changed, 125 insertions(+), 46 deletions(-)
diff --git a/.claude-plugin/marketplace.json b/.claude-plugin/marketplace.json
index 6987c3d3..447d125c 100644
--- a/.claude-plugin/marketplace.json
+++ b/.claude-plugin/marketplace.json
@@ -6,13 +6,13 @@
     "url": "https://github.com/jjackson"
   },
   "metadata": {
-    "version": "0.13.330"
+    "version": "0.13.331"
   },
   "plugins": [
     {
       "name": "ace",
       "source": "./",
-      "version": "0.13.330",
+      "version": "0.13.331",
       "description": "AI Connect Engine — orchestrates the CRISPR-Connect lifecycle from idea through app building, Connect setup, LLO management, and closeout"
     }
   ]
diff --git a/.claude-plugin/plugin.json b/.claude-plugin/plugin.json
index 13ae57c7..d519303a 100644
--- a/.claude-plugin/plugin.json
+++ b/.claude-plugin/plugin.json
@@ -1,6 +1,6 @@
 {
   "name": "ace",
-  "version": "0.13.330",
+  "version": "0.13.331",
   "description": "AI Connect Engine — orchestrates the CRISPR-Connect lifecycle from idea through app building, Connect setup, LLO management, and closeout",
   "author": {
     "name": "Jonathan Jackson",
diff --git a/VERSION b/VERSION
index 8bf173bc..7a026946 100644
--- a/VERSION
+++ b/VERSION
@@ -1 +1 @@
-0.13.330
+0.13.331
diff --git a/package.json b/package.json
index eddbfdad..b7ff3858 100644
--- a/package.json
+++ b/package.json
@@ -1,6 +1,6 @@
 {
   "name": "ace",
-  "version": "0.13.330",
+  "version": "0.13.331",
   "description": "AI Connect Engine - orchestrator for building Connect Opps using AI",
   "type": "module",
   "scripts": {
diff --git a/skills/solicitation-create/SKILL.md b/skills/solicitation-create/SKILL.md
index d65b29ec..a13c9089 100644
--- a/skills/solicitation-create/SKILL.md
+++ b/skills/solicitation-create/SKILL.md
@@ -175,8 +175,8 @@ contract.
    | `expected_end_date` | string `YYYY-MM-DD` | Phase 4 opp `end_date` if available, else PDD `## Timeline` → end. **NOT** `anticipated_end`. |
    | `estimated_scale` | string | Human-readable summary of expected reach, e.g. "30–50 verified HH visits per LLO; 2–3 LLOs total (90–150 HH end-to-end)". Sourced from PDD `## Target Population` → `Expected reach`. **NOT** `sample_target`. |
    | `contact_email` | string | Operator-monitored address. **NOT** `ace@dimagi-ai.com` (that's the service-account bot). Use `${ACE_SOLICITATION_CONTACT_EMAIL}` env var; halt with `[BLOCKER]` if unset rather than defaulting to the bot inbox. |
-   | `evaluation_criteria` | array of `{name, description, weight, scoring_guide, linked_questions}` | Composed locally — see Step 3. **NOT** `rubric`. **NOT** `[{dimension, criterion, weight}]`. Each criterion MUST have a populated `scoring_guide` and at least one `linked_questions` id. Weights sum to 100 (integers). |
-   | `questions` | array of `{id, text, required, type, framing}` | Composed locally — see Step 4. **NOT** `response_questions`. Field is `text`, not `question`. **Every question MUST have a `framing` field** (1-2 sentence "why we're asking this" preface) — the public template hides empty framings, but the eval rubric and downstream solicitation-review consume the framing. `type` is one of `"textarea"` (default for open-ended), `"multiple_choice"`, `"number"`. |
+   | `evaluation_criteria` | array of `{id, name, description, weight, scoring_guide, linked_questions}` | Composed locally — see Step 3. **NOT** `rubric`. **NOT** `[{dimension, criterion, weight}]`. **`id` is REQUIRED** — kebab-case identifier (e.g. `field-ops-realism`), unique within the rubric. Server rejects criteria without `id` even though the labs atom's documented inputSchema doesn't flag it (see `jjackson/connect-labs#212`). Each criterion MUST also have a populated `scoring_guide` and at least one `linked_questions` id. Weights sum to 100 (integers). |
+   | `questions` | array of `{id, text, required, type}` (+ inlined framing — see below) | Composed locally — see Step 4. **NOT** `response_questions`. Field is `text`, not `question`. `id` is kebab-case (`field-ops-realism`). `type` is one of `"textarea"` (default), `"multiple_choice"`, `"number"`. **Framing handling — current labs reality:** the deployed labs server (as of 2026-05-22) rejects a `framing` key with `INVALID_SCHEMA`. Until `jjackson/connect-labs#212` ships, inline the framing as a prefix in the `text` field using the literal convention `Why we're asking: <framing-sentence>\n\n<actual-prompt>`. Two-line gap between framing and prompt is load-bearing — `solicitation-review` parses the prefix back out on this anchor. Once `connect-labs#212` lands, drop the inline prefix and emit `framing` as a structured field. |
    | `status` | string | `'active'` (publishes immediately; `'draft'` for dry-run mode). |
    | `is_public` | bool | `true` (so unsolicited orgs can find it on the public marketplace). |
    | `connect_opportunity_id` | int | Phase 4 opp internal id (not the UUID). Stored on the record for downstream solicitation-review linkage. |
@@ -314,19 +314,24 @@ contract.
    **Required shape per criterion:**
 
    ```yaml
-   - name: string                # short title, e.g. "Field operations realism"
+   - id: string                  # kebab-case identifier, e.g. "field-ops-realism" — REQUIRED
+     name: string                # short title, e.g. "Field operations realism"
      description: string         # 1-2 sentence explanation of what this measures
      weight: int                 # 5-30, integer; all criteria weights sum to 100
      scoring_guide: string       # what makes a 10/10 vs 5/10 vs 0/10 — concrete and falsifiable
      linked_questions: [string]  # one or more question `id`s from the questions block; each criterion links to ≥1 question
    ```
 
-   **Every field is required.** Specifically, `scoring_guide` and
-   `linked_questions` are NOT optional — the labs template renders both,
-   and an empty `scoring_guide` makes the rubric uninterpretable to
-   responding LLOs. **Weights must sum to 100** (integers — the labs
-   template formats them as `<weight>%`). 5-8 criteria total; more than
-   8 dilutes the signal, fewer than 4 misses dimensions.
+   **Every field is required, including `id`.** The deployed labs server
+   enforces `id` even though the MCP atom's documented inputSchema does
+   not currently flag it (tracked in `jjackson/connect-labs#212`).
+   Derive `id` from `slugify(name)` if you're tempted to elide it.
+   `scoring_guide` and `linked_questions` are NOT optional — the labs
+   template renders both, and an empty `scoring_guide` makes the rubric
+   uninterpretable to responding LLOs. **Weights must sum to 100**
+   (integers — the labs template formats them as `<weight>%`). 5-8
+   criteria total; more than 8 dilutes the signal, fewer than 4 misses
+   dimensions.
 
    **`scoring_guide` shape — what a strong response looks like:**
    Each `scoring_guide` MUST describe (a) what a strong (8-10) answer
@@ -386,12 +391,37 @@ contract.
 
    ```yaml
    - id: string         # short kebab-case, e.g. "field-ops-realism"
-     framing: string    # 1-2 sentences: why this question, what we're looking for
-     text: string       # the actual prompt the LLO reads + responds to
+     text: string       # framing prefix + prompt — see "Framing convention" below
      required: bool     # default true
      type: string       # "textarea" (default), "multiple_choice", "number"
    ```
 
+   **Framing convention (current labs reality).** The deployed labs
+   server rejects a top-level `framing` key on questions (`INVALID_SCHEMA`).
+   Until `jjackson/connect-labs#212` ships, inline the framing inside
+   `text` using the literal anchor `Why we're asking: ` followed by the
+   framing sentence(s), a blank line, then the prompt:
+
+   ```yaml
+   - id: field-ops-realism
+     text: |
+       Why we're asking: a strong response names supervisor:FLW ratios,
+       handles the photo-heavy visit logistics, and treats the mid-pilot
+       checkpoint as a real planning anchor.
+
+       Propose a week-by-week schedule from award through Week 10 closeout,
+       including LLO mobilization, Connect onboarding, Learn calibration,
+       field launch, mid-pilot checkpoint, and closeout.
+     required: true
+     type: textarea
+   ```
+
+   The `Why we're asking: ` prefix + blank-line gap is load-bearing —
+   `solicitation-review` parses the framing back out on this exact
+   anchor. Once `connect-labs#212` lands and the server accepts
+   `framing` as a structured field, drop the inline prefix and emit
+   the framing separately. Until then this is the canonical convention.
+
    **Dedupe by intent.** The PDD's `## Solicitation` → `Response
    template` may list opp-specific questions that overlap with the
    default set. Combine them — never publish two questions that ask
@@ -475,7 +505,39 @@ contract.
       shadow programs on first opportunity sync). Surface the Connect
       program name and the list of labs programs the caller can see.
 
-6. **Publish.** Call:
+6. **Publish.**
+
+   **Wire-shape note (current labs reality).** The MCP atom's
+   `tools/list` inputSchema declares `{program_id, data: {...fields...}}`
+   (fields wrapped in `data`). The deployed labs server (as of
+   2026-05-22) instead expects the fields **flat** at the JSON-RPC
+   params level and rejects the `data` envelope with `unknown fields:
+   ['data']`. This is a drift between the documented atom schema and
+   the server validator, tracked in `jjackson/connect-labs#212`.
+
+   Two paths exist while the drift is live:
+
+   - **Preferred:** call the MCP atom as documented (`data: {...}` wrap)
+     and let the proxy translate; the labs MCP server will be updated
+     to accept the wrapped shape once `connect-labs#212` lands. If the
+     MCP call succeeds, you're done.
+   - **Fallback (if MCP atom returns `unknown fields: ['data']`):** POST
+     the JSON-RPC frame directly to `https://labs.connect.dimagi.com/mcp/`
+     with the fields flat at `params.arguments`. Set
+     `Authorization: Bearer ${LABS_MCP_TOKEN}` and
+     `Content-Type: application/json`. Body shape:
+     ```json
+     {"jsonrpc":"2.0","id":1,"method":"tools/call","params":{
+       "name":"create_solicitation",
+       "arguments":{"program_id":"153","title":"...","description":"...", /* flat fields */}
+     }}
+     ```
+     Parse the `result` envelope to extract `solicitation_id`.
+
+   Either way, the payload field SHAPE (names + types + required-ness)
+   is the same. Only the outer wrap differs.
+
+   Call:
 
    ```
    mcp__connect-labs__create_solicitation(
@@ -545,35 +607,51 @@ contract.
    reports `is_public: true` but the public listing page renders empty
    (verified jjackson/ace e2e malaria-itn-app run 20260517-1829).
 
-7a. **Verify the rendered public page.** After publish, fetch the
-   public URL with `curl -s` and grep for structural signals that
-   confirm each load-bearing section actually rendered:
-
-   ```bash
-   curl -s "${LABS_BASE_URL}/solicitations/<id>/" | tee /tmp/sol.html
-   ```
-
-   Required matches (all must be present):
-
-   - `<h2[^>]*>.*Description` AND a non-empty markdown body following it
-   - `Application Deadline` AND a parseable date in `Month D, YYYY` form (NOT "No deadline")
-   - `Timeline` AND a non-empty `Mon YYYY — Mon YYYY` value (NOT "TBD — TBD")
-   - `<h2[^>]*>.*Scope of Work` AND markdown bullets / sub-headings (NOT a `['...', '...']` Python repr)
-   - `<h2[^>]*>.*Application Questions` AND `N questions` count text (where N matches len(questions))
-   - `<h2[^>]*>.*Evaluation Criteria` AND `N criteria` count text (where N matches len(evaluation_criteria))
-
-   **Any miss → `[BLOCKER]`** naming the missing section + the
-   probable field-name drift (e.g. "Application Deadline reads `No
-   deadline` — likely the payload sent `response_window_days` instead
-   of `application_deadline`"). Do NOT proceed to write
-   `published.md` against a half-rendered solicitation. The class
-   this catches is the silent-echo failure mode where the labs API
-   accepts the create cleanly but the public page is empty.
-
-   This is the structural backstop for the field-name canonical-schema
-   contract. Verified live on solicitation 3130 (jjackson/ace
-   `malaria-itn-app/20260521-1400`) where all 6 sections were broken
-   simultaneously because the entire payload schema had drifted.
+7a. **Verify the round-trip via `get_solicitation`.** Immediately after
+   publish (whether via MCP atom or direct JSON-RPC fallback), call
+   `mcp__connect-labs__get_solicitation(solicitation_id=<returned>,
+   program_id=<labs_program_id>)` and assert every load-bearing field
+   round-trips intact. Treat the response as the canonical view of
+   what labs persisted — if it diverges from what you sent, that's
+   either silent-drop (an extra field labs rejected without surfacing)
+   or silent-mutation (labs reformatting something).
+
+   Required round-trip assertions:
+
+   - `title` matches what was sent.
+   - `description` length is non-empty and ≥ 80% of the sent length (small whitespace/markdown normalization is OK; 20%+ shrinkage is silent-drop).
+   - `scope_of_work` is a string (not an array) AND length ≥ 80% of sent.
+   - `application_deadline` parses as `YYYY-MM-DD` AND matches the computed deadline.
+   - `expected_start_date` + `expected_end_date` both parse and match.
+   - `estimated_scale` is non-empty.
+   - `contact_email` matches.
+   - `len(questions)` matches sent; every question has `id`, `text`, `required`, `type`; `text` contains the `Why we're asking: ` framing anchor.
+   - `len(evaluation_criteria)` matches sent; every criterion has `id`, `name`, `weight`, `scoring_guide`, `linked_questions` (non-empty); `sum(weights) == 100`.
+   - `is_public: true`; `status: 'active'`; `connect_opportunity_id` matches sent.
+
+   **Any miss → `[BLOCKER]`** naming the missing or mutated field +
+   the probable cause (e.g. "scope_of_work returned as an array — labs
+   string→array coerced; check the payload was sent as a single
+   markdown string and not Python-list-coerced upstream"). Do NOT
+   proceed to write `published.md` against a half-persisted solicitation.
+
+   **Why round-trip and not curl-the-public-URL.** The labs
+   `/solicitations/<id>/` detail page currently 302s to `/labs/login/`
+   for unauthenticated visitors even when `is_public: true` — tracked
+   in `jjackson/connect-labs#212`. A vanilla curl gets the login page,
+   not the solicitation. Once `connect-labs#212` ships the public-page
+   ACL fix, add a second verifier step that curls the public URL and
+   greps for the rendered sections (the original Step 7a, surfaced as
+   the structural backstop for the *rendered* layer rather than just
+   the persistence layer). Until then, `get_solicitation` round-trip
+   is the canonical structural check.
+
+   This catches the class of bugs where the labs MCP accepts the
+   create cleanly but persisted state diverges from sent payload.
+   Verified live on solicitation 3130 (`malaria-itn-app/20260521-1400`)
+   where all 6 sections were broken simultaneously because the entire
+   payload schema had drifted — round-trip would have caught it at
+   write time instead of human-eye time.
 
 7b. **Verify the round-trip.** Immediately after publish, call:
 
@@ -735,4 +813,5 @@ Each row this skill writes uses `phase: 8-solicitation-management` and
 | 2026-05-15 | Three archetype-branches added for `focus-group`: (1) scope_of_work concatenation in Step 2 — FGD PDD has no `## Learn App Specification` (uses `## Facilitation Protocol` instead); the scope opens with a "PER VERIFIED SESSION, THREE ARTIFACTS" block listing audio + gdoc + 5-field attestation form with explicit "NOT in the form" callout. (2) evaluation_criteria in Step 3 — focus-group goes from a 4-axis sketch to a 6-axis starter rubric (qualitative-research experience, facilitator skill + language, homogeneous-group recruitment, coordinator gdoc-review capacity, audio handling out-of-band, timeline + per-session payment economics). (3) default questions in Step 3 — swap CHW-deployment vocabulary for qualitative-research vocabulary on q1 + q5 + q6. Prompted by `malaria-itn-fgd/20260514-2352` Phase 8 observations. | ACE team |
 | 2026-05-15 | Codify the **"per-unit payment is negotiated, not declared"** design principle at the top of `## Process`. Solicitations express payment as a range with rationale in `scope_of_work` prose; the `questions` block asks the responding LLO to propose their actual rate + why. Closes the loop on the "labs `per_unit_payment` schema gap" surfaced in Phase 8 — it's not a gap, it's an intentional design choice (per-unit shape varies by archetype; the rate is opp-and-LLO-specific and negotiated through the response). | ACE team |
 | 2026-05-21 | **Work-order-as-primary-input + canonical-schema field names + comprehensive-content shape.** Three bundled rewrites prompted by solicitation 3130 on `malaria-itn-app/20260521-1400` where the public page rendered blank Description, "TBD" timeline, "No deadline," Python-list-repr Scope, and zero questions / zero rubric simultaneously. (1) Inputs now read Phase 1's work order (`1-design/pdd-to-work-order.gdoc`) as the primary content source + `decisions.yaml` for later run decisions, alongside the PDD (now used for problem-framing only, not for scope). (2) Field names migrated to the labs canonical schema (`description` not `overview`, `application_deadline` not `response_window_days`, `expected_start_date/_end_date` not `anticipated_*`, `estimated_scale` not `sample_target`, `questions[].text` not `response_questions[].question`, `evaluation_criteria[].name/.scoring_guide/.linked_questions` not `rubric[].dimension/.criterion`; `solicitation_type: 'eoi'` lowercase). Top-level fields not in `solicitations/models.py` (`pass_bar`, `eligibility_criteria`, `geographic_scope`, `per_hh_payment_band_usd`, `budget`) folded into `description`/`scope_of_work` prose. (3) Content shape demands comprehensive prose: `description` 500-800 words foundation-pitch tone; `scope_of_work` 600-1000+ words derived section-by-section from the work order with explicit de-prescription rules (exact dollars → ranges, exact weeks → windows); every question has a required `framing` field; every evaluation criterion has a required `scoring_guide` + `linked_questions`. (4) Added Step 7a — a curl-the-public-URL structural verifier that catches field-name drift at write time instead of at human-eye time. | ACE team |
+| 2026-05-22 | **Align with current labs reality + document the ideal end-state (`jjackson/connect-labs#212`).** Surfaced during the malaria-itn-app `20260521-1400` Phase 8 republish (solicitation 3140). Three labs-side gaps required inline workarounds: (a) atom inputSchema `{data: {...}}` shape vs deployed server's flat-fields validator — added Step 6's wire-shape fallback documenting both paths; (b) `questions[].framing` rejected as unknown key — adopted the literal `Why we're asking: <framing>\n\n<text>` inline anchor convention, load-bearing for `solicitation-review` parsing; (c) `evaluation_criteria[].id` required by server but undocumented in atom inputSchema — added `id` to the required criterion shape with `slugify(name)` as the derivation fallback. Step 7a verifier rewritten from "curl the public URL" to "`get_solicitation` round-trip" because the labs public-detail page now 302s to login for unauthenticated visitors even when `is_public: true` (tracked in `connect-labs#212`). When all four labs items in #212 ship, drop the inline workarounds: emit `framing` as a structured key, drop the wire-shape fallback paragraph, restore the curl-the-public-URL verifier as a second post-round-trip check. | ACE team |
 | 2026-05-22 | **Architecture decision: ACE owns composition; labs validates.** PR #396 had floated a future labs-side `create_solicitation_from_brief` MCP tool that would compose content server-side via labs's `solicitation_agent`. Walked back — operator chose to keep composition in ACE so this skill retains full control over voice, archetype-branched scope, framing/scoring_guide quality, and decisions-log integration (all of which are ACE-context that labs would have to learn). Labs's tightened MCP (forthcoming deploy: `create_solicitation` + `update_solicitation` now validate the canonical schema and fail loudly with `INVALID_SCHEMA` + `error.details.fields` on drift) is the right server-side contribution: schema enforcement, not content generation. This skill is the long-term home for solicitation composition; Step 6's payload shape is bound to labs's `tools/list` inputSchema rather than to a future composer call. Removal of the prior "Removal criteria" line. | ACE team |