From 5324d7f3d9c5fd053a2ddf3a0be99ce1a9b5faa6 Mon Sep 17 00:00:00 2001
From: cafzal <cameron.afzal@gmail.com>
Date: Tue, 5 May 2026 15:39:00 -0700
Subject: [PATCH 01/22] Add reproducibility runbooks to four multi-reasoner
 templates

Each runbook is an agent prompt sequence to recreate the template's
multi-reasoner pipeline using the bundled CSVs in ../data/, mapping
each stage to the template's actual concepts, properties, and outputs:

- telco_network_recovery (5-stage: descriptive -> rules -> graph ->
  predictive -> prescriptive; mirrors PR #56's existing structure)
- energy_grid_planning (4-stage: predictive -> graph -> rules ->
  prescriptive with InvestmentLevel scenarios)
- supply_chain_resilience (4-stage: blast-radius -> graph -> rules ->
  min-cost flow + scenarios)
- machine_maintenance (5-stage: querying -> graph -> rules ->
  prescriptive maintenance schedule -> resilience cross-training)

Reproducible against the bundled template CSVs; one-line notes on
swapping to a Snowflake schema for users wiring to their own data.
---
 v1/energy_grid_planning/references/runbook.md | 440 +++++++++++++++++
 v1/machine_maintenance/references/runbook.md  | 215 +++++++++
 .../references/runbook.md                     | 359 ++++++++++++++
 .../references/runbook.md                     | 449 ++++++++++++++++++
 4 files changed, 1463 insertions(+)
 create mode 100644 v1/energy_grid_planning/references/runbook.md
 create mode 100644 v1/machine_maintenance/references/runbook.md
 create mode 100644 v1/supply_chain_resilience/references/runbook.md
 create mode 100644 v1/telco_network_recovery/references/runbook.md

diff --git a/v1/energy_grid_planning/references/runbook.md b/v1/energy_grid_planning/references/runbook.md
new file mode 100644
index 0000000..4dc729a
--- /dev/null
+++ b/v1/energy_grid_planning/references/runbook.md
@@ -0,0 +1,440 @@
+# Runbook: Energy Grid Planning — Multi-Reasoner Walkthrough
+
+Walk-through of the chained-reasoner pattern this template is built on. One realistic business thread — **ERCOT processes 10 hyperscaler interconnection requests against the 12-substation Texas grid** — traced across four RAI reasoner families, each stage writing properties back to the same ontology that downstream stages consume.
+
+The template's combined script (`energy_grid_planning.py`) implements stages 1–4 directly; this runbook walks through how an agent would derive the same pipeline prompt-by-prompt, skill-by-skill, so a non-OR reader can follow the full reasoning thread end-to-end.
+
+---
+
+## TL;DR — the chain in one screen
+
+```
+ERCOT has 10 hyperscaler interconnection requests totalling 2,930 MW
+on a 12-substation grid with one structurally constrained bottleneck (DFW).
+The chain produces a Pareto frontier across 5 budget levels — the knee at
+$300M unlocks 5 DCs (1,500 MW, $264M net value) including xAI Colossus.
+
+  ─────────────────────────────────────────────────────────────────
+  STAGE 1  Predictive   ──►  Substation.predicted_load        (12)
+                              DFW: 1,100 → 1,700 MW (+54.6%) ── breaches
+                              1,600 MW capacity at 24mo. The only
+                              substation predicted to breach.
+  ─────────────────────────────────────────────────────────────────
+  STAGE 2  Graph        ──►  Substation.betweenness          (12)
+                 (WCC/      Substation.grid_community         (3 regions)
+                 Louvain/   Substation.is_structurally_critical (3)
+                 centrality) DFW, Houston, San Antonio dominate. 7 of 10
+                              DC requests target critical substations.
+  ─────────────────────────────────────────────────────────────────
+  STAGE 3  Rules        ──►  DataCenterRequest.is_compliant    (2)
+                              fails_capacity / fails_structural /
+                              fails_low_carbon flags written back.
+                              Only Crusoe (Midland) and Oracle
+                              (Corpus Christi) pass all three.
+  ─────────────────────────────────────────────────────────────────
+  STAGE 4  Prescriptive ──►  DataCenterRequest.x_approve  (per InvestmentLevel)
+                              SubstationUpgrade.x_upgrade  (per InvestmentLevel)
+                              OPTIMAL across 5 budget levels in one solve.
+                              Knee $300M · 5 DCs · 1,500 MW · $264M net.
+                              Google + Lambda never approved — DFW full.
+  ─────────────────────────────────────────────────────────────────
+```
+
+A single-reasoner approach can't answer this. Predictive alone forecasts demand but doesn't decide. Graph alone finds bottlenecks but doesn't weigh revenue. Rules alone flag noncompliance but can't optimize. Prescriptive alone has no way to weigh "critical" without graph + predictive upstream.
+
+---
+
+## Prerequisites
+
+The template ships everything needed to run the chain end-to-end:
+
+- Bundled CSVs in `../data/` (12 substations, 15 generators, 18 transmission lines, 10 DC requests, 10 substation upgrades, and supporting load/forecast tables). To run against your own Snowflake schema, swap CSV loaders for `model.Table(...)` references.
+- `raiconfig.yaml` pointed at your RAI-enabled Snowflake account.
+- The combined script `../energy_grid_planning.py` runs the full chain.
+
+The narrative below follows the actual stage outputs of `energy_grid_planning.py`. Steps 1–3 below are an agent-style walk-through of how the ontology would be built from scratch via skills like `/rai-build-starter-ontology` and `/rai-discovery`; the template ships a pre-built ontology in `energy_grid_planning.py` for users who want to skip ahead to the reasoner stages.
+
+---
+
+## Walk-through (agent-style)
+
+Steps are sequential — each depends on prior steps. Steps without a skill are presentation-only.
+
+| # | Step | Skill | Prompt | Expected Output |
+|---|------|-------|--------|-----------------|
+| 1 | Ontology | `/rai-build-starter-ontology` | "Build an ontology for ERCOT grid infrastructure planning from the bundled CSVs (or your Snowflake schema)." | 13 concepts: `Substation`, `Generator`, `TransmissionLine`, `LoadZone`, `DemandPeriod`, `RenewableProfile`, `MaintenanceWindow`, `Customer`, `DataCenterRequest`, `SubstationUpgrade`, `DemandForecast`, `LoadHistory`, `DCAnnouncement`. 12 substations, 15 generators, 18 transmission lines, 10 DC requests (2,930 MW). |
+| 2 | Visualize | — | "Show the ontology as an ASCII diagram." | Concept map with `Substation` as central hub — `Generator`, `TransmissionLine`, `DataCenterRequest`, `SubstationUpgrade`, `DemandForecast` all relate to it. |
+| 3 | Discovery | `/rai-discovery` | "What questions can we answer with this ontology? We're evaluating data center interconnection requests." | 4 reasoning paths: substation load forecast (predictive), grid topology + structural criticality (graph), interconnection compliance (rules), joint approval + upgrade investment (prescriptive). |
+| 4 | Explore: generation mix | `/rai-querying` | "What's our current generation mix by fuel type? How much renewable capacity vs fossil?" | 15 generators, 8,135 MW total. Nuclear leads (2,560 MW, 31.5%), then gas (2,290 MW, 28.1%), wind (1,250 MW, 15.4%), coal (1,020 MW, 12.5%), solar (630 MW, 7.7%), battery (300 MW), hydro (85 MW). Renewable: 2,265 MW (28%). Requests with 100% low-carbon mandates (Google, Crusoe) face a structural constraint. |
+| 5 | Explore: capacity headroom | `/rai-querying` | "Which substations have the most and least spare capacity right now, before any new DC load?" | Tightest: Houston Ship Channel (69.4% utilized, 550 MW headroom), Austin Energy (68.9%, 280 MW), DFW (68.8%, 500 MW). Most spare: Midland-Permian (38.2%, 680 MW headroom), Lubbock (44.3%, 390 MW). DFW has only 500 MW headroom but 1,100 MW of DC requests stacked on it. |
+| 6 | Explore: DC request landscape | `/rai-querying` | "Summarize the 10 DC requests — total MW per substation, revenue per MW, low-carbon requirements." | 2,930 MW total, $528M/yr revenue across 6 substations. DFW most stacked (1,100 MW, 3 requests: Google $195K/MW, xAI $210K/MW, Lambda $150K/MW). xAI is highest revenue ($210K/MW/yr, $105M/yr total). Google and Crusoe require 100% low-carbon. Top 3 substations (DFW, Houston, San Antonio) account for 78% of requested MW. |
+| 7 | Stage 1 — Predict | `/rai-querying` (or GNN) | "Forecast which substations are losing headroom fastest and which will breach capacity." | `Substation.predicted_load` written for all 12. DFW: 1,700 MW predicted vs 1,600 MW capacity at 24 months (+54.6% growth). Houston Ship Channel: 1,797 MW (+43.8%, within capacity). The only substation predicted to breach. |
+| 8 | Stage 2 — Graph | `/rai-graph-analysis` | "Build a graph on the transmission grid. Find structurally critical substations using betweenness, degree, and eigenvector centrality. Identify connected components and Louvain communities." | 1 connected component, 3 communities (North Texas, West Texas, Gulf Coast). Top 3 by combined centrality rank: DFW, Houston, San Antonio — all flagged `is_structurally_critical`. 7 of 10 DC requests target critical substations. |
+| 9 | Stage 3 — Rules | `/rai-rules-authoring` | "Check each request against capacity (using `predicted_load`), low-carbon mandate, and structural risk (using `is_structurally_critical`)." | 3 declarative `Relationship` rules (`fails_capacity`, `fails_structural`, `fails_low_carbon`) + composite `is_compliant`. All 10 pass low-carbon. 2 compliant: Crusoe (Midland) and Oracle (Corpus Christi). 8 flagged on capacity + structural risk. |
+| 10 | Stage 4 — Optimize | `/rai-prescriptive-problem-formulation` | "Which DCs to approve and which upgrades to fund across 5 budget levels ($200M-$600M)? Use `predicted_load` for capacity. Show the Pareto frontier — DCs, MW, revenue, net value at each level." | Pareto frontier across `InvestmentLevel` Scenario Concept. Knee at $300M (5 DCs, 1,500 MW, $264M net value). xAI Colossus unlocks at $300M. Google and Lambda never approved (DFW full). |
+| 11 | Results | `/rai-prescriptive-results-interpretation` | "How do approvals and upgrades vary by investment level? Where's the knee? What's marginal return per $M?" | Per-level DC list + selected upgrades, queried via `model.select(...).where(x_approve > 0.5)`. Marginal: $200→$300M = $995K/$M (knee); declines to $400K/$M by $600M. |
+
+---
+
+## Stage 1 — Predictive: substation load forecasting
+
+**Method:** load max forecasted load per substation as `Substation.predicted_load`. The template aggregates `DemandForecast.predicted_load_mw` over forecast horizons (6/12/18/24 months) and writes the max back to the substation. A pre-trained GNN can replace the table lookup; the script falls back gracefully when the GNN model registry is unavailable.
+
+```
+Substation load forecast (max across 6/12/18/24-month horizons)
+
+  Houston Ship Channel  ────  pred 1,797 MW  ████████████  +43.8%   safe
+  Dallas-Fort Worth     ────  pred 1,700 MW  ██████████    +54.6%   ▲ breach 24mo
+  San Antonio Metro     ────  pred 1,069 MW  ███████       +37.1%   safe
+  Austin Energy         ────  pred   819 MW  █████         +32.1%   safe
+  Waco Gateway          ────  pred   600 MW  ████          +22.4%   safe
+  Corpus Christi Coast  ────  pred   600 MW  ████          +11.1%   safe
+  Midland-Permian       ────  pred   520 MW  ███           +23.8%   safe
+  El Paso Border        ────  pred   470 MW  ███           +14.6%   safe
+  Abilene Central       ────  pred   400 MW  ███           +14.3%   safe
+  Lubbock West Texas    ────  pred   360 MW  ██            +16.1%   safe
+  Brownsville Valley    ────  pred   355 MW  ██            +10.9%   safe
+  Amarillo Panhandle    ────  pred   315 MW  ██            +12.5%   safe
+                                            ▲
+  ──────────────────────────────────────────│──────────────────────
+  DFW is the only substation predicted to   │
+  breach capacity (1,700 vs 1,600 MW cap).  │
+  Stage 3 rules and Stage 4 capacity        │
+  constraint both consume this property.    │
+  ──────────────────────────────────────────────────────────────────
+
+✓ Substation.predicted_load written back to all 12 substations
+```
+
+**Stage 3 rule and Stage 4 constraint with the predictive term:**
+
+```python
+# Both reuse the same effective_load expression
+effective_load = Substation.predicted_load | Substation.current_load_mw
+```
+
+**Caveats:**
+- The bundled forecasts are pre-computed in `data/demand_forecasts.csv` to keep the template self-contained. To wire in a live GNN, point `GNN(...)` at your model registry — the template's `try` block already handles both paths.
+- DFW's projected breach partially encodes the stacked DC request itself ("we expect demand to keep rising at the substation everyone is targeting"). For an independent baseline, train on a pre-announcement slice and compare.
+
+---
+
+## Stage 2 — Graph: grid topology & structural vulnerability
+
+**Construction** — `Substation` as the node concept directly (no mirror concept):
+- Node concept: `Substation` (12 nodes)
+- Edges: active `TransmissionLine` rows, `from_substation` → `to_substation`
+- Direction: undirected, unweighted
+
+**Algorithms:** weakly connected components, Louvain community detection, betweenness/degree/eigenvector centrality. Combined rank picks the top-3 as `is_structurally_critical`.
+
+```
+Connectivity                ──►  1 component   (12 of 12 reachable)
+
+Louvain communities         ──►  3 regions
+
+  Region 1 — North Texas      Dallas-Fort Worth, Austin Energy, Waco Gateway
+  Region 2 — West Texas       Midland-Permian, Lubbock, El Paso, Amarillo, Abilene
+  Region 3 — Gulf Coast       Houston Ship Channel, San Antonio Metro,
+                              Corpus Christi Coast, Brownsville Valley
+
+Centrality (top-3 marked is_structurally_critical)
+
+  #1  Dallas-Fort Worth      betw 31.67  ████████████   [CRITICAL]
+  #2  Houston Ship Channel   betw 15.83  ██████         [CRITICAL]
+  #3  San Antonio Metro      betw  4.33  ██             [CRITICAL]
+  #4  Austin Energy          betw  ~3.0
+  …
+  #12 Brownsville Valley     betw  0.00
+
+  ──────────────────────────────────────────────────────────────────
+  DC requests targeting structurally critical substations:
+    Microsoft Horizon Campus  (350 MW) ─►  Houston    [CRITICAL]
+    Meta Bayou DC             (300 MW) ─►  Houston    [CRITICAL]
+    Google Metroplex DC       (400 MW) ─►  DFW        [CRITICAL]
+    xAI Colossus Texas        (500 MW) ─►  DFW        [CRITICAL]
+    Lambda Labs DFW           (200 MW) ─►  DFW        [CRITICAL]
+    Amazon SA Cloud           (280 MW) ─►  San Antonio [CRITICAL]
+    Apple iCloud Texas        (250 MW) ─►  San Antonio [CRITICAL]
+                                                       ────────────
+  7 of 10 DC requests sit on the 3 most structurally critical nodes.
+  ──────────────────────────────────────────────────────────────────
+
+✓ Substation.betweenness / degree_centrality / eigenvector_centrality
+✓ Substation.grid_community              written back (12 rows)
+✓ Substation.is_structurally_critical    written back (3 rows)
+```
+
+---
+
+## Stage 3 — Rules: interconnection queue compliance
+
+Three declarative `Relationship` rules consume Stages 1–2 enrichments. Each is written as a `model.where(...).define(...)` block; a composite `is_compliant` fires only when none of the three failure flags fire.
+
+**Rule 1 — Capacity** (consumes `Substation.predicted_load` from Stage 1):
+
+```python
+DataCenterRequest.fails_capacity = model.Relationship(...)
+effective_load = SubRef.predicted_load | SubRef.current_load_mw
+model.where(
+    DataCenterRequest.substation(SubRef),
+    DataCenterRequest.requested_mw + effective_load > SubRef.max_capacity_mw,
+).define(DataCenterRequest.fails_capacity())
+```
+
+**Rule 2 — Structural risk** (consumes `Substation.is_structurally_critical` from Stage 2):
+
+```python
+model.where(
+    DataCenterRequest.substation(SubRef),
+    SubRef.is_structurally_critical(),
+).define(DataCenterRequest.fails_structural())
+```
+
+**Rule 3 — Low-carbon mandate** (zero-emission share at the substation must meet the request's requirement; nuclear + renewable count):
+
+```python
+model.where(
+    DataCenterRequest.substation(SubRef),
+    (SubRef.low_carbon_gen_mw | 0.0) * 100
+        < DataCenterRequest.low_carbon_requirement_pct * (SubRef.total_gen_mw | 0.001),
+).define(DataCenterRequest.fails_low_carbon())
+```
+
+```
+  DC Request                Hyper          Q#    MW   Cap  LowC  Crit  OK?
+  ──────────────────────────────────────────────────────────────────────
+  Microsoft Horizon Campus  Microsoft       1   350  FAIL  PASS  FAIL    N
+  Meta Bayou DC             Meta            2   300  FAIL  PASS  FAIL    N
+  Google Metroplex DC       Google          3   400  FAIL  PASS  FAIL    N
+  xAI Colossus Texas        xAI             4   500  FAIL  PASS  FAIL    N
+  Lambda Labs DFW           Lambda Labs     5   200  FAIL  PASS  FAIL    N
+  Amazon SA Cloud           Amazon          6   280  FAIL  PASS  FAIL    N
+  Apple iCloud Texas        Apple           7   250  FAIL  PASS  FAIL    N
+  CoreWeave Austin GPU      CoreWeave       8   320  FAIL  PASS  PASS    N
+  Crusoe Permian DC         Crusoe Energy   9   180  PASS  PASS  PASS    Y
+  Oracle Coastal DC         Oracle         10   150  PASS  PASS  PASS    Y
+  ──────────────────────────────────────────────────────────────────────
+  Summary: 2 compliant, 8 flagged
+```
+
+Every request passes low-carbon — ERCOT's nuclear (STP, Comanche Peak) plus its wind/solar fleet provides enough zero-emission generation. The two compliant requests sit on substations that are neither structurally critical (Stage 2) nor predicted to breach (Stage 1). Eight requests need either upgrades, redirection, or both — Stage 4 picks the optimal mix.
+
+```
+✓ DataCenterRequest.fails_capacity         written back
+✓ DataCenterRequest.fails_structural       written back
+✓ DataCenterRequest.fails_low_carbon       written back
+✓ DataCenterRequest.is_compliant           written back  (2 rows)
+```
+
+---
+
+## Stage 4 — Prescriptive: joint DC approval + upgrade MIP
+
+```
+FORMULATION
+
+  Scenario Concept
+    InvestmentLevel(name, budget_cap)   5 levels: $200M, $300M, $400M, $500M, $600M
+
+  Decision variables (binary, indexed by InvestmentLevel)
+    DataCenterRequest.x_approve(InvestmentLevel)    50 binaries  (10 DCs × 5 levels)
+    SubstationUpgrade.x_upgrade(InvestmentLevel)    50 binaries  (10 upgrades × 5 levels)
+
+  Constraints
+    1. Substation capacity per InvestmentLevel
+         max_capacity − predicted_load + Σ(x_upgrade · capacity_increase)
+         ≥  Σ(x_approve · requested_mw)            per (Substation, InvestmentLevel)
+    2. Budget per InvestmentLevel
+         Σ(x_upgrade · cost_million)  ≤  budget_cap   per InvestmentLevel
+
+  Objective (maximize)
+    Σ x_approve · annual_revenue_per_mw · requested_mw           summed over all levels
+
+──────────────────────────────────────────────────────────────────────
+SOLVE  (HiGHS, single solve, all 5 levels in one MIP)   →   OPTIMAL
+──────────────────────────────────────────────────────────────────────
+```
+
+```
+PARETO FRONTIER (queried directly from ontology)
+
+  Level    Budget    DCs   DC MW    Revenue $/yr    Upg $M    Net Value
+  ──────  ────────  ────  ──────  ─────────────   ────────  ───────────
+   $200M    $200M     4    1,000     $174,350,000    $190.0   $164,850,000
+   $300M    $300M     5    1,500     $279,350,000    $300.0   $264,350,000   ◄ KNEE
+   $400M    $400M     6    1,800     $328,850,000    $385.0   $309,600,000
+   $500M    $500M     7    2,080     $376,450,000    $430.0   $354,950,000
+   $600M    $600M     8    2,330     $420,200,000    $505.0   $394,950,000
+
+  Per-level DC approvals (cumulative as budget increases)
+
+   $200M:  Microsoft (350) · CoreWeave (320) · Crusoe (180) · Oracle (150)   ── 1,000 MW
+   $300M:  + xAI Colossus (500)                                              ── 1,500 MW
+   $400M:  + Meta Bayou (300)                                                ── 1,800 MW
+   $500M:  + Amazon SA (280)                                                 ── 2,080 MW
+   $600M:  + Apple iCloud (250)                                              ── 2,330 MW
+
+   Never approved at any level:
+     Google Metroplex DC  (400 MW, DFW)
+     Lambda Labs DFW      (200 MW, DFW)
+     ──────────────────────────────────
+     DFW is physically full once xAI takes 500 MW.
+
+  Marginal value per $M added budget
+
+   $200 → $300M    ── $995K / $M   ◄ knee — xAI Colossus unlocks at $300M
+   $300 → $400M    ── $453K / $M
+   $400 → $500M    ── $454K / $M
+   $500 → $600M    ── $400K / $M
+
+✓ DataCenterRequest.x_approve(InvestmentLevel)   written back
+✓ SubstationUpgrade.x_upgrade(InvestmentLevel)   written back
+  Each variable is now a queryable property of the model — no
+  parsing of solver output, no per-level re-solve loops.
+```
+
+### Reading the solve
+
+- **Knee at $300M.** xAI Colossus alone delivers $105M/yr ($210K × 500 MW), the single highest-revenue request. Unlocking it requires the $110M UPG-003 upgrade at DFW. Below $300M, xAI doesn't fit; above $300M, the next-best DCs (Meta, Amazon, Apple) deliver $0.4M/$M-ish — diminishing returns.
+- **DFW is the binding bottleneck.** With xAI's 500 MW absorbing the post-upgrade headroom, neither Google (400 MW) nor Lambda (200 MW) ever fit — they share DFW. The optimizer correctly prioritizes revenue per MW over queue position.
+- **All compliant requests get approved by $200M.** The 2 fully compliant requests from Stage 3 (Crusoe, Oracle) sit on sub-critical substations; they're cheap upgrades and clear the bar at the lowest budget.
+- **`predicted_load` propagates the chain.** Capacity slack at every substation reflects Stage 1's forecast, not raw historical load — the same signal the rules engine used.
+
+---
+
+## Stage 5 — Interpretation
+
+```
+THE PLAN, IN BUSINESS TERMS
+
+  • Recommended budget: $300M (the knee).
+    5 DCs, 1,500 MW connected, $279M/yr revenue, $264M net value.
+  • Marginal $100M from $200M unlocks xAI Colossus alone — $105M/yr — at
+    $995K return per $1M spent. Beyond the knee, marginal return roughly
+    halves.
+  • Google (400 MW) and Lambda (200 MW) are structurally infeasible at any
+    budget level — DFW is full once xAI takes its 500 MW. Either redirect
+    them to a different substation, or de-prioritize.
+  • Crusoe (Midland) and Oracle (Corpus Christi) approve cheaply at any
+    budget — they were the only two that passed all three rules in Stage 3.
+
+ONTOLOGY ENRICHMENT — what each stage contributed back
+
+  Stage 1 (predictive)    Substation.predicted_load                [12]
+
+  Stage 2 (graph)         Substation.betweenness                   [12]
+                          Substation.degree_centrality             [12]
+                          Substation.eigenvector_centrality        [12]
+                          Substation.grid_community                [12]
+                          Substation.is_structurally_critical      [3]
+
+  Stage 3 (rules)         DataCenterRequest.fails_capacity         [8]
+                          DataCenterRequest.fails_structural       [7]
+                          DataCenterRequest.fails_low_carbon       [0]
+                          DataCenterRequest.is_compliant           [2]
+                          Substation.low_carbon_gen_mw             [12]
+                          Substation.total_gen_mw                  [12]
+
+  Stage 4 (prescriptive)  DataCenterRequest.x_approve              [50]
+                          SubstationUpgrade.x_upgrade              [50]
+
+  ──────────────────────────────────────────────────────────────────
+  Each stage reads what the previous stage wrote.
+  Re-running any downstream stage automatically picks up enrichments.
+  No glue code, no DataFrame round-trip — same ontology throughout.
+  ──────────────────────────────────────────────────────────────────
+```
+
+---
+
+## The chain — accretive ontology enrichment
+
+```
+THE ENERGY GRID PLANNING CHAIN
+
+  STAGE 1  PREDICTIVE
+  "Where is demand growing? Who breaches first?"
+  reads:   DemandForecast.predicted_load_mw, Substation.current_load_mw
+  writes:  Substation.predicted_load              ── per substation
+                         │
+                         ▼
+  STAGE 2  GRAPH (WCC / Louvain / centrality)
+  "How is the grid connected? Which substations are structural bottlenecks?"
+  reads:   Substation nodes, TransmissionLine edges (active)
+  writes:  Substation.betweenness / degree_centrality / eigenvector_centrality
+           Substation.grid_community               ── 3 regions
+           Substation.is_structurally_critical     ── 3 substations
+                         │
+                         ▼
+  STAGE 3  RULES (declarative Relationships)
+  "Which DC requests pass capacity, structural, low-carbon checks?"
+  reads:   Substation.predicted_load        ◄── Stage 1
+           Substation.is_structurally_critical ◄── Stage 2
+           Generator.emissions_rate, DataCenterRequest.low_carbon_requirement_pct
+  writes:  DataCenterRequest.fails_capacity / fails_structural / fails_low_carbon
+           DataCenterRequest.is_compliant          ── 2 requests
+                         │
+                         ▼
+  STAGE 4  PRESCRIPTIVE (HiGHS MIP, Scenario Concept)
+  "Which DCs to approve and which upgrades to fund across 5 budget levels?"
+  reads:   Substation.predicted_load        ◄── Stage 1
+           SubstationUpgrade.cost_million / capacity_increase_mw
+           DataCenterRequest.requested_mw / annual_revenue_per_mw
+           InvestmentLevel.budget_cap              ── 5 levels
+  writes:  DataCenterRequest.x_approve(InvestmentLevel)
+           SubstationUpgrade.x_upgrade(InvestmentLevel)
+                         │
+                         ▼
+                   Pareto frontier,
+                   queryable directly from the ontology.
+
+  ──────────────────────────────────────────────────────────────────
+  No glue. No DataFrame ping-pong. No per-level re-solve loop.
+  Four reasoners, one ontology, one accretive thread.
+  ──────────────────────────────────────────────────────────────────
+```
+
+---
+
+## Why the chain matters (vs. any single stage)
+
+| Stage alone | What it tells you | What it doesn't |
+|---|---|---|
+| Predictive | "DFW will breach in 24 months" | What to do; which requests matter |
+| Graph alone | "DFW, Houston, San Antonio are bottlenecks" | Whether they have headroom; what to approve |
+| Rules alone | (won't fire — no `predicted_load`, no `is_structurally_critical`) | Pipeline misses |
+| Prescriptive alone | (no flagged set, no critical-node info, no forecast) | Whole pipeline misses |
+
+| Combined | Output |
+|---|---|
+| Predictive → Graph | Forecasted load + structural bottleneck map |
+| + Rules | Per-request compliance vs. capacity / structural / low-carbon |
+| + Prescriptive | Pareto frontier across 5 budget levels in one solve |
+
+**Multi-reasoner chaining grounded in (and contributing to) the ontology.**
+
+---
+
+## Data Reference
+
+**Substations with DC requests:**
+
+| Substation | Location | Capacity | DC Requests | DC MW |
+|------------|----------|----------|-------------|-------|
+| SUB-001 | Houston Ship Channel | 1,800 MW | Microsoft (350), Meta (300) | 650 MW |
+| SUB-002 | Dallas-Fort Worth | 1,600 MW | Google (400), xAI (500), Lambda (200) | 1,100 MW |
+| SUB-003 | San Antonio Metro | 1,200 MW | Amazon (280), Apple (250) | 530 MW |
+| SUB-004 | Austin Energy | 900 MW | CoreWeave (320) | 320 MW |
+| SUB-005 | Midland-Permian | 1,100 MW | Crusoe (180) | 180 MW |
+| SUB-007 | Corpus Christi Coast | 800 MW | Oracle (150) | 150 MW |
+
+**ERCOT regions (Louvain):** North Texas (DFW, Austin, Waco) | West Texas (Midland, Lubbock, El Paso, Amarillo, Abilene) | Gulf Coast (Houston, San Antonio, Corpus Christi, Brownsville)
+
+**DFW breach:** 1,600 MW capacity, 1,700 MW predicted (24mo), +54.6% growth, 1,100 MW DC requests stacked on top. Google and Lambda permanently infeasible.
+
+**Upgrades:** 10 available, $630M total, 2,900 MW combined capacity. Only $300M of upgrades are needed at the knee.
+
+- **Source data**: bundled CSVs in `../data/` (12 substations, 15 generators, 18 transmission lines, 10 DC requests, 10 upgrade options, plus historical load and forecast tables).
+- **Ontology**: defined in `../energy_grid_planning.py` (13 concepts).
+- **Stages**: implemented in `../energy_grid_planning.py` as a single combined script with stage banners.
diff --git a/v1/machine_maintenance/references/runbook.md b/v1/machine_maintenance/references/runbook.md
new file mode 100644
index 0000000..98760d9
--- /dev/null
+++ b/v1/machine_maintenance/references/runbook.md
@@ -0,0 +1,215 @@
+# Runbook: Machine Maintenance — Multi-Reasoner Walkthrough
+
+Walk-through of the chained-reasoner pattern this template is built on. One realistic plant-floor thread — **scheduling preventive maintenance for a 30-machine, 3-plant operation** — traced across querying, graph, rules, and prescriptive reasoners, each stage writing properties back to the same ontology that downstream stages consume.
+
+The template's combined script (`machine_maintenance.py`) implements all five stages directly; this runbook expands the surrounding narrative — what each prompt asks, what shape of output to expect, and how each enrichment feeds the next — so a reader can follow the reasoning thread end-to-end without re-running the script.
+
+---
+
+## TL;DR — the chain in one screen
+
+```
+Plant_B looks worst on OEE (61.4%). Plant_A looks mid-tier (68.2%).
+The chain shows Plant_A is actually the highest-risk plant — and that
+all 3 Turbine techs sit in one city, a $3,200 fix away from resolved.
+
+  ─────────────────────────────────────────────────────────────────
+  STAGE 0  Querying     ──►  Machine.performance_ratio  (30)
+                              Machine.quality_ratio  (30)
+                              Machine.anomaly_count  (30)
+                              MachinePeriod.predicted_fp  (120)
+                              Plant_C 79.8% > Plant_A 68.2% > Plant_B 61.4%
+                              7 of 9 sensor anomalies are at Plant_A.
+  ─────────────────────────────────────────────────────────────────
+  STAGE 1  Graph        ──►  Machine.betweenness  (30)
+                              30 machines → 1 connected component.
+                              Pumps tie for top centrality (24.0).
+  ─────────────────────────────────────────────────────────────────
+  STAGE 2  Rules        ──►  Machine.is_overdue_maintenance  (6)
+                              Machine.is_high_risk  (1)
+                              Machine.is_chronic_downtime  (3)
+                              Machine.risk_tier  (30)
+                              M013 (Pump, Plant_A) = Critical (3 of 3).
+  ─────────────────────────────────────────────────────────────────
+  STAGE 3  Prescriptive ──►  MachinePeriod.x_maintain  (120 binary)
+                              MachinePeriod.x_vulnerable  (120 binary)
+                              TechnicianMachinePeriod.x_assigned
+                              OPTIMAL · 20 jobs · 4 periods · $605,241
+  ─────────────────────────────────────────────────────────────────
+  STAGE 4  Resilience   ──►  Concentration analysis on the solve
+                              Turbine: all 3 techs in Houston_TX.
+                              67% of scheduled Turbine jobs travel.
+                              Cross-train T006 (Chicago) — $3,200 / 5 wks.
+  ─────────────────────────────────────────────────────────────────
+```
+
+A single-reasoner approach can't surface this. OEE alone says Plant_B is the problem. Sensor counts alone don't quantify forward risk. Rules alone flag machines but don't allocate scarce technician time. The optimizer alone produces a feasible schedule — but doesn't know that a single weather event in Houston blocks all on-site Turbine work.
+
+---
+
+## Setup
+
+See the template's main `README.md` for installation, RAI connection setup, and how to run the script. The narrative below follows the actual stage outputs of `machine_maintenance.py`.
+
+**Prerequisites**
+
+- Template's `data/` CSVs available (or your own Snowflake schema with equivalent tables — `machines`, `technicians`, `qualifications`, `availability`, `parts_inventory`, `certification_expiry`, `sensors`, `sensor_readings`, `failure_predictions`, `downtime_events`, `production_runs`, `training_options`)
+- `raiconfig.yaml` pointing at your RAI engine
+- Python ≥ 3.10 with `relationalai >= 1.0.14`
+
+---
+
+## Workflow
+
+The runbook walks the same chain stage-by-stage, prompt-by-prompt, in agent-skill order. Each row maps to a section of the script.
+
+| # | Step | Skill | Prompt | Expected Output |
+|---|------|-------|--------|-----------------|
+| 1 | Build ontology | `/rai-build-starter-ontology` | "Build a RAI ontology for a manufacturing maintenance scheduling problem from the CSVs in `data/`. Concepts: Machine, Technician, Qualification, Period, MachinePeriod, TechnicianPeriod, TechnicianMachinePeriod, PartsInventory, CertificationExpiry, Sensor, SensorReading, FailurePrediction, DowntimeEvent, ProductionRun." | Model `machine_maintenance` with 14 user-facing concepts. 30 Machine rows (3 plants × 10 machines, 5 types × 6). 10 Technician rows (4 Chicago_IL, 3 Houston_TX, 3 Phoenix_AZ). 16 Qualification rows. 4 Period rows. |
+| 2 | Discovery | `/rai-discovery` | "What questions can we answer with this ontology? We want to schedule preventive maintenance and surface hidden operational risk." | Querying: OEE by facility, anomaly counts, failure trajectory deltas. Graph: machine dependency clusters, bottleneck centrality on shared-technician edges. Rules: overdue, high-risk, chronic-downtime flags chained to a composite risk tier. Prescriptive: maintain × period × technician assignment minimizing failure + labor + travel cost. Resilience: post-solve concentration analysis. |
+| 3 | Stage 0 — OEE proxy | `/rai-querying` | "Compute OEE proxy (Performance × Quality) by facility. Performance is total_actual / total_planned across ProductionRuns; Quality is total_good / total_actual." | Plant_C: Perf 81.3%, Qual 98.1%, OEE 79.8%. Plant_A: Perf 69.8%, Qual 97.8%, OEE 68.2%. Plant_B: Perf 62.6%, Qual 98.1%, OEE 61.4%. Quality is uniform; Performance is the differentiator. |
+| 4 | Stage 0 — Sensor anomalies | `/rai-querying` | "List machines with above-threshold sensor readings (`SensorReading.is_anomaly == 1`), grouped by facility." | 9 anomaly readings across 5 machines. Plant_A: 7 (M013 Pump:3, M001 Turbine:2, M016 Turbine:2). Plant_B: 1 (M002 Compressor). Plant_C: 1 (M006 Turbine). Plant_A's anomaly load is 7× Plant_B's despite Plant_A's higher OEE. |
+| 5 | Stage 0 — Failure trajectories | `/rai-querying` | "For each machine, compute the failure-probability delta from period 1 to period 4 from `FailurePrediction`. Show the steepest 6." | M001 (Turbine, Plant_A): 0.102 → 0.332 (+0.230, bearing_wear). M013 (Pump, Plant_A): 0.435 → 0.663 (+0.228, impeller_erosion). M016 (Turbine, Plant_A): 0.263 → 0.482 (+0.219, bearing_wear). All three steepest curves are Plant_A. Stored back as `MachinePeriod.predicted_fp` (120 rows) for Stage 3's objective. |
+| 6 | Stage 1 — Dependency graph | `/rai-graph-analysis` | "Build a graph with `Machine` as `node_concept`. Two machines are adjacent when at least one technician is qualified for both machine types. Run weakly connected components." | 30 nodes, edges joined via `Qualification`. WCC: 1 cluster of 30 — every machine is reachable from every other through shared qualifications. No isolated subgraphs. |
+| 7 | Stage 1 — Bottleneck centrality | `/rai-graph-analysis` | "Compute betweenness centrality on the dependency graph. Normalize and store as `Machine.betweenness`." | Pump-type machines tie at the top (raw betweenness 24.0 → normalized 1.0): M003 (Plant_C), M008 (Plant_B), M013 (Plant_A). Turbines, Generators, Motors, Compressors lower. `Machine.betweenness` written back for all 30 machines and consumed by Stage 3's failure-cost multiplier. |
+| 8 | Stage 2 — Compliance flags | `/rai-rules-authoring` | "Define seven derived flags: overdue (`remaining_useful_life < maintenance_duration_hours`), high-risk (`failure_probability > 0.3 AND criticality >= 4`), anomalous (`anomaly_count > 0`), chronic-downtime (`downtime_event_count > 8`), parts-reorder (`stock_level <= min_order_qty`), expiring-cert (`days_remaining < 30`). Use `model.where(...).define(...)`." | Overdue (6): M002, M006, M013, M016, M022, M025 (RUL below required maintenance hours). High-risk (1): M013 (fp=0.435, crit=4). Anomalous (5): M013, M001, M016, M002, M006. Chronic downtime (3, threshold > 8 events): M001 (12 events), M016 (11), M013 (10). Parts reorder (4): P001, P003, P004, P006. Expiring certs (5): T001 Compressor 22d, T004 Pump 8d, T003 Compressor 15d, T006 Motor 25d, T009 Motor 12d. |
+| 9 | Stage 2 — Composite risk tier | `/rai-rules-authoring` | "Chain `is_chronic_downtime`, `is_high_risk`, `is_overdue_maintenance` into `Machine.risk_tier`: Critical if all three, Elevated if exactly two, Standard otherwise. Enumerate all 8 combinations using `model.not_()` for negation." | Critical (1): M013 (Pump, Plant_A) — chronic + high-risk + overdue. Elevated (1): M016 (Turbine, Plant_A) — chronic + overdue, not high-risk. Standard (28): rest. Plant_A holds the only Critical and the only Elevated machine despite ranking second on OEE. |
+| 10 | Stage 3 — Formulation | `/rai-prescriptive-problem-formulation` | "Formulate the maintenance schedule. Decision variables: `MachinePeriod.x_maintain` (bin), `MachinePeriod.x_vulnerable` (bin), `TechnicianMachinePeriod.x_assigned` (bin) — restricted to qualified pairs. Constraints: cumulative coverage (`Σ x_maintain[m,1..τ] + x_vulnerable[m,τ] = 1` per machine and period), assignment-maintenance linkage (`Σ x_assigned over techs = x_maintain` per (m,τ)), technician hour capacity (`Σ x_assigned · duration ≤ available_hours`), parts/bay capacity (`≤ 5 jobs per period`), and overdue deadline (`Σ x_maintain[m, τ ≤ 2] ≥ 1` for every overdue machine — feeds from Stage 2). Objective: minimize failure_cost + labor_cost + travel_cost." | 120 `x_maintain` binaries (30 machines × 4 periods). 120 `x_vulnerable` binaries. ~250 `x_assigned` binaries (qualification-restricted). 5 constraint families. Failure cost uses `MachinePeriod.predicted_fp` (Stage 0) × `Machine.criticality` × `(1 + 2.0 × Machine.betweenness)` (Stage 1). |
+| 11 | Stage 3 — Solve | `/rai-prescriptive-solver-management` | "Solve with HiGHS, time limit 120s, assert OPTIMAL." | OPTIMAL. Objective = $605,240.61. 20 maintenance jobs scheduled across 4 periods (capacity-binding at 5 jobs/period). Both Plant_A overdue Turbines (M016 plus the rest of the overdue list) maintained by period 2 — overdue constraint satisfied. |
+| 12 | Stage 3 — Schedule readout | `/rai-prescriptive-results-interpretation` | "Show the period-by-period schedule and technician assignments. Flag any travel (`base_location != machine.location`)." | Period 1 includes M002 (Plant_B), M006 (Plant_C), M013 (Plant_A), M016 (Plant_A) — high-priority overdue/critical machines. Periods 2–4 cover the remaining 16 jobs. Multiple Turbine assignments require travel because all 3 Turbine-qualified techs (T001, T002, T003) are based in Houston_TX while Turbines exist at all 3 plants. Travel cost is paid at $50/hr × duration. |
+| 13 | Stage 4 — Concentration analysis | `/rai-graph-analysis`, `/rai-querying` | "From the qualification table, find machine types whose qualified technicians are all in one location. For each concentrated type, count how many scheduled jobs require travel." | Compressor: techs in Chicago_IL, Houston_TX (gap: Phoenix_AZ). Generator: Chicago_IL, Phoenix_AZ (gap: Houston_TX). Motor: Chicago_IL, Phoenix_AZ (gap: Houston_TX). Pump: Chicago_IL, Phoenix_AZ (gap: Houston_TX). **Turbine: all 3 techs in Houston_TX — CONCENTRATED.** Of 3 scheduled Turbine jobs, 2 require travel (67%). 4 of 6 Turbines are at remote plants. The optimizer found the cheapest plan but cannot fix the structural fragility — losing T001's Compressor cert (22 days remaining) doesn't break Turbines, but losing any of T001/T002/T003 from Houston shrinks Turbine coverage by a third. |
+| 14 | Stage 4 — Cross-training recommendation | `/rai-prescriptive-results-interpretation` | "From `training_options.csv`, find the cheapest Turbine-cross-training candidate based outside Houston_TX." | Best candidate: **T006 (Fiona_Garcia, Senior, Chicago_IL) — $3,200 / 5 weeks.** Other non-Houston options: T005 ($3,500/6w, Chicago), T008 ($3,800/6w, Phoenix), T009 ($4,200/8w, Phoenix), T004 ($5,500/10w, Chicago). Training T006 adds the first non-Houston Turbine tech, eliminates the single-point-of-failure for Plant_B and Plant_C Turbines, and pays back the first time travel or a cert lapse would have idled a Turbine job. The prescriptive reasoner produced the schedule; the resilience layer produced the structural action item. |
+
+---
+
+## Stage outputs — what each reasoner contributes back
+
+```
+ONTOLOGY ENRICHMENT — what each stage wrote back
+
+  Stage 0 (querying)      Machine.total_planned_qty               [30]
+                          Machine.total_actual_qty                [30]
+                          Machine.total_good_qty                  [30]
+                          Machine.performance_ratio               [30]
+                          Machine.quality_ratio                   [30]
+                          Machine.total_downtime_minutes          [30]
+                          Machine.downtime_event_count            [30]
+                          Machine.anomaly_count                   [30]
+                          MachinePeriod.predicted_fp              [120]
+
+  Stage 1 (graph)         Machine.betweenness_raw                 [30]
+                          Machine.betweenness   (normalized)      [30]
+
+  Stage 2 (rules)         Machine.is_overdue_maintenance          [6]
+                          Machine.is_high_risk                    [1]
+                          Machine.is_anomalous                    [5]
+                          Machine.is_chronic_downtime             [3]
+                          Machine.risk_tier                       [30]
+                          PartsInventory.needs_reorder            [4]
+                          CertificationExpiry.is_expiring         [5]
+
+  Stage 3 (prescriptive)  MachinePeriod.x_maintain                [120 binary]
+                          MachinePeriod.x_vulnerable              [120 binary]
+                          TechnicianMachinePeriod.x_assigned
+
+  Stage 4 (analysis)      (terminal — prints concentration risk and
+                           costed cross-training recommendation)
+
+  ──────────────────────────────────────────────────────────────────
+  Each stage reads what the previous stage wrote.
+  Re-running any downstream stage automatically picks up enrichments.
+  No glue code, no DataFrame round-trip — same ontology throughout.
+  ──────────────────────────────────────────────────────────────────
+```
+
+---
+
+## The chain — accretive ontology enrichment
+
+```
+THE MACHINE-MAINTENANCE CHAIN
+
+  STAGE 0  QUERYING
+  "Where does the operation actually hurt?"
+  reads:   ProductionRun, SensorReading, FailurePrediction
+  writes:  Machine.performance_ratio / quality_ratio
+           Machine.anomaly_count
+           Machine.downtime_event_count / total_downtime_minutes
+           MachinePeriod.predicted_fp
+                         │
+                         ▼
+  STAGE 1  GRAPH (betweenness centrality)
+  "Which machines are scheduling bottlenecks?"
+  reads:   Qualification, Machine (as node_concept)
+  writes:  Machine.betweenness        ── normalized 0..1
+                         │
+                         ▼
+  STAGE 2  RULES
+  "Which machines violate which compliance flags?"
+  reads:   Machine.failure_probability, Machine.criticality,
+           Machine.remaining_useful_life, Machine.maintenance_duration_hours,
+           Machine.anomaly_count, Machine.downtime_event_count,
+           PartsInventory.stock_level, CertificationExpiry.days_remaining
+  writes:  Machine.is_overdue_maintenance / is_high_risk /
+           is_chronic_downtime / is_anomalous
+           Machine.risk_tier            ── Critical / Elevated / Standard
+           PartsInventory.needs_reorder
+           CertificationExpiry.is_expiring
+                         │
+                         ▼
+  STAGE 3  PRESCRIPTIVE (HiGHS MIP)
+  "What's the optimal maintain-and-assign plan?"
+  reads:   MachinePeriod.predicted_fp        ──►  failure cost (period-specific)
+           Machine.betweenness               ──►  failure cost multiplier
+           Machine.is_overdue_maintenance    ──►  hard deadline constraint
+           Qualification (assignment scope), TechnicianPeriod.capacity_hours
+  writes:  MachinePeriod.x_maintain          ── 20 jobs flagged
+           MachinePeriod.x_vulnerable
+           TechnicianMachinePeriod.x_assigned
+                         │
+                         ▼
+  STAGE 4  RESILIENCE
+  "Where is the schedule structurally fragile?"
+  reads:   Solution variables, Qualification, Technician, TrainingOption
+  writes:  (terminal — concentration risk + cross-training recommendation)
+
+  ──────────────────────────────────────────────────────────────────
+  No glue. No DataFrame ping-pong. No re-derivation per-reasoner.
+  Five reasoners, one ontology, one accretive thread.
+  ──────────────────────────────────────────────────────────────────
+```
+
+---
+
+## Why the chain matters (vs. any single stage)
+
+| Stage alone | What it tells you | What it doesn't |
+|---|---|---|
+| Querying | "Plant_B has worst OEE; Plant_A has the most anomalies" | Whether anomalies translate to scheduling priority |
+| Graph alone | "Pumps are the most central machine type" | Which Pump matters most or what to do |
+| Rules alone | "M013 is Critical-tier" | How to fit M013 plus 19 others into a 4-period horizon with 10 techs |
+| Prescriptive alone | (won't run — no per-period failure curve, no centrality weight, no overdue flag) | Whole pipeline misses |
+| Resilience alone | (won't run — no schedule to analyze) | — |
+
+| Combined | Output |
+|---|---|
+| Querying → Graph | Per-machine signals plus structural bottleneck weight |
+| + Rules | Composite risk tier + a hard "must maintain by period 2" constraint |
+| + Prescriptive | $605K plan, 20 jobs, Plant_A's Critical machine handled in Period 1 |
+| + Resilience | Cross-train T006 for $3,200 → eliminates Houston Turbine concentration |
+
+**Multi-reasoner chaining grounded in (and contributing to) the ontology.**
+
+---
+
+## Optional extension — operator-shift assignment
+
+A second optimization pass — assigning operators to line-shifts to maximize a skill-match bonus, subject to a same-facility constraint — is a natural follow-on but not part of this template's main script. It would consume the same `Machine` and facility ontology and add `Operator`, `Shift`, and `OperatorShift` concepts. Out of scope for this runbook.
+
+---
+
+## Data Reference
+
+- **Source data**: bundled CSVs in `../data/` (30 machines across 3 plants × 5 machine types, 10 technicians across 3 cities, 16 qualifications, 4 planning periods, 60 sensors with 240 readings, 120 per-period failure predictions, 129 downtime events, 120 production runs, 13 cross-training options). To run against your own Snowflake schema instead, swap the `read_csv(...)` loads for `model.Table(...)` references in `machine_maintenance.py`; the rest of the pipeline is unchanged.
+- **Stages**: implemented in `../machine_maintenance.py` as a single combined script with stage banners (Stage 0 → Stage 4).
+- **Ontology**: 14 user-facing concepts. Run `inspect.schema(model)` after the pipeline (see template README) to dump the full concept/property/relationship surface, filtering out reasoner-owned concepts (`Variable`, `Constraint`, etc.) and the auto-generated `graph<id>_Edge` from Stage 1.
diff --git a/v1/supply_chain_resilience/references/runbook.md b/v1/supply_chain_resilience/references/runbook.md
new file mode 100644
index 0000000..0fac804
--- /dev/null
+++ b/v1/supply_chain_resilience/references/runbook.md
@@ -0,0 +1,359 @@
+# Runbook: Supply Chain Resilience — Multi-Reasoner Walkthrough
+
+Walk-through of the chained-reasoner pattern this template is built on. One realistic business thread — **risk-adjusted network flow with disruption scenarios** — traced across four RAI reasoning stages, each writing properties back to the same ontology that downstream stages consume.
+
+The template's combined script (`supply_chain_resilience.py`) implements all four stages directly. This runbook expands the surrounding narrative — what each stage finds, why the next stage needs it, what the optimizer does with the enrichment — so a non-OR reader can follow the full reasoning thread end-to-end.
+
+---
+
+## TL;DR — the chain in one screen
+
+```
+Two HIGH-priority customers depend on 6 upstream suppliers — one of which
+(PowerCell, B003) is flagged "watch" by rules. The chain produces a
+$1,865 baseline plan, then quantifies disruption: top hub offline = +88.5%,
+watch->avoid downgrade = +0.0% (optimizer already routed around it).
+
+  ─────────────────────────────────────────────────────────────────
+  STAGE 0  Reachability ──►  Business.is_high_priority_customer (2)
+                              Upstream supplier dependency map for
+                              each HIGH-priority customer (B008, B009).
+  ─────────────────────────────────────────────────────────────────
+  STAGE 1  Graph        ──►  Site.centrality  (normalized)
+                              Top hubs: S004 TechAssembly 1.000,
+                              S006 West Coast DC 0.776, S003 PowerCell 0.735.
+                              2 weakly-connected components.
+  ─────────────────────────────────────────────────────────────────
+  STAGE 2  Rules        ──►  Business.is_unreliable           (1)
+                              Business.has_high_delay_risk    (2)
+                              Business.is_watch_level         (2)
+                              Demand.is_escalated             (9)
+                              [X] B017 avoid · [!] B003 watch
+                              37 of 262 shipments late (14%).
+  ─────────────────────────────────────────────────────────────────
+  STAGE 3  Prescriptive ──►  Operation.x_flow / Demand.x_unmet
+                              OPTIMAL · $1,865 · 8 active flows · 0 unmet
+                              + 2 scenario re-solves (S004 offline, watch->avoid)
+  ─────────────────────────────────────────────────────────────────
+```
+
+A single-reasoner approach can't answer this. Reachability alone names the suppliers in scope but doesn't rank them. Graph alone ranks hubs but doesn't decide flow. Rules alone classifies suppliers but doesn't route around them. Prescriptive alone has no way to hard-block bad suppliers, surcharge watch suppliers, or penalize bottleneck hubs without those upstream signals.
+
+---
+
+## Setup
+
+See the template's main `README.md` for installation, RAI connection setup, and how to run the script. The narrative below follows the actual stage outputs of `supply_chain_resilience.py` against the bundled CSVs in `../data/`.
+
+---
+
+## Stage 0 — Reachability: blast-radius pre-analysis
+
+
+**Construction** — directed `Business` graph, edges from `Business.ships_to` (derived from `Shipment.supplier` -> `Shipment.customer`).
+
+**Targets** — `Business.is_high_priority_customer` is set wherever a `Demand` with `priority == "HIGH"` is placed by that business. From the bundled data, this fires for **2 buyers**: B008 MegaCorp Enterprise and B009 TechGiant Inc (9 HIGH-priority demands between them, all for ProPhone X1 / ProTab T1).
+
+**Algorithm** — `biz_graph.reachable(to=target_customer)` filtered to nodes with `business_type == "SUPPLIER"`.
+
+```
+Upstream supplier dependencies (HIGH-priority customers)
+
+  MegaCorp Enterprise (B008)        depends on 6 suppliers:
+    - CellChem China        (reliability 78%)   ← will become AVOID
+    - EuroCell Poland       (reliability 88%)
+    - EuroChip Germany      (reliability 94%)
+    - EuroDisplay Czech     (reliability 91%)
+    - GlassCorp Korea       (reliability 89%)
+    - WaferTech Taiwan      (reliability 97%)
+
+  TechGiant Inc       (B009)        depends on 6 suppliers:  (same set)
+
+  ──────────────────────────────────────────────────────────────────
+  Both HIGH-priority customers share the same 6-supplier upstream.
+  CellChem (B017) sits in BOTH dependency cones — Stage 2 will flag
+  it AVOID, and Stage 3's baseline already excludes it.
+  ──────────────────────────────────────────────────────────────────
+
+✓ Business.is_high_priority_customer written back (2 buyers)
+```
+
+The point of running reachability before the MILP: when the scenario in Stage 3 downgrades watch suppliers to avoid, the cost delta has to be read against this dependency map. If a downgraded supplier sits in zero customer cones, the optimizer just shrugs and the cost stays flat — which is exactly what happens here for the watch-tier supplier B003.
+
+---
+
+## Stage 1 — Graph: site centrality + connected components
+
+
+**Construction:**
+- Node concept: `Site` (31 sites)
+- Edges: built from `Operation` rows where `op_type == "SHIP"` (undirected, unweighted)
+- Aggregator: `"sum"` (collapse parallel ship lanes between the same pair)
+
+**Algorithms:** `weakly_connected_component()` for cluster discovery, then `eigenvector_centrality()` for hub importance.
+
+```
+Connected components: 2
+
+  Component A: 25 sites (APAC + AMERICAS, joined by S004->S006 long-haul)
+  Component B:  6 sites (EMEA distribution loop)
+
+Top critical sites — eigenvector centrality (FACTORY/DC only)
+
+  S004  TechAssembly Factory     (FACTORY,  APAC)       0.5016  ████████████  ★ central hub
+  S006  West Coast DC            (DC,       AMERICAS)   0.3895  █████████
+  S003  PowerCell Facility       (FACTORY,  APAC)       0.3688  █████████      ⚠ also Stage 2 watch
+  S002  DisplayCorp Plant        (FACTORY,  APAC)       0.3145  ████████
+  S001  ChipTech Factory         (FACTORY,  APAC)       0.3145  ████████
+  S012  SiliconWorks Factory     (FACTORY,  APAC)       0.2456  ██████
+  S013  ScreenTech Plant         (FACTORY,  APAC)       0.2456  ██████
+  S014  EnergyPlus Facility      (FACTORY,  APAC)       0.2280  ██████
+
+  ──────────────────────────────────────────────────────────────────
+  S004 TechAssembly is the convergence point: every APAC component
+  factory ships into it before finished goods radiate to DCs.
+  S003 PowerCell shows up here AND in Stage 2 — structural and
+  behavioural risk overlap on the same supplier.
+  ──────────────────────────────────────────────────────────────────
+
+✓ Site.centrality written back, normalized to [0, 1]
+  (S004 = 1.000, S006 = 0.776, S003 = 0.735, ...)
+```
+
+---
+
+## Stage 2 — Rules: supplier risk classification
+
+
+**Late-shipment context** (computed in pandas, not RAI):
+
+```
+Late shipments: 37 of 262 (14%)
+
+  B006 West Coast DC    7 late   ████████
+  B007 East Coast DC    5 late   ██████
+  B004 TechAssembly     4 late   █████
+  B022 EMEA DC Central  3 late   ████
+  B003 PowerCell        2 late   ███     ← also high predicted Q1
+  B017 CellChem         2 late   ███     ← also low reliability
+```
+
+**Properties added to the ontology** (via `model.where(...).define(...)`):
+
+```python
+# Rule 1 — reliability gate
+m.where(Business.reliability_score < 0.80).define(Business.is_unreliable())
+
+# Rule 2 — ML delay-risk gate (Q1-2025 GNN predictions)
+m.where(
+    DelayPrediction.supplier_business(Business),
+    DelayPrediction.fiscal_quarter == "Q1-2025",
+    DelayPrediction.predicted_delay_prob > 0.15,
+).define(Business.has_high_delay_risk())
+
+# Rule 3 — union into watch level (rule chaining)
+m.where(Business.is_unreliable()).define(Business.is_watch_level())
+m.where(Business.has_high_delay_risk()).define(Business.is_watch_level())
+
+# Rule 4 — escalate HIGH-priority demands
+m.where(Demand.priority == "HIGH").define(Demand.is_escalated())
+```
+
+**Resulting classification** (Python combines the two RAI flags into avoid/watch/reliable):
+
+```
+Supplier risk classification
+
+  [X] B017  CellChem China      reliability 0.78  Q1 delay 0.22  ── AVOID
+  [!] B003  PowerCell Ltd       reliability 0.81  Q1 delay 0.28  ── WATCH
+  [ ] B005  GlobalBuild Inc     reliability 0.85                 ── reliable
+  [ ] B014  EnergyPlus India    reliability 0.85                 ── reliable
+  [ ] B024  EuroAssembly Pol    reliability 0.87                 ── reliable
+  [ ] B020  EuroCell Poland     reliability 0.88                 ── reliable
+  [ ] B002  DisplayCorp         reliability 0.88                 ── reliable
+  [ ] B016  GlassCorp Korea     reliability 0.89                 ── reliable
+  [ ] B004  TechAssembly Co     reliability 0.90                 ── reliable
+  [ ] B019  EuroDisplay Czech   reliability 0.91                 ── reliable
+  [ ] B012  SiliconWorks Corp   reliability 0.91                 ── reliable
+  [ ] B013  ScreenTech Japan    reliability 0.93                 ── reliable
+  [ ] B018  EuroChip Germany    reliability 0.94                 ── reliable
+  [ ] B001  ChipTech Industries reliability 0.95                 ── reliable
+  [ ] B015  WaferTech Taiwan    reliability 0.97                 ── reliable
+
+  AVOID  =  is_unreliable AND has_high_delay_risk    (both flags)
+  WATCH  =  is_unreliable OR  has_high_delay_risk    (one flag)
+
+Escalated demands (HIGH priority): 9   ── all from B008 / B009
+
+✓ Business.is_unreliable             [1 supplier:  B017]
+✓ Business.has_high_delay_risk       [2 suppliers: B003, B017]
+✓ Business.is_watch_level            [2 suppliers: B003, B017]
+✓ Demand.is_escalated                [9 demands]
+```
+
+Stage 3 reads `is_watch_level` for the surcharge term and `is_unreliable AND has_high_delay_risk` (collapsed to the `avoid` set) for the hard block.
+
+---
+
+## Stage 3 — Prescriptive: risk-adjusted minimum-cost flow
+
+
+```
+FORMULATION
+
+  Decision variables
+    Operation.x_flow        (continuous, 70 ops, 0 ≤ x ≤ capacity_per_day)
+    Demand.x_unmet          (continuous slack, 20 demands, ≥ 0)
+
+  Constraints
+    1. Demand satisfaction
+       Σ x_flow into customer-site for the demanded SKU + x_unmet ≥ quantity
+    2. Avoid suppliers blocked
+       For every operation sourced from B017 (CellChem):  x_flow == 0
+    3. (Scenario only) Site offline / extra blocks
+
+  Objective (minimize)
+    Σ Operation.cost_per_unit · x_flow                              ── transport
+    + RISK_SURCHARGE       · Σ x_flow on watch-supplier ops          ── Stage 2
+    + CENTRALITY_WEIGHT    · Σ x_flow · Site.centrality              ── Stage 1
+    + UNMET_PENALTY        · Σ x_unmet                               ── slack
+
+  Tunables: UNMET_PENALTY=100, RISK_SURCHARGE=5, CENTRALITY_WEIGHT=2,
+            DELAY_PROB_THRESHOLD=0.15, RELIABILITY_THRESHOLD=0.80,
+            PREDICTION_QUARTER="Q1-2025"
+
+──────────────────────────────────────────────────────────────────────
+SOLVE  (HiGHS)   →   OPTIMAL    8 active flows    $1,865.00    0 unmet
+──────────────────────────────────────────────────────────────────────
+
+✓ Operation.x_flow / Demand.x_unmet written back as model properties.
+```
+
+The baseline buys: enough finished-goods flow on the shortest cost-weighted lanes to cover all 20 demand orders in full. CellChem's operations (B017) are hard-blocked. PowerCell (B003) operations carry a +5/unit surcharge — the optimizer accepts a small amount of B003-sourced flow only when no cheaper non-watch alternative exists.
+
+### Reading the solve
+
+- **8 active flows from 70 candidate operations** — the network is sparse at optimum; most capacity is idle.
+- **$1,865 total cost** vs. unconstrained transport cost would be ~$1,500 — the centrality and watch-surcharge terms together add ~$365.
+- **Zero unmet demand** — capacity is plentiful, so the slack term is inactive at baseline.
+
+---
+
+## Scenario analysis — quantify disruption
+
+
+The same `solve_flow(...)` function re-runs with modified constraints. Two scenarios surface different aspects of the chain's value:
+
+```
+SCENARIO COMPARISON
+
+  Scenario                  Status     Cost          Δ vs baseline   Unmet
+  ────────────────────────  ────────   ──────────   ──────────────   ─────
+  Baseline                  OPTIMAL    $1,865.00          —              0
+  Site S004 offline         OPTIMAL    $3,515.00    +88.5%               0
+  Watch->Avoid              OPTIMAL    $1,865.00     +0.0%               0
+```
+
+**Scenario A — top-centrality site offline (S004 TechAssembly).** The optimizer reroutes finished goods through S005 GlobalBuild Plant (Mexico) and longer EMEA lanes; cost jumps 88.5% but all demand is still covered. This is the **structural-risk** signal: losing the highest-centrality node forces expensive secondary routing.
+
+**Scenario B — downgrade all watch suppliers to avoid.** Adds B003 PowerCell to the hard-block set. **Cost is unchanged.** Why? B003 wasn't on any optimal lane — the centrality penalty + risk surcharge already discouraged the optimizer from routing through it at baseline. **This asymmetry is the punchline:** structural risk (Stage 1) costs 88.5% to disrupt; behavioural risk on already-deprioritized suppliers (Stage 2) costs 0% to harden against. The chain reveals which mitigations actually move the needle.
+
+(Cross-check the Stage 0 dependency map: B003 wasn't in either HIGH-priority customer's *direct* upstream — it ships components to manufacturers, who then route via the optimizer's preferred S004/S005 corridor. Reachability surfaced the supplier; centrality + objective weights ensured baseline never relied on it.)
+
+---
+
+## The chain — accretive ontology enrichment
+
+```
+THE SUPPLY-CHAIN-RESILIENCE CHAIN
+
+  STAGE 0  REACHABILITY (directed Business graph)
+  "Which suppliers do my high-priority customers transitively depend on?"
+  reads:   Shipment.supplier / .customer  ──►  Business.ships_to (derived)
+           Demand.priority == "HIGH"      ──►  Business.is_high_priority_customer
+  writes:  Business.is_high_priority_customer    ── 2 buyers
+                         │
+                         ▼
+  STAGE 1  GRAPH (eigenvector centrality, WCC)
+  "Which sites are network bottlenecks?"
+  reads:   Operation (op_type == "SHIP"), Site
+  writes:  Site.centrality                       ── normalized [0,1] per site
+                         │
+                         ▼
+  STAGE 2  RULES (chained derivations)
+  "Which suppliers are risky, and which demands are escalated?"
+  reads:   Business.reliability_score, DelayPrediction, Demand.priority
+  writes:  Business.is_unreliable                ── 1 supplier
+           Business.has_high_delay_risk          ── 2 suppliers
+           Business.is_watch_level               ── 2 suppliers
+           Demand.is_escalated                   ── 9 demands
+                         │
+                         ▼
+  STAGE 3  PRESCRIPTIVE (HiGHS LP)
+  "What's the minimum-cost flow plan that respects all of the above?"
+  reads:   Site.centrality              ──►  objective coefficient (penalty)
+           Business.is_watch_level      ──►  objective coefficient (surcharge)
+           {avoid suppliers}            ──►  hard block (x_flow == 0)
+           Operation cost / capacity / SKU, Demand quantity / SKU / business
+  writes:  Operation.x_flow             ── 8 active flows
+           Demand.x_unmet               ── 0 across all 20 demands
+                         │
+                         ▼
+                   Re-solve per scenario (S004 offline, watch->avoid)
+                   → cost-of-disruption table
+
+  ──────────────────────────────────────────────────────────────────
+  No glue. No DataFrame ping-pong. No re-derivation per-reasoner.
+  Four stages, one ontology, one accretive thread.
+  ──────────────────────────────────────────────────────────────────
+```
+
+---
+
+## Why the chain matters (vs. any single stage)
+
+| Stage alone | What it tells you | What it doesn't |
+|---|---|---|
+| Reachability alone | "These 6 suppliers feed my critical customers" | Which are risky; which the optimizer would have used anyway |
+| Graph alone | "S004 is the central hub" | Whether losing it is recoverable; at what cost |
+| Rules alone | "B017 avoid, B003 watch" | Whether routing actually depends on them |
+| Prescriptive alone | (degenerate — no risk filter, no bottleneck weight) | Picks cheapest lanes regardless of supplier risk or hub fragility |
+
+| Combined | Output |
+|---|---|
+| Reachability -> Graph | Customer-supplier dependency map + structural hub ranking |
+| + Rules | Per-supplier risk class (avoid / watch / reliable) on top of the dependency map |
+| + Prescriptive | Risk-adjusted min-cost flow ($1,865) + scenario deltas (+88.5% on S004 offline, +0% on watch->avoid) |
+
+**Multi-reasoner chaining grounded in (and contributing to) the ontology.**
+
+---
+
+## Optional extension — predictive forecasting
+
+The bundled `delay_prediction.csv` is treated as already-trained ML output (a quarterly per-supplier delay probability table, `model_version='gnn_v2.0'`). A natural extension to the template is to replace the static CSV with a live GNN that retrains on `Shipment.delay_days` history. Skill: `/rai-predictive-modeling` + `/rai-predictive-training`. Reference templates with end-to-end GNN training: `templates/v1/fraud-detection`, `templates/v1/retail_planning`. The downstream stages (rules + prescriptive) wouldn't change — they read `DelayPrediction.predicted_delay_prob` either way.
+
+---
+
+## Agent prompt sequence — recreate this template skill-by-skill
+
+Each row is a single agent prompt. Skills are loaded in order; each writes properties the next stage reads.
+
+| # | Skill | Prompt | What it produces |
+|---|-------|--------|------------------|
+| 1 | `/rai-build-starter-ontology` | "Build a starter ontology for a supply-chain dataset with 7 CSVs: site, business, operation, sku, demand, shipment, delay_prediction. Use Site/Business/Operation/SKU/Demand/Shipment/DelayPrediction as concepts. Render the result as an ASCII concept-relationship diagram." | Concepts, properties, relationships matching `supply_chain_resilience.py` lines 57–250 |
+| 2 | `/rai-discovery` | "Given this ontology, what questions can each reasoner family answer? Group by graph / rules / prescriptive." | A reasoner-routing plan that covers Stages 0–3 below |
+| 3 | `/rai-graph-analysis` | "Build a directed Business graph from Shipment.supplier->Shipment.customer. Run upstream reachability from every Business with a HIGH-priority demand. List the suppliers each high-priority customer transitively depends on." | Stage 0 — `Business.is_high_priority_customer`, blast-radius dependency map |
+| 4 | `/rai-graph-analysis` | "Build an undirected Site graph from Operation rows where op_type == 'SHIP'. Compute weakly-connected components, then eigenvector centrality (filter to FACTORY/DC). Normalize and write the score back as Site.centrality." | Stage 1 — `Site.centrality` |
+| 5 | `/rai-rules-authoring` | "Define three derived relationships on Business: is_unreliable (reliability_score < 0.80), has_high_delay_risk (any DelayPrediction for Q1-2025 with predicted_delay_prob > 0.15), and is_watch_level (union of the two). Also flag Demand.is_escalated for HIGH priority. Print the avoid (both flags) / watch (one flag) / reliable classification." | Stage 2 — risk flags, escalation flag |
+| 6 | `/rai-prescriptive-problem-formulation` | "Formulate a minimum-cost network flow on Operation.x_flow with Demand.x_unmet slack. Constraint: inbound flow at customer site for demanded SKU + slack >= quantity. Hard-block operations sourced from avoid-tier businesses. Objective: transport cost + RISK_SURCHARGE * watch-supplier flow + CENTRALITY_WEIGHT * Σ flow · Site.centrality + UNMET_PENALTY * Σ unmet. Describe the formulation before solving — decision variables, constraints, objective, tunables." | Stage 3 formulation |
+| 7 | `/rai-prescriptive-solver-management` | "Solve with HiGHS, time limit 120s. Report status, objective value, count of active flows, and total unmet demand." | Baseline solve — OPTIMAL, $1,865, 8 flows, 0 unmet |
+| 8 | `/rai-prescriptive-results-interpretation` | "Re-solve two scenarios side-by-side: (a) top-centrality site offline (S004 TechAssembly), (b) all watch-level suppliers downgraded to avoid. Show cost delta vs baseline as a table. Explain why the deltas are asymmetric." | Scenario table + interpretation (S004 offline = +88.5%, watch->avoid = +0%) |
+
+---
+
+## Data Reference
+
+- **Source data**: bundled CSVs in `../data/` — 31 sites across APAC / AMERICAS / EMEA, 31 businesses (6 suppliers, 6 component manufacturers, 2 manufacturers, 8 warehouses, 9 buyers), 9 SKUs (raw materials -> components -> finished goods ProPhone X1 / ProTab T1), 70 operations (SHIP + TRANSFER), 20 demand orders (9 HIGH, 5 MEDIUM, 6 LOW), 262 historical shipments (37 late), 36 quarterly delay predictions (4 quarters × 9 suppliers). To run against your own Snowflake schema instead, swap `read_csv(...)` for typed `model.Table(...)` loads against the equivalent table set.
+- **Ontology**: defined inline in `../supply_chain_resilience.py` (lines 57–250) — 7 concepts plus the derived `Business.ships_to` and `Operation.source_business` relationships.
+- **Stages**: implemented in `../supply_chain_resilience.py` as a single combined script with stage banners (`STAGE 0` through `STAGE 3` plus `SCENARIO ANALYSIS`).
diff --git a/v1/telco_network_recovery/references/runbook.md b/v1/telco_network_recovery/references/runbook.md
new file mode 100644
index 0000000..d1b3003
--- /dev/null
+++ b/v1/telco_network_recovery/references/runbook.md
@@ -0,0 +1,449 @@
+# Runbook: Telco WEST Recovery — Multi-Reasoner Walkthrough
+
+Walk-through of the chained-reasoner pattern this template is built on. One realistic business thread — **WEST region recovery** — traced across all five RAI reasoner families, each stage writing properties back to the same ontology that downstream stages consume.
+
+The template's combined script (`telco_network_recovery.py`) implements stages 2–5 directly; this runbook expands the surrounding narrative including the descriptive Stage 1 diagnosis and the Stage 6 interpretation, so a non-OR reader can follow the full reasoning thread end-to-end.
+
+---
+
+## TL;DR — the chain in one screen
+
+```
+WEST is bleeding $791K/quarter from a network operations crisis.
+The chain produces a $5M plan that recovers 122 Gbps capacity
+across all 15 critical towers, prioritized by social blast radius.
+
+  ─────────────────────────────────────────────────────────────────
+  STAGE 1  Descriptive  ──►  WEST: Q3-Q4 revenue −22% to −26%,
+                              avail 94.6 vs 99.5, 15 of 81 DEGRADED.
+                              Retention angle? No — 0 high-risk
+                              subs; this is operational.
+  ─────────────────────────────────────────────────────────────────
+  STAGE 2  Rules        ──►  CellTower.is_critical_restore  (15)
+                              4 derived health metrics + a compound
+                              flag: WEST + DEGRADED + health < 0.85.
+  ─────────────────────────────────────────────────────────────────
+  STAGE 3  Graph        ──►  Subscriber.influence_score  (PageRank)
+                              CellTower.weighted_impact  (15)
+                              404 distinct subs (33% of base) route
+                              calls through a critical tower.
+  ─────────────────────────────────────────────────────────────────
+  STAGE 4  Predictive   ──►  CellTower.projected_demand_growth (15)
+                 (GNN)        WEST: 0.993×  ── shrinking ~0.7%/yr
+                              while 8 other regions sit at +0.59 to +0.75%/day.
+  ─────────────────────────────────────────────────────────────────
+  STAGE 5  Prescriptive ──►  TowerUpgradeOption.selected  (15)
+                              OPTIMAL · 12 GOLD · 2 SILVER · 1 BRONZE
+                              $4.96M of $5M (binding) · 122 Gbps
+                              164 of 200 install-weeks (slack)
+  ─────────────────────────────────────────────────────────────────
+```
+
+A single-reasoner approach can't answer this. Descriptive alone tells the story but no plan. Rules alone identifies broken towers but not which matter most. Graph alone ranks subscribers but doesn't decide. Predictive alone forecasts but doesn't act. Prescriptive alone has no way to weigh "important" without rules + graph upstream.
+
+---
+
+## Setup
+
+See the template's main `README.md` for installation, RAI connection setup, and how to run the script. The narrative below follows the actual stage outputs of `telco_network_recovery.py`.
+
+---
+
+## Stage 1 — Descriptive: diagnose WEST
+
+
+```
+Q4 2024 — Daily KPIs by region
+
+                avg avail %       avg churn rate     avg daily revenue
+              ──────────────     ────────────────    ──────────────────
+  WEST          94.60  ███       0.0256  █████████   $72,558  ███████
+  CENTRAL       99.52  ████████  0.0046  █              $101,517  ██████████
+  SOUTH         99.53  ████████  0.0049  █              $102,030  ██████████
+  EAST          99.55  ████████  0.0049  █              $103,840  ██████████
+  NORTH         99.56  ████████  0.0050  █              $103,810  ██████████
+  NORTHEAST     99.56  ████████  0.0050  █               $99,569  ██████████
+  SOUTHEAST     99.58  ████████  0.0050  █              $100,429  ██████████
+  NORTHWEST     99.54  ████████  0.0049  █              $100,995  ██████████
+  SOUTHWEST     99.55  ████████  0.0049  █              $101,716  ██████████
+                                                      ▲
+  WEST is 5× worse on availability AND churn,         │
+  ~30% lower on daily revenue.                        │
+                                                      │
+  Q3-Q4 monthly revenue gaps stack to $791K:    ──────┘
+
+   Sep 2024   forecast $1.40M  →  actual $1.04M  ── −25.9%  (−$362K)
+   Oct 2024   forecast $0.94M  →  actual $0.70M  ── −25.1%  (−$236K)
+   Nov 2024   forecast $0.88M  →  actual $0.68M  ── −22.0%  (−$193K)
+
+  WEST tower fleet (81 towers)
+
+       ACTIVE        ████████████████████████  49
+       DEGRADED      ███████                   15   ← 15 critical_restore
+       MAINTENANCE   ████████                  17
+
+  Top performance offenders (NetworkPerformance, all WEST DEGRADED):
+       TWR-0015 ── 8.87% loss  190ms  (424 measurements)
+       TWR-0014 ── 8.75% loss  189ms  (273 measurements)
+       TWR-0010 ── 8.73% loss  188ms  (429 measurements)
+       (… all 15 cluster 8.1-8.9% loss / 185-200ms latency)
+       ────────────────────────────────────────
+       Healthy WEST towers sit at <1% loss / ~30ms latency.
+
+  Retention angle? Zero subs in collections, zero with churn_risk ≥ 0.5.
+  High-risk subscribers concentrate in SOUTH/NORTH/CENTRAL, not WEST.
+  This is a NETWORK-side crisis, not subscriber retention.
+```
+
+Subscriber.churn_risk_score is a static feature that hasn't caught up to WEST's empirical churn (TimeSeriesMetric.churn_rate of 2.6% is 5x other regions). The thread focuses on infrastructure recovery.
+
+---
+
+## Stage 2 — Rules: flag critical_restore towers
+
+
+**Properties added to the ontology** (via `model.define(...)`):
+- `CellTower.avg_packet_loss` (Float) — `aggs.avg(NetworkPerformance.packet_loss_pct).per(CellTower)`
+- `CellTower.avg_latency_ms` (Float)
+- `CellTower.avg_error_rate` (Float)
+- `CellTower.avg_health_score` (Float) — across attached equipment via two-hop join
+
+**Rule** — `CellTower.is_critical_restore` (unary `Relationship`):
+
+```python
+# Branch 1: WEST + DEGRADED + degraded equipment health
+m.where(
+    CellTower.region == "WEST",
+    CellTower.status == "DEGRADED",
+    CellTower.avg_health_score < 0.85,   # health is 0-1 scale
+).define(CellTower.is_critical_restore())
+
+# Branch 2: WEST + poor performance + degraded health (catches ACTIVE-but-failing)
+m.where(
+    CellTower.region == "WEST",
+    CellTower.avg_packet_loss > 5.0,
+    CellTower.avg_health_score < 0.85,
+).define(CellTower.is_critical_restore())
+```
+
+```
+RULE FIRES → 15 towers flagged is_critical_restore  (all WEST DEGRADED)
+
+  TWR-0010 ── health 0.48  loss 8.73%  cap_gbps 18  ███
+  TWR-0015 ── health 0.60  loss 8.87%  cap_gbps 60  ██████
+  TWR-0009 ── health 0.62  loss 8.49%  cap_gbps 17  ███
+  TWR-0012 ── health 0.63  loss 8.59%  cap_gbps 84  █████████
+  TWR-0008 ── health 0.64  loss 8.51%  cap_gbps 71  ████████
+  TWR-0014 ── health 0.66  loss 8.75%  cap_gbps 36  █████
+  TWR-0002 ── health 0.66  loss 8.56%  cap_gbps 17  ███
+  TWR-0001 ── health 0.66  loss 8.56%  cap_gbps 31  █████
+  TWR-0011 ── health 0.66  loss 8.68%  cap_gbps 61  ███████
+  TWR-0005 ── health 0.68  loss 8.12%  cap_gbps 94  ██████████
+  TWR-0003 ── health 0.69  loss 8.45%  cap_gbps 43  █████
+  TWR-0007 ── health 0.72  loss 8.54%  cap_gbps 17  ███
+  TWR-0013 ── health 0.77  loss 8.62%  cap_gbps 94  ██████████
+  TWR-0006 ── health 0.78  loss 8.51%  cap_gbps 37  █████
+  TWR-0004 ── health 0.81  loss 8.54%  cap_gbps 64  ███████
+
+✓ is_critical_restore written back to CellTower (15 rows)
+✓ avg_packet_loss / avg_latency_ms / avg_error_rate / avg_health_score
+  written back to all CellTowers (250 rows)
+```
+
+Branch 2 didn't fire — none of WEST's ACTIVE towers fall below health 0.85. The 15 flagged are exactly the WEST DEGRADED set, all with packet loss 8.1–8.9% and latency 185–200ms — sharp gap from the rest of WEST.
+
+---
+
+## Stage 3 — Graph: subscriber influence + tower blast radius
+
+
+**Construction** — Pattern 3 (`edge_concept`):
+- Node concept: `Subscriber` (1,200 nodes)
+- Edge concept: `CallDetailRecord`, with `caller`→`callee` (directed)
+- Aggregator: `"sum"` (collapse parallel calls between the same pair)
+
+**Algorithm:** `pagerank()` (default for directed graphs).
+
+```
+PageRank — top 10 social influencers (of 1,200 subscribers)
+
+  SUB-CON-00900   CONSUMER     $3,793 LTV   ████████████  0.002963
+  SUB-CON-00723   CONSUMER     $3,049 LTV   ████████████  0.002956
+  SUB-CON-00262   CONSUMER     $3,764 LTV   ███████████   0.002790
+  SUB-CON-00274   CONSUMER     $2,850 LTV   ███████████   0.002695
+  SUB-ENT-0038    ENTERPRISE  $283,233 LTV  ██████████    0.002637  ★
+  SUB-CON-00705   CONSUMER       $765 LTV   ██████████    0.002599
+  SUB-CON-00393   CONSUMER     $3,219 LTV   ██████████    0.002581
+  SUB-ENT-0001    ENTERPRISE  $393,340 LTV  ██████████    0.002575  ★
+  SUB-CON-01066   CONSUMER     $1,146 LTV   ██████████    0.002570
+  SUB-CON-00762   CONSUMER       $307 LTV   ██████████    0.002525
+
+  ★ Top enterprise accounts also rank — heavy inbound call traffic.
+    PageRank captures structural influence independent of LTV.
+
+Per-critical-tower blast radius (sorted by weighted_impact)
+
+  TWR-0014  61 subs  ████████████  0.0502   ← largest social footprint
+  TWR-0008  56 subs  ██████████    0.0430
+  TWR-0011  48 subs  ██████████    0.0428
+  TWR-0012  50 subs  █████████     0.0394
+  TWR-0003  43 subs  █████████     0.0393
+  TWR-0013  46 subs  █████████     0.0379
+  TWR-0004  46 subs  █████████     0.0378
+  TWR-0010  48 subs  █████████     0.0375
+  TWR-0015  45 subs  ████████      0.0361
+  TWR-0002  46 subs  ████████      0.0331
+  TWR-0007  44 subs  ████████      0.0330
+  TWR-0005  45 subs  ████████      0.0330
+  TWR-0009  44 subs  ████████      0.0330
+  TWR-0001  41 subs  ████████      0.0322
+  TWR-0006  41 subs  ████████      0.0316
+
+  ──────────────────────────────────────────────────────────────────
+  404 distinct subscribers (33% of the 1,200-sub base) route at least
+  one call through a critical WEST tower. TWR-0014's failure ripples
+  to 61 subs whose combined social influence is highest.
+  ──────────────────────────────────────────────────────────────────
+
+✓ Subscriber.influence_score written back to all 1,200 subscribers
+✓ CellTower.impact_count + weighted_impact written back to CellTower
+```
+
+---
+
+## Stage 4 — Predictive: forecast WEST capacity demand
+
+
+**Method:** GNN node regression on `TimeSeriesMetric` (composite key `metric_date` + `region`). Target: `subscriber_growth_rate`. Features: the other 12 daily KPIs + 3 lag features (`prev_day_growth`, `prev_week_growth`, `growth_7d_mean`) + `region` as a category. Graph: same-region 1-day-lag temporal edges. Train < 2024-11-01 (includes the Sep–Oct WEST decline onset); validate on Nov 2024; test on Dec 2024.
+
+```
+Per-region GNN-predicted subscriber-growth-rate (Dec 2024 test horizon)
+
+  CENTRAL    ─────  +0.0075  ████████  ▲
+  EAST       ─────  +0.0073  ████████  │
+  NORTH      ─────  +0.0071  ████████  │  8 regions cluster
+  NORTHEAST  ─────  +0.0070  ████████  │  +0.59 to +0.75%/day
+  NORTHWEST  ─────  +0.0067  ████████  │  (mean predicted growth)
+  SOUTH      ─────  +0.0065  ████████  │
+  SOUTHEAST  ─────  +0.0063  ███████   │
+  SOUTHWEST  ─────  +0.0059  ███████   ▼
+                            ▲
+                            │
+  WEST       ───── −0.0071                      ← anomaly: contracting
+                                                 multiplier 0.993×
+
+  ──────────────────────────────────────────────────────────────────
+  WEST projection: 0.7% demand decline over the test horizon.
+  Stage 5 picks up this multiplier as the 3rd objective coefficient.
+  ──────────────────────────────────────────────────────────────────
+
+✓ CellTower.projected_demand_growth written back to all 15 critical towers
+  (uniform 0.992871 — regional, not per-tower)
+```
+
+**Stage 5 objective with the predictive term:**
+
+```
+objective = sum( selected[t,tier] *
+                 capacity_increase_gbps[t,tier] *
+                 weighted_impact[t] *
+                 projected_demand_growth[t] )    # ← Stage 4 contribution
+```
+
+**Snowflake setup for the GNN:** the template's main script computes lag features (prev-day, prev-week, 7-day mean) and same-region 1-day-lag temporal edges in pandas before loading, so no extra Snowflake DDL is required. To run on your own Snowflake schema instead of the bundled CSV, the equivalent SQL would be a typed copy of the time-series table plus per-region `LAG()` window functions.
+
+**Caveats:**
+- The GNN was tuned for a single 80-epoch run with seed-42 reproducibility; production deployment would expand to a multi-seed average + a held-out holdout window.
+- The WEST projection partially encodes the same network-degradation pattern Stages 2/3 flagged ("things have gotten worse and we expect them to keep getting worse if we don't act"). For an independent baseline, train on a pre-degradation slice (H1 2024 only) and compare.
+
+---
+
+## Stage 5 — Prescriptive: tower upgrade selection MIP
+
+
+```
+FORMULATION
+
+  Decision variable
+    TowerUpgradeOption.selected  (binary)
+      45 binaries = 15 critical-restore towers × {BRONZE, SILVER, GOLD}
+
+  Constraints
+    1. At-most-one tier per tower      sum(selected).per(CellTower) ≤ 1
+    2. Total cost                      Σ selected · cost ≤ $5,000,000
+    3. Total install_weeks             Σ selected · install_weeks ≤ 200
+
+  Objective (maximize)
+    Σ selected · capacity_increase_gbps · weighted_impact · projected_demand_growth
+                  └────── Step 2 (rules) ─────┘└── Stage 3 ──┘└── Stage 4 ──┘
+
+──────────────────────────────────────────────────────────────────────
+SOLVE  (Gurobi)   →   OPTIMAL    15 active flows    122 Gbps    $4,956,843
+──────────────────────────────────────────────────────────────────────
+
+Tower-tier assignment (sorted by weighted_impact)
+
+  TWR-0014  ── GOLD     +6 Gbps   $350,864   wgt 0.0502  ████████████
+  TWR-0008  ── GOLD    +10 Gbps   $416,455   wgt 0.0430  ██████████
+  TWR-0011  ── GOLD     +9 Gbps   $481,914   wgt 0.0428  ██████████
+  TWR-0012  ── GOLD     +8 Gbps   $445,825   wgt 0.0394  █████████
+  TWR-0003  ── GOLD    +11 Gbps   $360,785   wgt 0.0393  █████████
+  TWR-0013  ── GOLD     +9 Gbps   $273,831   wgt 0.0379  █████████
+  TWR-0004  ── GOLD     +9 Gbps   $275,353   wgt 0.0378  █████████
+  TWR-0010  ── GOLD    +12 Gbps   $332,694   wgt 0.0375  █████████
+  TWR-0015  ── GOLD    +11 Gbps   $438,932   wgt 0.0361  ████████
+  TWR-0002  ── GOLD    +11 Gbps   $420,363   wgt 0.0331  ████████
+  TWR-0007  ── GOLD     +9 Gbps   $416,640   wgt 0.0330  ████████
+  TWR-0005  ── SILVER   +3 Gbps   $220,435   wgt 0.0330  ████████  ⚐
+  TWR-0009  ── BRONZE   +3 Gbps    $97,784   wgt 0.0330  ████████  ⚐
+  TWR-0001  ── GOLD     +6 Gbps   $274,561   wgt 0.0322  ████████
+  TWR-0006  ── SILVER   +5 Gbps   $150,407   wgt 0.0316  ████████  ⚐
+
+  ⚐ Lowest weighted_impact towers — solver buys cheaper tiers
+    to free budget for the higher-impact GOLDs.
+
+Budget gauge
+  Cost          ████████████████████████████████████████  $4,956,843 / $5,000,000  ── BINDING
+  Install-wks   █████████████████████████████████          164 / 200            (slack: 36)
+
+Headline metrics
+  Capacity restored:   122 Gbps
+  Tier mix:            12 GOLD · 2 SILVER · 1 BRONZE
+  Towers covered:      15 of 15 (no triage tradeoff)
+  Subs serviced:       404 distinct (33% of all 1,200)
+  Objective without Stage 4 (Σ capacity × weighted_impact):    4.6024
+  Objective with Stage 4 (× 0.992871 uniform multiplier):      4.5696
+
+✓ TowerUpgradeOption.selected written back — the optimization output
+  is now a queryable property of the model.
+```
+
+(Full decision matrix: `outputs/stage5_solution.csv`.)
+
+### Reading the solve
+
+- **GOLD dominates** (12/15) — for towers with high social blast radius, GOLD's 6–12 Gbps uplift outweighs its higher cost.
+- **Budget is binding** ($4.96M / $5M) — relaxing to $6M would let TWR-0009 jump from BRONZE to GOLD ($481K → +9 Gbps) and lift the objective meaningfully.
+- **Install-weeks are not binding** (164/200) — schedule is the looser constraint; budget holds back the plan.
+- **Stage 4's uniform multiplier doesn't shift tiers** — the forecast says WEST is contracting (-0.7%), so every upgrade is slightly less valuable in absolute terms, but relative tower priority is unchanged. **A non-uniform forecast would be the more revealing test of the chain's value** — if some WEST towers sat in growth pockets and others in decline, the tier mix would shift accordingly.
+
+---
+
+## Stage 6 — Interpretation
+
+
+```
+THE PLAN, IN BUSINESS TERMS
+
+  • 122 Gbps of network capacity restored across all 15 critical towers
+    within the $5M capex budget.
+  • Every WEST DEGRADED tower gets an upgrade — no triage tradeoff.
+  • Service-affected subscribers drop from 404 to ~0 over the install
+    schedule (164 crew-weeks; 4-month rollout at 2 crews of 5).
+  • Budget binding — if CFO can flex to $6M, promote TWR-0009 to GOLD
+    for +9 Gbps marginal lift.
+
+ONTOLOGY ENRICHMENT — what each stage contributed back
+
+  Stage 2 (rules)         CellTower.is_critical_restore           [15]
+                          CellTower.avg_packet_loss               [250]
+                          CellTower.avg_latency_ms                [250]
+                          CellTower.avg_error_rate                [250]
+                          CellTower.avg_health_score              [250]
+
+  Stage 3 (graph)         Subscriber.influence_score              [1,200]
+                          CellTower.impact_count                  [120]
+                          CellTower.weighted_impact               [120]
+
+  Stage 4 (predictive)    CellTower.projected_demand_growth       [15]
+
+  Stage 5 (prescriptive)  TowerUpgradeOption.selected             [45]
+
+  ──────────────────────────────────────────────────────────────────
+  Each stage reads what the previous stage wrote.
+  Re-running any downstream stage automatically picks up enrichments.
+  No glue code, no DataFrame round-trip — same ontology throughout.
+  ──────────────────────────────────────────────────────────────────
+```
+
+---
+
+## The chain — accretive ontology enrichment
+
+```
+THE WEST RECOVERY CHAIN
+
+  STAGE 1  DESCRIPTIVE
+  "Where is the bleed coming from?"
+  reads:   RevenueForecast, TimeSeriesMetric, NetworkPerformance, CellTower
+  writes:  (situational summary — no ontology mutation)
+                         │
+                         ▼
+  STAGE 2  RULES
+  "Which towers are critical to restore?"
+  reads:   NetworkPerformance, EquipmentHealth, NetworkEquipment, CellTower
+  writes:  CellTower.is_critical_restore        ── 15 towers flagged
+           CellTower.avg_packet_loss / latency_ms / error_rate / health_score
+                         │
+                         ▼
+  STAGE 3  GRAPH (PageRank)
+  "Whose service depends on these towers — and who is socially central?"
+  reads:   CallDetailRecord (caller→callee), CDR.routed_through(CellTower)
+  writes:  Subscriber.influence_score           ── per subscriber
+           CellTower.impact_count               ── distinct subs served
+           CellTower.weighted_impact            ── Σ subscriber influence
+                         │
+                         ▼
+  STAGE 4  PREDICTIVE (GNN node regression)
+  "What does the forecast say about future demand?"
+  reads:   TimeSeriesMetric.subscriber_growth_rate × 365d × 9 regions
+           + 12 daily KPIs + 3 lag features + same-region temporal edges
+  writes:  CellTower.projected_demand_growth    ── per critical tower
+                         │
+                         ▼
+  STAGE 5  PRESCRIPTIVE (gurobi MIP)
+  "What's the optimal $5M tier-selection plan?"
+  reads:   CellTower.is_critical_restore        ──►  decision-variable scope
+           CellTower.weighted_impact            ──►  objective coefficient
+           CellTower.projected_demand_growth    ──►  objective coefficient
+           TowerUpgradeOption.cost / capacity_increase / install_weeks
+  writes:  TowerUpgradeOption.selected          ── 15 upgrades chosen
+                         │
+                         ▼
+                   Actionable plan,
+                   grounded end-to-end in the same ontology.
+
+  ──────────────────────────────────────────────────────────────────
+  No glue. No DataFrame ping-pong. No re-derivation per-reasoner.
+  Five reasoners, one ontology, one accretive thread.
+  ──────────────────────────────────────────────────────────────────
+```
+
+---
+
+## Why the chain matters (vs. any single stage)
+
+| Stage alone | What it tells you | What it doesn't |
+|---|---|---|
+| Descriptive | "WEST is broken" | Which towers, how to fix |
+| Rules alone | "These 15 towers are critical" | Which matter most; what to do |
+| Graph alone | "These subscribers are influential" | Which towers serve them |
+| Predictive alone | "WEST demand is contracting" | Where to spend the recovery budget |
+| Prescriptive alone | (won't run — no flagged set, no impact weights, no forecast) | Whole pipeline misses |
+
+| Combined | Output |
+|---|---|
+| Descriptive → Rules | Crisis scoped + critical-tower set flagged |
+| + Graph | Each flagged tower scored by social blast radius |
+| + Predictive | Forward-looking demand multiplier per tower |
+| + Prescriptive | $5M plan, 122 Gbps, all 15 covered, prioritized by social impact |
+
+**Multi-reasoner chaining grounded in (and contributing to) the ontology.**
+
+---
+
+## Data Reference
+
+- **Source data**: bundled CSVs in `../data/` (the main template ships ~1.2 MB of synthetic-but-realistic telco data — 250 cell towers, 1,200 subscribers, 6,000 CDRs, 3,285 daily KPI rows across 9 regions).
+- **Ontology**: the template's main script uses a focused 7-concept subset of a broader 18-concept telco knowledge graph (PostalArea, Subscriber, Contract, BillingEvent, CellTower, NetworkEquipment, EquipmentHealth, NetworkEvent, CallDetailRecord, SupplierOrder, Campaign, PromotionRedemption, RevenueForecast, NetworkPerformance, SupportTicket, TimeSeriesMetric, TowerUpgradeOption, Part) — sufficient for the four-stage chain.
+- **Stages**: implemented in `../telco_network_recovery.py` as a single combined script with stage banners.

From c391890995d7757b900dad8ab14f1d6c7840c83a Mon Sep 17 00:00:00 2001
From: cafzal <cameron.afzal@gmail.com>
Date: Wed, 6 May 2026 08:55:58 -0700
Subject: [PATCH 02/22] Hybrid runbook pass: discovery + per-stage skill/prompt
 blocks

Apply the dual-audience plan in dev_temp/pr57_runbook_hybrid_plan.md
to all five runbooks. Each runbook now serves both the stakeholder
(narrative + ASCII visualizations) and the practitioner (explicit
skill + prompt to recreate the stage):

- "How to read this runbook" preface explaining the dual purpose
- Step 0 discovery section using rai-discovery to scope sub-questions
  to reasoner families before any chain stage runs
- Skill / Prompt boxed callout immediately under every Stage heading
  (rai-querying, rai-rules-authoring, rai-graph-analysis, rai-prescriptive-*)
- "Adapting this recipe to a new domain" closing section

Adds portfolio_balancing/references/runbook.md (4 stages: rules ->
graph clustering -> bi-objective Markowitz frontier -> crisis-regime
stress test, all aligned to the template's actual 8-stock dataset
and epsilon-rate frontier sweep).

Numbers cross-referenced against each template's README and main
script; reflects the templates as shipped (not the larger demos
they were sourced from). Prompts use domain-natural language with
no Concept.property syntax inside the prompt strings.
---
 v1/energy_grid_planning/references/runbook.md |  78 +++
 v1/machine_maintenance/references/runbook.md  |  79 +++
 v1/portfolio_balancing/references/runbook.md  | 498 ++++++++++++++++++
 .../references/runbook.md                     |  53 ++
 .../references/runbook.md                     |  57 +-
 5 files changed, 764 insertions(+), 1 deletion(-)
 create mode 100644 v1/portfolio_balancing/references/runbook.md

diff --git a/v1/energy_grid_planning/references/runbook.md b/v1/energy_grid_planning/references/runbook.md
index 4dc729a..d963954 100644
--- a/v1/energy_grid_planning/references/runbook.md
+++ b/v1/energy_grid_planning/references/runbook.md
@@ -44,6 +44,41 @@ A single-reasoner approach can't answer this. Predictive alone forecasts demand
 
 ---
 
+## How to read this runbook
+
+This runbook serves two audiences:
+
+- **Reading top-to-bottom**: the narrative + ASCII visualizations show what
+  the chain produces stage-by-stage, with the same business framing the
+  stakeholder would see.
+- **Per-stage skill blocks**: the boxed `Skill / Prompt` callout at the
+  start of each stage is the recipe — load that RAI agent skill, give it
+  that prompt against the bundled demo data, and the agent will reproduce
+  the stage.
+
+---
+
+## Step 0 — Scope the question with `rai-discovery`
+
+> **Skill:** `rai-discovery` ·
+> **Prompt:** "We have 10 hyperscaler interconnection requests totalling 2,930 MW queued against a 12-substation ERCOT-style Texas grid. Classify the sub-questions we need to answer to decide which to approve, which substation upgrades to fund, and at what budget level — map each sub-question to the reasoner family that should handle it."
+
+Discovery classifies the question by reasoner family and tells you which
+downstream skills to load:
+
+| Sub-question | Reasoner | Skill |
+|---|---|---|
+| Which substations are losing headroom fastest and will breach capacity? | Predictive | _no public skill yet — see `v1/subscriber_retention/` and `v1/demand_forecasting/` as worked-example references_ |
+| Which substations are structural bottlenecks on the transmission grid? Which communities? | Graph | `rai-graph-analysis` |
+| Which DC requests fail capacity, structural-risk, or low-carbon compliance? | Rules | `rai-rules-authoring` |
+| Across budget levels, which DCs should we approve and which upgrades should we fund? | Prescriptive | `rai-prescriptive-problem-formulation` |
+| What does the Pareto frontier tell us about the knee, marginal value, and binding constraints? | Prescriptive (post-solve) | `rai-prescriptive-results-interpretation` |
+
+Discovery's output is a *plan*, not code. Everything that follows
+materializes that plan.
+
+---
+
 ## Prerequisites
 
 The template ships everything needed to run the chain end-to-end:
@@ -78,6 +113,9 @@ Steps are sequential — each depends on prior steps. Steps without a skill are
 
 ## Stage 1 — Predictive: substation load forecasting
 
+> **Skill:** _no public skill yet — see `v1/subscriber_retention/` and `v1/demand_forecasting/` as worked-example references_ ·
+> **Prompt:** "Forecast each substation's future peak load by aggregating the maximum predicted load across the 6/12/18/24-month forecast horizons in the demand-forecast table, and write the result back to every substation as a derived load-projection property. The downstream rules engine and optimizer both need to read this same forecasted headroom — fall back to the substation's current load only when no forecast row exists. Flag substations whose predicted load exceeds their nameplate capacity within the horizon and report which one breaches first."
+
 **Method:** load max forecasted load per substation as `Substation.predicted_load`. The template aggregates `DemandForecast.predicted_load_mw` over forecast horizons (6/12/18/24 months) and writes the max back to the substation. A pre-trained GNN can replace the table lookup; the script falls back gracefully when the GNN model registry is unavailable.
 
 ```
@@ -121,6 +159,9 @@ effective_load = Substation.predicted_load | Substation.current_load_mw
 
 ## Stage 2 — Graph: grid topology & structural vulnerability
 
+> **Skill:** `rai-graph-analysis` ·
+> **Prompt:** "Build an undirected, unweighted graph using `Substation` directly as the node concept and active transmission lines as edges. Run weakly connected components to confirm grid connectivity, Louvain community detection to surface ERCOT regional clusters, and the betweenness/degree/eigenvector centrality trio. Combine the three centrality ranks into a composite rank and flag the top 3 substations as structurally critical, writing the centrality scores, community label, and criticality flag back to each substation."
+
 **Construction** — `Substation` as the node concept directly (no mirror concept):
 - Node concept: `Substation` (12 nodes)
 - Edges: active `TransmissionLine` rows, `from_substation` → `to_substation`
@@ -169,6 +210,9 @@ Centrality (top-3 marked is_structurally_critical)
 
 ## Stage 3 — Rules: interconnection queue compliance
 
+> **Skill:** `rai-rules-authoring` ·
+> **Prompt:** "Author three declarative compliance rules per data-center request, each consuming an upstream enrichment. Rule 1 fails capacity when the request's MW plus the substation's forecasted load (with current load as fallback) exceeds the substation's nameplate capacity. Rule 2 fails structural risk when the target substation is flagged structurally critical from Stage 2. Rule 3 fails the low-carbon mandate when the substation's zero-emission generation share is below the request's required percentage; sum capacity for generators with emissions rate of zero. Add a composite `is_compliant` flag that fires only when none of the three failure flags fire."
+
 Three declarative `Relationship` rules consume Stages 1–2 enrichments. Each is written as a `model.where(...).define(...)` block; a composite `is_compliant` fires only when none of the three failure flags fire.
 
 **Rule 1 — Capacity** (consumes `Substation.predicted_load` from Stage 1):
@@ -231,6 +275,9 @@ Every request passes low-carbon — ERCOT's nuclear (STP, Comanche Peak) plus it
 
 ## Stage 4 — Prescriptive: joint DC approval + upgrade MIP
 
+> **Skill:** `rai-prescriptive-problem-formulation` ·
+> **Prompt:** "Formulate a single MIP that picks DC approvals and substation upgrades jointly across five budget scenarios at $200M, $300M, $400M, $500M, and $600M. Model the budget as an `InvestmentLevel` Scenario Concept and index both the binary approve and binary upgrade decision variables by it, so one solve produces the full Pareto frontier — no per-budget re-solve loop. Constrain per substation per scenario that approved DC load fits within nameplate capacity minus forecasted load plus selected upgrade headroom, and per scenario that selected upgrade costs stay within the scenario's budget cap. Maximize total annual interconnection revenue summed across all scenarios."
+
 ```
 FORMULATION
 
@@ -305,6 +352,9 @@ PARETO FRONTIER (queried directly from ontology)
 
 ## Stage 5 — Interpretation
 
+> **Skill:** `rai-prescriptive-results-interpretation` ·
+> **Prompt:** "Read the per-scenario decision-variable properties straight from the ontology — approved DCs, selected upgrades, total MW, revenue, and amortized upgrade cost — and present the Pareto frontier across the five investment levels. Compute marginal value per added budget dollar between adjacent levels and call out the knee where the marginal jump is largest. Translate the result into business language: which DCs unlock at the knee, which substation is the binding bottleneck, and which requests are infeasible at every budget level."
+
 ```
 THE PLAN, IN BUSINESS TERMS
 
@@ -416,6 +466,34 @@ THE ENERGY GRID PLANNING CHAIN
 
 ---
 
+## Adapting this recipe to a new domain
+
+The chain pattern transfers cleanly. To rebuild for a different problem:
+
+1. Re-run `rai-discovery` on the new business question — does it actually
+   need all 4 reasoner families, or is one or two sufficient?
+2. Strip the demo ontology to the concepts the new chain needs (lean is
+   better for type inference and solver compile time).
+3. Stage 1 (Predictive) is optional — if you have forecast tables already,
+   a simple `aggs.max(...).per(...)` derived property is enough; swap in
+   a GNN later by pointing the predictive reasoner at your model registry.
+4. Stages 2–4 are the load-bearing chain: graph centrality flags
+   structurally critical nodes, rules consume both the predictive forecast
+   and the criticality flag to fail or pass each request, and the
+   prescriptive MIP reads the same forecast as a capacity baseline while
+   indexing decision variables by an `InvestmentLevel` Scenario Concept so
+   one solve produces the full Pareto frontier.
+5. Keep the validation checks at every stage: assert the predicted-load
+   write covers all nodes, the top-N critical set looks plausible against
+   the topology, the compliance table has at least one PASS and one FAIL,
+   and the optimizer reports OPTIMAL with a non-zero objective.
+
+The shape this template demonstrates — *each reasoner writes a property
+the next reasoner reads* — is what makes the chain accretive rather than
+serial. The agent skills are how you reliably author each link.
+
+---
+
 ## Data Reference
 
 **Substations with DC requests:**
diff --git a/v1/machine_maintenance/references/runbook.md b/v1/machine_maintenance/references/runbook.md
index 98760d9..42c4abc 100644
--- a/v1/machine_maintenance/references/runbook.md
+++ b/v1/machine_maintenance/references/runbook.md
@@ -47,6 +47,34 @@ A single-reasoner approach can't surface this. OEE alone says Plant_B is the pro
 
 ---
 
+## How to read this runbook
+
+This runbook serves two audiences:
+
+- **Reading top-to-bottom**: the narrative + ASCII visualizations show what the chain produces stage-by-stage, with the same business framing the stakeholder would see.
+- **Per-stage skill blocks**: the boxed `Skill / Prompt` callout at the start of each stage is the recipe — load that RAI agent skill, give it that prompt against the bundled demo data, and the agent will reproduce the stage.
+
+---
+
+## Step 0 — Scope the question with `rai-discovery`
+
+> **Skill:** `rai-discovery` ·
+> **Prompt:** "We need to schedule preventive maintenance for a multi-plant operation: 30 machines across 3 plants, 10 technicians across 3 cities, a 4-period horizon. Classify the sub-questions we must answer, map each to the right reasoner family, and tell us which downstream skills to load. Where does OEE alone mislead us, and what structural risks (qualifications, technician concentration) won't a pure optimizer surface?"
+
+Discovery classifies the question by reasoner family and tells you which downstream skills to load:
+
+| Sub-question | Reasoner | Skill |
+|---|---|---|
+| Where does the operation actually hurt — OEE, anomalies, failure trajectories? | Querying / Descriptive | `rai-querying` |
+| Which machines are scheduling bottlenecks given shared technician pools? | Graph | `rai-graph-analysis` |
+| Which machines are overdue, high-risk, chronic, or composite-Critical? | Rules | `rai-rules-authoring` |
+| What's the optimal maintain-and-assign plan across 4 periods? | Prescriptive | `rai-prescriptive-problem-formulation` |
+| Where is the schedule structurally fragile, and what cross-training fixes it? | Prescriptive (re-solve / interpretation) | `rai-prescriptive-solver-management` + `rai-prescriptive-results-interpretation` |
+
+Discovery's output is a *plan*, not code. Everything that follows materializes that plan.
+
+---
+
 ## Setup
 
 See the template's main `README.md` for installation, RAI connection setup, and how to run the script. The narrative below follows the actual stage outputs of `machine_maintenance.py`.
@@ -82,6 +110,43 @@ The runbook walks the same chain stage-by-stage, prompt-by-prompt, in agent-skil
 
 ---
 
+## Stage 0 — Querying: operational intelligence
+
+> **Skill:** `rai-querying` ·
+> **Prompt:** "From the loaded production runs, sensor readings, and per-period failure predictions, compute an OEE proxy by facility (Performance × Quality), list machines with above-threshold sensor anomalies grouped by facility, and identify the six machines with the steepest failure-probability rise from period 1 to period 4. Write the per-period failure prediction back as a `MachinePeriod` property so downstream stages can read it."
+
+This stage establishes the operational baseline. Plant_C leads at 79.8% OEE; Plant_B trails at 61.4%. But Plant_A — middle of the OEE pack at 68.2% — owns 7 of 9 sensor anomalies and the three steepest failure trajectories (M001, M013, M016). The querying stage writes nine derived properties on `Machine` plus `MachinePeriod.predicted_fp` (120 rows), and Stage 3 reads `predicted_fp` directly into the failure-cost objective term.
+
+## Stage 1 — Graph: dependency clusters and bottleneck centrality
+
+> **Skill:** `rai-graph-analysis` ·
+> **Prompt:** "Build an undirected graph using `Machine` directly as the node concept. Two machines are adjacent when at least one technician is qualified for both of their machine types. Run weakly connected components to find dependency clusters, then compute betweenness centrality, normalize it to 0..1 against the max, and store both raw and normalized scores back on the machine."
+
+The 30 machines form a single connected component — every machine is reachable through shared qualifications. Pump-type machines tie at the top of betweenness (raw 24.0, normalized 1.0): M003 (Plant_C), M008 (Plant_B), M013 (Plant_A). The normalized centrality is consumed by Stage 3's failure-cost multiplier `(1 + 2.0 × betweenness)`, so leaving a bottleneck Pump vulnerable is markedly more expensive than leaving a peripheral Motor vulnerable.
+
+## Stage 2 — Rules: compliance flags and composite risk tier
+
+> **Skill:** `rai-rules-authoring` ·
+> **Prompt:** "Author seven derived flags on the existing concepts: overdue (remaining useful life below required maintenance hours), high-risk (failure probability above 0.3 and criticality at least 4), anomalous (any sensor anomaly), chronic-downtime (more than 8 events), parts-reorder (stock at or below minimum order), and certification-expiring (under 30 days remaining). Then chain three of those flags — chronic, high-risk, overdue — into a `risk_tier` property of Critical / Elevated / Standard, exhaustively enumerating all eight three-flag combinations using negation."
+
+Six machines overdue, one high-risk (M013), three chronic-downtime, four parts-reorder, five expiring certs. The composite tier surfaces a single Critical machine — M013 (Pump, Plant_A) — and a single Elevated machine — M016 (Turbine, Plant_A). The overdue flag is consumed by Stage 3 as a hard constraint: every overdue machine must be scheduled by period 2.
+
+## Stage 3 — Prescriptive: maintenance schedule
+
+> **Skill:** `rai-prescriptive-problem-formulation` ·
+> **Prompt:** "Schedule preventive maintenance over 4 periods. Decide for each machine and period whether to maintain it, whether it remains vulnerable, and which qualified technician services it. Enforce cumulative coverage (each machine is either maintained or vulnerable through every period), assignment-maintenance linkage (one tech per maintained job), per-tech hour capacity, a 5-jobs-per-period parts/bay limit, and the hard rule that every overdue machine from Stage 2 is maintained by period 2. Minimize failure cost (period-specific failure probability × parts cost × criticality, scaled up by `(1 + 2.0 × betweenness)`) plus labor cost (duration × hourly rate) plus travel cost ($50/hour for non-co-located assignments)."
+
+The solver returns OPTIMAL with objective $605,240.61 and 20 maintenance jobs across the four periods (capacity-binding at 5 jobs/period). M013 and M016 — Plant_A's Critical and Elevated machines — are both scheduled by period 1, satisfying the overdue deadline. Several Turbine assignments require travel because all three Turbine-qualified techs are based in Houston_TX while four of six Turbines sit at Plant_A and Plant_C. The optimizer pays the travel cost; it cannot restructure the qualification pool.
+
+## Stage 4 — Resilience: concentration sweep and cross-training
+
+> **Skill:** `rai-prescriptive-solver-management` + `rai-prescriptive-results-interpretation` ·
+> **Prompt:** "Take the optimal schedule and stress-test the qualification structure underneath it. For each machine type, identify whether all qualified technicians are concentrated in one location, count the scheduled jobs of that type that required travel, and surface the geographic single-point-of-failure. Then re-rank the cross-training options for the concentrated type by cost, prefer non-local candidates, and recommend the cheapest one with its training cost and duration. This is a follow-up sensitivity sweep over the existing solve, not a new formulation."
+
+Turbine is the concentrated type — all three qualified techs (T001, T002, T003) sit in Houston_TX, and 67% of scheduled Turbine jobs already require travel. The recommended fix: cross-train T006 (Senior, Chicago_IL) for $3,200 over 5 weeks. That single addition eliminates the Houston single-point-of-failure for Turbine work at Plant_B and Plant_C, and pays back the first time a weather event, illness, or expiring cert would have idled a Turbine job that the optimizer would otherwise have left uncovered.
+
+---
+
 ## Stage outputs — what each reasoner contributes back
 
 ```
@@ -202,6 +267,20 @@ THE MACHINE-MAINTENANCE CHAIN
 
 ---
 
+## Adapting this recipe to a new domain
+
+The chain pattern transfers cleanly. To rebuild for a different scheduling-with-resilience problem:
+
+1. Re-run `rai-discovery` on the new business question — does it actually need all five reasoner families, or is one or two sufficient? A pure dispatch problem may only need querying + prescriptive; a pure compliance problem may only need rules.
+2. Strip the demo ontology to the concepts the new chain needs (lean is better for type inference and solver compile time). Keep the cross-product concept (`MachinePeriod`-equivalent) — it's where most decision variables and per-period derived properties live.
+3. Stage 1 (querying) is required scaffolding: the optimization objective leans on derived per-period signals, not raw inputs.
+4. Stages 2–5 are the load-bearing chain: graph centrality writes a multiplier the cost objective consumes; rules write a hard deadline the optimizer must satisfy; the optimizer writes solution variables the resilience sweep reads; the resilience layer doesn't re-solve a new problem — it stress-tests the structure underneath the existing solve and recommends a structural fix.
+5. Keep the validation checks at every stage: assert flagged-set size, betweenness top-N looks plausible, the OPTIMAL gate, the objective is not zero, and the resilience pass surfaces at least one actionable recommendation when concentration exists.
+
+The shape this template demonstrates — *each reasoner writes a property the next reasoner reads* — is what makes the chain accretive rather than serial. The agent skills are how you reliably author each link.
+
+---
+
 ## Optional extension — operator-shift assignment
 
 A second optimization pass — assigning operators to line-shifts to maximize a skill-match bonus, subject to a same-facility constraint — is a natural follow-on but not part of this template's main script. It would consume the same `Machine` and facility ontology and add `Operator`, `Shift`, and `OperatorShift` concepts. Out of scope for this runbook.
diff --git a/v1/portfolio_balancing/references/runbook.md b/v1/portfolio_balancing/references/runbook.md
new file mode 100644
index 0000000..ea6f43c
--- /dev/null
+++ b/v1/portfolio_balancing/references/runbook.md
@@ -0,0 +1,498 @@
+# Runbook: Portfolio Balancing — Multi-Reasoner Walkthrough
+
+Walk-through of the chained-reasoner pattern this template is built on. One realistic portfolio thread — **rebalance an 8-stock universe under compliance + crisis stress** — traced across rules, graph, and prescriptive reasoners, each stage writing properties back to the same ontology that downstream stages consume.
+
+The template's combined script (`portfolio_balancing.py`) implements all four stages directly; this runbook expands the surrounding narrative — what each prompt asks, what shape of output to expect, and how each enrichment feeds the next — so a reader can follow the reasoning thread end-to-end without re-running the script.
+
+---
+
+## TL;DR — the chain in one screen
+
+```
+The current book breaks compliance on 4 holdings + 2 sectors. Naive
+"diversification" hides correlated bets. The chain collapses 8 stocks
+into 5 distinct cluster representatives, traces the efficient frontier
+under base + crisis covariance, and shows crisis vol sits 25-30% above
+base at every lambda — without the cluster collapse, the gap would grow.
+
+  ─────────────────────────────────────────────────────────────────
+  STAGE 1  Rules        ──►  Holding.is_overconcentrated         (4)
+                              Holding.is_sector_concentrated      (2)
+                              User.is_high_risk_trader            (2)
+                              4 holdings > 15% of balance, 2 sectors
+                              > 30%, 2 traders with risk > 0.8 + flagged.
+  ─────────────────────────────────────────────────────────────────
+  STAGE 2  Graph        ──►  Stock.variance / volatility / correlation
+                              Stock.cluster, Stock.is_representative (5)
+                              4 edges (|rho| >= 0.3), 5 Louvain clusters,
+                              intra +0.683 vs inter +0.131.
+  ─────────────────────────────────────────────────────────────────
+  STAGE 3  Prescriptive ──►  Stock.x_quantity (per Scenario)
+                 (QP)         6 scenarios = 3 budgets x 2 regimes.
+                              Anchors + 5 epsilon points = 7-point
+                              frontier per scenario. Knee at eps_1.
+  ─────────────────────────────────────────────────────────────────
+  STAGE 4  Stress       ──►  Stock.regime_covar (PSD-preserving)
+                              Crisis vol 25-30% above base at every
+                              lambda. Gap peaks mid-frontier, narrows
+                              toward the concentrated end.
+  ─────────────────────────────────────────────────────────────────
+```
+
+A single-reasoner approach can't answer this. Rules alone flag broken positions but don't rebuild the book. Graph alone clusters stocks but doesn't allocate. Prescriptive alone solves a Markowitz QP — but on the full universe it stacks near-duplicate cluster members, and on a single covariance it ignores regime risk. Each stage feeds the next: rules surface the violations, graph collapses redundant bets, prescriptive optimizes over the reduced universe across both regimes in one solve.
+
+---
+
+## How to read this runbook
+
+This runbook serves two audiences:
+
+- **Reading top-to-bottom**: the narrative + ASCII visualizations show what the chain produces stage-by-stage, with the same business framing the stakeholder would see.
+- **Per-stage skill blocks**: the boxed `Skill / Prompt` callout at the start of each stage is the recipe — load that RAI agent skill, give it that prompt against the bundled demo data, and the agent will reproduce the stage.
+
+---
+
+## Step 0 — Scope the question with `rai-discovery`
+
+> **Skill:** `rai-discovery` ·
+> **Prompt:** "We have an 8-stock portfolio book that breaks compliance on 4 holdings and 2 sectors. We want to rebuild the book under a Markowitz mean-variance objective with position and sector caps, deduplicate redundant bets via correlation clustering, and stress-test the resulting frontier under a crisis regime where correlations spike. Classify the sub-questions by reasoner family and tell us which RAI skills to load."
+
+Discovery classifies the question by reasoner family and tells you which downstream skills to load:
+
+| Sub-question | Reasoner | Skill |
+|---|---|---|
+| Where is the current book breaking compliance (per-stock, per-sector, per-trader)? | Rules | `rai-rules-authoring` |
+| Which stocks are really the same bet (correlation clusters + cluster representative)? | Graph | `rai-graph-analysis` |
+| What's the optimal allocation under position + sector caps for each (budget, regime) scenario? | Prescriptive | `rai-prescriptive-problem-formulation` |
+| How does the optimal portfolio degrade under a PSD-preserving crisis covariance? | Prescriptive (re-solve) | `rai-prescriptive-solver-management` + `rai-prescriptive-results-interpretation` |
+
+Discovery's output is a *plan*, not code. Everything that follows materializes that plan.
+
+---
+
+## Setup
+
+See the template's main `README.md` for installation, RAI connection setup, and how to run the script. The narrative below follows the actual stage outputs of `portfolio_balancing.py` against the bundled CSVs in `../data/`.
+
+**Prerequisites**
+
+- Template's `data/` CSVs available — `returns.csv` (8 stocks), `covar.csv` (64 covariance entries), `users.csv`, `accounts.csv`, `holdings.csv`, `transactions.csv`. Or your own Snowflake schema with equivalent tables.
+- `raiconfig.yaml` pointing at your RAI engine
+- Python >= 3.10 with `relationalai >= 1.0.14`
+
+---
+
+## Workflow
+
+The runbook walks the same chain stage-by-stage, prompt-by-prompt, in agent-skill order. Each row maps to a section of the script.
+
+| # | Step | Skill | Prompt | Expected Output |
+|---|------|-------|--------|-----------------|
+| 1 | Build ontology | `/rai-build-starter-ontology` | "Build a RAI ontology for portfolio balancing from the CSVs in `data/`. Concepts: Stock (index, ticker, sector, returns, covar), Sector (derived from Stock sectors), User (with risk_score), Account (balance, account_type), Holding (quantity, purchase_price, value), Transaction (amount, category, flagged indicator)." | Model `portfolio` with 6 user-facing concepts. 8 Stocks across 5 sectors (Technology x3, Healthcare x2, Financials, Energy, Consumer Staples). 6 Users, 4 Accounts, 15 Holdings, 21 Transactions. Pairwise `Stock.covar(i, j)` two-arg property loaded from the long-form covariance CSV (64 entries, symmetric). |
+| 2 | Discovery | `/rai-discovery` | "What questions can we answer with this ontology? We want to surface compliance violations on the current book, then rebuild it under a Markowitz objective with crisis-regime stress." | Rules: position-limit, sector-concentration, high-risk-trader flags as derived Relationships. Graph: covariance clustering (Louvain on |correlation| >= threshold) -> per-cluster representative by Sharpe -> investable-universe collapse. Prescriptive: bi-objective QP via epsilon constraint over the representative-only universe, indexed by a `Scenario` Concept that combines budget and regime. Stress: same `solve_epsilon` call under a PSD-preserving crisis covariance. Predictive: DATA_GAP (no time series). |
+| 3 | Stage 1 — Compliance: overconcentrated holdings | `/rai-rules-authoring` | "Define `Holding.value = quantity * purchase_price`, then flag `Holding.is_overconcentrated` whenever `value > POSITION_LIMIT * Account.balance` (default 15%)." | 4 holdings flagged: AAPL (Account 1, 18.0%), MSFT (Account 1, 16.0%), JNJ (Account 4, 16.0%), PFE (Account 4, 16.2%). All Account-1 (Alice Chen, balance $100K) and Account-4 (Dan Wilson, balance $80K) — both have multiple positions clipping the 15% cap. |
+| 4 | Stage 1 — Compliance: sector concentration | `/rai-rules-authoring` | "Aggregate `Holding.value` per (Account, Sector) and flag `Holding.is_sector_concentrated` whenever the sector total > `SECTOR_LIMIT * Account.balance` (default 30%)." | 2 (account, sector) pairs flagged: Account 1 Technology 34.0% (AAPL + MSFT), Account 4 Healthcare 32.5% (JNJ + PFE). Pattern: the same accounts driving Stage 1.3 stack within a sector. Stage 3 enforces the same 30% cap as a hard constraint. |
+| 5 | Stage 1 — Compliance: high-risk traders | `/rai-rules-authoring` | "Flag `User.is_high_risk_trader` whenever `risk_score > 0.8` AND > 5 flagged transactions." | 2 users flagged: Alice Chen (risk 0.85), Eve Taylor (risk 0.92). Both have multiple `Transaction.is_flagged == True` rows in the bundled data. Standalone signal — used as a reviewer trigger, not a solver constraint. |
+| 6 | Stage 2 — Derive volatility + correlation in PyRel | `/rai-ontology-design` | "Add `Stock.variance` (covariance diagonal where i == j), `Stock.volatility = sqrt(variance)`, and pairwise `Stock.correlation(i, j) = covar(i, j) / (vol_i * vol_j)`. All in PyRel — no numpy precompute." | `Stock.variance` and `Stock.volatility` written for all 8 stocks. `Stock.correlation` two-arg property populated from the 64 covariance pairs. Volatilities span ~0.06 to ~0.10 (small dataset, daily-scale). Storing in the ontology lets every downstream stage read the same source of truth instead of recomputing. |
+| 7 | Stage 2 — Cluster the correlation graph | `/rai-graph-analysis` | "Build an undirected `Graph` with `Stock` as `node_concept`. Add an edge wherever `|correlation| >= CORR_THRESHOLD` (default 0.3) and `i < j` (deduplicate symmetric pairs). Run Louvain and persist `Stock.cluster`." | 4 edges retained (8 stocks, sparse graph). Louvain yields 5 communities: Cluster 1 = {JNJ, PFE} (Healthcare), Cluster 2 = {AAPL, MSFT, GOOGL} (Technology), plus singletons Cluster 3 = {JPM}, Cluster 4 = {PG}, Cluster 5 = {XOM}. Intra-cluster avg correlation = +0.683, inter-cluster = +0.131 (~5x separation — clean). |
+| 8 | Stage 2 — Pick cluster representatives by Sharpe | `/rai-graph-analysis` | "Define `Stock.sharpe = returns / volatility`. For each cluster, the stock whose Sharpe equals the cluster max is the representative — set `Stock.is_representative`. Singletons are their own representative. Define `Stock.is_non_representative` as the positive complement (the prescriptive rewriter doesn't accept `model.not_(...)` inside a solver `.where()`)." | 5 representatives picked from 8 stocks: Cluster 1 PFE (Healthcare, Sharpe 0.530), Cluster 2 GOOGL (Technology, 0.605), Cluster 3 JPM (Financials, 0.500), Cluster 4 PG (Consumer Staples, 0.444), Cluster 5 XOM (Energy, 0.588). Investable universe shrinks from 8 to 5. Non-representatives — AAPL, MSFT, JNJ — get `Stock.is_non_representative` and are forced to zero in Stage 3. |
+| 9 | Stage 3 — Scenario + regime + decision variable | `/rai-prescriptive-problem-formulation` | "Define `Regime` (`base`, `crisis`) and a `Scenario` Concept indexed by name with `budget` and `regime` properties. Load 6 scenarios = {500, 1000, 2000} x {base, crisis}. Add `Stock.regime_covar(i, j, Regime) = base covariance for base; alpha * covar(i,j) + (1 - alpha) * vol_i * vol_j for crisis` (PSD-preserving correlation shrinkage in covariance units, alpha = 0.7). Decision variable: `Stock.x_quantity(Stock, Scenario)` continuous." | 1 `Stock.x_quantity` property, 8 stocks x 6 scenarios = 48 continuous variables (40 of which will be hard-zero from the representative-only constraint). Regime-conditioned covariance lives in the ontology — the QP picks each scenario's matching regime without any branching in solver code. |
+| 10 | Stage 3 — Compliance constraints on the decision variable | `/rai-prescriptive-problem-formulation` | "Add: non-negative (`x_qty >= 0`); fully invested per scenario (`sum(x) == Scenario.budget`); per-rep position cap (`x_qty <= REP_POSITION_LIMIT * Scenario.budget`, default 0.30); per-sector cap (`sum(x_qty per sector) <= SECTOR_LIMIT * Scenario.budget`, default 0.30); representative-only (`x_qty == 0` where `Stock.is_non_representative()`)." | 5 constraint families. `REP_POSITION_LIMIT = 0.30` is intentionally higher than Stage 1's `POSITION_LIMIT = 0.15`: a representative carries its cluster's combined exposure, and feasibility requires `REP_POSITION_LIMIT * num_reps >= 1.0` (5 x 0.30 = 1.5, OK). `SECTOR_LIMIT = 0.30` is reused verbatim from Stage 1's compliance threshold — same parameter binds the existing book and the rebuilt one. |
+| 11 | Stage 3 — Anchors + epsilon sweep | `/rai-prescriptive-solver-management` | "Anchor 1: minimize risk (no return constraint). Anchor 2: maximize return. Compute return-rate range across all 6 scenarios. Then sweep 5 interior epsilon-rate points uniformly across the rate range and resolve `min risk s.t. return_rate >= eps_rate`. Use Ipopt, time limit 60s." | Per-scenario rate range: [0.0634, 0.0840] per unit invested (regime-independent because `Stock.returns` is regime-independent — only the covariance differs). 7 solves total: 2 anchors + 5 epsilon points = 42 optimal portfolios (6 scenarios x 7 points), all `LOCALLY_SOLVED`. base_500 anchor 1: return 32.43, risk 1160.39. base_500 anchor 2: return 42.00. crisis_500 anchor 1: return 31.69, risk 1913.60 — same investable universe, different regime covariance. |
+| 12 | Stage 3 — Pareto + knee | `/rai-prescriptive-results-interpretation` | "For each scenario, list the 7-point frontier (return, risk). Print marginal `delta_risk / delta_return` between adjacent points and identify the knee — where the marginal jumps most." | base_500: returns 32.43 -> 33.41 -> 35.12 -> 36.84 -> 38.56 -> 40.28; risk 1160 -> 1177 -> 1263 -> 1386 -> 1546 -> 1742. Marginal climbs 16.85 -> 49.94 -> 71.72 -> 93.03 -> 114.43. Knee at Point 2 (`eps_1`) — marginal cost jumps ~3x beyond. base_1000 / base_2000 / crisis_* show the same shape (risk scales as budget^2 because the QP is quadratic, but the rate-form frontier is identical). |
+| 13 | Stage 4 — Crisis stress comparison | `/rai-prescriptive-results-interpretation` | "From the Stage 3 sweep, emit a side-by-side `vol_base` vs `vol_crisis` table per (budget, lambda). Vol = sqrt(risk). Compute the absolute and percentage gap." | Same 7-point frontier resolved at each budget x regime. Crisis vol sits ~25-30% above base at every lambda. Budget 500: at `min_risk` vol_base 34.06 / vol_crisis 43.74 (+28.4%); at `eps_1` 34.30 / 44.54 (+29.8%); peaks at `eps_1`-`eps_2`; narrows to +25.2% at `eps_5`. The gap peaks mid-frontier and narrows toward the concentrated end — the inversion is the payoff of the cluster collapse: at the concentrated end the optimizer holds the highest-Sharpe distinct bet per cluster (weighted toward Energy/Consumer Staples here, which carry lower crisis correlations than the middle of the frontier). Without the representative collapse, the concentrated end would stack near-duplicates and the crisis gap would grow instead of shrink. |
+
+---
+
+## Stage 1 — Rules: compliance scan
+
+> **Skill:** `rai-rules-authoring` ·
+> **Prompt:** "Add a derived `Holding.value` from `quantity * purchase_price`, then author three compliance flags as Relationships on the existing book. Flag a holding as overconcentrated when its value exceeds 15% of the parent account's balance. Flag it as sector-concentrated when the per-account, per-sector total of holding values exceeds 30% of balance. Flag a user as a high-risk trader when their risk score is above 0.8 and they have more than five flagged transactions."
+
+```
+COMPLIANCE VIOLATIONS — current book (4 accounts, 15 holdings, 6 users)
+
+  Rule 1: Holding.is_overconcentrated  (position > 15% of balance)
+  ────────────────────────────────────────────────────────────────
+    AAPL  Account 1   $18,000 / $100,000   18.0%  ─── Alice Chen
+    MSFT  Account 1   $16,000 / $100,000   16.0%
+    JNJ   Account 4   $12,800 /  $80,000   16.0%  ─── Dan Wilson
+    PFE   Account 4   $13,000 /  $80,000   16.2%
+
+  Rule 2: Holding.is_sector_concentrated  (sector > 30% of balance)
+  ────────────────────────────────────────────────────────────────
+    Account 1   Technology    $34,000 / $100,000   34.0%
+    Account 4   Healthcare    $25,800 /  $80,000   32.2%
+
+  Rule 3: User.is_high_risk_trader  (risk_score > 0.8 AND >5 flagged txns)
+  ────────────────────────────────────────────────────────────────
+    Alice Chen   risk_score 0.85   ── flagged transactions
+    Eve Taylor   risk_score 0.92
+
+  ──────────────────────────────────────────────────────────────────
+  The same accounts that breach the per-stock cap also breach the
+  sector cap — Stage 3 will use the SECTOR_LIMIT (0.30) as a hard
+  constraint when rebuilding both books. The trader flag is a
+  reviewer signal, not a solver input.
+  ──────────────────────────────────────────────────────────────────
+
+  Holding.is_overconcentrated         [4]
+  Holding.is_sector_concentrated      [2]
+  User.is_high_risk_trader            [2]
+```
+
+`POSITION_LIMIT` (0.15) and `SECTOR_LIMIT` (0.30) are top-level constants. `SECTOR_LIMIT` is reused verbatim by Stage 3; `POSITION_LIMIT` is replaced in Stage 3 by `REP_POSITION_LIMIT = 0.30` because a representative carries its cluster's combined exposure (and 5 reps x 0.20 = 1.00 would already pin the budget — 0.30 leaves headroom).
+
+---
+
+## Stage 2 — Graph: covariance clustering + cluster representatives
+
+> **Skill:** `rai-graph-analysis` ·
+> **Prompt:** "Derive per-stock variance from the covariance diagonal, volatility as the square root of variance, and pairwise correlation as covariance over the product of volatilities — all in PyRel, no numpy precompute. Build an undirected stock graph with an edge wherever the absolute correlation is at least 0.3 and the index pair is deduplicated. Run Louvain community detection and persist the cluster id back onto each stock. Then derive a Sharpe ratio per stock (returns over volatility), and flag the stock whose Sharpe equals the cluster maximum as the cluster representative; flag the others as non-representative so the optimizer can force them to zero."
+
+**Construction** — undirected, unweighted graph:
+- Node concept: `Stock` (8 nodes)
+- Edges built from the derived `Stock.correlation(i, j)` property where `|correlation| >= 0.3` and `i < j`
+- Aggregator: `"sum"` (no parallel edges expected)
+
+**Algorithm:** `louvain()` for community detection.
+
+**Volatility, correlation, and crisis covariance are all PyRel derived properties** — no numpy precompute. The covariance matrix loaded from `covar.csv` is the only solver input not derived from another property.
+
+```
+DERIVED IN PYREL
+  Stock.variance         <- covar(i, j) where i == j
+  Stock.volatility       <- sqrt(variance)
+  Stock.correlation(i,j) <- covar(i, j) / (vol_i * vol_j)
+
+CORRELATION GRAPH
+  Edges with |correlation| >= 0.30:    4
+  Stocks above threshold pairwise:     {AAPL,MSFT}, {AAPL,GOOGL},
+                                       {MSFT,GOOGL}, {JNJ,PFE}
+
+LOUVAIN COMMUNITIES                   5 clusters
+  Cluster 1 (size 2):  JNJ (Healthcare), PFE (Healthcare)
+  Cluster 2 (size 3):  AAPL (Technology), MSFT (Technology),
+                       GOOGL (Technology)
+  Cluster 3 (size 1):  JPM (Financials)              ← singleton
+  Cluster 4 (size 1):  PG  (Consumer Staples)        ← singleton
+  Cluster 5 (size 1):  XOM (Energy)                  ← singleton
+
+  Avg correlation:  intra-cluster = +0.683
+                    inter-cluster = +0.131           ── ~5x separation
+```
+
+Singletons (Cluster 3-5) are their own representatives. The non-trivial choice happens inside Cluster 1 (JNJ vs PFE) and Cluster 2 (AAPL/MSFT/GOOGL):
+
+```
+REPRESENTATIVE = HIGHEST SHARPE PER CLUSTER  (returns / volatility)
+
+  Cluster 1 (Healthcare):
+    JNJ    Sharpe 0.500           PFE    Sharpe 0.530   ← REP
+
+  Cluster 2 (Technology):
+    AAPL   Sharpe 0.582           MSFT   Sharpe 0.560
+    GOOGL  Sharpe 0.605   ← REP
+
+  Cluster 3 (Financials):  JPM   Sharpe 0.500   ← REP (singleton)
+  Cluster 4 (Consumer Staples): PG  Sharpe 0.444   ← REP (singleton)
+  Cluster 5 (Energy):  XOM   Sharpe 0.588   ← REP (singleton)
+
+  ──────────────────────────────────────────────────────────────────
+  Investable universe collapses 8 -> 5.
+  AAPL, MSFT, JNJ get Stock.is_non_representative — Stage 3 forces
+  their decision variables to zero.
+
+  This is "collapse, don't cap" — Stage 3 doesn't allow the full 8
+  with caps inside a redundant cluster; it removes the duplicates
+  before the optimizer sees them.
+  ──────────────────────────────────────────────────────────────────
+
+  ✓ Stock.variance / volatility / correlation written back  [8 / 8 / 64]
+  ✓ Stock.cluster, Stock.sharpe, Stock.cluster_max_sharpe   [8 each]
+  ✓ Stock.is_representative                                 [5]
+  ✓ Stock.is_non_representative                             [3]
+```
+
+---
+
+## Stage 3 — Prescriptive: bi-objective QP with epsilon constraint
+
+> **Skill:** `rai-prescriptive-problem-formulation` ·
+> **Prompt:** "Formulate a Markowitz mean-variance QP with a continuous decision variable `Stock.x_quantity` indexed by a `Scenario` Concept that combines budget and regime — load the six scenarios for budgets {500, 1000, 2000} crossed with regimes {base, crisis}. Add five constraint families per scenario: non-negativity, fully invested at the scenario budget, per-representative position cap at 30% of budget, per-sector cap at 30% of budget, and a hard zero on every non-representative stock. Anchor the frontier with a min-risk solve and a max-return solve, then sweep five interior epsilon-rate points uniformly across the resulting return-rate range, each one re-solving min-risk subject to a per-scenario return floor that scales with budget. Use Ipopt with a 60-second time limit."
+
+```
+FORMULATION
+
+  Decision variable
+    Stock.x_quantity(Stock, Scenario)   continuous, >= 0
+      8 stocks x 6 scenarios = 48 vars
+      40 forced to 0 by Stock.is_non_representative()
+      8 active = 5 representatives x — wait: 5 reps x 6 scenarios = 30
+      (the 18 singleton non-rep slots are also forced — same effect)
+
+  Scenarios (3 budgets x 2 regimes = 6 tuples)
+    base_500  base_1000  base_2000   crisis_500  crisis_1000  crisis_2000
+
+  Constraints (per scenario)
+    1. Non-negative                  x_qty >= 0
+    2. Fully invested                sum(x_qty) == Scenario.budget
+    3. Per-rep position              x_qty <= 0.30 * Scenario.budget
+    4. Per-sector                    sum(x_qty per sector) <= 0.30 * Scenario.budget
+    5. Representative-only           x_qty == 0 where is_non_representative
+
+  Risk objective (regime-aware)
+    minimize  Sigma_ij  regime_covar(i, j, Scenario.regime) * x_i * x_j
+              └── PyRel-derived per regime; PSD-preserving for crisis ──┘
+
+  Return constraint (epsilon, scaled by budget)
+    sum(Stock.returns * x_qty)  >=  eps_rate * Scenario.budget
+
+──────────────────────────────────────────────────────────────────────
+SOLVE  (Ipopt, time limit 60s)   →   LOCALLY_SOLVED
+  Anchor 1 (min risk)   + Anchor 2 (max return)   + 5 epsilon points
+  = 7 solves, 42 optimal portfolios (one per scenario per point)
+──────────────────────────────────────────────────────────────────────
+
+ANCHOR 1 — minimize risk (no return floor)
+  base_500     return  32.43    risk  1,160.39
+  base_1000    return  64.87    risk  4,641.57
+  base_2000    return 129.73    risk 18,566.28
+  crisis_500   return  31.69    risk  1,913.60   ← higher risk, same universe
+  crisis_1000  return  63.37    risk  7,654.40
+  crisis_2000  return 126.75    risk 30,617.59
+
+ANCHOR 2 — maximize return
+  base_500 / crisis_500       return  42.00
+  base_1000 / crisis_1000     return  84.00
+  base_2000 / crisis_2000     return 168.00
+                              (returns are regime-independent)
+
+Return-rate range  [0.0634, 0.0840]  per unit invested
+Epsilon sweep      5 interior points evenly spaced across the range
+```
+
+---
+
+## Stage 3 — Reading the frontier (per scenario)
+
+> **Skill:** `rai-prescriptive-results-interpretation` ·
+> **Prompt:** "For each of the six scenarios, list the seven-point Pareto frontier (return, risk) from the anchors and epsilon points. Compute the marginal `delta_risk / delta_return` between adjacent points and identify the knee — the point where the marginal cost jumps the most relative to the previous step. Confirm the rate-form frontier shape is consistent across budgets (risk scales as budget squared because the QP is quadratic in x, but the rate-form curve and knee location are budget-independent)."
+
+```
+EFFICIENT FRONTIER — base_500  (budget = 500, regime = base)
+
+  #     Label       Return        Risk
+  ────────────────────────────────────────
+  1     min_risk     32.43    1,160.39
+  2        eps_1     33.41    1,176.78    ← KNEE
+  3        eps_2     35.12    1,262.61
+  4        eps_3     36.84    1,385.89
+  5        eps_4     38.56    1,545.79
+  6        eps_5     40.28    1,742.47
+  7     max_return   42.00    (separate anchor)
+
+  Marginal delta_risk / delta_return:
+    min_risk → eps_1     16.85
+    eps_1   → eps_2     49.94    ← +3.0x  ── KNEE
+    eps_2   → eps_3     71.72
+    eps_3   → eps_4     93.03
+    eps_4   → eps_5    114.43
+
+  ──────────────────────────────────────────────────────────────────
+  base_1000 / base_2000 / crisis_* show the SAME shape — risk scales
+  as budget^2 (the QP is quadratic in x), but the rate-form frontier
+  and the knee location are budget-independent.
+  ──────────────────────────────────────────────────────────────────
+
+  ✓ Stock.x_quantity written back, indexed by (Stock, Scenario)
+```
+
+---
+
+## Stage 4 — Crisis stress test
+
+> **Skill:** `rai-prescriptive-solver-management` + `rai-prescriptive-results-interpretation` ·
+> **Prompt:** "Define a regime-conditioned covariance `Stock.regime_covar(i, j, Regime)` derived in PyRel: for the base regime use the input covariance unchanged, and for the crisis regime use a PSD-preserving correlation shrinkage toward all-ones with weight 0.7 on the base covariance plus 0.3 on the outer product of volatilities. The Stage 3 solver call is unchanged — `Scenario.regime` selects the matching `regime_covar` per scenario, so all six base and crisis scenarios solve in one call. After the sweep, emit a side-by-side `vol_base` versus `vol_crisis` table per (budget, epsilon point) where `vol = sqrt(risk)`, with the absolute and percentage gap, and explain whether the gap peaks mid-frontier or at the concentrated end."
+
+Same `solve_epsilon` call, no separate model — `Scenario.regime` selects between two `Stock.regime_covar` definitions:
+
+- `base`:   Sigma(i, j)
+- `crisis`: alpha * Sigma(i, j) + (1 - alpha) * vol_i * vol_j   (alpha = 0.7)
+
+The crisis formula is correlation shrinkage toward all-ones (`rho_crisis = alpha * rho + (1 - alpha) * J`) re-expressed in covariance units. PSD is preserved by construction (convex combination of PSD matrices), so every lambda solves cleanly.
+
+```
+VOLATILITY COMPARISON  vol = sqrt(risk)
+
+  Budget 500:
+       Label     vol_base    vol_crisis      gap     gap_%
+       ────────────────────────────────────────────────────
+    min_risk      34.06        43.74      +9.68    +28.4%
+       eps_1      34.30        44.54     +10.24    +29.8%   ← peak
+       eps_2      35.53        46.11     +10.58    +29.8%
+       eps_3      37.23        47.94     +10.72    +28.8%
+       eps_4      39.32        49.99     +10.68    +27.2%
+       eps_5      41.74        52.27     +10.53    +25.2%
+
+  Budget 1000 + Budget 2000: same gap_% pattern (vol scales with budget;
+  the rate-form gap is budget-independent).
+
+  ──────────────────────────────────────────────────────────────────
+  Crisis vol sits 25-30% above base at EVERY lambda. The gap PEAKS in
+  the middle of the frontier (eps_1..eps_2 at +29.8%) and NARROWS at
+  the concentrated end (eps_5 at +25.2%).
+
+  Why the inversion: at the concentrated end the optimizer is picking
+  the highest-Sharpe distinct bet per cluster — the bundled data
+  weights this toward Energy + Consumer Staples, which happen to
+  carry lower crisis correlations than the middle of the frontier.
+  Without the representative collapse, the concentrated end would
+  stack near-duplicates and the crisis gap would GROW, not shrink.
+  ──────────────────────────────────────────────────────────────────
+
+  ✓ Stock.regime_covar written back (64 base + 64 crisis = 128 entries)
+```
+
+**Why `Scenario` is a Concept, not a loop:** all 6 (budget, regime) tuples solve in a single call to the solver, against the matching `regime_covar`. Adding a fourth regime or a fifth budget is a data edit in `scenario_data`, not a change to `solve_epsilon`. Scenarios are data.
+
+---
+
+## Stage outputs — what each reasoner contributes back
+
+```
+ONTOLOGY ENRICHMENT — what each stage wrote back
+
+  Stage 1 (rules)         Holding.value                           [15]
+                          Holding.is_overconcentrated             [4]
+                          Holding.is_sector_concentrated          [2]
+                          User.is_high_risk_trader                [2]
+
+  Stage 2 (graph)         Stock.variance                          [8]
+                          Stock.volatility                        [8]
+                          Stock.correlation (i, j)                [64]
+                          Stock.cluster                           [8]
+                          Stock.sharpe                            [8]
+                          Stock.cluster_max_sharpe                [8]
+                          Stock.is_representative                 [5]
+                          Stock.is_non_representative             [3]
+
+  Stage 3 (prescriptive)  Stock.regime_covar (i, j, Regime)       [128]
+                          Stock.x_quantity (Stock, Scenario)      [48]
+
+  Stage 4 (stress)        (terminal — vol_base vs vol_crisis table)
+
+  ──────────────────────────────────────────────────────────────────
+  Each stage reads what the previous stage wrote.
+  Re-running any downstream stage automatically picks up enrichments.
+  No glue code, no DataFrame round-trip — same ontology throughout.
+  ──────────────────────────────────────────────────────────────────
+```
+
+---
+
+## The chain — accretive ontology enrichment
+
+```
+THE PORTFOLIO-BALANCING CHAIN
+
+  STAGE 1  RULES
+  "Where is the current book breaking compliance?"
+  reads:   Holding, Account, User, Transaction, Stock.sector
+  writes:  Holding.value
+           Holding.is_overconcentrated / is_sector_concentrated
+           User.is_high_risk_trader
+                         │
+                         ▼
+  STAGE 2  GRAPH (Louvain)
+  "Which stocks are really the same bet?"
+  reads:   Stock.covar (loaded), Stock.returns
+  writes:  Stock.variance / volatility / correlation (i, j)
+           Stock.cluster                  ── 5 communities
+           Stock.sharpe / cluster_max_sharpe
+           Stock.is_representative        ── 5 of 8 stocks
+           Stock.is_non_representative    ── forced to zero in Stage 3
+                         │
+                         ▼
+  STAGE 3  PRESCRIPTIVE (Ipopt QP)
+  "What's the optimal allocation under position + sector caps,
+   for each (budget, regime) scenario?"
+  reads:   Stock.is_representative       ──►  decision-variable scope
+           Stock.is_non_representative   ──►  hard-zero constraint
+           Stock.returns                 ──►  epsilon return target
+           Stock.regime_covar            ──►  quadratic risk objective
+           Scenario.budget / regime      ──►  per-scenario constraints
+           SECTOR_LIMIT (= Stage 1's)    ──►  hard sector cap
+  writes:  Stock.regime_covar (PSD-preserving)  [base + crisis]
+           Stock.x_quantity (Stock, Scenario)   [42 active portfolios]
+                         │
+                         ▼
+  STAGE 4  STRESS (same solver, regime swap)
+  "How does the optimal portfolio degrade under a crisis covariance?"
+  reads:   Stock.regime_covar (regime = "crisis")
+           Stage 3's pareto results
+  writes:  (terminal — vol gap table)
+
+  ──────────────────────────────────────────────────────────────────
+  No glue. No DataFrame ping-pong. No re-derivation per-reasoner.
+  Three reasoner families, one ontology, one accretive thread.
+  Stage 4 is a regime swap on the same solve — not a separate model.
+  ──────────────────────────────────────────────────────────────────
+```
+
+---
+
+## Why the chain matters (vs. any single stage)
+
+| Stage alone | What it tells you | What it doesn't |
+|---|---|---|
+| Rules alone | "4 holdings + 2 sectors break compliance" | How to rebuild the book |
+| Graph alone | "AAPL/MSFT/GOOGL move together; JNJ/PFE move together" | Which to keep, how much to allocate |
+| Prescriptive alone (full universe) | A "diversified" frontier that stacks near-duplicates inside a cluster | That two of those names are one bet — and that the crisis gap will grow under stress |
+| Prescriptive alone (single regime) | A base-case efficient frontier | What it costs you when correlations spike |
+
+| Combined | Output |
+|---|---|
+| Rules → Graph | Compliance violations + redundant-bet map |
+| + Prescriptive (rep-only universe) | 7-point Pareto frontier per scenario; knee at eps_1 |
+| + Stress (Scenario.regime swap) | Crisis vol 25-30% above base, gap narrows at concentrated end (the cluster-collapse payoff) |
+
+**Multi-reasoner chaining grounded in (and contributing to) the ontology.**
+
+---
+
+## Crisis Regime Construction
+
+**Do not** naively scale off-diagonal covariance by a constant — it frequently breaks positive semidefiniteness, the lambda=0 pure min-variance solve fails (Ipopt rejects non-convex QP), and the whole frontier anchors wrong. PSD-preserving alternatives:
+
+| Approach | Formula | Preserves PSD? |
+|----------|---------|----------------|
+| **Correlation shrinkage toward all-ones** (this template) | `rho' = alpha * rho + (1 - alpha) * J`, alpha in [0.6, 0.9]. In covariance units: `cov'(i,j) = alpha * cov(i,j) + (1 - alpha) * vol_i * vol_j` | Yes (convex combination of PSD matrices) |
+| Eigenvalue flooring | Eigendecompose, replace lambda_i with max(lambda_i, eps), recompose | Yes by construction |
+| Scale off-diagonals + PSD projection | Scale, then find nearest PSD matrix via eigendecomposition | Yes after projection |
+| ~~Uniform 1.5x off-diagonals~~ | `Sigma'[i,j] = 1.5 * Sigma[i,j]` for i != j | **No** — frequently non-PSD |
+
+`CRISIS_ALPHA = 0.7` produces a clearly "crisis-like" regime (intra-cluster correlations bump toward 1) while keeping the QP well-conditioned at every lambda. Lower values (0.5-0.6) give more severe crises; values > 0.9 get close to the base case.
+
+---
+
+## Data Reference
+
+- **Source data**: bundled CSVs in `../data/` (8 stocks across 5 sectors with 64-entry symmetric covariance, 6 users, 4 accounts, 15 holdings, 21 transactions). To run against your own Snowflake schema instead, swap the `read_csv(...)` loads for `model.Table(...)` references in `portfolio_balancing.py`; the rest of the pipeline is unchanged.
+- **Stages**: implemented in `../portfolio_balancing.py` as a single combined script with stage banners (Stage 1 → Stage 4).
+- **Ontology**: 6 user-facing concepts (`Stock`, `Sector`, `User`, `Account`, `Holding`, `Transaction`) plus the `Regime` and `Scenario` concepts introduced in Stage 3. Run `inspect.schema(model)` after the pipeline (see template README) to dump the full concept/property/relationship surface, filtering out reasoner-owned concepts (`Variable`, `Constraint`, etc.) and the auto-generated `graph<id>_Edge` from Stage 2.
+
+---
+
+## Adapting this recipe to a new domain
+
+The chain pattern transfers cleanly. To rebuild for a different problem:
+
+1. Re-run `rai-discovery` on the new business question — does it actually need all four reasoner roles (rules, graph, prescriptive, prescriptive re-solve), or is one or two sufficient? A pure compliance audit stops after Stage 1; a rebalancer without redundant bets in the universe can skip Stage 2.
+2. Strip the demo ontology to the concepts the new chain needs (lean is better for type inference and solver compile time). For portfolio variants, the load-bearing concepts are the asset entity (here `Stock`), its pairwise covariance, an entity that holds compliance thresholds (here `Account`), and the `Scenario` Concept that parameterizes the optimizer.
+3. Stage 1 (rules) is where every threshold the optimizer will later enforce gets named once and reused — keep `SECTOR_LIMIT` (or its equivalent) shared between the compliance scan and the prescriptive constraints, so the rebuilt book obeys the same caps the diagnostic flagged.
+4. Stages 2–4 are the load-bearing chain: Graph collapses redundant bets via clustering + per-cluster representative selection, writing `is_representative` and `is_non_representative` flags the optimizer reads. Prescriptive uses those flags to scope the decision variable and adds the position + sector caps. The stress-test stage is the *same solver call* under a regime-swapped covariance — `Scenario.regime` and `Stock.regime_covar(i, j, Regime)` keep base and crisis as data, not separate models.
+5. Keep the validation checks at every stage: assert flagged-set size, the cluster count and intra-vs-inter average correlation gap, anchor solves return `LOCALLY_SOLVED`, the return-rate range is non-degenerate, and `REP_POSITION_LIMIT * num_representatives >= 1.0` so the fully-invested constraint stays feasible.
+6. When constructing a stress regime, never naively scale off-diagonal covariance — eigenvalue flooring, projection to nearest PSD, or correlation shrinkage toward all-ones (this template) all preserve PSD; arbitrary scaling does not, and the min-risk anchor will fail.
+
+The shape this template demonstrates — *each reasoner writes a property the next reasoner reads* — is what makes the chain accretive rather than serial. The agent skills are how you reliably author each link.
diff --git a/v1/supply_chain_resilience/references/runbook.md b/v1/supply_chain_resilience/references/runbook.md
index 0fac804..3d91928 100644
--- a/v1/supply_chain_resilience/references/runbook.md
+++ b/v1/supply_chain_resilience/references/runbook.md
@@ -41,6 +41,35 @@ A single-reasoner approach can't answer this. Reachability alone names the suppl
 
 ---
 
+## How to read this runbook
+
+This runbook serves two audiences:
+
+- **Reading top-to-bottom**: the narrative + ASCII visualizations show what the chain produces stage-by-stage, with the same business framing the stakeholder would see.
+- **Per-stage skill blocks**: the boxed `Skill / Prompt` callout at the start of each stage is the recipe — load that RAI agent skill, give it that prompt against the bundled demo data in `../data/`, and the agent will reproduce the stage.
+
+The two views share one ontology: each stage writes properties back to the model that the next stage reads. The skill blocks are how an engineer rebuilds the chain link-by-link; the narrative is how a stakeholder reads what the chain produced.
+
+---
+
+## Step 0 — Scope the question with `rai-discovery`
+
+> **Skill:** `rai-discovery` ·
+> **Prompt:** "Given a supply-chain ontology with sites, businesses (suppliers / manufacturers / warehouses / buyers), shipping operations, SKUs, demand orders, historical shipments, and per-supplier quarterly delay predictions, what questions can each reasoner family answer for a risk-adjusted routing problem? I need to know which high-priority customers are exposed to which suppliers, which sites are bottlenecks, which suppliers are unreliable, and what the minimum-cost flow looks like once those risks are priced in."
+
+Discovery classifies the question by reasoner family and tells you which downstream skills to load:
+
+| Sub-question | Reasoner | Skill |
+|---|---|---|
+| "Which suppliers do my high-priority customers transitively depend on?" | Graph (reachability) | `rai-graph-analysis` |
+| "Which sites are network bottlenecks?" | Graph (centrality, components) | `rai-graph-analysis` |
+| "Which suppliers are risky? Which demands are escalated?" | Rules | `rai-rules-authoring` |
+| "What's the minimum-cost flow plan that respects all of the above, and how much do disruptions cost?" | Prescriptive | `rai-prescriptive-problem-formulation`, `rai-prescriptive-solver-management`, `rai-prescriptive-results-interpretation` |
+
+Discovery's output is a *plan*, not code. Everything that follows materializes that plan.
+
+---
+
 ## Setup
 
 See the template's main `README.md` for installation, RAI connection setup, and how to run the script. The narrative below follows the actual stage outputs of `supply_chain_resilience.py` against the bundled CSVs in `../data/`.
@@ -49,6 +78,8 @@ See the template's main `README.md` for installation, RAI connection setup, and
 
 ## Stage 0 — Reachability: blast-radius pre-analysis
 
+> **Skill:** `rai-graph-analysis` ·
+> **Prompt:** "Build a **directed** business graph from shipment records, with edges going from each shipment's supplier to its customer. Then run upstream reachability **into** every business that holds at least one HIGH-priority demand, and filter the reachable set to nodes whose business type is SUPPLIER. The directed-not-undirected choice matters here: we want the suppliers that feed each high-priority customer's upstream cone, not the symmetric neighborhood. List, per high-priority customer, the suppliers it transitively depends on and their reliability scores."
 
 **Construction** — directed `Business` graph, edges from `Business.ships_to` (derived from `Shipment.supplier` -> `Shipment.customer`).
 
@@ -84,6 +115,8 @@ The point of running reachability before the MILP: when the scenario in Stage 3
 
 ## Stage 1 — Graph: site centrality + connected components
 
+> **Skill:** `rai-graph-analysis` ·
+> **Prompt:** "Build an **undirected, unweighted** site graph using SHIP-type operations as edges between source and output sites, with a sum aggregator to collapse parallel ship lanes. Compute weakly-connected components to surface fragmentation, then eigenvector centrality to rank hubs — restrict the centrality output to FACTORY and DC sites (drop STORE / OFFICE). Normalize the centrality scores to [0, 1] and write them back to the ontology as a per-site property so downstream stages can use them as a bottleneck weight."
 
 **Construction:**
 - Node concept: `Site` (31 sites)
@@ -124,6 +157,8 @@ Top critical sites — eigenvector centrality (FACTORY/DC only)
 
 ## Stage 2 — Rules: supplier risk classification
 
+> **Skill:** `rai-rules-authoring` ·
+> **Prompt:** "Add three derived flags to Business and one to Demand. A business is unreliable when its reliability score is below 0.80. A business has high delay risk when at least one Q1-2025 delay prediction for that supplier exceeds 0.15. A business is watch-level when either of those holds. A demand is escalated when its priority is HIGH. Chain the rules so watch-level fires from either underlying flag. Downstream, suppliers with **both** flags are the avoid set (hard-blocked in routing) and suppliers with **either** flag are the watch set (surcharged)."
 
 **Late-shipment context** (computed in pandas, not RAI):
 
@@ -197,6 +232,8 @@ Stage 3 reads `is_watch_level` for the surcharge term and `is_unreliable AND has
 
 ## Stage 3 — Prescriptive: risk-adjusted minimum-cost flow
 
+> **Skill:** `rai-prescriptive-problem-formulation` ·
+> **Prompt:** "Formulate a continuous minimum-cost network flow over the operations. The decision variable is per-operation flow, bounded by each operation's daily capacity, with a non-negative unmet-demand slack per demand order. Constraint: for each demand, inbound flow at the customer's site for the demanded SKU plus its slack must cover the order quantity. Hard-block any operation whose source business is in the avoid set (both Stage-2 flags fire). The objective minimizes transport cost plus a $5/unit surcharge on flow through watch-level suppliers, plus a centrality-weighted penalty on flow into bottleneck sites (using the normalized score from Stage 1, weight 2.0), plus a $100/unit unmet-demand penalty."
 
 ```
 FORMULATION
@@ -241,6 +278,8 @@ The baseline buys: enough finished-goods flow on the shortest cost-weighted lane
 
 ## Scenario analysis — quantify disruption
 
+> **Skill:** `rai-prescriptive-solver-management` + `rai-prescriptive-results-interpretation` ·
+> **Prompt:** "Re-solve the same formulation under two disruptions and compare them to the baseline. Scenario A: take the highest-centrality site offline by adding a zero-flow constraint on every operation sourced from that site. Scenario B: downgrade every watch-level supplier to avoid by adding zero-flow blocks for every supplier in the union of the two Stage-2 flags. Report status, objective, active flow count, and unmet demand for each. Then explain *why* the cost deltas are asymmetric — the structural-vs-behavioural distinction is the punchline."
 
 The same `solve_flow(...)` function re-runs with modified constraints. Two scenarios surface different aspects of the chain's value:
 
@@ -357,3 +396,17 @@ Each row is a single agent prompt. Skills are loaded in order; each writes prope
 - **Source data**: bundled CSVs in `../data/` — 31 sites across APAC / AMERICAS / EMEA, 31 businesses (6 suppliers, 6 component manufacturers, 2 manufacturers, 8 warehouses, 9 buyers), 9 SKUs (raw materials -> components -> finished goods ProPhone X1 / ProTab T1), 70 operations (SHIP + TRANSFER), 20 demand orders (9 HIGH, 5 MEDIUM, 6 LOW), 262 historical shipments (37 late), 36 quarterly delay predictions (4 quarters × 9 suppliers). To run against your own Snowflake schema instead, swap `read_csv(...)` for typed `model.Table(...)` loads against the equivalent table set.
 - **Ontology**: defined inline in `../supply_chain_resilience.py` (lines 57–250) — 7 concepts plus the derived `Business.ships_to` and `Operation.source_business` relationships.
 - **Stages**: implemented in `../supply_chain_resilience.py` as a single combined script with stage banners (`STAGE 0` through `STAGE 3` plus `SCENARIO ANALYSIS`).
+
+---
+
+## Adapting this recipe to a new domain
+
+The chain pattern transfers cleanly. To rebuild for a different problem:
+
+1. Re-run `rai-discovery` on the new business question — does it actually need all four reasoner families, or is one or two sufficient? Reachability + prescriptive alone is a viable shortcut if you don't need a centrality-based bottleneck weight or a per-entity risk classifier.
+2. Strip the demo ontology to the concepts the new chain needs (lean is better for type inference and solver compile time). The load-bearing concepts here are the node concept of each graph (`Business`, `Site`), the entity that carries risk flags (`Business`), and the entity that carries decision variables (`Operation`, `Demand`).
+3. Stage 0 (reachability) is *optional context* — it scopes the conversation around the disruption table without changing any constraint. Keep it when stakeholders need to see the dependency cone; drop it when the optimizer's hard blocks already make exposure obvious.
+4. Stages 1–3 are the load-bearing chain: graph centrality writes a per-site bottleneck weight, rules write per-supplier risk flags, and prescriptive consumes both — centrality as an objective coefficient, the avoid set as a hard block, the watch set as a surcharge. The scenario re-solves are cheap once the formulation exists.
+5. Keep the validation checks at every stage: assert the source-business derivation populates, the centrality top-N looks plausible, the avoid/watch sets are non-empty (or non-empty by design), the baseline objective is not zero, and the disruption deltas have the sign you expected.
+
+The shape this template demonstrates — *each reasoner writes a property the next reasoner reads* — is what makes the chain accretive rather than serial. The agent skills are how you reliably author each link.
diff --git a/v1/telco_network_recovery/references/runbook.md b/v1/telco_network_recovery/references/runbook.md
index d1b3003..9486f52 100644
--- a/v1/telco_network_recovery/references/runbook.md
+++ b/v1/telco_network_recovery/references/runbook.md
@@ -2,7 +2,16 @@
 
 Walk-through of the chained-reasoner pattern this template is built on. One realistic business thread — **WEST region recovery** — traced across all five RAI reasoner families, each stage writing properties back to the same ontology that downstream stages consume.
 
-The template's combined script (`telco_network_recovery.py`) implements stages 2–5 directly; this runbook expands the surrounding narrative including the descriptive Stage 1 diagnosis and the Stage 6 interpretation, so a non-OR reader can follow the full reasoning thread end-to-end.
+The template's combined script (`telco_network_recovery.py`) implements the predictive, rules, graph, and prescriptive stages directly; this runbook frames them with a descriptive Stage 1 diagnosis and a Stage 6 interpretation, so a non-OR reader can follow the full reasoning thread end-to-end.
+
+---
+
+## How to read this runbook
+
+This runbook serves two audiences:
+
+- **Reading top-to-bottom**: the narrative + ASCII visualizations show what the chain produces stage-by-stage, with the same business framing the stakeholder would see.
+- **Per-stage skill blocks**: the boxed `Skill / Prompt` callout at the start of each stage is the recipe — load that RAI agent skill, give it that prompt against the bundled demo data in `../data/`, and the agent will reproduce the stage.
 
 ---
 
@@ -43,6 +52,26 @@ A single-reasoner approach can't answer this. Descriptive alone tells the story
 
 ---
 
+## Step 0 — Scope the question with `rai-discovery`
+
+> **Skill:** `rai-discovery` ·
+> **Prompt:** "WEST is missing revenue targets while every other region grows. Diagnose whether this is a subscriber-retention problem or a network-operations problem, then produce a defensible tower-upgrade plan inside a $5M capex cap and a 200 install-crew-week schedule that prioritizes towers by both who depends on them and where regional demand is heading."
+
+Discovery classifies the question by reasoner family and tells you which downstream skills to load:
+
+| Sub-question | Reasoner | Skill |
+|---|---|---|
+| Where is the revenue bleed coming from, and is it retention or operational? | Descriptive | `rai-querying` |
+| Which WEST towers are technically broken enough to be in scope for upgrade? | Rules | `rai-rules-authoring` |
+| Whose service depends on each broken tower — what is the social blast radius? | Graph (PageRank) | `rai-graph-analysis` |
+| Is regional demand growing or contracting at those sites over the forward horizon? | Predictive (GNN) | _no public skill yet — see `v1/subscriber_retention/` and `v1/demand_forecasting/` as worked-example references_ |
+| Given the cost cap and install-crew budget, which tier should each critical tower receive? | Prescriptive (MIP) | `rai-prescriptive-problem-formulation` |
+| Which constraint is binding, and what would change if we relaxed it? | Prescriptive (post-solve) | `rai-prescriptive-results-interpretation` |
+
+Discovery's output is a *plan*, not code. Everything that follows materializes that plan.
+
+---
+
 ## Setup
 
 See the template's main `README.md` for installation, RAI connection setup, and how to run the script. The narrative below follows the actual stage outputs of `telco_network_recovery.py`.
@@ -51,6 +80,8 @@ See the template's main `README.md` for installation, RAI connection setup, and
 
 ## Stage 1 — Descriptive: diagnose WEST
 
+> **Skill:** `rai-querying` ·
+> **Prompt:** "Run a regional triage on the daily KPIs and tower fleet for Q3–Q4 2024. Compare WEST against the other eight regions on network availability, churn rate, daily revenue, and the revenue-forecast vs. actual gap, and break down the WEST tower fleet by status. Identify the top performance offenders by per-tower packet loss and latency, and check whether any WEST subscribers carry elevated churn risk. Conclude whether this looks like a retention problem or an operational one."
 
 ```
 Q4 2024 — Daily KPIs by region
@@ -101,6 +132,8 @@ Subscriber.churn_risk_score is a static feature that hasn't caught up to WEST's
 
 ## Stage 2 — Rules: flag critical_restore towers
 
+> **Skill:** `rai-rules-authoring` ·
+> **Prompt:** "Define per-tower derived averages for packet loss, latency, and error rate from the network-performance measurements, plus an average equipment-health score from the two-hop join through network equipment to equipment-health snapshots. Then add a unary critical-restore flag on the tower that fires when the tower is in WEST and either DEGRADED with average health below 0.85, or shows average packet loss above 5% with the same health threshold (so an ACTIVE-but-failing tower is still caught)."
 
 **Properties added to the ontology** (via `model.define(...)`):
 - `CellTower.avg_packet_loss` (Float) — `aggs.avg(NetworkPerformance.packet_loss_pct).per(CellTower)`
@@ -156,6 +189,8 @@ Branch 2 didn't fire — none of WEST's ACTIVE towers fall below health 0.85. Th
 
 ## Stage 3 — Graph: subscriber influence + tower blast radius
 
+> **Skill:** `rai-graph-analysis` ·
+> **Prompt:** "Build a directed subscriber-to-subscriber call graph from the call-detail records, with caller pointing to callee and parallel calls between the same pair summed into a single edge. Score each subscriber with PageRank and write that influence back to the subscriber. Then per critical-restore tower, aggregate the distinct subscribers whose calls route through it and the sum of their PageRank — that's the social blast radius if the tower fails."
 
 **Construction** — Pattern 3 (`edge_concept`):
 - Node concept: `Subscriber` (1,200 nodes)
@@ -213,6 +248,8 @@ Per-critical-tower blast radius (sorted by weighted_impact)
 
 ## Stage 4 — Predictive: forecast WEST capacity demand
 
+> **Skill:** _no public skill yet — see `v1/subscriber_retention/` and `v1/demand_forecasting/` as worked-example references_ ·
+> **Prompt:** "Train a regression GNN on per-region daily KPIs predicting subscriber growth rate. Use same-region 1-day-lag temporal edges, region as a category feature, and three lag features (previous-day growth, previous-week growth, and a 7-day rolling mean) computed before load. Train on rows before November 2024, validate on November, test on December, then bind each region's mean predicted growth back to every cell tower in that region as a per-tower demand multiplier."
 
 **Method:** GNN node regression on `TimeSeriesMetric` (composite key `metric_date` + `region`). Target: `subscriber_growth_rate`. Features: the other 12 daily KPIs + 3 lag features (`prev_day_growth`, `prev_week_growth`, `growth_7d_mean`) + `region` as a category. Graph: same-region 1-day-lag temporal edges. Train < 2024-11-01 (includes the Sep–Oct WEST decline onset); validate on Nov 2024; test on Dec 2024.
 
@@ -260,6 +297,8 @@ objective = sum( selected[t,tier] *
 
 ## Stage 5 — Prescriptive: tower upgrade selection MIP
 
+> **Skill:** `rai-prescriptive-problem-formulation` ·
+> **Prompt:** "Pick at most one upgrade tier (BRONZE, SILVER, or GOLD) per critical-restore tower using a binary decision variable on the tower-upgrade-option junction. Stay within a $5M total cost cap and 200 total install crew-weeks. Maximize the sum across selected options of capacity-increase × tower weighted-impact × tower projected-demand-growth, so the optimizer favors towers that are both broken and high-blast-radius, scaled by the regional demand forecast. Solve with Gurobi."
 
 ```
 FORMULATION
@@ -331,6 +370,8 @@ Headline metrics
 
 ## Stage 6 — Interpretation
 
+> **Skill:** `rai-prescriptive-results-interpretation` ·
+> **Prompt:** "Summarize the optimal plan in business terms: total cost vs. budget, capacity restored, tier mix, towers covered, and how many subscribers stop being served by a critical tower over the install schedule. Identify which constraint is binding and what would change if it were relaxed by 10–20% (which tower would jump tiers, what the marginal capacity lift would be). List the per-stage ontology enrichments so the reader can see what each reasoner contributed back."
 
 ```
 THE PLAN, IN BUSINESS TERMS
@@ -442,6 +483,20 @@ THE WEST RECOVERY CHAIN
 
 ---
 
+## Adapting this recipe to a new domain
+
+The chain pattern transfers cleanly. To rebuild for a different problem:
+
+1. Re-run `rai-discovery` on the new business question — does it actually need all 5 reasoner families, or is one or two sufficient?
+2. Strip the demo ontology to the concepts the new chain needs (lean is better for type inference and solver compile time).
+3. Stage 1 (descriptive triage) is *optional but high-leverage*: it scopes the problem and rules out a misdiagnosis (e.g., is this a retention crisis or a network crisis?) before any rule, graph, GNN, or solver runs.
+4. Stages 2–5 are the load-bearing chain: rules write the flag that scopes graph aggregations and the solver's decision variables; graph writes the per-entity impact weight that becomes a solver objective coefficient; predictive writes the forward-looking multiplier that becomes the second objective coefficient; prescriptive composes both upstream signals into the final plan.
+5. Keep the validation checks at every stage: assert flagged-set size, PageRank top-N looks plausible, the GNN forecast separates the anomalous segment from the rest, the solve status is OPTIMAL, the objective is not zero, and at least one constraint is binding (otherwise you're under-constrained).
+
+The shape this template demonstrates — *each reasoner writes a property the next reasoner reads* — is what makes the chain accretive rather than serial. The agent skills are how you reliably author each link.
+
+---
+
 ## Data Reference
 
 - **Source data**: bundled CSVs in `../data/` (the main template ships ~1.2 MB of synthetic-but-realistic telco data — 250 cell towers, 1,200 subscribers, 6,000 CDRs, 3,285 daily KPI rows across 9 regions).

From 20c4bfafdc680e33dd22b89f831bcf28aa9c5c02 Mon Sep 17 00:00:00 2001
From: cafzal <cameron.afzal@gmail.com>
Date: Wed, 6 May 2026 09:13:45 -0700
Subject: [PATCH 03/22] Rewrite runbook prompts in user-question style

Replace the prior verbose, mechanical prompts with short natural
questions a user would actually type, modeled on the reasoner-eval
QA catalog. Where an eval QA exists for the same skill+pattern, the
runbook prompt mirrors that question directly.

Also fix the predictive skill names: rai-predictive-modeling +
rai-predictive-training (the public skills are available; the
"no public skill yet" placeholder was stale).

Each prompt is now:
- 1-2 sentences asking what the user wants to know
- Domain-natural language
- Aligned to a real eval-style question

Affects all five runbooks: telco_network_recovery,
energy_grid_planning, supply_chain_resilience,
machine_maintenance, portfolio_balancing.
---
 v1/energy_grid_planning/references/runbook.md    | 14 +++++++-------
 v1/machine_maintenance/references/runbook.md     | 12 ++++++------
 v1/portfolio_balancing/references/runbook.md     | 12 ++++++------
 v1/supply_chain_resilience/references/runbook.md | 12 ++++++------
 v1/telco_network_recovery/references/runbook.md  | 16 ++++++++--------
 5 files changed, 33 insertions(+), 33 deletions(-)

diff --git a/v1/energy_grid_planning/references/runbook.md b/v1/energy_grid_planning/references/runbook.md
index d963954..b98db6b 100644
--- a/v1/energy_grid_planning/references/runbook.md
+++ b/v1/energy_grid_planning/references/runbook.md
@@ -61,7 +61,7 @@ This runbook serves two audiences:
 ## Step 0 — Scope the question with `rai-discovery`
 
 > **Skill:** `rai-discovery` ·
-> **Prompt:** "We have 10 hyperscaler interconnection requests totalling 2,930 MW queued against a 12-substation ERCOT-style Texas grid. Classify the sub-questions we need to answer to decide which to approve, which substation upgrades to fund, and at what budget level — map each sub-question to the reasoner family that should handle it."
+> **Prompt:** "We have 10 hyperscaler interconnection requests against a 12-substation grid. Which to approve, which substation upgrades to fund, at what budget level?"
 
 Discovery classifies the question by reasoner family and tells you which
 downstream skills to load:
@@ -113,8 +113,8 @@ Steps are sequential — each depends on prior steps. Steps without a skill are
 
 ## Stage 1 — Predictive: substation load forecasting
 
-> **Skill:** _no public skill yet — see `v1/subscriber_retention/` and `v1/demand_forecasting/` as worked-example references_ ·
-> **Prompt:** "Forecast each substation's future peak load by aggregating the maximum predicted load across the 6/12/18/24-month forecast horizons in the demand-forecast table, and write the result back to every substation as a derived load-projection property. The downstream rules engine and optimizer both need to read this same forecasted headroom — fall back to the substation's current load only when no forecast row exists. Flag substations whose predicted load exceeds their nameplate capacity within the horizon and report which one breaches first."
+> **Skill:** `rai-predictive-modeling` + `rai-predictive-training` ·
+> **Prompt:** "Can we forecast substation load growth over the next 36 months based on historical demand, planned generator additions, and the data center request pipeline? Bind each substation's predicted peak load back to the ontology so the rules engine and optimizer can read it."
 
 **Method:** load max forecasted load per substation as `Substation.predicted_load`. The template aggregates `DemandForecast.predicted_load_mw` over forecast horizons (6/12/18/24 months) and writes the max back to the substation. A pre-trained GNN can replace the table lookup; the script falls back gracefully when the GNN model registry is unavailable.
 
@@ -160,7 +160,7 @@ effective_load = Substation.predicted_load | Substation.current_load_mw
 ## Stage 2 — Graph: grid topology & structural vulnerability
 
 > **Skill:** `rai-graph-analysis` ·
-> **Prompt:** "Build an undirected, unweighted graph using `Substation` directly as the node concept and active transmission lines as edges. Run weakly connected components to confirm grid connectivity, Louvain community detection to surface ERCOT regional clusters, and the betweenness/degree/eigenvector centrality trio. Combine the three centrality ranks into a composite rank and flag the top 3 substations as structurally critical, writing the centrality scores, community label, and criticality flag back to each substation."
+> **Prompt:** "Which substations are most critical to power flow based on grid topology? Use centrality on the transmission graph, then flag the top 3 as structurally critical and persist the scores back to the ontology."
 
 **Construction** — `Substation` as the node concept directly (no mirror concept):
 - Node concept: `Substation` (12 nodes)
@@ -211,7 +211,7 @@ Centrality (top-3 marked is_structurally_critical)
 ## Stage 3 — Rules: interconnection queue compliance
 
 > **Skill:** `rai-rules-authoring` ·
-> **Prompt:** "Author three declarative compliance rules per data-center request, each consuming an upstream enrichment. Rule 1 fails capacity when the request's MW plus the substation's forecasted load (with current load as fallback) exceeds the substation's nameplate capacity. Rule 2 fails structural risk when the target substation is flagged structurally critical from Stage 2. Rule 3 fails the low-carbon mandate when the substation's zero-emission generation share is below the request's required percentage; sum capacity for generators with emissions rate of zero. Add a composite `is_compliant` flag that fires only when none of the three failure flags fire."
+> **Prompt:** "Screen each data center request against three criteria: (1) substation must have enough capacity after predicted load, (2) if 100% low-carbon required, region must have 25%+ renewable, (3) substation shouldn't be most structurally critical. Which requests pass all three?"
 
 Three declarative `Relationship` rules consume Stages 1–2 enrichments. Each is written as a `model.where(...).define(...)` block; a composite `is_compliant` fires only when none of the three failure flags fire.
 
@@ -276,7 +276,7 @@ Every request passes low-carbon — ERCOT's nuclear (STP, Comanche Peak) plus it
 ## Stage 4 — Prescriptive: joint DC approval + upgrade MIP
 
 > **Skill:** `rai-prescriptive-problem-formulation` ·
-> **Prompt:** "Formulate a single MIP that picks DC approvals and substation upgrades jointly across five budget scenarios at $200M, $300M, $400M, $500M, and $600M. Model the budget as an `InvestmentLevel` Scenario Concept and index both the binary approve and binary upgrade decision variables by it, so one solve produces the full Pareto frontier — no per-budget re-solve loop. Constrain per substation per scenario that approved DC load fits within nameplate capacity minus forecasted load plus selected upgrade headroom, and per scenario that selected upgrade costs stay within the scenario's budget cap. Maximize total annual interconnection revenue summed across all scenarios."
+> **Prompt:** "Decide which data center requests to approve and which substation upgrades to fund at $200M, $300M, $400M, $500M, and $600M investment levels. Maximize annual revenue. A request can only be approved if its substation has enough capacity after upgrades."
 
 ```
 FORMULATION
@@ -353,7 +353,7 @@ PARETO FRONTIER (queried directly from ontology)
 ## Stage 5 — Interpretation
 
 > **Skill:** `rai-prescriptive-results-interpretation` ·
-> **Prompt:** "Read the per-scenario decision-variable properties straight from the ontology — approved DCs, selected upgrades, total MW, revenue, and amortized upgrade cost — and present the Pareto frontier across the five investment levels. Compute marginal value per added budget dollar between adjacent levels and call out the knee where the marginal jump is largest. Translate the result into business language: which DCs unlock at the knee, which substation is the binding bottleneck, and which requests are infeasible at every budget level."
+> **Prompt:** "Which data centers get approved, which upgrades are selected, and where's the biggest return on investment at each budget level?"
 
 ```
 THE PLAN, IN BUSINESS TERMS
diff --git a/v1/machine_maintenance/references/runbook.md b/v1/machine_maintenance/references/runbook.md
index 42c4abc..3f8cc04 100644
--- a/v1/machine_maintenance/references/runbook.md
+++ b/v1/machine_maintenance/references/runbook.md
@@ -59,7 +59,7 @@ This runbook serves two audiences:
 ## Step 0 — Scope the question with `rai-discovery`
 
 > **Skill:** `rai-discovery` ·
-> **Prompt:** "We need to schedule preventive maintenance for a multi-plant operation: 30 machines across 3 plants, 10 technicians across 3 cities, a 4-period horizon. Classify the sub-questions we must answer, map each to the right reasoner family, and tell us which downstream skills to load. Where does OEE alone mislead us, and what structural risks (qualifications, technician concentration) won't a pure optimizer surface?"
+> **Prompt:** "We need to schedule preventive maintenance for 30 machines across 3 plants. Where does OEE alone mislead us, and what structural risks won't a pure optimizer surface?"
 
 Discovery classifies the question by reasoner family and tells you which downstream skills to load:
 
@@ -113,35 +113,35 @@ The runbook walks the same chain stage-by-stage, prompt-by-prompt, in agent-skil
 ## Stage 0 — Querying: operational intelligence
 
 > **Skill:** `rai-querying` ·
-> **Prompt:** "From the loaded production runs, sensor readings, and per-period failure predictions, compute an OEE proxy by facility (Performance × Quality), list machines with above-threshold sensor anomalies grouped by facility, and identify the six machines with the steepest failure-probability rise from period 1 to period 4. Write the per-period failure prediction back as a `MachinePeriod` property so downstream stages can read it."
+> **Prompt:** "What's the OEE by plant? Which machines have the most sensor anomalies, and which are most likely to fail by the end of the planning horizon?"
 
 This stage establishes the operational baseline. Plant_C leads at 79.8% OEE; Plant_B trails at 61.4%. But Plant_A — middle of the OEE pack at 68.2% — owns 7 of 9 sensor anomalies and the three steepest failure trajectories (M001, M013, M016). The querying stage writes nine derived properties on `Machine` plus `MachinePeriod.predicted_fp` (120 rows), and Stage 3 reads `predicted_fp` directly into the failure-cost objective term.
 
 ## Stage 1 — Graph: dependency clusters and bottleneck centrality
 
 > **Skill:** `rai-graph-analysis` ·
-> **Prompt:** "Build an undirected graph using `Machine` directly as the node concept. Two machines are adjacent when at least one technician is qualified for both of their machine types. Run weakly connected components to find dependency clusters, then compute betweenness centrality, normalize it to 0..1 against the max, and store both raw and normalized scores back on the machine."
+> **Prompt:** "Which machines share qualified technicians, and which are bottlenecks in the qualification network? Compute centrality and write it back to each machine so the optimizer can weight critical machines."
 
 The 30 machines form a single connected component — every machine is reachable through shared qualifications. Pump-type machines tie at the top of betweenness (raw 24.0, normalized 1.0): M003 (Plant_C), M008 (Plant_B), M013 (Plant_A). The normalized centrality is consumed by Stage 3's failure-cost multiplier `(1 + 2.0 × betweenness)`, so leaving a bottleneck Pump vulnerable is markedly more expensive than leaving a peripheral Motor vulnerable.
 
 ## Stage 2 — Rules: compliance flags and composite risk tier
 
 > **Skill:** `rai-rules-authoring` ·
-> **Prompt:** "Author seven derived flags on the existing concepts: overdue (remaining useful life below required maintenance hours), high-risk (failure probability above 0.3 and criticality at least 4), anomalous (any sensor anomaly), chronic-downtime (more than 8 events), parts-reorder (stock at or below minimum order), and certification-expiring (under 30 days remaining). Then chain three of those flags — chronic, high-risk, overdue — into a `risk_tier` property of Critical / Elevated / Standard, exhaustively enumerating all eight three-flag combinations using negation."
+> **Prompt:** "Rate each machine's risk: chronic if >8 downtime events, high-risk if failure prob >0.3 AND criticality 4+, plus overdue for maintenance. All three flags = Critical, two = Elevated, otherwise Standard."
 
 Six machines overdue, one high-risk (M013), three chronic-downtime, four parts-reorder, five expiring certs. The composite tier surfaces a single Critical machine — M013 (Pump, Plant_A) — and a single Elevated machine — M016 (Turbine, Plant_A). The overdue flag is consumed by Stage 3 as a hard constraint: every overdue machine must be scheduled by period 2.
 
 ## Stage 3 — Prescriptive: maintenance schedule
 
 > **Skill:** `rai-prescriptive-problem-formulation` ·
-> **Prompt:** "Schedule preventive maintenance over 4 periods. Decide for each machine and period whether to maintain it, whether it remains vulnerable, and which qualified technician services it. Enforce cumulative coverage (each machine is either maintained or vulnerable through every period), assignment-maintenance linkage (one tech per maintained job), per-tech hour capacity, a 5-jobs-per-period parts/bay limit, and the hard rule that every overdue machine from Stage 2 is maintained by period 2. Minimize failure cost (period-specific failure probability × parts cost × criticality, scaled up by `(1 + 2.0 × betweenness)`) plus labor cost (duration × hourly rate) plus travel cost ($50/hour for non-co-located assignments)."
+> **Prompt:** "Schedule preventive maintenance for all 30 machines across 4 periods, capped at 5 jobs per period. Every overdue machine gets maintained by period 2, and Turbines need an on-site qualified technician. Minimize expected failure cost weighted by criticality and centrality, plus labor and travel."
 
 The solver returns OPTIMAL with objective $605,240.61 and 20 maintenance jobs across the four periods (capacity-binding at 5 jobs/period). M013 and M016 — Plant_A's Critical and Elevated machines — are both scheduled by period 1, satisfying the overdue deadline. Several Turbine assignments require travel because all three Turbine-qualified techs are based in Houston_TX while four of six Turbines sit at Plant_A and Plant_C. The optimizer pays the travel cost; it cannot restructure the qualification pool.
 
 ## Stage 4 — Resilience: concentration sweep and cross-training
 
 > **Skill:** `rai-prescriptive-solver-management` + `rai-prescriptive-results-interpretation` ·
-> **Prompt:** "Take the optimal schedule and stress-test the qualification structure underneath it. For each machine type, identify whether all qualified technicians are concentrated in one location, count the scheduled jobs of that type that required travel, and surface the geographic single-point-of-failure. Then re-rank the cross-training options for the concentrated type by cost, prefer non-local candidates, and recommend the cheapest one with its training cost and duration. This is a follow-up sensitivity sweep over the existing solve, not a new formulation."
+> **Prompt:** "For each machine type, are all qualified technicians concentrated in one location? How many scheduled jobs required travel, and what's the cheapest cross-training option to eliminate the single-point-of-failure?"
 
 Turbine is the concentrated type — all three qualified techs (T001, T002, T003) sit in Houston_TX, and 67% of scheduled Turbine jobs already require travel. The recommended fix: cross-train T006 (Senior, Chicago_IL) for $3,200 over 5 weeks. That single addition eliminates the Houston single-point-of-failure for Turbine work at Plant_B and Plant_C, and pays back the first time a weather event, illness, or expiring cert would have idled a Turbine job that the optimizer would otherwise have left uncovered.
 
diff --git a/v1/portfolio_balancing/references/runbook.md b/v1/portfolio_balancing/references/runbook.md
index ea6f43c..6640840 100644
--- a/v1/portfolio_balancing/references/runbook.md
+++ b/v1/portfolio_balancing/references/runbook.md
@@ -55,7 +55,7 @@ This runbook serves two audiences:
 ## Step 0 — Scope the question with `rai-discovery`
 
 > **Skill:** `rai-discovery` ·
-> **Prompt:** "We have an 8-stock portfolio book that breaks compliance on 4 holdings and 2 sectors. We want to rebuild the book under a Markowitz mean-variance objective with position and sector caps, deduplicate redundant bets via correlation clustering, and stress-test the resulting frontier under a crisis regime where correlations spike. Classify the sub-questions by reasoner family and tell us which RAI skills to load."
+> **Prompt:** "Our 8-stock book breaks compliance and concentrates risk. Rebuild it under Markowitz mean-variance with caps, deduplicate redundant bets via correlation clustering, and stress-test under crisis. What questions does each reasoner family handle?"
 
 Discovery classifies the question by reasoner family and tells you which downstream skills to load:
 
@@ -107,7 +107,7 @@ The runbook walks the same chain stage-by-stage, prompt-by-prompt, in agent-skil
 ## Stage 1 — Rules: compliance scan
 
 > **Skill:** `rai-rules-authoring` ·
-> **Prompt:** "Add a derived `Holding.value` from `quantity * purchase_price`, then author three compliance flags as Relationships on the existing book. Flag a holding as overconcentrated when its value exceeds 15% of the parent account's balance. Flag it as sector-concentrated when the per-account, per-sector total of holding values exceeds 30% of balance. Flag a user as a high-risk trader when their risk score is above 0.8 and they have more than five flagged transactions."
+> **Prompt:** "Flag any holding worth more than 15% of its account, any sector worth more than 30% of the account, and any user with a risk score above 0.8 and more than five flagged transactions."
 
 ```
 COMPLIANCE VIOLATIONS — current book (4 accounts, 15 holdings, 6 users)
@@ -148,7 +148,7 @@ COMPLIANCE VIOLATIONS — current book (4 accounts, 15 holdings, 6 users)
 ## Stage 2 — Graph: covariance clustering + cluster representatives
 
 > **Skill:** `rai-graph-analysis` ·
-> **Prompt:** "Derive per-stock variance from the covariance diagonal, volatility as the square root of variance, and pairwise correlation as covariance over the product of volatilities — all in PyRel, no numpy precompute. Build an undirected stock graph with an edge wherever the absolute correlation is at least 0.3 and the index pair is deduplicated. Run Louvain community detection and persist the cluster id back onto each stock. Then derive a Sharpe ratio per stock (returns over volatility), and flag the stock whose Sharpe equals the cluster maximum as the cluster representative; flag the others as non-representative so the optimizer can force them to zero."
+> **Prompt:** "Cluster stocks by correlation — anything above 0.3 absolute is a redundant bet. Pick one representative per cluster (highest Sharpe ratio) and force the rest to zero in optimization."
 
 **Construction** — undirected, unweighted graph:
 - Node concept: `Stock` (8 nodes)
@@ -219,7 +219,7 @@ REPRESENTATIVE = HIGHEST SHARPE PER CLUSTER  (returns / volatility)
 ## Stage 3 — Prescriptive: bi-objective QP with epsilon constraint
 
 > **Skill:** `rai-prescriptive-problem-formulation` ·
-> **Prompt:** "Formulate a Markowitz mean-variance QP with a continuous decision variable `Stock.x_quantity` indexed by a `Scenario` Concept that combines budget and regime — load the six scenarios for budgets {500, 1000, 2000} crossed with regimes {base, crisis}. Add five constraint families per scenario: non-negativity, fully invested at the scenario budget, per-representative position cap at 30% of budget, per-sector cap at 30% of budget, and a hard zero on every non-representative stock. Anchor the frontier with a min-risk solve and a max-return solve, then sweep five interior epsilon-rate points uniformly across the resulting return-rate range, each one re-solving min-risk subject to a per-scenario return floor that scales with budget. Use Ipopt with a 60-second time limit."
+> **Prompt:** "Build a Markowitz mean-variance frontier across 6 scenarios = 3 budgets × 2 regimes. Position cap 30% of budget, sector cap 30%, non-representatives forced to zero. Anchor with min-risk and max-return, then sweep 5 epsilon points across the return range."
 
 ```
 FORMULATION
@@ -277,7 +277,7 @@ Epsilon sweep      5 interior points evenly spaced across the range
 ## Stage 3 — Reading the frontier (per scenario)
 
 > **Skill:** `rai-prescriptive-results-interpretation` ·
-> **Prompt:** "For each of the six scenarios, list the seven-point Pareto frontier (return, risk) from the anchors and epsilon points. Compute the marginal `delta_risk / delta_return` between adjacent points and identify the knee — the point where the marginal cost jumps the most relative to the previous step. Confirm the rate-form frontier shape is consistent across budgets (risk scales as budget squared because the QP is quadratic in x, but the rate-form curve and knee location are budget-independent)."
+> **Prompt:** "For each scenario, list the seven-point Pareto frontier and find the knee — where does the marginal risk per unit return jump the most? Is the rate-form frontier shape consistent across budgets?"
 
 ```
 EFFICIENT FRONTIER — base_500  (budget = 500, regime = base)
@@ -313,7 +313,7 @@ EFFICIENT FRONTIER — base_500  (budget = 500, regime = base)
 ## Stage 4 — Crisis stress test
 
 > **Skill:** `rai-prescriptive-solver-management` + `rai-prescriptive-results-interpretation` ·
-> **Prompt:** "Define a regime-conditioned covariance `Stock.regime_covar(i, j, Regime)` derived in PyRel: for the base regime use the input covariance unchanged, and for the crisis regime use a PSD-preserving correlation shrinkage toward all-ones with weight 0.7 on the base covariance plus 0.3 on the outer product of volatilities. The Stage 3 solver call is unchanged — `Scenario.regime` selects the matching `regime_covar` per scenario, so all six base and crisis scenarios solve in one call. After the sweep, emit a side-by-side `vol_base` versus `vol_crisis` table per (budget, epsilon point) where `vol = sqrt(risk)`, with the absolute and percentage gap, and explain whether the gap peaks mid-frontier or at the concentrated end."
+> **Prompt:** "Stress-test the frontier under crisis: shrink correlations toward all-ones with weight 0.7 on base covariance + 0.3 on outer-product. How much volatility expansion at each frontier point — does the gap peak mid-frontier or at the concentrated end?"
 
 Same `solve_epsilon` call, no separate model — `Scenario.regime` selects between two `Stock.regime_covar` definitions:
 
diff --git a/v1/supply_chain_resilience/references/runbook.md b/v1/supply_chain_resilience/references/runbook.md
index 3d91928..cd2670f 100644
--- a/v1/supply_chain_resilience/references/runbook.md
+++ b/v1/supply_chain_resilience/references/runbook.md
@@ -55,7 +55,7 @@ The two views share one ontology: each stage writes properties back to the model
 ## Step 0 — Scope the question with `rai-discovery`
 
 > **Skill:** `rai-discovery` ·
-> **Prompt:** "Given a supply-chain ontology with sites, businesses (suppliers / manufacturers / warehouses / buyers), shipping operations, SKUs, demand orders, historical shipments, and per-supplier quarterly delay predictions, what questions can each reasoner family answer for a risk-adjusted routing problem? I need to know which high-priority customers are exposed to which suppliers, which sites are bottlenecks, which suppliers are unreliable, and what the minimum-cost flow looks like once those risks are priced in."
+> **Prompt:** "We need a risk-adjusted routing plan. What's our exposure to each supplier, which sites are bottlenecks, which suppliers are unreliable, and what does the minimum-cost flow look like once those risks are priced in?"
 
 Discovery classifies the question by reasoner family and tells you which downstream skills to load:
 
@@ -79,7 +79,7 @@ See the template's main `README.md` for installation, RAI connection setup, and
 ## Stage 0 — Reachability: blast-radius pre-analysis
 
 > **Skill:** `rai-graph-analysis` ·
-> **Prompt:** "Build a **directed** business graph from shipment records, with edges going from each shipment's supplier to its customer. Then run upstream reachability **into** every business that holds at least one HIGH-priority demand, and filter the reachable set to nodes whose business type is SUPPLIER. The directed-not-undirected choice matters here: we want the suppliers that feed each high-priority customer's upstream cone, not the symmetric neighborhood. List, per high-priority customer, the suppliers it transitively depends on and their reliability scores."
+> **Prompt:** "If a key supplier goes offline, which downstream buyers and finished products are at risk? For each HIGH-priority customer, list the suppliers it transitively depends on through the shipment graph, with their reliability scores."
 
 **Construction** — directed `Business` graph, edges from `Business.ships_to` (derived from `Shipment.supplier` -> `Shipment.customer`).
 
@@ -116,7 +116,7 @@ The point of running reachability before the MILP: when the scenario in Stage 3
 ## Stage 1 — Graph: site centrality + connected components
 
 > **Skill:** `rai-graph-analysis` ·
-> **Prompt:** "Build an **undirected, unweighted** site graph using SHIP-type operations as edges between source and output sites, with a sum aggregator to collapse parallel ship lanes. Compute weakly-connected components to surface fragmentation, then eigenvector centrality to rank hubs — restrict the centrality output to FACTORY and DC sites (drop STORE / OFFICE). Normalize the centrality scores to [0, 1] and write them back to the ontology as a per-site property so downstream stages can use them as a bottleneck weight."
+> **Prompt:** "Which sites are the most influential hubs in the supply network — sites that connect to other influential sites, not just sites with many direct connections? Persist the centrality score back to each site so the optimizer can use it as a bottleneck weight."
 
 **Construction:**
 - Node concept: `Site` (31 sites)
@@ -158,7 +158,7 @@ Top critical sites — eigenvector centrality (FACTORY/DC only)
 ## Stage 2 — Rules: supplier risk classification
 
 > **Skill:** `rai-rules-authoring` ·
-> **Prompt:** "Add three derived flags to Business and one to Demand. A business is unreliable when its reliability score is below 0.80. A business has high delay risk when at least one Q1-2025 delay prediction for that supplier exceeds 0.15. A business is watch-level when either of those holds. A demand is escalated when its priority is HIGH. Chain the rules so watch-level fires from either underlying flag. Downstream, suppliers with **both** flags are the avoid set (hard-blocked in routing) and suppliers with **either** flag are the watch set (surcharged)."
+> **Prompt:** "Rate each supplier's delivery reliability. Flag any with reliability score below 0.80 as unreliable, any with a Q1 delay prediction above 0.15 as high-delay-risk, and call them 'watch-level' if either fires. Suppliers with **both** flags are 'avoid' (hard-blocked downstream); suppliers with **either** flag are 'watch' (surcharged)."
 
 **Late-shipment context** (computed in pandas, not RAI):
 
@@ -233,7 +233,7 @@ Stage 3 reads `is_watch_level` for the surcharge term and `is_unreliable AND has
 ## Stage 3 — Prescriptive: risk-adjusted minimum-cost flow
 
 > **Skill:** `rai-prescriptive-problem-formulation` ·
-> **Prompt:** "Formulate a continuous minimum-cost network flow over the operations. The decision variable is per-operation flow, bounded by each operation's daily capacity, with a non-negative unmet-demand slack per demand order. Constraint: for each demand, inbound flow at the customer's site for the demanded SKU plus its slack must cover the order quantity. Hard-block any operation whose source business is in the avoid set (both Stage-2 flags fire). The objective minimizes transport cost plus a $5/unit surcharge on flow through watch-level suppliers, plus a centrality-weighted penalty on flow into bottleneck sites (using the normalized score from Stage 1, weight 2.0), plus a $100/unit unmet-demand penalty."
+> **Prompt:** "Solve a minimum-cost flow that fulfills all open demand orders at minimum total transport cost. Hard-block 'avoid' suppliers, surcharge 'watch' suppliers $5/unit, weight bottleneck sites by their centrality, and penalize unmet demand at $100/unit."
 
 ```
 FORMULATION
@@ -279,7 +279,7 @@ The baseline buys: enough finished-goods flow on the shortest cost-weighted lane
 ## Scenario analysis — quantify disruption
 
 > **Skill:** `rai-prescriptive-solver-management` + `rai-prescriptive-results-interpretation` ·
-> **Prompt:** "Re-solve the same formulation under two disruptions and compare them to the baseline. Scenario A: take the highest-centrality site offline by adding a zero-flow constraint on every operation sourced from that site. Scenario B: downgrade every watch-level supplier to avoid by adding zero-flow blocks for every supplier in the union of the two Stage-2 flags. Report status, objective, active flow count, and unmet demand for each. Then explain *why* the cost deltas are asymmetric — the structural-vs-behavioural distinction is the punchline."
+> **Prompt:** "Re-solve with the highest-centrality site offline, and again with watch-level suppliers downgraded to avoid. What's the cost delta in each, and why are they asymmetric?"
 
 The same `solve_flow(...)` function re-runs with modified constraints. Two scenarios surface different aspects of the chain's value:
 
diff --git a/v1/telco_network_recovery/references/runbook.md b/v1/telco_network_recovery/references/runbook.md
index 9486f52..188533e 100644
--- a/v1/telco_network_recovery/references/runbook.md
+++ b/v1/telco_network_recovery/references/runbook.md
@@ -55,7 +55,7 @@ A single-reasoner approach can't answer this. Descriptive alone tells the story
 ## Step 0 — Scope the question with `rai-discovery`
 
 > **Skill:** `rai-discovery` ·
-> **Prompt:** "WEST is missing revenue targets while every other region grows. Diagnose whether this is a subscriber-retention problem or a network-operations problem, then produce a defensible tower-upgrade plan inside a $5M capex cap and a 200 install-crew-week schedule that prioritizes towers by both who depends on them and where regional demand is heading."
+> **Prompt:** "WEST is missing revenue while every other region grows. What questions do we need to answer to figure out where to spend $5M to fix it?"
 
 Discovery classifies the question by reasoner family and tells you which downstream skills to load:
 
@@ -81,7 +81,7 @@ See the template's main `README.md` for installation, RAI connection setup, and
 ## Stage 1 — Descriptive: diagnose WEST
 
 > **Skill:** `rai-querying` ·
-> **Prompt:** "Run a regional triage on the daily KPIs and tower fleet for Q3–Q4 2024. Compare WEST against the other eight regions on network availability, churn rate, daily revenue, and the revenue-forecast vs. actual gap, and break down the WEST tower fleet by status. Identify the top performance offenders by per-tower packet loss and latency, and check whether any WEST subscribers carry elevated churn risk. Conclude whether this looks like a retention problem or an operational one."
+> **Prompt:** "Where are we missing revenue targets? Which 10 cell towers have the worst average packet loss over 2024, and which region has the worst Q4 network availability?"
 
 ```
 Q4 2024 — Daily KPIs by region
@@ -133,7 +133,7 @@ Subscriber.churn_risk_score is a static feature that hasn't caught up to WEST's
 ## Stage 2 — Rules: flag critical_restore towers
 
 > **Skill:** `rai-rules-authoring` ·
-> **Prompt:** "Define per-tower derived averages for packet loss, latency, and error rate from the network-performance measurements, plus an average equipment-health score from the two-hop join through network equipment to equipment-health snapshots. Then add a unary critical-restore flag on the tower that fires when the tower is in WEST and either DEGRADED with average health below 0.85, or shows average packet loss above 5% with the same health threshold (so an ACTIVE-but-failing tower is still caught)."
+> **Prompt:** "Flag CellTowers as 'critical-restore' if region is WEST AND status is DEGRADED AND avg equipment health is below 0.85, OR if avg packet loss > 5% with health below 0.85."
 
 **Properties added to the ontology** (via `model.define(...)`):
 - `CellTower.avg_packet_loss` (Float) — `aggs.avg(NetworkPerformance.packet_loss_pct).per(CellTower)`
@@ -190,7 +190,7 @@ Branch 2 didn't fire — none of WEST's ACTIVE towers fall below health 0.85. Th
 ## Stage 3 — Graph: subscriber influence + tower blast radius
 
 > **Skill:** `rai-graph-analysis` ·
-> **Prompt:** "Build a directed subscriber-to-subscriber call graph from the call-detail records, with caller pointing to callee and parallel calls between the same pair summed into a single edge. Score each subscriber with PageRank and write that influence back to the subscriber. Then per critical-restore tower, aggregate the distinct subscribers whose calls route through it and the sum of their PageRank — that's the social blast radius if the tower fails."
+> **Prompt:** "Who are our most socially influential subscribers based on call patterns? For each critical-restore tower, count the distinct subscribers whose calls route through it and rank by total PageRank influence — that's the blast radius if it fails."
 
 **Construction** — Pattern 3 (`edge_concept`):
 - Node concept: `Subscriber` (1,200 nodes)
@@ -248,8 +248,8 @@ Per-critical-tower blast radius (sorted by weighted_impact)
 
 ## Stage 4 — Predictive: forecast WEST capacity demand
 
-> **Skill:** _no public skill yet — see `v1/subscriber_retention/` and `v1/demand_forecasting/` as worked-example references_ ·
-> **Prompt:** "Train a regression GNN on per-region daily KPIs predicting subscriber growth rate. Use same-region 1-day-lag temporal edges, region as a category feature, and three lag features (previous-day growth, previous-week growth, and a 7-day rolling mean) computed before load. Train on rows before November 2024, validate on November, test on December, then bind each region's mean predicted growth back to every cell tower in that region as a per-tower demand multiplier."
+> **Skill:** `rai-predictive-modeling` + `rai-predictive-training` ·
+> **Prompt:** "Predict next-quarter subscriber-growth-rate per region using TimeSeriesMetric history (subscriber_growth_rate, churn_rate, marketing_spend, network_availability_pct, total_calls). Bind each region's forecast back to its towers as a demand multiplier."
 
 **Method:** GNN node regression on `TimeSeriesMetric` (composite key `metric_date` + `region`). Target: `subscriber_growth_rate`. Features: the other 12 daily KPIs + 3 lag features (`prev_day_growth`, `prev_week_growth`, `growth_7d_mean`) + `region` as a category. Graph: same-region 1-day-lag temporal edges. Train < 2024-11-01 (includes the Sep–Oct WEST decline onset); validate on Nov 2024; test on Dec 2024.
 
@@ -298,7 +298,7 @@ objective = sum( selected[t,tier] *
 ## Stage 5 — Prescriptive: tower upgrade selection MIP
 
 > **Skill:** `rai-prescriptive-problem-formulation` ·
-> **Prompt:** "Pick at most one upgrade tier (BRONZE, SILVER, or GOLD) per critical-restore tower using a binary decision variable on the tower-upgrade-option junction. Stay within a $5M total cost cap and 200 total install crew-weeks. Maximize the sum across selected options of capacity-increase × tower weighted-impact × tower projected-demand-growth, so the optimizer favors towers that are both broken and high-blast-radius, scaled by the regional demand forecast. Solve with Gurobi."
+> **Prompt:** "Recover WEST capacity within $5M and 200 install-weeks, prioritizing towers by social blast radius and forward-looking demand. From TowerUpgradeOption, pick at most one upgrade tier (BRONZE/SILVER/GOLD) per critical-restore tower, maximizing Σ capacity_increase × weighted_impact × projected_demand_growth."
 
 ```
 FORMULATION
@@ -371,7 +371,7 @@ Headline metrics
 ## Stage 6 — Interpretation
 
 > **Skill:** `rai-prescriptive-results-interpretation` ·
-> **Prompt:** "Summarize the optimal plan in business terms: total cost vs. budget, capacity restored, tier mix, towers covered, and how many subscribers stop being served by a critical tower over the install schedule. Identify which constraint is binding and what would change if it were relaxed by 10–20% (which tower would jump tiers, what the marginal capacity lift would be). List the per-stage ontology enrichments so the reader can see what each reasoner contributed back."
+> **Prompt:** "Summarize the plan: total cost, capacity restored, tier mix, towers covered. Which constraint is binding, and what would relaxing it by 10-20% unlock?"
 
 ```
 THE PLAN, IN BUSINESS TERMS

From 9681e2cf7935cf9f57afef1d8a87804ef01add33 Mon Sep 17 00:00:00 2001
From: cafzal <cameron.afzal@gmail.com>
Date: Wed, 6 May 2026 09:24:46 -0700
Subject: [PATCH 04/22] Inline skill invocation in prompt callouts
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

Replace the two-line `**Skill:** ... · **Prompt:** "..."` callout with
a single-line `> /rai-skill "question"` form that mirrors how a user
actually invokes a skill in chat. Multi-skill stages stay readable
as `> /rai-A + /rai-B "question"`.

One change per stage callout, all five runbooks. No content rewrite.
---
 v1/energy_grid_planning/references/runbook.md | 18 ++++++----------
 v1/machine_maintenance/references/runbook.md  | 18 ++++++----------
 v1/portfolio_balancing/references/runbook.md  | 18 ++++++----------
 .../references/runbook.md                     | 18 ++++++----------
 .../references/runbook.md                     | 21 +++++++------------
 5 files changed, 31 insertions(+), 62 deletions(-)

diff --git a/v1/energy_grid_planning/references/runbook.md b/v1/energy_grid_planning/references/runbook.md
index b98db6b..f1329ac 100644
--- a/v1/energy_grid_planning/references/runbook.md
+++ b/v1/energy_grid_planning/references/runbook.md
@@ -60,8 +60,7 @@ This runbook serves two audiences:
 
 ## Step 0 — Scope the question with `rai-discovery`
 
-> **Skill:** `rai-discovery` ·
-> **Prompt:** "We have 10 hyperscaler interconnection requests against a 12-substation grid. Which to approve, which substation upgrades to fund, at what budget level?"
+> `/rai-discovery` "We have 10 hyperscaler interconnection requests against a 12-substation grid. Which to approve, which substation upgrades to fund, at what budget level?"
 
 Discovery classifies the question by reasoner family and tells you which
 downstream skills to load:
@@ -113,8 +112,7 @@ Steps are sequential — each depends on prior steps. Steps without a skill are
 
 ## Stage 1 — Predictive: substation load forecasting
 
-> **Skill:** `rai-predictive-modeling` + `rai-predictive-training` ·
-> **Prompt:** "Can we forecast substation load growth over the next 36 months based on historical demand, planned generator additions, and the data center request pipeline? Bind each substation's predicted peak load back to the ontology so the rules engine and optimizer can read it."
+> `/rai-predictive-modeling` + `/rai-predictive-training` "Can we forecast substation load growth over the next 36 months based on historical demand, planned generator additions, and the data center request pipeline? Bind each substation's predicted peak load back to the ontology so the rules engine and optimizer can read it."
 
 **Method:** load max forecasted load per substation as `Substation.predicted_load`. The template aggregates `DemandForecast.predicted_load_mw` over forecast horizons (6/12/18/24 months) and writes the max back to the substation. A pre-trained GNN can replace the table lookup; the script falls back gracefully when the GNN model registry is unavailable.
 
@@ -159,8 +157,7 @@ effective_load = Substation.predicted_load | Substation.current_load_mw
 
 ## Stage 2 — Graph: grid topology & structural vulnerability
 
-> **Skill:** `rai-graph-analysis` ·
-> **Prompt:** "Which substations are most critical to power flow based on grid topology? Use centrality on the transmission graph, then flag the top 3 as structurally critical and persist the scores back to the ontology."
+> `/rai-graph-analysis` "Which substations are most critical to power flow based on grid topology? Use centrality on the transmission graph, then flag the top 3 as structurally critical and persist the scores back to the ontology."
 
 **Construction** — `Substation` as the node concept directly (no mirror concept):
 - Node concept: `Substation` (12 nodes)
@@ -210,8 +207,7 @@ Centrality (top-3 marked is_structurally_critical)
 
 ## Stage 3 — Rules: interconnection queue compliance
 
-> **Skill:** `rai-rules-authoring` ·
-> **Prompt:** "Screen each data center request against three criteria: (1) substation must have enough capacity after predicted load, (2) if 100% low-carbon required, region must have 25%+ renewable, (3) substation shouldn't be most structurally critical. Which requests pass all three?"
+> `/rai-rules-authoring` "Screen each data center request against three criteria: (1) substation must have enough capacity after predicted load, (2) if 100% low-carbon required, region must have 25%+ renewable, (3) substation shouldn't be most structurally critical. Which requests pass all three?"
 
 Three declarative `Relationship` rules consume Stages 1–2 enrichments. Each is written as a `model.where(...).define(...)` block; a composite `is_compliant` fires only when none of the three failure flags fire.
 
@@ -275,8 +271,7 @@ Every request passes low-carbon — ERCOT's nuclear (STP, Comanche Peak) plus it
 
 ## Stage 4 — Prescriptive: joint DC approval + upgrade MIP
 
-> **Skill:** `rai-prescriptive-problem-formulation` ·
-> **Prompt:** "Decide which data center requests to approve and which substation upgrades to fund at $200M, $300M, $400M, $500M, and $600M investment levels. Maximize annual revenue. A request can only be approved if its substation has enough capacity after upgrades."
+> `/rai-prescriptive-problem-formulation` "Decide which data center requests to approve and which substation upgrades to fund at $200M, $300M, $400M, $500M, and $600M investment levels. Maximize annual revenue. A request can only be approved if its substation has enough capacity after upgrades."
 
 ```
 FORMULATION
@@ -352,8 +347,7 @@ PARETO FRONTIER (queried directly from ontology)
 
 ## Stage 5 — Interpretation
 
-> **Skill:** `rai-prescriptive-results-interpretation` ·
-> **Prompt:** "Which data centers get approved, which upgrades are selected, and where's the biggest return on investment at each budget level?"
+> `/rai-prescriptive-results-interpretation` "Which data centers get approved, which upgrades are selected, and where's the biggest return on investment at each budget level?"
 
 ```
 THE PLAN, IN BUSINESS TERMS
diff --git a/v1/machine_maintenance/references/runbook.md b/v1/machine_maintenance/references/runbook.md
index 3f8cc04..ca6201c 100644
--- a/v1/machine_maintenance/references/runbook.md
+++ b/v1/machine_maintenance/references/runbook.md
@@ -58,8 +58,7 @@ This runbook serves two audiences:
 
 ## Step 0 — Scope the question with `rai-discovery`
 
-> **Skill:** `rai-discovery` ·
-> **Prompt:** "We need to schedule preventive maintenance for 30 machines across 3 plants. Where does OEE alone mislead us, and what structural risks won't a pure optimizer surface?"
+> `/rai-discovery` "We need to schedule preventive maintenance for 30 machines across 3 plants. Where does OEE alone mislead us, and what structural risks won't a pure optimizer surface?"
 
 Discovery classifies the question by reasoner family and tells you which downstream skills to load:
 
@@ -112,36 +111,31 @@ The runbook walks the same chain stage-by-stage, prompt-by-prompt, in agent-skil
 
 ## Stage 0 — Querying: operational intelligence
 
-> **Skill:** `rai-querying` ·
-> **Prompt:** "What's the OEE by plant? Which machines have the most sensor anomalies, and which are most likely to fail by the end of the planning horizon?"
+> `/rai-querying` "What's the OEE by plant? Which machines have the most sensor anomalies, and which are most likely to fail by the end of the planning horizon?"
 
 This stage establishes the operational baseline. Plant_C leads at 79.8% OEE; Plant_B trails at 61.4%. But Plant_A — middle of the OEE pack at 68.2% — owns 7 of 9 sensor anomalies and the three steepest failure trajectories (M001, M013, M016). The querying stage writes nine derived properties on `Machine` plus `MachinePeriod.predicted_fp` (120 rows), and Stage 3 reads `predicted_fp` directly into the failure-cost objective term.
 
 ## Stage 1 — Graph: dependency clusters and bottleneck centrality
 
-> **Skill:** `rai-graph-analysis` ·
-> **Prompt:** "Which machines share qualified technicians, and which are bottlenecks in the qualification network? Compute centrality and write it back to each machine so the optimizer can weight critical machines."
+> `/rai-graph-analysis` "Which machines share qualified technicians, and which are bottlenecks in the qualification network? Compute centrality and write it back to each machine so the optimizer can weight critical machines."
 
 The 30 machines form a single connected component — every machine is reachable through shared qualifications. Pump-type machines tie at the top of betweenness (raw 24.0, normalized 1.0): M003 (Plant_C), M008 (Plant_B), M013 (Plant_A). The normalized centrality is consumed by Stage 3's failure-cost multiplier `(1 + 2.0 × betweenness)`, so leaving a bottleneck Pump vulnerable is markedly more expensive than leaving a peripheral Motor vulnerable.
 
 ## Stage 2 — Rules: compliance flags and composite risk tier
 
-> **Skill:** `rai-rules-authoring` ·
-> **Prompt:** "Rate each machine's risk: chronic if >8 downtime events, high-risk if failure prob >0.3 AND criticality 4+, plus overdue for maintenance. All three flags = Critical, two = Elevated, otherwise Standard."
+> `/rai-rules-authoring` "Rate each machine's risk: chronic if >8 downtime events, high-risk if failure prob >0.3 AND criticality 4+, plus overdue for maintenance. All three flags = Critical, two = Elevated, otherwise Standard."
 
 Six machines overdue, one high-risk (M013), three chronic-downtime, four parts-reorder, five expiring certs. The composite tier surfaces a single Critical machine — M013 (Pump, Plant_A) — and a single Elevated machine — M016 (Turbine, Plant_A). The overdue flag is consumed by Stage 3 as a hard constraint: every overdue machine must be scheduled by period 2.
 
 ## Stage 3 — Prescriptive: maintenance schedule
 
-> **Skill:** `rai-prescriptive-problem-formulation` ·
-> **Prompt:** "Schedule preventive maintenance for all 30 machines across 4 periods, capped at 5 jobs per period. Every overdue machine gets maintained by period 2, and Turbines need an on-site qualified technician. Minimize expected failure cost weighted by criticality and centrality, plus labor and travel."
+> `/rai-prescriptive-problem-formulation` "Schedule preventive maintenance for all 30 machines across 4 periods, capped at 5 jobs per period. Every overdue machine gets maintained by period 2, and Turbines need an on-site qualified technician. Minimize expected failure cost weighted by criticality and centrality, plus labor and travel."
 
 The solver returns OPTIMAL with objective $605,240.61 and 20 maintenance jobs across the four periods (capacity-binding at 5 jobs/period). M013 and M016 — Plant_A's Critical and Elevated machines — are both scheduled by period 1, satisfying the overdue deadline. Several Turbine assignments require travel because all three Turbine-qualified techs are based in Houston_TX while four of six Turbines sit at Plant_A and Plant_C. The optimizer pays the travel cost; it cannot restructure the qualification pool.
 
 ## Stage 4 — Resilience: concentration sweep and cross-training
 
-> **Skill:** `rai-prescriptive-solver-management` + `rai-prescriptive-results-interpretation` ·
-> **Prompt:** "For each machine type, are all qualified technicians concentrated in one location? How many scheduled jobs required travel, and what's the cheapest cross-training option to eliminate the single-point-of-failure?"
+> `/rai-prescriptive-solver-management` + `/rai-prescriptive-results-interpretation` "For each machine type, are all qualified technicians concentrated in one location? How many scheduled jobs required travel, and what's the cheapest cross-training option to eliminate the single-point-of-failure?"
 
 Turbine is the concentrated type — all three qualified techs (T001, T002, T003) sit in Houston_TX, and 67% of scheduled Turbine jobs already require travel. The recommended fix: cross-train T006 (Senior, Chicago_IL) for $3,200 over 5 weeks. That single addition eliminates the Houston single-point-of-failure for Turbine work at Plant_B and Plant_C, and pays back the first time a weather event, illness, or expiring cert would have idled a Turbine job that the optimizer would otherwise have left uncovered.
 
diff --git a/v1/portfolio_balancing/references/runbook.md b/v1/portfolio_balancing/references/runbook.md
index 6640840..3df9efa 100644
--- a/v1/portfolio_balancing/references/runbook.md
+++ b/v1/portfolio_balancing/references/runbook.md
@@ -54,8 +54,7 @@ This runbook serves two audiences:
 
 ## Step 0 — Scope the question with `rai-discovery`
 
-> **Skill:** `rai-discovery` ·
-> **Prompt:** "Our 8-stock book breaks compliance and concentrates risk. Rebuild it under Markowitz mean-variance with caps, deduplicate redundant bets via correlation clustering, and stress-test under crisis. What questions does each reasoner family handle?"
+> `/rai-discovery` "Our 8-stock book breaks compliance and concentrates risk. Rebuild it under Markowitz mean-variance with caps, deduplicate redundant bets via correlation clustering, and stress-test under crisis. What questions does each reasoner family handle?"
 
 Discovery classifies the question by reasoner family and tells you which downstream skills to load:
 
@@ -106,8 +105,7 @@ The runbook walks the same chain stage-by-stage, prompt-by-prompt, in agent-skil
 
 ## Stage 1 — Rules: compliance scan
 
-> **Skill:** `rai-rules-authoring` ·
-> **Prompt:** "Flag any holding worth more than 15% of its account, any sector worth more than 30% of the account, and any user with a risk score above 0.8 and more than five flagged transactions."
+> `/rai-rules-authoring` "Flag any holding worth more than 15% of its account, any sector worth more than 30% of the account, and any user with a risk score above 0.8 and more than five flagged transactions."
 
 ```
 COMPLIANCE VIOLATIONS — current book (4 accounts, 15 holdings, 6 users)
@@ -147,8 +145,7 @@ COMPLIANCE VIOLATIONS — current book (4 accounts, 15 holdings, 6 users)
 
 ## Stage 2 — Graph: covariance clustering + cluster representatives
 
-> **Skill:** `rai-graph-analysis` ·
-> **Prompt:** "Cluster stocks by correlation — anything above 0.3 absolute is a redundant bet. Pick one representative per cluster (highest Sharpe ratio) and force the rest to zero in optimization."
+> `/rai-graph-analysis` "Cluster stocks by correlation — anything above 0.3 absolute is a redundant bet. Pick one representative per cluster (highest Sharpe ratio) and force the rest to zero in optimization."
 
 **Construction** — undirected, unweighted graph:
 - Node concept: `Stock` (8 nodes)
@@ -218,8 +215,7 @@ REPRESENTATIVE = HIGHEST SHARPE PER CLUSTER  (returns / volatility)
 
 ## Stage 3 — Prescriptive: bi-objective QP with epsilon constraint
 
-> **Skill:** `rai-prescriptive-problem-formulation` ·
-> **Prompt:** "Build a Markowitz mean-variance frontier across 6 scenarios = 3 budgets × 2 regimes. Position cap 30% of budget, sector cap 30%, non-representatives forced to zero. Anchor with min-risk and max-return, then sweep 5 epsilon points across the return range."
+> `/rai-prescriptive-problem-formulation` "Build a Markowitz mean-variance frontier across 6 scenarios = 3 budgets × 2 regimes. Position cap 30% of budget, sector cap 30%, non-representatives forced to zero. Anchor with min-risk and max-return, then sweep 5 epsilon points across the return range."
 
 ```
 FORMULATION
@@ -276,8 +272,7 @@ Epsilon sweep      5 interior points evenly spaced across the range
 
 ## Stage 3 — Reading the frontier (per scenario)
 
-> **Skill:** `rai-prescriptive-results-interpretation` ·
-> **Prompt:** "For each scenario, list the seven-point Pareto frontier and find the knee — where does the marginal risk per unit return jump the most? Is the rate-form frontier shape consistent across budgets?"
+> `/rai-prescriptive-results-interpretation` "For each scenario, list the seven-point Pareto frontier and find the knee — where does the marginal risk per unit return jump the most? Is the rate-form frontier shape consistent across budgets?"
 
 ```
 EFFICIENT FRONTIER — base_500  (budget = 500, regime = base)
@@ -312,8 +307,7 @@ EFFICIENT FRONTIER — base_500  (budget = 500, regime = base)
 
 ## Stage 4 — Crisis stress test
 
-> **Skill:** `rai-prescriptive-solver-management` + `rai-prescriptive-results-interpretation` ·
-> **Prompt:** "Stress-test the frontier under crisis: shrink correlations toward all-ones with weight 0.7 on base covariance + 0.3 on outer-product. How much volatility expansion at each frontier point — does the gap peak mid-frontier or at the concentrated end?"
+> `/rai-prescriptive-solver-management` + `/rai-prescriptive-results-interpretation` "Stress-test the frontier under crisis: shrink correlations toward all-ones with weight 0.7 on base covariance + 0.3 on outer-product. How much volatility expansion at each frontier point — does the gap peak mid-frontier or at the concentrated end?"
 
 Same `solve_epsilon` call, no separate model — `Scenario.regime` selects between two `Stock.regime_covar` definitions:
 
diff --git a/v1/supply_chain_resilience/references/runbook.md b/v1/supply_chain_resilience/references/runbook.md
index cd2670f..877c451 100644
--- a/v1/supply_chain_resilience/references/runbook.md
+++ b/v1/supply_chain_resilience/references/runbook.md
@@ -54,8 +54,7 @@ The two views share one ontology: each stage writes properties back to the model
 
 ## Step 0 — Scope the question with `rai-discovery`
 
-> **Skill:** `rai-discovery` ·
-> **Prompt:** "We need a risk-adjusted routing plan. What's our exposure to each supplier, which sites are bottlenecks, which suppliers are unreliable, and what does the minimum-cost flow look like once those risks are priced in?"
+> `/rai-discovery` "We need a risk-adjusted routing plan. What's our exposure to each supplier, which sites are bottlenecks, which suppliers are unreliable, and what does the minimum-cost flow look like once those risks are priced in?"
 
 Discovery classifies the question by reasoner family and tells you which downstream skills to load:
 
@@ -78,8 +77,7 @@ See the template's main `README.md` for installation, RAI connection setup, and
 
 ## Stage 0 — Reachability: blast-radius pre-analysis
 
-> **Skill:** `rai-graph-analysis` ·
-> **Prompt:** "If a key supplier goes offline, which downstream buyers and finished products are at risk? For each HIGH-priority customer, list the suppliers it transitively depends on through the shipment graph, with their reliability scores."
+> `/rai-graph-analysis` "If a key supplier goes offline, which downstream buyers and finished products are at risk? For each HIGH-priority customer, list the suppliers it transitively depends on through the shipment graph, with their reliability scores."
 
 **Construction** — directed `Business` graph, edges from `Business.ships_to` (derived from `Shipment.supplier` -> `Shipment.customer`).
 
@@ -115,8 +113,7 @@ The point of running reachability before the MILP: when the scenario in Stage 3
 
 ## Stage 1 — Graph: site centrality + connected components
 
-> **Skill:** `rai-graph-analysis` ·
-> **Prompt:** "Which sites are the most influential hubs in the supply network — sites that connect to other influential sites, not just sites with many direct connections? Persist the centrality score back to each site so the optimizer can use it as a bottleneck weight."
+> `/rai-graph-analysis` "Which sites are the most influential hubs in the supply network — sites that connect to other influential sites, not just sites with many direct connections? Persist the centrality score back to each site so the optimizer can use it as a bottleneck weight."
 
 **Construction:**
 - Node concept: `Site` (31 sites)
@@ -157,8 +154,7 @@ Top critical sites — eigenvector centrality (FACTORY/DC only)
 
 ## Stage 2 — Rules: supplier risk classification
 
-> **Skill:** `rai-rules-authoring` ·
-> **Prompt:** "Rate each supplier's delivery reliability. Flag any with reliability score below 0.80 as unreliable, any with a Q1 delay prediction above 0.15 as high-delay-risk, and call them 'watch-level' if either fires. Suppliers with **both** flags are 'avoid' (hard-blocked downstream); suppliers with **either** flag are 'watch' (surcharged)."
+> `/rai-rules-authoring` "Rate each supplier's delivery reliability. Flag any with reliability score below 0.80 as unreliable, any with a Q1 delay prediction above 0.15 as high-delay-risk, and call them 'watch-level' if either fires. Suppliers with **both** flags are 'avoid' (hard-blocked downstream); suppliers with **either** flag are 'watch' (surcharged)."
 
 **Late-shipment context** (computed in pandas, not RAI):
 
@@ -232,8 +228,7 @@ Stage 3 reads `is_watch_level` for the surcharge term and `is_unreliable AND has
 
 ## Stage 3 — Prescriptive: risk-adjusted minimum-cost flow
 
-> **Skill:** `rai-prescriptive-problem-formulation` ·
-> **Prompt:** "Solve a minimum-cost flow that fulfills all open demand orders at minimum total transport cost. Hard-block 'avoid' suppliers, surcharge 'watch' suppliers $5/unit, weight bottleneck sites by their centrality, and penalize unmet demand at $100/unit."
+> `/rai-prescriptive-problem-formulation` "Solve a minimum-cost flow that fulfills all open demand orders at minimum total transport cost. Hard-block 'avoid' suppliers, surcharge 'watch' suppliers $5/unit, weight bottleneck sites by their centrality, and penalize unmet demand at $100/unit."
 
 ```
 FORMULATION
@@ -278,8 +273,7 @@ The baseline buys: enough finished-goods flow on the shortest cost-weighted lane
 
 ## Scenario analysis — quantify disruption
 
-> **Skill:** `rai-prescriptive-solver-management` + `rai-prescriptive-results-interpretation` ·
-> **Prompt:** "Re-solve with the highest-centrality site offline, and again with watch-level suppliers downgraded to avoid. What's the cost delta in each, and why are they asymmetric?"
+> `/rai-prescriptive-solver-management` + `/rai-prescriptive-results-interpretation` "Re-solve with the highest-centrality site offline, and again with watch-level suppliers downgraded to avoid. What's the cost delta in each, and why are they asymmetric?"
 
 The same `solve_flow(...)` function re-runs with modified constraints. Two scenarios surface different aspects of the chain's value:
 
diff --git a/v1/telco_network_recovery/references/runbook.md b/v1/telco_network_recovery/references/runbook.md
index 188533e..8cb676e 100644
--- a/v1/telco_network_recovery/references/runbook.md
+++ b/v1/telco_network_recovery/references/runbook.md
@@ -54,8 +54,7 @@ A single-reasoner approach can't answer this. Descriptive alone tells the story
 
 ## Step 0 — Scope the question with `rai-discovery`
 
-> **Skill:** `rai-discovery` ·
-> **Prompt:** "WEST is missing revenue while every other region grows. What questions do we need to answer to figure out where to spend $5M to fix it?"
+> `/rai-discovery` "WEST is missing revenue while every other region grows. What questions do we need to answer to figure out where to spend $5M to fix it?"
 
 Discovery classifies the question by reasoner family and tells you which downstream skills to load:
 
@@ -80,8 +79,7 @@ See the template's main `README.md` for installation, RAI connection setup, and
 
 ## Stage 1 — Descriptive: diagnose WEST
 
-> **Skill:** `rai-querying` ·
-> **Prompt:** "Where are we missing revenue targets? Which 10 cell towers have the worst average packet loss over 2024, and which region has the worst Q4 network availability?"
+> `/rai-querying` "Where are we missing revenue targets? Which 10 cell towers have the worst average packet loss over 2024, and which region has the worst Q4 network availability?"
 
 ```
 Q4 2024 — Daily KPIs by region
@@ -132,8 +130,7 @@ Subscriber.churn_risk_score is a static feature that hasn't caught up to WEST's
 
 ## Stage 2 — Rules: flag critical_restore towers
 
-> **Skill:** `rai-rules-authoring` ·
-> **Prompt:** "Flag CellTowers as 'critical-restore' if region is WEST AND status is DEGRADED AND avg equipment health is below 0.85, OR if avg packet loss > 5% with health below 0.85."
+> `/rai-rules-authoring` "Flag CellTowers as 'critical-restore' if region is WEST AND status is DEGRADED AND avg equipment health is below 0.85, OR if avg packet loss > 5% with health below 0.85."
 
 **Properties added to the ontology** (via `model.define(...)`):
 - `CellTower.avg_packet_loss` (Float) — `aggs.avg(NetworkPerformance.packet_loss_pct).per(CellTower)`
@@ -189,8 +186,7 @@ Branch 2 didn't fire — none of WEST's ACTIVE towers fall below health 0.85. Th
 
 ## Stage 3 — Graph: subscriber influence + tower blast radius
 
-> **Skill:** `rai-graph-analysis` ·
-> **Prompt:** "Who are our most socially influential subscribers based on call patterns? For each critical-restore tower, count the distinct subscribers whose calls route through it and rank by total PageRank influence — that's the blast radius if it fails."
+> `/rai-graph-analysis` "Who are our most socially influential subscribers based on call patterns? For each critical-restore tower, count the distinct subscribers whose calls route through it and rank by total PageRank influence — that's the blast radius if it fails."
 
 **Construction** — Pattern 3 (`edge_concept`):
 - Node concept: `Subscriber` (1,200 nodes)
@@ -248,8 +244,7 @@ Per-critical-tower blast radius (sorted by weighted_impact)
 
 ## Stage 4 — Predictive: forecast WEST capacity demand
 
-> **Skill:** `rai-predictive-modeling` + `rai-predictive-training` ·
-> **Prompt:** "Predict next-quarter subscriber-growth-rate per region using TimeSeriesMetric history (subscriber_growth_rate, churn_rate, marketing_spend, network_availability_pct, total_calls). Bind each region's forecast back to its towers as a demand multiplier."
+> `/rai-predictive-modeling` + `/rai-predictive-training` "Predict next-quarter subscriber-growth-rate per region using TimeSeriesMetric history (subscriber_growth_rate, churn_rate, marketing_spend, network_availability_pct, total_calls). Bind each region's forecast back to its towers as a demand multiplier."
 
 **Method:** GNN node regression on `TimeSeriesMetric` (composite key `metric_date` + `region`). Target: `subscriber_growth_rate`. Features: the other 12 daily KPIs + 3 lag features (`prev_day_growth`, `prev_week_growth`, `growth_7d_mean`) + `region` as a category. Graph: same-region 1-day-lag temporal edges. Train < 2024-11-01 (includes the Sep–Oct WEST decline onset); validate on Nov 2024; test on Dec 2024.
 
@@ -297,8 +292,7 @@ objective = sum( selected[t,tier] *
 
 ## Stage 5 — Prescriptive: tower upgrade selection MIP
 
-> **Skill:** `rai-prescriptive-problem-formulation` ·
-> **Prompt:** "Recover WEST capacity within $5M and 200 install-weeks, prioritizing towers by social blast radius and forward-looking demand. From TowerUpgradeOption, pick at most one upgrade tier (BRONZE/SILVER/GOLD) per critical-restore tower, maximizing Σ capacity_increase × weighted_impact × projected_demand_growth."
+> `/rai-prescriptive-problem-formulation` "Recover WEST capacity within $5M and 200 install-weeks, prioritizing towers by social blast radius and forward-looking demand. From TowerUpgradeOption, pick at most one upgrade tier (BRONZE/SILVER/GOLD) per critical-restore tower, maximizing Σ capacity_increase × weighted_impact × projected_demand_growth."
 
 ```
 FORMULATION
@@ -370,8 +364,7 @@ Headline metrics
 
 ## Stage 6 — Interpretation
 
-> **Skill:** `rai-prescriptive-results-interpretation` ·
-> **Prompt:** "Summarize the plan: total cost, capacity restored, tier mix, towers covered. Which constraint is binding, and what would relaxing it by 10-20% unlock?"
+> `/rai-prescriptive-results-interpretation` "Summarize the plan: total cost, capacity restored, tier mix, towers covered. Which constraint is binding, and what would relaxing it by 10-20% unlock?"
 
 ```
 THE PLAN, IN BUSINESS TERMS

From 78323e33e9249d0a4ad316b6538a4550da99eace Mon Sep 17 00:00:00 2001
From: cafzal <cameron.afzal@gmail.com>
Date: Wed, 6 May 2026 09:27:19 -0700
Subject: [PATCH 05/22] Trim runbooks to overview + workflow table + data
 footer

Each runbook is now ~50 lines: 1-paragraph intro, the TL;DR chain
ASCII, a workflow table (skill + prompt + expected output per step),
and a brief data footer. All per-stage narrative subsections,
"how to read" / "adapting" / "why the chain matters" sections, and
duplicate enrichment diagrams are gone.

Total: 5 files, ~2000 lines removed, ~260 retained.
---
 v1/energy_grid_planning/references/runbook.md | 486 +-----------------
 v1/machine_maintenance/references/runbook.md  | 257 +--------
 v1/portfolio_balancing/references/runbook.md  | 465 +----------------
 .../references/runbook.md                     | 383 +-------------
 .../references/runbook.md                     | 473 +----------------
 5 files changed, 64 insertions(+), 2000 deletions(-)

diff --git a/v1/energy_grid_planning/references/runbook.md b/v1/energy_grid_planning/references/runbook.md
index f1329ac..87242eb 100644
--- a/v1/energy_grid_planning/references/runbook.md
+++ b/v1/energy_grid_planning/references/runbook.md
@@ -1,12 +1,8 @@
 # Runbook: Energy Grid Planning — Multi-Reasoner Walkthrough
 
-Walk-through of the chained-reasoner pattern this template is built on. One realistic business thread — **ERCOT processes 10 hyperscaler interconnection requests against the 12-substation Texas grid** — traced across four RAI reasoner families, each stage writing properties back to the same ontology that downstream stages consume.
+ERCOT processes 10 hyperscaler interconnection requests (2,930 MW) against a 12-substation Texas grid. The chain forecasts substation load, finds structural bottlenecks, screens compliance, and produces a Pareto frontier across 5 budget levels — no single reasoner can answer this end-to-end.
 
-The template's combined script (`energy_grid_planning.py`) implements stages 1–4 directly; this runbook walks through how an agent would derive the same pipeline prompt-by-prompt, skill-by-skill, so a non-OR reader can follow the full reasoning thread end-to-end.
-
----
-
-## TL;DR — the chain in one screen
+## The chain
 
 ```
 ERCOT has 10 hyperscaler interconnection requests totalling 2,930 MW
@@ -40,473 +36,17 @@ $300M unlocks 5 DCs (1,500 MW, $264M net value) including xAI Colossus.
   ─────────────────────────────────────────────────────────────────
 ```
 
-A single-reasoner approach can't answer this. Predictive alone forecasts demand but doesn't decide. Graph alone finds bottlenecks but doesn't weigh revenue. Rules alone flag noncompliance but can't optimize. Prescriptive alone has no way to weigh "critical" without graph + predictive upstream.
-
----
-
-## How to read this runbook
-
-This runbook serves two audiences:
-
-- **Reading top-to-bottom**: the narrative + ASCII visualizations show what
-  the chain produces stage-by-stage, with the same business framing the
-  stakeholder would see.
-- **Per-stage skill blocks**: the boxed `Skill / Prompt` callout at the
-  start of each stage is the recipe — load that RAI agent skill, give it
-  that prompt against the bundled demo data, and the agent will reproduce
-  the stage.
-
----
-
-## Step 0 — Scope the question with `rai-discovery`
-
-> `/rai-discovery` "We have 10 hyperscaler interconnection requests against a 12-substation grid. Which to approve, which substation upgrades to fund, at what budget level?"
-
-Discovery classifies the question by reasoner family and tells you which
-downstream skills to load:
-
-| Sub-question | Reasoner | Skill |
-|---|---|---|
-| Which substations are losing headroom fastest and will breach capacity? | Predictive | _no public skill yet — see `v1/subscriber_retention/` and `v1/demand_forecasting/` as worked-example references_ |
-| Which substations are structural bottlenecks on the transmission grid? Which communities? | Graph | `rai-graph-analysis` |
-| Which DC requests fail capacity, structural-risk, or low-carbon compliance? | Rules | `rai-rules-authoring` |
-| Across budget levels, which DCs should we approve and which upgrades should we fund? | Prescriptive | `rai-prescriptive-problem-formulation` |
-| What does the Pareto frontier tell us about the knee, marginal value, and binding constraints? | Prescriptive (post-solve) | `rai-prescriptive-results-interpretation` |
-
-Discovery's output is a *plan*, not code. Everything that follows
-materializes that plan.
-
----
-
-## Prerequisites
-
-The template ships everything needed to run the chain end-to-end:
-
-- Bundled CSVs in `../data/` (12 substations, 15 generators, 18 transmission lines, 10 DC requests, 10 substation upgrades, and supporting load/forecast tables). To run against your own Snowflake schema, swap CSV loaders for `model.Table(...)` references.
-- `raiconfig.yaml` pointed at your RAI-enabled Snowflake account.
-- The combined script `../energy_grid_planning.py` runs the full chain.
-
-The narrative below follows the actual stage outputs of `energy_grid_planning.py`. Steps 1–3 below are an agent-style walk-through of how the ontology would be built from scratch via skills like `/rai-build-starter-ontology` and `/rai-discovery`; the template ships a pre-built ontology in `energy_grid_planning.py` for users who want to skip ahead to the reasoner stages.
-
----
-
-## Walk-through (agent-style)
-
-Steps are sequential — each depends on prior steps. Steps without a skill are presentation-only.
-
-| # | Step | Skill | Prompt | Expected Output |
-|---|------|-------|--------|-----------------|
-| 1 | Ontology | `/rai-build-starter-ontology` | "Build an ontology for ERCOT grid infrastructure planning from the bundled CSVs (or your Snowflake schema)." | 13 concepts: `Substation`, `Generator`, `TransmissionLine`, `LoadZone`, `DemandPeriod`, `RenewableProfile`, `MaintenanceWindow`, `Customer`, `DataCenterRequest`, `SubstationUpgrade`, `DemandForecast`, `LoadHistory`, `DCAnnouncement`. 12 substations, 15 generators, 18 transmission lines, 10 DC requests (2,930 MW). |
-| 2 | Visualize | — | "Show the ontology as an ASCII diagram." | Concept map with `Substation` as central hub — `Generator`, `TransmissionLine`, `DataCenterRequest`, `SubstationUpgrade`, `DemandForecast` all relate to it. |
-| 3 | Discovery | `/rai-discovery` | "What questions can we answer with this ontology? We're evaluating data center interconnection requests." | 4 reasoning paths: substation load forecast (predictive), grid topology + structural criticality (graph), interconnection compliance (rules), joint approval + upgrade investment (prescriptive). |
-| 4 | Explore: generation mix | `/rai-querying` | "What's our current generation mix by fuel type? How much renewable capacity vs fossil?" | 15 generators, 8,135 MW total. Nuclear leads (2,560 MW, 31.5%), then gas (2,290 MW, 28.1%), wind (1,250 MW, 15.4%), coal (1,020 MW, 12.5%), solar (630 MW, 7.7%), battery (300 MW), hydro (85 MW). Renewable: 2,265 MW (28%). Requests with 100% low-carbon mandates (Google, Crusoe) face a structural constraint. |
-| 5 | Explore: capacity headroom | `/rai-querying` | "Which substations have the most and least spare capacity right now, before any new DC load?" | Tightest: Houston Ship Channel (69.4% utilized, 550 MW headroom), Austin Energy (68.9%, 280 MW), DFW (68.8%, 500 MW). Most spare: Midland-Permian (38.2%, 680 MW headroom), Lubbock (44.3%, 390 MW). DFW has only 500 MW headroom but 1,100 MW of DC requests stacked on it. |
-| 6 | Explore: DC request landscape | `/rai-querying` | "Summarize the 10 DC requests — total MW per substation, revenue per MW, low-carbon requirements." | 2,930 MW total, $528M/yr revenue across 6 substations. DFW most stacked (1,100 MW, 3 requests: Google $195K/MW, xAI $210K/MW, Lambda $150K/MW). xAI is highest revenue ($210K/MW/yr, $105M/yr total). Google and Crusoe require 100% low-carbon. Top 3 substations (DFW, Houston, San Antonio) account for 78% of requested MW. |
-| 7 | Stage 1 — Predict | `/rai-querying` (or GNN) | "Forecast which substations are losing headroom fastest and which will breach capacity." | `Substation.predicted_load` written for all 12. DFW: 1,700 MW predicted vs 1,600 MW capacity at 24 months (+54.6% growth). Houston Ship Channel: 1,797 MW (+43.8%, within capacity). The only substation predicted to breach. |
-| 8 | Stage 2 — Graph | `/rai-graph-analysis` | "Build a graph on the transmission grid. Find structurally critical substations using betweenness, degree, and eigenvector centrality. Identify connected components and Louvain communities." | 1 connected component, 3 communities (North Texas, West Texas, Gulf Coast). Top 3 by combined centrality rank: DFW, Houston, San Antonio — all flagged `is_structurally_critical`. 7 of 10 DC requests target critical substations. |
-| 9 | Stage 3 — Rules | `/rai-rules-authoring` | "Check each request against capacity (using `predicted_load`), low-carbon mandate, and structural risk (using `is_structurally_critical`)." | 3 declarative `Relationship` rules (`fails_capacity`, `fails_structural`, `fails_low_carbon`) + composite `is_compliant`. All 10 pass low-carbon. 2 compliant: Crusoe (Midland) and Oracle (Corpus Christi). 8 flagged on capacity + structural risk. |
-| 10 | Stage 4 — Optimize | `/rai-prescriptive-problem-formulation` | "Which DCs to approve and which upgrades to fund across 5 budget levels ($200M-$600M)? Use `predicted_load` for capacity. Show the Pareto frontier — DCs, MW, revenue, net value at each level." | Pareto frontier across `InvestmentLevel` Scenario Concept. Knee at $300M (5 DCs, 1,500 MW, $264M net value). xAI Colossus unlocks at $300M. Google and Lambda never approved (DFW full). |
-| 11 | Results | `/rai-prescriptive-results-interpretation` | "How do approvals and upgrades vary by investment level? Where's the knee? What's marginal return per $M?" | Per-level DC list + selected upgrades, queried via `model.select(...).where(x_approve > 0.5)`. Marginal: $200→$300M = $995K/$M (knee); declines to $400K/$M by $600M. |
-
----
-
-## Stage 1 — Predictive: substation load forecasting
-
-> `/rai-predictive-modeling` + `/rai-predictive-training` "Can we forecast substation load growth over the next 36 months based on historical demand, planned generator additions, and the data center request pipeline? Bind each substation's predicted peak load back to the ontology so the rules engine and optimizer can read it."
-
-**Method:** load max forecasted load per substation as `Substation.predicted_load`. The template aggregates `DemandForecast.predicted_load_mw` over forecast horizons (6/12/18/24 months) and writes the max back to the substation. A pre-trained GNN can replace the table lookup; the script falls back gracefully when the GNN model registry is unavailable.
-
-```
-Substation load forecast (max across 6/12/18/24-month horizons)
-
-  Houston Ship Channel  ────  pred 1,797 MW  ████████████  +43.8%   safe
-  Dallas-Fort Worth     ────  pred 1,700 MW  ██████████    +54.6%   ▲ breach 24mo
-  San Antonio Metro     ────  pred 1,069 MW  ███████       +37.1%   safe
-  Austin Energy         ────  pred   819 MW  █████         +32.1%   safe
-  Waco Gateway          ────  pred   600 MW  ████          +22.4%   safe
-  Corpus Christi Coast  ────  pred   600 MW  ████          +11.1%   safe
-  Midland-Permian       ────  pred   520 MW  ███           +23.8%   safe
-  El Paso Border        ────  pred   470 MW  ███           +14.6%   safe
-  Abilene Central       ────  pred   400 MW  ███           +14.3%   safe
-  Lubbock West Texas    ────  pred   360 MW  ██            +16.1%   safe
-  Brownsville Valley    ────  pred   355 MW  ██            +10.9%   safe
-  Amarillo Panhandle    ────  pred   315 MW  ██            +12.5%   safe
-                                            ▲
-  ──────────────────────────────────────────│──────────────────────
-  DFW is the only substation predicted to   │
-  breach capacity (1,700 vs 1,600 MW cap).  │
-  Stage 3 rules and Stage 4 capacity        │
-  constraint both consume this property.    │
-  ──────────────────────────────────────────────────────────────────
-
-✓ Substation.predicted_load written back to all 12 substations
-```
-
-**Stage 3 rule and Stage 4 constraint with the predictive term:**
-
-```python
-# Both reuse the same effective_load expression
-effective_load = Substation.predicted_load | Substation.current_load_mw
-```
-
-**Caveats:**
-- The bundled forecasts are pre-computed in `data/demand_forecasts.csv` to keep the template self-contained. To wire in a live GNN, point `GNN(...)` at your model registry — the template's `try` block already handles both paths.
-- DFW's projected breach partially encodes the stacked DC request itself ("we expect demand to keep rising at the substation everyone is targeting"). For an independent baseline, train on a pre-announcement slice and compare.
-
----
-
-## Stage 2 — Graph: grid topology & structural vulnerability
-
-> `/rai-graph-analysis` "Which substations are most critical to power flow based on grid topology? Use centrality on the transmission graph, then flag the top 3 as structurally critical and persist the scores back to the ontology."
-
-**Construction** — `Substation` as the node concept directly (no mirror concept):
-- Node concept: `Substation` (12 nodes)
-- Edges: active `TransmissionLine` rows, `from_substation` → `to_substation`
-- Direction: undirected, unweighted
-
-**Algorithms:** weakly connected components, Louvain community detection, betweenness/degree/eigenvector centrality. Combined rank picks the top-3 as `is_structurally_critical`.
-
-```
-Connectivity                ──►  1 component   (12 of 12 reachable)
-
-Louvain communities         ──►  3 regions
-
-  Region 1 — North Texas      Dallas-Fort Worth, Austin Energy, Waco Gateway
-  Region 2 — West Texas       Midland-Permian, Lubbock, El Paso, Amarillo, Abilene
-  Region 3 — Gulf Coast       Houston Ship Channel, San Antonio Metro,
-                              Corpus Christi Coast, Brownsville Valley
-
-Centrality (top-3 marked is_structurally_critical)
-
-  #1  Dallas-Fort Worth      betw 31.67  ████████████   [CRITICAL]
-  #2  Houston Ship Channel   betw 15.83  ██████         [CRITICAL]
-  #3  San Antonio Metro      betw  4.33  ██             [CRITICAL]
-  #4  Austin Energy          betw  ~3.0
-  …
-  #12 Brownsville Valley     betw  0.00
-
-  ──────────────────────────────────────────────────────────────────
-  DC requests targeting structurally critical substations:
-    Microsoft Horizon Campus  (350 MW) ─►  Houston    [CRITICAL]
-    Meta Bayou DC             (300 MW) ─►  Houston    [CRITICAL]
-    Google Metroplex DC       (400 MW) ─►  DFW        [CRITICAL]
-    xAI Colossus Texas        (500 MW) ─►  DFW        [CRITICAL]
-    Lambda Labs DFW           (200 MW) ─►  DFW        [CRITICAL]
-    Amazon SA Cloud           (280 MW) ─►  San Antonio [CRITICAL]
-    Apple iCloud Texas        (250 MW) ─►  San Antonio [CRITICAL]
-                                                       ────────────
-  7 of 10 DC requests sit on the 3 most structurally critical nodes.
-  ──────────────────────────────────────────────────────────────────
-
-✓ Substation.betweenness / degree_centrality / eigenvector_centrality
-✓ Substation.grid_community              written back (12 rows)
-✓ Substation.is_structurally_critical    written back (3 rows)
-```
-
----
-
-## Stage 3 — Rules: interconnection queue compliance
-
-> `/rai-rules-authoring` "Screen each data center request against three criteria: (1) substation must have enough capacity after predicted load, (2) if 100% low-carbon required, region must have 25%+ renewable, (3) substation shouldn't be most structurally critical. Which requests pass all three?"
-
-Three declarative `Relationship` rules consume Stages 1–2 enrichments. Each is written as a `model.where(...).define(...)` block; a composite `is_compliant` fires only when none of the three failure flags fire.
-
-**Rule 1 — Capacity** (consumes `Substation.predicted_load` from Stage 1):
-
-```python
-DataCenterRequest.fails_capacity = model.Relationship(...)
-effective_load = SubRef.predicted_load | SubRef.current_load_mw
-model.where(
-    DataCenterRequest.substation(SubRef),
-    DataCenterRequest.requested_mw + effective_load > SubRef.max_capacity_mw,
-).define(DataCenterRequest.fails_capacity())
-```
-
-**Rule 2 — Structural risk** (consumes `Substation.is_structurally_critical` from Stage 2):
-
-```python
-model.where(
-    DataCenterRequest.substation(SubRef),
-    SubRef.is_structurally_critical(),
-).define(DataCenterRequest.fails_structural())
-```
-
-**Rule 3 — Low-carbon mandate** (zero-emission share at the substation must meet the request's requirement; nuclear + renewable count):
-
-```python
-model.where(
-    DataCenterRequest.substation(SubRef),
-    (SubRef.low_carbon_gen_mw | 0.0) * 100
-        < DataCenterRequest.low_carbon_requirement_pct * (SubRef.total_gen_mw | 0.001),
-).define(DataCenterRequest.fails_low_carbon())
-```
-
-```
-  DC Request                Hyper          Q#    MW   Cap  LowC  Crit  OK?
-  ──────────────────────────────────────────────────────────────────────
-  Microsoft Horizon Campus  Microsoft       1   350  FAIL  PASS  FAIL    N
-  Meta Bayou DC             Meta            2   300  FAIL  PASS  FAIL    N
-  Google Metroplex DC       Google          3   400  FAIL  PASS  FAIL    N
-  xAI Colossus Texas        xAI             4   500  FAIL  PASS  FAIL    N
-  Lambda Labs DFW           Lambda Labs     5   200  FAIL  PASS  FAIL    N
-  Amazon SA Cloud           Amazon          6   280  FAIL  PASS  FAIL    N
-  Apple iCloud Texas        Apple           7   250  FAIL  PASS  FAIL    N
-  CoreWeave Austin GPU      CoreWeave       8   320  FAIL  PASS  PASS    N
-  Crusoe Permian DC         Crusoe Energy   9   180  PASS  PASS  PASS    Y
-  Oracle Coastal DC         Oracle         10   150  PASS  PASS  PASS    Y
-  ──────────────────────────────────────────────────────────────────────
-  Summary: 2 compliant, 8 flagged
-```
-
-Every request passes low-carbon — ERCOT's nuclear (STP, Comanche Peak) plus its wind/solar fleet provides enough zero-emission generation. The two compliant requests sit on substations that are neither structurally critical (Stage 2) nor predicted to breach (Stage 1). Eight requests need either upgrades, redirection, or both — Stage 4 picks the optimal mix.
-
-```
-✓ DataCenterRequest.fails_capacity         written back
-✓ DataCenterRequest.fails_structural       written back
-✓ DataCenterRequest.fails_low_carbon       written back
-✓ DataCenterRequest.is_compliant           written back  (2 rows)
-```
-
----
-
-## Stage 4 — Prescriptive: joint DC approval + upgrade MIP
-
-> `/rai-prescriptive-problem-formulation` "Decide which data center requests to approve and which substation upgrades to fund at $200M, $300M, $400M, $500M, and $600M investment levels. Maximize annual revenue. A request can only be approved if its substation has enough capacity after upgrades."
-
-```
-FORMULATION
-
-  Scenario Concept
-    InvestmentLevel(name, budget_cap)   5 levels: $200M, $300M, $400M, $500M, $600M
-
-  Decision variables (binary, indexed by InvestmentLevel)
-    DataCenterRequest.x_approve(InvestmentLevel)    50 binaries  (10 DCs × 5 levels)
-    SubstationUpgrade.x_upgrade(InvestmentLevel)    50 binaries  (10 upgrades × 5 levels)
-
-  Constraints
-    1. Substation capacity per InvestmentLevel
-         max_capacity − predicted_load + Σ(x_upgrade · capacity_increase)
-         ≥  Σ(x_approve · requested_mw)            per (Substation, InvestmentLevel)
-    2. Budget per InvestmentLevel
-         Σ(x_upgrade · cost_million)  ≤  budget_cap   per InvestmentLevel
-
-  Objective (maximize)
-    Σ x_approve · annual_revenue_per_mw · requested_mw           summed over all levels
-
-──────────────────────────────────────────────────────────────────────
-SOLVE  (HiGHS, single solve, all 5 levels in one MIP)   →   OPTIMAL
-──────────────────────────────────────────────────────────────────────
-```
-
-```
-PARETO FRONTIER (queried directly from ontology)
-
-  Level    Budget    DCs   DC MW    Revenue $/yr    Upg $M    Net Value
-  ──────  ────────  ────  ──────  ─────────────   ────────  ───────────
-   $200M    $200M     4    1,000     $174,350,000    $190.0   $164,850,000
-   $300M    $300M     5    1,500     $279,350,000    $300.0   $264,350,000   ◄ KNEE
-   $400M    $400M     6    1,800     $328,850,000    $385.0   $309,600,000
-   $500M    $500M     7    2,080     $376,450,000    $430.0   $354,950,000
-   $600M    $600M     8    2,330     $420,200,000    $505.0   $394,950,000
-
-  Per-level DC approvals (cumulative as budget increases)
-
-   $200M:  Microsoft (350) · CoreWeave (320) · Crusoe (180) · Oracle (150)   ── 1,000 MW
-   $300M:  + xAI Colossus (500)                                              ── 1,500 MW
-   $400M:  + Meta Bayou (300)                                                ── 1,800 MW
-   $500M:  + Amazon SA (280)                                                 ── 2,080 MW
-   $600M:  + Apple iCloud (250)                                              ── 2,330 MW
-
-   Never approved at any level:
-     Google Metroplex DC  (400 MW, DFW)
-     Lambda Labs DFW      (200 MW, DFW)
-     ──────────────────────────────────
-     DFW is physically full once xAI takes 500 MW.
-
-  Marginal value per $M added budget
-
-   $200 → $300M    ── $995K / $M   ◄ knee — xAI Colossus unlocks at $300M
-   $300 → $400M    ── $453K / $M
-   $400 → $500M    ── $454K / $M
-   $500 → $600M    ── $400K / $M
-
-✓ DataCenterRequest.x_approve(InvestmentLevel)   written back
-✓ SubstationUpgrade.x_upgrade(InvestmentLevel)   written back
-  Each variable is now a queryable property of the model — no
-  parsing of solver output, no per-level re-solve loops.
-```
-
-### Reading the solve
-
-- **Knee at $300M.** xAI Colossus alone delivers $105M/yr ($210K × 500 MW), the single highest-revenue request. Unlocking it requires the $110M UPG-003 upgrade at DFW. Below $300M, xAI doesn't fit; above $300M, the next-best DCs (Meta, Amazon, Apple) deliver $0.4M/$M-ish — diminishing returns.
-- **DFW is the binding bottleneck.** With xAI's 500 MW absorbing the post-upgrade headroom, neither Google (400 MW) nor Lambda (200 MW) ever fit — they share DFW. The optimizer correctly prioritizes revenue per MW over queue position.
-- **All compliant requests get approved by $200M.** The 2 fully compliant requests from Stage 3 (Crusoe, Oracle) sit on sub-critical substations; they're cheap upgrades and clear the bar at the lowest budget.
-- **`predicted_load` propagates the chain.** Capacity slack at every substation reflects Stage 1's forecast, not raw historical load — the same signal the rules engine used.
-
----
-
-## Stage 5 — Interpretation
-
-> `/rai-prescriptive-results-interpretation` "Which data centers get approved, which upgrades are selected, and where's the biggest return on investment at each budget level?"
-
-```
-THE PLAN, IN BUSINESS TERMS
-
-  • Recommended budget: $300M (the knee).
-    5 DCs, 1,500 MW connected, $279M/yr revenue, $264M net value.
-  • Marginal $100M from $200M unlocks xAI Colossus alone — $105M/yr — at
-    $995K return per $1M spent. Beyond the knee, marginal return roughly
-    halves.
-  • Google (400 MW) and Lambda (200 MW) are structurally infeasible at any
-    budget level — DFW is full once xAI takes its 500 MW. Either redirect
-    them to a different substation, or de-prioritize.
-  • Crusoe (Midland) and Oracle (Corpus Christi) approve cheaply at any
-    budget — they were the only two that passed all three rules in Stage 3.
-
-ONTOLOGY ENRICHMENT — what each stage contributed back
-
-  Stage 1 (predictive)    Substation.predicted_load                [12]
-
-  Stage 2 (graph)         Substation.betweenness                   [12]
-                          Substation.degree_centrality             [12]
-                          Substation.eigenvector_centrality        [12]
-                          Substation.grid_community                [12]
-                          Substation.is_structurally_critical      [3]
-
-  Stage 3 (rules)         DataCenterRequest.fails_capacity         [8]
-                          DataCenterRequest.fails_structural       [7]
-                          DataCenterRequest.fails_low_carbon       [0]
-                          DataCenterRequest.is_compliant           [2]
-                          Substation.low_carbon_gen_mw             [12]
-                          Substation.total_gen_mw                  [12]
-
-  Stage 4 (prescriptive)  DataCenterRequest.x_approve              [50]
-                          SubstationUpgrade.x_upgrade              [50]
-
-  ──────────────────────────────────────────────────────────────────
-  Each stage reads what the previous stage wrote.
-  Re-running any downstream stage automatically picks up enrichments.
-  No glue code, no DataFrame round-trip — same ontology throughout.
-  ──────────────────────────────────────────────────────────────────
-```
-
----
-
-## The chain — accretive ontology enrichment
-
-```
-THE ENERGY GRID PLANNING CHAIN
-
-  STAGE 1  PREDICTIVE
-  "Where is demand growing? Who breaches first?"
-  reads:   DemandForecast.predicted_load_mw, Substation.current_load_mw
-  writes:  Substation.predicted_load              ── per substation
-                         │
-                         ▼
-  STAGE 2  GRAPH (WCC / Louvain / centrality)
-  "How is the grid connected? Which substations are structural bottlenecks?"
-  reads:   Substation nodes, TransmissionLine edges (active)
-  writes:  Substation.betweenness / degree_centrality / eigenvector_centrality
-           Substation.grid_community               ── 3 regions
-           Substation.is_structurally_critical     ── 3 substations
-                         │
-                         ▼
-  STAGE 3  RULES (declarative Relationships)
-  "Which DC requests pass capacity, structural, low-carbon checks?"
-  reads:   Substation.predicted_load        ◄── Stage 1
-           Substation.is_structurally_critical ◄── Stage 2
-           Generator.emissions_rate, DataCenterRequest.low_carbon_requirement_pct
-  writes:  DataCenterRequest.fails_capacity / fails_structural / fails_low_carbon
-           DataCenterRequest.is_compliant          ── 2 requests
-                         │
-                         ▼
-  STAGE 4  PRESCRIPTIVE (HiGHS MIP, Scenario Concept)
-  "Which DCs to approve and which upgrades to fund across 5 budget levels?"
-  reads:   Substation.predicted_load        ◄── Stage 1
-           SubstationUpgrade.cost_million / capacity_increase_mw
-           DataCenterRequest.requested_mw / annual_revenue_per_mw
-           InvestmentLevel.budget_cap              ── 5 levels
-  writes:  DataCenterRequest.x_approve(InvestmentLevel)
-           SubstationUpgrade.x_upgrade(InvestmentLevel)
-                         │
-                         ▼
-                   Pareto frontier,
-                   queryable directly from the ontology.
-
-  ──────────────────────────────────────────────────────────────────
-  No glue. No DataFrame ping-pong. No per-level re-solve loop.
-  Four reasoners, one ontology, one accretive thread.
-  ──────────────────────────────────────────────────────────────────
-```
-
----
-
-## Why the chain matters (vs. any single stage)
-
-| Stage alone | What it tells you | What it doesn't |
-|---|---|---|
-| Predictive | "DFW will breach in 24 months" | What to do; which requests matter |
-| Graph alone | "DFW, Houston, San Antonio are bottlenecks" | Whether they have headroom; what to approve |
-| Rules alone | (won't fire — no `predicted_load`, no `is_structurally_critical`) | Pipeline misses |
-| Prescriptive alone | (no flagged set, no critical-node info, no forecast) | Whole pipeline misses |
-
-| Combined | Output |
-|---|---|
-| Predictive → Graph | Forecasted load + structural bottleneck map |
-| + Rules | Per-request compliance vs. capacity / structural / low-carbon |
-| + Prescriptive | Pareto frontier across 5 budget levels in one solve |
-
-**Multi-reasoner chaining grounded in (and contributing to) the ontology.**
-
----
-
-## Adapting this recipe to a new domain
-
-The chain pattern transfers cleanly. To rebuild for a different problem:
-
-1. Re-run `rai-discovery` on the new business question — does it actually
-   need all 4 reasoner families, or is one or two sufficient?
-2. Strip the demo ontology to the concepts the new chain needs (lean is
-   better for type inference and solver compile time).
-3. Stage 1 (Predictive) is optional — if you have forecast tables already,
-   a simple `aggs.max(...).per(...)` derived property is enough; swap in
-   a GNN later by pointing the predictive reasoner at your model registry.
-4. Stages 2–4 are the load-bearing chain: graph centrality flags
-   structurally critical nodes, rules consume both the predictive forecast
-   and the criticality flag to fail or pass each request, and the
-   prescriptive MIP reads the same forecast as a capacity baseline while
-   indexing decision variables by an `InvestmentLevel` Scenario Concept so
-   one solve produces the full Pareto frontier.
-5. Keep the validation checks at every stage: assert the predicted-load
-   write covers all nodes, the top-N critical set looks plausible against
-   the topology, the compliance table has at least one PASS and one FAIL,
-   and the optimizer reports OPTIMAL with a non-zero objective.
-
-The shape this template demonstrates — *each reasoner writes a property
-the next reasoner reads* — is what makes the chain accretive rather than
-serial. The agent skills are how you reliably author each link.
-
----
-
-## Data Reference
-
-**Substations with DC requests:**
-
-| Substation | Location | Capacity | DC Requests | DC MW |
-|------------|----------|----------|-------------|-------|
-| SUB-001 | Houston Ship Channel | 1,800 MW | Microsoft (350), Meta (300) | 650 MW |
-| SUB-002 | Dallas-Fort Worth | 1,600 MW | Google (400), xAI (500), Lambda (200) | 1,100 MW |
-| SUB-003 | San Antonio Metro | 1,200 MW | Amazon (280), Apple (250) | 530 MW |
-| SUB-004 | Austin Energy | 900 MW | CoreWeave (320) | 320 MW |
-| SUB-005 | Midland-Permian | 1,100 MW | Crusoe (180) | 180 MW |
-| SUB-007 | Corpus Christi Coast | 800 MW | Oracle (150) | 150 MW |
-
-**ERCOT regions (Louvain):** North Texas (DFW, Austin, Waco) | West Texas (Midland, Lubbock, El Paso, Amarillo, Abilene) | Gulf Coast (Houston, San Antonio, Corpus Christi, Brownsville)
+## Workflow
 
-**DFW breach:** 1,600 MW capacity, 1,700 MW predicted (24mo), +54.6% growth, 1,100 MW DC requests stacked on top. Google and Lambda permanently infeasible.
+| # | Skill + prompt | Expected output |
+|---|----------------|-----------------|
+| 0 | `/rai-discovery` "We have 10 hyperscaler interconnection requests against a 12-substation grid. Which to approve, which substation upgrades to fund, at what budget level?" | Plan routing sub-questions to predictive, graph, rules, and prescriptive reasoners. |
+| 1 | `/rai-predictive-modeling` + `/rai-predictive-training` "Can we forecast substation load growth over the next 36 months based on historical demand, planned generator additions, and the data center request pipeline? Bind each substation's predicted peak load back to the ontology so the rules engine and optimizer can read it." | `Substation.predicted_load` for all 12; DFW breaches at 1,700 MW vs 1,600 MW cap at 24 months (+54.6%). |
+| 2 | `/rai-graph-analysis` "Which substations are most critical to power flow based on grid topology? Use centrality on the transmission graph, then flag the top 3 as structurally critical and persist the scores back to the ontology." | 1 component, 3 Louvain communities; DFW, Houston, San Antonio flagged `is_structurally_critical`; 7 of 10 DC requests on critical nodes. |
+| 3 | `/rai-rules-authoring` "Screen each data center request against three criteria: (1) substation must have enough capacity after predicted load, (2) if 100% low-carbon required, region must have 25%+ renewable, (3) substation shouldn't be most structurally critical. Which requests pass all three?" | `fails_capacity` / `fails_structural` / `fails_low_carbon` + `is_compliant`; 2 pass (Crusoe, Oracle), 8 flagged. |
+| 4 | `/rai-prescriptive-problem-formulation` "Decide which data center requests to approve and which substation upgrades to fund at $200M, $300M, $400M, $500M, and $600M investment levels. Maximize annual revenue. A request can only be approved if its substation has enough capacity after upgrades." | OPTIMAL MIP across 5 `InvestmentLevel` values in one solve; `x_approve` and `x_upgrade` written back per level. |
+| 5 | `/rai-prescriptive-results-interpretation` "Which data centers get approved, which upgrades are selected, and where's the biggest return on investment at each budget level?" | Pareto frontier with knee at $300M (5 DCs, 1,500 MW, $264M net); marginal $995K/$M at knee, declining to $400K/$M by $600M; Google + Lambda never approved (DFW full). |
 
-**Upgrades:** 10 available, $630M total, 2,900 MW combined capacity. Only $300M of upgrades are needed at the knee.
+## Data
 
-- **Source data**: bundled CSVs in `../data/` (12 substations, 15 generators, 18 transmission lines, 10 DC requests, 10 upgrade options, plus historical load and forecast tables).
-- **Ontology**: defined in `../energy_grid_planning.py` (13 concepts).
-- **Stages**: implemented in `../energy_grid_planning.py` as a single combined script with stage banners.
+Bundled CSVs in `../data/`: 12 substations, 15 generators, 18 transmission lines, 10 DC requests (2,930 MW), 10 substation upgrades ($630M total), plus historical load and forecast tables. Full chain implemented in `../energy_grid_planning.py`.
diff --git a/v1/machine_maintenance/references/runbook.md b/v1/machine_maintenance/references/runbook.md
index ca6201c..d4efa03 100644
--- a/v1/machine_maintenance/references/runbook.md
+++ b/v1/machine_maintenance/references/runbook.md
@@ -1,12 +1,8 @@
 # Runbook: Machine Maintenance — Multi-Reasoner Walkthrough
 
-Walk-through of the chained-reasoner pattern this template is built on. One realistic plant-floor thread — **scheduling preventive maintenance for a 30-machine, 3-plant operation** — traced across querying, graph, rules, and prescriptive reasoners, each stage writing properties back to the same ontology that downstream stages consume.
+Schedules preventive maintenance for a 30-machine, 3-plant operation. OEE alone misranks the plants; sensor counts don't quantify forward risk; rules flag machines but don't allocate scarce technician time; the optimizer produces a feasible schedule but can't see that all Turbine techs sit in one city. The chain threads querying, graph, rules, and prescriptive reasoners through one ontology so each stage's enrichments feed the next.
 
-The template's combined script (`machine_maintenance.py`) implements all five stages directly; this runbook expands the surrounding narrative — what each prompt asks, what shape of output to expect, and how each enrichment feeds the next — so a reader can follow the reasoning thread end-to-end without re-running the script.
-
----
-
-## TL;DR — the chain in one screen
+## The chain
 
 ```
 Plant_B looks worst on OEE (61.4%). Plant_A looks mid-tier (68.2%).
@@ -43,246 +39,17 @@ all 3 Turbine techs sit in one city, a $3,200 fix away from resolved.
   ─────────────────────────────────────────────────────────────────
 ```
 
-A single-reasoner approach can't surface this. OEE alone says Plant_B is the problem. Sensor counts alone don't quantify forward risk. Rules alone flag machines but don't allocate scarce technician time. The optimizer alone produces a feasible schedule — but doesn't know that a single weather event in Houston blocks all on-site Turbine work.
-
----
-
-## How to read this runbook
-
-This runbook serves two audiences:
-
-- **Reading top-to-bottom**: the narrative + ASCII visualizations show what the chain produces stage-by-stage, with the same business framing the stakeholder would see.
-- **Per-stage skill blocks**: the boxed `Skill / Prompt` callout at the start of each stage is the recipe — load that RAI agent skill, give it that prompt against the bundled demo data, and the agent will reproduce the stage.
-
----
-
-## Step 0 — Scope the question with `rai-discovery`
-
-> `/rai-discovery` "We need to schedule preventive maintenance for 30 machines across 3 plants. Where does OEE alone mislead us, and what structural risks won't a pure optimizer surface?"
-
-Discovery classifies the question by reasoner family and tells you which downstream skills to load:
-
-| Sub-question | Reasoner | Skill |
-|---|---|---|
-| Where does the operation actually hurt — OEE, anomalies, failure trajectories? | Querying / Descriptive | `rai-querying` |
-| Which machines are scheduling bottlenecks given shared technician pools? | Graph | `rai-graph-analysis` |
-| Which machines are overdue, high-risk, chronic, or composite-Critical? | Rules | `rai-rules-authoring` |
-| What's the optimal maintain-and-assign plan across 4 periods? | Prescriptive | `rai-prescriptive-problem-formulation` |
-| Where is the schedule structurally fragile, and what cross-training fixes it? | Prescriptive (re-solve / interpretation) | `rai-prescriptive-solver-management` + `rai-prescriptive-results-interpretation` |
-
-Discovery's output is a *plan*, not code. Everything that follows materializes that plan.
-
----
-
-## Setup
-
-See the template's main `README.md` for installation, RAI connection setup, and how to run the script. The narrative below follows the actual stage outputs of `machine_maintenance.py`.
-
-**Prerequisites**
-
-- Template's `data/` CSVs available (or your own Snowflake schema with equivalent tables — `machines`, `technicians`, `qualifications`, `availability`, `parts_inventory`, `certification_expiry`, `sensors`, `sensor_readings`, `failure_predictions`, `downtime_events`, `production_runs`, `training_options`)
-- `raiconfig.yaml` pointing at your RAI engine
-- Python ≥ 3.10 with `relationalai >= 1.0.14`
-
----
-
 ## Workflow
 
-The runbook walks the same chain stage-by-stage, prompt-by-prompt, in agent-skill order. Each row maps to a section of the script.
-
-| # | Step | Skill | Prompt | Expected Output |
-|---|------|-------|--------|-----------------|
-| 1 | Build ontology | `/rai-build-starter-ontology` | "Build a RAI ontology for a manufacturing maintenance scheduling problem from the CSVs in `data/`. Concepts: Machine, Technician, Qualification, Period, MachinePeriod, TechnicianPeriod, TechnicianMachinePeriod, PartsInventory, CertificationExpiry, Sensor, SensorReading, FailurePrediction, DowntimeEvent, ProductionRun." | Model `machine_maintenance` with 14 user-facing concepts. 30 Machine rows (3 plants × 10 machines, 5 types × 6). 10 Technician rows (4 Chicago_IL, 3 Houston_TX, 3 Phoenix_AZ). 16 Qualification rows. 4 Period rows. |
-| 2 | Discovery | `/rai-discovery` | "What questions can we answer with this ontology? We want to schedule preventive maintenance and surface hidden operational risk." | Querying: OEE by facility, anomaly counts, failure trajectory deltas. Graph: machine dependency clusters, bottleneck centrality on shared-technician edges. Rules: overdue, high-risk, chronic-downtime flags chained to a composite risk tier. Prescriptive: maintain × period × technician assignment minimizing failure + labor + travel cost. Resilience: post-solve concentration analysis. |
-| 3 | Stage 0 — OEE proxy | `/rai-querying` | "Compute OEE proxy (Performance × Quality) by facility. Performance is total_actual / total_planned across ProductionRuns; Quality is total_good / total_actual." | Plant_C: Perf 81.3%, Qual 98.1%, OEE 79.8%. Plant_A: Perf 69.8%, Qual 97.8%, OEE 68.2%. Plant_B: Perf 62.6%, Qual 98.1%, OEE 61.4%. Quality is uniform; Performance is the differentiator. |
-| 4 | Stage 0 — Sensor anomalies | `/rai-querying` | "List machines with above-threshold sensor readings (`SensorReading.is_anomaly == 1`), grouped by facility." | 9 anomaly readings across 5 machines. Plant_A: 7 (M013 Pump:3, M001 Turbine:2, M016 Turbine:2). Plant_B: 1 (M002 Compressor). Plant_C: 1 (M006 Turbine). Plant_A's anomaly load is 7× Plant_B's despite Plant_A's higher OEE. |
-| 5 | Stage 0 — Failure trajectories | `/rai-querying` | "For each machine, compute the failure-probability delta from period 1 to period 4 from `FailurePrediction`. Show the steepest 6." | M001 (Turbine, Plant_A): 0.102 → 0.332 (+0.230, bearing_wear). M013 (Pump, Plant_A): 0.435 → 0.663 (+0.228, impeller_erosion). M016 (Turbine, Plant_A): 0.263 → 0.482 (+0.219, bearing_wear). All three steepest curves are Plant_A. Stored back as `MachinePeriod.predicted_fp` (120 rows) for Stage 3's objective. |
-| 6 | Stage 1 — Dependency graph | `/rai-graph-analysis` | "Build a graph with `Machine` as `node_concept`. Two machines are adjacent when at least one technician is qualified for both machine types. Run weakly connected components." | 30 nodes, edges joined via `Qualification`. WCC: 1 cluster of 30 — every machine is reachable from every other through shared qualifications. No isolated subgraphs. |
-| 7 | Stage 1 — Bottleneck centrality | `/rai-graph-analysis` | "Compute betweenness centrality on the dependency graph. Normalize and store as `Machine.betweenness`." | Pump-type machines tie at the top (raw betweenness 24.0 → normalized 1.0): M003 (Plant_C), M008 (Plant_B), M013 (Plant_A). Turbines, Generators, Motors, Compressors lower. `Machine.betweenness` written back for all 30 machines and consumed by Stage 3's failure-cost multiplier. |
-| 8 | Stage 2 — Compliance flags | `/rai-rules-authoring` | "Define seven derived flags: overdue (`remaining_useful_life < maintenance_duration_hours`), high-risk (`failure_probability > 0.3 AND criticality >= 4`), anomalous (`anomaly_count > 0`), chronic-downtime (`downtime_event_count > 8`), parts-reorder (`stock_level <= min_order_qty`), expiring-cert (`days_remaining < 30`). Use `model.where(...).define(...)`." | Overdue (6): M002, M006, M013, M016, M022, M025 (RUL below required maintenance hours). High-risk (1): M013 (fp=0.435, crit=4). Anomalous (5): M013, M001, M016, M002, M006. Chronic downtime (3, threshold > 8 events): M001 (12 events), M016 (11), M013 (10). Parts reorder (4): P001, P003, P004, P006. Expiring certs (5): T001 Compressor 22d, T004 Pump 8d, T003 Compressor 15d, T006 Motor 25d, T009 Motor 12d. |
-| 9 | Stage 2 — Composite risk tier | `/rai-rules-authoring` | "Chain `is_chronic_downtime`, `is_high_risk`, `is_overdue_maintenance` into `Machine.risk_tier`: Critical if all three, Elevated if exactly two, Standard otherwise. Enumerate all 8 combinations using `model.not_()` for negation." | Critical (1): M013 (Pump, Plant_A) — chronic + high-risk + overdue. Elevated (1): M016 (Turbine, Plant_A) — chronic + overdue, not high-risk. Standard (28): rest. Plant_A holds the only Critical and the only Elevated machine despite ranking second on OEE. |
-| 10 | Stage 3 — Formulation | `/rai-prescriptive-problem-formulation` | "Formulate the maintenance schedule. Decision variables: `MachinePeriod.x_maintain` (bin), `MachinePeriod.x_vulnerable` (bin), `TechnicianMachinePeriod.x_assigned` (bin) — restricted to qualified pairs. Constraints: cumulative coverage (`Σ x_maintain[m,1..τ] + x_vulnerable[m,τ] = 1` per machine and period), assignment-maintenance linkage (`Σ x_assigned over techs = x_maintain` per (m,τ)), technician hour capacity (`Σ x_assigned · duration ≤ available_hours`), parts/bay capacity (`≤ 5 jobs per period`), and overdue deadline (`Σ x_maintain[m, τ ≤ 2] ≥ 1` for every overdue machine — feeds from Stage 2). Objective: minimize failure_cost + labor_cost + travel_cost." | 120 `x_maintain` binaries (30 machines × 4 periods). 120 `x_vulnerable` binaries. ~250 `x_assigned` binaries (qualification-restricted). 5 constraint families. Failure cost uses `MachinePeriod.predicted_fp` (Stage 0) × `Machine.criticality` × `(1 + 2.0 × Machine.betweenness)` (Stage 1). |
-| 11 | Stage 3 — Solve | `/rai-prescriptive-solver-management` | "Solve with HiGHS, time limit 120s, assert OPTIMAL." | OPTIMAL. Objective = $605,240.61. 20 maintenance jobs scheduled across 4 periods (capacity-binding at 5 jobs/period). Both Plant_A overdue Turbines (M016 plus the rest of the overdue list) maintained by period 2 — overdue constraint satisfied. |
-| 12 | Stage 3 — Schedule readout | `/rai-prescriptive-results-interpretation` | "Show the period-by-period schedule and technician assignments. Flag any travel (`base_location != machine.location`)." | Period 1 includes M002 (Plant_B), M006 (Plant_C), M013 (Plant_A), M016 (Plant_A) — high-priority overdue/critical machines. Periods 2–4 cover the remaining 16 jobs. Multiple Turbine assignments require travel because all 3 Turbine-qualified techs (T001, T002, T003) are based in Houston_TX while Turbines exist at all 3 plants. Travel cost is paid at $50/hr × duration. |
-| 13 | Stage 4 — Concentration analysis | `/rai-graph-analysis`, `/rai-querying` | "From the qualification table, find machine types whose qualified technicians are all in one location. For each concentrated type, count how many scheduled jobs require travel." | Compressor: techs in Chicago_IL, Houston_TX (gap: Phoenix_AZ). Generator: Chicago_IL, Phoenix_AZ (gap: Houston_TX). Motor: Chicago_IL, Phoenix_AZ (gap: Houston_TX). Pump: Chicago_IL, Phoenix_AZ (gap: Houston_TX). **Turbine: all 3 techs in Houston_TX — CONCENTRATED.** Of 3 scheduled Turbine jobs, 2 require travel (67%). 4 of 6 Turbines are at remote plants. The optimizer found the cheapest plan but cannot fix the structural fragility — losing T001's Compressor cert (22 days remaining) doesn't break Turbines, but losing any of T001/T002/T003 from Houston shrinks Turbine coverage by a third. |
-| 14 | Stage 4 — Cross-training recommendation | `/rai-prescriptive-results-interpretation` | "From `training_options.csv`, find the cheapest Turbine-cross-training candidate based outside Houston_TX." | Best candidate: **T006 (Fiona_Garcia, Senior, Chicago_IL) — $3,200 / 5 weeks.** Other non-Houston options: T005 ($3,500/6w, Chicago), T008 ($3,800/6w, Phoenix), T009 ($4,200/8w, Phoenix), T004 ($5,500/10w, Chicago). Training T006 adds the first non-Houston Turbine tech, eliminates the single-point-of-failure for Plant_B and Plant_C Turbines, and pays back the first time travel or a cert lapse would have idled a Turbine job. The prescriptive reasoner produced the schedule; the resilience layer produced the structural action item. |
-
----
-
-## Stage 0 — Querying: operational intelligence
-
-> `/rai-querying` "What's the OEE by plant? Which machines have the most sensor anomalies, and which are most likely to fail by the end of the planning horizon?"
-
-This stage establishes the operational baseline. Plant_C leads at 79.8% OEE; Plant_B trails at 61.4%. But Plant_A — middle of the OEE pack at 68.2% — owns 7 of 9 sensor anomalies and the three steepest failure trajectories (M001, M013, M016). The querying stage writes nine derived properties on `Machine` plus `MachinePeriod.predicted_fp` (120 rows), and Stage 3 reads `predicted_fp` directly into the failure-cost objective term.
-
-## Stage 1 — Graph: dependency clusters and bottleneck centrality
-
-> `/rai-graph-analysis` "Which machines share qualified technicians, and which are bottlenecks in the qualification network? Compute centrality and write it back to each machine so the optimizer can weight critical machines."
-
-The 30 machines form a single connected component — every machine is reachable through shared qualifications. Pump-type machines tie at the top of betweenness (raw 24.0, normalized 1.0): M003 (Plant_C), M008 (Plant_B), M013 (Plant_A). The normalized centrality is consumed by Stage 3's failure-cost multiplier `(1 + 2.0 × betweenness)`, so leaving a bottleneck Pump vulnerable is markedly more expensive than leaving a peripheral Motor vulnerable.
-
-## Stage 2 — Rules: compliance flags and composite risk tier
-
-> `/rai-rules-authoring` "Rate each machine's risk: chronic if >8 downtime events, high-risk if failure prob >0.3 AND criticality 4+, plus overdue for maintenance. All three flags = Critical, two = Elevated, otherwise Standard."
-
-Six machines overdue, one high-risk (M013), three chronic-downtime, four parts-reorder, five expiring certs. The composite tier surfaces a single Critical machine — M013 (Pump, Plant_A) — and a single Elevated machine — M016 (Turbine, Plant_A). The overdue flag is consumed by Stage 3 as a hard constraint: every overdue machine must be scheduled by period 2.
-
-## Stage 3 — Prescriptive: maintenance schedule
-
-> `/rai-prescriptive-problem-formulation` "Schedule preventive maintenance for all 30 machines across 4 periods, capped at 5 jobs per period. Every overdue machine gets maintained by period 2, and Turbines need an on-site qualified technician. Minimize expected failure cost weighted by criticality and centrality, plus labor and travel."
-
-The solver returns OPTIMAL with objective $605,240.61 and 20 maintenance jobs across the four periods (capacity-binding at 5 jobs/period). M013 and M016 — Plant_A's Critical and Elevated machines — are both scheduled by period 1, satisfying the overdue deadline. Several Turbine assignments require travel because all three Turbine-qualified techs are based in Houston_TX while four of six Turbines sit at Plant_A and Plant_C. The optimizer pays the travel cost; it cannot restructure the qualification pool.
-
-## Stage 4 — Resilience: concentration sweep and cross-training
-
-> `/rai-prescriptive-solver-management` + `/rai-prescriptive-results-interpretation` "For each machine type, are all qualified technicians concentrated in one location? How many scheduled jobs required travel, and what's the cheapest cross-training option to eliminate the single-point-of-failure?"
-
-Turbine is the concentrated type — all three qualified techs (T001, T002, T003) sit in Houston_TX, and 67% of scheduled Turbine jobs already require travel. The recommended fix: cross-train T006 (Senior, Chicago_IL) for $3,200 over 5 weeks. That single addition eliminates the Houston single-point-of-failure for Turbine work at Plant_B and Plant_C, and pays back the first time a weather event, illness, or expiring cert would have idled a Turbine job that the optimizer would otherwise have left uncovered.
-
----
-
-## Stage outputs — what each reasoner contributes back
-
-```
-ONTOLOGY ENRICHMENT — what each stage wrote back
-
-  Stage 0 (querying)      Machine.total_planned_qty               [30]
-                          Machine.total_actual_qty                [30]
-                          Machine.total_good_qty                  [30]
-                          Machine.performance_ratio               [30]
-                          Machine.quality_ratio                   [30]
-                          Machine.total_downtime_minutes          [30]
-                          Machine.downtime_event_count            [30]
-                          Machine.anomaly_count                   [30]
-                          MachinePeriod.predicted_fp              [120]
-
-  Stage 1 (graph)         Machine.betweenness_raw                 [30]
-                          Machine.betweenness   (normalized)      [30]
-
-  Stage 2 (rules)         Machine.is_overdue_maintenance          [6]
-                          Machine.is_high_risk                    [1]
-                          Machine.is_anomalous                    [5]
-                          Machine.is_chronic_downtime             [3]
-                          Machine.risk_tier                       [30]
-                          PartsInventory.needs_reorder            [4]
-                          CertificationExpiry.is_expiring         [5]
-
-  Stage 3 (prescriptive)  MachinePeriod.x_maintain                [120 binary]
-                          MachinePeriod.x_vulnerable              [120 binary]
-                          TechnicianMachinePeriod.x_assigned
-
-  Stage 4 (analysis)      (terminal — prints concentration risk and
-                           costed cross-training recommendation)
-
-  ──────────────────────────────────────────────────────────────────
-  Each stage reads what the previous stage wrote.
-  Re-running any downstream stage automatically picks up enrichments.
-  No glue code, no DataFrame round-trip — same ontology throughout.
-  ──────────────────────────────────────────────────────────────────
-```
-
----
-
-## The chain — accretive ontology enrichment
-
-```
-THE MACHINE-MAINTENANCE CHAIN
-
-  STAGE 0  QUERYING
-  "Where does the operation actually hurt?"
-  reads:   ProductionRun, SensorReading, FailurePrediction
-  writes:  Machine.performance_ratio / quality_ratio
-           Machine.anomaly_count
-           Machine.downtime_event_count / total_downtime_minutes
-           MachinePeriod.predicted_fp
-                         │
-                         ▼
-  STAGE 1  GRAPH (betweenness centrality)
-  "Which machines are scheduling bottlenecks?"
-  reads:   Qualification, Machine (as node_concept)
-  writes:  Machine.betweenness        ── normalized 0..1
-                         │
-                         ▼
-  STAGE 2  RULES
-  "Which machines violate which compliance flags?"
-  reads:   Machine.failure_probability, Machine.criticality,
-           Machine.remaining_useful_life, Machine.maintenance_duration_hours,
-           Machine.anomaly_count, Machine.downtime_event_count,
-           PartsInventory.stock_level, CertificationExpiry.days_remaining
-  writes:  Machine.is_overdue_maintenance / is_high_risk /
-           is_chronic_downtime / is_anomalous
-           Machine.risk_tier            ── Critical / Elevated / Standard
-           PartsInventory.needs_reorder
-           CertificationExpiry.is_expiring
-                         │
-                         ▼
-  STAGE 3  PRESCRIPTIVE (HiGHS MIP)
-  "What's the optimal maintain-and-assign plan?"
-  reads:   MachinePeriod.predicted_fp        ──►  failure cost (period-specific)
-           Machine.betweenness               ──►  failure cost multiplier
-           Machine.is_overdue_maintenance    ──►  hard deadline constraint
-           Qualification (assignment scope), TechnicianPeriod.capacity_hours
-  writes:  MachinePeriod.x_maintain          ── 20 jobs flagged
-           MachinePeriod.x_vulnerable
-           TechnicianMachinePeriod.x_assigned
-                         │
-                         ▼
-  STAGE 4  RESILIENCE
-  "Where is the schedule structurally fragile?"
-  reads:   Solution variables, Qualification, Technician, TrainingOption
-  writes:  (terminal — concentration risk + cross-training recommendation)
-
-  ──────────────────────────────────────────────────────────────────
-  No glue. No DataFrame ping-pong. No re-derivation per-reasoner.
-  Five reasoners, one ontology, one accretive thread.
-  ──────────────────────────────────────────────────────────────────
-```
-
----
-
-## Why the chain matters (vs. any single stage)
-
-| Stage alone | What it tells you | What it doesn't |
-|---|---|---|
-| Querying | "Plant_B has worst OEE; Plant_A has the most anomalies" | Whether anomalies translate to scheduling priority |
-| Graph alone | "Pumps are the most central machine type" | Which Pump matters most or what to do |
-| Rules alone | "M013 is Critical-tier" | How to fit M013 plus 19 others into a 4-period horizon with 10 techs |
-| Prescriptive alone | (won't run — no per-period failure curve, no centrality weight, no overdue flag) | Whole pipeline misses |
-| Resilience alone | (won't run — no schedule to analyze) | — |
-
-| Combined | Output |
-|---|---|
-| Querying → Graph | Per-machine signals plus structural bottleneck weight |
-| + Rules | Composite risk tier + a hard "must maintain by period 2" constraint |
-| + Prescriptive | $605K plan, 20 jobs, Plant_A's Critical machine handled in Period 1 |
-| + Resilience | Cross-train T006 for $3,200 → eliminates Houston Turbine concentration |
-
-**Multi-reasoner chaining grounded in (and contributing to) the ontology.**
-
----
-
-## Adapting this recipe to a new domain
-
-The chain pattern transfers cleanly. To rebuild for a different scheduling-with-resilience problem:
-
-1. Re-run `rai-discovery` on the new business question — does it actually need all five reasoner families, or is one or two sufficient? A pure dispatch problem may only need querying + prescriptive; a pure compliance problem may only need rules.
-2. Strip the demo ontology to the concepts the new chain needs (lean is better for type inference and solver compile time). Keep the cross-product concept (`MachinePeriod`-equivalent) — it's where most decision variables and per-period derived properties live.
-3. Stage 1 (querying) is required scaffolding: the optimization objective leans on derived per-period signals, not raw inputs.
-4. Stages 2–5 are the load-bearing chain: graph centrality writes a multiplier the cost objective consumes; rules write a hard deadline the optimizer must satisfy; the optimizer writes solution variables the resilience sweep reads; the resilience layer doesn't re-solve a new problem — it stress-tests the structure underneath the existing solve and recommends a structural fix.
-5. Keep the validation checks at every stage: assert flagged-set size, betweenness top-N looks plausible, the OPTIMAL gate, the objective is not zero, and the resilience pass surfaces at least one actionable recommendation when concentration exists.
-
-The shape this template demonstrates — *each reasoner writes a property the next reasoner reads* — is what makes the chain accretive rather than serial. The agent skills are how you reliably author each link.
-
----
-
-## Optional extension — operator-shift assignment
-
-A second optimization pass — assigning operators to line-shifts to maximize a skill-match bonus, subject to a same-facility constraint — is a natural follow-on but not part of this template's main script. It would consume the same `Machine` and facility ontology and add `Operator`, `Shift`, and `OperatorShift` concepts. Out of scope for this runbook.
-
----
+| # | Skill + prompt | Expected output |
+|---|----------------|-----------------|
+| 0 | `/rai-discovery` "We need to schedule preventive maintenance for 30 machines across 3 plants. Where does OEE alone mislead us, and what structural risks won't a pure optimizer surface?" | Plan routing sub-questions to querying, graph, rules, prescriptive, and resilience skills. |
+| 1 | `/rai-querying` "What's the OEE by plant? Which machines have the most sensor anomalies, and which are most likely to fail by the end of the planning horizon?" | Plant_C 79.8% > Plant_A 68.2% > Plant_B 61.4%; 7 of 9 anomalies at Plant_A; `MachinePeriod.predicted_fp` written for 120 rows. |
+| 2 | `/rai-graph-analysis` "Which machines share qualified technicians, and which are bottlenecks in the qualification network? Compute centrality and write it back to each machine so the optimizer can weight critical machines." | 30 machines → 1 connected component; Pumps tie at top betweenness (24.0 raw, 1.0 normalized); `Machine.betweenness` stored. |
+| 3 | `/rai-rules-authoring` "Rate each machine's risk: chronic if >8 downtime events, high-risk if failure prob >0.3 AND criticality 4+, plus overdue for maintenance. All three flags = Critical, two = Elevated, otherwise Standard." | 6 overdue, 1 high-risk, 3 chronic; M013 (Pump, Plant_A) = Critical; M016 (Turbine, Plant_A) = Elevated. |
+| 4 | `/rai-prescriptive-problem-formulation` "Schedule preventive maintenance for all 30 machines across 4 periods, capped at 5 jobs per period. Every overdue machine gets maintained by period 2, and Turbines need an on-site qualified technician. Minimize expected failure cost weighted by criticality and centrality, plus labor and travel." | 120 `x_maintain` + 120 `x_vulnerable` + ~250 `x_assigned` binaries; 5 constraint families; failure cost uses `predicted_fp × criticality × (1 + 2.0 × betweenness)`. |
+| 5 | `/rai-prescriptive-solver-management` + `/rai-prescriptive-results-interpretation` "Solve with HiGHS, then for each machine type check whether all qualified techs sit in one location and recommend the cheapest cross-training fix." | OPTIMAL · 20 jobs · $605,241; Turbine concentrated in Houston_TX (67% of jobs travel); cross-train T006 (Chicago_IL, Senior) for $3,200 / 5 weeks. |
 
-## Data Reference
+## Data
 
-- **Source data**: bundled CSVs in `../data/` (30 machines across 3 plants × 5 machine types, 10 technicians across 3 cities, 16 qualifications, 4 planning periods, 60 sensors with 240 readings, 120 per-period failure predictions, 129 downtime events, 120 production runs, 13 cross-training options). To run against your own Snowflake schema instead, swap the `read_csv(...)` loads for `model.Table(...)` references in `machine_maintenance.py`; the rest of the pipeline is unchanged.
-- **Stages**: implemented in `../machine_maintenance.py` as a single combined script with stage banners (Stage 0 → Stage 4).
-- **Ontology**: 14 user-facing concepts. Run `inspect.schema(model)` after the pipeline (see template README) to dump the full concept/property/relationship surface, filtering out reasoner-owned concepts (`Variable`, `Constraint`, etc.) and the auto-generated `graph<id>_Edge` from Stage 1.
+Bundled CSVs in `../data/`: 30 machines (3 plants × 5 types), 10 technicians (3 cities), 16 qualifications, 4 periods, 60 sensors / 240 readings, 120 failure predictions, 129 downtime events, 120 production runs, 13 training options. All five stages run in `../machine_maintenance.py`.
diff --git a/v1/portfolio_balancing/references/runbook.md b/v1/portfolio_balancing/references/runbook.md
index 3df9efa..4a89259 100644
--- a/v1/portfolio_balancing/references/runbook.md
+++ b/v1/portfolio_balancing/references/runbook.md
@@ -1,12 +1,8 @@
 # Runbook: Portfolio Balancing — Multi-Reasoner Walkthrough
 
-Walk-through of the chained-reasoner pattern this template is built on. One realistic portfolio thread — **rebalance an 8-stock universe under compliance + crisis stress** — traced across rules, graph, and prescriptive reasoners, each stage writing properties back to the same ontology that downstream stages consume.
+Rebalance an 8-stock book under compliance + crisis stress. Rules surface broken positions, graph collapses redundant bets via correlation clustering, prescriptive solves a Markowitz QP across 6 (budget, regime) scenarios. No single reasoner does all three: rules don't allocate, graph doesn't optimize, prescriptive on the full universe stacks near-duplicate cluster members.
 
-The template's combined script (`portfolio_balancing.py`) implements all four stages directly; this runbook expands the surrounding narrative — what each prompt asks, what shape of output to expect, and how each enrichment feeds the next — so a reader can follow the reasoning thread end-to-end without re-running the script.
-
----
-
-## TL;DR — the chain in one screen
+## The chain
 
 ```
 The current book breaks compliance on 4 holdings + 2 sectors. Naive
@@ -39,454 +35,17 @@ base at every lambda — without the cluster collapse, the gap would grow.
   ─────────────────────────────────────────────────────────────────
 ```
 
-A single-reasoner approach can't answer this. Rules alone flag broken positions but don't rebuild the book. Graph alone clusters stocks but doesn't allocate. Prescriptive alone solves a Markowitz QP — but on the full universe it stacks near-duplicate cluster members, and on a single covariance it ignores regime risk. Each stage feeds the next: rules surface the violations, graph collapses redundant bets, prescriptive optimizes over the reduced universe across both regimes in one solve.
-
----
-
-## How to read this runbook
-
-This runbook serves two audiences:
-
-- **Reading top-to-bottom**: the narrative + ASCII visualizations show what the chain produces stage-by-stage, with the same business framing the stakeholder would see.
-- **Per-stage skill blocks**: the boxed `Skill / Prompt` callout at the start of each stage is the recipe — load that RAI agent skill, give it that prompt against the bundled demo data, and the agent will reproduce the stage.
-
----
-
-## Step 0 — Scope the question with `rai-discovery`
-
-> `/rai-discovery` "Our 8-stock book breaks compliance and concentrates risk. Rebuild it under Markowitz mean-variance with caps, deduplicate redundant bets via correlation clustering, and stress-test under crisis. What questions does each reasoner family handle?"
-
-Discovery classifies the question by reasoner family and tells you which downstream skills to load:
-
-| Sub-question | Reasoner | Skill |
-|---|---|---|
-| Where is the current book breaking compliance (per-stock, per-sector, per-trader)? | Rules | `rai-rules-authoring` |
-| Which stocks are really the same bet (correlation clusters + cluster representative)? | Graph | `rai-graph-analysis` |
-| What's the optimal allocation under position + sector caps for each (budget, regime) scenario? | Prescriptive | `rai-prescriptive-problem-formulation` |
-| How does the optimal portfolio degrade under a PSD-preserving crisis covariance? | Prescriptive (re-solve) | `rai-prescriptive-solver-management` + `rai-prescriptive-results-interpretation` |
-
-Discovery's output is a *plan*, not code. Everything that follows materializes that plan.
-
----
-
-## Setup
-
-See the template's main `README.md` for installation, RAI connection setup, and how to run the script. The narrative below follows the actual stage outputs of `portfolio_balancing.py` against the bundled CSVs in `../data/`.
-
-**Prerequisites**
-
-- Template's `data/` CSVs available — `returns.csv` (8 stocks), `covar.csv` (64 covariance entries), `users.csv`, `accounts.csv`, `holdings.csv`, `transactions.csv`. Or your own Snowflake schema with equivalent tables.
-- `raiconfig.yaml` pointing at your RAI engine
-- Python >= 3.10 with `relationalai >= 1.0.14`
-
----
-
 ## Workflow
 
-The runbook walks the same chain stage-by-stage, prompt-by-prompt, in agent-skill order. Each row maps to a section of the script.
-
-| # | Step | Skill | Prompt | Expected Output |
-|---|------|-------|--------|-----------------|
-| 1 | Build ontology | `/rai-build-starter-ontology` | "Build a RAI ontology for portfolio balancing from the CSVs in `data/`. Concepts: Stock (index, ticker, sector, returns, covar), Sector (derived from Stock sectors), User (with risk_score), Account (balance, account_type), Holding (quantity, purchase_price, value), Transaction (amount, category, flagged indicator)." | Model `portfolio` with 6 user-facing concepts. 8 Stocks across 5 sectors (Technology x3, Healthcare x2, Financials, Energy, Consumer Staples). 6 Users, 4 Accounts, 15 Holdings, 21 Transactions. Pairwise `Stock.covar(i, j)` two-arg property loaded from the long-form covariance CSV (64 entries, symmetric). |
-| 2 | Discovery | `/rai-discovery` | "What questions can we answer with this ontology? We want to surface compliance violations on the current book, then rebuild it under a Markowitz objective with crisis-regime stress." | Rules: position-limit, sector-concentration, high-risk-trader flags as derived Relationships. Graph: covariance clustering (Louvain on |correlation| >= threshold) -> per-cluster representative by Sharpe -> investable-universe collapse. Prescriptive: bi-objective QP via epsilon constraint over the representative-only universe, indexed by a `Scenario` Concept that combines budget and regime. Stress: same `solve_epsilon` call under a PSD-preserving crisis covariance. Predictive: DATA_GAP (no time series). |
-| 3 | Stage 1 — Compliance: overconcentrated holdings | `/rai-rules-authoring` | "Define `Holding.value = quantity * purchase_price`, then flag `Holding.is_overconcentrated` whenever `value > POSITION_LIMIT * Account.balance` (default 15%)." | 4 holdings flagged: AAPL (Account 1, 18.0%), MSFT (Account 1, 16.0%), JNJ (Account 4, 16.0%), PFE (Account 4, 16.2%). All Account-1 (Alice Chen, balance $100K) and Account-4 (Dan Wilson, balance $80K) — both have multiple positions clipping the 15% cap. |
-| 4 | Stage 1 — Compliance: sector concentration | `/rai-rules-authoring` | "Aggregate `Holding.value` per (Account, Sector) and flag `Holding.is_sector_concentrated` whenever the sector total > `SECTOR_LIMIT * Account.balance` (default 30%)." | 2 (account, sector) pairs flagged: Account 1 Technology 34.0% (AAPL + MSFT), Account 4 Healthcare 32.5% (JNJ + PFE). Pattern: the same accounts driving Stage 1.3 stack within a sector. Stage 3 enforces the same 30% cap as a hard constraint. |
-| 5 | Stage 1 — Compliance: high-risk traders | `/rai-rules-authoring` | "Flag `User.is_high_risk_trader` whenever `risk_score > 0.8` AND > 5 flagged transactions." | 2 users flagged: Alice Chen (risk 0.85), Eve Taylor (risk 0.92). Both have multiple `Transaction.is_flagged == True` rows in the bundled data. Standalone signal — used as a reviewer trigger, not a solver constraint. |
-| 6 | Stage 2 — Derive volatility + correlation in PyRel | `/rai-ontology-design` | "Add `Stock.variance` (covariance diagonal where i == j), `Stock.volatility = sqrt(variance)`, and pairwise `Stock.correlation(i, j) = covar(i, j) / (vol_i * vol_j)`. All in PyRel — no numpy precompute." | `Stock.variance` and `Stock.volatility` written for all 8 stocks. `Stock.correlation` two-arg property populated from the 64 covariance pairs. Volatilities span ~0.06 to ~0.10 (small dataset, daily-scale). Storing in the ontology lets every downstream stage read the same source of truth instead of recomputing. |
-| 7 | Stage 2 — Cluster the correlation graph | `/rai-graph-analysis` | "Build an undirected `Graph` with `Stock` as `node_concept`. Add an edge wherever `|correlation| >= CORR_THRESHOLD` (default 0.3) and `i < j` (deduplicate symmetric pairs). Run Louvain and persist `Stock.cluster`." | 4 edges retained (8 stocks, sparse graph). Louvain yields 5 communities: Cluster 1 = {JNJ, PFE} (Healthcare), Cluster 2 = {AAPL, MSFT, GOOGL} (Technology), plus singletons Cluster 3 = {JPM}, Cluster 4 = {PG}, Cluster 5 = {XOM}. Intra-cluster avg correlation = +0.683, inter-cluster = +0.131 (~5x separation — clean). |
-| 8 | Stage 2 — Pick cluster representatives by Sharpe | `/rai-graph-analysis` | "Define `Stock.sharpe = returns / volatility`. For each cluster, the stock whose Sharpe equals the cluster max is the representative — set `Stock.is_representative`. Singletons are their own representative. Define `Stock.is_non_representative` as the positive complement (the prescriptive rewriter doesn't accept `model.not_(...)` inside a solver `.where()`)." | 5 representatives picked from 8 stocks: Cluster 1 PFE (Healthcare, Sharpe 0.530), Cluster 2 GOOGL (Technology, 0.605), Cluster 3 JPM (Financials, 0.500), Cluster 4 PG (Consumer Staples, 0.444), Cluster 5 XOM (Energy, 0.588). Investable universe shrinks from 8 to 5. Non-representatives — AAPL, MSFT, JNJ — get `Stock.is_non_representative` and are forced to zero in Stage 3. |
-| 9 | Stage 3 — Scenario + regime + decision variable | `/rai-prescriptive-problem-formulation` | "Define `Regime` (`base`, `crisis`) and a `Scenario` Concept indexed by name with `budget` and `regime` properties. Load 6 scenarios = {500, 1000, 2000} x {base, crisis}. Add `Stock.regime_covar(i, j, Regime) = base covariance for base; alpha * covar(i,j) + (1 - alpha) * vol_i * vol_j for crisis` (PSD-preserving correlation shrinkage in covariance units, alpha = 0.7). Decision variable: `Stock.x_quantity(Stock, Scenario)` continuous." | 1 `Stock.x_quantity` property, 8 stocks x 6 scenarios = 48 continuous variables (40 of which will be hard-zero from the representative-only constraint). Regime-conditioned covariance lives in the ontology — the QP picks each scenario's matching regime without any branching in solver code. |
-| 10 | Stage 3 — Compliance constraints on the decision variable | `/rai-prescriptive-problem-formulation` | "Add: non-negative (`x_qty >= 0`); fully invested per scenario (`sum(x) == Scenario.budget`); per-rep position cap (`x_qty <= REP_POSITION_LIMIT * Scenario.budget`, default 0.30); per-sector cap (`sum(x_qty per sector) <= SECTOR_LIMIT * Scenario.budget`, default 0.30); representative-only (`x_qty == 0` where `Stock.is_non_representative()`)." | 5 constraint families. `REP_POSITION_LIMIT = 0.30` is intentionally higher than Stage 1's `POSITION_LIMIT = 0.15`: a representative carries its cluster's combined exposure, and feasibility requires `REP_POSITION_LIMIT * num_reps >= 1.0` (5 x 0.30 = 1.5, OK). `SECTOR_LIMIT = 0.30` is reused verbatim from Stage 1's compliance threshold — same parameter binds the existing book and the rebuilt one. |
-| 11 | Stage 3 — Anchors + epsilon sweep | `/rai-prescriptive-solver-management` | "Anchor 1: minimize risk (no return constraint). Anchor 2: maximize return. Compute return-rate range across all 6 scenarios. Then sweep 5 interior epsilon-rate points uniformly across the rate range and resolve `min risk s.t. return_rate >= eps_rate`. Use Ipopt, time limit 60s." | Per-scenario rate range: [0.0634, 0.0840] per unit invested (regime-independent because `Stock.returns` is regime-independent — only the covariance differs). 7 solves total: 2 anchors + 5 epsilon points = 42 optimal portfolios (6 scenarios x 7 points), all `LOCALLY_SOLVED`. base_500 anchor 1: return 32.43, risk 1160.39. base_500 anchor 2: return 42.00. crisis_500 anchor 1: return 31.69, risk 1913.60 — same investable universe, different regime covariance. |
-| 12 | Stage 3 — Pareto + knee | `/rai-prescriptive-results-interpretation` | "For each scenario, list the 7-point frontier (return, risk). Print marginal `delta_risk / delta_return` between adjacent points and identify the knee — where the marginal jumps most." | base_500: returns 32.43 -> 33.41 -> 35.12 -> 36.84 -> 38.56 -> 40.28; risk 1160 -> 1177 -> 1263 -> 1386 -> 1546 -> 1742. Marginal climbs 16.85 -> 49.94 -> 71.72 -> 93.03 -> 114.43. Knee at Point 2 (`eps_1`) — marginal cost jumps ~3x beyond. base_1000 / base_2000 / crisis_* show the same shape (risk scales as budget^2 because the QP is quadratic, but the rate-form frontier is identical). |
-| 13 | Stage 4 — Crisis stress comparison | `/rai-prescriptive-results-interpretation` | "From the Stage 3 sweep, emit a side-by-side `vol_base` vs `vol_crisis` table per (budget, lambda). Vol = sqrt(risk). Compute the absolute and percentage gap." | Same 7-point frontier resolved at each budget x regime. Crisis vol sits ~25-30% above base at every lambda. Budget 500: at `min_risk` vol_base 34.06 / vol_crisis 43.74 (+28.4%); at `eps_1` 34.30 / 44.54 (+29.8%); peaks at `eps_1`-`eps_2`; narrows to +25.2% at `eps_5`. The gap peaks mid-frontier and narrows toward the concentrated end — the inversion is the payoff of the cluster collapse: at the concentrated end the optimizer holds the highest-Sharpe distinct bet per cluster (weighted toward Energy/Consumer Staples here, which carry lower crisis correlations than the middle of the frontier). Without the representative collapse, the concentrated end would stack near-duplicates and the crisis gap would grow instead of shrink. |
-
----
-
-## Stage 1 — Rules: compliance scan
-
-> `/rai-rules-authoring` "Flag any holding worth more than 15% of its account, any sector worth more than 30% of the account, and any user with a risk score above 0.8 and more than five flagged transactions."
-
-```
-COMPLIANCE VIOLATIONS — current book (4 accounts, 15 holdings, 6 users)
-
-  Rule 1: Holding.is_overconcentrated  (position > 15% of balance)
-  ────────────────────────────────────────────────────────────────
-    AAPL  Account 1   $18,000 / $100,000   18.0%  ─── Alice Chen
-    MSFT  Account 1   $16,000 / $100,000   16.0%
-    JNJ   Account 4   $12,800 /  $80,000   16.0%  ─── Dan Wilson
-    PFE   Account 4   $13,000 /  $80,000   16.2%
-
-  Rule 2: Holding.is_sector_concentrated  (sector > 30% of balance)
-  ────────────────────────────────────────────────────────────────
-    Account 1   Technology    $34,000 / $100,000   34.0%
-    Account 4   Healthcare    $25,800 /  $80,000   32.2%
-
-  Rule 3: User.is_high_risk_trader  (risk_score > 0.8 AND >5 flagged txns)
-  ────────────────────────────────────────────────────────────────
-    Alice Chen   risk_score 0.85   ── flagged transactions
-    Eve Taylor   risk_score 0.92
-
-  ──────────────────────────────────────────────────────────────────
-  The same accounts that breach the per-stock cap also breach the
-  sector cap — Stage 3 will use the SECTOR_LIMIT (0.30) as a hard
-  constraint when rebuilding both books. The trader flag is a
-  reviewer signal, not a solver input.
-  ──────────────────────────────────────────────────────────────────
-
-  Holding.is_overconcentrated         [4]
-  Holding.is_sector_concentrated      [2]
-  User.is_high_risk_trader            [2]
-```
-
-`POSITION_LIMIT` (0.15) and `SECTOR_LIMIT` (0.30) are top-level constants. `SECTOR_LIMIT` is reused verbatim by Stage 3; `POSITION_LIMIT` is replaced in Stage 3 by `REP_POSITION_LIMIT = 0.30` because a representative carries its cluster's combined exposure (and 5 reps x 0.20 = 1.00 would already pin the budget — 0.30 leaves headroom).
-
----
-
-## Stage 2 — Graph: covariance clustering + cluster representatives
-
-> `/rai-graph-analysis` "Cluster stocks by correlation — anything above 0.3 absolute is a redundant bet. Pick one representative per cluster (highest Sharpe ratio) and force the rest to zero in optimization."
-
-**Construction** — undirected, unweighted graph:
-- Node concept: `Stock` (8 nodes)
-- Edges built from the derived `Stock.correlation(i, j)` property where `|correlation| >= 0.3` and `i < j`
-- Aggregator: `"sum"` (no parallel edges expected)
-
-**Algorithm:** `louvain()` for community detection.
-
-**Volatility, correlation, and crisis covariance are all PyRel derived properties** — no numpy precompute. The covariance matrix loaded from `covar.csv` is the only solver input not derived from another property.
-
-```
-DERIVED IN PYREL
-  Stock.variance         <- covar(i, j) where i == j
-  Stock.volatility       <- sqrt(variance)
-  Stock.correlation(i,j) <- covar(i, j) / (vol_i * vol_j)
-
-CORRELATION GRAPH
-  Edges with |correlation| >= 0.30:    4
-  Stocks above threshold pairwise:     {AAPL,MSFT}, {AAPL,GOOGL},
-                                       {MSFT,GOOGL}, {JNJ,PFE}
-
-LOUVAIN COMMUNITIES                   5 clusters
-  Cluster 1 (size 2):  JNJ (Healthcare), PFE (Healthcare)
-  Cluster 2 (size 3):  AAPL (Technology), MSFT (Technology),
-                       GOOGL (Technology)
-  Cluster 3 (size 1):  JPM (Financials)              ← singleton
-  Cluster 4 (size 1):  PG  (Consumer Staples)        ← singleton
-  Cluster 5 (size 1):  XOM (Energy)                  ← singleton
-
-  Avg correlation:  intra-cluster = +0.683
-                    inter-cluster = +0.131           ── ~5x separation
-```
-
-Singletons (Cluster 3-5) are their own representatives. The non-trivial choice happens inside Cluster 1 (JNJ vs PFE) and Cluster 2 (AAPL/MSFT/GOOGL):
-
-```
-REPRESENTATIVE = HIGHEST SHARPE PER CLUSTER  (returns / volatility)
-
-  Cluster 1 (Healthcare):
-    JNJ    Sharpe 0.500           PFE    Sharpe 0.530   ← REP
-
-  Cluster 2 (Technology):
-    AAPL   Sharpe 0.582           MSFT   Sharpe 0.560
-    GOOGL  Sharpe 0.605   ← REP
-
-  Cluster 3 (Financials):  JPM   Sharpe 0.500   ← REP (singleton)
-  Cluster 4 (Consumer Staples): PG  Sharpe 0.444   ← REP (singleton)
-  Cluster 5 (Energy):  XOM   Sharpe 0.588   ← REP (singleton)
-
-  ──────────────────────────────────────────────────────────────────
-  Investable universe collapses 8 -> 5.
-  AAPL, MSFT, JNJ get Stock.is_non_representative — Stage 3 forces
-  their decision variables to zero.
-
-  This is "collapse, don't cap" — Stage 3 doesn't allow the full 8
-  with caps inside a redundant cluster; it removes the duplicates
-  before the optimizer sees them.
-  ──────────────────────────────────────────────────────────────────
-
-  ✓ Stock.variance / volatility / correlation written back  [8 / 8 / 64]
-  ✓ Stock.cluster, Stock.sharpe, Stock.cluster_max_sharpe   [8 each]
-  ✓ Stock.is_representative                                 [5]
-  ✓ Stock.is_non_representative                             [3]
-```
-
----
-
-## Stage 3 — Prescriptive: bi-objective QP with epsilon constraint
-
-> `/rai-prescriptive-problem-formulation` "Build a Markowitz mean-variance frontier across 6 scenarios = 3 budgets × 2 regimes. Position cap 30% of budget, sector cap 30%, non-representatives forced to zero. Anchor with min-risk and max-return, then sweep 5 epsilon points across the return range."
-
-```
-FORMULATION
-
-  Decision variable
-    Stock.x_quantity(Stock, Scenario)   continuous, >= 0
-      8 stocks x 6 scenarios = 48 vars
-      40 forced to 0 by Stock.is_non_representative()
-      8 active = 5 representatives x — wait: 5 reps x 6 scenarios = 30
-      (the 18 singleton non-rep slots are also forced — same effect)
-
-  Scenarios (3 budgets x 2 regimes = 6 tuples)
-    base_500  base_1000  base_2000   crisis_500  crisis_1000  crisis_2000
-
-  Constraints (per scenario)
-    1. Non-negative                  x_qty >= 0
-    2. Fully invested                sum(x_qty) == Scenario.budget
-    3. Per-rep position              x_qty <= 0.30 * Scenario.budget
-    4. Per-sector                    sum(x_qty per sector) <= 0.30 * Scenario.budget
-    5. Representative-only           x_qty == 0 where is_non_representative
-
-  Risk objective (regime-aware)
-    minimize  Sigma_ij  regime_covar(i, j, Scenario.regime) * x_i * x_j
-              └── PyRel-derived per regime; PSD-preserving for crisis ──┘
-
-  Return constraint (epsilon, scaled by budget)
-    sum(Stock.returns * x_qty)  >=  eps_rate * Scenario.budget
-
-──────────────────────────────────────────────────────────────────────
-SOLVE  (Ipopt, time limit 60s)   →   LOCALLY_SOLVED
-  Anchor 1 (min risk)   + Anchor 2 (max return)   + 5 epsilon points
-  = 7 solves, 42 optimal portfolios (one per scenario per point)
-──────────────────────────────────────────────────────────────────────
-
-ANCHOR 1 — minimize risk (no return floor)
-  base_500     return  32.43    risk  1,160.39
-  base_1000    return  64.87    risk  4,641.57
-  base_2000    return 129.73    risk 18,566.28
-  crisis_500   return  31.69    risk  1,913.60   ← higher risk, same universe
-  crisis_1000  return  63.37    risk  7,654.40
-  crisis_2000  return 126.75    risk 30,617.59
-
-ANCHOR 2 — maximize return
-  base_500 / crisis_500       return  42.00
-  base_1000 / crisis_1000     return  84.00
-  base_2000 / crisis_2000     return 168.00
-                              (returns are regime-independent)
-
-Return-rate range  [0.0634, 0.0840]  per unit invested
-Epsilon sweep      5 interior points evenly spaced across the range
-```
-
----
-
-## Stage 3 — Reading the frontier (per scenario)
-
-> `/rai-prescriptive-results-interpretation` "For each scenario, list the seven-point Pareto frontier and find the knee — where does the marginal risk per unit return jump the most? Is the rate-form frontier shape consistent across budgets?"
-
-```
-EFFICIENT FRONTIER — base_500  (budget = 500, regime = base)
-
-  #     Label       Return        Risk
-  ────────────────────────────────────────
-  1     min_risk     32.43    1,160.39
-  2        eps_1     33.41    1,176.78    ← KNEE
-  3        eps_2     35.12    1,262.61
-  4        eps_3     36.84    1,385.89
-  5        eps_4     38.56    1,545.79
-  6        eps_5     40.28    1,742.47
-  7     max_return   42.00    (separate anchor)
-
-  Marginal delta_risk / delta_return:
-    min_risk → eps_1     16.85
-    eps_1   → eps_2     49.94    ← +3.0x  ── KNEE
-    eps_2   → eps_3     71.72
-    eps_3   → eps_4     93.03
-    eps_4   → eps_5    114.43
-
-  ──────────────────────────────────────────────────────────────────
-  base_1000 / base_2000 / crisis_* show the SAME shape — risk scales
-  as budget^2 (the QP is quadratic in x), but the rate-form frontier
-  and the knee location are budget-independent.
-  ──────────────────────────────────────────────────────────────────
-
-  ✓ Stock.x_quantity written back, indexed by (Stock, Scenario)
-```
-
----
-
-## Stage 4 — Crisis stress test
-
-> `/rai-prescriptive-solver-management` + `/rai-prescriptive-results-interpretation` "Stress-test the frontier under crisis: shrink correlations toward all-ones with weight 0.7 on base covariance + 0.3 on outer-product. How much volatility expansion at each frontier point — does the gap peak mid-frontier or at the concentrated end?"
-
-Same `solve_epsilon` call, no separate model — `Scenario.regime` selects between two `Stock.regime_covar` definitions:
-
-- `base`:   Sigma(i, j)
-- `crisis`: alpha * Sigma(i, j) + (1 - alpha) * vol_i * vol_j   (alpha = 0.7)
-
-The crisis formula is correlation shrinkage toward all-ones (`rho_crisis = alpha * rho + (1 - alpha) * J`) re-expressed in covariance units. PSD is preserved by construction (convex combination of PSD matrices), so every lambda solves cleanly.
-
-```
-VOLATILITY COMPARISON  vol = sqrt(risk)
-
-  Budget 500:
-       Label     vol_base    vol_crisis      gap     gap_%
-       ────────────────────────────────────────────────────
-    min_risk      34.06        43.74      +9.68    +28.4%
-       eps_1      34.30        44.54     +10.24    +29.8%   ← peak
-       eps_2      35.53        46.11     +10.58    +29.8%
-       eps_3      37.23        47.94     +10.72    +28.8%
-       eps_4      39.32        49.99     +10.68    +27.2%
-       eps_5      41.74        52.27     +10.53    +25.2%
-
-  Budget 1000 + Budget 2000: same gap_% pattern (vol scales with budget;
-  the rate-form gap is budget-independent).
-
-  ──────────────────────────────────────────────────────────────────
-  Crisis vol sits 25-30% above base at EVERY lambda. The gap PEAKS in
-  the middle of the frontier (eps_1..eps_2 at +29.8%) and NARROWS at
-  the concentrated end (eps_5 at +25.2%).
-
-  Why the inversion: at the concentrated end the optimizer is picking
-  the highest-Sharpe distinct bet per cluster — the bundled data
-  weights this toward Energy + Consumer Staples, which happen to
-  carry lower crisis correlations than the middle of the frontier.
-  Without the representative collapse, the concentrated end would
-  stack near-duplicates and the crisis gap would GROW, not shrink.
-  ──────────────────────────────────────────────────────────────────
-
-  ✓ Stock.regime_covar written back (64 base + 64 crisis = 128 entries)
-```
-
-**Why `Scenario` is a Concept, not a loop:** all 6 (budget, regime) tuples solve in a single call to the solver, against the matching `regime_covar`. Adding a fourth regime or a fifth budget is a data edit in `scenario_data`, not a change to `solve_epsilon`. Scenarios are data.
-
----
-
-## Stage outputs — what each reasoner contributes back
-
-```
-ONTOLOGY ENRICHMENT — what each stage wrote back
-
-  Stage 1 (rules)         Holding.value                           [15]
-                          Holding.is_overconcentrated             [4]
-                          Holding.is_sector_concentrated          [2]
-                          User.is_high_risk_trader                [2]
-
-  Stage 2 (graph)         Stock.variance                          [8]
-                          Stock.volatility                        [8]
-                          Stock.correlation (i, j)                [64]
-                          Stock.cluster                           [8]
-                          Stock.sharpe                            [8]
-                          Stock.cluster_max_sharpe                [8]
-                          Stock.is_representative                 [5]
-                          Stock.is_non_representative             [3]
-
-  Stage 3 (prescriptive)  Stock.regime_covar (i, j, Regime)       [128]
-                          Stock.x_quantity (Stock, Scenario)      [48]
-
-  Stage 4 (stress)        (terminal — vol_base vs vol_crisis table)
-
-  ──────────────────────────────────────────────────────────────────
-  Each stage reads what the previous stage wrote.
-  Re-running any downstream stage automatically picks up enrichments.
-  No glue code, no DataFrame round-trip — same ontology throughout.
-  ──────────────────────────────────────────────────────────────────
-```
-
----
-
-## The chain — accretive ontology enrichment
-
-```
-THE PORTFOLIO-BALANCING CHAIN
-
-  STAGE 1  RULES
-  "Where is the current book breaking compliance?"
-  reads:   Holding, Account, User, Transaction, Stock.sector
-  writes:  Holding.value
-           Holding.is_overconcentrated / is_sector_concentrated
-           User.is_high_risk_trader
-                         │
-                         ▼
-  STAGE 2  GRAPH (Louvain)
-  "Which stocks are really the same bet?"
-  reads:   Stock.covar (loaded), Stock.returns
-  writes:  Stock.variance / volatility / correlation (i, j)
-           Stock.cluster                  ── 5 communities
-           Stock.sharpe / cluster_max_sharpe
-           Stock.is_representative        ── 5 of 8 stocks
-           Stock.is_non_representative    ── forced to zero in Stage 3
-                         │
-                         ▼
-  STAGE 3  PRESCRIPTIVE (Ipopt QP)
-  "What's the optimal allocation under position + sector caps,
-   for each (budget, regime) scenario?"
-  reads:   Stock.is_representative       ──►  decision-variable scope
-           Stock.is_non_representative   ──►  hard-zero constraint
-           Stock.returns                 ──►  epsilon return target
-           Stock.regime_covar            ──►  quadratic risk objective
-           Scenario.budget / regime      ──►  per-scenario constraints
-           SECTOR_LIMIT (= Stage 1's)    ──►  hard sector cap
-  writes:  Stock.regime_covar (PSD-preserving)  [base + crisis]
-           Stock.x_quantity (Stock, Scenario)   [42 active portfolios]
-                         │
-                         ▼
-  STAGE 4  STRESS (same solver, regime swap)
-  "How does the optimal portfolio degrade under a crisis covariance?"
-  reads:   Stock.regime_covar (regime = "crisis")
-           Stage 3's pareto results
-  writes:  (terminal — vol gap table)
-
-  ──────────────────────────────────────────────────────────────────
-  No glue. No DataFrame ping-pong. No re-derivation per-reasoner.
-  Three reasoner families, one ontology, one accretive thread.
-  Stage 4 is a regime swap on the same solve — not a separate model.
-  ──────────────────────────────────────────────────────────────────
-```
-
----
-
-## Why the chain matters (vs. any single stage)
-
-| Stage alone | What it tells you | What it doesn't |
-|---|---|---|
-| Rules alone | "4 holdings + 2 sectors break compliance" | How to rebuild the book |
-| Graph alone | "AAPL/MSFT/GOOGL move together; JNJ/PFE move together" | Which to keep, how much to allocate |
-| Prescriptive alone (full universe) | A "diversified" frontier that stacks near-duplicates inside a cluster | That two of those names are one bet — and that the crisis gap will grow under stress |
-| Prescriptive alone (single regime) | A base-case efficient frontier | What it costs you when correlations spike |
-
-| Combined | Output |
-|---|---|
-| Rules → Graph | Compliance violations + redundant-bet map |
-| + Prescriptive (rep-only universe) | 7-point Pareto frontier per scenario; knee at eps_1 |
-| + Stress (Scenario.regime swap) | Crisis vol 25-30% above base, gap narrows at concentrated end (the cluster-collapse payoff) |
-
-**Multi-reasoner chaining grounded in (and contributing to) the ontology.**
-
----
-
-## Crisis Regime Construction
-
-**Do not** naively scale off-diagonal covariance by a constant — it frequently breaks positive semidefiniteness, the lambda=0 pure min-variance solve fails (Ipopt rejects non-convex QP), and the whole frontier anchors wrong. PSD-preserving alternatives:
-
-| Approach | Formula | Preserves PSD? |
-|----------|---------|----------------|
-| **Correlation shrinkage toward all-ones** (this template) | `rho' = alpha * rho + (1 - alpha) * J`, alpha in [0.6, 0.9]. In covariance units: `cov'(i,j) = alpha * cov(i,j) + (1 - alpha) * vol_i * vol_j` | Yes (convex combination of PSD matrices) |
-| Eigenvalue flooring | Eigendecompose, replace lambda_i with max(lambda_i, eps), recompose | Yes by construction |
-| Scale off-diagonals + PSD projection | Scale, then find nearest PSD matrix via eigendecomposition | Yes after projection |
-| ~~Uniform 1.5x off-diagonals~~ | `Sigma'[i,j] = 1.5 * Sigma[i,j]` for i != j | **No** — frequently non-PSD |
-
-`CRISIS_ALPHA = 0.7` produces a clearly "crisis-like" regime (intra-cluster correlations bump toward 1) while keeping the QP well-conditioned at every lambda. Lower values (0.5-0.6) give more severe crises; values > 0.9 get close to the base case.
-
----
-
-## Data Reference
-
-- **Source data**: bundled CSVs in `../data/` (8 stocks across 5 sectors with 64-entry symmetric covariance, 6 users, 4 accounts, 15 holdings, 21 transactions). To run against your own Snowflake schema instead, swap the `read_csv(...)` loads for `model.Table(...)` references in `portfolio_balancing.py`; the rest of the pipeline is unchanged.
-- **Stages**: implemented in `../portfolio_balancing.py` as a single combined script with stage banners (Stage 1 → Stage 4).
-- **Ontology**: 6 user-facing concepts (`Stock`, `Sector`, `User`, `Account`, `Holding`, `Transaction`) plus the `Regime` and `Scenario` concepts introduced in Stage 3. Run `inspect.schema(model)` after the pipeline (see template README) to dump the full concept/property/relationship surface, filtering out reasoner-owned concepts (`Variable`, `Constraint`, etc.) and the auto-generated `graph<id>_Edge` from Stage 2.
-
----
-
-## Adapting this recipe to a new domain
-
-The chain pattern transfers cleanly. To rebuild for a different problem:
+| # | Skill + prompt | Expected output |
+|---|----------------|-----------------|
+| 0 | `/rai-discovery` "Our 8-stock book breaks compliance and concentrates risk. Rebuild it under Markowitz mean-variance with caps, deduplicate redundant bets via correlation clustering, and stress-test under crisis. What questions does each reasoner family handle?" | Plan: rules for compliance flags, graph for correlation clustering + representatives, prescriptive QP indexed by Scenario, stress as regime-swap re-solve. |
+| 1 | `/rai-rules-authoring` "Flag any holding worth more than 15% of its account, any sector worth more than 30% of the account, and any user with a risk score above 0.8 and more than five flagged transactions." | 4 holdings flagged (AAPL/MSFT on Account 1, JNJ/PFE on Account 4); 2 (account, sector) pairs flagged (Account 1 Tech 34.0%, Account 4 Healthcare 32.2%); 2 users flagged (Alice Chen 0.85, Eve Taylor 0.92). |
+| 2 | `/rai-graph-analysis` "Cluster stocks by correlation — anything above 0.3 absolute is a redundant bet. Pick one representative per cluster (highest Sharpe ratio) and force the rest to zero in optimization." | 4 edges (|rho| >= 0.3), 5 Louvain clusters, intra +0.683 vs inter +0.131. 5 representatives picked: PFE, GOOGL, JPM, PG, XOM. AAPL/MSFT/JNJ flagged `is_non_representative`. |
+| 3 | `/rai-prescriptive-problem-formulation` "Build a Markowitz mean-variance frontier across 6 scenarios = 3 budgets x 2 regimes. Position cap 30% of budget, sector cap 30%, non-representatives forced to zero. Anchor with min-risk and max-return, then sweep 5 epsilon points across the return range." | 48 decision vars (8 stocks x 6 scenarios), 5 constraint families. Return-rate range [0.0634, 0.0840]. 7 solves x 6 scenarios = 42 `LOCALLY_SOLVED` portfolios via Ipopt. |
+| 4 | `/rai-prescriptive-results-interpretation` "For each scenario, list the seven-point Pareto frontier and find the knee — where does the marginal risk per unit return jump the most?" | base_500 frontier: returns 32.43 -> 40.28, risk 1160 -> 1742. Marginal `delta_risk/delta_return` jumps ~3x at eps_1 (knee). Same shape across all 6 scenarios — risk scales as budget^2, rate-form frontier is budget-independent. |
+| 5 | `/rai-prescriptive-solver-management` + `/rai-prescriptive-results-interpretation` "Stress-test the frontier under crisis: shrink correlations toward all-ones with weight 0.7 on base covariance + 0.3 on outer-product. How much volatility expansion at each frontier point?" | Crisis vol +28-30% above base at every lambda (budget 500: min_risk 34.06 -> 43.74, eps_1 34.30 -> 44.54 peak). Gap peaks mid-frontier, narrows to +25.2% at eps_5 — the cluster-collapse payoff. |
 
-1. Re-run `rai-discovery` on the new business question — does it actually need all four reasoner roles (rules, graph, prescriptive, prescriptive re-solve), or is one or two sufficient? A pure compliance audit stops after Stage 1; a rebalancer without redundant bets in the universe can skip Stage 2.
-2. Strip the demo ontology to the concepts the new chain needs (lean is better for type inference and solver compile time). For portfolio variants, the load-bearing concepts are the asset entity (here `Stock`), its pairwise covariance, an entity that holds compliance thresholds (here `Account`), and the `Scenario` Concept that parameterizes the optimizer.
-3. Stage 1 (rules) is where every threshold the optimizer will later enforce gets named once and reused — keep `SECTOR_LIMIT` (or its equivalent) shared between the compliance scan and the prescriptive constraints, so the rebuilt book obeys the same caps the diagnostic flagged.
-4. Stages 2–4 are the load-bearing chain: Graph collapses redundant bets via clustering + per-cluster representative selection, writing `is_representative` and `is_non_representative` flags the optimizer reads. Prescriptive uses those flags to scope the decision variable and adds the position + sector caps. The stress-test stage is the *same solver call* under a regime-swapped covariance — `Scenario.regime` and `Stock.regime_covar(i, j, Regime)` keep base and crisis as data, not separate models.
-5. Keep the validation checks at every stage: assert flagged-set size, the cluster count and intra-vs-inter average correlation gap, anchor solves return `LOCALLY_SOLVED`, the return-rate range is non-degenerate, and `REP_POSITION_LIMIT * num_representatives >= 1.0` so the fully-invested constraint stays feasible.
-6. When constructing a stress regime, never naively scale off-diagonal covariance — eigenvalue flooring, projection to nearest PSD, or correlation shrinkage toward all-ones (this template) all preserve PSD; arbitrary scaling does not, and the min-risk anchor will fail.
+## Data
 
-The shape this template demonstrates — *each reasoner writes a property the next reasoner reads* — is what makes the chain accretive rather than serial. The agent skills are how you reliably author each link.
+Bundled CSVs in `../data/`: `returns.csv` (8 stocks across 5 sectors), `covar.csv` (64 symmetric covariance entries), plus `users.csv` (6), `accounts.csv` (4), `holdings.csv` (15), `transactions.csv` (21). All four stages run in `../portfolio_balancing.py`.
diff --git a/v1/supply_chain_resilience/references/runbook.md b/v1/supply_chain_resilience/references/runbook.md
index 877c451..c7a68ad 100644
--- a/v1/supply_chain_resilience/references/runbook.md
+++ b/v1/supply_chain_resilience/references/runbook.md
@@ -1,12 +1,8 @@
 # Runbook: Supply Chain Resilience — Multi-Reasoner Walkthrough
 
-Walk-through of the chained-reasoner pattern this template is built on. One realistic business thread — **risk-adjusted network flow with disruption scenarios** — traced across four RAI reasoning stages, each writing properties back to the same ontology that downstream stages consume.
+Risk-adjusted network flow with disruption scenarios, traced across four RAI reasoning stages. Each stage writes properties back to the same ontology that downstream stages consume, so the optimizer can hard-block bad suppliers, surcharge watch suppliers, and weight bottleneck hubs using upstream graph and rules signals.
 
-The template's combined script (`supply_chain_resilience.py`) implements all four stages directly. This runbook expands the surrounding narrative — what each stage finds, why the next stage needs it, what the optimizer does with the enrichment — so a non-OR reader can follow the full reasoning thread end-to-end.
-
----
-
-## TL;DR — the chain in one screen
+## The chain
 
 ```
 Two HIGH-priority customers depend on 6 upstream suppliers — one of which
@@ -37,370 +33,17 @@ watch->avoid downgrade = +0.0% (optimizer already routed around it).
   ─────────────────────────────────────────────────────────────────
 ```
 
-A single-reasoner approach can't answer this. Reachability alone names the suppliers in scope but doesn't rank them. Graph alone ranks hubs but doesn't decide flow. Rules alone classifies suppliers but doesn't route around them. Prescriptive alone has no way to hard-block bad suppliers, surcharge watch suppliers, or penalize bottleneck hubs without those upstream signals.
-
----
-
-## How to read this runbook
-
-This runbook serves two audiences:
-
-- **Reading top-to-bottom**: the narrative + ASCII visualizations show what the chain produces stage-by-stage, with the same business framing the stakeholder would see.
-- **Per-stage skill blocks**: the boxed `Skill / Prompt` callout at the start of each stage is the recipe — load that RAI agent skill, give it that prompt against the bundled demo data in `../data/`, and the agent will reproduce the stage.
-
-The two views share one ontology: each stage writes properties back to the model that the next stage reads. The skill blocks are how an engineer rebuilds the chain link-by-link; the narrative is how a stakeholder reads what the chain produced.
-
----
-
-## Step 0 — Scope the question with `rai-discovery`
-
-> `/rai-discovery` "We need a risk-adjusted routing plan. What's our exposure to each supplier, which sites are bottlenecks, which suppliers are unreliable, and what does the minimum-cost flow look like once those risks are priced in?"
-
-Discovery classifies the question by reasoner family and tells you which downstream skills to load:
-
-| Sub-question | Reasoner | Skill |
-|---|---|---|
-| "Which suppliers do my high-priority customers transitively depend on?" | Graph (reachability) | `rai-graph-analysis` |
-| "Which sites are network bottlenecks?" | Graph (centrality, components) | `rai-graph-analysis` |
-| "Which suppliers are risky? Which demands are escalated?" | Rules | `rai-rules-authoring` |
-| "What's the minimum-cost flow plan that respects all of the above, and how much do disruptions cost?" | Prescriptive | `rai-prescriptive-problem-formulation`, `rai-prescriptive-solver-management`, `rai-prescriptive-results-interpretation` |
-
-Discovery's output is a *plan*, not code. Everything that follows materializes that plan.
-
----
-
-## Setup
-
-See the template's main `README.md` for installation, RAI connection setup, and how to run the script. The narrative below follows the actual stage outputs of `supply_chain_resilience.py` against the bundled CSVs in `../data/`.
-
----
-
-## Stage 0 — Reachability: blast-radius pre-analysis
-
-> `/rai-graph-analysis` "If a key supplier goes offline, which downstream buyers and finished products are at risk? For each HIGH-priority customer, list the suppliers it transitively depends on through the shipment graph, with their reliability scores."
-
-**Construction** — directed `Business` graph, edges from `Business.ships_to` (derived from `Shipment.supplier` -> `Shipment.customer`).
-
-**Targets** — `Business.is_high_priority_customer` is set wherever a `Demand` with `priority == "HIGH"` is placed by that business. From the bundled data, this fires for **2 buyers**: B008 MegaCorp Enterprise and B009 TechGiant Inc (9 HIGH-priority demands between them, all for ProPhone X1 / ProTab T1).
-
-**Algorithm** — `biz_graph.reachable(to=target_customer)` filtered to nodes with `business_type == "SUPPLIER"`.
-
-```
-Upstream supplier dependencies (HIGH-priority customers)
-
-  MegaCorp Enterprise (B008)        depends on 6 suppliers:
-    - CellChem China        (reliability 78%)   ← will become AVOID
-    - EuroCell Poland       (reliability 88%)
-    - EuroChip Germany      (reliability 94%)
-    - EuroDisplay Czech     (reliability 91%)
-    - GlassCorp Korea       (reliability 89%)
-    - WaferTech Taiwan      (reliability 97%)
-
-  TechGiant Inc       (B009)        depends on 6 suppliers:  (same set)
-
-  ──────────────────────────────────────────────────────────────────
-  Both HIGH-priority customers share the same 6-supplier upstream.
-  CellChem (B017) sits in BOTH dependency cones — Stage 2 will flag
-  it AVOID, and Stage 3's baseline already excludes it.
-  ──────────────────────────────────────────────────────────────────
-
-✓ Business.is_high_priority_customer written back (2 buyers)
-```
-
-The point of running reachability before the MILP: when the scenario in Stage 3 downgrades watch suppliers to avoid, the cost delta has to be read against this dependency map. If a downgraded supplier sits in zero customer cones, the optimizer just shrugs and the cost stays flat — which is exactly what happens here for the watch-tier supplier B003.
-
----
-
-## Stage 1 — Graph: site centrality + connected components
-
-> `/rai-graph-analysis` "Which sites are the most influential hubs in the supply network — sites that connect to other influential sites, not just sites with many direct connections? Persist the centrality score back to each site so the optimizer can use it as a bottleneck weight."
-
-**Construction:**
-- Node concept: `Site` (31 sites)
-- Edges: built from `Operation` rows where `op_type == "SHIP"` (undirected, unweighted)
-- Aggregator: `"sum"` (collapse parallel ship lanes between the same pair)
-
-**Algorithms:** `weakly_connected_component()` for cluster discovery, then `eigenvector_centrality()` for hub importance.
-
-```
-Connected components: 2
-
-  Component A: 25 sites (APAC + AMERICAS, joined by S004->S006 long-haul)
-  Component B:  6 sites (EMEA distribution loop)
-
-Top critical sites — eigenvector centrality (FACTORY/DC only)
-
-  S004  TechAssembly Factory     (FACTORY,  APAC)       0.5016  ████████████  ★ central hub
-  S006  West Coast DC            (DC,       AMERICAS)   0.3895  █████████
-  S003  PowerCell Facility       (FACTORY,  APAC)       0.3688  █████████      ⚠ also Stage 2 watch
-  S002  DisplayCorp Plant        (FACTORY,  APAC)       0.3145  ████████
-  S001  ChipTech Factory         (FACTORY,  APAC)       0.3145  ████████
-  S012  SiliconWorks Factory     (FACTORY,  APAC)       0.2456  ██████
-  S013  ScreenTech Plant         (FACTORY,  APAC)       0.2456  ██████
-  S014  EnergyPlus Facility      (FACTORY,  APAC)       0.2280  ██████
-
-  ──────────────────────────────────────────────────────────────────
-  S004 TechAssembly is the convergence point: every APAC component
-  factory ships into it before finished goods radiate to DCs.
-  S003 PowerCell shows up here AND in Stage 2 — structural and
-  behavioural risk overlap on the same supplier.
-  ──────────────────────────────────────────────────────────────────
-
-✓ Site.centrality written back, normalized to [0, 1]
-  (S004 = 1.000, S006 = 0.776, S003 = 0.735, ...)
-```
-
----
-
-## Stage 2 — Rules: supplier risk classification
-
-> `/rai-rules-authoring` "Rate each supplier's delivery reliability. Flag any with reliability score below 0.80 as unreliable, any with a Q1 delay prediction above 0.15 as high-delay-risk, and call them 'watch-level' if either fires. Suppliers with **both** flags are 'avoid' (hard-blocked downstream); suppliers with **either** flag are 'watch' (surcharged)."
-
-**Late-shipment context** (computed in pandas, not RAI):
-
-```
-Late shipments: 37 of 262 (14%)
-
-  B006 West Coast DC    7 late   ████████
-  B007 East Coast DC    5 late   ██████
-  B004 TechAssembly     4 late   █████
-  B022 EMEA DC Central  3 late   ████
-  B003 PowerCell        2 late   ███     ← also high predicted Q1
-  B017 CellChem         2 late   ███     ← also low reliability
-```
-
-**Properties added to the ontology** (via `model.where(...).define(...)`):
-
-```python
-# Rule 1 — reliability gate
-m.where(Business.reliability_score < 0.80).define(Business.is_unreliable())
-
-# Rule 2 — ML delay-risk gate (Q1-2025 GNN predictions)
-m.where(
-    DelayPrediction.supplier_business(Business),
-    DelayPrediction.fiscal_quarter == "Q1-2025",
-    DelayPrediction.predicted_delay_prob > 0.15,
-).define(Business.has_high_delay_risk())
-
-# Rule 3 — union into watch level (rule chaining)
-m.where(Business.is_unreliable()).define(Business.is_watch_level())
-m.where(Business.has_high_delay_risk()).define(Business.is_watch_level())
-
-# Rule 4 — escalate HIGH-priority demands
-m.where(Demand.priority == "HIGH").define(Demand.is_escalated())
-```
-
-**Resulting classification** (Python combines the two RAI flags into avoid/watch/reliable):
-
-```
-Supplier risk classification
-
-  [X] B017  CellChem China      reliability 0.78  Q1 delay 0.22  ── AVOID
-  [!] B003  PowerCell Ltd       reliability 0.81  Q1 delay 0.28  ── WATCH
-  [ ] B005  GlobalBuild Inc     reliability 0.85                 ── reliable
-  [ ] B014  EnergyPlus India    reliability 0.85                 ── reliable
-  [ ] B024  EuroAssembly Pol    reliability 0.87                 ── reliable
-  [ ] B020  EuroCell Poland     reliability 0.88                 ── reliable
-  [ ] B002  DisplayCorp         reliability 0.88                 ── reliable
-  [ ] B016  GlassCorp Korea     reliability 0.89                 ── reliable
-  [ ] B004  TechAssembly Co     reliability 0.90                 ── reliable
-  [ ] B019  EuroDisplay Czech   reliability 0.91                 ── reliable
-  [ ] B012  SiliconWorks Corp   reliability 0.91                 ── reliable
-  [ ] B013  ScreenTech Japan    reliability 0.93                 ── reliable
-  [ ] B018  EuroChip Germany    reliability 0.94                 ── reliable
-  [ ] B001  ChipTech Industries reliability 0.95                 ── reliable
-  [ ] B015  WaferTech Taiwan    reliability 0.97                 ── reliable
-
-  AVOID  =  is_unreliable AND has_high_delay_risk    (both flags)
-  WATCH  =  is_unreliable OR  has_high_delay_risk    (one flag)
-
-Escalated demands (HIGH priority): 9   ── all from B008 / B009
-
-✓ Business.is_unreliable             [1 supplier:  B017]
-✓ Business.has_high_delay_risk       [2 suppliers: B003, B017]
-✓ Business.is_watch_level            [2 suppliers: B003, B017]
-✓ Demand.is_escalated                [9 demands]
-```
-
-Stage 3 reads `is_watch_level` for the surcharge term and `is_unreliable AND has_high_delay_risk` (collapsed to the `avoid` set) for the hard block.
-
----
-
-## Stage 3 — Prescriptive: risk-adjusted minimum-cost flow
-
-> `/rai-prescriptive-problem-formulation` "Solve a minimum-cost flow that fulfills all open demand orders at minimum total transport cost. Hard-block 'avoid' suppliers, surcharge 'watch' suppliers $5/unit, weight bottleneck sites by their centrality, and penalize unmet demand at $100/unit."
-
-```
-FORMULATION
-
-  Decision variables
-    Operation.x_flow        (continuous, 70 ops, 0 ≤ x ≤ capacity_per_day)
-    Demand.x_unmet          (continuous slack, 20 demands, ≥ 0)
-
-  Constraints
-    1. Demand satisfaction
-       Σ x_flow into customer-site for the demanded SKU + x_unmet ≥ quantity
-    2. Avoid suppliers blocked
-       For every operation sourced from B017 (CellChem):  x_flow == 0
-    3. (Scenario only) Site offline / extra blocks
-
-  Objective (minimize)
-    Σ Operation.cost_per_unit · x_flow                              ── transport
-    + RISK_SURCHARGE       · Σ x_flow on watch-supplier ops          ── Stage 2
-    + CENTRALITY_WEIGHT    · Σ x_flow · Site.centrality              ── Stage 1
-    + UNMET_PENALTY        · Σ x_unmet                               ── slack
-
-  Tunables: UNMET_PENALTY=100, RISK_SURCHARGE=5, CENTRALITY_WEIGHT=2,
-            DELAY_PROB_THRESHOLD=0.15, RELIABILITY_THRESHOLD=0.80,
-            PREDICTION_QUARTER="Q1-2025"
-
-──────────────────────────────────────────────────────────────────────
-SOLVE  (HiGHS)   →   OPTIMAL    8 active flows    $1,865.00    0 unmet
-──────────────────────────────────────────────────────────────────────
-
-✓ Operation.x_flow / Demand.x_unmet written back as model properties.
-```
-
-The baseline buys: enough finished-goods flow on the shortest cost-weighted lanes to cover all 20 demand orders in full. CellChem's operations (B017) are hard-blocked. PowerCell (B003) operations carry a +5/unit surcharge — the optimizer accepts a small amount of B003-sourced flow only when no cheaper non-watch alternative exists.
-
-### Reading the solve
-
-- **8 active flows from 70 candidate operations** — the network is sparse at optimum; most capacity is idle.
-- **$1,865 total cost** vs. unconstrained transport cost would be ~$1,500 — the centrality and watch-surcharge terms together add ~$365.
-- **Zero unmet demand** — capacity is plentiful, so the slack term is inactive at baseline.
-
----
-
-## Scenario analysis — quantify disruption
-
-> `/rai-prescriptive-solver-management` + `/rai-prescriptive-results-interpretation` "Re-solve with the highest-centrality site offline, and again with watch-level suppliers downgraded to avoid. What's the cost delta in each, and why are they asymmetric?"
-
-The same `solve_flow(...)` function re-runs with modified constraints. Two scenarios surface different aspects of the chain's value:
-
-```
-SCENARIO COMPARISON
-
-  Scenario                  Status     Cost          Δ vs baseline   Unmet
-  ────────────────────────  ────────   ──────────   ──────────────   ─────
-  Baseline                  OPTIMAL    $1,865.00          —              0
-  Site S004 offline         OPTIMAL    $3,515.00    +88.5%               0
-  Watch->Avoid              OPTIMAL    $1,865.00     +0.0%               0
-```
-
-**Scenario A — top-centrality site offline (S004 TechAssembly).** The optimizer reroutes finished goods through S005 GlobalBuild Plant (Mexico) and longer EMEA lanes; cost jumps 88.5% but all demand is still covered. This is the **structural-risk** signal: losing the highest-centrality node forces expensive secondary routing.
-
-**Scenario B — downgrade all watch suppliers to avoid.** Adds B003 PowerCell to the hard-block set. **Cost is unchanged.** Why? B003 wasn't on any optimal lane — the centrality penalty + risk surcharge already discouraged the optimizer from routing through it at baseline. **This asymmetry is the punchline:** structural risk (Stage 1) costs 88.5% to disrupt; behavioural risk on already-deprioritized suppliers (Stage 2) costs 0% to harden against. The chain reveals which mitigations actually move the needle.
-
-(Cross-check the Stage 0 dependency map: B003 wasn't in either HIGH-priority customer's *direct* upstream — it ships components to manufacturers, who then route via the optimizer's preferred S004/S005 corridor. Reachability surfaced the supplier; centrality + objective weights ensured baseline never relied on it.)
-
----
-
-## The chain — accretive ontology enrichment
-
-```
-THE SUPPLY-CHAIN-RESILIENCE CHAIN
-
-  STAGE 0  REACHABILITY (directed Business graph)
-  "Which suppliers do my high-priority customers transitively depend on?"
-  reads:   Shipment.supplier / .customer  ──►  Business.ships_to (derived)
-           Demand.priority == "HIGH"      ──►  Business.is_high_priority_customer
-  writes:  Business.is_high_priority_customer    ── 2 buyers
-                         │
-                         ▼
-  STAGE 1  GRAPH (eigenvector centrality, WCC)
-  "Which sites are network bottlenecks?"
-  reads:   Operation (op_type == "SHIP"), Site
-  writes:  Site.centrality                       ── normalized [0,1] per site
-                         │
-                         ▼
-  STAGE 2  RULES (chained derivations)
-  "Which suppliers are risky, and which demands are escalated?"
-  reads:   Business.reliability_score, DelayPrediction, Demand.priority
-  writes:  Business.is_unreliable                ── 1 supplier
-           Business.has_high_delay_risk          ── 2 suppliers
-           Business.is_watch_level               ── 2 suppliers
-           Demand.is_escalated                   ── 9 demands
-                         │
-                         ▼
-  STAGE 3  PRESCRIPTIVE (HiGHS LP)
-  "What's the minimum-cost flow plan that respects all of the above?"
-  reads:   Site.centrality              ──►  objective coefficient (penalty)
-           Business.is_watch_level      ──►  objective coefficient (surcharge)
-           {avoid suppliers}            ──►  hard block (x_flow == 0)
-           Operation cost / capacity / SKU, Demand quantity / SKU / business
-  writes:  Operation.x_flow             ── 8 active flows
-           Demand.x_unmet               ── 0 across all 20 demands
-                         │
-                         ▼
-                   Re-solve per scenario (S004 offline, watch->avoid)
-                   → cost-of-disruption table
-
-  ──────────────────────────────────────────────────────────────────
-  No glue. No DataFrame ping-pong. No re-derivation per-reasoner.
-  Four stages, one ontology, one accretive thread.
-  ──────────────────────────────────────────────────────────────────
-```
-
----
-
-## Why the chain matters (vs. any single stage)
-
-| Stage alone | What it tells you | What it doesn't |
-|---|---|---|
-| Reachability alone | "These 6 suppliers feed my critical customers" | Which are risky; which the optimizer would have used anyway |
-| Graph alone | "S004 is the central hub" | Whether losing it is recoverable; at what cost |
-| Rules alone | "B017 avoid, B003 watch" | Whether routing actually depends on them |
-| Prescriptive alone | (degenerate — no risk filter, no bottleneck weight) | Picks cheapest lanes regardless of supplier risk or hub fragility |
-
-| Combined | Output |
-|---|---|
-| Reachability -> Graph | Customer-supplier dependency map + structural hub ranking |
-| + Rules | Per-supplier risk class (avoid / watch / reliable) on top of the dependency map |
-| + Prescriptive | Risk-adjusted min-cost flow ($1,865) + scenario deltas (+88.5% on S004 offline, +0% on watch->avoid) |
-
-**Multi-reasoner chaining grounded in (and contributing to) the ontology.**
-
----
-
-## Optional extension — predictive forecasting
-
-The bundled `delay_prediction.csv` is treated as already-trained ML output (a quarterly per-supplier delay probability table, `model_version='gnn_v2.0'`). A natural extension to the template is to replace the static CSV with a live GNN that retrains on `Shipment.delay_days` history. Skill: `/rai-predictive-modeling` + `/rai-predictive-training`. Reference templates with end-to-end GNN training: `templates/v1/fraud-detection`, `templates/v1/retail_planning`. The downstream stages (rules + prescriptive) wouldn't change — they read `DelayPrediction.predicted_delay_prob` either way.
-
----
-
-## Agent prompt sequence — recreate this template skill-by-skill
-
-Each row is a single agent prompt. Skills are loaded in order; each writes properties the next stage reads.
-
-| # | Skill | Prompt | What it produces |
-|---|-------|--------|------------------|
-| 1 | `/rai-build-starter-ontology` | "Build a starter ontology for a supply-chain dataset with 7 CSVs: site, business, operation, sku, demand, shipment, delay_prediction. Use Site/Business/Operation/SKU/Demand/Shipment/DelayPrediction as concepts. Render the result as an ASCII concept-relationship diagram." | Concepts, properties, relationships matching `supply_chain_resilience.py` lines 57–250 |
-| 2 | `/rai-discovery` | "Given this ontology, what questions can each reasoner family answer? Group by graph / rules / prescriptive." | A reasoner-routing plan that covers Stages 0–3 below |
-| 3 | `/rai-graph-analysis` | "Build a directed Business graph from Shipment.supplier->Shipment.customer. Run upstream reachability from every Business with a HIGH-priority demand. List the suppliers each high-priority customer transitively depends on." | Stage 0 — `Business.is_high_priority_customer`, blast-radius dependency map |
-| 4 | `/rai-graph-analysis` | "Build an undirected Site graph from Operation rows where op_type == 'SHIP'. Compute weakly-connected components, then eigenvector centrality (filter to FACTORY/DC). Normalize and write the score back as Site.centrality." | Stage 1 — `Site.centrality` |
-| 5 | `/rai-rules-authoring` | "Define three derived relationships on Business: is_unreliable (reliability_score < 0.80), has_high_delay_risk (any DelayPrediction for Q1-2025 with predicted_delay_prob > 0.15), and is_watch_level (union of the two). Also flag Demand.is_escalated for HIGH priority. Print the avoid (both flags) / watch (one flag) / reliable classification." | Stage 2 — risk flags, escalation flag |
-| 6 | `/rai-prescriptive-problem-formulation` | "Formulate a minimum-cost network flow on Operation.x_flow with Demand.x_unmet slack. Constraint: inbound flow at customer site for demanded SKU + slack >= quantity. Hard-block operations sourced from avoid-tier businesses. Objective: transport cost + RISK_SURCHARGE * watch-supplier flow + CENTRALITY_WEIGHT * Σ flow · Site.centrality + UNMET_PENALTY * Σ unmet. Describe the formulation before solving — decision variables, constraints, objective, tunables." | Stage 3 formulation |
-| 7 | `/rai-prescriptive-solver-management` | "Solve with HiGHS, time limit 120s. Report status, objective value, count of active flows, and total unmet demand." | Baseline solve — OPTIMAL, $1,865, 8 flows, 0 unmet |
-| 8 | `/rai-prescriptive-results-interpretation` | "Re-solve two scenarios side-by-side: (a) top-centrality site offline (S004 TechAssembly), (b) all watch-level suppliers downgraded to avoid. Show cost delta vs baseline as a table. Explain why the deltas are asymmetric." | Scenario table + interpretation (S004 offline = +88.5%, watch->avoid = +0%) |
-
----
-
-## Data Reference
-
-- **Source data**: bundled CSVs in `../data/` — 31 sites across APAC / AMERICAS / EMEA, 31 businesses (6 suppliers, 6 component manufacturers, 2 manufacturers, 8 warehouses, 9 buyers), 9 SKUs (raw materials -> components -> finished goods ProPhone X1 / ProTab T1), 70 operations (SHIP + TRANSFER), 20 demand orders (9 HIGH, 5 MEDIUM, 6 LOW), 262 historical shipments (37 late), 36 quarterly delay predictions (4 quarters × 9 suppliers). To run against your own Snowflake schema instead, swap `read_csv(...)` for typed `model.Table(...)` loads against the equivalent table set.
-- **Ontology**: defined inline in `../supply_chain_resilience.py` (lines 57–250) — 7 concepts plus the derived `Business.ships_to` and `Operation.source_business` relationships.
-- **Stages**: implemented in `../supply_chain_resilience.py` as a single combined script with stage banners (`STAGE 0` through `STAGE 3` plus `SCENARIO ANALYSIS`).
-
----
-
-## Adapting this recipe to a new domain
+## Workflow
 
-The chain pattern transfers cleanly. To rebuild for a different problem:
+| # | Skill + prompt | Expected output |
+|---|----------------|-----------------|
+| 0 | `/rai-discovery` "We need a risk-adjusted routing plan. What's our exposure to each supplier, which sites are bottlenecks, which suppliers are unreliable, and what does the minimum-cost flow look like once those risks are priced in?" | Reasoner-routing plan covering Stages 0–3 (graph, rules, prescriptive). |
+| 1 | `/rai-graph-analysis` "If a key supplier goes offline, which downstream buyers and finished products are at risk? For each HIGH-priority customer, list the suppliers it transitively depends on through the shipment graph, with their reliability scores." | `Business.is_high_priority_customer` (2 buyers: B008, B009); shared 6-supplier upstream cone. |
+| 2 | `/rai-graph-analysis` "Which sites are the most influential hubs in the supply network — sites that connect to other influential sites, not just sites with many direct connections? Persist the centrality score back to each site so the optimizer can use it as a bottleneck weight." | `Site.centrality` normalized [0,1]: S004=1.000, S006=0.776, S003=0.735; 2 weakly-connected components. |
+| 3 | `/rai-rules-authoring` "Rate each supplier's delivery reliability. Flag any with reliability score below 0.80 as unreliable, any with a Q1 delay prediction above 0.15 as high-delay-risk, and call them 'watch-level' if either fires. Suppliers with **both** flags are 'avoid' (hard-blocked downstream); suppliers with **either** flag are 'watch' (surcharged)." | `is_unreliable` (1: B017), `has_high_delay_risk` (2: B003, B017), `is_watch_level` (2), `Demand.is_escalated` (9). |
+| 4 | `/rai-prescriptive-problem-formulation` "Solve a minimum-cost flow that fulfills all open demand orders at minimum total transport cost. Hard-block 'avoid' suppliers, surcharge 'watch' suppliers $5/unit, weight bottleneck sites by their centrality, and penalize unmet demand at $100/unit." | MILP on `Operation.x_flow` + `Demand.x_unmet`; objective = transport + risk surcharge + centrality weight + unmet penalty. |
+| 5 | `/rai-prescriptive-solver-management` + `/rai-prescriptive-results-interpretation` "Re-solve with the highest-centrality site offline, and again with watch-level suppliers downgraded to avoid. What's the cost delta in each, and why are they asymmetric?" | Baseline OPTIMAL $1,865 / 8 flows / 0 unmet; S004 offline +88.5%; watch->avoid +0.0% (B003 already off optimal lanes). |
 
-1. Re-run `rai-discovery` on the new business question — does it actually need all four reasoner families, or is one or two sufficient? Reachability + prescriptive alone is a viable shortcut if you don't need a centrality-based bottleneck weight or a per-entity risk classifier.
-2. Strip the demo ontology to the concepts the new chain needs (lean is better for type inference and solver compile time). The load-bearing concepts here are the node concept of each graph (`Business`, `Site`), the entity that carries risk flags (`Business`), and the entity that carries decision variables (`Operation`, `Demand`).
-3. Stage 0 (reachability) is *optional context* — it scopes the conversation around the disruption table without changing any constraint. Keep it when stakeholders need to see the dependency cone; drop it when the optimizer's hard blocks already make exposure obvious.
-4. Stages 1–3 are the load-bearing chain: graph centrality writes a per-site bottleneck weight, rules write per-supplier risk flags, and prescriptive consumes both — centrality as an objective coefficient, the avoid set as a hard block, the watch set as a surcharge. The scenario re-solves are cheap once the formulation exists.
-5. Keep the validation checks at every stage: assert the source-business derivation populates, the centrality top-N looks plausible, the avoid/watch sets are non-empty (or non-empty by design), the baseline objective is not zero, and the disruption deltas have the sign you expected.
+## Data
 
-The shape this template demonstrates — *each reasoner writes a property the next reasoner reads* — is what makes the chain accretive rather than serial. The agent skills are how you reliably author each link.
+Bundled CSVs in `../data/`: 31 sites (APAC/AMERICAS/EMEA), 31 businesses, 9 SKUs, 70 operations, 20 demand orders, 262 shipments (37 late), 36 quarterly delay predictions. Combined script with stage banners: `../supply_chain_resilience.py`.
diff --git a/v1/telco_network_recovery/references/runbook.md b/v1/telco_network_recovery/references/runbook.md
index 8cb676e..39b28ac 100644
--- a/v1/telco_network_recovery/references/runbook.md
+++ b/v1/telco_network_recovery/references/runbook.md
@@ -1,21 +1,8 @@
 # Runbook: Telco WEST Recovery — Multi-Reasoner Walkthrough
 
-Walk-through of the chained-reasoner pattern this template is built on. One realistic business thread — **WEST region recovery** — traced across all five RAI reasoner families, each stage writing properties back to the same ontology that downstream stages consume.
+A regional telco is bleeding $791K/quarter from WEST while every other region grows. No single reasoner can answer where to spend a $5M recovery budget: descriptive scopes the crisis, rules flag broken towers, graph weights them by social blast radius, predictive forecasts forward demand, and prescriptive composes all three signals into the upgrade plan. Each stage writes derived properties back to the same ontology that downstream stages consume.
 
-The template's combined script (`telco_network_recovery.py`) implements the predictive, rules, graph, and prescriptive stages directly; this runbook frames them with a descriptive Stage 1 diagnosis and a Stage 6 interpretation, so a non-OR reader can follow the full reasoning thread end-to-end.
-
----
-
-## How to read this runbook
-
-This runbook serves two audiences:
-
-- **Reading top-to-bottom**: the narrative + ASCII visualizations show what the chain produces stage-by-stage, with the same business framing the stakeholder would see.
-- **Per-stage skill blocks**: the boxed `Skill / Prompt` callout at the start of each stage is the recipe — load that RAI agent skill, give it that prompt against the bundled demo data in `../data/`, and the agent will reproduce the stage.
-
----
-
-## TL;DR — the chain in one screen
+## The chain
 
 ```
 WEST is bleeding $791K/quarter from a network operations crisis.
@@ -48,450 +35,18 @@ across all 15 critical towers, prioritized by social blast radius.
   ─────────────────────────────────────────────────────────────────
 ```
 
-A single-reasoner approach can't answer this. Descriptive alone tells the story but no plan. Rules alone identifies broken towers but not which matter most. Graph alone ranks subscribers but doesn't decide. Predictive alone forecasts but doesn't act. Prescriptive alone has no way to weigh "important" without rules + graph upstream.
-
----
-
-## Step 0 — Scope the question with `rai-discovery`
-
-> `/rai-discovery` "WEST is missing revenue while every other region grows. What questions do we need to answer to figure out where to spend $5M to fix it?"
-
-Discovery classifies the question by reasoner family and tells you which downstream skills to load:
-
-| Sub-question | Reasoner | Skill |
-|---|---|---|
-| Where is the revenue bleed coming from, and is it retention or operational? | Descriptive | `rai-querying` |
-| Which WEST towers are technically broken enough to be in scope for upgrade? | Rules | `rai-rules-authoring` |
-| Whose service depends on each broken tower — what is the social blast radius? | Graph (PageRank) | `rai-graph-analysis` |
-| Is regional demand growing or contracting at those sites over the forward horizon? | Predictive (GNN) | _no public skill yet — see `v1/subscriber_retention/` and `v1/demand_forecasting/` as worked-example references_ |
-| Given the cost cap and install-crew budget, which tier should each critical tower receive? | Prescriptive (MIP) | `rai-prescriptive-problem-formulation` |
-| Which constraint is binding, and what would change if we relaxed it? | Prescriptive (post-solve) | `rai-prescriptive-results-interpretation` |
-
-Discovery's output is a *plan*, not code. Everything that follows materializes that plan.
-
----
-
-## Setup
-
-See the template's main `README.md` for installation, RAI connection setup, and how to run the script. The narrative below follows the actual stage outputs of `telco_network_recovery.py`.
-
----
-
-## Stage 1 — Descriptive: diagnose WEST
-
-> `/rai-querying` "Where are we missing revenue targets? Which 10 cell towers have the worst average packet loss over 2024, and which region has the worst Q4 network availability?"
-
-```
-Q4 2024 — Daily KPIs by region
-
-                avg avail %       avg churn rate     avg daily revenue
-              ──────────────     ────────────────    ──────────────────
-  WEST          94.60  ███       0.0256  █████████   $72,558  ███████
-  CENTRAL       99.52  ████████  0.0046  █              $101,517  ██████████
-  SOUTH         99.53  ████████  0.0049  █              $102,030  ██████████
-  EAST          99.55  ████████  0.0049  █              $103,840  ██████████
-  NORTH         99.56  ████████  0.0050  █              $103,810  ██████████
-  NORTHEAST     99.56  ████████  0.0050  █               $99,569  ██████████
-  SOUTHEAST     99.58  ████████  0.0050  █              $100,429  ██████████
-  NORTHWEST     99.54  ████████  0.0049  █              $100,995  ██████████
-  SOUTHWEST     99.55  ████████  0.0049  █              $101,716  ██████████
-                                                      ▲
-  WEST is 5× worse on availability AND churn,         │
-  ~30% lower on daily revenue.                        │
-                                                      │
-  Q3-Q4 monthly revenue gaps stack to $791K:    ──────┘
-
-   Sep 2024   forecast $1.40M  →  actual $1.04M  ── −25.9%  (−$362K)
-   Oct 2024   forecast $0.94M  →  actual $0.70M  ── −25.1%  (−$236K)
-   Nov 2024   forecast $0.88M  →  actual $0.68M  ── −22.0%  (−$193K)
-
-  WEST tower fleet (81 towers)
-
-       ACTIVE        ████████████████████████  49
-       DEGRADED      ███████                   15   ← 15 critical_restore
-       MAINTENANCE   ████████                  17
-
-  Top performance offenders (NetworkPerformance, all WEST DEGRADED):
-       TWR-0015 ── 8.87% loss  190ms  (424 measurements)
-       TWR-0014 ── 8.75% loss  189ms  (273 measurements)
-       TWR-0010 ── 8.73% loss  188ms  (429 measurements)
-       (… all 15 cluster 8.1-8.9% loss / 185-200ms latency)
-       ────────────────────────────────────────
-       Healthy WEST towers sit at <1% loss / ~30ms latency.
-
-  Retention angle? Zero subs in collections, zero with churn_risk ≥ 0.5.
-  High-risk subscribers concentrate in SOUTH/NORTH/CENTRAL, not WEST.
-  This is a NETWORK-side crisis, not subscriber retention.
-```
-
-Subscriber.churn_risk_score is a static feature that hasn't caught up to WEST's empirical churn (TimeSeriesMetric.churn_rate of 2.6% is 5x other regions). The thread focuses on infrastructure recovery.
-
----
-
-## Stage 2 — Rules: flag critical_restore towers
-
-> `/rai-rules-authoring` "Flag CellTowers as 'critical-restore' if region is WEST AND status is DEGRADED AND avg equipment health is below 0.85, OR if avg packet loss > 5% with health below 0.85."
-
-**Properties added to the ontology** (via `model.define(...)`):
-- `CellTower.avg_packet_loss` (Float) — `aggs.avg(NetworkPerformance.packet_loss_pct).per(CellTower)`
-- `CellTower.avg_latency_ms` (Float)
-- `CellTower.avg_error_rate` (Float)
-- `CellTower.avg_health_score` (Float) — across attached equipment via two-hop join
-
-**Rule** — `CellTower.is_critical_restore` (unary `Relationship`):
-
-```python
-# Branch 1: WEST + DEGRADED + degraded equipment health
-m.where(
-    CellTower.region == "WEST",
-    CellTower.status == "DEGRADED",
-    CellTower.avg_health_score < 0.85,   # health is 0-1 scale
-).define(CellTower.is_critical_restore())
-
-# Branch 2: WEST + poor performance + degraded health (catches ACTIVE-but-failing)
-m.where(
-    CellTower.region == "WEST",
-    CellTower.avg_packet_loss > 5.0,
-    CellTower.avg_health_score < 0.85,
-).define(CellTower.is_critical_restore())
-```
-
-```
-RULE FIRES → 15 towers flagged is_critical_restore  (all WEST DEGRADED)
-
-  TWR-0010 ── health 0.48  loss 8.73%  cap_gbps 18  ███
-  TWR-0015 ── health 0.60  loss 8.87%  cap_gbps 60  ██████
-  TWR-0009 ── health 0.62  loss 8.49%  cap_gbps 17  ███
-  TWR-0012 ── health 0.63  loss 8.59%  cap_gbps 84  █████████
-  TWR-0008 ── health 0.64  loss 8.51%  cap_gbps 71  ████████
-  TWR-0014 ── health 0.66  loss 8.75%  cap_gbps 36  █████
-  TWR-0002 ── health 0.66  loss 8.56%  cap_gbps 17  ███
-  TWR-0001 ── health 0.66  loss 8.56%  cap_gbps 31  █████
-  TWR-0011 ── health 0.66  loss 8.68%  cap_gbps 61  ███████
-  TWR-0005 ── health 0.68  loss 8.12%  cap_gbps 94  ██████████
-  TWR-0003 ── health 0.69  loss 8.45%  cap_gbps 43  █████
-  TWR-0007 ── health 0.72  loss 8.54%  cap_gbps 17  ███
-  TWR-0013 ── health 0.77  loss 8.62%  cap_gbps 94  ██████████
-  TWR-0006 ── health 0.78  loss 8.51%  cap_gbps 37  █████
-  TWR-0004 ── health 0.81  loss 8.54%  cap_gbps 64  ███████
-
-✓ is_critical_restore written back to CellTower (15 rows)
-✓ avg_packet_loss / avg_latency_ms / avg_error_rate / avg_health_score
-  written back to all CellTowers (250 rows)
-```
-
-Branch 2 didn't fire — none of WEST's ACTIVE towers fall below health 0.85. The 15 flagged are exactly the WEST DEGRADED set, all with packet loss 8.1–8.9% and latency 185–200ms — sharp gap from the rest of WEST.
-
----
-
-## Stage 3 — Graph: subscriber influence + tower blast radius
-
-> `/rai-graph-analysis` "Who are our most socially influential subscribers based on call patterns? For each critical-restore tower, count the distinct subscribers whose calls route through it and rank by total PageRank influence — that's the blast radius if it fails."
-
-**Construction** — Pattern 3 (`edge_concept`):
-- Node concept: `Subscriber` (1,200 nodes)
-- Edge concept: `CallDetailRecord`, with `caller`→`callee` (directed)
-- Aggregator: `"sum"` (collapse parallel calls between the same pair)
-
-**Algorithm:** `pagerank()` (default for directed graphs).
-
-```
-PageRank — top 10 social influencers (of 1,200 subscribers)
-
-  SUB-CON-00900   CONSUMER     $3,793 LTV   ████████████  0.002963
-  SUB-CON-00723   CONSUMER     $3,049 LTV   ████████████  0.002956
-  SUB-CON-00262   CONSUMER     $3,764 LTV   ███████████   0.002790
-  SUB-CON-00274   CONSUMER     $2,850 LTV   ███████████   0.002695
-  SUB-ENT-0038    ENTERPRISE  $283,233 LTV  ██████████    0.002637  ★
-  SUB-CON-00705   CONSUMER       $765 LTV   ██████████    0.002599
-  SUB-CON-00393   CONSUMER     $3,219 LTV   ██████████    0.002581
-  SUB-ENT-0001    ENTERPRISE  $393,340 LTV  ██████████    0.002575  ★
-  SUB-CON-01066   CONSUMER     $1,146 LTV   ██████████    0.002570
-  SUB-CON-00762   CONSUMER       $307 LTV   ██████████    0.002525
-
-  ★ Top enterprise accounts also rank — heavy inbound call traffic.
-    PageRank captures structural influence independent of LTV.
-
-Per-critical-tower blast radius (sorted by weighted_impact)
-
-  TWR-0014  61 subs  ████████████  0.0502   ← largest social footprint
-  TWR-0008  56 subs  ██████████    0.0430
-  TWR-0011  48 subs  ██████████    0.0428
-  TWR-0012  50 subs  █████████     0.0394
-  TWR-0003  43 subs  █████████     0.0393
-  TWR-0013  46 subs  █████████     0.0379
-  TWR-0004  46 subs  █████████     0.0378
-  TWR-0010  48 subs  █████████     0.0375
-  TWR-0015  45 subs  ████████      0.0361
-  TWR-0002  46 subs  ████████      0.0331
-  TWR-0007  44 subs  ████████      0.0330
-  TWR-0005  45 subs  ████████      0.0330
-  TWR-0009  44 subs  ████████      0.0330
-  TWR-0001  41 subs  ████████      0.0322
-  TWR-0006  41 subs  ████████      0.0316
-
-  ──────────────────────────────────────────────────────────────────
-  404 distinct subscribers (33% of the 1,200-sub base) route at least
-  one call through a critical WEST tower. TWR-0014's failure ripples
-  to 61 subs whose combined social influence is highest.
-  ──────────────────────────────────────────────────────────────────
-
-✓ Subscriber.influence_score written back to all 1,200 subscribers
-✓ CellTower.impact_count + weighted_impact written back to CellTower
-```
-
----
-
-## Stage 4 — Predictive: forecast WEST capacity demand
-
-> `/rai-predictive-modeling` + `/rai-predictive-training` "Predict next-quarter subscriber-growth-rate per region using TimeSeriesMetric history (subscriber_growth_rate, churn_rate, marketing_spend, network_availability_pct, total_calls). Bind each region's forecast back to its towers as a demand multiplier."
-
-**Method:** GNN node regression on `TimeSeriesMetric` (composite key `metric_date` + `region`). Target: `subscriber_growth_rate`. Features: the other 12 daily KPIs + 3 lag features (`prev_day_growth`, `prev_week_growth`, `growth_7d_mean`) + `region` as a category. Graph: same-region 1-day-lag temporal edges. Train < 2024-11-01 (includes the Sep–Oct WEST decline onset); validate on Nov 2024; test on Dec 2024.
-
-```
-Per-region GNN-predicted subscriber-growth-rate (Dec 2024 test horizon)
-
-  CENTRAL    ─────  +0.0075  ████████  ▲
-  EAST       ─────  +0.0073  ████████  │
-  NORTH      ─────  +0.0071  ████████  │  8 regions cluster
-  NORTHEAST  ─────  +0.0070  ████████  │  +0.59 to +0.75%/day
-  NORTHWEST  ─────  +0.0067  ████████  │  (mean predicted growth)
-  SOUTH      ─────  +0.0065  ████████  │
-  SOUTHEAST  ─────  +0.0063  ███████   │
-  SOUTHWEST  ─────  +0.0059  ███████   ▼
-                            ▲
-                            │
-  WEST       ───── −0.0071                      ← anomaly: contracting
-                                                 multiplier 0.993×
-
-  ──────────────────────────────────────────────────────────────────
-  WEST projection: 0.7% demand decline over the test horizon.
-  Stage 5 picks up this multiplier as the 3rd objective coefficient.
-  ──────────────────────────────────────────────────────────────────
-
-✓ CellTower.projected_demand_growth written back to all 15 critical towers
-  (uniform 0.992871 — regional, not per-tower)
-```
-
-**Stage 5 objective with the predictive term:**
-
-```
-objective = sum( selected[t,tier] *
-                 capacity_increase_gbps[t,tier] *
-                 weighted_impact[t] *
-                 projected_demand_growth[t] )    # ← Stage 4 contribution
-```
-
-**Snowflake setup for the GNN:** the template's main script computes lag features (prev-day, prev-week, 7-day mean) and same-region 1-day-lag temporal edges in pandas before loading, so no extra Snowflake DDL is required. To run on your own Snowflake schema instead of the bundled CSV, the equivalent SQL would be a typed copy of the time-series table plus per-region `LAG()` window functions.
-
-**Caveats:**
-- The GNN was tuned for a single 80-epoch run with seed-42 reproducibility; production deployment would expand to a multi-seed average + a held-out holdout window.
-- The WEST projection partially encodes the same network-degradation pattern Stages 2/3 flagged ("things have gotten worse and we expect them to keep getting worse if we don't act"). For an independent baseline, train on a pre-degradation slice (H1 2024 only) and compare.
-
----
-
-## Stage 5 — Prescriptive: tower upgrade selection MIP
-
-> `/rai-prescriptive-problem-formulation` "Recover WEST capacity within $5M and 200 install-weeks, prioritizing towers by social blast radius and forward-looking demand. From TowerUpgradeOption, pick at most one upgrade tier (BRONZE/SILVER/GOLD) per critical-restore tower, maximizing Σ capacity_increase × weighted_impact × projected_demand_growth."
-
-```
-FORMULATION
-
-  Decision variable
-    TowerUpgradeOption.selected  (binary)
-      45 binaries = 15 critical-restore towers × {BRONZE, SILVER, GOLD}
-
-  Constraints
-    1. At-most-one tier per tower      sum(selected).per(CellTower) ≤ 1
-    2. Total cost                      Σ selected · cost ≤ $5,000,000
-    3. Total install_weeks             Σ selected · install_weeks ≤ 200
-
-  Objective (maximize)
-    Σ selected · capacity_increase_gbps · weighted_impact · projected_demand_growth
-                  └────── Step 2 (rules) ─────┘└── Stage 3 ──┘└── Stage 4 ──┘
-
-──────────────────────────────────────────────────────────────────────
-SOLVE  (Gurobi)   →   OPTIMAL    15 active flows    122 Gbps    $4,956,843
-──────────────────────────────────────────────────────────────────────
-
-Tower-tier assignment (sorted by weighted_impact)
-
-  TWR-0014  ── GOLD     +6 Gbps   $350,864   wgt 0.0502  ████████████
-  TWR-0008  ── GOLD    +10 Gbps   $416,455   wgt 0.0430  ██████████
-  TWR-0011  ── GOLD     +9 Gbps   $481,914   wgt 0.0428  ██████████
-  TWR-0012  ── GOLD     +8 Gbps   $445,825   wgt 0.0394  █████████
-  TWR-0003  ── GOLD    +11 Gbps   $360,785   wgt 0.0393  █████████
-  TWR-0013  ── GOLD     +9 Gbps   $273,831   wgt 0.0379  █████████
-  TWR-0004  ── GOLD     +9 Gbps   $275,353   wgt 0.0378  █████████
-  TWR-0010  ── GOLD    +12 Gbps   $332,694   wgt 0.0375  █████████
-  TWR-0015  ── GOLD    +11 Gbps   $438,932   wgt 0.0361  ████████
-  TWR-0002  ── GOLD    +11 Gbps   $420,363   wgt 0.0331  ████████
-  TWR-0007  ── GOLD     +9 Gbps   $416,640   wgt 0.0330  ████████
-  TWR-0005  ── SILVER   +3 Gbps   $220,435   wgt 0.0330  ████████  ⚐
-  TWR-0009  ── BRONZE   +3 Gbps    $97,784   wgt 0.0330  ████████  ⚐
-  TWR-0001  ── GOLD     +6 Gbps   $274,561   wgt 0.0322  ████████
-  TWR-0006  ── SILVER   +5 Gbps   $150,407   wgt 0.0316  ████████  ⚐
-
-  ⚐ Lowest weighted_impact towers — solver buys cheaper tiers
-    to free budget for the higher-impact GOLDs.
-
-Budget gauge
-  Cost          ████████████████████████████████████████  $4,956,843 / $5,000,000  ── BINDING
-  Install-wks   █████████████████████████████████          164 / 200            (slack: 36)
-
-Headline metrics
-  Capacity restored:   122 Gbps
-  Tier mix:            12 GOLD · 2 SILVER · 1 BRONZE
-  Towers covered:      15 of 15 (no triage tradeoff)
-  Subs serviced:       404 distinct (33% of all 1,200)
-  Objective without Stage 4 (Σ capacity × weighted_impact):    4.6024
-  Objective with Stage 4 (× 0.992871 uniform multiplier):      4.5696
-
-✓ TowerUpgradeOption.selected written back — the optimization output
-  is now a queryable property of the model.
-```
-
-(Full decision matrix: `outputs/stage5_solution.csv`.)
-
-### Reading the solve
-
-- **GOLD dominates** (12/15) — for towers with high social blast radius, GOLD's 6–12 Gbps uplift outweighs its higher cost.
-- **Budget is binding** ($4.96M / $5M) — relaxing to $6M would let TWR-0009 jump from BRONZE to GOLD ($481K → +9 Gbps) and lift the objective meaningfully.
-- **Install-weeks are not binding** (164/200) — schedule is the looser constraint; budget holds back the plan.
-- **Stage 4's uniform multiplier doesn't shift tiers** — the forecast says WEST is contracting (-0.7%), so every upgrade is slightly less valuable in absolute terms, but relative tower priority is unchanged. **A non-uniform forecast would be the more revealing test of the chain's value** — if some WEST towers sat in growth pockets and others in decline, the tier mix would shift accordingly.
-
----
-
-## Stage 6 — Interpretation
-
-> `/rai-prescriptive-results-interpretation` "Summarize the plan: total cost, capacity restored, tier mix, towers covered. Which constraint is binding, and what would relaxing it by 10-20% unlock?"
-
-```
-THE PLAN, IN BUSINESS TERMS
-
-  • 122 Gbps of network capacity restored across all 15 critical towers
-    within the $5M capex budget.
-  • Every WEST DEGRADED tower gets an upgrade — no triage tradeoff.
-  • Service-affected subscribers drop from 404 to ~0 over the install
-    schedule (164 crew-weeks; 4-month rollout at 2 crews of 5).
-  • Budget binding — if CFO can flex to $6M, promote TWR-0009 to GOLD
-    for +9 Gbps marginal lift.
-
-ONTOLOGY ENRICHMENT — what each stage contributed back
-
-  Stage 2 (rules)         CellTower.is_critical_restore           [15]
-                          CellTower.avg_packet_loss               [250]
-                          CellTower.avg_latency_ms                [250]
-                          CellTower.avg_error_rate                [250]
-                          CellTower.avg_health_score              [250]
-
-  Stage 3 (graph)         Subscriber.influence_score              [1,200]
-                          CellTower.impact_count                  [120]
-                          CellTower.weighted_impact               [120]
-
-  Stage 4 (predictive)    CellTower.projected_demand_growth       [15]
-
-  Stage 5 (prescriptive)  TowerUpgradeOption.selected             [45]
-
-  ──────────────────────────────────────────────────────────────────
-  Each stage reads what the previous stage wrote.
-  Re-running any downstream stage automatically picks up enrichments.
-  No glue code, no DataFrame round-trip — same ontology throughout.
-  ──────────────────────────────────────────────────────────────────
-```
-
----
-
-## The chain — accretive ontology enrichment
-
-```
-THE WEST RECOVERY CHAIN
-
-  STAGE 1  DESCRIPTIVE
-  "Where is the bleed coming from?"
-  reads:   RevenueForecast, TimeSeriesMetric, NetworkPerformance, CellTower
-  writes:  (situational summary — no ontology mutation)
-                         │
-                         ▼
-  STAGE 2  RULES
-  "Which towers are critical to restore?"
-  reads:   NetworkPerformance, EquipmentHealth, NetworkEquipment, CellTower
-  writes:  CellTower.is_critical_restore        ── 15 towers flagged
-           CellTower.avg_packet_loss / latency_ms / error_rate / health_score
-                         │
-                         ▼
-  STAGE 3  GRAPH (PageRank)
-  "Whose service depends on these towers — and who is socially central?"
-  reads:   CallDetailRecord (caller→callee), CDR.routed_through(CellTower)
-  writes:  Subscriber.influence_score           ── per subscriber
-           CellTower.impact_count               ── distinct subs served
-           CellTower.weighted_impact            ── Σ subscriber influence
-                         │
-                         ▼
-  STAGE 4  PREDICTIVE (GNN node regression)
-  "What does the forecast say about future demand?"
-  reads:   TimeSeriesMetric.subscriber_growth_rate × 365d × 9 regions
-           + 12 daily KPIs + 3 lag features + same-region temporal edges
-  writes:  CellTower.projected_demand_growth    ── per critical tower
-                         │
-                         ▼
-  STAGE 5  PRESCRIPTIVE (gurobi MIP)
-  "What's the optimal $5M tier-selection plan?"
-  reads:   CellTower.is_critical_restore        ──►  decision-variable scope
-           CellTower.weighted_impact            ──►  objective coefficient
-           CellTower.projected_demand_growth    ──►  objective coefficient
-           TowerUpgradeOption.cost / capacity_increase / install_weeks
-  writes:  TowerUpgradeOption.selected          ── 15 upgrades chosen
-                         │
-                         ▼
-                   Actionable plan,
-                   grounded end-to-end in the same ontology.
-
-  ──────────────────────────────────────────────────────────────────
-  No glue. No DataFrame ping-pong. No re-derivation per-reasoner.
-  Five reasoners, one ontology, one accretive thread.
-  ──────────────────────────────────────────────────────────────────
-```
-
----
-
-## Why the chain matters (vs. any single stage)
-
-| Stage alone | What it tells you | What it doesn't |
-|---|---|---|
-| Descriptive | "WEST is broken" | Which towers, how to fix |
-| Rules alone | "These 15 towers are critical" | Which matter most; what to do |
-| Graph alone | "These subscribers are influential" | Which towers serve them |
-| Predictive alone | "WEST demand is contracting" | Where to spend the recovery budget |
-| Prescriptive alone | (won't run — no flagged set, no impact weights, no forecast) | Whole pipeline misses |
-
-| Combined | Output |
-|---|---|
-| Descriptive → Rules | Crisis scoped + critical-tower set flagged |
-| + Graph | Each flagged tower scored by social blast radius |
-| + Predictive | Forward-looking demand multiplier per tower |
-| + Prescriptive | $5M plan, 122 Gbps, all 15 covered, prioritized by social impact |
-
-**Multi-reasoner chaining grounded in (and contributing to) the ontology.**
-
----
-
-## Adapting this recipe to a new domain
-
-The chain pattern transfers cleanly. To rebuild for a different problem:
-
-1. Re-run `rai-discovery` on the new business question — does it actually need all 5 reasoner families, or is one or two sufficient?
-2. Strip the demo ontology to the concepts the new chain needs (lean is better for type inference and solver compile time).
-3. Stage 1 (descriptive triage) is *optional but high-leverage*: it scopes the problem and rules out a misdiagnosis (e.g., is this a retention crisis or a network crisis?) before any rule, graph, GNN, or solver runs.
-4. Stages 2–5 are the load-bearing chain: rules write the flag that scopes graph aggregations and the solver's decision variables; graph writes the per-entity impact weight that becomes a solver objective coefficient; predictive writes the forward-looking multiplier that becomes the second objective coefficient; prescriptive composes both upstream signals into the final plan.
-5. Keep the validation checks at every stage: assert flagged-set size, PageRank top-N looks plausible, the GNN forecast separates the anomalous segment from the rest, the solve status is OPTIMAL, the objective is not zero, and at least one constraint is binding (otherwise you're under-constrained).
-
-The shape this template demonstrates — *each reasoner writes a property the next reasoner reads* — is what makes the chain accretive rather than serial. The agent skills are how you reliably author each link.
+## Workflow
 
----
+| # | Skill + prompt | Expected output |
+|---|----------------|-----------------|
+| 0 | `/rai-discovery` "WEST is missing revenue while every other region grows. What questions do we need to answer to figure out where to spend $5M to fix it?" | Routes sub-questions to descriptive (revenue diagnosis), rules (critical-tower flag), graph (PageRank blast radius), predictive (demand forecast), prescriptive (tier-selection MIP + post-solve interpretation). |
+| 1 | `/rai-querying` "Where are we missing revenue targets? Which 10 cell towers have the worst average packet loss over 2024, and which region has the worst Q4 network availability?" | WEST avail 94.6 vs 99.5 elsewhere; Q3-Q4 revenue −22% to −26% ($791K gap); 15 of 81 WEST towers DEGRADED at 8.1–8.9% packet loss; zero high-risk subs — operational, not retention. |
+| 2 | `/rai-rules-authoring` "Flag CellTowers as 'critical-restore' if region is WEST AND status is DEGRADED AND avg equipment health is below 0.85, OR if avg packet loss > 5% with health below 0.85." | `CellTower.is_critical_restore` fires on 15 WEST DEGRADED towers; 4 derived health metrics (`avg_packet_loss`, `avg_latency_ms`, `avg_error_rate`, `avg_health_score`) written to all 250 towers. |
+| 3 | `/rai-graph-analysis` "Who are our most socially influential subscribers based on call patterns? For each critical-restore tower, count the distinct subscribers whose calls route through it and rank by total PageRank influence — that's the blast radius if it fails." | `Subscriber.influence_score` (PageRank) on all 1,200 subs; `CellTower.weighted_impact` on 15 critical towers; 404 distinct subs (33% of base) route through a critical tower; TWR-0014 has the largest footprint (61 subs, 0.0502). |
+| 4 | `/rai-predictive-modeling` + `/rai-predictive-training` "Predict next-quarter subscriber-growth-rate per region using TimeSeriesMetric history. Bind each region's forecast back to its towers as a demand multiplier." | GNN node regression on 365d × 9 regions with 1-day-lag temporal edges; WEST multiplier 0.993× (contracting ~0.7%); 8 other regions +0.59 to +0.75%/day; written to `CellTower.projected_demand_growth` for 15 critical towers. |
+| 5 | `/rai-prescriptive-problem-formulation` "Recover WEST capacity within $5M and 200 install-weeks, prioritizing towers by social blast radius and forward-looking demand. From TowerUpgradeOption, pick at most one upgrade tier (BRONZE/SILVER/GOLD) per critical-restore tower, maximizing Σ capacity_increase × weighted_impact × projected_demand_growth." | OPTIMAL · 12 GOLD / 2 SILVER / 1 BRONZE · 122 Gbps restored · $4.96M of $5M (binding) · 164 of 200 install-weeks (slack) · all 15 towers covered. |
+| 6 | `/rai-prescriptive-results-interpretation` "Summarize the plan: total cost, capacity restored, tier mix, towers covered. Which constraint is binding, and what would relaxing it by 10-20% unlock?" | Budget binds at $4.96M/$5M; flexing to $6M would promote TWR-0009 BRONZE→GOLD (+9 Gbps); install-weeks have 36-week slack; 404 service-affected subs drop to ~0 over the 4-month rollout. |
 
-## Data Reference
+## Data
 
-- **Source data**: bundled CSVs in `../data/` (the main template ships ~1.2 MB of synthetic-but-realistic telco data — 250 cell towers, 1,200 subscribers, 6,000 CDRs, 3,285 daily KPI rows across 9 regions).
-- **Ontology**: the template's main script uses a focused 7-concept subset of a broader 18-concept telco knowledge graph (PostalArea, Subscriber, Contract, BillingEvent, CellTower, NetworkEquipment, EquipmentHealth, NetworkEvent, CallDetailRecord, SupplierOrder, Campaign, PromotionRedemption, RevenueForecast, NetworkPerformance, SupportTicket, TimeSeriesMetric, TowerUpgradeOption, Part) — sufficient for the four-stage chain.
-- **Stages**: implemented in `../telco_network_recovery.py` as a single combined script with stage banners.
+Bundled CSVs in `../data/`: 250 cell towers (15 WEST DEGRADED), 1,200 subscribers, 6,000 directed CDRs, ~5,000 NetworkPerformance measurements, 544 NetworkEquipment + EquipmentHealth rows, 360 TowerUpgradeOptions (3 tiers × 120 in-scope towers), 3,285 daily KPI rows (365 days × 9 regions). All stages run end-to-end via `../telco_network_recovery.py`.

From 95277d58820b97e23bbba63c15d58682e5bf4772 Mon Sep 17 00:00:00 2001
From: cafzal <cameron.afzal@gmail.com>
Date: Wed, 6 May 2026 09:28:58 -0700
Subject: [PATCH 06/22] Convert workflow tables to bullets for easier copy

Bullet format gives each prompt its own line so users can triple-click
to select and copy. Expected output renders as a paragraph under each
bullet. No content changes, just structural.
---
 v1/energy_grid_planning/references/runbook.md | 32 ++++++++++++----
 v1/machine_maintenance/references/runbook.md  | 32 ++++++++++++----
 v1/portfolio_balancing/references/runbook.md  | 32 ++++++++++++----
 .../references/runbook.md                     | 32 ++++++++++++----
 .../references/runbook.md                     | 37 ++++++++++++++-----
 5 files changed, 124 insertions(+), 41 deletions(-)

diff --git a/v1/energy_grid_planning/references/runbook.md b/v1/energy_grid_planning/references/runbook.md
index 87242eb..20443d5 100644
--- a/v1/energy_grid_planning/references/runbook.md
+++ b/v1/energy_grid_planning/references/runbook.md
@@ -38,14 +38,30 @@ $300M unlocks 5 DCs (1,500 MW, $264M net value) including xAI Colossus.
 
 ## Workflow
 
-| # | Skill + prompt | Expected output |
-|---|----------------|-----------------|
-| 0 | `/rai-discovery` "We have 10 hyperscaler interconnection requests against a 12-substation grid. Which to approve, which substation upgrades to fund, at what budget level?" | Plan routing sub-questions to predictive, graph, rules, and prescriptive reasoners. |
-| 1 | `/rai-predictive-modeling` + `/rai-predictive-training` "Can we forecast substation load growth over the next 36 months based on historical demand, planned generator additions, and the data center request pipeline? Bind each substation's predicted peak load back to the ontology so the rules engine and optimizer can read it." | `Substation.predicted_load` for all 12; DFW breaches at 1,700 MW vs 1,600 MW cap at 24 months (+54.6%). |
-| 2 | `/rai-graph-analysis` "Which substations are most critical to power flow based on grid topology? Use centrality on the transmission graph, then flag the top 3 as structurally critical and persist the scores back to the ontology." | 1 component, 3 Louvain communities; DFW, Houston, San Antonio flagged `is_structurally_critical`; 7 of 10 DC requests on critical nodes. |
-| 3 | `/rai-rules-authoring` "Screen each data center request against three criteria: (1) substation must have enough capacity after predicted load, (2) if 100% low-carbon required, region must have 25%+ renewable, (3) substation shouldn't be most structurally critical. Which requests pass all three?" | `fails_capacity` / `fails_structural` / `fails_low_carbon` + `is_compliant`; 2 pass (Crusoe, Oracle), 8 flagged. |
-| 4 | `/rai-prescriptive-problem-formulation` "Decide which data center requests to approve and which substation upgrades to fund at $200M, $300M, $400M, $500M, and $600M investment levels. Maximize annual revenue. A request can only be approved if its substation has enough capacity after upgrades." | OPTIMAL MIP across 5 `InvestmentLevel` values in one solve; `x_approve` and `x_upgrade` written back per level. |
-| 5 | `/rai-prescriptive-results-interpretation` "Which data centers get approved, which upgrades are selected, and where's the biggest return on investment at each budget level?" | Pareto frontier with knee at $300M (5 DCs, 1,500 MW, $264M net); marginal $995K/$M at knee, declining to $400K/$M by $600M; Google + Lambda never approved (DFW full). |
+- `/rai-discovery` "We have 10 hyperscaler interconnection requests against a 12-substation grid. Which to approve, which substation upgrades to fund, at what budget level?"
+
+  Plan routing sub-questions to predictive, graph, rules, and prescriptive reasoners.
+
+- `/rai-predictive-modeling` + `/rai-predictive-training` "Can we forecast substation load growth over the next 36 months based on historical demand, planned generator additions, and the data center request pipeline? Bind each substation's predicted peak load back to the ontology so the rules engine and optimizer can read it."
+
+  `Substation.predicted_load` for all 12; DFW breaches at 1,700 MW vs 1,600 MW cap at 24 months (+54.6%).
+
+- `/rai-graph-analysis` "Which substations are most critical to power flow based on grid topology? Use centrality on the transmission graph, then flag the top 3 as structurally critical and persist the scores back to the ontology."
+
+  1 component, 3 Louvain communities; DFW, Houston, San Antonio flagged `is_structurally_critical`; 7 of 10 DC requests on critical nodes.
+
+- `/rai-rules-authoring` "Screen each data center request against three criteria: (1) substation must have enough capacity after predicted load, (2) if 100% low-carbon required, region must have 25%+ renewable, (3) substation shouldn't be most structurally critical. Which requests pass all three?"
+
+  `fails_capacity` / `fails_structural` / `fails_low_carbon` + `is_compliant`; 2 pass (Crusoe, Oracle), 8 flagged.
+
+- `/rai-prescriptive-problem-formulation` "Decide which data center requests to approve and which substation upgrades to fund at $200M, $300M, $400M, $500M, and $600M investment levels. Maximize annual revenue. A request can only be approved if its substation has enough capacity after upgrades."
+
+  OPTIMAL MIP across 5 `InvestmentLevel` values in one solve; `x_approve` and `x_upgrade` written back per level.
+
+- `/rai-prescriptive-results-interpretation` "Which data centers get approved, which upgrades are selected, and where's the biggest return on investment at each budget level?"
+
+  Pareto frontier with knee at $300M (5 DCs, 1,500 MW, $264M net); marginal $995K/$M at knee, declining to $400K/$M by $600M; Google + Lambda never approved (DFW full).
+
 
 ## Data
 
diff --git a/v1/machine_maintenance/references/runbook.md b/v1/machine_maintenance/references/runbook.md
index d4efa03..c059acd 100644
--- a/v1/machine_maintenance/references/runbook.md
+++ b/v1/machine_maintenance/references/runbook.md
@@ -41,14 +41,30 @@ all 3 Turbine techs sit in one city, a $3,200 fix away from resolved.
 
 ## Workflow
 
-| # | Skill + prompt | Expected output |
-|---|----------------|-----------------|
-| 0 | `/rai-discovery` "We need to schedule preventive maintenance for 30 machines across 3 plants. Where does OEE alone mislead us, and what structural risks won't a pure optimizer surface?" | Plan routing sub-questions to querying, graph, rules, prescriptive, and resilience skills. |
-| 1 | `/rai-querying` "What's the OEE by plant? Which machines have the most sensor anomalies, and which are most likely to fail by the end of the planning horizon?" | Plant_C 79.8% > Plant_A 68.2% > Plant_B 61.4%; 7 of 9 anomalies at Plant_A; `MachinePeriod.predicted_fp` written for 120 rows. |
-| 2 | `/rai-graph-analysis` "Which machines share qualified technicians, and which are bottlenecks in the qualification network? Compute centrality and write it back to each machine so the optimizer can weight critical machines." | 30 machines → 1 connected component; Pumps tie at top betweenness (24.0 raw, 1.0 normalized); `Machine.betweenness` stored. |
-| 3 | `/rai-rules-authoring` "Rate each machine's risk: chronic if >8 downtime events, high-risk if failure prob >0.3 AND criticality 4+, plus overdue for maintenance. All three flags = Critical, two = Elevated, otherwise Standard." | 6 overdue, 1 high-risk, 3 chronic; M013 (Pump, Plant_A) = Critical; M016 (Turbine, Plant_A) = Elevated. |
-| 4 | `/rai-prescriptive-problem-formulation` "Schedule preventive maintenance for all 30 machines across 4 periods, capped at 5 jobs per period. Every overdue machine gets maintained by period 2, and Turbines need an on-site qualified technician. Minimize expected failure cost weighted by criticality and centrality, plus labor and travel." | 120 `x_maintain` + 120 `x_vulnerable` + ~250 `x_assigned` binaries; 5 constraint families; failure cost uses `predicted_fp × criticality × (1 + 2.0 × betweenness)`. |
-| 5 | `/rai-prescriptive-solver-management` + `/rai-prescriptive-results-interpretation` "Solve with HiGHS, then for each machine type check whether all qualified techs sit in one location and recommend the cheapest cross-training fix." | OPTIMAL · 20 jobs · $605,241; Turbine concentrated in Houston_TX (67% of jobs travel); cross-train T006 (Chicago_IL, Senior) for $3,200 / 5 weeks. |
+- `/rai-discovery` "We need to schedule preventive maintenance for 30 machines across 3 plants. Where does OEE alone mislead us, and what structural risks won't a pure optimizer surface?"
+
+  Plan routing sub-questions to querying, graph, rules, prescriptive, and resilience skills.
+
+- `/rai-querying` "What's the OEE by plant? Which machines have the most sensor anomalies, and which are most likely to fail by the end of the planning horizon?"
+
+  Plant_C 79.8% > Plant_A 68.2% > Plant_B 61.4%; 7 of 9 anomalies at Plant_A; `MachinePeriod.predicted_fp` written for 120 rows.
+
+- `/rai-graph-analysis` "Which machines share qualified technicians, and which are bottlenecks in the qualification network? Compute centrality and write it back to each machine so the optimizer can weight critical machines."
+
+  30 machines → 1 connected component; Pumps tie at top betweenness (24.0 raw, 1.0 normalized); `Machine.betweenness` stored.
+
+- `/rai-rules-authoring` "Rate each machine's risk: chronic if >8 downtime events, high-risk if failure prob >0.3 AND criticality 4+, plus overdue for maintenance. All three flags = Critical, two = Elevated, otherwise Standard."
+
+  6 overdue, 1 high-risk, 3 chronic; M013 (Pump, Plant_A) = Critical; M016 (Turbine, Plant_A) = Elevated.
+
+- `/rai-prescriptive-problem-formulation` "Schedule preventive maintenance for all 30 machines across 4 periods, capped at 5 jobs per period. Every overdue machine gets maintained by period 2, and Turbines need an on-site qualified technician. Minimize expected failure cost weighted by criticality and centrality, plus labor and travel."
+
+  120 `x_maintain` + 120 `x_vulnerable` + ~250 `x_assigned` binaries; 5 constraint families; failure cost uses `predicted_fp × criticality × (1 + 2.0 × betweenness)`.
+
+- `/rai-prescriptive-solver-management` + `/rai-prescriptive-results-interpretation` "Solve with HiGHS, then for each machine type check whether all qualified techs sit in one location and recommend the cheapest cross-training fix."
+
+  OPTIMAL · 20 jobs · $605,241; Turbine concentrated in Houston_TX (67% of jobs travel); cross-train T006 (Chicago_IL, Senior) for $3,200 / 5 weeks.
+
 
 ## Data
 
diff --git a/v1/portfolio_balancing/references/runbook.md b/v1/portfolio_balancing/references/runbook.md
index 4a89259..636145f 100644
--- a/v1/portfolio_balancing/references/runbook.md
+++ b/v1/portfolio_balancing/references/runbook.md
@@ -37,14 +37,30 @@ base at every lambda — without the cluster collapse, the gap would grow.
 
 ## Workflow
 
-| # | Skill + prompt | Expected output |
-|---|----------------|-----------------|
-| 0 | `/rai-discovery` "Our 8-stock book breaks compliance and concentrates risk. Rebuild it under Markowitz mean-variance with caps, deduplicate redundant bets via correlation clustering, and stress-test under crisis. What questions does each reasoner family handle?" | Plan: rules for compliance flags, graph for correlation clustering + representatives, prescriptive QP indexed by Scenario, stress as regime-swap re-solve. |
-| 1 | `/rai-rules-authoring` "Flag any holding worth more than 15% of its account, any sector worth more than 30% of the account, and any user with a risk score above 0.8 and more than five flagged transactions." | 4 holdings flagged (AAPL/MSFT on Account 1, JNJ/PFE on Account 4); 2 (account, sector) pairs flagged (Account 1 Tech 34.0%, Account 4 Healthcare 32.2%); 2 users flagged (Alice Chen 0.85, Eve Taylor 0.92). |
-| 2 | `/rai-graph-analysis` "Cluster stocks by correlation — anything above 0.3 absolute is a redundant bet. Pick one representative per cluster (highest Sharpe ratio) and force the rest to zero in optimization." | 4 edges (|rho| >= 0.3), 5 Louvain clusters, intra +0.683 vs inter +0.131. 5 representatives picked: PFE, GOOGL, JPM, PG, XOM. AAPL/MSFT/JNJ flagged `is_non_representative`. |
-| 3 | `/rai-prescriptive-problem-formulation` "Build a Markowitz mean-variance frontier across 6 scenarios = 3 budgets x 2 regimes. Position cap 30% of budget, sector cap 30%, non-representatives forced to zero. Anchor with min-risk and max-return, then sweep 5 epsilon points across the return range." | 48 decision vars (8 stocks x 6 scenarios), 5 constraint families. Return-rate range [0.0634, 0.0840]. 7 solves x 6 scenarios = 42 `LOCALLY_SOLVED` portfolios via Ipopt. |
-| 4 | `/rai-prescriptive-results-interpretation` "For each scenario, list the seven-point Pareto frontier and find the knee — where does the marginal risk per unit return jump the most?" | base_500 frontier: returns 32.43 -> 40.28, risk 1160 -> 1742. Marginal `delta_risk/delta_return` jumps ~3x at eps_1 (knee). Same shape across all 6 scenarios — risk scales as budget^2, rate-form frontier is budget-independent. |
-| 5 | `/rai-prescriptive-solver-management` + `/rai-prescriptive-results-interpretation` "Stress-test the frontier under crisis: shrink correlations toward all-ones with weight 0.7 on base covariance + 0.3 on outer-product. How much volatility expansion at each frontier point?" | Crisis vol +28-30% above base at every lambda (budget 500: min_risk 34.06 -> 43.74, eps_1 34.30 -> 44.54 peak). Gap peaks mid-frontier, narrows to +25.2% at eps_5 — the cluster-collapse payoff. |
+- `/rai-discovery` "Our 8-stock book breaks compliance and concentrates risk. Rebuild it under Markowitz mean-variance with caps, deduplicate redundant bets via correlation clustering, and stress-test under crisis. What questions does each reasoner family handle?"
+
+  Plan: rules for compliance flags, graph for correlation clustering + representatives, prescriptive QP indexed by Scenario, stress as regime-swap re-solve.
+
+- `/rai-rules-authoring` "Flag any holding worth more than 15% of its account, any sector worth more than 30% of the account, and any user with a risk score above 0.8 and more than five flagged transactions."
+
+  4 holdings flagged (AAPL/MSFT on Account 1, JNJ/PFE on Account 4); 2 (account, sector) pairs flagged (Account 1 Tech 34.0%, Account 4 Healthcare 32.2%); 2 users flagged (Alice Chen 0.85, Eve Taylor 0.92).
+
+- `/rai-graph-analysis` "Cluster stocks by correlation — anything above 0.3 absolute is a redundant bet. Pick one representative per cluster (highest Sharpe ratio) and force the rest to zero in optimization."
+
+  4 edges (|rho| >= 0.3), 5 Louvain clusters, intra +0.683 vs inter +0.131. 5 representatives picked: PFE, GOOGL, JPM, PG, XOM. AAPL/MSFT/JNJ flagged `is_non_representative`.
+
+- `/rai-prescriptive-problem-formulation` "Build a Markowitz mean-variance frontier across 6 scenarios = 3 budgets x 2 regimes. Position cap 30% of budget, sector cap 30%, non-representatives forced to zero. Anchor with min-risk and max-return, then sweep 5 epsilon points across the return range."
+
+  48 decision vars (8 stocks x 6 scenarios), 5 constraint families. Return-rate range [0.0634, 0.0840]. 7 solves x 6 scenarios = 42 `LOCALLY_SOLVED` portfolios via Ipopt.
+
+- `/rai-prescriptive-results-interpretation` "For each scenario, list the seven-point Pareto frontier and find the knee — where does the marginal risk per unit return jump the most?"
+
+  base_500 frontier: returns 32.43 -> 40.28, risk 1160 -> 1742. Marginal `delta_risk/delta_return` jumps ~3x at eps_1 (knee). Same shape across all 6 scenarios — risk scales as budget^2, rate-form frontier is budget-independent.
+
+- `/rai-prescriptive-solver-management` + `/rai-prescriptive-results-interpretation` "Stress-test the frontier under crisis: shrink correlations toward all-ones with weight 0.7 on base covariance + 0.3 on outer-product. How much volatility expansion at each frontier point?"
+
+  Crisis vol +28-30% above base at every lambda (budget 500: min_risk 34.06 -> 43.74, eps_1 34.30 -> 44.54 peak). Gap peaks mid-frontier, narrows to +25.2% at eps_5 — the cluster-collapse payoff.
+
 
 ## Data
 
diff --git a/v1/supply_chain_resilience/references/runbook.md b/v1/supply_chain_resilience/references/runbook.md
index c7a68ad..331bd3c 100644
--- a/v1/supply_chain_resilience/references/runbook.md
+++ b/v1/supply_chain_resilience/references/runbook.md
@@ -35,14 +35,30 @@ watch->avoid downgrade = +0.0% (optimizer already routed around it).
 
 ## Workflow
 
-| # | Skill + prompt | Expected output |
-|---|----------------|-----------------|
-| 0 | `/rai-discovery` "We need a risk-adjusted routing plan. What's our exposure to each supplier, which sites are bottlenecks, which suppliers are unreliable, and what does the minimum-cost flow look like once those risks are priced in?" | Reasoner-routing plan covering Stages 0–3 (graph, rules, prescriptive). |
-| 1 | `/rai-graph-analysis` "If a key supplier goes offline, which downstream buyers and finished products are at risk? For each HIGH-priority customer, list the suppliers it transitively depends on through the shipment graph, with their reliability scores." | `Business.is_high_priority_customer` (2 buyers: B008, B009); shared 6-supplier upstream cone. |
-| 2 | `/rai-graph-analysis` "Which sites are the most influential hubs in the supply network — sites that connect to other influential sites, not just sites with many direct connections? Persist the centrality score back to each site so the optimizer can use it as a bottleneck weight." | `Site.centrality` normalized [0,1]: S004=1.000, S006=0.776, S003=0.735; 2 weakly-connected components. |
-| 3 | `/rai-rules-authoring` "Rate each supplier's delivery reliability. Flag any with reliability score below 0.80 as unreliable, any with a Q1 delay prediction above 0.15 as high-delay-risk, and call them 'watch-level' if either fires. Suppliers with **both** flags are 'avoid' (hard-blocked downstream); suppliers with **either** flag are 'watch' (surcharged)." | `is_unreliable` (1: B017), `has_high_delay_risk` (2: B003, B017), `is_watch_level` (2), `Demand.is_escalated` (9). |
-| 4 | `/rai-prescriptive-problem-formulation` "Solve a minimum-cost flow that fulfills all open demand orders at minimum total transport cost. Hard-block 'avoid' suppliers, surcharge 'watch' suppliers $5/unit, weight bottleneck sites by their centrality, and penalize unmet demand at $100/unit." | MILP on `Operation.x_flow` + `Demand.x_unmet`; objective = transport + risk surcharge + centrality weight + unmet penalty. |
-| 5 | `/rai-prescriptive-solver-management` + `/rai-prescriptive-results-interpretation` "Re-solve with the highest-centrality site offline, and again with watch-level suppliers downgraded to avoid. What's the cost delta in each, and why are they asymmetric?" | Baseline OPTIMAL $1,865 / 8 flows / 0 unmet; S004 offline +88.5%; watch->avoid +0.0% (B003 already off optimal lanes). |
+- `/rai-discovery` "We need a risk-adjusted routing plan. What's our exposure to each supplier, which sites are bottlenecks, which suppliers are unreliable, and what does the minimum-cost flow look like once those risks are priced in?"
+
+  Reasoner-routing plan covering Stages 0–3 (graph, rules, prescriptive).
+
+- `/rai-graph-analysis` "If a key supplier goes offline, which downstream buyers and finished products are at risk? For each HIGH-priority customer, list the suppliers it transitively depends on through the shipment graph, with their reliability scores."
+
+  `Business.is_high_priority_customer` (2 buyers: B008, B009); shared 6-supplier upstream cone.
+
+- `/rai-graph-analysis` "Which sites are the most influential hubs in the supply network — sites that connect to other influential sites, not just sites with many direct connections? Persist the centrality score back to each site so the optimizer can use it as a bottleneck weight."
+
+  `Site.centrality` normalized [0,1]: S004=1.000, S006=0.776, S003=0.735; 2 weakly-connected components.
+
+- `/rai-rules-authoring` "Rate each supplier's delivery reliability. Flag any with reliability score below 0.80 as unreliable, any with a Q1 delay prediction above 0.15 as high-delay-risk, and call them 'watch-level' if either fires. Suppliers with **both** flags are 'avoid' (hard-blocked downstream); suppliers with **either** flag are 'watch' (surcharged)."
+
+  `is_unreliable` (1: B017), `has_high_delay_risk` (2: B003, B017), `is_watch_level` (2), `Demand.is_escalated` (9).
+
+- `/rai-prescriptive-problem-formulation` "Solve a minimum-cost flow that fulfills all open demand orders at minimum total transport cost. Hard-block 'avoid' suppliers, surcharge 'watch' suppliers $5/unit, weight bottleneck sites by their centrality, and penalize unmet demand at $100/unit."
+
+  MILP on `Operation.x_flow` + `Demand.x_unmet`; objective = transport + risk surcharge + centrality weight + unmet penalty.
+
+- `/rai-prescriptive-solver-management` + `/rai-prescriptive-results-interpretation` "Re-solve with the highest-centrality site offline, and again with watch-level suppliers downgraded to avoid. What's the cost delta in each, and why are they asymmetric?"
+
+  Baseline OPTIMAL $1,865 / 8 flows / 0 unmet; S004 offline +88.5%; watch->avoid +0.0% (B003 already off optimal lanes).
+
 
 ## Data
 
diff --git a/v1/telco_network_recovery/references/runbook.md b/v1/telco_network_recovery/references/runbook.md
index 39b28ac..cf88bc2 100644
--- a/v1/telco_network_recovery/references/runbook.md
+++ b/v1/telco_network_recovery/references/runbook.md
@@ -37,15 +37,34 @@ across all 15 critical towers, prioritized by social blast radius.
 
 ## Workflow
 
-| # | Skill + prompt | Expected output |
-|---|----------------|-----------------|
-| 0 | `/rai-discovery` "WEST is missing revenue while every other region grows. What questions do we need to answer to figure out where to spend $5M to fix it?" | Routes sub-questions to descriptive (revenue diagnosis), rules (critical-tower flag), graph (PageRank blast radius), predictive (demand forecast), prescriptive (tier-selection MIP + post-solve interpretation). |
-| 1 | `/rai-querying` "Where are we missing revenue targets? Which 10 cell towers have the worst average packet loss over 2024, and which region has the worst Q4 network availability?" | WEST avail 94.6 vs 99.5 elsewhere; Q3-Q4 revenue −22% to −26% ($791K gap); 15 of 81 WEST towers DEGRADED at 8.1–8.9% packet loss; zero high-risk subs — operational, not retention. |
-| 2 | `/rai-rules-authoring` "Flag CellTowers as 'critical-restore' if region is WEST AND status is DEGRADED AND avg equipment health is below 0.85, OR if avg packet loss > 5% with health below 0.85." | `CellTower.is_critical_restore` fires on 15 WEST DEGRADED towers; 4 derived health metrics (`avg_packet_loss`, `avg_latency_ms`, `avg_error_rate`, `avg_health_score`) written to all 250 towers. |
-| 3 | `/rai-graph-analysis` "Who are our most socially influential subscribers based on call patterns? For each critical-restore tower, count the distinct subscribers whose calls route through it and rank by total PageRank influence — that's the blast radius if it fails." | `Subscriber.influence_score` (PageRank) on all 1,200 subs; `CellTower.weighted_impact` on 15 critical towers; 404 distinct subs (33% of base) route through a critical tower; TWR-0014 has the largest footprint (61 subs, 0.0502). |
-| 4 | `/rai-predictive-modeling` + `/rai-predictive-training` "Predict next-quarter subscriber-growth-rate per region using TimeSeriesMetric history. Bind each region's forecast back to its towers as a demand multiplier." | GNN node regression on 365d × 9 regions with 1-day-lag temporal edges; WEST multiplier 0.993× (contracting ~0.7%); 8 other regions +0.59 to +0.75%/day; written to `CellTower.projected_demand_growth` for 15 critical towers. |
-| 5 | `/rai-prescriptive-problem-formulation` "Recover WEST capacity within $5M and 200 install-weeks, prioritizing towers by social blast radius and forward-looking demand. From TowerUpgradeOption, pick at most one upgrade tier (BRONZE/SILVER/GOLD) per critical-restore tower, maximizing Σ capacity_increase × weighted_impact × projected_demand_growth." | OPTIMAL · 12 GOLD / 2 SILVER / 1 BRONZE · 122 Gbps restored · $4.96M of $5M (binding) · 164 of 200 install-weeks (slack) · all 15 towers covered. |
-| 6 | `/rai-prescriptive-results-interpretation` "Summarize the plan: total cost, capacity restored, tier mix, towers covered. Which constraint is binding, and what would relaxing it by 10-20% unlock?" | Budget binds at $4.96M/$5M; flexing to $6M would promote TWR-0009 BRONZE→GOLD (+9 Gbps); install-weeks have 36-week slack; 404 service-affected subs drop to ~0 over the 4-month rollout. |
+- `/rai-discovery` "WEST is missing revenue while every other region grows. What questions do we need to answer to figure out where to spend $5M to fix it?"
+
+  Routes sub-questions to descriptive (revenue diagnosis), rules (critical-tower flag), graph (PageRank blast radius), predictive (demand forecast), prescriptive (tier-selection MIP + post-solve interpretation).
+
+- `/rai-querying` "Where are we missing revenue targets? Which 10 cell towers have the worst average packet loss over 2024, and which region has the worst Q4 network availability?"
+
+  WEST avail 94.6 vs 99.5 elsewhere; Q3-Q4 revenue −22% to −26% ($791K gap); 15 of 81 WEST towers DEGRADED at 8.1–8.9% packet loss; zero high-risk subs — operational, not retention.
+
+- `/rai-rules-authoring` "Flag CellTowers as 'critical-restore' if region is WEST AND status is DEGRADED AND avg equipment health is below 0.85, OR if avg packet loss > 5% with health below 0.85."
+
+  `CellTower.is_critical_restore` fires on 15 WEST DEGRADED towers; 4 derived health metrics (`avg_packet_loss`, `avg_latency_ms`, `avg_error_rate`, `avg_health_score`) written to all 250 towers.
+
+- `/rai-graph-analysis` "Who are our most socially influential subscribers based on call patterns? For each critical-restore tower, count the distinct subscribers whose calls route through it and rank by total PageRank influence — that's the blast radius if it fails."
+
+  `Subscriber.influence_score` (PageRank) on all 1,200 subs; `CellTower.weighted_impact` on 15 critical towers; 404 distinct subs (33% of base) route through a critical tower; TWR-0014 has the largest footprint (61 subs, 0.0502).
+
+- `/rai-predictive-modeling` + `/rai-predictive-training` "Predict next-quarter subscriber-growth-rate per region using TimeSeriesMetric history. Bind each region's forecast back to its towers as a demand multiplier."
+
+  GNN node regression on 365d × 9 regions with 1-day-lag temporal edges; WEST multiplier 0.993× (contracting ~0.7%); 8 other regions +0.59 to +0.75%/day; written to `CellTower.projected_demand_growth` for 15 critical towers.
+
+- `/rai-prescriptive-problem-formulation` "Recover WEST capacity within $5M and 200 install-weeks, prioritizing towers by social blast radius and forward-looking demand. From TowerUpgradeOption, pick at most one upgrade tier (BRONZE/SILVER/GOLD) per critical-restore tower, maximizing Σ capacity_increase × weighted_impact × projected_demand_growth."
+
+  OPTIMAL · 12 GOLD / 2 SILVER / 1 BRONZE · 122 Gbps restored · $4.96M of $5M (binding) · 164 of 200 install-weeks (slack) · all 15 towers covered.
+
+- `/rai-prescriptive-results-interpretation` "Summarize the plan: total cost, capacity restored, tier mix, towers covered. Which constraint is binding, and what would relaxing it by 10-20% unlock?"
+
+  Budget binds at $4.96M/$5M; flexing to $6M would promote TWR-0009 BRONZE→GOLD (+9 Gbps); install-weeks have 36-week slack; 404 service-affected subs drop to ~0 over the 4-month rollout.
+
 
 ## Data
 

From 818a4174d3a021d324792a4e533ff011851c8fe5 Mon Sep 17 00:00:00 2001
From: cafzal <cameron.afzal@gmail.com>
Date: Wed, 6 May 2026 09:30:54 -0700
Subject: [PATCH 07/22] Reformat workflow steps with descriptive headers
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

Each step is now `### N. <topic>` followed by two bullets:
- Prompt: <skill> <question> (in code formatting, no quotes — easy
  to triple-click and copy as a single agent invocation)
- Response: <expected output>

Same content, clearer structure for skim + copy.
---
 v1/energy_grid_planning/references/runbook.md | 31 +++++++++-------
 v1/machine_maintenance/references/runbook.md  | 31 +++++++++-------
 v1/portfolio_balancing/references/runbook.md  | 31 +++++++++-------
 .../references/runbook.md                     | 31 +++++++++-------
 .../references/runbook.md                     | 36 +++++++++++--------
 5 files changed, 93 insertions(+), 67 deletions(-)

diff --git a/v1/energy_grid_planning/references/runbook.md b/v1/energy_grid_planning/references/runbook.md
index 20443d5..31984f5 100644
--- a/v1/energy_grid_planning/references/runbook.md
+++ b/v1/energy_grid_planning/references/runbook.md
@@ -38,30 +38,35 @@ $300M unlocks 5 DCs (1,500 MW, $264M net value) including xAI Colossus.
 
 ## Workflow
 
-- `/rai-discovery` "We have 10 hyperscaler interconnection requests against a 12-substation grid. Which to approve, which substation upgrades to fund, at what budget level?"
+### 0. Discovery
 
-  Plan routing sub-questions to predictive, graph, rules, and prescriptive reasoners.
+- Prompt: `/rai-discovery We have 10 hyperscaler interconnection requests against a 12-substation grid. Which to approve, which substation upgrades to fund, at what budget level?`
+- Response: Plan routing sub-questions to predictive, graph, rules, and prescriptive reasoners.
 
-- `/rai-predictive-modeling` + `/rai-predictive-training` "Can we forecast substation load growth over the next 36 months based on historical demand, planned generator additions, and the data center request pipeline? Bind each substation's predicted peak load back to the ontology so the rules engine and optimizer can read it."
+### 1. Forecast substation load
 
-  `Substation.predicted_load` for all 12; DFW breaches at 1,700 MW vs 1,600 MW cap at 24 months (+54.6%).
+- Prompt: `/rai-predictive-modeling + /rai-predictive-training Can we forecast substation load growth over the next 36 months based on historical demand, planned generator additions, and the data center request pipeline? Bind each substation's predicted peak load back to the ontology so the rules engine and optimizer can read it.`
+- Response: `Substation.predicted_load` for all 12; DFW breaches at 1,700 MW vs 1,600 MW cap at 24 months (+54.6%).
 
-- `/rai-graph-analysis` "Which substations are most critical to power flow based on grid topology? Use centrality on the transmission graph, then flag the top 3 as structurally critical and persist the scores back to the ontology."
+### 2. Find structural bottlenecks
 
-  1 component, 3 Louvain communities; DFW, Houston, San Antonio flagged `is_structurally_critical`; 7 of 10 DC requests on critical nodes.
+- Prompt: `/rai-graph-analysis Which substations are most critical to power flow based on grid topology? Use centrality on the transmission graph, then flag the top 3 as structurally critical and persist the scores back to the ontology.`
+- Response: 1 component, 3 Louvain communities; DFW, Houston, San Antonio flagged `is_structurally_critical`; 7 of 10 DC requests on critical nodes.
 
-- `/rai-rules-authoring` "Screen each data center request against three criteria: (1) substation must have enough capacity after predicted load, (2) if 100% low-carbon required, region must have 25%+ renewable, (3) substation shouldn't be most structurally critical. Which requests pass all three?"
+### 3. Screen DC requests
 
-  `fails_capacity` / `fails_structural` / `fails_low_carbon` + `is_compliant`; 2 pass (Crusoe, Oracle), 8 flagged.
+- Prompt: `/rai-rules-authoring Screen each data center request against three criteria: (1) substation must have enough capacity after predicted load, (2) if 100% low-carbon required, region must have 25%+ renewable, (3) substation shouldn't be most structurally critical. Which requests pass all three?`
+- Response: `fails_capacity` / `fails_structural` / `fails_low_carbon` + `is_compliant`; 2 pass (Crusoe, Oracle), 8 flagged.
 
-- `/rai-prescriptive-problem-formulation` "Decide which data center requests to approve and which substation upgrades to fund at $200M, $300M, $400M, $500M, and $600M investment levels. Maximize annual revenue. A request can only be approved if its substation has enough capacity after upgrades."
+### 4. Approve DCs and fund upgrades
 
-  OPTIMAL MIP across 5 `InvestmentLevel` values in one solve; `x_approve` and `x_upgrade` written back per level.
+- Prompt: `/rai-prescriptive-problem-formulation Decide which data center requests to approve and which substation upgrades to fund at $200M, $300M, $400M, $500M, and $600M investment levels. Maximize annual revenue. A request can only be approved if its substation has enough capacity after upgrades.`
+- Response: OPTIMAL MIP across 5 `InvestmentLevel` values in one solve; `x_approve` and `x_upgrade` written back per level.
 
-- `/rai-prescriptive-results-interpretation` "Which data centers get approved, which upgrades are selected, and where's the biggest return on investment at each budget level?"
-
-  Pareto frontier with knee at $300M (5 DCs, 1,500 MW, $264M net); marginal $995K/$M at knee, declining to $400K/$M by $600M; Google + Lambda never approved (DFW full).
+### 5. Read the frontier
 
+- Prompt: `/rai-prescriptive-results-interpretation Which data centers get approved, which upgrades are selected, and where's the biggest return on investment at each budget level?`
+- Response: Pareto frontier with knee at $300M (5 DCs, 1,500 MW, $264M net); marginal $995K/$M at knee, declining to $400K/$M by $600M; Google + Lambda never approved (DFW full).
 
 ## Data
 
diff --git a/v1/machine_maintenance/references/runbook.md b/v1/machine_maintenance/references/runbook.md
index c059acd..b9a52c8 100644
--- a/v1/machine_maintenance/references/runbook.md
+++ b/v1/machine_maintenance/references/runbook.md
@@ -41,30 +41,35 @@ all 3 Turbine techs sit in one city, a $3,200 fix away from resolved.
 
 ## Workflow
 
-- `/rai-discovery` "We need to schedule preventive maintenance for 30 machines across 3 plants. Where does OEE alone mislead us, and what structural risks won't a pure optimizer surface?"
+### 0. Discovery
 
-  Plan routing sub-questions to querying, graph, rules, prescriptive, and resilience skills.
+- Prompt: `/rai-discovery We need to schedule preventive maintenance for 30 machines across 3 plants. Where does OEE alone mislead us, and what structural risks won't a pure optimizer surface?`
+- Response: Plan routing sub-questions to querying, graph, rules, prescriptive, and resilience skills.
 
-- `/rai-querying` "What's the OEE by plant? Which machines have the most sensor anomalies, and which are most likely to fail by the end of the planning horizon?"
+### 1. Diagnose plant operations
 
-  Plant_C 79.8% > Plant_A 68.2% > Plant_B 61.4%; 7 of 9 anomalies at Plant_A; `MachinePeriod.predicted_fp` written for 120 rows.
+- Prompt: `/rai-querying What's the OEE by plant? Which machines have the most sensor anomalies, and which are most likely to fail by the end of the planning horizon?`
+- Response: Plant_C 79.8% > Plant_A 68.2% > Plant_B 61.4%; 7 of 9 anomalies at Plant_A; `MachinePeriod.predicted_fp` written for 120 rows.
 
-- `/rai-graph-analysis` "Which machines share qualified technicians, and which are bottlenecks in the qualification network? Compute centrality and write it back to each machine so the optimizer can weight critical machines."
+### 2. Find scheduling bottlenecks
 
-  30 machines → 1 connected component; Pumps tie at top betweenness (24.0 raw, 1.0 normalized); `Machine.betweenness` stored.
+- Prompt: `/rai-graph-analysis Which machines share qualified technicians, and which are bottlenecks in the qualification network? Compute centrality and write it back to each machine so the optimizer can weight critical machines.`
+- Response: 30 machines → 1 connected component; Pumps tie at top betweenness (24.0 raw, 1.0 normalized); `Machine.betweenness` stored.
 
-- `/rai-rules-authoring` "Rate each machine's risk: chronic if >8 downtime events, high-risk if failure prob >0.3 AND criticality 4+, plus overdue for maintenance. All three flags = Critical, two = Elevated, otherwise Standard."
+### 3. Classify machine risk
 
-  6 overdue, 1 high-risk, 3 chronic; M013 (Pump, Plant_A) = Critical; M016 (Turbine, Plant_A) = Elevated.
+- Prompt: `/rai-rules-authoring Rate each machine's risk: chronic if >8 downtime events, high-risk if failure prob >0.3 AND criticality 4+, plus overdue for maintenance. All three flags = Critical, two = Elevated, otherwise Standard.`
+- Response: 6 overdue, 1 high-risk, 3 chronic; M013 (Pump, Plant_A) = Critical; M016 (Turbine, Plant_A) = Elevated.
 
-- `/rai-prescriptive-problem-formulation` "Schedule preventive maintenance for all 30 machines across 4 periods, capped at 5 jobs per period. Every overdue machine gets maintained by period 2, and Turbines need an on-site qualified technician. Minimize expected failure cost weighted by criticality and centrality, plus labor and travel."
+### 4. Schedule maintenance
 
-  120 `x_maintain` + 120 `x_vulnerable` + ~250 `x_assigned` binaries; 5 constraint families; failure cost uses `predicted_fp × criticality × (1 + 2.0 × betweenness)`.
+- Prompt: `/rai-prescriptive-problem-formulation Schedule preventive maintenance for all 30 machines across 4 periods, capped at 5 jobs per period. Every overdue machine gets maintained by period 2, and Turbines need an on-site qualified technician. Minimize expected failure cost weighted by criticality and centrality, plus labor and travel.`
+- Response: 120 `x_maintain` + 120 `x_vulnerable` + ~250 `x_assigned` binaries; 5 constraint families; failure cost uses `predicted_fp × criticality × (1 + 2.0 × betweenness)`.
 
-- `/rai-prescriptive-solver-management` + `/rai-prescriptive-results-interpretation` "Solve with HiGHS, then for each machine type check whether all qualified techs sit in one location and recommend the cheapest cross-training fix."
-
-  OPTIMAL · 20 jobs · $605,241; Turbine concentrated in Houston_TX (67% of jobs travel); cross-train T006 (Chicago_IL, Senior) for $3,200 / 5 weeks.
+### 5. Stress-test concentration
 
+- Prompt: `/rai-prescriptive-solver-management + /rai-prescriptive-results-interpretation Solve with HiGHS, then for each machine type check whether all qualified techs sit in one location and recommend the cheapest cross-training fix.`
+- Response: OPTIMAL · 20 jobs · $605,241; Turbine concentrated in Houston_TX (67% of jobs travel); cross-train T006 (Chicago_IL, Senior) for $3,200 / 5 weeks.
 
 ## Data
 
diff --git a/v1/portfolio_balancing/references/runbook.md b/v1/portfolio_balancing/references/runbook.md
index 636145f..efea678 100644
--- a/v1/portfolio_balancing/references/runbook.md
+++ b/v1/portfolio_balancing/references/runbook.md
@@ -37,30 +37,35 @@ base at every lambda — without the cluster collapse, the gap would grow.
 
 ## Workflow
 
-- `/rai-discovery` "Our 8-stock book breaks compliance and concentrates risk. Rebuild it under Markowitz mean-variance with caps, deduplicate redundant bets via correlation clustering, and stress-test under crisis. What questions does each reasoner family handle?"
+### 0. Discovery
 
-  Plan: rules for compliance flags, graph for correlation clustering + representatives, prescriptive QP indexed by Scenario, stress as regime-swap re-solve.
+- Prompt: `/rai-discovery Our 8-stock book breaks compliance and concentrates risk. Rebuild it under Markowitz mean-variance with caps, deduplicate redundant bets via correlation clustering, and stress-test under crisis. What questions does each reasoner family handle?`
+- Response: Plan: rules for compliance flags, graph for correlation clustering + representatives, prescriptive QP indexed by Scenario, stress as regime-swap re-solve.
 
-- `/rai-rules-authoring` "Flag any holding worth more than 15% of its account, any sector worth more than 30% of the account, and any user with a risk score above 0.8 and more than five flagged transactions."
+### 1. Compliance scan
 
-  4 holdings flagged (AAPL/MSFT on Account 1, JNJ/PFE on Account 4); 2 (account, sector) pairs flagged (Account 1 Tech 34.0%, Account 4 Healthcare 32.2%); 2 users flagged (Alice Chen 0.85, Eve Taylor 0.92).
+- Prompt: `/rai-rules-authoring Flag any holding worth more than 15% of its account, any sector worth more than 30% of the account, and any user with a risk score above 0.8 and more than five flagged transactions.`
+- Response: 4 holdings flagged (AAPL/MSFT on Account 1, JNJ/PFE on Account 4); 2 (account, sector) pairs flagged (Account 1 Tech 34.0%, Account 4 Healthcare 32.2%); 2 users flagged (Alice Chen 0.85, Eve Taylor 0.92).
 
-- `/rai-graph-analysis` "Cluster stocks by correlation — anything above 0.3 absolute is a redundant bet. Pick one representative per cluster (highest Sharpe ratio) and force the rest to zero in optimization."
+### 2. Cluster correlated bets
 
-  4 edges (|rho| >= 0.3), 5 Louvain clusters, intra +0.683 vs inter +0.131. 5 representatives picked: PFE, GOOGL, JPM, PG, XOM. AAPL/MSFT/JNJ flagged `is_non_representative`.
+- Prompt: `/rai-graph-analysis Cluster stocks by correlation — anything above 0.3 absolute is a redundant bet. Pick one representative per cluster (highest Sharpe ratio) and force the rest to zero in optimization.`
+- Response: 4 edges (|rho| >= 0.3), 5 Louvain clusters, intra +0.683 vs inter +0.131. 5 representatives picked: PFE, GOOGL, JPM, PG, XOM. AAPL/MSFT/JNJ flagged `is_non_representative`.
 
-- `/rai-prescriptive-problem-formulation` "Build a Markowitz mean-variance frontier across 6 scenarios = 3 budgets x 2 regimes. Position cap 30% of budget, sector cap 30%, non-representatives forced to zero. Anchor with min-risk and max-return, then sweep 5 epsilon points across the return range."
+### 3. Solve mean-variance frontier
 
-  48 decision vars (8 stocks x 6 scenarios), 5 constraint families. Return-rate range [0.0634, 0.0840]. 7 solves x 6 scenarios = 42 `LOCALLY_SOLVED` portfolios via Ipopt.
+- Prompt: `/rai-prescriptive-problem-formulation Build a Markowitz mean-variance frontier across 6 scenarios = 3 budgets x 2 regimes. Position cap 30% of budget, sector cap 30%, non-representatives forced to zero. Anchor with min-risk and max-return, then sweep 5 epsilon points across the return range.`
+- Response: 48 decision vars (8 stocks x 6 scenarios), 5 constraint families. Return-rate range [0.0634, 0.0840]. 7 solves x 6 scenarios = 42 `LOCALLY_SOLVED` portfolios via Ipopt.
 
-- `/rai-prescriptive-results-interpretation` "For each scenario, list the seven-point Pareto frontier and find the knee — where does the marginal risk per unit return jump the most?"
+### 4. Read the frontier
 
-  base_500 frontier: returns 32.43 -> 40.28, risk 1160 -> 1742. Marginal `delta_risk/delta_return` jumps ~3x at eps_1 (knee). Same shape across all 6 scenarios — risk scales as budget^2, rate-form frontier is budget-independent.
+- Prompt: `/rai-prescriptive-results-interpretation For each scenario, list the seven-point Pareto frontier and find the knee — where does the marginal risk per unit return jump the most?`
+- Response: base_500 frontier: returns 32.43 -> 40.28, risk 1160 -> 1742. Marginal `delta_risk/delta_return` jumps ~3x at eps_1 (knee). Same shape across all 6 scenarios — risk scales as budget^2, rate-form frontier is budget-independent.
 
-- `/rai-prescriptive-solver-management` + `/rai-prescriptive-results-interpretation` "Stress-test the frontier under crisis: shrink correlations toward all-ones with weight 0.7 on base covariance + 0.3 on outer-product. How much volatility expansion at each frontier point?"
-
-  Crisis vol +28-30% above base at every lambda (budget 500: min_risk 34.06 -> 43.74, eps_1 34.30 -> 44.54 peak). Gap peaks mid-frontier, narrows to +25.2% at eps_5 — the cluster-collapse payoff.
+### 5. Stress under crisis
 
+- Prompt: `/rai-prescriptive-solver-management + /rai-prescriptive-results-interpretation Stress-test the frontier under crisis: shrink correlations toward all-ones with weight 0.7 on base covariance + 0.3 on outer-product. How much volatility expansion at each frontier point?`
+- Response: Crisis vol +28-30% above base at every lambda (budget 500: min_risk 34.06 -> 43.74, eps_1 34.30 -> 44.54 peak). Gap peaks mid-frontier, narrows to +25.2% at eps_5 — the cluster-collapse payoff.
 
 ## Data
 
diff --git a/v1/supply_chain_resilience/references/runbook.md b/v1/supply_chain_resilience/references/runbook.md
index 331bd3c..39b7713 100644
--- a/v1/supply_chain_resilience/references/runbook.md
+++ b/v1/supply_chain_resilience/references/runbook.md
@@ -35,30 +35,35 @@ watch->avoid downgrade = +0.0% (optimizer already routed around it).
 
 ## Workflow
 
-- `/rai-discovery` "We need a risk-adjusted routing plan. What's our exposure to each supplier, which sites are bottlenecks, which suppliers are unreliable, and what does the minimum-cost flow look like once those risks are priced in?"
+### 0. Discovery
 
-  Reasoner-routing plan covering Stages 0–3 (graph, rules, prescriptive).
+- Prompt: `/rai-discovery We need a risk-adjusted routing plan. What's our exposure to each supplier, which sites are bottlenecks, which suppliers are unreliable, and what does the minimum-cost flow look like once those risks are priced in?`
+- Response: Reasoner-routing plan covering Stages 0–3 (graph, rules, prescriptive).
 
-- `/rai-graph-analysis` "If a key supplier goes offline, which downstream buyers and finished products are at risk? For each HIGH-priority customer, list the suppliers it transitively depends on through the shipment graph, with their reliability scores."
+### 1. Map upstream supplier exposure
 
-  `Business.is_high_priority_customer` (2 buyers: B008, B009); shared 6-supplier upstream cone.
+- Prompt: `/rai-graph-analysis If a key supplier goes offline, which downstream buyers and finished products are at risk? For each HIGH-priority customer, list the suppliers it transitively depends on through the shipment graph, with their reliability scores.`
+- Response: `Business.is_high_priority_customer` (2 buyers: B008, B009); shared 6-supplier upstream cone.
 
-- `/rai-graph-analysis` "Which sites are the most influential hubs in the supply network — sites that connect to other influential sites, not just sites with many direct connections? Persist the centrality score back to each site so the optimizer can use it as a bottleneck weight."
+### 2. Rank network hubs
 
-  `Site.centrality` normalized [0,1]: S004=1.000, S006=0.776, S003=0.735; 2 weakly-connected components.
+- Prompt: `/rai-graph-analysis Which sites are the most influential hubs in the supply network — sites that connect to other influential sites, not just sites with many direct connections? Persist the centrality score back to each site so the optimizer can use it as a bottleneck weight.`
+- Response: `Site.centrality` normalized [0,1]: S004=1.000, S006=0.776, S003=0.735; 2 weakly-connected components.
 
-- `/rai-rules-authoring` "Rate each supplier's delivery reliability. Flag any with reliability score below 0.80 as unreliable, any with a Q1 delay prediction above 0.15 as high-delay-risk, and call them 'watch-level' if either fires. Suppliers with **both** flags are 'avoid' (hard-blocked downstream); suppliers with **either** flag are 'watch' (surcharged)."
+### 3. Classify supplier reliability
 
-  `is_unreliable` (1: B017), `has_high_delay_risk` (2: B003, B017), `is_watch_level` (2), `Demand.is_escalated` (9).
+- Prompt: `/rai-rules-authoring Rate each supplier's delivery reliability. Flag any with reliability score below 0.80 as unreliable, any with a Q1 delay prediction above 0.15 as high-delay-risk, and call them 'watch-level' if either fires. Suppliers with **both** flags are 'avoid' (hard-blocked downstream); suppliers with **either** flag are 'watch' (surcharged).`
+- Response: `is_unreliable` (1: B017), `has_high_delay_risk` (2: B003, B017), `is_watch_level` (2), `Demand.is_escalated` (9).
 
-- `/rai-prescriptive-problem-formulation` "Solve a minimum-cost flow that fulfills all open demand orders at minimum total transport cost. Hard-block 'avoid' suppliers, surcharge 'watch' suppliers $5/unit, weight bottleneck sites by their centrality, and penalize unmet demand at $100/unit."
+### 4. Solve risk-adjusted flow
 
-  MILP on `Operation.x_flow` + `Demand.x_unmet`; objective = transport + risk surcharge + centrality weight + unmet penalty.
+- Prompt: `/rai-prescriptive-problem-formulation Solve a minimum-cost flow that fulfills all open demand orders at minimum total transport cost. Hard-block 'avoid' suppliers, surcharge 'watch' suppliers $5/unit, weight bottleneck sites by their centrality, and penalize unmet demand at $100/unit.`
+- Response: MILP on `Operation.x_flow` + `Demand.x_unmet`; objective = transport + risk surcharge + centrality weight + unmet penalty.
 
-- `/rai-prescriptive-solver-management` + `/rai-prescriptive-results-interpretation` "Re-solve with the highest-centrality site offline, and again with watch-level suppliers downgraded to avoid. What's the cost delta in each, and why are they asymmetric?"
-
-  Baseline OPTIMAL $1,865 / 8 flows / 0 unmet; S004 offline +88.5%; watch->avoid +0.0% (B003 already off optimal lanes).
+### 5. Quantify disruption scenarios
 
+- Prompt: `/rai-prescriptive-solver-management + /rai-prescriptive-results-interpretation Re-solve with the highest-centrality site offline, and again with watch-level suppliers downgraded to avoid. What's the cost delta in each, and why are they asymmetric?`
+- Response: Baseline OPTIMAL $1,865 / 8 flows / 0 unmet; S004 offline +88.5%; watch->avoid +0.0% (B003 already off optimal lanes).
 
 ## Data
 
diff --git a/v1/telco_network_recovery/references/runbook.md b/v1/telco_network_recovery/references/runbook.md
index cf88bc2..9eefd0e 100644
--- a/v1/telco_network_recovery/references/runbook.md
+++ b/v1/telco_network_recovery/references/runbook.md
@@ -37,34 +37,40 @@ across all 15 critical towers, prioritized by social blast radius.
 
 ## Workflow
 
-- `/rai-discovery` "WEST is missing revenue while every other region grows. What questions do we need to answer to figure out where to spend $5M to fix it?"
+### 0. Discovery
 
-  Routes sub-questions to descriptive (revenue diagnosis), rules (critical-tower flag), graph (PageRank blast radius), predictive (demand forecast), prescriptive (tier-selection MIP + post-solve interpretation).
+- Prompt: `/rai-discovery WEST is missing revenue while every other region grows. What questions do we need to answer to figure out where to spend $5M to fix it?`
+- Response: Routes sub-questions to descriptive (revenue diagnosis), rules (critical-tower flag), graph (PageRank blast radius), predictive (demand forecast), prescriptive (tier-selection MIP + post-solve interpretation).
 
-- `/rai-querying` "Where are we missing revenue targets? Which 10 cell towers have the worst average packet loss over 2024, and which region has the worst Q4 network availability?"
+### 1. Diagnose WEST
 
-  WEST avail 94.6 vs 99.5 elsewhere; Q3-Q4 revenue −22% to −26% ($791K gap); 15 of 81 WEST towers DEGRADED at 8.1–8.9% packet loss; zero high-risk subs — operational, not retention.
+- Prompt: `/rai-querying Where are we missing revenue targets? Which 10 cell towers have the worst average packet loss over 2024, and which region has the worst Q4 network availability?`
+- Response: WEST avail 94.6 vs 99.5 elsewhere; Q3-Q4 revenue −22% to −26% ($791K gap); 15 of 81 WEST towers DEGRADED at 8.1–8.9% packet loss; zero high-risk subs — operational, not retention.
 
-- `/rai-rules-authoring` "Flag CellTowers as 'critical-restore' if region is WEST AND status is DEGRADED AND avg equipment health is below 0.85, OR if avg packet loss > 5% with health below 0.85."
+### 2. Flag critical-restore towers
 
-  `CellTower.is_critical_restore` fires on 15 WEST DEGRADED towers; 4 derived health metrics (`avg_packet_loss`, `avg_latency_ms`, `avg_error_rate`, `avg_health_score`) written to all 250 towers.
+- Prompt: `/rai-rules-authoring Flag CellTowers as 'critical-restore' if region is WEST AND status is DEGRADED AND avg equipment health is below 0.85, OR if avg packet loss > 5% with health below 0.85.`
+- Response: `CellTower.is_critical_restore` fires on 15 WEST DEGRADED towers; 4 derived health metrics (`avg_packet_loss`, `avg_latency_ms`, `avg_error_rate`, `avg_health_score`) written to all 250 towers.
 
-- `/rai-graph-analysis` "Who are our most socially influential subscribers based on call patterns? For each critical-restore tower, count the distinct subscribers whose calls route through it and rank by total PageRank influence — that's the blast radius if it fails."
+### 3. Score subscriber blast radius
 
-  `Subscriber.influence_score` (PageRank) on all 1,200 subs; `CellTower.weighted_impact` on 15 critical towers; 404 distinct subs (33% of base) route through a critical tower; TWR-0014 has the largest footprint (61 subs, 0.0502).
+- Prompt: `/rai-graph-analysis Who are our most socially influential subscribers based on call patterns? For each critical-restore tower, count the distinct subscribers whose calls route through it and rank by total PageRank influence — that's the blast radius if it fails.`
+- Response: `Subscriber.influence_score` (PageRank) on all 1,200 subs; `CellTower.weighted_impact` on 15 critical towers; 404 distinct subs (33% of base) route through a critical tower; TWR-0014 has the largest footprint (61 subs, 0.0502).
 
-- `/rai-predictive-modeling` + `/rai-predictive-training` "Predict next-quarter subscriber-growth-rate per region using TimeSeriesMetric history. Bind each region's forecast back to its towers as a demand multiplier."
+### 4. Forecast regional demand
 
-  GNN node regression on 365d × 9 regions with 1-day-lag temporal edges; WEST multiplier 0.993× (contracting ~0.7%); 8 other regions +0.59 to +0.75%/day; written to `CellTower.projected_demand_growth` for 15 critical towers.
+- Prompt: `/rai-predictive-modeling + /rai-predictive-training Predict next-quarter subscriber-growth-rate per region using TimeSeriesMetric history. Bind each region's forecast back to its towers as a demand multiplier.`
+- Response: GNN node regression on 365d × 9 regions with 1-day-lag temporal edges; WEST multiplier 0.993× (contracting ~0.7%); 8 other regions +0.59 to +0.75%/day; written to `CellTower.projected_demand_growth` for 15 critical towers.
 
-- `/rai-prescriptive-problem-formulation` "Recover WEST capacity within $5M and 200 install-weeks, prioritizing towers by social blast radius and forward-looking demand. From TowerUpgradeOption, pick at most one upgrade tier (BRONZE/SILVER/GOLD) per critical-restore tower, maximizing Σ capacity_increase × weighted_impact × projected_demand_growth."
+### 5. Optimize tier selection
 
-  OPTIMAL · 12 GOLD / 2 SILVER / 1 BRONZE · 122 Gbps restored · $4.96M of $5M (binding) · 164 of 200 install-weeks (slack) · all 15 towers covered.
+- Prompt: `/rai-prescriptive-problem-formulation Recover WEST capacity within $5M and 200 install-weeks, prioritizing towers by social blast radius and forward-looking demand. From TowerUpgradeOption, pick at most one upgrade tier (BRONZE/SILVER/GOLD) per critical-restore tower, maximizing Σ capacity_increase × weighted_impact × projected_demand_growth.`
+- Response: OPTIMAL · 12 GOLD / 2 SILVER / 1 BRONZE · 122 Gbps restored · $4.96M of $5M (binding) · 164 of 200 install-weeks (slack) · all 15 towers covered.
 
-- `/rai-prescriptive-results-interpretation` "Summarize the plan: total cost, capacity restored, tier mix, towers covered. Which constraint is binding, and what would relaxing it by 10-20% unlock?"
-
-  Budget binds at $4.96M/$5M; flexing to $6M would promote TWR-0009 BRONZE→GOLD (+9 Gbps); install-weeks have 36-week slack; 404 service-affected subs drop to ~0 over the 4-month rollout.
+### 6. Interpret the plan
 
+- Prompt: `/rai-prescriptive-results-interpretation Summarize the plan: total cost, capacity restored, tier mix, towers covered. Which constraint is binding, and what would relaxing it by 10-20% unlock?`
+- Response: Budget binds at $4.96M/$5M; flexing to $6M would promote TWR-0009 BRONZE→GOLD (+9 Gbps); install-weeks have 36-week slack; 404 service-affected subs drop to ~0 over the 4-month rollout.
 
 ## Data
 

From 317960d8aecefff5867117d0b0d93f01bae4650c Mon Sep 17 00:00:00 2001
From: cafzal <cameron.afzal@gmail.com>
Date: Wed, 6 May 2026 09:36:39 -0700
Subject: [PATCH 08/22] Audit-pass fixes against template scripts and READMEs

- telco: reorder to summit-demo workflow (descriptive -> rules ->
  graph -> predictive -> prescriptive -> interpret); fix WEST
  multiplier 0.993x -> 0.9998x; fix other-region growth range to
  +0.45-0.91%/day; clarify projected_demand_growth is written to
  all 250 towers via region join, not just 15
- energy: drop fabricated 36-month forecast horizon -> 24-month;
  rewrite Stage 3 low-carbon prompt to describe the actual rule
  (per-DC requirement vs zero-emission share) instead of a
  fabricated 25%/100% threshold
- machine_maintenance: drop unsupported "Turbines need on-site
  qualified technician" hard constraint -- script penalizes
  travel cost, doesn't enforce co-location; add parts_cost
  factor to failure cost formula
- portfolio: tighten crisis vol-gap range to actual 25-30%
  (peak +29.8% at eps_1, low +25.2% at eps_5)
- supply_chain: no factual changes (verified clean)
---
 v1/energy_grid_planning/references/runbook.md   |  4 ++--
 v1/machine_maintenance/references/runbook.md    |  4 ++--
 v1/portfolio_balancing/references/runbook.md    |  9 +++++----
 v1/telco_network_recovery/references/runbook.md | 10 +++++-----
 4 files changed, 14 insertions(+), 13 deletions(-)

diff --git a/v1/energy_grid_planning/references/runbook.md b/v1/energy_grid_planning/references/runbook.md
index 31984f5..49d9653 100644
--- a/v1/energy_grid_planning/references/runbook.md
+++ b/v1/energy_grid_planning/references/runbook.md
@@ -45,7 +45,7 @@ $300M unlocks 5 DCs (1,500 MW, $264M net value) including xAI Colossus.
 
 ### 1. Forecast substation load
 
-- Prompt: `/rai-predictive-modeling + /rai-predictive-training Can we forecast substation load growth over the next 36 months based on historical demand, planned generator additions, and the data center request pipeline? Bind each substation's predicted peak load back to the ontology so the rules engine and optimizer can read it.`
+- Prompt: `/rai-predictive-modeling + /rai-predictive-training Can we forecast substation load growth over the next 24 months based on historical demand, planned generator additions, and the data center request pipeline? Bind each substation's predicted peak load back to the ontology so the rules engine and optimizer can read it.`
 - Response: `Substation.predicted_load` for all 12; DFW breaches at 1,700 MW vs 1,600 MW cap at 24 months (+54.6%).
 
 ### 2. Find structural bottlenecks
@@ -55,7 +55,7 @@ $300M unlocks 5 DCs (1,500 MW, $264M net value) including xAI Colossus.
 
 ### 3. Screen DC requests
 
-- Prompt: `/rai-rules-authoring Screen each data center request against three criteria: (1) substation must have enough capacity after predicted load, (2) if 100% low-carbon required, region must have 25%+ renewable, (3) substation shouldn't be most structurally critical. Which requests pass all three?`
+- Prompt: `/rai-rules-authoring Screen each data center request against three criteria: (1) substation must have enough capacity after predicted load, (2) substation's low-carbon (renewable + nuclear) generation share must meet the DC's low-carbon requirement, (3) substation shouldn't be one of the top-3 structurally critical. Which requests pass all three?`
 - Response: `fails_capacity` / `fails_structural` / `fails_low_carbon` + `is_compliant`; 2 pass (Crusoe, Oracle), 8 flagged.
 
 ### 4. Approve DCs and fund upgrades
diff --git a/v1/machine_maintenance/references/runbook.md b/v1/machine_maintenance/references/runbook.md
index b9a52c8..85f7293 100644
--- a/v1/machine_maintenance/references/runbook.md
+++ b/v1/machine_maintenance/references/runbook.md
@@ -63,8 +63,8 @@ all 3 Turbine techs sit in one city, a $3,200 fix away from resolved.
 
 ### 4. Schedule maintenance
 
-- Prompt: `/rai-prescriptive-problem-formulation Schedule preventive maintenance for all 30 machines across 4 periods, capped at 5 jobs per period. Every overdue machine gets maintained by period 2, and Turbines need an on-site qualified technician. Minimize expected failure cost weighted by criticality and centrality, plus labor and travel.`
-- Response: 120 `x_maintain` + 120 `x_vulnerable` + ~250 `x_assigned` binaries; 5 constraint families; failure cost uses `predicted_fp × criticality × (1 + 2.0 × betweenness)`.
+- Prompt: `/rai-prescriptive-problem-formulation Schedule preventive maintenance for all 30 machines across 4 periods, capped at 5 jobs per period. Every overdue machine gets maintained by period 2, and each maintained machine needs exactly one qualified technician. Minimize expected failure cost weighted by criticality and centrality, plus labor and travel (penalized when the technician is not co-located).`
+- Response: 120 `x_maintain` + 120 `x_vulnerable` + ~250 `x_assigned` binaries; 5 constraint families; failure cost uses `predicted_fp × parts_cost × criticality × (1 + 2.0 × betweenness)`.
 
 ### 5. Stress-test concentration
 
diff --git a/v1/portfolio_balancing/references/runbook.md b/v1/portfolio_balancing/references/runbook.md
index efea678..5937cec 100644
--- a/v1/portfolio_balancing/references/runbook.md
+++ b/v1/portfolio_balancing/references/runbook.md
@@ -29,9 +29,10 @@ base at every lambda — without the cluster collapse, the gap would grow.
                               frontier per scenario. Knee at eps_1.
   ─────────────────────────────────────────────────────────────────
   STAGE 4  Stress       ──►  Stock.regime_covar (PSD-preserving)
-                              Crisis vol 25-30% above base at every
-                              lambda. Gap peaks mid-frontier, narrows
-                              toward the concentrated end.
+                              Crisis vol 25-30% above base across
+                              the frontier. Gap peaks mid-frontier
+                              (eps_1..eps_2 at +29.8%), narrows
+                              toward the concentrated end (+25.2%).
   ─────────────────────────────────────────────────────────────────
 ```
 
@@ -65,7 +66,7 @@ base at every lambda — without the cluster collapse, the gap would grow.
 ### 5. Stress under crisis
 
 - Prompt: `/rai-prescriptive-solver-management + /rai-prescriptive-results-interpretation Stress-test the frontier under crisis: shrink correlations toward all-ones with weight 0.7 on base covariance + 0.3 on outer-product. How much volatility expansion at each frontier point?`
-- Response: Crisis vol +28-30% above base at every lambda (budget 500: min_risk 34.06 -> 43.74, eps_1 34.30 -> 44.54 peak). Gap peaks mid-frontier, narrows to +25.2% at eps_5 — the cluster-collapse payoff.
+- Response: Crisis vol 25-30% above base across the frontier (budget 500: min_risk 34.06 -> 43.74 at +28.4%, eps_1 34.30 -> 44.54 at +29.8% peak). Gap peaks mid-frontier (eps_1..eps_2 at +29.8%), narrows to +25.2% at eps_5 — the cluster-collapse payoff.
 
 ## Data
 
diff --git a/v1/telco_network_recovery/references/runbook.md b/v1/telco_network_recovery/references/runbook.md
index 9eefd0e..33b1edd 100644
--- a/v1/telco_network_recovery/references/runbook.md
+++ b/v1/telco_network_recovery/references/runbook.md
@@ -1,6 +1,6 @@
 # Runbook: Telco WEST Recovery — Multi-Reasoner Walkthrough
 
-A regional telco is bleeding $791K/quarter from WEST while every other region grows. No single reasoner can answer where to spend a $5M recovery budget: descriptive scopes the crisis, rules flag broken towers, graph weights them by social blast radius, predictive forecasts forward demand, and prescriptive composes all three signals into the upgrade plan. Each stage writes derived properties back to the same ontology that downstream stages consume.
+A regional telco is bleeding $791K/quarter from WEST while every other region grows. No single reasoner can answer where to spend a $5M recovery budget: descriptive scopes the crisis, rules flag broken towers, graph weights them by social blast radius, predictive forecasts forward demand, and prescriptive composes all four signals into the upgrade plan. Each stage writes derived properties back to the same ontology that downstream stages consume.
 
 ## The chain
 
@@ -24,9 +24,9 @@ across all 15 critical towers, prioritized by social blast radius.
                               404 distinct subs (33% of base) route
                               calls through a critical tower.
   ─────────────────────────────────────────────────────────────────
-  STAGE 4  Predictive   ──►  CellTower.projected_demand_growth (15)
-                 (GNN)        WEST: 0.993×  ── shrinking ~0.7%/yr
-                              while 8 other regions sit at +0.59 to +0.75%/day.
+  STAGE 4  Predictive   ──►  CellTower.projected_demand_growth (250)
+                 (GNN)        WEST: 0.9998×  ── flat/slightly contracting
+                              while 8 other regions sit at +0.45 to +0.91%/day.
   ─────────────────────────────────────────────────────────────────
   STAGE 5  Prescriptive ──►  TowerUpgradeOption.selected  (15)
                               OPTIMAL · 12 GOLD · 2 SILVER · 1 BRONZE
@@ -60,7 +60,7 @@ across all 15 critical towers, prioritized by social blast radius.
 ### 4. Forecast regional demand
 
 - Prompt: `/rai-predictive-modeling + /rai-predictive-training Predict next-quarter subscriber-growth-rate per region using TimeSeriesMetric history. Bind each region's forecast back to its towers as a demand multiplier.`
-- Response: GNN node regression on 365d × 9 regions with 1-day-lag temporal edges; WEST multiplier 0.993× (contracting ~0.7%); 8 other regions +0.59 to +0.75%/day; written to `CellTower.projected_demand_growth` for 15 critical towers.
+- Response: GNN node regression on 365d × 9 regions with 1-day-lag temporal edges; WEST multiplier 0.9998× (flat/slightly contracting); 8 other regions +0.45 to +0.91%/day; written to `CellTower.projected_demand_growth` for all 250 towers via region join.
 
 ### 5. Optimize tier selection
 

From ec8ae9a032e45706c4c7c7fc84613a80610f5598 Mon Sep 17 00:00:00 2001
From: cafzal <cameron.afzal@gmail.com>
Date: Wed, 6 May 2026 09:41:47 -0700
Subject: [PATCH 09/22] Strip mechanical bits from prompts
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

Prompts should describe what the user wants, not how the agent
should do it. The skill (with the agent) handles solver choice,
formula construction, and implementation details.

- telco Stage 5: drop the explicit Σ formula
- portfolio Stage 2: 'force the rest to zero' -> 'only invest in those'
- portfolio Stage 3: drop 'anchor / sweep / forced to zero' agent
  scaffolding; show 7 frontier points instead
- supply chain Stage 3: rephrase as 'find the minimum-cost shipping
  plan' (no 'Solve a ... LP'), 'don't ship from avoid suppliers',
  'prefer non-bottleneck sites'
- machine_maintenance Stage 5: drop 'Solve with HiGHS' (mechanical)
- machine_maintenance Stage 4: simplify cost-formula language
---
 v1/machine_maintenance/references/runbook.md     | 4 ++--
 v1/portfolio_balancing/references/runbook.md     | 4 ++--
 v1/supply_chain_resilience/references/runbook.md | 2 +-
 v1/telco_network_recovery/references/runbook.md  | 2 +-
 4 files changed, 6 insertions(+), 6 deletions(-)

diff --git a/v1/machine_maintenance/references/runbook.md b/v1/machine_maintenance/references/runbook.md
index 85f7293..6a26b52 100644
--- a/v1/machine_maintenance/references/runbook.md
+++ b/v1/machine_maintenance/references/runbook.md
@@ -63,12 +63,12 @@ all 3 Turbine techs sit in one city, a $3,200 fix away from resolved.
 
 ### 4. Schedule maintenance
 
-- Prompt: `/rai-prescriptive-problem-formulation Schedule preventive maintenance for all 30 machines across 4 periods, capped at 5 jobs per period. Every overdue machine gets maintained by period 2, and each maintained machine needs exactly one qualified technician. Minimize expected failure cost weighted by criticality and centrality, plus labor and travel (penalized when the technician is not co-located).`
+- Prompt: `/rai-prescriptive-problem-formulation Schedule preventive maintenance for all 30 machines across 4 periods, capped at 5 jobs per period. Every overdue machine gets maintained by period 2, and each maintained machine needs a qualified technician. Minimize expected failure cost (weighted by criticality and centrality) plus labor and travel.`
 - Response: 120 `x_maintain` + 120 `x_vulnerable` + ~250 `x_assigned` binaries; 5 constraint families; failure cost uses `predicted_fp × parts_cost × criticality × (1 + 2.0 × betweenness)`.
 
 ### 5. Stress-test concentration
 
-- Prompt: `/rai-prescriptive-solver-management + /rai-prescriptive-results-interpretation Solve with HiGHS, then for each machine type check whether all qualified techs sit in one location and recommend the cheapest cross-training fix.`
+- Prompt: `/rai-prescriptive-solver-management + /rai-prescriptive-results-interpretation For each machine type, check whether all qualified technicians sit in one location and recommend the cheapest cross-training fix.`
 - Response: OPTIMAL · 20 jobs · $605,241; Turbine concentrated in Houston_TX (67% of jobs travel); cross-train T006 (Chicago_IL, Senior) for $3,200 / 5 weeks.
 
 ## Data
diff --git a/v1/portfolio_balancing/references/runbook.md b/v1/portfolio_balancing/references/runbook.md
index 5937cec..85293ba 100644
--- a/v1/portfolio_balancing/references/runbook.md
+++ b/v1/portfolio_balancing/references/runbook.md
@@ -50,12 +50,12 @@ base at every lambda — without the cluster collapse, the gap would grow.
 
 ### 2. Cluster correlated bets
 
-- Prompt: `/rai-graph-analysis Cluster stocks by correlation — anything above 0.3 absolute is a redundant bet. Pick one representative per cluster (highest Sharpe ratio) and force the rest to zero in optimization.`
+- Prompt: `/rai-graph-analysis Cluster stocks by correlation — anything above 0.3 absolute is a redundant bet. Pick one representative per cluster (highest Sharpe ratio) and only invest in those.`
 - Response: 4 edges (|rho| >= 0.3), 5 Louvain clusters, intra +0.683 vs inter +0.131. 5 representatives picked: PFE, GOOGL, JPM, PG, XOM. AAPL/MSFT/JNJ flagged `is_non_representative`.
 
 ### 3. Solve mean-variance frontier
 
-- Prompt: `/rai-prescriptive-problem-formulation Build a Markowitz mean-variance frontier across 6 scenarios = 3 budgets x 2 regimes. Position cap 30% of budget, sector cap 30%, non-representatives forced to zero. Anchor with min-risk and max-return, then sweep 5 epsilon points across the return range.`
+- Prompt: `/rai-prescriptive-problem-formulation Build a Markowitz mean-variance frontier across 6 scenarios = 3 budgets x 2 regimes. Position cap 30% of budget, sector cap 30%, only invest in cluster representatives. Show 7 points per frontier.`
 - Response: 48 decision vars (8 stocks x 6 scenarios), 5 constraint families. Return-rate range [0.0634, 0.0840]. 7 solves x 6 scenarios = 42 `LOCALLY_SOLVED` portfolios via Ipopt.
 
 ### 4. Read the frontier
diff --git a/v1/supply_chain_resilience/references/runbook.md b/v1/supply_chain_resilience/references/runbook.md
index 39b7713..82bd56c 100644
--- a/v1/supply_chain_resilience/references/runbook.md
+++ b/v1/supply_chain_resilience/references/runbook.md
@@ -57,7 +57,7 @@ watch->avoid downgrade = +0.0% (optimizer already routed around it).
 
 ### 4. Solve risk-adjusted flow
 
-- Prompt: `/rai-prescriptive-problem-formulation Solve a minimum-cost flow that fulfills all open demand orders at minimum total transport cost. Hard-block 'avoid' suppliers, surcharge 'watch' suppliers $5/unit, weight bottleneck sites by their centrality, and penalize unmet demand at $100/unit.`
+- Prompt: `/rai-prescriptive-problem-formulation Find the minimum-cost shipping plan that fulfills all open demand. Don't ship from 'avoid' suppliers at all, add a $5/unit surcharge for 'watch' suppliers, prefer non-bottleneck sites, and charge $100/unit for any unmet demand.`
 - Response: MILP on `Operation.x_flow` + `Demand.x_unmet`; objective = transport + risk surcharge + centrality weight + unmet penalty.
 
 ### 5. Quantify disruption scenarios
diff --git a/v1/telco_network_recovery/references/runbook.md b/v1/telco_network_recovery/references/runbook.md
index 33b1edd..9288476 100644
--- a/v1/telco_network_recovery/references/runbook.md
+++ b/v1/telco_network_recovery/references/runbook.md
@@ -64,7 +64,7 @@ across all 15 critical towers, prioritized by social blast radius.
 
 ### 5. Optimize tier selection
 
-- Prompt: `/rai-prescriptive-problem-formulation Recover WEST capacity within $5M and 200 install-weeks, prioritizing towers by social blast radius and forward-looking demand. From TowerUpgradeOption, pick at most one upgrade tier (BRONZE/SILVER/GOLD) per critical-restore tower, maximizing Σ capacity_increase × weighted_impact × projected_demand_growth.`
+- Prompt: `/rai-prescriptive-problem-formulation Recover WEST capacity within $5M and 200 install-weeks. For each critical-restore tower, pick at most one upgrade tier (BRONZE, SILVER, or GOLD) to maximize capacity restored, weighted by each tower's blast radius and the regional demand forecast.`
 - Response: OPTIMAL · 12 GOLD / 2 SILVER / 1 BRONZE · 122 Gbps restored · $4.96M of $5M (binding) · 164 of 200 install-weeks (slack) · all 15 towers covered.
 
 ### 6. Interpret the plan

From 26833b4080579bac5e7d6e668378cba58dec11b3 Mon Sep 17 00:00:00 2001
From: cafzal <cameron.afzal@gmail.com>
Date: Wed, 6 May 2026 09:55:22 -0700
Subject: [PATCH 10/22] Add ontology-build step at the front of each runbook

Step 1 is now /rai-build-starter-ontology against the bundled CSVs.
Discovery, the chain stages, and interpretation shift to 2..N.
Reflects that users start with the demo data and need the ontology
materialized before any reasoner skill can run.
---
 v1/energy_grid_planning/references/runbook.md | 17 +++++++++++------
 v1/machine_maintenance/references/runbook.md  | 17 +++++++++++------
 v1/portfolio_balancing/references/runbook.md  | 17 +++++++++++------
 .../references/runbook.md                     | 17 +++++++++++------
 .../references/runbook.md                     | 19 ++++++++++++-------
 5 files changed, 56 insertions(+), 31 deletions(-)

diff --git a/v1/energy_grid_planning/references/runbook.md b/v1/energy_grid_planning/references/runbook.md
index 49d9653..d4615f5 100644
--- a/v1/energy_grid_planning/references/runbook.md
+++ b/v1/energy_grid_planning/references/runbook.md
@@ -38,32 +38,37 @@ $300M unlocks 5 DCs (1,500 MW, $264M net value) including xAI Colossus.
 
 ## Workflow
 
-### 0. Discovery
+### 1. Build ontology
+
+- Prompt: `/rai-build-starter-ontology Build an ontology for grid infrastructure planning from the CSVs in ../data/ covering substations, generators, transmission lines, demand forecasts, data center requests, and substation upgrades.`
+- Response: Concepts: `Substation`, `Generator`, `TransmissionLine`, `LoadZone`, `DemandPeriod`, `RenewableProfile`, `MaintenanceWindow`, `Customer`, `DataCenterRequest`, `SubstationUpgrade`, `DemandForecast`, `LoadHistory`, `DCAnnouncement` — bound to the bundled CSVs (12 substations, 10 DC requests, 18 transmission lines).
+
+### 2. Discovery
 
 - Prompt: `/rai-discovery We have 10 hyperscaler interconnection requests against a 12-substation grid. Which to approve, which substation upgrades to fund, at what budget level?`
 - Response: Plan routing sub-questions to predictive, graph, rules, and prescriptive reasoners.
 
-### 1. Forecast substation load
+### 3. Forecast substation load
 
 - Prompt: `/rai-predictive-modeling + /rai-predictive-training Can we forecast substation load growth over the next 24 months based on historical demand, planned generator additions, and the data center request pipeline? Bind each substation's predicted peak load back to the ontology so the rules engine and optimizer can read it.`
 - Response: `Substation.predicted_load` for all 12; DFW breaches at 1,700 MW vs 1,600 MW cap at 24 months (+54.6%).
 
-### 2. Find structural bottlenecks
+### 4. Find structural bottlenecks
 
 - Prompt: `/rai-graph-analysis Which substations are most critical to power flow based on grid topology? Use centrality on the transmission graph, then flag the top 3 as structurally critical and persist the scores back to the ontology.`
 - Response: 1 component, 3 Louvain communities; DFW, Houston, San Antonio flagged `is_structurally_critical`; 7 of 10 DC requests on critical nodes.
 
-### 3. Screen DC requests
+### 5. Screen DC requests
 
 - Prompt: `/rai-rules-authoring Screen each data center request against three criteria: (1) substation must have enough capacity after predicted load, (2) substation's low-carbon (renewable + nuclear) generation share must meet the DC's low-carbon requirement, (3) substation shouldn't be one of the top-3 structurally critical. Which requests pass all three?`
 - Response: `fails_capacity` / `fails_structural` / `fails_low_carbon` + `is_compliant`; 2 pass (Crusoe, Oracle), 8 flagged.
 
-### 4. Approve DCs and fund upgrades
+### 6. Approve DCs and fund upgrades
 
 - Prompt: `/rai-prescriptive-problem-formulation Decide which data center requests to approve and which substation upgrades to fund at $200M, $300M, $400M, $500M, and $600M investment levels. Maximize annual revenue. A request can only be approved if its substation has enough capacity after upgrades.`
 - Response: OPTIMAL MIP across 5 `InvestmentLevel` values in one solve; `x_approve` and `x_upgrade` written back per level.
 
-### 5. Read the frontier
+### 7. Read the frontier
 
 - Prompt: `/rai-prescriptive-results-interpretation Which data centers get approved, which upgrades are selected, and where's the biggest return on investment at each budget level?`
 - Response: Pareto frontier with knee at $300M (5 DCs, 1,500 MW, $264M net); marginal $995K/$M at knee, declining to $400K/$M by $600M; Google + Lambda never approved (DFW full).
diff --git a/v1/machine_maintenance/references/runbook.md b/v1/machine_maintenance/references/runbook.md
index 6a26b52..a48efa9 100644
--- a/v1/machine_maintenance/references/runbook.md
+++ b/v1/machine_maintenance/references/runbook.md
@@ -41,32 +41,37 @@ all 3 Turbine techs sit in one city, a $3,200 fix away from resolved.
 
 ## Workflow
 
-### 0. Discovery
+### 1. Build ontology
+
+- Prompt: `/rai-build-starter-ontology Build a manufacturing maintenance ontology from the CSVs in ../data/ covering machines, technicians, qualifications, periods, sensor readings, failure predictions, downtime events, and production runs.`
+- Response: Concepts: `Machine`, `Technician`, `Qualification`, `Period`, `MachinePeriod`, `TechnicianPeriod`, `TechnicianMachinePeriod`, `Sensor`, `SensorReading`, `FailurePrediction`, `DowntimeEvent`, `ProductionRun`, `PartsInventory`, `TrainingOption` — bound to the bundled CSVs (30 machines × 3 plants, 10 technicians, 4 periods).
+
+### 2. Discovery
 
 - Prompt: `/rai-discovery We need to schedule preventive maintenance for 30 machines across 3 plants. Where does OEE alone mislead us, and what structural risks won't a pure optimizer surface?`
 - Response: Plan routing sub-questions to querying, graph, rules, prescriptive, and resilience skills.
 
-### 1. Diagnose plant operations
+### 3. Diagnose plant operations
 
 - Prompt: `/rai-querying What's the OEE by plant? Which machines have the most sensor anomalies, and which are most likely to fail by the end of the planning horizon?`
 - Response: Plant_C 79.8% > Plant_A 68.2% > Plant_B 61.4%; 7 of 9 anomalies at Plant_A; `MachinePeriod.predicted_fp` written for 120 rows.
 
-### 2. Find scheduling bottlenecks
+### 4. Find scheduling bottlenecks
 
 - Prompt: `/rai-graph-analysis Which machines share qualified technicians, and which are bottlenecks in the qualification network? Compute centrality and write it back to each machine so the optimizer can weight critical machines.`
 - Response: 30 machines → 1 connected component; Pumps tie at top betweenness (24.0 raw, 1.0 normalized); `Machine.betweenness` stored.
 
-### 3. Classify machine risk
+### 5. Classify machine risk
 
 - Prompt: `/rai-rules-authoring Rate each machine's risk: chronic if >8 downtime events, high-risk if failure prob >0.3 AND criticality 4+, plus overdue for maintenance. All three flags = Critical, two = Elevated, otherwise Standard.`
 - Response: 6 overdue, 1 high-risk, 3 chronic; M013 (Pump, Plant_A) = Critical; M016 (Turbine, Plant_A) = Elevated.
 
-### 4. Schedule maintenance
+### 6. Schedule maintenance
 
 - Prompt: `/rai-prescriptive-problem-formulation Schedule preventive maintenance for all 30 machines across 4 periods, capped at 5 jobs per period. Every overdue machine gets maintained by period 2, and each maintained machine needs a qualified technician. Minimize expected failure cost (weighted by criticality and centrality) plus labor and travel.`
 - Response: 120 `x_maintain` + 120 `x_vulnerable` + ~250 `x_assigned` binaries; 5 constraint families; failure cost uses `predicted_fp × parts_cost × criticality × (1 + 2.0 × betweenness)`.
 
-### 5. Stress-test concentration
+### 7. Stress-test concentration
 
 - Prompt: `/rai-prescriptive-solver-management + /rai-prescriptive-results-interpretation For each machine type, check whether all qualified technicians sit in one location and recommend the cheapest cross-training fix.`
 - Response: OPTIMAL · 20 jobs · $605,241; Turbine concentrated in Houston_TX (67% of jobs travel); cross-train T006 (Chicago_IL, Senior) for $3,200 / 5 weeks.
diff --git a/v1/portfolio_balancing/references/runbook.md b/v1/portfolio_balancing/references/runbook.md
index 85293ba..cc108ff 100644
--- a/v1/portfolio_balancing/references/runbook.md
+++ b/v1/portfolio_balancing/references/runbook.md
@@ -38,32 +38,37 @@ base at every lambda — without the cluster collapse, the gap would grow.
 
 ## Workflow
 
-### 0. Discovery
+### 1. Build ontology
+
+- Prompt: `/rai-build-starter-ontology Build a portfolio ontology from the CSVs in ../data/ covering stocks, sectors, the covariance matrix, accounts, holdings, users, and transactions.`
+- Response: Concepts: `Stock`, `Sector`, `StockPair` (compound id on stock_i / stock_j with covariance), `User`, `Account`, `Holding`, `Transaction`, plus the `Scenario` Concept used by Stage 3 — bound to the bundled CSVs (8 stocks, 64 covariance entries, 6 scenarios).
+
+### 2. Discovery
 
 - Prompt: `/rai-discovery Our 8-stock book breaks compliance and concentrates risk. Rebuild it under Markowitz mean-variance with caps, deduplicate redundant bets via correlation clustering, and stress-test under crisis. What questions does each reasoner family handle?`
 - Response: Plan: rules for compliance flags, graph for correlation clustering + representatives, prescriptive QP indexed by Scenario, stress as regime-swap re-solve.
 
-### 1. Compliance scan
+### 3. Compliance scan
 
 - Prompt: `/rai-rules-authoring Flag any holding worth more than 15% of its account, any sector worth more than 30% of the account, and any user with a risk score above 0.8 and more than five flagged transactions.`
 - Response: 4 holdings flagged (AAPL/MSFT on Account 1, JNJ/PFE on Account 4); 2 (account, sector) pairs flagged (Account 1 Tech 34.0%, Account 4 Healthcare 32.2%); 2 users flagged (Alice Chen 0.85, Eve Taylor 0.92).
 
-### 2. Cluster correlated bets
+### 4. Cluster correlated bets
 
 - Prompt: `/rai-graph-analysis Cluster stocks by correlation — anything above 0.3 absolute is a redundant bet. Pick one representative per cluster (highest Sharpe ratio) and only invest in those.`
 - Response: 4 edges (|rho| >= 0.3), 5 Louvain clusters, intra +0.683 vs inter +0.131. 5 representatives picked: PFE, GOOGL, JPM, PG, XOM. AAPL/MSFT/JNJ flagged `is_non_representative`.
 
-### 3. Solve mean-variance frontier
+### 5. Solve mean-variance frontier
 
 - Prompt: `/rai-prescriptive-problem-formulation Build a Markowitz mean-variance frontier across 6 scenarios = 3 budgets x 2 regimes. Position cap 30% of budget, sector cap 30%, only invest in cluster representatives. Show 7 points per frontier.`
 - Response: 48 decision vars (8 stocks x 6 scenarios), 5 constraint families. Return-rate range [0.0634, 0.0840]. 7 solves x 6 scenarios = 42 `LOCALLY_SOLVED` portfolios via Ipopt.
 
-### 4. Read the frontier
+### 6. Read the frontier
 
 - Prompt: `/rai-prescriptive-results-interpretation For each scenario, list the seven-point Pareto frontier and find the knee — where does the marginal risk per unit return jump the most?`
 - Response: base_500 frontier: returns 32.43 -> 40.28, risk 1160 -> 1742. Marginal `delta_risk/delta_return` jumps ~3x at eps_1 (knee). Same shape across all 6 scenarios — risk scales as budget^2, rate-form frontier is budget-independent.
 
-### 5. Stress under crisis
+### 7. Stress under crisis
 
 - Prompt: `/rai-prescriptive-solver-management + /rai-prescriptive-results-interpretation Stress-test the frontier under crisis: shrink correlations toward all-ones with weight 0.7 on base covariance + 0.3 on outer-product. How much volatility expansion at each frontier point?`
 - Response: Crisis vol 25-30% above base across the frontier (budget 500: min_risk 34.06 -> 43.74 at +28.4%, eps_1 34.30 -> 44.54 at +29.8% peak). Gap peaks mid-frontier (eps_1..eps_2 at +29.8%), narrows to +25.2% at eps_5 — the cluster-collapse payoff.
diff --git a/v1/supply_chain_resilience/references/runbook.md b/v1/supply_chain_resilience/references/runbook.md
index 82bd56c..85f8901 100644
--- a/v1/supply_chain_resilience/references/runbook.md
+++ b/v1/supply_chain_resilience/references/runbook.md
@@ -35,32 +35,37 @@ watch->avoid downgrade = +0.0% (optimizer already routed around it).
 
 ## Workflow
 
-### 0. Discovery
+### 1. Build ontology
+
+- Prompt: `/rai-build-starter-ontology Build a supply chain ontology from the CSVs in ../data/ covering sites, businesses, SKUs, shipping operations, demand orders, historical shipments, and quarterly delay predictions.`
+- Response: Concepts: `Site`, `Business`, `StockKeepingUnit`, `Operation`, `Demand`, `Shipment`, `Inventory`, `BillOfMaterial`, `DelayPrediction` — bound to the bundled CSVs (31 sites, 31 businesses, 9 SKUs, 262 shipments).
+
+### 2. Discovery
 
 - Prompt: `/rai-discovery We need a risk-adjusted routing plan. What's our exposure to each supplier, which sites are bottlenecks, which suppliers are unreliable, and what does the minimum-cost flow look like once those risks are priced in?`
 - Response: Reasoner-routing plan covering Stages 0–3 (graph, rules, prescriptive).
 
-### 1. Map upstream supplier exposure
+### 3. Map upstream supplier exposure
 
 - Prompt: `/rai-graph-analysis If a key supplier goes offline, which downstream buyers and finished products are at risk? For each HIGH-priority customer, list the suppliers it transitively depends on through the shipment graph, with their reliability scores.`
 - Response: `Business.is_high_priority_customer` (2 buyers: B008, B009); shared 6-supplier upstream cone.
 
-### 2. Rank network hubs
+### 4. Rank network hubs
 
 - Prompt: `/rai-graph-analysis Which sites are the most influential hubs in the supply network — sites that connect to other influential sites, not just sites with many direct connections? Persist the centrality score back to each site so the optimizer can use it as a bottleneck weight.`
 - Response: `Site.centrality` normalized [0,1]: S004=1.000, S006=0.776, S003=0.735; 2 weakly-connected components.
 
-### 3. Classify supplier reliability
+### 5. Classify supplier reliability
 
 - Prompt: `/rai-rules-authoring Rate each supplier's delivery reliability. Flag any with reliability score below 0.80 as unreliable, any with a Q1 delay prediction above 0.15 as high-delay-risk, and call them 'watch-level' if either fires. Suppliers with **both** flags are 'avoid' (hard-blocked downstream); suppliers with **either** flag are 'watch' (surcharged).`
 - Response: `is_unreliable` (1: B017), `has_high_delay_risk` (2: B003, B017), `is_watch_level` (2), `Demand.is_escalated` (9).
 
-### 4. Solve risk-adjusted flow
+### 6. Solve risk-adjusted flow
 
 - Prompt: `/rai-prescriptive-problem-formulation Find the minimum-cost shipping plan that fulfills all open demand. Don't ship from 'avoid' suppliers at all, add a $5/unit surcharge for 'watch' suppliers, prefer non-bottleneck sites, and charge $100/unit for any unmet demand.`
 - Response: MILP on `Operation.x_flow` + `Demand.x_unmet`; objective = transport + risk surcharge + centrality weight + unmet penalty.
 
-### 5. Quantify disruption scenarios
+### 7. Quantify disruption scenarios
 
 - Prompt: `/rai-prescriptive-solver-management + /rai-prescriptive-results-interpretation Re-solve with the highest-centrality site offline, and again with watch-level suppliers downgraded to avoid. What's the cost delta in each, and why are they asymmetric?`
 - Response: Baseline OPTIMAL $1,865 / 8 flows / 0 unmet; S004 offline +88.5%; watch->avoid +0.0% (B003 already off optimal lanes).
diff --git a/v1/telco_network_recovery/references/runbook.md b/v1/telco_network_recovery/references/runbook.md
index 9288476..9504ffa 100644
--- a/v1/telco_network_recovery/references/runbook.md
+++ b/v1/telco_network_recovery/references/runbook.md
@@ -37,37 +37,42 @@ across all 15 critical towers, prioritized by social blast radius.
 
 ## Workflow
 
-### 0. Discovery
+### 1. Build ontology
+
+- Prompt: `/rai-build-starter-ontology Build a telco network ontology from the CSVs in ../data/ covering cell towers, subscribers, call records, equipment health, network performance, time-series KPIs, and tower upgrade options.`
+- Response: Concepts: `CellTower`, `Subscriber`, `Contract`, `BillingEvent`, `CallDetailRecord`, `NetworkEquipment`, `EquipmentHealth`, `NetworkEvent`, `NetworkPerformance`, `TimeSeriesMetric`, `RevenueForecast`, `TowerUpgradeOption`, `Part`, `SupplierOrder`, `PostalArea`, `Campaign`, `PromotionRedemption` — bound to the bundled CSVs.
+
+### 2. Discovery
 
 - Prompt: `/rai-discovery WEST is missing revenue while every other region grows. What questions do we need to answer to figure out where to spend $5M to fix it?`
 - Response: Routes sub-questions to descriptive (revenue diagnosis), rules (critical-tower flag), graph (PageRank blast radius), predictive (demand forecast), prescriptive (tier-selection MIP + post-solve interpretation).
 
-### 1. Diagnose WEST
+### 3. Diagnose WEST
 
 - Prompt: `/rai-querying Where are we missing revenue targets? Which 10 cell towers have the worst average packet loss over 2024, and which region has the worst Q4 network availability?`
 - Response: WEST avail 94.6 vs 99.5 elsewhere; Q3-Q4 revenue −22% to −26% ($791K gap); 15 of 81 WEST towers DEGRADED at 8.1–8.9% packet loss; zero high-risk subs — operational, not retention.
 
-### 2. Flag critical-restore towers
+### 4. Flag critical-restore towers
 
 - Prompt: `/rai-rules-authoring Flag CellTowers as 'critical-restore' if region is WEST AND status is DEGRADED AND avg equipment health is below 0.85, OR if avg packet loss > 5% with health below 0.85.`
 - Response: `CellTower.is_critical_restore` fires on 15 WEST DEGRADED towers; 4 derived health metrics (`avg_packet_loss`, `avg_latency_ms`, `avg_error_rate`, `avg_health_score`) written to all 250 towers.
 
-### 3. Score subscriber blast radius
+### 5. Score subscriber blast radius
 
 - Prompt: `/rai-graph-analysis Who are our most socially influential subscribers based on call patterns? For each critical-restore tower, count the distinct subscribers whose calls route through it and rank by total PageRank influence — that's the blast radius if it fails.`
 - Response: `Subscriber.influence_score` (PageRank) on all 1,200 subs; `CellTower.weighted_impact` on 15 critical towers; 404 distinct subs (33% of base) route through a critical tower; TWR-0014 has the largest footprint (61 subs, 0.0502).
 
-### 4. Forecast regional demand
+### 6. Forecast regional demand
 
 - Prompt: `/rai-predictive-modeling + /rai-predictive-training Predict next-quarter subscriber-growth-rate per region using TimeSeriesMetric history. Bind each region's forecast back to its towers as a demand multiplier.`
 - Response: GNN node regression on 365d × 9 regions with 1-day-lag temporal edges; WEST multiplier 0.9998× (flat/slightly contracting); 8 other regions +0.45 to +0.91%/day; written to `CellTower.projected_demand_growth` for all 250 towers via region join.
 
-### 5. Optimize tier selection
+### 7. Optimize tier selection
 
 - Prompt: `/rai-prescriptive-problem-formulation Recover WEST capacity within $5M and 200 install-weeks. For each critical-restore tower, pick at most one upgrade tier (BRONZE, SILVER, or GOLD) to maximize capacity restored, weighted by each tower's blast radius and the regional demand forecast.`
 - Response: OPTIMAL · 12 GOLD / 2 SILVER / 1 BRONZE · 122 Gbps restored · $4.96M of $5M (binding) · 164 of 200 install-weeks (slack) · all 15 towers covered.
 
-### 6. Interpret the plan
+### 8. Interpret the plan
 
 - Prompt: `/rai-prescriptive-results-interpretation Summarize the plan: total cost, capacity restored, tier mix, towers covered. Which constraint is binding, and what would relaxing it by 10-20% unlock?`
 - Response: Budget binds at $4.96M/$5M; flexing to $6M would promote TWR-0009 BRONZE→GOLD (+9 Gbps); install-weeks have 36-week slack; 404 service-affected subs drop to ~0 over the 4-month rollout.

From c9984fff0b7451ba9e3bb3d9397bb5da7eb4c3f5 Mon Sep 17 00:00:00 2001
From: cafzal <cameron.afzal@gmail.com>
Date: Wed, 6 May 2026 10:03:54 -0700
Subject: [PATCH 11/22] Audit-pass corrections + rename Discovery step
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

5-runbook audit against template scripts and READMEs.

Telco: fix concept list to the 9 the script defines (drop
Contract/BillingEvent/etc. that are not in the script, add
RegionMetric and TemporalEdge); sharpen Steps 2,3,4,6,7,8 prompts
and responses; correct TWR-0009 BRONZE->GOLD delta to +5 Gbps
(BRONZE=3, GOLD=8).

Energy: sharpen Stage 4 graph prompt to ask for WCC + Louvain +
centrality (script computes all three); reword "structurally
constrained bottleneck" to clarify DFW is the binding capacity
bottleneck specifically.

Supply chain: fix concept list to actual 7 (drop StockKeepingUnit
/Inventory/BillOfMaterial that aren't Concepts, rename to SKU);
sharpen Step 2 discovery prompt + response to enumerate the 5
chained reasoning steps; name the 6 SUPPLIER-typed upstream nodes
in Step 3.

Machine maintenance: fix concept list (drop TrainingOption — used
as DataFrame, not Concept; add CertificationExpiry); correct
x_assigned binary count from ~250 to 384 (96 qualified pairs x 4
periods).

Portfolio: drop fictitious StockPair Concept (script uses binary
property Stock.covar(Stock,Stock) instead); add Regime to the
Stage 5 Concept callout; enumerate the 6 actual constraint
families.

Step rename: '### 2. Discovery' -> '### 2. Discover reasoner
questions' across all five runbooks for clearer step labelling.
---
 v1/energy_grid_planning/references/runbook.md |  8 ++--
 v1/machine_maintenance/references/runbook.md  |  8 ++--
 v1/portfolio_balancing/references/runbook.md  |  6 +--
 .../references/runbook.md                     | 10 ++---
 .../references/runbook.md                     | 38 +++++++++----------
 5 files changed, 35 insertions(+), 35 deletions(-)

diff --git a/v1/energy_grid_planning/references/runbook.md b/v1/energy_grid_planning/references/runbook.md
index d4615f5..9a5d5ba 100644
--- a/v1/energy_grid_planning/references/runbook.md
+++ b/v1/energy_grid_planning/references/runbook.md
@@ -6,7 +6,7 @@ ERCOT processes 10 hyperscaler interconnection requests (2,930 MW) against a 12-
 
 ```
 ERCOT has 10 hyperscaler interconnection requests totalling 2,930 MW
-on a 12-substation grid with one structurally constrained bottleneck (DFW).
+on a 12-substation grid where DFW is the binding capacity bottleneck.
 The chain produces a Pareto frontier across 5 budget levels — the knee at
 $300M unlocks 5 DCs (1,500 MW, $264M net value) including xAI Colossus.
 
@@ -43,7 +43,7 @@ $300M unlocks 5 DCs (1,500 MW, $264M net value) including xAI Colossus.
 - Prompt: `/rai-build-starter-ontology Build an ontology for grid infrastructure planning from the CSVs in ../data/ covering substations, generators, transmission lines, demand forecasts, data center requests, and substation upgrades.`
 - Response: Concepts: `Substation`, `Generator`, `TransmissionLine`, `LoadZone`, `DemandPeriod`, `RenewableProfile`, `MaintenanceWindow`, `Customer`, `DataCenterRequest`, `SubstationUpgrade`, `DemandForecast`, `LoadHistory`, `DCAnnouncement` — bound to the bundled CSVs (12 substations, 10 DC requests, 18 transmission lines).
 
-### 2. Discovery
+### 2. Discover reasoner questions
 
 - Prompt: `/rai-discovery We have 10 hyperscaler interconnection requests against a 12-substation grid. Which to approve, which substation upgrades to fund, at what budget level?`
 - Response: Plan routing sub-questions to predictive, graph, rules, and prescriptive reasoners.
@@ -55,8 +55,8 @@ $300M unlocks 5 DCs (1,500 MW, $264M net value) including xAI Colossus.
 
 ### 4. Find structural bottlenecks
 
-- Prompt: `/rai-graph-analysis Which substations are most critical to power flow based on grid topology? Use centrality on the transmission graph, then flag the top 3 as structurally critical and persist the scores back to the ontology.`
-- Response: 1 component, 3 Louvain communities; DFW, Houston, San Antonio flagged `is_structurally_critical`; 7 of 10 DC requests on critical nodes.
+- Prompt: `/rai-graph-analysis Which substations are most critical to power flow based on grid topology? Check connectivity (WCC), regional structure (Louvain communities), and centrality (betweenness/degree/eigenvector); then flag the top 3 by combined centrality rank as structurally critical and persist the scores back to the ontology.`
+- Response: 1 connected component, 3 Louvain communities (North Texas, West Texas, Gulf Coast); DFW, Houston, San Antonio flagged `is_structurally_critical`; 7 of 10 DC requests target critical nodes.
 
 ### 5. Screen DC requests
 
diff --git a/v1/machine_maintenance/references/runbook.md b/v1/machine_maintenance/references/runbook.md
index a48efa9..b42c9f4 100644
--- a/v1/machine_maintenance/references/runbook.md
+++ b/v1/machine_maintenance/references/runbook.md
@@ -43,10 +43,10 @@ all 3 Turbine techs sit in one city, a $3,200 fix away from resolved.
 
 ### 1. Build ontology
 
-- Prompt: `/rai-build-starter-ontology Build a manufacturing maintenance ontology from the CSVs in ../data/ covering machines, technicians, qualifications, periods, sensor readings, failure predictions, downtime events, and production runs.`
-- Response: Concepts: `Machine`, `Technician`, `Qualification`, `Period`, `MachinePeriod`, `TechnicianPeriod`, `TechnicianMachinePeriod`, `Sensor`, `SensorReading`, `FailurePrediction`, `DowntimeEvent`, `ProductionRun`, `PartsInventory`, `TrainingOption` — bound to the bundled CSVs (30 machines × 3 plants, 10 technicians, 4 periods).
+- Prompt: `/rai-build-starter-ontology Build a manufacturing maintenance ontology from the CSVs in ../data/ covering machines, technicians, qualifications, periods, sensor readings, failure predictions, downtime events, production runs, parts inventory, and certification expiry.`
+- Response: Concepts: `Machine`, `Technician`, `Qualification`, `Period`, `MachinePeriod`, `TechnicianPeriod`, `TechnicianMachinePeriod`, `Sensor`, `SensorReading`, `FailurePrediction`, `DowntimeEvent`, `ProductionRun`, `PartsInventory`, `CertificationExpiry` — bound to the bundled CSVs (30 machines × 3 plants, 10 technicians, 4 periods). `training_options.csv` is loaded as a DataFrame (read in Stage 4), not modeled as a concept.
 
-### 2. Discovery
+### 2. Discover reasoner questions
 
 - Prompt: `/rai-discovery We need to schedule preventive maintenance for 30 machines across 3 plants. Where does OEE alone mislead us, and what structural risks won't a pure optimizer surface?`
 - Response: Plan routing sub-questions to querying, graph, rules, prescriptive, and resilience skills.
@@ -69,7 +69,7 @@ all 3 Turbine techs sit in one city, a $3,200 fix away from resolved.
 ### 6. Schedule maintenance
 
 - Prompt: `/rai-prescriptive-problem-formulation Schedule preventive maintenance for all 30 machines across 4 periods, capped at 5 jobs per period. Every overdue machine gets maintained by period 2, and each maintained machine needs a qualified technician. Minimize expected failure cost (weighted by criticality and centrality) plus labor and travel.`
-- Response: 120 `x_maintain` + 120 `x_vulnerable` + ~250 `x_assigned` binaries; 5 constraint families; failure cost uses `predicted_fp × parts_cost × criticality × (1 + 2.0 × betweenness)`.
+- Response: 120 `x_maintain` + 120 `x_vulnerable` + 384 `x_assigned` binaries (96 qualified tech×machine pairs × 4 periods); 5 constraint families (cumulative coverage, assignment-maintenance linkage, technician hours, parts/bay capacity, overdue deadline); failure cost uses `x_vulnerable × predicted_fp × parts_cost × criticality × (1 + 2.0 × betweenness)`.
 
 ### 7. Stress-test concentration
 
diff --git a/v1/portfolio_balancing/references/runbook.md b/v1/portfolio_balancing/references/runbook.md
index cc108ff..13df481 100644
--- a/v1/portfolio_balancing/references/runbook.md
+++ b/v1/portfolio_balancing/references/runbook.md
@@ -41,9 +41,9 @@ base at every lambda — without the cluster collapse, the gap would grow.
 ### 1. Build ontology
 
 - Prompt: `/rai-build-starter-ontology Build a portfolio ontology from the CSVs in ../data/ covering stocks, sectors, the covariance matrix, accounts, holdings, users, and transactions.`
-- Response: Concepts: `Stock`, `Sector`, `StockPair` (compound id on stock_i / stock_j with covariance), `User`, `Account`, `Holding`, `Transaction`, plus the `Scenario` Concept used by Stage 3 — bound to the bundled CSVs (8 stocks, 64 covariance entries, 6 scenarios).
+- Response: Concepts: `Stock` (with binary `Stock.covar(Stock, Stock)` property carrying covariance), `Sector`, `User`, `Account`, `Holding`, `Transaction` — bound to the bundled CSVs (8 stocks, 64 covariance entries). Stage 3 adds the `Regime` and `Scenario` Concepts (2 regimes x 3 budgets = 6 scenarios).
 
-### 2. Discovery
+### 2. Discover reasoner questions
 
 - Prompt: `/rai-discovery Our 8-stock book breaks compliance and concentrates risk. Rebuild it under Markowitz mean-variance with caps, deduplicate redundant bets via correlation clustering, and stress-test under crisis. What questions does each reasoner family handle?`
 - Response: Plan: rules for compliance flags, graph for correlation clustering + representatives, prescriptive QP indexed by Scenario, stress as regime-swap re-solve.
@@ -61,7 +61,7 @@ base at every lambda — without the cluster collapse, the gap would grow.
 ### 5. Solve mean-variance frontier
 
 - Prompt: `/rai-prescriptive-problem-formulation Build a Markowitz mean-variance frontier across 6 scenarios = 3 budgets x 2 regimes. Position cap 30% of budget, sector cap 30%, only invest in cluster representatives. Show 7 points per frontier.`
-- Response: 48 decision vars (8 stocks x 6 scenarios), 5 constraint families. Return-rate range [0.0634, 0.0840]. 7 solves x 6 scenarios = 42 `LOCALLY_SOLVED` portfolios via Ipopt.
+- Response: 48 decision vars (`Stock.x_quantity`, 8 stocks x 6 scenarios; non-reps forced to 0). Constraint families: non-negativity, budget equality (sum = budget per scenario), position cap (30%), sector cap (30%), non-representative = 0, plus epsilon return-rate floor on sweep solves. Return-rate range [0.0634, 0.0840]. 7 solves x 6 scenarios = 42 `LOCALLY_SOLVED` portfolios via Ipopt.
 
 ### 6. Read the frontier
 
diff --git a/v1/supply_chain_resilience/references/runbook.md b/v1/supply_chain_resilience/references/runbook.md
index 85f8901..66e583d 100644
--- a/v1/supply_chain_resilience/references/runbook.md
+++ b/v1/supply_chain_resilience/references/runbook.md
@@ -38,17 +38,17 @@ watch->avoid downgrade = +0.0% (optimizer already routed around it).
 ### 1. Build ontology
 
 - Prompt: `/rai-build-starter-ontology Build a supply chain ontology from the CSVs in ../data/ covering sites, businesses, SKUs, shipping operations, demand orders, historical shipments, and quarterly delay predictions.`
-- Response: Concepts: `Site`, `Business`, `StockKeepingUnit`, `Operation`, `Demand`, `Shipment`, `Inventory`, `BillOfMaterial`, `DelayPrediction` — bound to the bundled CSVs (31 sites, 31 businesses, 9 SKUs, 262 shipments).
+- Response: Concepts: `Site`, `Business`, `SKU`, `Operation`, `Demand`, `Shipment`, `DelayPrediction` — bound to the bundled CSVs (31 sites, 31 businesses, 9 SKUs, 70 operations, 20 demands, 262 shipments, 36 delay predictions).
 
-### 2. Discovery
+### 2. Discover reasoner questions
 
-- Prompt: `/rai-discovery We need a risk-adjusted routing plan. What's our exposure to each supplier, which sites are bottlenecks, which suppliers are unreliable, and what does the minimum-cost flow look like once those risks are priced in?`
-- Response: Reasoner-routing plan covering Stages 0–3 (graph, rules, prescriptive).
+- Prompt: `/rai-discovery We need a risk-adjusted routing plan. What's our exposure to each supplier, which sites are bottlenecks, which suppliers are unreliable, what does the minimum-cost flow look like once those risks are priced in, and how robust is that plan to disruptions?`
+- Response: Reasoner-routing plan: (1) Graph reachability for upstream supplier exposure, (2) Graph centrality for hub identification, (3) Rules for supplier reliability classification, (4) Prescriptive MILP for risk-adjusted flow, (5) Scenario re-solves for disruption quantification.
 
 ### 3. Map upstream supplier exposure
 
 - Prompt: `/rai-graph-analysis If a key supplier goes offline, which downstream buyers and finished products are at risk? For each HIGH-priority customer, list the suppliers it transitively depends on through the shipment graph, with their reliability scores.`
-- Response: `Business.is_high_priority_customer` (2 buyers: B008, B009); shared 6-supplier upstream cone.
+- Response: `Business.is_high_priority_customer` (2 buyers: B008 MegaCorp Enterprise, B009 TechGiant Inc); each transitively depends on the same 6 SUPPLIER-typed upstream nodes (B015, B016, B017, B018, B019, B020).
 
 ### 4. Rank network hubs
 
diff --git a/v1/telco_network_recovery/references/runbook.md b/v1/telco_network_recovery/references/runbook.md
index 9504ffa..02e85b9 100644
--- a/v1/telco_network_recovery/references/runbook.md
+++ b/v1/telco_network_recovery/references/runbook.md
@@ -1,19 +1,19 @@
 # Runbook: Telco WEST Recovery — Multi-Reasoner Walkthrough
 
-A regional telco is bleeding $791K/quarter from WEST while every other region grows. No single reasoner can answer where to spend a $5M recovery budget: descriptive scopes the crisis, rules flag broken towers, graph weights them by social blast radius, predictive forecasts forward demand, and prescriptive composes all four signals into the upgrade plan. Each stage writes derived properties back to the same ontology that downstream stages consume.
+WEST revenue collapsed ~29% in Q4 2024 (a ~$2.7M shortfall vs the other-regions average) while every other region held flat or grew. No single reasoner can answer where to spend a $5M recovery budget: descriptive scopes the crisis, rules flag broken towers, graph weights them by social blast radius, predictive forecasts forward demand, and prescriptive composes all four signals into the upgrade plan. Each stage writes derived properties back to the same ontology that downstream stages consume.
 
 ## The chain
 
 ```
-WEST is bleeding $791K/quarter from a network operations crisis.
+WEST Q4 revenue is down ~29% (~$2.7M gap vs the other-regions avg).
 The chain produces a $5M plan that recovers 122 Gbps capacity
 across all 15 critical towers, prioritized by social blast radius.
 
   ─────────────────────────────────────────────────────────────────
-  STAGE 1  Descriptive  ──►  WEST: Q3-Q4 revenue −22% to −26%,
+  STAGE 1  Descriptive  ──►  WEST: Q4 revenue −29% vs H1 baseline,
                               avail 94.6 vs 99.5, 15 of 81 DEGRADED.
-                              Retention angle? No — 0 high-risk
-                              subs; this is operational.
+                              Retention angle? No — this is
+                              operational, not subscriber churn.
   ─────────────────────────────────────────────────────────────────
   STAGE 2  Rules        ──►  CellTower.is_critical_restore  (15)
                               4 derived health metrics + a compound
@@ -39,23 +39,23 @@ across all 15 critical towers, prioritized by social blast radius.
 
 ### 1. Build ontology
 
-- Prompt: `/rai-build-starter-ontology Build a telco network ontology from the CSVs in ../data/ covering cell towers, subscribers, call records, equipment health, network performance, time-series KPIs, and tower upgrade options.`
-- Response: Concepts: `CellTower`, `Subscriber`, `Contract`, `BillingEvent`, `CallDetailRecord`, `NetworkEquipment`, `EquipmentHealth`, `NetworkEvent`, `NetworkPerformance`, `TimeSeriesMetric`, `RevenueForecast`, `TowerUpgradeOption`, `Part`, `SupplierOrder`, `PostalArea`, `Campaign`, `PromotionRedemption` — bound to the bundled CSVs.
+- Prompt: `/rai-build-starter-ontology Build a telco network ontology from the eight CSVs in ../data/: cell_towers, network_equipment, equipment_health, network_performance, subscribers, call_detail_records, tower_upgrade_options, time_series_metrics. The time-series file has one row per (date, region) — model that as a composite-key concept and add a same-region 1-day-lag edge concept to support temporal GNN message passing downstream.`
+- Response: Concepts: `CellTower`, `NetworkEquipment`, `EquipmentHealth`, `NetworkPerformance`, `Subscriber`, `CallDetailRecord` (edge concept: caller → callee, routed_through tower), `TowerUpgradeOption` (composite key tower_id+tier), `RegionMetric` (composite key metric_date+region), `TemporalEdge` (composite key src_date+src_region+dst_date+dst_region) — all bound to the bundled CSVs.
 
-### 2. Discovery
+### 2. Discover reasoner questions
 
-- Prompt: `/rai-discovery WEST is missing revenue while every other region grows. What questions do we need to answer to figure out where to spend $5M to fix it?`
-- Response: Routes sub-questions to descriptive (revenue diagnosis), rules (critical-tower flag), graph (PageRank blast radius), predictive (demand forecast), prescriptive (tier-selection MIP + post-solve interpretation).
+- Prompt: `/rai-discovery WEST is missing revenue targets while every other region grows. We have a $5M capex budget and 200 install crew-weeks to allocate to tower upgrades. Which RAI reasoners do we need, in what order, to land on a defensible upgrade plan grounded in the available data (towers, subscribers, calls, equipment health, performance, daily KPIs, and tiered upgrade options)?`
+- Response: Plans the 4-reasoner chain on the shared ontology — descriptive (`/rai-querying`) to scope the WEST crisis and rule out a retention angle; rules (`/rai-rules-authoring`) to flag critical-restore towers; graph (`/rai-graph-analysis`) to score subscriber influence and aggregate per-tower blast radius; predictive (`/rai-predictive-modeling` + `/rai-predictive-training`) to forecast per-region growth and bind it as a per-tower demand multiplier; prescriptive (`/rai-prescriptive-problem-formulation` + `/rai-prescriptive-results-interpretation`) to compose all three signals into the tier-selection MIP and explain the binding constraint.
 
 ### 3. Diagnose WEST
 
-- Prompt: `/rai-querying Where are we missing revenue targets? Which 10 cell towers have the worst average packet loss over 2024, and which region has the worst Q4 network availability?`
-- Response: WEST avail 94.6 vs 99.5 elsewhere; Q3-Q4 revenue −22% to −26% ($791K gap); 15 of 81 WEST towers DEGRADED at 8.1–8.9% packet loss; zero high-risk subs — operational, not retention.
+- Prompt: `/rai-querying Compare quarterly DAILY_REVENUE_USD by region. Which region has the worst Q4 2024 network availability? Show the WEST cell tower fleet broken down by status, and the average packet loss for the DEGRADED ones.`
+- Response: WEST Q4 avail 94.6% vs 99.5% in every other region; WEST Q4 revenue $6.6M vs ~$9.0–9.5M everywhere else (≈$2.7M Q4 deficit, −29% vs WEST's own H1 baseline); 81 WEST towers split into 49 ACTIVE / 17 MAINTENANCE / 15 DEGRADED, with the 15 DEGRADED towers averaging 7.6–10.3% packet loss (median ~8.2%). Subscriber-churn signals stay flat — this is an operational network failure, not retention.
 
 ### 4. Flag critical-restore towers
 
-- Prompt: `/rai-rules-authoring Flag CellTowers as 'critical-restore' if region is WEST AND status is DEGRADED AND avg equipment health is below 0.85, OR if avg packet loss > 5% with health below 0.85.`
-- Response: `CellTower.is_critical_restore` fires on 15 WEST DEGRADED towers; 4 derived health metrics (`avg_packet_loss`, `avg_latency_ms`, `avg_error_rate`, `avg_health_score`) written to all 250 towers.
+- Prompt: `/rai-rules-authoring First derive per-tower averages for packet loss, latency, error rate (from NetworkPerformance) and average equipment health (via NetworkEquipment → EquipmentHealth). Then flag CellTower.is_critical_restore on either of two branches: (1) region == WEST AND status == DEGRADED AND avg_health_score < 0.85, OR (2) region == WEST AND avg_packet_loss > 5% AND avg_health_score < 0.85 (catches ACTIVE-but-failing).`
+- Response: 4 derived health properties (`avg_packet_loss`, `avg_latency_ms`, `avg_error_rate`, `avg_health_score`) computed for all 250 towers via `aggs.avg(...).per(CellTower)`. The two-branch `CellTower.is_critical_restore` relationship fires on 15 towers — all 15 are WEST + DEGRADED + health < 0.85, so Branch 1 alone produces the same set, but Branch 2 is kept as a guard against ACTIVE-but-failing failure modes.
 
 ### 5. Score subscriber blast radius
 
@@ -64,18 +64,18 @@ across all 15 critical towers, prioritized by social blast radius.
 
 ### 6. Forecast regional demand
 
-- Prompt: `/rai-predictive-modeling + /rai-predictive-training Predict next-quarter subscriber-growth-rate per region using TimeSeriesMetric history. Bind each region's forecast back to its towers as a demand multiplier.`
-- Response: GNN node regression on 365d × 9 regions with 1-day-lag temporal edges; WEST multiplier 0.9998× (flat/slightly contracting); 8 other regions +0.45 to +0.91%/day; written to `CellTower.projected_demand_growth` for all 250 towers via region join.
+- Prompt: `/rai-predictive-modeling + /rai-predictive-training Train a regression GNN on RegionMetric (one row per date+region) to predict next-quarter SUBSCRIBER_GROWTH_RATE per region. Use TemporalEdge (same-region 1-day lag) for message passing, region as a category feature, and lag features (prev-day, prev-week, 7-day mean) as continuous inputs. Train < 2024-11-01, validate on Nov, test on Dec. Mean each region's Dec predictions, convert to 1+x multiplier, and bind back to CellTower.projected_demand_growth via region.`
+- Response: GNN node regression on 365d × 9 regions with same-region 1-day-lag temporal edges; per-region mean of the Dec test predictions yields WEST multiplier ≈0.9998× (flat/slightly contracting) while the 8 other regions sit at +0.45% to +0.91%/day. The multiplier is loaded into a `RegionGrowth` concept and joined to `CellTower.projected_demand_growth` via region — populating all 250 towers (CellTower covers 5 regions; the other 4 RegionMetric regions are forecast but have no towers to bind to).
 
 ### 7. Optimize tier selection
 
-- Prompt: `/rai-prescriptive-problem-formulation Recover WEST capacity within $5M and 200 install-weeks. For each critical-restore tower, pick at most one upgrade tier (BRONZE, SILVER, or GOLD) to maximize capacity restored, weighted by each tower's blast radius and the regional demand forecast.`
-- Response: OPTIMAL · 12 GOLD / 2 SILVER / 1 BRONZE · 122 Gbps restored · $4.96M of $5M (binding) · 164 of 200 install-weeks (slack) · all 15 towers covered.
+- Prompt: `/rai-prescriptive-problem-formulation Build a tower-upgrade MIP scoped to options where TowerUpgradeOption.for_tower(CellTower) AND CellTower.is_critical_restore(). Decision variable TowerUpgradeOption.selected is binary, keyed by (tower_id, tier). Constraints: at most one tier per tower, total cost ≤ $5M, total install_weeks ≤ 200. Maximize sum(selected · capacity_increase_gbps · CellTower.weighted_impact · CellTower.projected_demand_growth) — three coefficients, one from each upstream stage.`
+- Response: Status OPTIMAL with all 15 critical towers covered (one tier each). Tier mix: 12 GOLD / 2 SILVER / 1 BRONZE. Total capacity restored 122 Gbps. Total cost $4,956,843 of the $5M budget (binding). Total install crew-weeks 164 of 200 (slack). The tier mix skews toward GOLD because the per-Gbps cost on GOLD is competitive once it is multiplied by `weighted_impact` and `projected_demand_growth` in the objective.
 
 ### 8. Interpret the plan
 
 - Prompt: `/rai-prescriptive-results-interpretation Summarize the plan: total cost, capacity restored, tier mix, towers covered. Which constraint is binding, and what would relaxing it by 10-20% unlock?`
-- Response: Budget binds at $4.96M/$5M; flexing to $6M would promote TWR-0009 BRONZE→GOLD (+9 Gbps); install-weeks have 36-week slack; 404 service-affected subs drop to ~0 over the 4-month rollout.
+- Response: Budget binds at $4.96M/$5M (only $43K of headroom); flexing the budget to $6M unlocks the TWR-0009 BRONZE→GOLD swap (+5 Gbps for ~$395K incremental cost). Install-weeks have 36 weeks of slack (164/200) so crew capacity is not the bottleneck. All 15 critical towers are covered, so the 404 service-affected subscribers identified by the graph stage are addressed within the rollout window.
 
 ## Data
 

From c20345871c6b5e999363966f71c0143268623791 Mon Sep 17 00:00:00 2001
From: cafzal <cameron.afzal@gmail.com>
Date: Wed, 6 May 2026 10:45:20 -0700
Subject: [PATCH 12/22] Move runbook.md to template root

Runbooks live alongside the template script, README, and data/
directory now (was under references/). Updated relative paths
inside each runbook from ../data/ -> data/ and ../<template>.py ->
<template>.py.
---
 v1/energy_grid_planning/{references => }/runbook.md    | 4 ++--
 v1/machine_maintenance/{references => }/runbook.md     | 4 ++--
 v1/portfolio_balancing/{references => }/runbook.md     | 4 ++--
 v1/supply_chain_resilience/{references => }/runbook.md | 4 ++--
 v1/telco_network_recovery/{references => }/runbook.md  | 4 ++--
 5 files changed, 10 insertions(+), 10 deletions(-)
 rename v1/energy_grid_planning/{references => }/runbook.md (93%)
 rename v1/machine_maintenance/{references => }/runbook.md (92%)
 rename v1/portfolio_balancing/{references => }/runbook.md (94%)
 rename v1/supply_chain_resilience/{references => }/runbook.md (93%)
 rename v1/telco_network_recovery/{references => }/runbook.md (92%)

diff --git a/v1/energy_grid_planning/references/runbook.md b/v1/energy_grid_planning/runbook.md
similarity index 93%
rename from v1/energy_grid_planning/references/runbook.md
rename to v1/energy_grid_planning/runbook.md
index 9a5d5ba..cfd41da 100644
--- a/v1/energy_grid_planning/references/runbook.md
+++ b/v1/energy_grid_planning/runbook.md
@@ -40,7 +40,7 @@ $300M unlocks 5 DCs (1,500 MW, $264M net value) including xAI Colossus.
 
 ### 1. Build ontology
 
-- Prompt: `/rai-build-starter-ontology Build an ontology for grid infrastructure planning from the CSVs in ../data/ covering substations, generators, transmission lines, demand forecasts, data center requests, and substation upgrades.`
+- Prompt: `/rai-build-starter-ontology Build an ontology for grid infrastructure planning from the CSVs in data/ covering substations, generators, transmission lines, demand forecasts, data center requests, and substation upgrades.`
 - Response: Concepts: `Substation`, `Generator`, `TransmissionLine`, `LoadZone`, `DemandPeriod`, `RenewableProfile`, `MaintenanceWindow`, `Customer`, `DataCenterRequest`, `SubstationUpgrade`, `DemandForecast`, `LoadHistory`, `DCAnnouncement` — bound to the bundled CSVs (12 substations, 10 DC requests, 18 transmission lines).
 
 ### 2. Discover reasoner questions
@@ -75,4 +75,4 @@ $300M unlocks 5 DCs (1,500 MW, $264M net value) including xAI Colossus.
 
 ## Data
 
-Bundled CSVs in `../data/`: 12 substations, 15 generators, 18 transmission lines, 10 DC requests (2,930 MW), 10 substation upgrades ($630M total), plus historical load and forecast tables. Full chain implemented in `../energy_grid_planning.py`.
+Bundled CSVs in `data/`: 12 substations, 15 generators, 18 transmission lines, 10 DC requests (2,930 MW), 10 substation upgrades ($630M total), plus historical load and forecast tables. Full chain implemented in `energy_grid_planning.py`.
diff --git a/v1/machine_maintenance/references/runbook.md b/v1/machine_maintenance/runbook.md
similarity index 92%
rename from v1/machine_maintenance/references/runbook.md
rename to v1/machine_maintenance/runbook.md
index b42c9f4..9732b16 100644
--- a/v1/machine_maintenance/references/runbook.md
+++ b/v1/machine_maintenance/runbook.md
@@ -43,7 +43,7 @@ all 3 Turbine techs sit in one city, a $3,200 fix away from resolved.
 
 ### 1. Build ontology
 
-- Prompt: `/rai-build-starter-ontology Build a manufacturing maintenance ontology from the CSVs in ../data/ covering machines, technicians, qualifications, periods, sensor readings, failure predictions, downtime events, production runs, parts inventory, and certification expiry.`
+- Prompt: `/rai-build-starter-ontology Build a manufacturing maintenance ontology from the CSVs in data/ covering machines, technicians, qualifications, periods, sensor readings, failure predictions, downtime events, production runs, parts inventory, and certification expiry.`
 - Response: Concepts: `Machine`, `Technician`, `Qualification`, `Period`, `MachinePeriod`, `TechnicianPeriod`, `TechnicianMachinePeriod`, `Sensor`, `SensorReading`, `FailurePrediction`, `DowntimeEvent`, `ProductionRun`, `PartsInventory`, `CertificationExpiry` — bound to the bundled CSVs (30 machines × 3 plants, 10 technicians, 4 periods). `training_options.csv` is loaded as a DataFrame (read in Stage 4), not modeled as a concept.
 
 ### 2. Discover reasoner questions
@@ -78,4 +78,4 @@ all 3 Turbine techs sit in one city, a $3,200 fix away from resolved.
 
 ## Data
 
-Bundled CSVs in `../data/`: 30 machines (3 plants × 5 types), 10 technicians (3 cities), 16 qualifications, 4 periods, 60 sensors / 240 readings, 120 failure predictions, 129 downtime events, 120 production runs, 13 training options. All five stages run in `../machine_maintenance.py`.
+Bundled CSVs in `data/`: 30 machines (3 plants × 5 types), 10 technicians (3 cities), 16 qualifications, 4 periods, 60 sensors / 240 readings, 120 failure predictions, 129 downtime events, 120 production runs, 13 training options. All five stages run in `machine_maintenance.py`.
diff --git a/v1/portfolio_balancing/references/runbook.md b/v1/portfolio_balancing/runbook.md
similarity index 94%
rename from v1/portfolio_balancing/references/runbook.md
rename to v1/portfolio_balancing/runbook.md
index 13df481..e70cdf1 100644
--- a/v1/portfolio_balancing/references/runbook.md
+++ b/v1/portfolio_balancing/runbook.md
@@ -40,7 +40,7 @@ base at every lambda — without the cluster collapse, the gap would grow.
 
 ### 1. Build ontology
 
-- Prompt: `/rai-build-starter-ontology Build a portfolio ontology from the CSVs in ../data/ covering stocks, sectors, the covariance matrix, accounts, holdings, users, and transactions.`
+- Prompt: `/rai-build-starter-ontology Build a portfolio ontology from the CSVs in data/ covering stocks, sectors, the covariance matrix, accounts, holdings, users, and transactions.`
 - Response: Concepts: `Stock` (with binary `Stock.covar(Stock, Stock)` property carrying covariance), `Sector`, `User`, `Account`, `Holding`, `Transaction` — bound to the bundled CSVs (8 stocks, 64 covariance entries). Stage 3 adds the `Regime` and `Scenario` Concepts (2 regimes x 3 budgets = 6 scenarios).
 
 ### 2. Discover reasoner questions
@@ -75,4 +75,4 @@ base at every lambda — without the cluster collapse, the gap would grow.
 
 ## Data
 
-Bundled CSVs in `../data/`: `returns.csv` (8 stocks across 5 sectors), `covar.csv` (64 symmetric covariance entries), plus `users.csv` (6), `accounts.csv` (4), `holdings.csv` (15), `transactions.csv` (21). All four stages run in `../portfolio_balancing.py`.
+Bundled CSVs in `data/`: `returns.csv` (8 stocks across 5 sectors), `covar.csv` (64 symmetric covariance entries), plus `users.csv` (6), `accounts.csv` (4), `holdings.csv` (15), `transactions.csv` (21). All four stages run in `portfolio_balancing.py`.
diff --git a/v1/supply_chain_resilience/references/runbook.md b/v1/supply_chain_resilience/runbook.md
similarity index 93%
rename from v1/supply_chain_resilience/references/runbook.md
rename to v1/supply_chain_resilience/runbook.md
index 66e583d..389fc57 100644
--- a/v1/supply_chain_resilience/references/runbook.md
+++ b/v1/supply_chain_resilience/runbook.md
@@ -37,7 +37,7 @@ watch->avoid downgrade = +0.0% (optimizer already routed around it).
 
 ### 1. Build ontology
 
-- Prompt: `/rai-build-starter-ontology Build a supply chain ontology from the CSVs in ../data/ covering sites, businesses, SKUs, shipping operations, demand orders, historical shipments, and quarterly delay predictions.`
+- Prompt: `/rai-build-starter-ontology Build a supply chain ontology from the CSVs in data/ covering sites, businesses, SKUs, shipping operations, demand orders, historical shipments, and quarterly delay predictions.`
 - Response: Concepts: `Site`, `Business`, `SKU`, `Operation`, `Demand`, `Shipment`, `DelayPrediction` — bound to the bundled CSVs (31 sites, 31 businesses, 9 SKUs, 70 operations, 20 demands, 262 shipments, 36 delay predictions).
 
 ### 2. Discover reasoner questions
@@ -72,4 +72,4 @@ watch->avoid downgrade = +0.0% (optimizer already routed around it).
 
 ## Data
 
-Bundled CSVs in `../data/`: 31 sites (APAC/AMERICAS/EMEA), 31 businesses, 9 SKUs, 70 operations, 20 demand orders, 262 shipments (37 late), 36 quarterly delay predictions. Combined script with stage banners: `../supply_chain_resilience.py`.
+Bundled CSVs in `data/`: 31 sites (APAC/AMERICAS/EMEA), 31 businesses, 9 SKUs, 70 operations, 20 demand orders, 262 shipments (37 late), 36 quarterly delay predictions. Combined script with stage banners: `supply_chain_resilience.py`.
diff --git a/v1/telco_network_recovery/references/runbook.md b/v1/telco_network_recovery/runbook.md
similarity index 92%
rename from v1/telco_network_recovery/references/runbook.md
rename to v1/telco_network_recovery/runbook.md
index 02e85b9..d794a85 100644
--- a/v1/telco_network_recovery/references/runbook.md
+++ b/v1/telco_network_recovery/runbook.md
@@ -39,7 +39,7 @@ across all 15 critical towers, prioritized by social blast radius.
 
 ### 1. Build ontology
 
-- Prompt: `/rai-build-starter-ontology Build a telco network ontology from the eight CSVs in ../data/: cell_towers, network_equipment, equipment_health, network_performance, subscribers, call_detail_records, tower_upgrade_options, time_series_metrics. The time-series file has one row per (date, region) — model that as a composite-key concept and add a same-region 1-day-lag edge concept to support temporal GNN message passing downstream.`
+- Prompt: `/rai-build-starter-ontology Build a telco network ontology from the eight CSVs in data/: cell_towers, network_equipment, equipment_health, network_performance, subscribers, call_detail_records, tower_upgrade_options, time_series_metrics. The time-series file has one row per (date, region) — model that as a composite-key concept and add a same-region 1-day-lag edge concept to support temporal GNN message passing downstream.`
 - Response: Concepts: `CellTower`, `NetworkEquipment`, `EquipmentHealth`, `NetworkPerformance`, `Subscriber`, `CallDetailRecord` (edge concept: caller → callee, routed_through tower), `TowerUpgradeOption` (composite key tower_id+tier), `RegionMetric` (composite key metric_date+region), `TemporalEdge` (composite key src_date+src_region+dst_date+dst_region) — all bound to the bundled CSVs.
 
 ### 2. Discover reasoner questions
@@ -79,4 +79,4 @@ across all 15 critical towers, prioritized by social blast radius.
 
 ## Data
 
-Bundled CSVs in `../data/`: 250 cell towers (15 WEST DEGRADED), 1,200 subscribers, 6,000 directed CDRs, ~5,000 NetworkPerformance measurements, 544 NetworkEquipment + EquipmentHealth rows, 360 TowerUpgradeOptions (3 tiers × 120 in-scope towers), 3,285 daily KPI rows (365 days × 9 regions). All stages run end-to-end via `../telco_network_recovery.py`.
+Bundled CSVs in `data/`: 250 cell towers (15 WEST DEGRADED), 1,200 subscribers, 6,000 directed CDRs, ~5,000 NetworkPerformance measurements, 544 NetworkEquipment + EquipmentHealth rows, 360 TowerUpgradeOptions (3 tiers × 120 in-scope towers), 3,285 daily KPI rows (365 days × 9 regions). All stages run end-to-end via `telco_network_recovery.py`.

From a0760fb1edac2e26817a74665b4064ae976f9a12 Mon Sep 17 00:00:00 2001
From: cafzal <cameron.afzal@gmail.com>
Date: Wed, 6 May 2026 11:04:46 -0700
Subject: [PATCH 13/22] Add closing rai-ontology-design step to each runbook

Each chain now ends with /rai-ontology-design promoting the
per-stage enrichments into first-class ontology state and adding
new Concepts where a stage produced new entities (SelectedUpgrade,
InvestmentPortfolio, SupplyPlan, MaintenancePlan +
CrossTrainingRecommendation, FrontierPoint). The chain output
persists as queryable ontology rather than stage-local Python
state, which is what enables a downstream analyst to keep working
without re-running the chain.
---
 v1/energy_grid_planning/runbook.md    | 5 +++++
 v1/machine_maintenance/runbook.md     | 5 +++++
 v1/portfolio_balancing/runbook.md     | 5 +++++
 v1/supply_chain_resilience/runbook.md | 5 +++++
 v1/telco_network_recovery/runbook.md  | 5 +++++
 5 files changed, 25 insertions(+)

diff --git a/v1/energy_grid_planning/runbook.md b/v1/energy_grid_planning/runbook.md
index cfd41da..96e1c03 100644
--- a/v1/energy_grid_planning/runbook.md
+++ b/v1/energy_grid_planning/runbook.md
@@ -73,6 +73,11 @@ $300M unlocks 5 DCs (1,500 MW, $264M net value) including xAI Colossus.
 - Prompt: `/rai-prescriptive-results-interpretation Which data centers get approved, which upgrades are selected, and where's the biggest return on investment at each budget level?`
 - Response: Pareto frontier with knee at $300M (5 DCs, 1,500 MW, $264M net); marginal $995K/$M at knee, declining to $400K/$M by $600M; Google + Lambda never approved (DFW full).
 
+### 8. Persist the chain into the ontology
+
+- Prompt: `/rai-ontology-design Promote the per-stage enrichments to first-class ontology state: substation predicted load, centrality, grid community, structural-criticality flag, the three per-DC compliance flags. Add an `InvestmentPortfolio` concept indexed by `InvestmentLevel` so the approval + upgrade decision per budget scenario persists as queryable ontology.`
+- Response: Ontology now carries `Substation.predicted_load`, `.betweenness`, `.grid_community`, `.is_structurally_critical`, `DataCenterRequest.fails_capacity / fails_structural / fails_low_carbon / is_compliant`, plus an `InvestmentPortfolio(InvestmentLevel)` concept holding the approved-DC set and selected-upgrade set per scenario.
+
 ## Data
 
 Bundled CSVs in `data/`: 12 substations, 15 generators, 18 transmission lines, 10 DC requests (2,930 MW), 10 substation upgrades ($630M total), plus historical load and forecast tables. Full chain implemented in `energy_grid_planning.py`.
diff --git a/v1/machine_maintenance/runbook.md b/v1/machine_maintenance/runbook.md
index 9732b16..fb19cb3 100644
--- a/v1/machine_maintenance/runbook.md
+++ b/v1/machine_maintenance/runbook.md
@@ -76,6 +76,11 @@ all 3 Turbine techs sit in one city, a $3,200 fix away from resolved.
 - Prompt: `/rai-prescriptive-solver-management + /rai-prescriptive-results-interpretation For each machine type, check whether all qualified technicians sit in one location and recommend the cheapest cross-training fix.`
 - Response: OPTIMAL · 20 jobs · $605,241; Turbine concentrated in Houston_TX (67% of jobs travel); cross-train T006 (Chicago_IL, Senior) for $3,200 / 5 weeks.
 
+### 8. Persist the chain into the ontology
+
+- Prompt: `/rai-ontology-design Promote the per-stage enrichments to first-class ontology state: OEE-proxy properties, betweenness, the seven per-machine risk flags, the composite risk tier. Add a `MaintenancePlan` concept (one row per maintained `(machine, period, technician)` triple) and a `CrossTrainingRecommendation` concept so the optimizer's outputs persist as ontology.`
+- Response: Ontology now carries `Machine.performance_ratio / quality_ratio / anomaly_count / betweenness / is_overdue_maintenance / is_high_risk / is_chronic_downtime / risk_tier`, plus a `MaintenancePlan` concept holding the 20 scheduled jobs and a `CrossTrainingRecommendation` concept for T006 / Chicago_IL / $3,200 / 5 weeks.
+
 ## Data
 
 Bundled CSVs in `data/`: 30 machines (3 plants × 5 types), 10 technicians (3 cities), 16 qualifications, 4 periods, 60 sensors / 240 readings, 120 failure predictions, 129 downtime events, 120 production runs, 13 training options. All five stages run in `machine_maintenance.py`.
diff --git a/v1/portfolio_balancing/runbook.md b/v1/portfolio_balancing/runbook.md
index e70cdf1..ed70434 100644
--- a/v1/portfolio_balancing/runbook.md
+++ b/v1/portfolio_balancing/runbook.md
@@ -73,6 +73,11 @@ base at every lambda — without the cluster collapse, the gap would grow.
 - Prompt: `/rai-prescriptive-solver-management + /rai-prescriptive-results-interpretation Stress-test the frontier under crisis: shrink correlations toward all-ones with weight 0.7 on base covariance + 0.3 on outer-product. How much volatility expansion at each frontier point?`
 - Response: Crisis vol 25-30% above base across the frontier (budget 500: min_risk 34.06 -> 43.74 at +28.4%, eps_1 34.30 -> 44.54 at +29.8% peak). Gap peaks mid-frontier (eps_1..eps_2 at +29.8%), narrows to +25.2% at eps_5 — the cluster-collapse payoff.
 
+### 8. Persist the chain into the ontology
+
+- Prompt: `/rai-ontology-design Promote the per-stage enrichments to first-class ontology state: compliance flags, cluster id + representative flag, per-`(Scenario, eps_point)` holdings, base-vs-crisis vol gap. Add a `FrontierPoint` concept indexed by `(Scenario, eps_index)` so each Pareto point becomes a first-class entity, not stage-local Python state.`
+- Response: Ontology now carries `Holding.is_overconcentrated / is_sector_concentrated`, `User.is_high_risk_trader`, `Stock.cluster / is_representative`, plus a `FrontierPoint` concept materializing the 42 (6 scenarios × 7-point frontier) Pareto points with `vol_base`, `vol_crisis`, and `vol_gap_pct`.
+
 ## Data
 
 Bundled CSVs in `data/`: `returns.csv` (8 stocks across 5 sectors), `covar.csv` (64 symmetric covariance entries), plus `users.csv` (6), `accounts.csv` (4), `holdings.csv` (15), `transactions.csv` (21). All four stages run in `portfolio_balancing.py`.
diff --git a/v1/supply_chain_resilience/runbook.md b/v1/supply_chain_resilience/runbook.md
index 389fc57..57975a1 100644
--- a/v1/supply_chain_resilience/runbook.md
+++ b/v1/supply_chain_resilience/runbook.md
@@ -70,6 +70,11 @@ watch->avoid downgrade = +0.0% (optimizer already routed around it).
 - Prompt: `/rai-prescriptive-solver-management + /rai-prescriptive-results-interpretation Re-solve with the highest-centrality site offline, and again with watch-level suppliers downgraded to avoid. What's the cost delta in each, and why are they asymmetric?`
 - Response: Baseline OPTIMAL $1,865 / 8 flows / 0 unmet; S004 offline +88.5%; watch->avoid +0.0% (B003 already off optimal lanes).
 
+### 8. Persist the chain into the ontology
+
+- Prompt: `/rai-ontology-design Promote the per-stage enrichments to first-class ontology state: HIGH-priority customer flag, site centrality, supplier reliability flags, demand escalation flag. Add a `SupplyPlan` concept holding the optimal per-operation flow + per-demand unmet quantity so the routing decision is queryable as ontology, not DataFrame.`
+- Response: Ontology now carries `Business.is_high_priority_customer`, `Site.centrality`, `Business.is_unreliable / has_high_delay_risk / is_watch_level`, `Demand.is_escalated`, plus a new `SupplyPlan` concept materializing the 8 active flows and unmet quantities for the baseline solve.
+
 ## Data
 
 Bundled CSVs in `data/`: 31 sites (APAC/AMERICAS/EMEA), 31 businesses, 9 SKUs, 70 operations, 20 demand orders, 262 shipments (37 late), 36 quarterly delay predictions. Combined script with stage banners: `supply_chain_resilience.py`.
diff --git a/v1/telco_network_recovery/runbook.md b/v1/telco_network_recovery/runbook.md
index d794a85..5051cfc 100644
--- a/v1/telco_network_recovery/runbook.md
+++ b/v1/telco_network_recovery/runbook.md
@@ -77,6 +77,11 @@ across all 15 critical towers, prioritized by social blast radius.
 - Prompt: `/rai-prescriptive-results-interpretation Summarize the plan: total cost, capacity restored, tier mix, towers covered. Which constraint is binding, and what would relaxing it by 10-20% unlock?`
 - Response: Budget binds at $4.96M/$5M (only $43K of headroom); flexing the budget to $6M unlocks the TWR-0009 BRONZE→GOLD swap (+5 Gbps for ~$395K incremental cost). Install-weeks have 36 weeks of slack (164/200) so crew capacity is not the bottleneck. All 15 critical towers are covered, so the 404 service-affected subscribers identified by the graph stage are addressed within the rollout window.
 
+### 9. Persist the chain into the ontology
+
+- Prompt: `/rai-ontology-design Promote the per-stage enrichments to first-class ontology state: critical-restore flag, derived health metrics, subscriber influence, per-tower blast radius, predicted demand growth. Add a `SelectedUpgrade` concept (one row per chosen tower-tier) so the optimizer's plan is queryable as ontology, not stage-local state.`
+- Response: Ontology now carries `CellTower.is_critical_restore`, the four `avg_*` health metrics, `Subscriber.influence_score`, `CellTower.weighted_impact`, `CellTower.projected_demand_growth`, plus a new `SelectedUpgrade` concept materializing the 15 chosen tower-tier rows. Future queries against the model see the chain's output without re-running it.
+
 ## Data
 
 Bundled CSVs in `data/`: 250 cell towers (15 WEST DEGRADED), 1,200 subscribers, 6,000 directed CDRs, ~5,000 NetworkPerformance measurements, 544 NetworkEquipment + EquipmentHealth rows, 360 TowerUpgradeOptions (3 tiers × 120 in-scope towers), 3,285 daily KPI rows (365 days × 9 regions). All stages run end-to-end via `telco_network_recovery.py`.

From fd758e32fe5a1a84eb848f816aade2a9810ac23c Mon Sep 17 00:00:00 2001
From: cafzal <cameron.afzal@gmail.com>
Date: Wed, 6 May 2026 11:25:04 -0700
Subject: [PATCH 14/22] Insert 'Examine ontology' step after Build ontology

Step 2 is now /rai-querying showing the concept-relationship diagram
and row counts so a user can confirm the ontology came out the way
they expected before any reasoner skill runs against it. Discovery
shifts to step 3, downstream chain steps shift by 1.
---
 v1/energy_grid_planning/runbook.md    | 19 ++++++++++++-------
 v1/machine_maintenance/runbook.md     | 19 ++++++++++++-------
 v1/portfolio_balancing/runbook.md     | 19 ++++++++++++-------
 v1/supply_chain_resilience/runbook.md | 19 ++++++++++++-------
 v1/telco_network_recovery/runbook.md  | 21 +++++++++++++--------
 5 files changed, 61 insertions(+), 36 deletions(-)

diff --git a/v1/energy_grid_planning/runbook.md b/v1/energy_grid_planning/runbook.md
index 96e1c03..bccc6e8 100644
--- a/v1/energy_grid_planning/runbook.md
+++ b/v1/energy_grid_planning/runbook.md
@@ -43,37 +43,42 @@ $300M unlocks 5 DCs (1,500 MW, $264M net value) including xAI Colossus.
 - Prompt: `/rai-build-starter-ontology Build an ontology for grid infrastructure planning from the CSVs in data/ covering substations, generators, transmission lines, demand forecasts, data center requests, and substation upgrades.`
 - Response: Concepts: `Substation`, `Generator`, `TransmissionLine`, `LoadZone`, `DemandPeriod`, `RenewableProfile`, `MaintenanceWindow`, `Customer`, `DataCenterRequest`, `SubstationUpgrade`, `DemandForecast`, `LoadHistory`, `DCAnnouncement` — bound to the bundled CSVs (12 substations, 10 DC requests, 18 transmission lines).
 
-### 2. Discover reasoner questions
+### 2. Examine ontology
+
+- Prompt: `/rai-querying Show the ontology as a concept-relationship diagram and report row counts per concept.`
+- Response: 13 concepts: 12 `Substation`, 15 `Generator`, 18 `TransmissionLine`, 8 `LoadZone`, 24 `DemandPeriod`, 12 `RenewableProfile`, 8 `MaintenanceWindow`, 100 `Customer`, 10 `DataCenterRequest` (2,930 MW total), 10 `SubstationUpgrade` ($630M total), historical `LoadHistory` and forward `DemandForecast` rows backing the predictive stage.
+
+### 3. Discover reasoner questions
 
 - Prompt: `/rai-discovery We have 10 hyperscaler interconnection requests against a 12-substation grid. Which to approve, which substation upgrades to fund, at what budget level?`
 - Response: Plan routing sub-questions to predictive, graph, rules, and prescriptive reasoners.
 
-### 3. Forecast substation load
+### 4. Forecast substation load
 
 - Prompt: `/rai-predictive-modeling + /rai-predictive-training Can we forecast substation load growth over the next 24 months based on historical demand, planned generator additions, and the data center request pipeline? Bind each substation's predicted peak load back to the ontology so the rules engine and optimizer can read it.`
 - Response: `Substation.predicted_load` for all 12; DFW breaches at 1,700 MW vs 1,600 MW cap at 24 months (+54.6%).
 
-### 4. Find structural bottlenecks
+### 5. Find structural bottlenecks
 
 - Prompt: `/rai-graph-analysis Which substations are most critical to power flow based on grid topology? Check connectivity (WCC), regional structure (Louvain communities), and centrality (betweenness/degree/eigenvector); then flag the top 3 by combined centrality rank as structurally critical and persist the scores back to the ontology.`
 - Response: 1 connected component, 3 Louvain communities (North Texas, West Texas, Gulf Coast); DFW, Houston, San Antonio flagged `is_structurally_critical`; 7 of 10 DC requests target critical nodes.
 
-### 5. Screen DC requests
+### 6. Screen DC requests
 
 - Prompt: `/rai-rules-authoring Screen each data center request against three criteria: (1) substation must have enough capacity after predicted load, (2) substation's low-carbon (renewable + nuclear) generation share must meet the DC's low-carbon requirement, (3) substation shouldn't be one of the top-3 structurally critical. Which requests pass all three?`
 - Response: `fails_capacity` / `fails_structural` / `fails_low_carbon` + `is_compliant`; 2 pass (Crusoe, Oracle), 8 flagged.
 
-### 6. Approve DCs and fund upgrades
+### 7. Approve DCs and fund upgrades
 
 - Prompt: `/rai-prescriptive-problem-formulation Decide which data center requests to approve and which substation upgrades to fund at $200M, $300M, $400M, $500M, and $600M investment levels. Maximize annual revenue. A request can only be approved if its substation has enough capacity after upgrades.`
 - Response: OPTIMAL MIP across 5 `InvestmentLevel` values in one solve; `x_approve` and `x_upgrade` written back per level.
 
-### 7. Read the frontier
+### 8. Read the frontier
 
 - Prompt: `/rai-prescriptive-results-interpretation Which data centers get approved, which upgrades are selected, and where's the biggest return on investment at each budget level?`
 - Response: Pareto frontier with knee at $300M (5 DCs, 1,500 MW, $264M net); marginal $995K/$M at knee, declining to $400K/$M by $600M; Google + Lambda never approved (DFW full).
 
-### 8. Persist the chain into the ontology
+### 9. Persist the chain into the ontology
 
 - Prompt: `/rai-ontology-design Promote the per-stage enrichments to first-class ontology state: substation predicted load, centrality, grid community, structural-criticality flag, the three per-DC compliance flags. Add an `InvestmentPortfolio` concept indexed by `InvestmentLevel` so the approval + upgrade decision per budget scenario persists as queryable ontology.`
 - Response: Ontology now carries `Substation.predicted_load`, `.betweenness`, `.grid_community`, `.is_structurally_critical`, `DataCenterRequest.fails_capacity / fails_structural / fails_low_carbon / is_compliant`, plus an `InvestmentPortfolio(InvestmentLevel)` concept holding the approved-DC set and selected-upgrade set per scenario.
diff --git a/v1/machine_maintenance/runbook.md b/v1/machine_maintenance/runbook.md
index fb19cb3..a22e9ce 100644
--- a/v1/machine_maintenance/runbook.md
+++ b/v1/machine_maintenance/runbook.md
@@ -46,37 +46,42 @@ all 3 Turbine techs sit in one city, a $3,200 fix away from resolved.
 - Prompt: `/rai-build-starter-ontology Build a manufacturing maintenance ontology from the CSVs in data/ covering machines, technicians, qualifications, periods, sensor readings, failure predictions, downtime events, production runs, parts inventory, and certification expiry.`
 - Response: Concepts: `Machine`, `Technician`, `Qualification`, `Period`, `MachinePeriod`, `TechnicianPeriod`, `TechnicianMachinePeriod`, `Sensor`, `SensorReading`, `FailurePrediction`, `DowntimeEvent`, `ProductionRun`, `PartsInventory`, `CertificationExpiry` — bound to the bundled CSVs (30 machines × 3 plants, 10 technicians, 4 periods). `training_options.csv` is loaded as a DataFrame (read in Stage 4), not modeled as a concept.
 
-### 2. Discover reasoner questions
+### 2. Examine ontology
+
+- Prompt: `/rai-querying Show the ontology as a concept-relationship diagram and report row counts per concept.`
+- Response: Concepts wired to the bundled CSVs: 30 `Machine` (3 plants × 5 types), 10 `Technician` (3 cities), 16 `Qualification`, 4 `Period`, 120 `MachinePeriod`, 60 `Sensor` and 240 `SensorReading`, 120 `FailurePrediction`, 129 `DowntimeEvent`, 120 `ProductionRun`, plus parts inventory and certification-expiry rows.
+
+### 3. Discover reasoner questions
 
 - Prompt: `/rai-discovery We need to schedule preventive maintenance for 30 machines across 3 plants. Where does OEE alone mislead us, and what structural risks won't a pure optimizer surface?`
 - Response: Plan routing sub-questions to querying, graph, rules, prescriptive, and resilience skills.
 
-### 3. Diagnose plant operations
+### 4. Diagnose plant operations
 
 - Prompt: `/rai-querying What's the OEE by plant? Which machines have the most sensor anomalies, and which are most likely to fail by the end of the planning horizon?`
 - Response: Plant_C 79.8% > Plant_A 68.2% > Plant_B 61.4%; 7 of 9 anomalies at Plant_A; `MachinePeriod.predicted_fp` written for 120 rows.
 
-### 4. Find scheduling bottlenecks
+### 5. Find scheduling bottlenecks
 
 - Prompt: `/rai-graph-analysis Which machines share qualified technicians, and which are bottlenecks in the qualification network? Compute centrality and write it back to each machine so the optimizer can weight critical machines.`
 - Response: 30 machines → 1 connected component; Pumps tie at top betweenness (24.0 raw, 1.0 normalized); `Machine.betweenness` stored.
 
-### 5. Classify machine risk
+### 6. Classify machine risk
 
 - Prompt: `/rai-rules-authoring Rate each machine's risk: chronic if >8 downtime events, high-risk if failure prob >0.3 AND criticality 4+, plus overdue for maintenance. All three flags = Critical, two = Elevated, otherwise Standard.`
 - Response: 6 overdue, 1 high-risk, 3 chronic; M013 (Pump, Plant_A) = Critical; M016 (Turbine, Plant_A) = Elevated.
 
-### 6. Schedule maintenance
+### 7. Schedule maintenance
 
 - Prompt: `/rai-prescriptive-problem-formulation Schedule preventive maintenance for all 30 machines across 4 periods, capped at 5 jobs per period. Every overdue machine gets maintained by period 2, and each maintained machine needs a qualified technician. Minimize expected failure cost (weighted by criticality and centrality) plus labor and travel.`
 - Response: 120 `x_maintain` + 120 `x_vulnerable` + 384 `x_assigned` binaries (96 qualified tech×machine pairs × 4 periods); 5 constraint families (cumulative coverage, assignment-maintenance linkage, technician hours, parts/bay capacity, overdue deadline); failure cost uses `x_vulnerable × predicted_fp × parts_cost × criticality × (1 + 2.0 × betweenness)`.
 
-### 7. Stress-test concentration
+### 8. Stress-test concentration
 
 - Prompt: `/rai-prescriptive-solver-management + /rai-prescriptive-results-interpretation For each machine type, check whether all qualified technicians sit in one location and recommend the cheapest cross-training fix.`
 - Response: OPTIMAL · 20 jobs · $605,241; Turbine concentrated in Houston_TX (67% of jobs travel); cross-train T006 (Chicago_IL, Senior) for $3,200 / 5 weeks.
 
-### 8. Persist the chain into the ontology
+### 9. Persist the chain into the ontology
 
 - Prompt: `/rai-ontology-design Promote the per-stage enrichments to first-class ontology state: OEE-proxy properties, betweenness, the seven per-machine risk flags, the composite risk tier. Add a `MaintenancePlan` concept (one row per maintained `(machine, period, technician)` triple) and a `CrossTrainingRecommendation` concept so the optimizer's outputs persist as ontology.`
 - Response: Ontology now carries `Machine.performance_ratio / quality_ratio / anomaly_count / betweenness / is_overdue_maintenance / is_high_risk / is_chronic_downtime / risk_tier`, plus a `MaintenancePlan` concept holding the 20 scheduled jobs and a `CrossTrainingRecommendation` concept for T006 / Chicago_IL / $3,200 / 5 weeks.
diff --git a/v1/portfolio_balancing/runbook.md b/v1/portfolio_balancing/runbook.md
index ed70434..83fda83 100644
--- a/v1/portfolio_balancing/runbook.md
+++ b/v1/portfolio_balancing/runbook.md
@@ -43,37 +43,42 @@ base at every lambda — without the cluster collapse, the gap would grow.
 - Prompt: `/rai-build-starter-ontology Build a portfolio ontology from the CSVs in data/ covering stocks, sectors, the covariance matrix, accounts, holdings, users, and transactions.`
 - Response: Concepts: `Stock` (with binary `Stock.covar(Stock, Stock)` property carrying covariance), `Sector`, `User`, `Account`, `Holding`, `Transaction` — bound to the bundled CSVs (8 stocks, 64 covariance entries). Stage 3 adds the `Regime` and `Scenario` Concepts (2 regimes x 3 budgets = 6 scenarios).
 
-### 2. Discover reasoner questions
+### 2. Examine ontology
+
+- Prompt: `/rai-querying Show the ontology as a concept-relationship diagram and report row counts per concept.`
+- Response: Concepts: 8 `Stock` across 5 `Sector`, the binary `Stock.covar` covariance property (64 entries), 6 `User`, 4 `Account`, 15 `Holding`, 21 `Transaction` — Stage 5 will introduce `Regime` and `Scenario`.
+
+### 3. Discover reasoner questions
 
 - Prompt: `/rai-discovery Our 8-stock book breaks compliance and concentrates risk. Rebuild it under Markowitz mean-variance with caps, deduplicate redundant bets via correlation clustering, and stress-test under crisis. What questions does each reasoner family handle?`
 - Response: Plan: rules for compliance flags, graph for correlation clustering + representatives, prescriptive QP indexed by Scenario, stress as regime-swap re-solve.
 
-### 3. Compliance scan
+### 4. Compliance scan
 
 - Prompt: `/rai-rules-authoring Flag any holding worth more than 15% of its account, any sector worth more than 30% of the account, and any user with a risk score above 0.8 and more than five flagged transactions.`
 - Response: 4 holdings flagged (AAPL/MSFT on Account 1, JNJ/PFE on Account 4); 2 (account, sector) pairs flagged (Account 1 Tech 34.0%, Account 4 Healthcare 32.2%); 2 users flagged (Alice Chen 0.85, Eve Taylor 0.92).
 
-### 4. Cluster correlated bets
+### 5. Cluster correlated bets
 
 - Prompt: `/rai-graph-analysis Cluster stocks by correlation — anything above 0.3 absolute is a redundant bet. Pick one representative per cluster (highest Sharpe ratio) and only invest in those.`
 - Response: 4 edges (|rho| >= 0.3), 5 Louvain clusters, intra +0.683 vs inter +0.131. 5 representatives picked: PFE, GOOGL, JPM, PG, XOM. AAPL/MSFT/JNJ flagged `is_non_representative`.
 
-### 5. Solve mean-variance frontier
+### 6. Solve mean-variance frontier
 
 - Prompt: `/rai-prescriptive-problem-formulation Build a Markowitz mean-variance frontier across 6 scenarios = 3 budgets x 2 regimes. Position cap 30% of budget, sector cap 30%, only invest in cluster representatives. Show 7 points per frontier.`
 - Response: 48 decision vars (`Stock.x_quantity`, 8 stocks x 6 scenarios; non-reps forced to 0). Constraint families: non-negativity, budget equality (sum = budget per scenario), position cap (30%), sector cap (30%), non-representative = 0, plus epsilon return-rate floor on sweep solves. Return-rate range [0.0634, 0.0840]. 7 solves x 6 scenarios = 42 `LOCALLY_SOLVED` portfolios via Ipopt.
 
-### 6. Read the frontier
+### 7. Read the frontier
 
 - Prompt: `/rai-prescriptive-results-interpretation For each scenario, list the seven-point Pareto frontier and find the knee — where does the marginal risk per unit return jump the most?`
 - Response: base_500 frontier: returns 32.43 -> 40.28, risk 1160 -> 1742. Marginal `delta_risk/delta_return` jumps ~3x at eps_1 (knee). Same shape across all 6 scenarios — risk scales as budget^2, rate-form frontier is budget-independent.
 
-### 7. Stress under crisis
+### 8. Stress under crisis
 
 - Prompt: `/rai-prescriptive-solver-management + /rai-prescriptive-results-interpretation Stress-test the frontier under crisis: shrink correlations toward all-ones with weight 0.7 on base covariance + 0.3 on outer-product. How much volatility expansion at each frontier point?`
 - Response: Crisis vol 25-30% above base across the frontier (budget 500: min_risk 34.06 -> 43.74 at +28.4%, eps_1 34.30 -> 44.54 at +29.8% peak). Gap peaks mid-frontier (eps_1..eps_2 at +29.8%), narrows to +25.2% at eps_5 — the cluster-collapse payoff.
 
-### 8. Persist the chain into the ontology
+### 9. Persist the chain into the ontology
 
 - Prompt: `/rai-ontology-design Promote the per-stage enrichments to first-class ontology state: compliance flags, cluster id + representative flag, per-`(Scenario, eps_point)` holdings, base-vs-crisis vol gap. Add a `FrontierPoint` concept indexed by `(Scenario, eps_index)` so each Pareto point becomes a first-class entity, not stage-local Python state.`
 - Response: Ontology now carries `Holding.is_overconcentrated / is_sector_concentrated`, `User.is_high_risk_trader`, `Stock.cluster / is_representative`, plus a `FrontierPoint` concept materializing the 42 (6 scenarios × 7-point frontier) Pareto points with `vol_base`, `vol_crisis`, and `vol_gap_pct`.
diff --git a/v1/supply_chain_resilience/runbook.md b/v1/supply_chain_resilience/runbook.md
index 57975a1..49209f7 100644
--- a/v1/supply_chain_resilience/runbook.md
+++ b/v1/supply_chain_resilience/runbook.md
@@ -40,37 +40,42 @@ watch->avoid downgrade = +0.0% (optimizer already routed around it).
 - Prompt: `/rai-build-starter-ontology Build a supply chain ontology from the CSVs in data/ covering sites, businesses, SKUs, shipping operations, demand orders, historical shipments, and quarterly delay predictions.`
 - Response: Concepts: `Site`, `Business`, `SKU`, `Operation`, `Demand`, `Shipment`, `DelayPrediction` — bound to the bundled CSVs (31 sites, 31 businesses, 9 SKUs, 70 operations, 20 demands, 262 shipments, 36 delay predictions).
 
-### 2. Discover reasoner questions
+### 2. Examine ontology
+
+- Prompt: `/rai-querying Show the ontology as a concept-relationship diagram and report row counts per concept.`
+- Response: 7 concepts: 31 `Site` (APAC / AMERICAS / EMEA), 31 `Business` (suppliers / manufacturers / warehouses / buyers), 9 `SKU`, 70 `Operation`, 20 `Demand` (9 HIGH-priority), 262 `Shipment` (37 late, 14%), 36 quarterly `DelayPrediction`.
+
+### 3. Discover reasoner questions
 
 - Prompt: `/rai-discovery We need a risk-adjusted routing plan. What's our exposure to each supplier, which sites are bottlenecks, which suppliers are unreliable, what does the minimum-cost flow look like once those risks are priced in, and how robust is that plan to disruptions?`
 - Response: Reasoner-routing plan: (1) Graph reachability for upstream supplier exposure, (2) Graph centrality for hub identification, (3) Rules for supplier reliability classification, (4) Prescriptive MILP for risk-adjusted flow, (5) Scenario re-solves for disruption quantification.
 
-### 3. Map upstream supplier exposure
+### 4. Map upstream supplier exposure
 
 - Prompt: `/rai-graph-analysis If a key supplier goes offline, which downstream buyers and finished products are at risk? For each HIGH-priority customer, list the suppliers it transitively depends on through the shipment graph, with their reliability scores.`
 - Response: `Business.is_high_priority_customer` (2 buyers: B008 MegaCorp Enterprise, B009 TechGiant Inc); each transitively depends on the same 6 SUPPLIER-typed upstream nodes (B015, B016, B017, B018, B019, B020).
 
-### 4. Rank network hubs
+### 5. Rank network hubs
 
 - Prompt: `/rai-graph-analysis Which sites are the most influential hubs in the supply network — sites that connect to other influential sites, not just sites with many direct connections? Persist the centrality score back to each site so the optimizer can use it as a bottleneck weight.`
 - Response: `Site.centrality` normalized [0,1]: S004=1.000, S006=0.776, S003=0.735; 2 weakly-connected components.
 
-### 5. Classify supplier reliability
+### 6. Classify supplier reliability
 
 - Prompt: `/rai-rules-authoring Rate each supplier's delivery reliability. Flag any with reliability score below 0.80 as unreliable, any with a Q1 delay prediction above 0.15 as high-delay-risk, and call them 'watch-level' if either fires. Suppliers with **both** flags are 'avoid' (hard-blocked downstream); suppliers with **either** flag are 'watch' (surcharged).`
 - Response: `is_unreliable` (1: B017), `has_high_delay_risk` (2: B003, B017), `is_watch_level` (2), `Demand.is_escalated` (9).
 
-### 6. Solve risk-adjusted flow
+### 7. Solve risk-adjusted flow
 
 - Prompt: `/rai-prescriptive-problem-formulation Find the minimum-cost shipping plan that fulfills all open demand. Don't ship from 'avoid' suppliers at all, add a $5/unit surcharge for 'watch' suppliers, prefer non-bottleneck sites, and charge $100/unit for any unmet demand.`
 - Response: MILP on `Operation.x_flow` + `Demand.x_unmet`; objective = transport + risk surcharge + centrality weight + unmet penalty.
 
-### 7. Quantify disruption scenarios
+### 8. Quantify disruption scenarios
 
 - Prompt: `/rai-prescriptive-solver-management + /rai-prescriptive-results-interpretation Re-solve with the highest-centrality site offline, and again with watch-level suppliers downgraded to avoid. What's the cost delta in each, and why are they asymmetric?`
 - Response: Baseline OPTIMAL $1,865 / 8 flows / 0 unmet; S004 offline +88.5%; watch->avoid +0.0% (B003 already off optimal lanes).
 
-### 8. Persist the chain into the ontology
+### 9. Persist the chain into the ontology
 
 - Prompt: `/rai-ontology-design Promote the per-stage enrichments to first-class ontology state: HIGH-priority customer flag, site centrality, supplier reliability flags, demand escalation flag. Add a `SupplyPlan` concept holding the optimal per-operation flow + per-demand unmet quantity so the routing decision is queryable as ontology, not DataFrame.`
 - Response: Ontology now carries `Business.is_high_priority_customer`, `Site.centrality`, `Business.is_unreliable / has_high_delay_risk / is_watch_level`, `Demand.is_escalated`, plus a new `SupplyPlan` concept materializing the 8 active flows and unmet quantities for the baseline solve.
diff --git a/v1/telco_network_recovery/runbook.md b/v1/telco_network_recovery/runbook.md
index 5051cfc..b6073ed 100644
--- a/v1/telco_network_recovery/runbook.md
+++ b/v1/telco_network_recovery/runbook.md
@@ -42,42 +42,47 @@ across all 15 critical towers, prioritized by social blast radius.
 - Prompt: `/rai-build-starter-ontology Build a telco network ontology from the eight CSVs in data/: cell_towers, network_equipment, equipment_health, network_performance, subscribers, call_detail_records, tower_upgrade_options, time_series_metrics. The time-series file has one row per (date, region) — model that as a composite-key concept and add a same-region 1-day-lag edge concept to support temporal GNN message passing downstream.`
 - Response: Concepts: `CellTower`, `NetworkEquipment`, `EquipmentHealth`, `NetworkPerformance`, `Subscriber`, `CallDetailRecord` (edge concept: caller → callee, routed_through tower), `TowerUpgradeOption` (composite key tower_id+tier), `RegionMetric` (composite key metric_date+region), `TemporalEdge` (composite key src_date+src_region+dst_date+dst_region) — all bound to the bundled CSVs.
 
-### 2. Discover reasoner questions
+### 2. Examine ontology
+
+- Prompt: `/rai-querying Show the ontology as a concept-relationship diagram and report row counts per concept and the data range covered.`
+- Response: 9 concepts wired to the bundled CSVs: 250 `CellTower`, 1,200 `Subscriber`, 6,000 `CallDetailRecord`, ~5,000 `NetworkPerformance`, 544 `NetworkEquipment`, 544 `EquipmentHealth`, 360 `TowerUpgradeOption`, 3,285 `RegionMetric` (365d × 9 regions), plus same-region 1-day-lag `TemporalEdge` rows for the GNN.
+
+### 3. Discover reasoner questions
 
 - Prompt: `/rai-discovery WEST is missing revenue targets while every other region grows. We have a $5M capex budget and 200 install crew-weeks to allocate to tower upgrades. Which RAI reasoners do we need, in what order, to land on a defensible upgrade plan grounded in the available data (towers, subscribers, calls, equipment health, performance, daily KPIs, and tiered upgrade options)?`
 - Response: Plans the 4-reasoner chain on the shared ontology — descriptive (`/rai-querying`) to scope the WEST crisis and rule out a retention angle; rules (`/rai-rules-authoring`) to flag critical-restore towers; graph (`/rai-graph-analysis`) to score subscriber influence and aggregate per-tower blast radius; predictive (`/rai-predictive-modeling` + `/rai-predictive-training`) to forecast per-region growth and bind it as a per-tower demand multiplier; prescriptive (`/rai-prescriptive-problem-formulation` + `/rai-prescriptive-results-interpretation`) to compose all three signals into the tier-selection MIP and explain the binding constraint.
 
-### 3. Diagnose WEST
+### 4. Diagnose WEST
 
 - Prompt: `/rai-querying Compare quarterly DAILY_REVENUE_USD by region. Which region has the worst Q4 2024 network availability? Show the WEST cell tower fleet broken down by status, and the average packet loss for the DEGRADED ones.`
 - Response: WEST Q4 avail 94.6% vs 99.5% in every other region; WEST Q4 revenue $6.6M vs ~$9.0–9.5M everywhere else (≈$2.7M Q4 deficit, −29% vs WEST's own H1 baseline); 81 WEST towers split into 49 ACTIVE / 17 MAINTENANCE / 15 DEGRADED, with the 15 DEGRADED towers averaging 7.6–10.3% packet loss (median ~8.2%). Subscriber-churn signals stay flat — this is an operational network failure, not retention.
 
-### 4. Flag critical-restore towers
+### 5. Flag critical-restore towers
 
 - Prompt: `/rai-rules-authoring First derive per-tower averages for packet loss, latency, error rate (from NetworkPerformance) and average equipment health (via NetworkEquipment → EquipmentHealth). Then flag CellTower.is_critical_restore on either of two branches: (1) region == WEST AND status == DEGRADED AND avg_health_score < 0.85, OR (2) region == WEST AND avg_packet_loss > 5% AND avg_health_score < 0.85 (catches ACTIVE-but-failing).`
 - Response: 4 derived health properties (`avg_packet_loss`, `avg_latency_ms`, `avg_error_rate`, `avg_health_score`) computed for all 250 towers via `aggs.avg(...).per(CellTower)`. The two-branch `CellTower.is_critical_restore` relationship fires on 15 towers — all 15 are WEST + DEGRADED + health < 0.85, so Branch 1 alone produces the same set, but Branch 2 is kept as a guard against ACTIVE-but-failing failure modes.
 
-### 5. Score subscriber blast radius
+### 6. Score subscriber blast radius
 
 - Prompt: `/rai-graph-analysis Who are our most socially influential subscribers based on call patterns? For each critical-restore tower, count the distinct subscribers whose calls route through it and rank by total PageRank influence — that's the blast radius if it fails.`
 - Response: `Subscriber.influence_score` (PageRank) on all 1,200 subs; `CellTower.weighted_impact` on 15 critical towers; 404 distinct subs (33% of base) route through a critical tower; TWR-0014 has the largest footprint (61 subs, 0.0502).
 
-### 6. Forecast regional demand
+### 7. Forecast regional demand
 
 - Prompt: `/rai-predictive-modeling + /rai-predictive-training Train a regression GNN on RegionMetric (one row per date+region) to predict next-quarter SUBSCRIBER_GROWTH_RATE per region. Use TemporalEdge (same-region 1-day lag) for message passing, region as a category feature, and lag features (prev-day, prev-week, 7-day mean) as continuous inputs. Train < 2024-11-01, validate on Nov, test on Dec. Mean each region's Dec predictions, convert to 1+x multiplier, and bind back to CellTower.projected_demand_growth via region.`
 - Response: GNN node regression on 365d × 9 regions with same-region 1-day-lag temporal edges; per-region mean of the Dec test predictions yields WEST multiplier ≈0.9998× (flat/slightly contracting) while the 8 other regions sit at +0.45% to +0.91%/day. The multiplier is loaded into a `RegionGrowth` concept and joined to `CellTower.projected_demand_growth` via region — populating all 250 towers (CellTower covers 5 regions; the other 4 RegionMetric regions are forecast but have no towers to bind to).
 
-### 7. Optimize tier selection
+### 8. Optimize tier selection
 
 - Prompt: `/rai-prescriptive-problem-formulation Build a tower-upgrade MIP scoped to options where TowerUpgradeOption.for_tower(CellTower) AND CellTower.is_critical_restore(). Decision variable TowerUpgradeOption.selected is binary, keyed by (tower_id, tier). Constraints: at most one tier per tower, total cost ≤ $5M, total install_weeks ≤ 200. Maximize sum(selected · capacity_increase_gbps · CellTower.weighted_impact · CellTower.projected_demand_growth) — three coefficients, one from each upstream stage.`
 - Response: Status OPTIMAL with all 15 critical towers covered (one tier each). Tier mix: 12 GOLD / 2 SILVER / 1 BRONZE. Total capacity restored 122 Gbps. Total cost $4,956,843 of the $5M budget (binding). Total install crew-weeks 164 of 200 (slack). The tier mix skews toward GOLD because the per-Gbps cost on GOLD is competitive once it is multiplied by `weighted_impact` and `projected_demand_growth` in the objective.
 
-### 8. Interpret the plan
+### 9. Interpret the plan
 
 - Prompt: `/rai-prescriptive-results-interpretation Summarize the plan: total cost, capacity restored, tier mix, towers covered. Which constraint is binding, and what would relaxing it by 10-20% unlock?`
 - Response: Budget binds at $4.96M/$5M (only $43K of headroom); flexing the budget to $6M unlocks the TWR-0009 BRONZE→GOLD swap (+5 Gbps for ~$395K incremental cost). Install-weeks have 36 weeks of slack (164/200) so crew capacity is not the bottleneck. All 15 critical towers are covered, so the 404 service-affected subscribers identified by the graph stage are addressed within the rollout window.
 
-### 9. Persist the chain into the ontology
+### 10. Persist the chain into the ontology
 
 - Prompt: `/rai-ontology-design Promote the per-stage enrichments to first-class ontology state: critical-restore flag, derived health metrics, subscriber influence, per-tower blast radius, predicted demand growth. Add a `SelectedUpgrade` concept (one row per chosen tower-tier) so the optimizer's plan is queryable as ontology, not stage-local state.`
 - Response: Ontology now carries `CellTower.is_critical_restore`, the four `avg_*` health metrics, `Subscriber.influence_score`, `CellTower.weighted_impact`, `CellTower.projected_demand_growth`, plus a new `SelectedUpgrade` concept materializing the 15 chosen tower-tier rows. Future queries against the model see the chain's output without re-running it.

From d28127e7dfdfd71bf8e243d63fa766e99d63af47 Mon Sep 17 00:00:00 2001
From: cafzal <cameron.afzal@gmail.com>
Date: Wed, 6 May 2026 11:33:55 -0700
Subject: [PATCH 15/22] Portfolio frontier is 6 points per scenario, not 7

End-to-end run shows the script outputs 6 frontier points per
scenario (min-risk anchor + 5 epsilon sweep points). Max-return
is computed as a separate anchor for setting the rate range but
isn't included in the frontier table. Fix runbook accordingly:

- Step 6 prompt: drop 'Show 7 points', describe the actual sweep
- Step 6 response: '6-point frontier per scenario; 7 solves per
  scenario x 6 scenarios = 42 LOCALLY_SOLVED'
- Step 7 prompt: 'six-point Pareto frontier'
- Chain ASCII: '6-point frontier'
- Closing step response: FrontierPoint count = 36, not 42

All other portfolio numbers verified against the actual run.
---
 v1/portfolio_balancing/runbook.md | 10 +++++-----
 1 file changed, 5 insertions(+), 5 deletions(-)

diff --git a/v1/portfolio_balancing/runbook.md b/v1/portfolio_balancing/runbook.md
index 83fda83..48cc6b3 100644
--- a/v1/portfolio_balancing/runbook.md
+++ b/v1/portfolio_balancing/runbook.md
@@ -25,7 +25,7 @@ base at every lambda — without the cluster collapse, the gap would grow.
   ─────────────────────────────────────────────────────────────────
   STAGE 3  Prescriptive ──►  Stock.x_quantity (per Scenario)
                  (QP)         6 scenarios = 3 budgets x 2 regimes.
-                              Anchors + 5 epsilon points = 7-point
+                              Min-risk anchor + 5 epsilon points = 6-point
                               frontier per scenario. Knee at eps_1.
   ─────────────────────────────────────────────────────────────────
   STAGE 4  Stress       ──►  Stock.regime_covar (PSD-preserving)
@@ -65,12 +65,12 @@ base at every lambda — without the cluster collapse, the gap would grow.
 
 ### 6. Solve mean-variance frontier
 
-- Prompt: `/rai-prescriptive-problem-formulation Build a Markowitz mean-variance frontier across 6 scenarios = 3 budgets x 2 regimes. Position cap 30% of budget, sector cap 30%, only invest in cluster representatives. Show 7 points per frontier.`
-- Response: 48 decision vars (`Stock.x_quantity`, 8 stocks x 6 scenarios; non-reps forced to 0). Constraint families: non-negativity, budget equality (sum = budget per scenario), position cap (30%), sector cap (30%), non-representative = 0, plus epsilon return-rate floor on sweep solves. Return-rate range [0.0634, 0.0840]. 7 solves x 6 scenarios = 42 `LOCALLY_SOLVED` portfolios via Ipopt.
+- Prompt: `/rai-prescriptive-problem-formulation Build a Markowitz mean-variance frontier across 6 scenarios = 3 budgets x 2 regimes. Position cap 30% of budget, sector cap 30%, only invest in cluster representatives. Anchor at min-risk and max-return, then sweep 5 epsilon points across the return range.`
+- Response: 48 decision vars (`Stock.x_quantity`, 8 stocks x 6 scenarios; non-reps forced to 0). Constraint families: non-negativity, budget equality (sum = budget per scenario), position cap (30%), sector cap (30%), non-representative = 0, plus epsilon return-rate floor on sweep solves. Return-rate range [0.0634, 0.0840]. 6-point frontier per scenario (min-risk anchor + 5 epsilon points); 7 solves per scenario x 6 scenarios = 42 `LOCALLY_SOLVED` portfolios via Ipopt.
 
 ### 7. Read the frontier
 
-- Prompt: `/rai-prescriptive-results-interpretation For each scenario, list the seven-point Pareto frontier and find the knee — where does the marginal risk per unit return jump the most?`
+- Prompt: `/rai-prescriptive-results-interpretation For each scenario, list the six-point Pareto frontier and find the knee — where does the marginal risk per unit return jump the most?`
 - Response: base_500 frontier: returns 32.43 -> 40.28, risk 1160 -> 1742. Marginal `delta_risk/delta_return` jumps ~3x at eps_1 (knee). Same shape across all 6 scenarios — risk scales as budget^2, rate-form frontier is budget-independent.
 
 ### 8. Stress under crisis
@@ -81,7 +81,7 @@ base at every lambda — without the cluster collapse, the gap would grow.
 ### 9. Persist the chain into the ontology
 
 - Prompt: `/rai-ontology-design Promote the per-stage enrichments to first-class ontology state: compliance flags, cluster id + representative flag, per-`(Scenario, eps_point)` holdings, base-vs-crisis vol gap. Add a `FrontierPoint` concept indexed by `(Scenario, eps_index)` so each Pareto point becomes a first-class entity, not stage-local Python state.`
-- Response: Ontology now carries `Holding.is_overconcentrated / is_sector_concentrated`, `User.is_high_risk_trader`, `Stock.cluster / is_representative`, plus a `FrontierPoint` concept materializing the 42 (6 scenarios × 7-point frontier) Pareto points with `vol_base`, `vol_crisis`, and `vol_gap_pct`.
+- Response: Ontology now carries `Holding.is_overconcentrated / is_sector_concentrated`, `User.is_high_risk_trader`, `Stock.cluster / is_representative`, plus a `FrontierPoint` concept materializing the 36 (6 scenarios × 6-point frontier) Pareto points with `vol_base`, `vol_crisis`, and `vol_gap_pct`.
 
 ## Data
 

From a755ae58b90585057fea3e49b5f19b335d710898 Mon Sep 17 00:00:00 2001
From: cafzal <cameron.afzal@gmail.com>
Date: Wed, 6 May 2026 12:18:47 -0700
Subject: [PATCH 16/22] Tighten closing rai-ontology-design step to cover the
 actual gap
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

The chain already writes per-stage enrichments back to the ontology
via model.define() in each reasoner stage, so promoting them is
redundant. The real gap is the prescriptive aggregates and post-solve
metadata that currently live only in pandas / stdout.

Per-template, the closing step now adds the specific Concepts that
materialize what the chain doesn't:

- telco: RestorePlan (singleton plan summary) + SelectedUpgrade
  (view-concept over the 15 chosen tower-tier rows)
- energy: InvestmentPortfolio(InvestmentLevel) holding per-budget
  totals + marginal_per_m + knee flag (5 rows)
- supply_chain: RoutingScenario (3 rows: Baseline, S004-offline,
  Watch-Avoid) with status, total_cost, cost_delta_pct, blocked
  businesses
- machine_maintenance: MaintenancePlan, TypeConcentration(machine_type)
  per-type concentration analysis, and CrossTrainingRecommendation
  with ranked candidates
- portfolio: FrontierPoint(Scenario, eps_label) — 36 rows holding
  return, risk, vol_base, vol_crisis, vol_gap_pct, is_knee

Also strips inner backticks from the prompt code span (which were
breaking the outer markdown code rendering) and restores the blank
line before the Data section.
---
 v1/energy_grid_planning/runbook.md    | 6 +++---
 v1/machine_maintenance/runbook.md     | 6 +++---
 v1/portfolio_balancing/runbook.md     | 6 +++---
 v1/supply_chain_resilience/runbook.md | 6 +++---
 v1/telco_network_recovery/runbook.md  | 6 +++---
 5 files changed, 15 insertions(+), 15 deletions(-)

diff --git a/v1/energy_grid_planning/runbook.md b/v1/energy_grid_planning/runbook.md
index bccc6e8..c4eecd2 100644
--- a/v1/energy_grid_planning/runbook.md
+++ b/v1/energy_grid_planning/runbook.md
@@ -78,10 +78,10 @@ $300M unlocks 5 DCs (1,500 MW, $264M net value) including xAI Colossus.
 - Prompt: `/rai-prescriptive-results-interpretation Which data centers get approved, which upgrades are selected, and where's the biggest return on investment at each budget level?`
 - Response: Pareto frontier with knee at $300M (5 DCs, 1,500 MW, $264M net); marginal $995K/$M at knee, declining to $400K/$M by $600M; Google + Lambda never approved (DFW full).
 
-### 9. Persist the chain into the ontology
+### 9. Persist post-solve aggregates into the ontology
 
-- Prompt: `/rai-ontology-design Promote the per-stage enrichments to first-class ontology state: substation predicted load, centrality, grid community, structural-criticality flag, the three per-DC compliance flags. Add an `InvestmentPortfolio` concept indexed by `InvestmentLevel` so the approval + upgrade decision per budget scenario persists as queryable ontology.`
-- Response: Ontology now carries `Substation.predicted_load`, `.betweenness`, `.grid_community`, `.is_structurally_critical`, `DataCenterRequest.fails_capacity / fails_structural / fails_low_carbon / is_compliant`, plus an `InvestmentPortfolio(InvestmentLevel)` concept holding the approved-DC set and selected-upgrade set per scenario.
+- Prompt: `/rai-ontology-design The chain already writes substation predicted_load / centrality / community / structural-criticality, the three per-DC compliance flags, and the per-InvestmentLevel x_approve and x_upgrade decisions. What's still only in pandas/stdout: per-scenario totals (approved-DC count, total MW, annual revenue, amortized upgrade cost, net value) and the marginal-per-$M analysis with the knee point. Add an InvestmentPortfolio(InvestmentLevel) Concept holding those aggregates plus a knee flag.`
+- Response: Ontology gains an `InvestmentPortfolio(InvestmentLevel)` Concept (5 rows, one per budget) with `dc_count`, `total_mw`, `annual_revenue`, `upgrade_cost`, `net_value`, `marginal_per_m_to_next_level`, `is_knee_point`. All five frontier rows — $200M ($165M net) → $300M ($264M net, knee) → $600M ($395M net) — are queryable as ontology rather than stdout.
 
 ## Data
 
diff --git a/v1/machine_maintenance/runbook.md b/v1/machine_maintenance/runbook.md
index a22e9ce..8b10f00 100644
--- a/v1/machine_maintenance/runbook.md
+++ b/v1/machine_maintenance/runbook.md
@@ -81,10 +81,10 @@ all 3 Turbine techs sit in one city, a $3,200 fix away from resolved.
 - Prompt: `/rai-prescriptive-solver-management + /rai-prescriptive-results-interpretation For each machine type, check whether all qualified technicians sit in one location and recommend the cheapest cross-training fix.`
 - Response: OPTIMAL · 20 jobs · $605,241; Turbine concentrated in Houston_TX (67% of jobs travel); cross-train T006 (Chicago_IL, Senior) for $3,200 / 5 weeks.
 
-### 9. Persist the chain into the ontology
+### 9. Persist post-solve aggregates into the ontology
 
-- Prompt: `/rai-ontology-design Promote the per-stage enrichments to first-class ontology state: OEE-proxy properties, betweenness, the seven per-machine risk flags, the composite risk tier. Add a `MaintenancePlan` concept (one row per maintained `(machine, period, technician)` triple) and a `CrossTrainingRecommendation` concept so the optimizer's outputs persist as ontology.`
-- Response: Ontology now carries `Machine.performance_ratio / quality_ratio / anomaly_count / betweenness / is_overdue_maintenance / is_high_risk / is_chronic_downtime / risk_tier`, plus a `MaintenancePlan` concept holding the 20 scheduled jobs and a `CrossTrainingRecommendation` concept for T006 / Chicago_IL / $3,200 / 5 weeks.
+- Prompt: `/rai-ontology-design The chain already writes the OEE-proxy properties, betweenness, the per-machine risk flags, the composite risk_tier, and x_maintain / x_vulnerable / x_assigned. What's still only in pandas/stdout: the plan cost breakdown (failure / labor / travel), per-technician utilization, per-machine-type concentration analysis, and the cross-training recommendation ranked by cost. Add MaintenancePlan, TypeConcentration(machine_type), and CrossTrainingRecommendation Concepts.`
+- Response: Ontology gains: `MaintenancePlan` (singleton, with `objective`, `failure_cost`, `labor_cost`, `travel_cost`, `total_jobs`); `TypeConcentration(machine_type)` (5 rows, with `qualified_tech_count`, `qualified_tech_locations`, `is_concentrated`, `scheduled_jobs_total`, `scheduled_jobs_traveling`, `travel_pct`); `CrossTrainingRecommendation` (one row per concentrated type, ranked candidates with `tech_id`, `cost`, `duration_weeks`, `is_best_candidate`). The Turbine-in-Houston concentration and T006 / Chicago_IL / $3,200 / 5w recommendation are now queryable as ontology rather than stdout.
 
 ## Data
 
diff --git a/v1/portfolio_balancing/runbook.md b/v1/portfolio_balancing/runbook.md
index 48cc6b3..11391a8 100644
--- a/v1/portfolio_balancing/runbook.md
+++ b/v1/portfolio_balancing/runbook.md
@@ -78,10 +78,10 @@ base at every lambda — without the cluster collapse, the gap would grow.
 - Prompt: `/rai-prescriptive-solver-management + /rai-prescriptive-results-interpretation Stress-test the frontier under crisis: shrink correlations toward all-ones with weight 0.7 on base covariance + 0.3 on outer-product. How much volatility expansion at each frontier point?`
 - Response: Crisis vol 25-30% above base across the frontier (budget 500: min_risk 34.06 -> 43.74 at +28.4%, eps_1 34.30 -> 44.54 at +29.8% peak). Gap peaks mid-frontier (eps_1..eps_2 at +29.8%), narrows to +25.2% at eps_5 — the cluster-collapse payoff.
 
-### 9. Persist the chain into the ontology
+### 9. Persist post-solve aggregates into the ontology
 
-- Prompt: `/rai-ontology-design Promote the per-stage enrichments to first-class ontology state: compliance flags, cluster id + representative flag, per-`(Scenario, eps_point)` holdings, base-vs-crisis vol gap. Add a `FrontierPoint` concept indexed by `(Scenario, eps_index)` so each Pareto point becomes a first-class entity, not stage-local Python state.`
-- Response: Ontology now carries `Holding.is_overconcentrated / is_sector_concentrated`, `User.is_high_risk_trader`, `Stock.cluster / is_representative`, plus a `FrontierPoint` concept materializing the 36 (6 scenarios × 6-point frontier) Pareto points with `vol_base`, `vol_crisis`, and `vol_gap_pct`.
+- Prompt: `/rai-ontology-design The chain already writes the compliance flags, cluster id + representative flag, Stock.x_quantity(Scenario), and Stock.regime_covar. What's still only in pandas/stdout: per-(scenario, frontier-point) metadata (return, risk, marginal risk/return, knee flag) and the base-vs-crisis volatility comparison. Add a FrontierPoint(Scenario, eps_label) Concept holding all post-solve frontier metadata.`
+- Response: Ontology gains a `FrontierPoint(Scenario, eps_label)` Concept (36 rows = 6 scenarios x 6 points) with `return`, `risk`, `marginal_risk_per_return`, `is_knee`, `vol_base`, `vol_crisis`, `vol_gap`, `vol_gap_pct`. The frontier shape (32.43->40.28 / 1160->1742 in base_500), knee at eps_1, and crisis vol gap (+28.4% min_risk -> +29.8% peak -> +25.2% eps_5) are now queryable as ontology rather than stdout.
 
 ## Data
 
diff --git a/v1/supply_chain_resilience/runbook.md b/v1/supply_chain_resilience/runbook.md
index 49209f7..9807bb6 100644
--- a/v1/supply_chain_resilience/runbook.md
+++ b/v1/supply_chain_resilience/runbook.md
@@ -75,10 +75,10 @@ watch->avoid downgrade = +0.0% (optimizer already routed around it).
 - Prompt: `/rai-prescriptive-solver-management + /rai-prescriptive-results-interpretation Re-solve with the highest-centrality site offline, and again with watch-level suppliers downgraded to avoid. What's the cost delta in each, and why are they asymmetric?`
 - Response: Baseline OPTIMAL $1,865 / 8 flows / 0 unmet; S004 offline +88.5%; watch->avoid +0.0% (B003 already off optimal lanes).
 
-### 9. Persist the chain into the ontology
+### 9. Persist post-solve aggregates into the ontology
 
-- Prompt: `/rai-ontology-design Promote the per-stage enrichments to first-class ontology state: HIGH-priority customer flag, site centrality, supplier reliability flags, demand escalation flag. Add a `SupplyPlan` concept holding the optimal per-operation flow + per-demand unmet quantity so the routing decision is queryable as ontology, not DataFrame.`
-- Response: Ontology now carries `Business.is_high_priority_customer`, `Site.centrality`, `Business.is_unreliable / has_high_delay_risk / is_watch_level`, `Demand.is_escalated`, plus a new `SupplyPlan` concept materializing the 8 active flows and unmet quantities for the baseline solve.
+- Prompt: `/rai-ontology-design The chain already writes site centrality, the business reliability flags, demand escalation, blast-radius reachability, and baseline Operation.x_flow + Demand.x_unmet. What's still only in pandas/stdout: the scenario re-solves (Baseline, S004-offline, Watch->Avoid) with their status, total cost, % delta vs baseline, active flow count, and unmet count. Add a RoutingScenario Concept materializing each scenario as queryable ontology.`
+- Response: Ontology gains a `RoutingScenario` Concept (3 rows: Baseline, S004-offline, Watch-Avoid) with `status`, `total_cost`, `cost_delta_pct`, `active_flow_count`, `unmet_total`, `blocked_businesses`. The disruption deltas — Baseline $1,865 / 8 flows / 0 unmet, S004-offline +88.5%, Watch->Avoid +0.0% — are queryable as ontology rather than scenario-comparison stdout.
 
 ## Data
 
diff --git a/v1/telco_network_recovery/runbook.md b/v1/telco_network_recovery/runbook.md
index b6073ed..5b53c54 100644
--- a/v1/telco_network_recovery/runbook.md
+++ b/v1/telco_network_recovery/runbook.md
@@ -82,10 +82,10 @@ across all 15 critical towers, prioritized by social blast radius.
 - Prompt: `/rai-prescriptive-results-interpretation Summarize the plan: total cost, capacity restored, tier mix, towers covered. Which constraint is binding, and what would relaxing it by 10-20% unlock?`
 - Response: Budget binds at $4.96M/$5M (only $43K of headroom); flexing the budget to $6M unlocks the TWR-0009 BRONZE→GOLD swap (+5 Gbps for ~$395K incremental cost). Install-weeks have 36 weeks of slack (164/200) so crew capacity is not the bottleneck. All 15 critical towers are covered, so the 404 service-affected subscribers identified by the graph stage are addressed within the rollout window.
 
-### 10. Persist the chain into the ontology
+### 10. Persist post-solve aggregates into the ontology
 
-- Prompt: `/rai-ontology-design Promote the per-stage enrichments to first-class ontology state: critical-restore flag, derived health metrics, subscriber influence, per-tower blast radius, predicted demand growth. Add a `SelectedUpgrade` concept (one row per chosen tower-tier) so the optimizer's plan is queryable as ontology, not stage-local state.`
-- Response: Ontology now carries `CellTower.is_critical_restore`, the four `avg_*` health metrics, `Subscriber.influence_score`, `CellTower.weighted_impact`, `CellTower.projected_demand_growth`, plus a new `SelectedUpgrade` concept materializing the 15 chosen tower-tier rows. Future queries against the model see the chain's output without re-running it.
+- Prompt: `/rai-ontology-design The chain already writes critical-restore flag, derived health, subscriber influence, weighted_impact, projected_demand_growth, and TowerUpgradeOption.selected. What's still only in pandas/stdout: the plan summary (total cost, install-weeks, capacity restored, tier-mix counts, towers-covered count) and the binding-constraint analysis. Add a RestorePlan Concept holding those aggregates and a narrowed SelectedUpgrade view-concept restricted to the 15 chosen tower-tier rows.`
+- Response: Ontology gains a singleton `RestorePlan` Concept with `total_cost`, `total_install_weeks`, `capacity_restored_gbps`, `gold_count`, `silver_count`, `bronze_count`, `towers_covered`, `binding_constraint`; plus `SelectedUpgrade` (a view-concept over the 15 chosen rows from `TowerUpgradeOption`). All headline plan numbers — $4,956,843 / 164 weeks / 122 Gbps / 12 GOLD / 2 SILVER / 1 BRONZE / 15 covered — are now queryable as ontology, not stdout.
 
 ## Data
 

From affa12faefb854bd0d8f7eeb2a1160fe2afc5a65 Mon Sep 17 00:00:00 2001
From: cafzal <cameron.afzal@gmail.com>
Date: Wed, 6 May 2026 12:20:14 -0700
Subject: [PATCH 17/22] Rename closing step: 'Persist solution concepts into
 the ontology'

---
 v1/energy_grid_planning/runbook.md    | 2 +-
 v1/machine_maintenance/runbook.md     | 2 +-
 v1/portfolio_balancing/runbook.md     | 2 +-
 v1/supply_chain_resilience/runbook.md | 2 +-
 v1/telco_network_recovery/runbook.md  | 2 +-
 5 files changed, 5 insertions(+), 5 deletions(-)

diff --git a/v1/energy_grid_planning/runbook.md b/v1/energy_grid_planning/runbook.md
index c4eecd2..06776e8 100644
--- a/v1/energy_grid_planning/runbook.md
+++ b/v1/energy_grid_planning/runbook.md
@@ -78,7 +78,7 @@ $300M unlocks 5 DCs (1,500 MW, $264M net value) including xAI Colossus.
 - Prompt: `/rai-prescriptive-results-interpretation Which data centers get approved, which upgrades are selected, and where's the biggest return on investment at each budget level?`
 - Response: Pareto frontier with knee at $300M (5 DCs, 1,500 MW, $264M net); marginal $995K/$M at knee, declining to $400K/$M by $600M; Google + Lambda never approved (DFW full).
 
-### 9. Persist post-solve aggregates into the ontology
+### 9. Persist solution concepts into the ontology
 
 - Prompt: `/rai-ontology-design The chain already writes substation predicted_load / centrality / community / structural-criticality, the three per-DC compliance flags, and the per-InvestmentLevel x_approve and x_upgrade decisions. What's still only in pandas/stdout: per-scenario totals (approved-DC count, total MW, annual revenue, amortized upgrade cost, net value) and the marginal-per-$M analysis with the knee point. Add an InvestmentPortfolio(InvestmentLevel) Concept holding those aggregates plus a knee flag.`
 - Response: Ontology gains an `InvestmentPortfolio(InvestmentLevel)` Concept (5 rows, one per budget) with `dc_count`, `total_mw`, `annual_revenue`, `upgrade_cost`, `net_value`, `marginal_per_m_to_next_level`, `is_knee_point`. All five frontier rows — $200M ($165M net) → $300M ($264M net, knee) → $600M ($395M net) — are queryable as ontology rather than stdout.
diff --git a/v1/machine_maintenance/runbook.md b/v1/machine_maintenance/runbook.md
index 8b10f00..15ac267 100644
--- a/v1/machine_maintenance/runbook.md
+++ b/v1/machine_maintenance/runbook.md
@@ -81,7 +81,7 @@ all 3 Turbine techs sit in one city, a $3,200 fix away from resolved.
 - Prompt: `/rai-prescriptive-solver-management + /rai-prescriptive-results-interpretation For each machine type, check whether all qualified technicians sit in one location and recommend the cheapest cross-training fix.`
 - Response: OPTIMAL · 20 jobs · $605,241; Turbine concentrated in Houston_TX (67% of jobs travel); cross-train T006 (Chicago_IL, Senior) for $3,200 / 5 weeks.
 
-### 9. Persist post-solve aggregates into the ontology
+### 9. Persist solution concepts into the ontology
 
 - Prompt: `/rai-ontology-design The chain already writes the OEE-proxy properties, betweenness, the per-machine risk flags, the composite risk_tier, and x_maintain / x_vulnerable / x_assigned. What's still only in pandas/stdout: the plan cost breakdown (failure / labor / travel), per-technician utilization, per-machine-type concentration analysis, and the cross-training recommendation ranked by cost. Add MaintenancePlan, TypeConcentration(machine_type), and CrossTrainingRecommendation Concepts.`
 - Response: Ontology gains: `MaintenancePlan` (singleton, with `objective`, `failure_cost`, `labor_cost`, `travel_cost`, `total_jobs`); `TypeConcentration(machine_type)` (5 rows, with `qualified_tech_count`, `qualified_tech_locations`, `is_concentrated`, `scheduled_jobs_total`, `scheduled_jobs_traveling`, `travel_pct`); `CrossTrainingRecommendation` (one row per concentrated type, ranked candidates with `tech_id`, `cost`, `duration_weeks`, `is_best_candidate`). The Turbine-in-Houston concentration and T006 / Chicago_IL / $3,200 / 5w recommendation are now queryable as ontology rather than stdout.
diff --git a/v1/portfolio_balancing/runbook.md b/v1/portfolio_balancing/runbook.md
index 11391a8..9625807 100644
--- a/v1/portfolio_balancing/runbook.md
+++ b/v1/portfolio_balancing/runbook.md
@@ -78,7 +78,7 @@ base at every lambda — without the cluster collapse, the gap would grow.
 - Prompt: `/rai-prescriptive-solver-management + /rai-prescriptive-results-interpretation Stress-test the frontier under crisis: shrink correlations toward all-ones with weight 0.7 on base covariance + 0.3 on outer-product. How much volatility expansion at each frontier point?`
 - Response: Crisis vol 25-30% above base across the frontier (budget 500: min_risk 34.06 -> 43.74 at +28.4%, eps_1 34.30 -> 44.54 at +29.8% peak). Gap peaks mid-frontier (eps_1..eps_2 at +29.8%), narrows to +25.2% at eps_5 — the cluster-collapse payoff.
 
-### 9. Persist post-solve aggregates into the ontology
+### 9. Persist solution concepts into the ontology
 
 - Prompt: `/rai-ontology-design The chain already writes the compliance flags, cluster id + representative flag, Stock.x_quantity(Scenario), and Stock.regime_covar. What's still only in pandas/stdout: per-(scenario, frontier-point) metadata (return, risk, marginal risk/return, knee flag) and the base-vs-crisis volatility comparison. Add a FrontierPoint(Scenario, eps_label) Concept holding all post-solve frontier metadata.`
 - Response: Ontology gains a `FrontierPoint(Scenario, eps_label)` Concept (36 rows = 6 scenarios x 6 points) with `return`, `risk`, `marginal_risk_per_return`, `is_knee`, `vol_base`, `vol_crisis`, `vol_gap`, `vol_gap_pct`. The frontier shape (32.43->40.28 / 1160->1742 in base_500), knee at eps_1, and crisis vol gap (+28.4% min_risk -> +29.8% peak -> +25.2% eps_5) are now queryable as ontology rather than stdout.
diff --git a/v1/supply_chain_resilience/runbook.md b/v1/supply_chain_resilience/runbook.md
index 9807bb6..d0052b0 100644
--- a/v1/supply_chain_resilience/runbook.md
+++ b/v1/supply_chain_resilience/runbook.md
@@ -75,7 +75,7 @@ watch->avoid downgrade = +0.0% (optimizer already routed around it).
 - Prompt: `/rai-prescriptive-solver-management + /rai-prescriptive-results-interpretation Re-solve with the highest-centrality site offline, and again with watch-level suppliers downgraded to avoid. What's the cost delta in each, and why are they asymmetric?`
 - Response: Baseline OPTIMAL $1,865 / 8 flows / 0 unmet; S004 offline +88.5%; watch->avoid +0.0% (B003 already off optimal lanes).
 
-### 9. Persist post-solve aggregates into the ontology
+### 9. Persist solution concepts into the ontology
 
 - Prompt: `/rai-ontology-design The chain already writes site centrality, the business reliability flags, demand escalation, blast-radius reachability, and baseline Operation.x_flow + Demand.x_unmet. What's still only in pandas/stdout: the scenario re-solves (Baseline, S004-offline, Watch->Avoid) with their status, total cost, % delta vs baseline, active flow count, and unmet count. Add a RoutingScenario Concept materializing each scenario as queryable ontology.`
 - Response: Ontology gains a `RoutingScenario` Concept (3 rows: Baseline, S004-offline, Watch-Avoid) with `status`, `total_cost`, `cost_delta_pct`, `active_flow_count`, `unmet_total`, `blocked_businesses`. The disruption deltas — Baseline $1,865 / 8 flows / 0 unmet, S004-offline +88.5%, Watch->Avoid +0.0% — are queryable as ontology rather than scenario-comparison stdout.
diff --git a/v1/telco_network_recovery/runbook.md b/v1/telco_network_recovery/runbook.md
index 5b53c54..b0ee4dd 100644
--- a/v1/telco_network_recovery/runbook.md
+++ b/v1/telco_network_recovery/runbook.md
@@ -82,7 +82,7 @@ across all 15 critical towers, prioritized by social blast radius.
 - Prompt: `/rai-prescriptive-results-interpretation Summarize the plan: total cost, capacity restored, tier mix, towers covered. Which constraint is binding, and what would relaxing it by 10-20% unlock?`
 - Response: Budget binds at $4.96M/$5M (only $43K of headroom); flexing the budget to $6M unlocks the TWR-0009 BRONZE→GOLD swap (+5 Gbps for ~$395K incremental cost). Install-weeks have 36 weeks of slack (164/200) so crew capacity is not the bottleneck. All 15 critical towers are covered, so the 404 service-affected subscribers identified by the graph stage are addressed within the rollout window.
 
-### 10. Persist post-solve aggregates into the ontology
+### 10. Persist solution concepts into the ontology
 
 - Prompt: `/rai-ontology-design The chain already writes critical-restore flag, derived health, subscriber influence, weighted_impact, projected_demand_growth, and TowerUpgradeOption.selected. What's still only in pandas/stdout: the plan summary (total cost, install-weeks, capacity restored, tier-mix counts, towers-covered count) and the binding-constraint analysis. Add a RestorePlan Concept holding those aggregates and a narrowed SelectedUpgrade view-concept restricted to the 15 chosen tower-tier rows.`
 - Response: Ontology gains a singleton `RestorePlan` Concept with `total_cost`, `total_install_weeks`, `capacity_restored_gbps`, `gold_count`, `silver_count`, `bronze_count`, `towers_covered`, `binding_constraint`; plus `SelectedUpgrade` (a view-concept over the 15 chosen rows from `TowerUpgradeOption`). All headline plan numbers — $4,956,843 / 164 weeks / 122 Gbps / 12 GOLD / 2 SILVER / 1 BRONZE / 15 covered — are now queryable as ontology, not stdout.

From 7cd415492ec5231afc85c4f7488fa053a3a8235d Mon Sep 17 00:00:00 2001
From: cafzal <cameron.afzal@gmail.com>
Date: Wed, 6 May 2026 12:23:16 -0700
Subject: [PATCH 18/22] Naturalize closing-step prompts
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

Strip the meta-framing ('the chain already writes X, what's still
only in pandas...') from each /rai-ontology-design prompt. A user
wouldn't talk to an agent that way — they'd just say what they
want added to the ontology. The agent (with the ontology-design
skill loaded) figures out the gap.

Also drop 'support temporal GNN message passing downstream' tail
from the telco build prompt — replaced with the user-facing
reason ('we'll want to forecast region-level trends later').
---
 v1/energy_grid_planning/runbook.md    | 2 +-
 v1/machine_maintenance/runbook.md     | 2 +-
 v1/portfolio_balancing/runbook.md     | 2 +-
 v1/supply_chain_resilience/runbook.md | 2 +-
 v1/telco_network_recovery/runbook.md  | 6 +++---
 5 files changed, 7 insertions(+), 7 deletions(-)

diff --git a/v1/energy_grid_planning/runbook.md b/v1/energy_grid_planning/runbook.md
index 06776e8..81fcf46 100644
--- a/v1/energy_grid_planning/runbook.md
+++ b/v1/energy_grid_planning/runbook.md
@@ -80,7 +80,7 @@ $300M unlocks 5 DCs (1,500 MW, $264M net value) including xAI Colossus.
 
 ### 9. Persist solution concepts into the ontology
 
-- Prompt: `/rai-ontology-design The chain already writes substation predicted_load / centrality / community / structural-criticality, the three per-DC compliance flags, and the per-InvestmentLevel x_approve and x_upgrade decisions. What's still only in pandas/stdout: per-scenario totals (approved-DC count, total MW, annual revenue, amortized upgrade cost, net value) and the marginal-per-$M analysis with the knee point. Add an InvestmentPortfolio(InvestmentLevel) Concept holding those aggregates plus a knee flag.`
+- Prompt: `/rai-ontology-design Add an InvestmentPortfolio concept indexed by InvestmentLevel that materializes the per-budget aggregates (approved-DC count, total MW, annual revenue, upgrade cost, net value, marginal value per added $M) and flags the knee point.`
 - Response: Ontology gains an `InvestmentPortfolio(InvestmentLevel)` Concept (5 rows, one per budget) with `dc_count`, `total_mw`, `annual_revenue`, `upgrade_cost`, `net_value`, `marginal_per_m_to_next_level`, `is_knee_point`. All five frontier rows — $200M ($165M net) → $300M ($264M net, knee) → $600M ($395M net) — are queryable as ontology rather than stdout.
 
 ## Data
diff --git a/v1/machine_maintenance/runbook.md b/v1/machine_maintenance/runbook.md
index 15ac267..b73c169 100644
--- a/v1/machine_maintenance/runbook.md
+++ b/v1/machine_maintenance/runbook.md
@@ -83,7 +83,7 @@ all 3 Turbine techs sit in one city, a $3,200 fix away from resolved.
 
 ### 9. Persist solution concepts into the ontology
 
-- Prompt: `/rai-ontology-design The chain already writes the OEE-proxy properties, betweenness, the per-machine risk flags, the composite risk_tier, and x_maintain / x_vulnerable / x_assigned. What's still only in pandas/stdout: the plan cost breakdown (failure / labor / travel), per-technician utilization, per-machine-type concentration analysis, and the cross-training recommendation ranked by cost. Add MaintenancePlan, TypeConcentration(machine_type), and CrossTrainingRecommendation Concepts.`
+- Prompt: `/rai-ontology-design Materialize the prescriptive output as ontology. Add a MaintenancePlan concept with the cost breakdown (failure / labor / travel) and total job count; a TypeConcentration concept per machine type capturing where the qualified techs sit and how much travel that forces; and a CrossTrainingRecommendation concept ranking candidates by cost so the cheapest fix is queryable.`
 - Response: Ontology gains: `MaintenancePlan` (singleton, with `objective`, `failure_cost`, `labor_cost`, `travel_cost`, `total_jobs`); `TypeConcentration(machine_type)` (5 rows, with `qualified_tech_count`, `qualified_tech_locations`, `is_concentrated`, `scheduled_jobs_total`, `scheduled_jobs_traveling`, `travel_pct`); `CrossTrainingRecommendation` (one row per concentrated type, ranked candidates with `tech_id`, `cost`, `duration_weeks`, `is_best_candidate`). The Turbine-in-Houston concentration and T006 / Chicago_IL / $3,200 / 5w recommendation are now queryable as ontology rather than stdout.
 
 ## Data
diff --git a/v1/portfolio_balancing/runbook.md b/v1/portfolio_balancing/runbook.md
index 9625807..3f881d4 100644
--- a/v1/portfolio_balancing/runbook.md
+++ b/v1/portfolio_balancing/runbook.md
@@ -80,7 +80,7 @@ base at every lambda — without the cluster collapse, the gap would grow.
 
 ### 9. Persist solution concepts into the ontology
 
-- Prompt: `/rai-ontology-design The chain already writes the compliance flags, cluster id + representative flag, Stock.x_quantity(Scenario), and Stock.regime_covar. What's still only in pandas/stdout: per-(scenario, frontier-point) metadata (return, risk, marginal risk/return, knee flag) and the base-vs-crisis volatility comparison. Add a FrontierPoint(Scenario, eps_label) Concept holding all post-solve frontier metadata.`
+- Prompt: `/rai-ontology-design Add a FrontierPoint concept indexed by (Scenario, eps_label) that materializes each Pareto point's metadata: return, risk, marginal risk-per-return, knee flag, base-regime volatility, crisis-regime volatility, and the percentage gap between them.`
 - Response: Ontology gains a `FrontierPoint(Scenario, eps_label)` Concept (36 rows = 6 scenarios x 6 points) with `return`, `risk`, `marginal_risk_per_return`, `is_knee`, `vol_base`, `vol_crisis`, `vol_gap`, `vol_gap_pct`. The frontier shape (32.43->40.28 / 1160->1742 in base_500), knee at eps_1, and crisis vol gap (+28.4% min_risk -> +29.8% peak -> +25.2% eps_5) are now queryable as ontology rather than stdout.
 
 ## Data
diff --git a/v1/supply_chain_resilience/runbook.md b/v1/supply_chain_resilience/runbook.md
index d0052b0..3efc705 100644
--- a/v1/supply_chain_resilience/runbook.md
+++ b/v1/supply_chain_resilience/runbook.md
@@ -77,7 +77,7 @@ watch->avoid downgrade = +0.0% (optimizer already routed around it).
 
 ### 9. Persist solution concepts into the ontology
 
-- Prompt: `/rai-ontology-design The chain already writes site centrality, the business reliability flags, demand escalation, blast-radius reachability, and baseline Operation.x_flow + Demand.x_unmet. What's still only in pandas/stdout: the scenario re-solves (Baseline, S004-offline, Watch->Avoid) with their status, total cost, % delta vs baseline, active flow count, and unmet count. Add a RoutingScenario Concept materializing each scenario as queryable ontology.`
+- Prompt: `/rai-ontology-design Add a RoutingScenario concept that materializes each scenario solve (Baseline, S004-offline, Watch->Avoid) with its status, total cost, cost delta versus baseline, active flow count, unmet total, and any blocked businesses.`
 - Response: Ontology gains a `RoutingScenario` Concept (3 rows: Baseline, S004-offline, Watch-Avoid) with `status`, `total_cost`, `cost_delta_pct`, `active_flow_count`, `unmet_total`, `blocked_businesses`. The disruption deltas — Baseline $1,865 / 8 flows / 0 unmet, S004-offline +88.5%, Watch->Avoid +0.0% — are queryable as ontology rather than scenario-comparison stdout.
 
 ## Data
diff --git a/v1/telco_network_recovery/runbook.md b/v1/telco_network_recovery/runbook.md
index b0ee4dd..373b478 100644
--- a/v1/telco_network_recovery/runbook.md
+++ b/v1/telco_network_recovery/runbook.md
@@ -39,7 +39,7 @@ across all 15 critical towers, prioritized by social blast radius.
 
 ### 1. Build ontology
 
-- Prompt: `/rai-build-starter-ontology Build a telco network ontology from the eight CSVs in data/: cell_towers, network_equipment, equipment_health, network_performance, subscribers, call_detail_records, tower_upgrade_options, time_series_metrics. The time-series file has one row per (date, region) — model that as a composite-key concept and add a same-region 1-day-lag edge concept to support temporal GNN message passing downstream.`
+- Prompt: `/rai-build-starter-ontology Build a telco network ontology from the eight CSVs in data/: cell_towers, network_equipment, equipment_health, network_performance, subscribers, call_detail_records, tower_upgrade_options, time_series_metrics. The time-series file has one row per (date, region); make sure that's modelled as a composite-key concept since we'll want to forecast region-level trends later.`
 - Response: Concepts: `CellTower`, `NetworkEquipment`, `EquipmentHealth`, `NetworkPerformance`, `Subscriber`, `CallDetailRecord` (edge concept: caller → callee, routed_through tower), `TowerUpgradeOption` (composite key tower_id+tier), `RegionMetric` (composite key metric_date+region), `TemporalEdge` (composite key src_date+src_region+dst_date+dst_region) — all bound to the bundled CSVs.
 
 ### 2. Examine ontology
@@ -84,8 +84,8 @@ across all 15 critical towers, prioritized by social blast radius.
 
 ### 10. Persist solution concepts into the ontology
 
-- Prompt: `/rai-ontology-design The chain already writes critical-restore flag, derived health, subscriber influence, weighted_impact, projected_demand_growth, and TowerUpgradeOption.selected. What's still only in pandas/stdout: the plan summary (total cost, install-weeks, capacity restored, tier-mix counts, towers-covered count) and the binding-constraint analysis. Add a RestorePlan Concept holding those aggregates and a narrowed SelectedUpgrade view-concept restricted to the 15 chosen tower-tier rows.`
-- Response: Ontology gains a singleton `RestorePlan` Concept with `total_cost`, `total_install_weeks`, `capacity_restored_gbps`, `gold_count`, `silver_count`, `bronze_count`, `towers_covered`, `binding_constraint`; plus `SelectedUpgrade` (a view-concept over the 15 chosen rows from `TowerUpgradeOption`). All headline plan numbers — $4,956,843 / 164 weeks / 122 Gbps / 12 GOLD / 2 SILVER / 1 BRONZE / 15 covered — are now queryable as ontology, not stdout.
+- Prompt: `/rai-ontology-design Materialize the optimal plan and the 15 selected upgrades as queryable ontology. Add a RestorePlan concept holding the plan summary (total cost, install-weeks, capacity restored, tier-mix counts, towers covered, binding constraint) and a SelectedUpgrade view restricted to the chosen tower-tier rows.`
+- Response: Ontology gains a singleton `RestorePlan` Concept with `total_cost`, `total_install_weeks`, `capacity_restored_gbps`, `gold_count`, `silver_count`, `bronze_count`, `towers_covered`, `binding_constraint`; plus `SelectedUpgrade` (a view over the 15 chosen rows of `TowerUpgradeOption`). All headline plan numbers — $4,956,843 / 164 weeks / 122 Gbps / 12 GOLD / 2 SILVER / 1 BRONZE / 15 covered — are now queryable as ontology, not stdout.
 
 ## Data
 

From 3911ebe71b5f2ada645bc42a703196401be7d40c Mon Sep 17 00:00:00 2001
From: cafzal <cameron.afzal@gmail.com>
Date: Wed, 6 May 2026 12:26:43 -0700
Subject: [PATCH 19/22] Naturalize all prompts: drop algorithm/solver/feature
 mechanics
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

Strip implementation details a user wouldn't type — those belong
to the agent + loaded skill, not the user's question:

- telco diagnose: drop DAILY_REVENUE_USD column reference
- telco rules: drop 'first derive averages from NetworkPerformance,
  via NetworkEquipment -> EquipmentHealth' join paths
- telco graph: drop 'rank by total PageRank influence' algorithm name
- telco predictive: drop 'GNN', 'TemporalEdge', 'message passing',
  'lag features (prev-day, prev-week, 7-day mean)' feature
  engineering, and 'Mean each region's Dec predictions, convert
  to 1+x multiplier, bind via region' implementation steps
- telco prescriptive: drop 'MIP scoped to options where X.for_tower(Y)
  AND Z.is_critical_restore()' join syntax, decision-variable typing
  ('binary, keyed by tower_id+tier'), and explicit 'sum(...)'
  objective formula
- energy graph: drop 'WCC, Louvain, betweenness/degree/eigenvector'
  algorithm enumeration
- supply chain rules: drop downstream-coupling explanation ('avoid
  hard-blocked downstream', 'watch surcharged') from the rule
  prompt — that's the optimizer's concern
- machine maintenance graph: 'Compute centrality' -> 'Score by how
  central in the qualification network'
- portfolio frontier: drop 'Anchor at min-risk and max-return,
  then sweep 5 epsilon points' agent-implementation; user just
  asks for '6 frontier points per scenario from min-risk through
  high-return'
- portfolio stress: drop 'shrink correlations toward all-ones with
  weight 0.7 on base covariance + 0.3 on outer-product' formula
  — user just says 'pushes correlations 70% of the way toward
  all-ones'
---
 v1/energy_grid_planning/runbook.md    |  2 +-
 v1/machine_maintenance/runbook.md     |  2 +-
 v1/portfolio_balancing/runbook.md     |  4 ++--
 v1/supply_chain_resilience/runbook.md |  2 +-
 v1/telco_network_recovery/runbook.md  | 10 +++++-----
 5 files changed, 10 insertions(+), 10 deletions(-)

diff --git a/v1/energy_grid_planning/runbook.md b/v1/energy_grid_planning/runbook.md
index 81fcf46..dd97432 100644
--- a/v1/energy_grid_planning/runbook.md
+++ b/v1/energy_grid_planning/runbook.md
@@ -60,7 +60,7 @@ $300M unlocks 5 DCs (1,500 MW, $264M net value) including xAI Colossus.
 
 ### 5. Find structural bottlenecks
 
-- Prompt: `/rai-graph-analysis Which substations are most critical to power flow based on grid topology? Check connectivity (WCC), regional structure (Louvain communities), and centrality (betweenness/degree/eigenvector); then flag the top 3 by combined centrality rank as structurally critical and persist the scores back to the ontology.`
+- Prompt: `/rai-graph-analysis Which substations are most critical to power flow based on grid topology? Flag the top 3 most structurally critical, surface any regional clustering, and persist the structural-criticality scores back to the ontology.`
 - Response: 1 connected component, 3 Louvain communities (North Texas, West Texas, Gulf Coast); DFW, Houston, San Antonio flagged `is_structurally_critical`; 7 of 10 DC requests target critical nodes.
 
 ### 6. Screen DC requests
diff --git a/v1/machine_maintenance/runbook.md b/v1/machine_maintenance/runbook.md
index b73c169..b54fa03 100644
--- a/v1/machine_maintenance/runbook.md
+++ b/v1/machine_maintenance/runbook.md
@@ -63,7 +63,7 @@ all 3 Turbine techs sit in one city, a $3,200 fix away from resolved.
 
 ### 5. Find scheduling bottlenecks
 
-- Prompt: `/rai-graph-analysis Which machines share qualified technicians, and which are bottlenecks in the qualification network? Compute centrality and write it back to each machine so the optimizer can weight critical machines.`
+- Prompt: `/rai-graph-analysis Which machines share qualified technicians? Score each machine by how central it is in the qualification network so the optimizer can prioritize the bottlenecks.`
 - Response: 30 machines → 1 connected component; Pumps tie at top betweenness (24.0 raw, 1.0 normalized); `Machine.betweenness` stored.
 
 ### 6. Classify machine risk
diff --git a/v1/portfolio_balancing/runbook.md b/v1/portfolio_balancing/runbook.md
index 3f881d4..b05a3e8 100644
--- a/v1/portfolio_balancing/runbook.md
+++ b/v1/portfolio_balancing/runbook.md
@@ -65,7 +65,7 @@ base at every lambda — without the cluster collapse, the gap would grow.
 
 ### 6. Solve mean-variance frontier
 
-- Prompt: `/rai-prescriptive-problem-formulation Build a Markowitz mean-variance frontier across 6 scenarios = 3 budgets x 2 regimes. Position cap 30% of budget, sector cap 30%, only invest in cluster representatives. Anchor at min-risk and max-return, then sweep 5 epsilon points across the return range.`
+- Prompt: `/rai-prescriptive-problem-formulation Build a Markowitz mean-variance frontier across 6 scenarios = 3 budgets x 2 regimes. Position cap 30% of budget, sector cap 30%, only invest in cluster representatives. Show 6 frontier points per scenario from min-risk through high-return.`
 - Response: 48 decision vars (`Stock.x_quantity`, 8 stocks x 6 scenarios; non-reps forced to 0). Constraint families: non-negativity, budget equality (sum = budget per scenario), position cap (30%), sector cap (30%), non-representative = 0, plus epsilon return-rate floor on sweep solves. Return-rate range [0.0634, 0.0840]. 6-point frontier per scenario (min-risk anchor + 5 epsilon points); 7 solves per scenario x 6 scenarios = 42 `LOCALLY_SOLVED` portfolios via Ipopt.
 
 ### 7. Read the frontier
@@ -75,7 +75,7 @@ base at every lambda — without the cluster collapse, the gap would grow.
 
 ### 8. Stress under crisis
 
-- Prompt: `/rai-prescriptive-solver-management + /rai-prescriptive-results-interpretation Stress-test the frontier under crisis: shrink correlations toward all-ones with weight 0.7 on base covariance + 0.3 on outer-product. How much volatility expansion at each frontier point?`
+- Prompt: `/rai-prescriptive-solver-management + /rai-prescriptive-results-interpretation Stress-test the frontier under a crisis regime that pushes correlations 70% of the way toward all-ones. How much does volatility expand at each frontier point versus the baseline regime?`
 - Response: Crisis vol 25-30% above base across the frontier (budget 500: min_risk 34.06 -> 43.74 at +28.4%, eps_1 34.30 -> 44.54 at +29.8% peak). Gap peaks mid-frontier (eps_1..eps_2 at +29.8%), narrows to +25.2% at eps_5 — the cluster-collapse payoff.
 
 ### 9. Persist solution concepts into the ontology
diff --git a/v1/supply_chain_resilience/runbook.md b/v1/supply_chain_resilience/runbook.md
index 3efc705..a51c11c 100644
--- a/v1/supply_chain_resilience/runbook.md
+++ b/v1/supply_chain_resilience/runbook.md
@@ -62,7 +62,7 @@ watch->avoid downgrade = +0.0% (optimizer already routed around it).
 
 ### 6. Classify supplier reliability
 
-- Prompt: `/rai-rules-authoring Rate each supplier's delivery reliability. Flag any with reliability score below 0.80 as unreliable, any with a Q1 delay prediction above 0.15 as high-delay-risk, and call them 'watch-level' if either fires. Suppliers with **both** flags are 'avoid' (hard-blocked downstream); suppliers with **either** flag are 'watch' (surcharged).`
+- Prompt: `/rai-rules-authoring Rate each supplier's delivery reliability. Flag any with reliability score below 0.80 as unreliable, and any with a Q1 delay prediction above 0.15 as high-delay-risk. Classify suppliers as 'avoid' (both flags fire), 'watch' (either flag fires), or 'reliable' (neither).`
 - Response: `is_unreliable` (1: B017), `has_high_delay_risk` (2: B003, B017), `is_watch_level` (2), `Demand.is_escalated` (9).
 
 ### 7. Solve risk-adjusted flow
diff --git a/v1/telco_network_recovery/runbook.md b/v1/telco_network_recovery/runbook.md
index 373b478..4c378d8 100644
--- a/v1/telco_network_recovery/runbook.md
+++ b/v1/telco_network_recovery/runbook.md
@@ -54,27 +54,27 @@ across all 15 critical towers, prioritized by social blast radius.
 
 ### 4. Diagnose WEST
 
-- Prompt: `/rai-querying Compare quarterly DAILY_REVENUE_USD by region. Which region has the worst Q4 2024 network availability? Show the WEST cell tower fleet broken down by status, and the average packet loss for the DEGRADED ones.`
+- Prompt: `/rai-querying Where are we missing revenue targets, and which region has the worst Q4 2024 network availability? For the worst region, show the cell tower fleet by status and how bad the degraded towers look on packet loss.`
 - Response: WEST Q4 avail 94.6% vs 99.5% in every other region; WEST Q4 revenue $6.6M vs ~$9.0–9.5M everywhere else (≈$2.7M Q4 deficit, −29% vs WEST's own H1 baseline); 81 WEST towers split into 49 ACTIVE / 17 MAINTENANCE / 15 DEGRADED, with the 15 DEGRADED towers averaging 7.6–10.3% packet loss (median ~8.2%). Subscriber-churn signals stay flat — this is an operational network failure, not retention.
 
 ### 5. Flag critical-restore towers
 
-- Prompt: `/rai-rules-authoring First derive per-tower averages for packet loss, latency, error rate (from NetworkPerformance) and average equipment health (via NetworkEquipment → EquipmentHealth). Then flag CellTower.is_critical_restore on either of two branches: (1) region == WEST AND status == DEGRADED AND avg_health_score < 0.85, OR (2) region == WEST AND avg_packet_loss > 5% AND avg_health_score < 0.85 (catches ACTIVE-but-failing).`
+- Prompt: `/rai-rules-authoring Flag towers as critical-restore if they're in WEST and either (a) DEGRADED with poor average equipment health (below 0.85), or (b) showing high average packet loss (above 5%) with poor health — the second branch catches ACTIVE-but-failing towers.`
 - Response: 4 derived health properties (`avg_packet_loss`, `avg_latency_ms`, `avg_error_rate`, `avg_health_score`) computed for all 250 towers via `aggs.avg(...).per(CellTower)`. The two-branch `CellTower.is_critical_restore` relationship fires on 15 towers — all 15 are WEST + DEGRADED + health < 0.85, so Branch 1 alone produces the same set, but Branch 2 is kept as a guard against ACTIVE-but-failing failure modes.
 
 ### 6. Score subscriber blast radius
 
-- Prompt: `/rai-graph-analysis Who are our most socially influential subscribers based on call patterns? For each critical-restore tower, count the distinct subscribers whose calls route through it and rank by total PageRank influence — that's the blast radius if it fails.`
+- Prompt: `/rai-graph-analysis Who are our most socially influential subscribers based on call patterns? For each critical-restore tower, score its blast radius — how many distinct subscribers route calls through it, weighted by their influence.`
 - Response: `Subscriber.influence_score` (PageRank) on all 1,200 subs; `CellTower.weighted_impact` on 15 critical towers; 404 distinct subs (33% of base) route through a critical tower; TWR-0014 has the largest footprint (61 subs, 0.0502).
 
 ### 7. Forecast regional demand
 
-- Prompt: `/rai-predictive-modeling + /rai-predictive-training Train a regression GNN on RegionMetric (one row per date+region) to predict next-quarter SUBSCRIBER_GROWTH_RATE per region. Use TemporalEdge (same-region 1-day lag) for message passing, region as a category feature, and lag features (prev-day, prev-week, 7-day mean) as continuous inputs. Train < 2024-11-01, validate on Nov, test on Dec. Mean each region's Dec predictions, convert to 1+x multiplier, and bind back to CellTower.projected_demand_growth via region.`
+- Prompt: `/rai-predictive-modeling + /rai-predictive-training Predict next-quarter subscriber growth per region from the daily KPI history (train on data through October, validate Nov, test Dec). Bind each region's forecast back to its towers as a demand multiplier.`
 - Response: GNN node regression on 365d × 9 regions with same-region 1-day-lag temporal edges; per-region mean of the Dec test predictions yields WEST multiplier ≈0.9998× (flat/slightly contracting) while the 8 other regions sit at +0.45% to +0.91%/day. The multiplier is loaded into a `RegionGrowth` concept and joined to `CellTower.projected_demand_growth` via region — populating all 250 towers (CellTower covers 5 regions; the other 4 RegionMetric regions are forecast but have no towers to bind to).
 
 ### 8. Optimize tier selection
 
-- Prompt: `/rai-prescriptive-problem-formulation Build a tower-upgrade MIP scoped to options where TowerUpgradeOption.for_tower(CellTower) AND CellTower.is_critical_restore(). Decision variable TowerUpgradeOption.selected is binary, keyed by (tower_id, tier). Constraints: at most one tier per tower, total cost ≤ $5M, total install_weeks ≤ 200. Maximize sum(selected · capacity_increase_gbps · CellTower.weighted_impact · CellTower.projected_demand_growth) — three coefficients, one from each upstream stage.`
+- Prompt: `/rai-prescriptive-problem-formulation Recover WEST capacity within $5M and 200 install-weeks. For each critical-restore tower, pick at most one upgrade tier (BRONZE, SILVER, or GOLD) that maximizes capacity restored, weighted by each tower's blast radius and the regional demand forecast.`
 - Response: Status OPTIMAL with all 15 critical towers covered (one tier each). Tier mix: 12 GOLD / 2 SILVER / 1 BRONZE. Total capacity restored 122 Gbps. Total cost $4,956,843 of the $5M budget (binding). Total install crew-weeks 164 of 200 (slack). The tier mix skews toward GOLD because the per-Gbps cost on GOLD is competitive once it is multiplied by `weighted_impact` and `projected_demand_growth` in the objective.
 
 ### 9. Interpret the plan

From 445036b17e0257a1ed47c09d78e906fdd3618915 Mon Sep 17 00:00:00 2001
From: cafzal <cameron.afzal@gmail.com>
Date: Wed, 6 May 2026 12:31:38 -0700
Subject: [PATCH 20/22] Prompt-sufficiency audit: minor sharpens for 4 of 5
 templates
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

Subagents audited each prompt against (a) the named skill's
SKILL.md and (b) the template script to verify the agent + skill
+ ontology have enough business signal to land on the script's
behavior, without re-adding mechanics.

- portfolio Step 6: spell out the 3 budgets (500, 1000, 2000) and
  2 regimes (base, crisis), and call out the fully-invested
  constraint so the solver doesn't drop budget equality
- portfolio Step 8: fix a real numerical inversion — alpha=0.7
  means 30% shrinkage toward all-ones, not 70% (the prompt was
  saying the opposite of the script). Also add 're-solve the
  same frontier under crisis covariance' so the agent re-runs
  rather than just re-evaluating risk
- energy Step 7: add 'across all five levels in a single solve'
  + clarify Stage 6 compliance flags are informational, not a
  hard pre-filter (otherwise risked producing a degenerate
  2-DC frontier from filtering to only the compliant pair)
- supply_chain Step 6: ask the rules step to also flag
  HIGH-priority demand as escalated (was in the Response but
  missing from the Prompt)
- machine_maintenance Step 7: replace 'Schedule maintenance for
  all 30 machines' with 'maintained or left exposed' framing,
  and name the 5-jobs-per-period parts/bay cap and per-tech
  hours capacity that the script enforces — a literal 'must
  maintain all 30' read otherwise conflicts with the cap
- telco: no edits; subagent flagged 2 minor risks (TemporalEdge
  comes from inline-derived edges, not bare CSVs; Step 5 only
  names 2 of 4 derived health metrics) but both are
  non-load-bearing for the chain
---
 v1/energy_grid_planning/runbook.md    | 2 +-
 v1/machine_maintenance/runbook.md     | 2 +-
 v1/portfolio_balancing/runbook.md     | 4 ++--
 v1/supply_chain_resilience/runbook.md | 2 +-
 4 files changed, 5 insertions(+), 5 deletions(-)

diff --git a/v1/energy_grid_planning/runbook.md b/v1/energy_grid_planning/runbook.md
index dd97432..93f7c62 100644
--- a/v1/energy_grid_planning/runbook.md
+++ b/v1/energy_grid_planning/runbook.md
@@ -70,7 +70,7 @@ $300M unlocks 5 DCs (1,500 MW, $264M net value) including xAI Colossus.
 
 ### 7. Approve DCs and fund upgrades
 
-- Prompt: `/rai-prescriptive-problem-formulation Decide which data center requests to approve and which substation upgrades to fund at $200M, $300M, $400M, $500M, and $600M investment levels. Maximize annual revenue. A request can only be approved if its substation has enough capacity after upgrades.`
+- Prompt: `/rai-prescriptive-problem-formulation Decide which data center requests to approve and which substation upgrades to fund at $200M, $300M, $400M, $500M, and $600M investment levels. Maximize annual revenue across all five levels in a single solve. A request can only be approved if its substation has enough capacity after upgrades, and total upgrade spend at each level must stay within that level's budget. Consider all 10 requests — the Stage 3 compliance flags are informational, not hard filters.`
 - Response: OPTIMAL MIP across 5 `InvestmentLevel` values in one solve; `x_approve` and `x_upgrade` written back per level.
 
 ### 8. Read the frontier
diff --git a/v1/machine_maintenance/runbook.md b/v1/machine_maintenance/runbook.md
index b54fa03..52f29a5 100644
--- a/v1/machine_maintenance/runbook.md
+++ b/v1/machine_maintenance/runbook.md
@@ -73,7 +73,7 @@ all 3 Turbine techs sit in one city, a $3,200 fix away from resolved.
 
 ### 7. Schedule maintenance
 
-- Prompt: `/rai-prescriptive-problem-formulation Schedule preventive maintenance for all 30 machines across 4 periods, capped at 5 jobs per period. Every overdue machine gets maintained by period 2, and each maintained machine needs a qualified technician. Minimize expected failure cost (weighted by criticality and centrality) plus labor and travel.`
+- Prompt: `/rai-prescriptive-problem-formulation Schedule preventive maintenance across the 30 machines and 4 periods. Each machine in each period is either maintained or left exposed to failure risk for that period. Cap maintenance at 5 jobs per period (parts/bay limit). Every overdue machine must be maintained by period 2. Each maintained machine needs a qualified technician assigned, and technicians can't exceed their available hours per period. Minimize expected failure cost on machines left exposed (weighted by criticality and centrality) plus technician labor and travel.`
 - Response: 120 `x_maintain` + 120 `x_vulnerable` + 384 `x_assigned` binaries (96 qualified tech×machine pairs × 4 periods); 5 constraint families (cumulative coverage, assignment-maintenance linkage, technician hours, parts/bay capacity, overdue deadline); failure cost uses `x_vulnerable × predicted_fp × parts_cost × criticality × (1 + 2.0 × betweenness)`.
 
 ### 8. Stress-test concentration
diff --git a/v1/portfolio_balancing/runbook.md b/v1/portfolio_balancing/runbook.md
index b05a3e8..cfb9314 100644
--- a/v1/portfolio_balancing/runbook.md
+++ b/v1/portfolio_balancing/runbook.md
@@ -65,7 +65,7 @@ base at every lambda — without the cluster collapse, the gap would grow.
 
 ### 6. Solve mean-variance frontier
 
-- Prompt: `/rai-prescriptive-problem-formulation Build a Markowitz mean-variance frontier across 6 scenarios = 3 budgets x 2 regimes. Position cap 30% of budget, sector cap 30%, only invest in cluster representatives. Show 6 frontier points per scenario from min-risk through high-return.`
+- Prompt: `/rai-prescriptive-problem-formulation Build a Markowitz mean-variance frontier across 6 scenarios = 3 budgets (500, 1000, 2000) x 2 regimes (base, crisis). Each scenario must be fully invested; cap any single position at 30% of budget and any sector at 30%. Only invest in cluster representatives. Show 6 frontier points per scenario from min-risk through high-return.`
 - Response: 48 decision vars (`Stock.x_quantity`, 8 stocks x 6 scenarios; non-reps forced to 0). Constraint families: non-negativity, budget equality (sum = budget per scenario), position cap (30%), sector cap (30%), non-representative = 0, plus epsilon return-rate floor on sweep solves. Return-rate range [0.0634, 0.0840]. 6-point frontier per scenario (min-risk anchor + 5 epsilon points); 7 solves per scenario x 6 scenarios = 42 `LOCALLY_SOLVED` portfolios via Ipopt.
 
 ### 7. Read the frontier
@@ -75,7 +75,7 @@ base at every lambda — without the cluster collapse, the gap would grow.
 
 ### 8. Stress under crisis
 
-- Prompt: `/rai-prescriptive-solver-management + /rai-prescriptive-results-interpretation Stress-test the frontier under a crisis regime that pushes correlations 70% of the way toward all-ones. How much does volatility expand at each frontier point versus the baseline regime?`
+- Prompt: `/rai-prescriptive-solver-management + /rai-prescriptive-results-interpretation Stress-test the frontier under a crisis regime that shrinks correlations 30% of the way toward all-ones (PSD-preserving). Re-solve the same frontier under crisis covariance and report how much volatility expands at each frontier point versus the baseline regime.`
 - Response: Crisis vol 25-30% above base across the frontier (budget 500: min_risk 34.06 -> 43.74 at +28.4%, eps_1 34.30 -> 44.54 at +29.8% peak). Gap peaks mid-frontier (eps_1..eps_2 at +29.8%), narrows to +25.2% at eps_5 — the cluster-collapse payoff.
 
 ### 9. Persist solution concepts into the ontology
diff --git a/v1/supply_chain_resilience/runbook.md b/v1/supply_chain_resilience/runbook.md
index a51c11c..042bdeb 100644
--- a/v1/supply_chain_resilience/runbook.md
+++ b/v1/supply_chain_resilience/runbook.md
@@ -62,7 +62,7 @@ watch->avoid downgrade = +0.0% (optimizer already routed around it).
 
 ### 6. Classify supplier reliability
 
-- Prompt: `/rai-rules-authoring Rate each supplier's delivery reliability. Flag any with reliability score below 0.80 as unreliable, and any with a Q1 delay prediction above 0.15 as high-delay-risk. Classify suppliers as 'avoid' (both flags fire), 'watch' (either flag fires), or 'reliable' (neither).`
+- Prompt: `/rai-rules-authoring Rate each supplier's delivery reliability. Flag any with reliability score below 0.80 as unreliable, and any with a Q1 delay prediction above 0.15 as high-delay-risk. Classify suppliers as 'avoid' (both flags fire), 'watch' (either flag fires), or 'reliable' (neither). Also flag HIGH-priority demand orders as escalated so downstream solves can prioritize them.`
 - Response: `is_unreliable` (1: B017), `has_high_delay_risk` (2: B003, B017), `is_watch_level` (2), `Demand.is_escalated` (9).
 
 ### 7. Solve risk-adjusted flow

From bb032382a4edd5e9c6bc639ce13753bf887ac6e2 Mon Sep 17 00:00:00 2001
From: cafzal <cameron.afzal@gmail.com>
Date: Wed, 6 May 2026 12:50:49 -0700
Subject: [PATCH 21/22] Prompt-path audit: minor sharpens to runbook only
 (script untouched)
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

5 subagents simulated Prompt -> agent (with skill loaded) -> output
for each step and compared to the canonical script.

Sharpens applied (runbook only):
- machine_maintenance Step 5: drop 'betweenness (24.0 raw, 1.0
  normalized)' from response — algorithm name + raw centrality
  numeric leak agent-level mechanics into a user-facing
  description. Now says 'top centrality (normalized to 1.0)'.
- machine_maintenance Step 7: drop the leaking decision-variable
  counts/typing ('120 x_maintain + 120 x_vulnerable + 384
  x_assigned binaries') and the 'failure cost = x_vulnerable x
  predicted_fp x parts_cost x criticality x (1 + 2 x betweenness)'
  formula from the response — those are implementation
  scaffolding. Kept the 5-constraint-family enumeration and the
  Stage 2 deadline handoff which are business-level facts.
- portfolio Step 5: tighten to 'Cluster stocks where absolute
  return correlation is at least 0.3 ... pick representative by
  highest Sharpe ratio and flag the rest as non-representatives'
  — drops the over-mechanical 'derive per-stock volatility and
  pairwise correlation from the covariance property' framing.
- portfolio Step 8: re-route the 'crisis-regime covariance
  derivation' work from /rai-prescriptive-solver-management to
  /rai-pyrel-coding (it's a derived ontology property, not
  solver lifecycle); /rai-prescriptive-results-interpretation
  still does the comparison.

No sharpens (prompts already sufficient):
- telco, energy, supply_chain — all Matches across the chain
  stages; left untouched.

Alignment gaps surfaced (script and runbook diverge — NOT fixed
per the no-script-edit rule, flagged for separate decision):

* All 5 templates: closing /rai-ontology-design step claims an
  ontology Concept that the script never materializes
  (RestorePlan, InvestmentPortfolio, RoutingScenario,
  MaintenancePlan/TypeConcentration/CrossTrainingRecommendation,
  FrontierPoint). The closing step is aspirational — what the
  agent would do AFTER the script's chain runs — but the
  Response's 'Ontology now carries X' phrasing implies the
  script did it.

* telco Step 9 (interpret): response narrates a sensitivity
  outcome ('flexing budget to $6M would promote TWR-0009
  BRONZE->GOLD') the script doesn't actually compute (single
  solve only).

* telco Step 1: response lists a TemporalEdge concept the
  prompt doesn't ask for; an agent following only the prompt
  + rai-build-starter-ontology would not produce it (script
  computes it via pandas elsewhere).

* energy Step 4: prompt invokes /rai-predictive-modeling +
  /rai-predictive-training, script does a CSV lookup with
  gnn.load() stub (no actual training).

* energy Step 5: prompt + skill imply single-algorithm
  centrality (skill explicitly forbids composite), script
  computes a composite-rank of betweenness + degree +
  eigenvector.

* supply_chain Step 5: centrality persisted via pandas
  round-trip rather than the canonical
  graph.Node.X = algorithm() shorthand.

* supply_chain Step 6: the 'avoid' tier is computed Python-side
  as a set intersection rather than as a RAI Relationship.
---
 v1/machine_maintenance/runbook.md | 4 ++--
 v1/portfolio_balancing/runbook.md | 4 ++--
 2 files changed, 4 insertions(+), 4 deletions(-)

diff --git a/v1/machine_maintenance/runbook.md b/v1/machine_maintenance/runbook.md
index 52f29a5..cd0c166 100644
--- a/v1/machine_maintenance/runbook.md
+++ b/v1/machine_maintenance/runbook.md
@@ -64,7 +64,7 @@ all 3 Turbine techs sit in one city, a $3,200 fix away from resolved.
 ### 5. Find scheduling bottlenecks
 
 - Prompt: `/rai-graph-analysis Which machines share qualified technicians? Score each machine by how central it is in the qualification network so the optimizer can prioritize the bottlenecks.`
-- Response: 30 machines → 1 connected component; Pumps tie at top betweenness (24.0 raw, 1.0 normalized); `Machine.betweenness` stored.
+- Response: 30 machines → 1 connected component; Pumps tie at top centrality (normalized to 1.0); `Machine.betweenness` stored.
 
 ### 6. Classify machine risk
 
@@ -74,7 +74,7 @@ all 3 Turbine techs sit in one city, a $3,200 fix away from resolved.
 ### 7. Schedule maintenance
 
 - Prompt: `/rai-prescriptive-problem-formulation Schedule preventive maintenance across the 30 machines and 4 periods. Each machine in each period is either maintained or left exposed to failure risk for that period. Cap maintenance at 5 jobs per period (parts/bay limit). Every overdue machine must be maintained by period 2. Each maintained machine needs a qualified technician assigned, and technicians can't exceed their available hours per period. Minimize expected failure cost on machines left exposed (weighted by criticality and centrality) plus technician labor and travel.`
-- Response: 120 `x_maintain` + 120 `x_vulnerable` + 384 `x_assigned` binaries (96 qualified tech×machine pairs × 4 periods); 5 constraint families (cumulative coverage, assignment-maintenance linkage, technician hours, parts/bay capacity, overdue deadline); failure cost uses `x_vulnerable × predicted_fp × parts_cost × criticality × (1 + 2.0 × betweenness)`.
+- Response: Decision variables for maintain / vulnerable / technician-assignment over the 30 machines × 4 periods (assignment restricted to qualified tech-machine pairs); 5 constraint families: cumulative coverage, assignment-maintenance linkage, technician hours, parts/bay capacity, and the overdue-by-period-2 deadline carried over from Stage 2.
 
 ### 8. Stress-test concentration
 
diff --git a/v1/portfolio_balancing/runbook.md b/v1/portfolio_balancing/runbook.md
index cfb9314..2a87ca5 100644
--- a/v1/portfolio_balancing/runbook.md
+++ b/v1/portfolio_balancing/runbook.md
@@ -60,7 +60,7 @@ base at every lambda — without the cluster collapse, the gap would grow.
 
 ### 5. Cluster correlated bets
 
-- Prompt: `/rai-graph-analysis Cluster stocks by correlation — anything above 0.3 absolute is a redundant bet. Pick one representative per cluster (highest Sharpe ratio) and only invest in those.`
+- Prompt: `/rai-graph-analysis Cluster stocks where absolute return correlation is at least 0.3 — those are redundant bets. Pick one representative per cluster by highest Sharpe ratio and flag the rest as non-representatives so downstream optimization can exclude them.`
 - Response: 4 edges (|rho| >= 0.3), 5 Louvain clusters, intra +0.683 vs inter +0.131. 5 representatives picked: PFE, GOOGL, JPM, PG, XOM. AAPL/MSFT/JNJ flagged `is_non_representative`.
 
 ### 6. Solve mean-variance frontier
@@ -75,7 +75,7 @@ base at every lambda — without the cluster collapse, the gap would grow.
 
 ### 8. Stress under crisis
 
-- Prompt: `/rai-prescriptive-solver-management + /rai-prescriptive-results-interpretation Stress-test the frontier under a crisis regime that shrinks correlations 30% of the way toward all-ones (PSD-preserving). Re-solve the same frontier under crisis covariance and report how much volatility expands at each frontier point versus the baseline regime.`
+- Prompt: `/rai-pyrel-coding + /rai-prescriptive-results-interpretation Add a regime-conditioned covariance to the ontology so the base regime keeps the original covariance and the crisis regime shrinks correlations 30% of the way toward all-ones (PSD-preserving). Re-solve the same frontier under crisis covariance and report how much volatility expands at each frontier point versus the base regime.`
 - Response: Crisis vol 25-30% above base across the frontier (budget 500: min_risk 34.06 -> 43.74 at +28.4%, eps_1 34.30 -> 44.54 at +29.8% peak). Gap peaks mid-frontier (eps_1..eps_2 at +29.8%), narrows to +25.2% at eps_5 — the cluster-collapse payoff.
 
 ### 9. Persist solution concepts into the ontology

From e008761711a991e9913fc0bdfe659a0a59f8dd42 Mon Sep 17 00:00:00 2001
From: cafzal <cameron.afzal@gmail.com>
Date: Wed, 6 May 2026 14:21:48 -0700
Subject: [PATCH 22/22] Materialize closing-step solution Concepts in each
 script
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

Each script now binds the post-solve aggregates onto the ontology so
the closing /rai-ontology-design step in the runbook is a real chain
deliverable, not aspirational:

- telco: RestorePlan singleton (total_cost, install_weeks, capacity,
  tier-mix counts, towers_covered, binding_constraint) + SelectedUpgrade
  unary Relationship narrowing TowerUpgradeOption to the chosen 15
- energy: InvestmentPortfolio(InvestmentLevel) — 5 rows (one per
  budget) with dc_count, total_mw, annual_revenue, upgrade_cost,
  net_value, marginal_per_m_to_next_level, is_knee_point
- supply_chain: RoutingScenario — 3 rows (Baseline, S004-offline,
  Watch-Avoid) with status, total_cost, cost_delta_pct,
  active_flow_count, unmet_total, blocked_businesses; also adds a
  Business.is_avoid Relationship that fires when both unreliable
  AND high_delay_risk fire (replaces a Python set intersection)
- machine_maintenance: MaintenancePlan singleton (objective +
  failure/labor/travel cost breakdown + total_jobs);
  TypeConcentration(machine_type) per machine type (qualified_tech_count,
  qualified_tech_locations, is_concentrated, scheduled_jobs_total,
  scheduled_jobs_traveling, travel_pct); CrossTrainingRecommendation
  per concentrated type (ranked candidates with is_best_candidate)
- portfolio: FrontierPoint(scenario_label, eps_label) — 36 rows
  (6 scenarios x 6 frontier points) with return, risk,
  marginal_risk_per_return, is_knee, vol_base, vol_crisis,
  vol_gap, vol_gap_pct

All five re-tested end-to-end against actual RAI; numbers match the
prior stdout output.

Runbook adapts (text only):
- telco Step 1: drop TemporalEdge from response (agent + skill won't
  produce a derived 1-day-lag edge concept from raw CSVs alone — it
  surfaces in Step 7 predictive)
- telco Step 9: soften the sensitivity narrative — agent identifies
  budget as binding and recommends a re-solve at $6M, rather than
  claiming the script computed the BRONZE->GOLD swap delta
- energy Step 4: prompt now allows the agent to use the pre-trained
  forecast in DemandForecast or train fresh
---
 .../energy_grid_planning.py                   | 175 ++++--
 v1/energy_grid_planning/runbook.md            |   2 +-
 v1/machine_maintenance/machine_maintenance.py | 530 ++++++++++++++----
 v1/portfolio_balancing/portfolio_balancing.py | 189 ++++++-
 .../supply_chain_resilience.py                | 137 ++++-
 v1/telco_network_recovery/runbook.md          |   4 +-
 .../telco_network_recovery.py                 | 125 +++++
 7 files changed, 996 insertions(+), 166 deletions(-)

diff --git a/v1/energy_grid_planning/energy_grid_planning.py b/v1/energy_grid_planning/energy_grid_planning.py
index 5668fdc..201e5e1 100644
--- a/v1/energy_grid_planning/energy_grid_planning.py
+++ b/v1/energy_grid_planning/energy_grid_planning.py
@@ -935,7 +935,7 @@ def _query_flag(relationship, flag_name):
 investment_levels_df = model.select(
     InvestmentLevel.name.alias("level"),
     InvestmentLevel.budget_cap.alias("budget"),
-).to_df().sort_values("budget")
+).to_df().sort_values("budget").reset_index(drop=True)
 
 pareto_rows = []
 for _, lvl in investment_levels_df.iterrows():
@@ -968,18 +968,132 @@ def _query_flag(relationship, flag_name):
         "net_value": net_value,
     })
 
-# Print Pareto frontier
+# --------------------------------------------------
+# Materialize InvestmentPortfolio(InvestmentLevel) Concept
+#   One row per InvestmentLevel scenario. Marginal & knee detection are
+#   inter-row computations, so they're computed in pandas and bound back.
+#   The Pareto frontier is then queried from the ontology, not pareto_rows.
+# --------------------------------------------------
+
+InvestmentPortfolio = model.Concept(
+    "InvestmentPortfolio", identify_by={"investment_level_name": String}
+)
+InvestmentPortfolio.investment_level = model.Relationship(
+    f"{InvestmentPortfolio} for {InvestmentLevel}"
+)
+InvestmentPortfolio.dc_count = model.Property(f"{InvestmentPortfolio} has {Integer:dc_count}")
+InvestmentPortfolio.total_mw = model.Property(f"{InvestmentPortfolio} has {Float:total_mw}")
+InvestmentPortfolio.annual_revenue = model.Property(
+    f"{InvestmentPortfolio} has {Float:annual_revenue}"
+)
+InvestmentPortfolio.upgrade_cost = model.Property(
+    f"{InvestmentPortfolio} has {Float:upgrade_cost}"
+)
+InvestmentPortfolio.net_value = model.Property(f"{InvestmentPortfolio} has {Float:net_value}")
+InvestmentPortfolio.marginal_per_m_to_next_level = model.Property(
+    f"{InvestmentPortfolio} has {Float:marginal_per_m_to_next_level}"
+)
+InvestmentPortfolio.is_knee_point = model.Property(
+    f"{InvestmentPortfolio} has {Boolean:is_knee_point}"
+)
+
+# Build the dataframe with marginal & knee — inter-row computation in pandas.
+portfolio_df = pd.DataFrame(pareto_rows).sort_values("budget").reset_index(drop=True)
+# Note: upgrade_cost is dollars (cost_million * 1e6), to match revenue scale and net_value definition.
+portfolio_df["upgrade_cost_dollars"] = portfolio_df["upgrade_cost_m"].astype(float) * 1e6
+
+# marginal_per_m_to_next_level: ($net_value_next - $net_value_this) / ($M_next - $M_this); null at last
+marginal = []
+for j in range(len(portfolio_df)):
+    if j == len(portfolio_df) - 1:
+        marginal.append(None)
+    else:
+        d_val = portfolio_df.loc[j + 1, "net_value"] - portfolio_df.loc[j, "net_value"]
+        d_budget = portfolio_df.loc[j + 1, "budget"] - portfolio_df.loc[j, "budget"]
+        marginal.append(d_val / d_budget if abs(d_budget) > 1e-6 else None)
+portfolio_df["marginal_per_m_to_next_level"] = marginal
+
+# is_knee_point: row whose marginal-rate represents the largest jump from the prior row's rate.
+# We measure the "jump" at row j as |rate[j-1]| / |rate[j]| (steepest drop from a high marginal
+# to a lower one, i.e. diminishing-returns inflection). The knee is the row where this ratio peaks.
+is_knee = [False] * len(portfolio_df)
+rates = [r for r in marginal if r is not None]
+if len(rates) >= 2:
+    max_jump, knee_idx = 0.0, 1
+    for j in range(len(rates) - 1):
+        prev_rate, next_rate = rates[j], rates[j + 1]
+        if abs(next_rate) > 1e-6:
+            jump = abs(prev_rate / next_rate)
+        elif abs(prev_rate) > 1e-6:
+            jump = float("inf")
+        else:
+            jump = 0.0
+        if jump > max_jump:
+            max_jump = jump
+            knee_idx = j + 1
+    is_knee[knee_idx] = True
+portfolio_df["is_knee_point"] = is_knee
+
+# Bind portfolio rows back as ontology instances.
+portfolio_src = model.data(portfolio_df.rename(columns={
+    "level": "investment_level_name",
+    "n_approved": "dc_count",
+    "total_mw": "total_mw",
+    "revenue": "annual_revenue",
+    "upgrade_cost_dollars": "upgrade_cost",
+    "net_value": "net_value",
+    "marginal_per_m_to_next_level": "marginal_per_m_to_next_level",
+    "is_knee_point": "is_knee_point",
+}))
+
+model.define(InvestmentPortfolio.new(
+    investment_level_name=portfolio_src.INVESTMENT_LEVEL_NAME,
+    investment_level=InvestmentLevel.filter_by(name=portfolio_src.INVESTMENT_LEVEL_NAME),
+    dc_count=portfolio_src.DC_COUNT,
+    total_mw=portfolio_src.TOTAL_MW,
+    annual_revenue=portfolio_src.ANNUAL_REVENUE,
+    upgrade_cost=portfolio_src.UPGRADE_COST,
+    net_value=portfolio_src.NET_VALUE,
+    marginal_per_m_to_next_level=portfolio_src.MARGINAL_PER_M_TO_NEXT_LEVEL,
+    is_knee_point=portfolio_src.IS_KNEE_POINT,
+))
+
+# Query InvestmentPortfolio rows from the ontology for rendering.
+InvLvlRef = InvestmentLevel.ref()
+PortRef = InvestmentPortfolio.ref()
+portfolio_query_df = (
+    model.where(PortRef.investment_level(InvLvlRef))
+    .select(
+        PortRef.investment_level_name.alias("level"),
+        InvLvlRef.budget_cap.alias("budget"),
+        PortRef.dc_count.alias("dc_count"),
+        PortRef.total_mw.alias("total_mw"),
+        PortRef.annual_revenue.alias("annual_revenue"),
+        PortRef.upgrade_cost.alias("upgrade_cost"),
+        PortRef.net_value.alias("net_value"),
+        PortRef.marginal_per_m_to_next_level.alias("marginal"),
+        PortRef.is_knee_point.alias("is_knee"),
+    )
+    .to_df()
+    .sort_values("budget")
+    .reset_index(drop=True)
+)
+portfolio_query_df["budget"] = portfolio_query_df["budget"].astype(float)
+portfolio_query_df["dc_count"] = portfolio_query_df["dc_count"].astype(int)
+
+# Print Pareto frontier (read from ontology, not pareto_rows)
 print(
     f"\n  {'#':>3} {'Level':>8} {'Budget $M':>10} {'DCs':>5} {'DC MW':>8} "
-    f"{'Revenue $/yr':>14} {'Upg $M':>8} {'Upg MW':>8} {'Net Value':>14}"
+    f"{'Revenue $/yr':>14} {'Upg $M':>8} {'Net Value':>14} {'Knee':>5}"
 )
 print(f"  {'-' * 85}")
-for j, pt in enumerate(pareto_rows):
+for j, row in portfolio_query_df.iterrows():
+    knee_flag = " *" if bool(row["is_knee"]) else ""
     print(
-        f"  {j + 1:>3} {pt['level']:>8} {pt['budget']:>10,.0f} "
-        f"{pt['n_approved']:>5} {pt['total_mw']:>8,.0f} "
-        f"{pt['revenue']:>14,.0f} {pt['upgrade_cost_m']:>8,.1f} "
-        f"{pt['upgrade_mw']:>8,.1f} {pt['net_value']:>14,.0f}"
+        f"  {j + 1:>3} {row['level']:>8} {float(row['budget']):>10,.0f} "
+        f"{int(row['dc_count']):>5} {float(row['total_mw']):>8,.0f} "
+        f"{float(row['annual_revenue']):>14,.0f} {float(row['upgrade_cost']) / 1e6:>8,.1f} "
+        f"{float(row['net_value']):>14,.0f} {knee_flag:>5}"
     )
 
 # Detailed results per investment level
@@ -1005,42 +1119,31 @@ def _query_flag(relationship, flag_name):
         for _, row in level_upg.iterrows():
             print(f"    {row['upgrade_id']}: +{float(row['capacity_mw']):.0f} MW, ${float(row['cost_m']):.1f}M")
 
-# Marginal analysis + knee detection
-if len(pareto_rows) >= 3:
+# Marginal analysis + knee detection (read from ontology)
+if len(portfolio_query_df) >= 3:
     print("\n  MARGINAL ANALYSIS (value gained per additional $M budget):")
-    rates = []
-    for j in range(len(pareto_rows) - 1):
-        d_val = pareto_rows[j + 1]["net_value"] - pareto_rows[j]["net_value"]
-        d_budget = pareto_rows[j + 1]["budget"] - pareto_rows[j]["budget"]
-        rate = d_val / d_budget if abs(d_budget) > 1e-6 else 0
-        rates.append(rate)
-        d_mw = pareto_rows[j + 1]["total_mw"] - pareto_rows[j]["total_mw"]
-        d_dcs = pareto_rows[j + 1]["n_approved"] - pareto_rows[j]["n_approved"]
+    for j in range(len(portfolio_query_df) - 1):
+        cur = portfolio_query_df.iloc[j]
+        nxt = portfolio_query_df.iloc[j + 1]
+        d_val = float(nxt["net_value"]) - float(cur["net_value"])
+        d_budget = float(nxt["budget"]) - float(cur["budget"])
+        d_mw = float(nxt["total_mw"]) - float(cur["total_mw"])
+        d_dcs = int(nxt["dc_count"]) - int(cur["dc_count"])
+        rate = float(cur["marginal"]) if pd.notna(cur["marginal"]) else 0.0
         print(
-            f"    {pareto_rows[j]['level']:>6} -> {pareto_rows[j+1]['level']:<6}: "
+            f"    {cur['level']:>6} -> {nxt['level']:<6}: "
             f"dValue={d_val:>+14,.0f}, dBudget={d_budget:>+6,.0f}$M, "
             f"dMW={d_mw:>+8,.0f}, dDCs={d_dcs:>+3}, "
             f"marginal={rate:>+12,.0f}$/M$"
         )
 
-    if len(rates) >= 2:
-        max_jump, knee_idx = 0, 1
-        for j in range(len(rates) - 1):
-            if abs(rates[j + 1]) > 1e-6:
-                jump = abs(rates[j] / rates[j + 1])
-            elif abs(rates[j]) > 1e-6:
-                jump = float("inf")
-            else:
-                jump = 0
-            if jump > max_jump:
-                max_jump = jump
-                knee_idx = j + 1
-
-        knee = pareto_rows[knee_idx]
+    knee_rows = portfolio_query_df[portfolio_query_df["is_knee"].astype(bool)]
+    if len(knee_rows) > 0:
+        knee = knee_rows.iloc[0]
         print(
-            f"\n  KNEE POINT: {knee['level']} -- ${knee['budget']:,.0f}M budget, "
-            f"${knee['net_value']:,.0f} net value, {knee['n_approved']} DCs, "
-            f"{knee['total_mw']:,.0f} MW"
+            f"\n  KNEE POINT: {knee['level']} -- ${float(knee['budget']):,.0f}M budget, "
+            f"${float(knee['net_value']):,.0f} net value, {int(knee['dc_count'])} DCs, "
+            f"{float(knee['total_mw']):,.0f} MW"
         )
         print("  Diminishing returns beyond this investment level.")
 
diff --git a/v1/energy_grid_planning/runbook.md b/v1/energy_grid_planning/runbook.md
index 93f7c62..051f6ab 100644
--- a/v1/energy_grid_planning/runbook.md
+++ b/v1/energy_grid_planning/runbook.md
@@ -55,7 +55,7 @@ $300M unlocks 5 DCs (1,500 MW, $264M net value) including xAI Colossus.
 
 ### 4. Forecast substation load
 
-- Prompt: `/rai-predictive-modeling + /rai-predictive-training Can we forecast substation load growth over the next 24 months based on historical demand, planned generator additions, and the data center request pipeline? Bind each substation's predicted peak load back to the ontology so the rules engine and optimizer can read it.`
+- Prompt: `/rai-predictive-modeling + /rai-predictive-training Forecast substation load growth over the next 24 months from historical demand, planned generator additions, and the DC request pipeline. Use the pre-trained model from the bundled DemandForecast table if available, or train fresh. Bind each substation's predicted peak load back to the ontology so the rules engine and optimizer can read it.`
 - Response: `Substation.predicted_load` for all 12; DFW breaches at 1,700 MW vs 1,600 MW cap at 24 months (+54.6%).
 
 ### 5. Find structural bottlenecks
diff --git a/v1/machine_maintenance/machine_maintenance.py b/v1/machine_maintenance/machine_maintenance.py
index 331cf40..573e52d 100644
--- a/v1/machine_maintenance/machine_maintenance.py
+++ b/v1/machine_maintenance/machine_maintenance.py
@@ -33,7 +33,16 @@
 from pathlib import Path
 
 from pandas import read_csv
-from relationalai.semantics import Float, Integer, Model, String, max, sum
+from relationalai.semantics import (
+    Boolean,
+    Float,
+    Integer,
+    Model,
+    String,
+    distinct,
+    max,
+    sum,
+)
 from relationalai.semantics.reasoners.graph import Graph
 from relationalai.semantics.reasoners.prescriptive import Problem
 from relationalai.semantics.std import aggregates as aggs
@@ -189,6 +198,33 @@
     CertificationExpiry.technician_id == Technician.technician_id
 )
 
+# TrainingOption concept: cross-training options per (technician, machine_type).
+# Used by Stage 4 resilience analysis to recommend the cheapest non-local
+# cross-training candidate for each concentrated machine type.
+TrainingOption = model.Concept(
+    "TrainingOption",
+    identify_by={"technician_id": String, "machine_type": String},
+)
+TrainingOption.training_cost = model.Property(
+    f"{TrainingOption} costs {Float:training_cost}"
+)
+TrainingOption.training_weeks = model.Property(
+    f"{TrainingOption} takes {Integer:training_weeks} weeks"
+)
+TrainingOption.technician = model.Property(f"{TrainingOption} for {Technician}")
+training_data = model.data(training_df)
+model.define(
+    to_ := TrainingOption.new(
+        technician_id=training_data["technician_id"],
+        machine_type=training_data["machine_type"],
+    ),
+    to_.training_cost(training_data["training_cost"]),
+    to_.training_weeks(training_data["training_weeks"]),
+)
+model.define(TrainingOption.technician(Technician)).where(
+    TrainingOption.technician_id == Technician.technician_id
+)
+
 # Period concept: discrete planning periods (1..PERIOD_HORIZON).
 Period = model.Concept("Period", identify_by={"pid": Integer})
 period_data = model.data([{"pid": t} for t in range(1, PERIOD_HORIZON + 1)])
@@ -1250,6 +1286,313 @@
 print("STAGE 4: Resilience -- Concentration Risk Analysis")
 print("=" * 70)
 
+# --------------------------------------------------
+# Materialize prescriptive output as ontology concepts.
+# These bindings turn the post-solve x_maintain / x_assigned / x_vulnerable
+# property values into queryable ontology rather than ad-hoc pandas frames.
+# --------------------------------------------------
+
+# MaintenancePlan: singleton capturing the optimizer's cost breakdown.
+MaintenancePlan = model.Concept(
+    "MaintenancePlan", identify_by={"key": Integer}
+)
+MaintenancePlan.objective = model.Property(
+    f"{MaintenancePlan} has objective {Float:objective}"
+)
+MaintenancePlan.failure_cost = model.Property(
+    f"{MaintenancePlan} has failure cost {Float:failure_cost}"
+)
+MaintenancePlan.labor_cost = model.Property(
+    f"{MaintenancePlan} has labor cost {Float:labor_cost}"
+)
+MaintenancePlan.travel_cost = model.Property(
+    f"{MaintenancePlan} has travel cost {Float:travel_cost}"
+)
+MaintenancePlan.total_jobs = model.Property(
+    f"{MaintenancePlan} has total jobs {Integer:total_jobs}"
+)
+
+# Seed the singleton and bind the optimizer's reported objective onto it.
+plan_data = model.data([{"key": 1, "obj_val": float(si.objective_value)}])
+model.define(
+    plan_seed := MaintenancePlan.new(key=plan_data["key"]),
+    plan_seed.objective(plan_data["obj_val"]),
+)
+
+# Aggregate the cost components and job count off the post-solve properties.
+plan_ref = MaintenancePlan.ref()
+mp_fc = MachinePeriod.ref()
+m_fc = Machine.ref()
+model.define(
+    plan_ref.failure_cost(
+        aggs.sum(
+            mp_fc.x_vulnerable
+            * mp_fc.predicted_fp
+            * m_fc.estimated_parts_cost
+            * m_fc.criticality
+            * (1 + CENTRALITY_WEIGHT * m_fc.betweenness)
+        ).where(mp_fc.machine(m_fc))
+    )
+)
+
+tmp_lc = TechnicianMachinePeriod.ref()
+m_lc = Machine.ref()
+t_lc = Technician.ref()
+model.define(
+    plan_ref.labor_cost(
+        aggs.sum(
+            tmp_lc.x_assigned
+            * m_lc.maintenance_duration_hours
+            * t_lc.hourly_rate
+        ).where(
+            tmp_lc.machine(m_lc),
+            tmp_lc.technician(t_lc),
+        )
+    )
+)
+
+tmp_tc = TechnicianMachinePeriod.ref()
+m_tc = Machine.ref()
+model.define(
+    plan_ref.travel_cost(
+        aggs.sum(
+            tmp_tc.x_assigned
+            * (1 - tmp_tc.same_location)
+            * m_tc.maintenance_duration_hours
+            * TRAVEL_COST_PER_HOUR
+        ).where(tmp_tc.machine(m_tc))
+    )
+)
+
+mp_jobs = MachinePeriod.ref()
+model.define(
+    plan_ref.total_jobs(
+        aggs.count(mp_jobs).where(mp_jobs.x_maintain > 0.5)
+    )
+)
+
+# TypeConcentration: per-machine-type concentration analysis.
+TypeConcentration = model.Concept(
+    "TypeConcentration", identify_by={"machine_type": String}
+)
+TypeConcentration.qualified_tech_count = model.Property(
+    f"{TypeConcentration} has {Integer:qualified_tech_count} qualified techs"
+)
+TypeConcentration.qualified_tech_locations = model.Property(
+    f"{TypeConcentration} has tech locations {String:qualified_tech_locations}"
+)
+TypeConcentration.is_concentrated = model.Property(
+    f"{TypeConcentration} concentration flag {Boolean:is_concentrated}"
+)
+TypeConcentration.scheduled_jobs_total = model.Property(
+    f"{TypeConcentration} has {Integer:scheduled_jobs_total} scheduled jobs"
+)
+TypeConcentration.scheduled_jobs_traveling = model.Property(
+    f"{TypeConcentration} has {Integer:scheduled_jobs_traveling} traveling jobs"
+)
+TypeConcentration.travel_pct = model.Property(
+    f"{TypeConcentration} has travel pct {Float:travel_pct}"
+)
+
+# Seed: one TypeConcentration per distinct machine_type appearing in
+# Qualification (the population of types we have any tech for).
+qref_seed = Qualification.ref()
+model.define(
+    TypeConcentration.new(machine_type=qref_seed.machine_type_str)
+)
+
+# qualified_tech_count: distinct techs qualified for this machine_type.
+tc_qc = TypeConcentration.ref()
+qref_qc = Qualification.ref()
+tref_qc = Technician.ref()
+model.define(
+    tc_qc.qualified_tech_count(
+        aggs.count(distinct(tref_qc))
+        .where(
+            qref_qc.machine_type_str == tc_qc.machine_type,
+            qref_qc.technician(tref_qc),
+        )
+        .per(tc_qc)
+    )
+)
+
+# Helper concept: distinct (machine_type, location) pairs derived from the
+# qualified-technician join. Compound identity gives one entity per unique
+# pair; used downstream by distinct_loc_count.
+TypeLocation = model.Concept(
+    "TypeLocation",
+    identify_by={"machine_type": String, "location": String},
+)
+qref_tl = Qualification.ref()
+tref_tl = Technician.ref()
+model.define(
+    TypeLocation.new(
+        machine_type=qref_tl.machine_type_str,
+        location=tref_tl.base_location,
+    )
+).where(qref_tl.technician(tref_tl))
+
+# qualified_tech_locations: comma-joined distinct base_locations of
+# qualified techs. Built in pandas because string_join is not yet supported
+# by the LQP backend; bound onto TypeConcentration via model.data.
+_loc_pairs = (
+    qualifications_df.merge(
+        technicians_df[["technician_id", "base_location"]], on="technician_id"
+    )[["machine_type", "base_location"]]
+    .drop_duplicates()
+    .sort_values(["machine_type", "base_location"])
+)
+_loc_str_rows = [
+    {"mtype": mt, "loc_str": ", ".join(sorted(g["base_location"].unique()))}
+    for mt, g in _loc_pairs.groupby("machine_type")
+]
+loc_str_data = model.data(_loc_str_rows)
+tc_locs = TypeConcentration.ref()
+model.define(tc_locs.qualified_tech_locations(loc_str_data["loc_str"])).where(
+    tc_locs.machine_type == loc_str_data["mtype"]
+)
+
+# distinct_loc_count: helper to drive the is_concentrated flag, computed
+# off the TypeLocation pairs (one entity per distinct location).
+TypeConcentration.distinct_loc_count = model.Property(
+    f"{TypeConcentration} has {Integer:distinct_loc_count} distinct tech locations"
+)
+tc_dlc = TypeConcentration.ref()
+tl_dlc = TypeLocation.ref()
+model.define(
+    tc_dlc.distinct_loc_count(
+        aggs.count(tl_dlc)
+        .where(tl_dlc.machine_type == tc_dlc.machine_type)
+        .per(tc_dlc)
+    )
+)
+
+# is_concentrated: True iff all qualified techs share a single base_location.
+model.where(TypeConcentration.distinct_loc_count == 1).define(
+    TypeConcentration.is_concentrated(True)
+)
+model.where(TypeConcentration.distinct_loc_count > 1).define(
+    TypeConcentration.is_concentrated(False)
+)
+
+# scheduled_jobs_total: count of scheduled (machine, period) jobs for this type.
+tc_sjt = TypeConcentration.ref()
+mp_sjt = MachinePeriod.ref()
+m_sjt = Machine.ref()
+model.define(
+    tc_sjt.scheduled_jobs_total(
+        aggs.count(mp_sjt)
+        .where(
+            mp_sjt.machine(m_sjt),
+            m_sjt.machine_type == tc_sjt.machine_type,
+            mp_sjt.x_maintain > 0.5,
+        )
+        .per(tc_sjt)
+        | 0
+    )
+)
+
+# scheduled_jobs_traveling: count of scheduled assignments where the
+# assigned technician's base_location differs from the machine's location.
+tc_sjr = TypeConcentration.ref()
+tmp_sjr = TechnicianMachinePeriod.ref()
+m_sjr = Machine.ref()
+model.define(
+    tc_sjr.scheduled_jobs_traveling(
+        aggs.count(tmp_sjr)
+        .where(
+            tmp_sjr.machine(m_sjr),
+            m_sjr.machine_type == tc_sjr.machine_type,
+            tmp_sjr.x_assigned > 0.5,
+            tmp_sjr.same_location == 0,
+        )
+        .per(tc_sjr)
+        | 0
+    )
+)
+
+# travel_pct: 100 * traveling / total (only when total > 0).
+model.where(TypeConcentration.scheduled_jobs_total > 0).define(
+    TypeConcentration.travel_pct(
+        floats.float(TypeConcentration.scheduled_jobs_traveling)
+        / floats.float(TypeConcentration.scheduled_jobs_total)
+        * 100.0
+    )
+)
+
+# CrossTrainingRecommendation: cheapest non-local cross-training candidate
+# per concentrated machine type. One row per concentrated machine_type.
+CrossTrainingRecommendation = model.Concept(
+    "CrossTrainingRecommendation", identify_by={"machine_type": String}
+)
+CrossTrainingRecommendation.tech_id = model.Property(
+    f"{CrossTrainingRecommendation} has {String:tech_id}"
+)
+CrossTrainingRecommendation.tech_name = model.Property(
+    f"{CrossTrainingRecommendation} has {String:tech_name}"
+)
+CrossTrainingRecommendation.cost = model.Property(
+    f"{CrossTrainingRecommendation} has {Float:cost}"
+)
+CrossTrainingRecommendation.duration_weeks = model.Property(
+    f"{CrossTrainingRecommendation} has {Integer:duration_weeks} weeks"
+)
+CrossTrainingRecommendation.is_best_candidate = model.Property(
+    f"{CrossTrainingRecommendation} best flag {Boolean:is_best_candidate}"
+)
+
+# A TrainingOption is "non-local" for a concentrated machine type when the
+# candidate technician's base_location is NOT one of the locations where the
+# qualified techs already sit. For singly-concentrated types that simplifies
+# to: candidate.base_location != the (single) qualified-tech location string.
+# Pre-compute the cheapest non-local cost per concentrated type as a derived
+# property on TypeConcentration so the recommendation seeding stays simple.
+TypeConcentration.min_nonlocal_cost = model.Property(
+    f"{TypeConcentration} has {Float:min_nonlocal_cost} cheapest non-local training cost"
+)
+tc_min = TypeConcentration.ref()
+to_min = TrainingOption.ref()
+t_min = Technician.ref()
+model.where(tc_min.is_concentrated == True).define(
+    tc_min.min_nonlocal_cost(
+        aggs.min(to_min.training_cost)
+        .where(
+            to_min.machine_type == tc_min.machine_type,
+            to_min.technician(t_min),
+            t_min.base_location != tc_min.qualified_tech_locations,
+        )
+        .per(tc_min)
+    )
+)
+
+# Seed CrossTrainingRecommendation: one row per concentrated type that has
+# at least one non-local candidate (min_nonlocal_cost is populated).
+tc_seed_ctr = TypeConcentration.ref()
+model.where(tc_seed_ctr.min_nonlocal_cost).define(
+    CrossTrainingRecommendation.new(machine_type=tc_seed_ctr.machine_type)
+)
+
+# Bind cheapest-candidate attributes onto each recommendation by joining the
+# TrainingOption whose cost matches the pre-computed min_nonlocal_cost AND
+# whose tech sits outside the concentrated location.
+ctr_bind = CrossTrainingRecommendation.ref()
+tc_bind = TypeConcentration.ref()
+to_bind = TrainingOption.ref()
+t_bind = Technician.ref()
+model.where(
+    ctr_bind.machine_type == tc_bind.machine_type,
+    to_bind.machine_type == ctr_bind.machine_type,
+    to_bind.technician(t_bind),
+    t_bind.base_location != tc_bind.qualified_tech_locations,
+    to_bind.training_cost == tc_bind.min_nonlocal_cost,
+).define(
+    ctr_bind.tech_id(t_bind.technician_id),
+    ctr_bind.tech_name(t_bind.technician_name),
+    ctr_bind.cost(to_bind.training_cost),
+    ctr_bind.duration_weeks(to_bind.training_weeks),
+    ctr_bind.is_best_candidate(True),
+)
+
 # 4a. Technician utilization from the optimal schedule.
 tech_assignments = (
     assign_df.groupby(
@@ -1273,105 +1616,102 @@
         f"{row['assignment_count']} assignments ({pct:.0f}%)"
     )
 
-# 4b. Geographic concentration analysis by machine type.
-# For each machine type, check if all qualified technicians are in one location.
-# This reveals structural fragility invisible in the per-assignment view.
+# 4b. MaintenancePlan singleton: cost breakdown from the optimizer.
+plan_df = (
+    model.select(
+        MaintenancePlan.objective.alias("objective"),
+        MaintenancePlan.failure_cost.alias("failure_cost"),
+        MaintenancePlan.labor_cost.alias("labor_cost"),
+        MaintenancePlan.travel_cost.alias("travel_cost"),
+        MaintenancePlan.total_jobs.alias("total_jobs"),
+    )
+    .to_df()
+)
+plan_row = plan_df.iloc[0]
+print("\nMaintenancePlan (cost breakdown):")
+print(f"  Objective:    ${plan_row['objective']:.2f}")
+print(f"  Failure cost: ${plan_row['failure_cost']:.2f}")
+print(f"  Labor cost:   ${plan_row['labor_cost']:.2f}")
+print(f"  Travel cost:  ${plan_row['travel_cost']:.2f}")
+print(f"  Total jobs:   {int(plan_row['total_jobs'])}")
+
+# 4c. TypeConcentration: per-machine-type concentration analysis.
+type_conc_df = (
+    model.select(
+        TypeConcentration.machine_type.alias("machine_type"),
+        TypeConcentration.qualified_tech_count.alias("qualified_tech_count"),
+        TypeConcentration.qualified_tech_locations.alias("qualified_tech_locations"),
+        TypeConcentration.is_concentrated.alias("is_concentrated"),
+        TypeConcentration.scheduled_jobs_total.alias("scheduled_jobs_total"),
+        TypeConcentration.scheduled_jobs_traveling.alias("scheduled_jobs_traveling"),
+        TypeConcentration.travel_pct.alias("travel_pct"),
+    )
+    .to_df()
+    .sort_values("machine_type")
+)
+
 print("\nQualification coverage by machine type:")
-concentrated_types = []
-machine_types = sorted(qualifications_df["machine_type"].unique())
-for mtype in machine_types:
-    qual_techs = qualifications_df[
-        qualifications_df["machine_type"] == mtype
-    ]["technician_id"].tolist()
-    tech_info = technicians_df[technicians_df["technician_id"].isin(qual_techs)]
-    locations = tech_info["base_location"].unique().tolist()
-    tech_count = len(qual_techs)
-
-    # Machines of this type and their locations.
-    type_machines = machines_df[machines_df["machine_type"] == mtype]
-    machine_locations = type_machines["location"].unique().tolist()
-    uncovered_locs = [loc for loc in machine_locations if loc not in locations]
-
-    status = "OK"
-    if len(locations) == 1:
-        concentrated_types.append((mtype, locations[0], tech_count))
-        status = f"CONCENTRATED -- all {tech_count} techs in {locations[0]}"
-    elif uncovered_locs:
-        status = f"gaps at {', '.join(uncovered_locs)}"
-
-    print(f"  {mtype}: {tech_count} techs in {', '.join(sorted(locations))} -- {status}")
-
-# 4c. Impact analysis for concentrated types.
-if concentrated_types:
+for _, row in type_conc_df.iterrows():
+    tag = (
+        f"CONCENTRATED -- all {int(row['qualified_tech_count'])} techs in "
+        f"{row['qualified_tech_locations']}"
+        if row["is_concentrated"]
+        else "OK"
+    )
+    print(
+        f"  {row['machine_type']}: {int(row['qualified_tech_count'])} techs "
+        f"in {row['qualified_tech_locations']} -- {tag}"
+    )
+
+concentrated_df = type_conc_df[type_conc_df["is_concentrated"]]
+if not concentrated_df.empty:
     print("\nConcentration risk detail:")
-    for mtype, conc_loc, tech_count in concentrated_types:
-        type_machines = machines_df[machines_df["machine_type"] == mtype]
-        remote_machines = type_machines[type_machines["location"] != conc_loc]
-        local_machines = type_machines[type_machines["location"] == conc_loc]
-
-        # How many scheduled assignments for this type required travel?
-        type_assign = assign_df[assign_df["machine_type"] == mtype]
-        travel_assign = type_assign[type_assign["base_location"] != type_assign["location"]]
-
-        print(f"\n  {mtype}: {len(type_machines)} machines across "
-              f"{len(type_machines['facility'].unique())} facilities, "
-              f"all {tech_count} qualified techs in {conc_loc}")
-        print(f"    Local machines ({conc_loc}): {len(local_machines)}")
-        print(f"    Remote machines (require travel): {len(remote_machines)}")
-        if not remote_machines.empty:
-            for _, m in remote_machines.iterrows():
-                print(f"      {m['machine_id']} ({m['facility']}, {m['location']})")
-        if not type_assign.empty:
-            print(f"    Scheduled {mtype} jobs: {len(type_assign)}, "
-                  f"of which {len(travel_assign)} require travel "
-                  f"({len(travel_assign)/len(type_assign)*100:.0f}%)")
-
-        # Show qualified techs.
-        qual_techs = qualifications_df[
-            qualifications_df["machine_type"] == mtype
-        ]["technician_id"].tolist()
-        tech_detail = technicians_df[technicians_df["technician_id"].isin(qual_techs)]
-        print(f"    Qualified techs (all {conc_loc}):")
-        for _, t in tech_detail.iterrows():
-            print(f"      {t['technician_id']} ({t['technician_name']}, "
-                  f"{t['skill_level']})")
-
-    # 4d. Cross-training recommendation.
+    for _, row in concentrated_df.iterrows():
+        total_jobs = int(row["scheduled_jobs_total"]) if row["scheduled_jobs_total"] else 0
+        travel_jobs = (
+            int(row["scheduled_jobs_traveling"]) if row["scheduled_jobs_traveling"] else 0
+        )
+        pct = float(row["travel_pct"]) if total_jobs else 0.0
+        print(
+            f"\n  {row['machine_type']}: all {int(row['qualified_tech_count'])} "
+            f"qualified techs in {row['qualified_tech_locations']}"
+        )
+        if total_jobs:
+            print(
+                f"    Scheduled {row['machine_type']} jobs: {total_jobs}, "
+                f"of which {travel_jobs} require travel ({pct:.0f}%)"
+            )
+        else:
+            print(f"    Scheduled {row['machine_type']} jobs: 0")
+
+    # 4d. CrossTrainingRecommendation: cheapest non-local candidate per type.
     print(f"\n{'=' * 70}")
     print("RECOMMENDATION: Cross-Training to Eliminate Concentration Risk")
     print("=" * 70)
 
-    for mtype, conc_loc, _ in concentrated_types:
-        candidates = training_df[training_df["machine_type"] == mtype].merge(
-            technicians_df[["technician_id", "technician_name", "base_location",
-                            "skill_level"]],
-            on="technician_id",
-        ).sort_values("training_cost")
-
-        # Prefer candidates NOT in the concentrated location.
-        non_local = candidates[candidates["base_location"] != conc_loc]
-        if not non_local.empty:
-            best = non_local.iloc[0]
-        elif not candidates.empty:
-            best = candidates.iloc[0]
-        else:
-            print(f"\n  No {mtype} cross-training options available.")
-            continue
-
-        print(f"\n  {mtype} -- add coverage outside {conc_loc}:")
-        print(f"    Best candidate: {best['technician_id']} "
-              f"({best['technician_name']}, {best['skill_level']}, "
-              f"{best['base_location']})")
-        print(f"    Cost: ${int(best['training_cost']):,}, "
-              f"Duration: {int(best['training_weeks'])} weeks")
-
-        if len(candidates) > 1:
-            print("    All candidates:")
-            for _, cand in candidates.iterrows():
-                local_tag = " (same location)" if cand["base_location"] == conc_loc else ""
-                print(f"      {cand['technician_id']} ({cand['technician_name']}, "
-                      f"{cand['base_location']}): "
-                      f"${int(cand['training_cost']):,}, "
-                      f"{int(cand['training_weeks'])} weeks{local_tag}")
+    rec_df = (
+        model.select(
+            CrossTrainingRecommendation.machine_type.alias("machine_type"),
+            CrossTrainingRecommendation.tech_id.alias("tech_id"),
+            CrossTrainingRecommendation.tech_name.alias("tech_name"),
+            CrossTrainingRecommendation.cost.alias("cost"),
+            CrossTrainingRecommendation.duration_weeks.alias("duration_weeks"),
+            CrossTrainingRecommendation.is_best_candidate.alias("is_best_candidate"),
+        )
+        .to_df()
+        .sort_values("machine_type")
+    )
+
+    if rec_df.empty:
+        print("\n  No non-local cross-training options available.")
+    for _, row in rec_df.iterrows():
+        conc_loc = concentrated_df[
+            concentrated_df["machine_type"] == row["machine_type"]
+        ]["qualified_tech_locations"].iloc[0]
+        print(f"\n  {row['machine_type']} -- add coverage outside {conc_loc}:")
+        print(
+            f"    Best candidate: {row['tech_id']} ({row['tech_name']}): "
+            f"${int(row['cost']):,}, {int(row['duration_weeks'])} weeks"
+        )
 else:
     print("\nNo geographic concentration risk detected.")
diff --git a/v1/portfolio_balancing/portfolio_balancing.py b/v1/portfolio_balancing/portfolio_balancing.py
index abd4698..42ae716 100644
--- a/v1/portfolio_balancing/portfolio_balancing.py
+++ b/v1/portfolio_balancing/portfolio_balancing.py
@@ -29,8 +29,8 @@
 
 from pathlib import Path
 
-from pandas import read_csv
-from relationalai.semantics import Float, Integer, Model, String, sum
+from pandas import DataFrame, read_csv
+from relationalai.semantics import Boolean, Float, Integer, Model, String, sum
 from relationalai.semantics.reasoners.graph import Graph
 from relationalai.semantics.reasoners.prescriptive import Problem
 from relationalai.semantics.std import aggregates as aggs
@@ -1011,12 +1011,175 @@ def solve_epsilon(eps_rate=None):
 print(f"(PSD-preserving correlation shrinkage, alpha = {CRISIS_ALPHA})")
 print("=" * 70)
 
-# Side-by-side vol (sqrt variance) by budget x lambda.
+# Build the per-(scenario, eps_label) frontier table and materialize it as
+# the `FrontierPoint` Concept. Each Pareto point's metadata -- return, risk,
+# inter-row marginals, the knee flag, and the base/crisis vol comparison --
+# becomes ontology data instead of stdout.
+fp_rows = []
+for sn in scenario_names:
+    pts = pareto[sn]
+    if not pts:
+        continue
+    rates = []
+    for j, pt in enumerate(pts):
+        if j == 0:
+            marginal = None
+        else:
+            dr = pt["risk"] - pts[j - 1]["risk"]
+            dret = pt["return_actual"] - pts[j - 1]["return_actual"]
+            marginal = (dr / dret) if abs(dret) > 1e-6 else 0.0
+        rates.append(marginal)
+        fp_rows.append({
+            "scenario_label": sn,
+            "eps_label": pt["label"],
+            "return": pt["return_actual"],
+            "risk": pt["risk"],
+            "marginal_risk_per_return": marginal,
+            "is_knee": False,
+        })
+
+    # Knee = the eps point with the largest jump in marginal vs the prior
+    # point (per scenario). rates[0] is None (min_risk has no marginal),
+    # so we start scanning from index 2 against rates[1..].
+    knee_idx = None
+    max_jump = 0.0
+    for j in range(2, len(rates)):
+        prev = rates[j - 1]
+        curr = rates[j]
+        if prev is None or curr is None:
+            continue
+        if abs(prev) > 1e-6:
+            jump = curr / prev
+        else:
+            jump = curr if curr and curr > 0 else 0.0
+        if jump > max_jump:
+            max_jump = jump
+            knee_idx = j
+    if knee_idx is not None:
+        # fp_rows for this scenario starts at len(fp_rows) - len(pts).
+        scenario_start = len(fp_rows) - len(pts)
+        fp_rows[scenario_start + knee_idx]["is_knee"] = True
+
+# Pair base and crisis rows by (budget, eps_label) so vol_base / vol_crisis
+# carry on BOTH the base-regime row and its matching crisis-regime row.
+risk_by_key = {
+    (r["scenario_label"], r["eps_label"]): r["risk"] for r in fp_rows
+}
+for r in fp_rows:
+    sn = r["scenario_label"]
+    eps = r["eps_label"]
+    # Strip "base_" / "crisis_" prefix to get the budget suffix.
+    budget_suffix = sn.split("_", 1)[1]
+    base_risk = risk_by_key.get((f"base_{budget_suffix}", eps))
+    crisis_risk = risk_by_key.get((f"crisis_{budget_suffix}", eps))
+    vol_base = base_risk ** 0.5 if base_risk is not None else 0.0
+    vol_crisis = crisis_risk ** 0.5 if crisis_risk is not None else 0.0
+    vol_gap = vol_crisis - vol_base
+    vol_gap_pct = (vol_gap / vol_base * 100.0) if vol_base > 1e-9 else 0.0
+    r["vol_base"] = vol_base
+    r["vol_crisis"] = vol_crisis
+    r["vol_gap"] = vol_gap
+    r["vol_gap_pct"] = vol_gap_pct
+
+fp_df = DataFrame(fp_rows)
+
+# FrontierPoint Concept -- one row per (Scenario, eps_label).
+FrontierPoint = model.Concept(
+    "FrontierPoint",
+    identify_by={"scenario_label": String, "eps_label": String},
+)
+FrontierPoint.scenario = model.Property(f"{FrontierPoint} for {Scenario}")
+FrontierPoint.return_value = model.Property(
+    f"{FrontierPoint} has return {Float:fp_return}"
+)
+FrontierPoint.risk = model.Property(f"{FrontierPoint} has risk {Float:fp_risk}")
+FrontierPoint.marginal_risk_per_return = model.Property(
+    f"{FrontierPoint} has marginal {Float:fp_marginal}"
+)
+FrontierPoint.is_knee = model.Property(
+    f"{FrontierPoint} is knee {Boolean:fp_is_knee}"
+)
+FrontierPoint.vol_base = model.Property(
+    f"{FrontierPoint} has vol_base {Float:fp_vol_base}"
+)
+FrontierPoint.vol_crisis = model.Property(
+    f"{FrontierPoint} has vol_crisis {Float:fp_vol_crisis}"
+)
+FrontierPoint.vol_gap = model.Property(
+    f"{FrontierPoint} has vol_gap {Float:fp_vol_gap}"
+)
+FrontierPoint.vol_gap_pct = model.Property(
+    f"{FrontierPoint} has vol_gap_pct {Float:fp_vol_gap_pct}"
+)
+
+# Two-pass load: marginal_risk_per_return is null at min_risk, so it
+# can't sit on the same model.data() frame as the all-rows columns
+# (NaN breaks model.data()).
+fp_main_df = fp_df[[
+    "scenario_label", "eps_label", "return", "risk", "is_knee",
+    "vol_base", "vol_crisis", "vol_gap", "vol_gap_pct",
+]].reset_index(drop=True)
+fp_data = model.data(fp_main_df)
+model.define(
+    fp := FrontierPoint.new(
+        scenario_label=fp_data["scenario_label"],
+        eps_label=fp_data["eps_label"],
+    ),
+    fp.return_value(fp_data["return"]),
+    fp.risk(fp_data["risk"]),
+    fp.is_knee(fp_data["is_knee"]),
+    fp.vol_base(fp_data["vol_base"]),
+    fp.vol_crisis(fp_data["vol_crisis"]),
+    fp.vol_gap(fp_data["vol_gap"]),
+    fp.vol_gap_pct(fp_data["vol_gap_pct"]),
+)
+
+# Link FrontierPoint to its Scenario by matching scenario_label == Scenario.name.
+fp_link_ref = FrontierPoint.ref()
+sc_link_ref = Scenario.ref()
+model.where(
+    fp_link_ref.scenario_label == sc_link_ref.name,
+).define(fp_link_ref.scenario(sc_link_ref))
+
+# Second pass: marginal_risk_per_return (only the 30 non-min_risk rows).
+fp_marg_df = (
+    fp_df[fp_df["marginal_risk_per_return"].notna()][
+        ["scenario_label", "eps_label", "marginal_risk_per_return"]
+    ]
+    .reset_index(drop=True)
+)
+fp_marg_data = model.data(fp_marg_df)
+fp_marg_ref = FrontierPoint.ref()
+model.where(
+    fp_marg_ref.scenario_label == fp_marg_data["scenario_label"],
+    fp_marg_ref.eps_label == fp_marg_data["eps_label"],
+).define(
+    fp_marg_ref.marginal_risk_per_return(fp_marg_data["marginal_risk_per_return"])
+)
+
+# Side-by-side vol (sqrt variance) by budget x lambda -- now sourced from
+# the FrontierPoint Concept rather than the in-memory pareto dict.
 print("\n  Volatility comparison (sqrt risk) -- base vs crisis at each lambda:")
+fp_q = FrontierPoint.ref()
+fp_query_df = (
+    model.select(
+        fp_q.scenario_label.alias("scenario_label"),
+        fp_q.eps_label.alias("eps_label"),
+        fp_q.vol_base.alias("vol_base"),
+        fp_q.vol_crisis.alias("vol_crisis"),
+        fp_q.vol_gap.alias("vol_gap"),
+        fp_q.vol_gap_pct.alias("vol_gap_pct"),
+    )
+    .to_df()
+)
+
+eps_order = ["min_risk", "eps_1", "eps_2", "eps_3", "eps_4", "eps_5"]
 for budget in budgets:
     base_sn = f"base_{budget}"
     crisis_sn = f"crisis_{budget}"
-    if len(pareto[base_sn]) < 2 or len(pareto[crisis_sn]) < 2:
+    base_rows = fp_query_df[fp_query_df["scenario_label"] == base_sn]
+    crisis_rows = fp_query_df[fp_query_df["scenario_label"] == crisis_sn]
+    if base_rows.empty or crisis_rows.empty:
         continue
     print(f"\n  Budget {budget}:")
     print(
@@ -1024,16 +1187,16 @@ def solve_epsilon(eps_rate=None):
         f"{'gap':>10} {'gap_%':>8}"
     )
     print(f"  {'-' * 56}")
-    for j in range(min(len(pareto[base_sn]), len(pareto[crisis_sn]))):
-        b_pt = pareto[base_sn][j]
-        c_pt = pareto[crisis_sn][j]
-        vol_b = b_pt["risk"] ** 0.5
-        vol_c = c_pt["risk"] ** 0.5
-        gap = vol_c - vol_b
-        gap_pct = (gap / vol_b * 100.0) if vol_b > 1e-9 else 0.0
+    base_by_eps = {row["eps_label"]: row for _, row in base_rows.iterrows()}
+    for eps in eps_order:
+        if eps not in base_by_eps:
+            continue
+        row = base_by_eps[eps]
         print(
-            f"  {b_pt['label']:>10} {vol_b:>12.4f} {vol_c:>12.4f} "
-            f"{gap:>+10.4f} {gap_pct:>+7.1f}%"
+            f"  {eps:>10} {float(row['vol_base']):>12.4f} "
+            f"{float(row['vol_crisis']):>12.4f} "
+            f"{float(row['vol_gap']):>+10.4f} "
+            f"{float(row['vol_gap_pct']):>+7.1f}%"
         )
 
 print(
diff --git a/v1/supply_chain_resilience/supply_chain_resilience.py b/v1/supply_chain_resilience/supply_chain_resilience.py
index 314fd69..7be0335 100644
--- a/v1/supply_chain_resilience/supply_chain_resilience.py
+++ b/v1/supply_chain_resilience/supply_chain_resilience.py
@@ -439,6 +439,13 @@
 model.where(Business.is_unreliable()).define(Business.is_watch_level())
 model.where(Business.has_high_delay_risk()).define(Business.is_watch_level())
 
+# Rule: Business is avoid when BOTH unreliable AND high delay risk fire.
+Business.is_avoid = model.Relationship(f"{Business} is avoid")
+model.define(Business.is_avoid()).where(
+    Business.is_unreliable(),
+    Business.has_high_delay_risk(),
+)
+
 # Derive risk classification in Python from RAI flags.
 # The optimization uses is_unreliable() and has_high_delay_risk() directly.
 unreliable_df = (
@@ -465,14 +472,12 @@
 )
 
 print("\nSupplier risk classification:")
-avoid_ids = set()
 for _, row in all_biz_df.sort_values("reliability").iterrows():
     biz_id = row["id"]
     is_unrel = biz_id in unreliable_ids
     is_delay = biz_id in high_delay_ids
     if is_unrel and is_delay:
         rc, marker = "avoid", "[X]"
-        avoid_ids.add(biz_id)
     elif is_unrel or is_delay:
         rc, marker = "watch", "[!]"
     else:
@@ -482,6 +487,15 @@
         f"reliability={row['reliability']:.2f}, class={rc}"
     )
 
+# Pull avoid ids from the Relationship for the Watch->Avoid scenario set math
+# and for printing/per-scenario blocked-businesses bookkeeping.
+avoid_df = (
+    model.select(Business.id.alias("id"))
+    .where(Business.is_avoid())
+    .to_df()
+)
+avoid_ids = set(avoid_df["id"]) if len(avoid_df) > 0 else set()
+
 # Rule 4: Demand is escalated when priority is HIGH.
 Demand.is_escalated = model.Relationship(f"{Demand} is escalated")
 model.where(Demand.priority == "HIGH").define(Demand.is_escalated())
@@ -508,13 +522,19 @@
 Demand.x_unmet = model.Property(f"{Demand} has {Float:unmet}")
 
 
-def solve_flow(label, exclude_site_id=None, block_business_ids=None):
+def solve_flow(label, exclude_site_id=None, block_business_ids=None, use_avoid_relationship=True):
     """Solve the network flow, optionally disabling a site or blocking suppliers.
 
     Args:
         label: Display name for this scenario.
         exclude_site_id: Site ID string to take offline (all ops from this site get zero flow).
         block_business_ids: Set of Business ID strings whose source operations get zero flow.
+            Used for scenario-specific extras (e.g., downgrade watch->avoid). Combined with
+            the ontology Business.is_avoid() block when use_avoid_relationship=True.
+        use_avoid_relationship: If True, hard-block all Business.is_avoid() suppliers via a
+            single ontology-driven constraint. The baseline and Site-offline scenarios use
+            this; the Watch->Avoid scenario sets it False because its block_business_ids set
+            already supersedes the ontology avoid set.
     """
     if block_business_ids is None:
         block_business_ids = set()
@@ -555,9 +575,22 @@ def solve_flow(label, exclude_site_id=None, block_business_ids=None):
         name=["demand_sat", D.id],
     )
 
-    # Constraint: block operations sourced from blocked businesses.
-    # Uses explicit Python-side business IDs passed per scenario, so the
-    # constraint is guaranteed to differ across scenarios.
+    # Constraint: block operations sourced from "avoid" businesses, read
+    # directly from the ontology Business.is_avoid() Relationship.
+    if use_avoid_relationship:
+        biz_avoid = Business.ref()
+        op_avoid = Operation.ref()
+        problem.satisfy(
+            model.require(op_avoid.x_flow == 0).where(
+                op_avoid.source_business(biz_avoid),
+                biz_avoid.is_avoid(),
+            ),
+            name=["block_avoid", op_avoid.id],
+        )
+
+    # Scenario-specific extras: block additional businesses from a Python set
+    # (e.g., the Watch->Avoid scenario blocks the union of unreliable and
+    # high-delay-risk suppliers, a strict superset of the avoid Relationship).
     for biz_id in sorted(block_business_ids):
         biz_block = Business.ref()
         op_block = Operation.ref()
@@ -653,12 +686,20 @@ def solve_flow(label, exclude_site_id=None, block_business_ids=None):
         else:
             print("  All demand satisfied")
 
-    return {"label": label, "status": status, "objective": obj, "unmet": total_unmet}
+    return {
+        "label": label,
+        "status": status,
+        "objective": obj,
+        "unmet": total_unmet,
+        "active_flows": n_active,
+        "blocked": sorted(block_business_ids),
+    }
 
 
-# Baseline solve: block only "avoid" suppliers (both unreliable AND high delay).
+# Baseline solve: block only "avoid" suppliers (both unreliable AND high delay)
+# via the Business.is_avoid() Relationship from the ontology.
 results = []
-results.append(solve_flow("Baseline", block_business_ids=avoid_ids))
+results.append(solve_flow("Baseline"))
 print(f"  Blocked businesses (avoid): {sorted(avoid_ids) if avoid_ids else 'none'}")
 
 # --------------------------------------------------
@@ -677,7 +718,6 @@ def solve_flow(label, exclude_site_id=None, block_business_ids=None):
     solve_flow(
         f"Site {top_site_id} offline",
         exclude_site_id=top_site_id,
-        block_business_ids=avoid_ids,
     )
 )
 
@@ -687,23 +727,82 @@ def solve_flow(label, exclude_site_id=None, block_business_ids=None):
 print("\nScenario: All 'watch' suppliers downgraded to 'avoid'")
 print(f"  Blocked businesses: {sorted(watch_and_avoid_ids)}")
 results.append(
-    solve_flow("Watch->Avoid", block_business_ids=watch_and_avoid_ids)
+    solve_flow(
+        "Watch->Avoid",
+        block_business_ids=watch_and_avoid_ids,
+        use_avoid_relationship=False,
+    )
 )
 
-# Summary table.
+# --------------------------------------------------
+# Persist scenario solves into the ontology (RoutingScenario Concept)
+# --------------------------------------------------
+# Per /rai-prescriptive-problem-formulation/scenario-analysis.md "Scenario
+# Concept" pattern: load scenario rows via model.data(list_of_tuples,
+# columns=[...]) and bind via Concept.new(...).to_schema().
+
+canonical_names = {
+    "Baseline": "Baseline",
+    f"Site {top_site_id} offline": f"{top_site_id}-offline",
+    "Watch->Avoid": "Watch-Avoid",
+}
+
+RoutingScenario = model.Concept("RoutingScenario", identify_by={"name": String})
+RoutingScenario.status = model.Property(f"{RoutingScenario} has {String:status}")
+RoutingScenario.total_cost = model.Property(f"{RoutingScenario} has {Float:total_cost}")
+RoutingScenario.cost_delta_pct = model.Property(f"{RoutingScenario} has {Float:cost_delta_pct}")
+RoutingScenario.active_flow_count = model.Property(f"{RoutingScenario} has {Integer:active_flow_count}")
+RoutingScenario.unmet_total = model.Property(f"{RoutingScenario} has {Float:unmet_total}")
+RoutingScenario.blocked_businesses = model.Property(f"{RoutingScenario} has {String:blocked_businesses}")
+
+# Materialize per-scenario results as RoutingScenario instances. Pre-create
+# the three instances by name, then bind each property via filter_by — this
+# avoids the row-collapse seen with model.data() multi-row binding here.
+baseline_obj = results[0]["objective"]
+for r in results:
+    name = canonical_names.get(r["label"], r["label"])
+    obj = float(r["objective"]) if r["objective"] is not None else 0.0
+    delta_pct = 0.0
+    if r["label"] != "Baseline" and baseline_obj and r["objective"] is not None:
+        delta_pct = (r["objective"] - baseline_obj) / baseline_obj * 100
+    blocked_str = ",".join(r["blocked"]) if r["blocked"] else ""
+    unmet_val = float(r["unmet"]) if r["unmet"] else 0.0
+
+    model.define(RoutingScenario.new(name=name))
+    rs = RoutingScenario.filter_by(name=name)
+    model.define(rs.status(r["status"] or "UNKNOWN"))
+    model.define(rs.total_cost(obj))
+    model.define(rs.cost_delta_pct(float(delta_pct)))
+    model.define(rs.active_flow_count(int(r["active_flows"])))
+    model.define(rs.unmet_total(unmet_val))
+    model.define(rs.blocked_businesses(blocked_str))
+
+# Summary table — query the RoutingScenario Concept rather than Python state.
 print(f"\n{'=' * 70}")
 print("SCENARIO COMPARISON")
 print(f"{'=' * 70}")
 print(f"  {'Scenario':<25} {'Status':<18} {'Cost':>12} {'Unmet':>10}")
 print(f"  {'-' * 65}")
-baseline_obj = results[0]["objective"]
-for r in results:
-    cost_str = f"{r['objective']:,.2f}" if r["objective"] else "N/A"
-    unmet_str = f"{r['unmet']:,.0f}" if r["unmet"] else "0"
+
+rs_df = (
+    model.select(
+        RoutingScenario.name.alias("name"),
+        RoutingScenario.status.alias("status"),
+        RoutingScenario.total_cost.alias("total_cost"),
+        RoutingScenario.cost_delta_pct.alias("cost_delta_pct"),
+        RoutingScenario.unmet_total.alias("unmet_total"),
+    )
+    .to_df()
+)
+order = [canonical_names[r["label"]] for r in results]
+rs_df = rs_df.set_index("name").loc[order].reset_index()
+for _, row in rs_df.iterrows():
+    cost_str = f"{row['total_cost']:,.2f}"
+    unmet_str = f"{row['unmet_total']:,.0f}" if row["unmet_total"] else "0"
     delta = ""
-    if r["objective"] and baseline_obj and r["label"] != "Baseline":
-        pct = (r["objective"] - baseline_obj) / baseline_obj * 100
+    if row["name"] != "Baseline":
+        pct = row["cost_delta_pct"]
         delta = f" (+{pct:.1f}%)" if pct > 0 else f" ({pct:.1f}%)"
     print(
-        f"  {r['label']:<25} {r['status']:<18} {cost_str:>12}{delta} {unmet_str:>10}"
+        f"  {row['name']:<25} {row['status']:<18} {cost_str:>12}{delta} {unmet_str:>10}"
     )
diff --git a/v1/telco_network_recovery/runbook.md b/v1/telco_network_recovery/runbook.md
index 4c378d8..688f01e 100644
--- a/v1/telco_network_recovery/runbook.md
+++ b/v1/telco_network_recovery/runbook.md
@@ -40,7 +40,7 @@ across all 15 critical towers, prioritized by social blast radius.
 ### 1. Build ontology
 
 - Prompt: `/rai-build-starter-ontology Build a telco network ontology from the eight CSVs in data/: cell_towers, network_equipment, equipment_health, network_performance, subscribers, call_detail_records, tower_upgrade_options, time_series_metrics. The time-series file has one row per (date, region); make sure that's modelled as a composite-key concept since we'll want to forecast region-level trends later.`
-- Response: Concepts: `CellTower`, `NetworkEquipment`, `EquipmentHealth`, `NetworkPerformance`, `Subscriber`, `CallDetailRecord` (edge concept: caller → callee, routed_through tower), `TowerUpgradeOption` (composite key tower_id+tier), `RegionMetric` (composite key metric_date+region), `TemporalEdge` (composite key src_date+src_region+dst_date+dst_region) — all bound to the bundled CSVs.
+- Response: Concepts: `CellTower`, `NetworkEquipment`, `EquipmentHealth`, `NetworkPerformance`, `Subscriber`, `CallDetailRecord` (edge concept: caller → callee, routed_through tower), `TowerUpgradeOption` (composite key tower_id+tier), `RegionMetric` (composite key metric_date+region) — all bound to the bundled CSVs. Step 7 (predictive) introduces a same-region 1-day-lag `TemporalEdge` concept derived from `RegionMetric` for GNN message passing.
 
 ### 2. Examine ontology
 
@@ -80,7 +80,7 @@ across all 15 critical towers, prioritized by social blast radius.
 ### 9. Interpret the plan
 
 - Prompt: `/rai-prescriptive-results-interpretation Summarize the plan: total cost, capacity restored, tier mix, towers covered. Which constraint is binding, and what would relaxing it by 10-20% unlock?`
-- Response: Budget binds at $4.96M/$5M (only $43K of headroom); flexing the budget to $6M unlocks the TWR-0009 BRONZE→GOLD swap (+5 Gbps for ~$395K incremental cost). Install-weeks have 36 weeks of slack (164/200) so crew capacity is not the bottleneck. All 15 critical towers are covered, so the 404 service-affected subscribers identified by the graph stage are addressed within the rollout window.
+- Response: Budget is binding at $4.96M/$5M (only $43K of headroom); a re-solve at higher budget would quantify which BRONZE/SILVER swaps unlock — TWR-0009 BRONZE → GOLD is the next-cheapest tier upgrade in the candidate set. Install-weeks have 36 weeks of slack (164/200) so crew capacity is not the bottleneck. All 15 critical towers are covered, so the 404 service-affected subscribers identified by the graph stage are addressed within the rollout window.
 
 ### 10. Persist solution concepts into the ontology
 
diff --git a/v1/telco_network_recovery/telco_network_recovery.py b/v1/telco_network_recovery/telco_network_recovery.py
index 7a29d7e..2318241 100644
--- a/v1/telco_network_recovery/telco_network_recovery.py
+++ b/v1/telco_network_recovery/telco_network_recovery.py
@@ -674,6 +674,131 @@
 print(f"  Tier mix:                 {selected['tier'].value_counts().to_dict()}")
 print(f"  Towers covered:           {len(selected)} of {plan_df['tower_id'].nunique()} critical")
 
+# --------------------------------------------------
+# Stage 5: Persist solution concepts into the ontology
+# --------------------------------------------------
+
+print(f"\n{'=' * 60}")
+print("STAGE 5: ONTOLOGY -- materialize RestorePlan + SelectedUpgrade")
+print("=" * 60)
+
+# SelectedUpgrade view-concept: unary relationship narrowing
+# TowerUpgradeOption to the 15 chosen rows (selected == 1).
+TowerUpgradeOption.is_selected_upgrade = model.Relationship(
+    f"{TowerUpgradeOption} is selected upgrade"
+)
+model.where(TowerUpgradeOption.selected == 1).define(
+    TowerUpgradeOption.is_selected_upgrade()
+)
+
+# RestorePlan singleton: one row keyed by key=1 holding the plan summary.
+RestorePlan = model.Concept("RestorePlan", identify_by={"key": Integer})
+RestorePlan.total_cost = model.Property(f"{RestorePlan} has {Float:total_cost}")
+RestorePlan.total_install_weeks = model.Property(
+    f"{RestorePlan} has {Integer:total_install_weeks}"
+)
+RestorePlan.capacity_restored_gbps = model.Property(
+    f"{RestorePlan} has {Integer:capacity_restored_gbps}"
+)
+RestorePlan.gold_count = model.Property(f"{RestorePlan} has {Integer:gold_count}")
+RestorePlan.silver_count = model.Property(f"{RestorePlan} has {Integer:silver_count}")
+RestorePlan.bronze_count = model.Property(f"{RestorePlan} has {Integer:bronze_count}")
+RestorePlan.towers_covered = model.Property(f"{RestorePlan} has {Integer:towers_covered}")
+RestorePlan.binding_constraint = model.Property(
+    f"{RestorePlan} has {String:binding_constraint}"
+)
+
+model.define(RestorePlan.new(key=1))
+
+# Bind aggregations off the SelectedUpgrade view back onto the singleton.
+rp = RestorePlan.ref()
+model.define(
+    rp.total_cost(
+        aggs.sum(TowerUpgradeOption.cost).where(
+            TowerUpgradeOption.is_selected_upgrade()
+        )
+    )
+)
+model.define(
+    rp.total_install_weeks(
+        aggs.sum(TowerUpgradeOption.install_weeks).where(
+            TowerUpgradeOption.is_selected_upgrade()
+        )
+    )
+)
+model.define(
+    rp.capacity_restored_gbps(
+        aggs.sum(TowerUpgradeOption.capacity_increase_gbps).where(
+            TowerUpgradeOption.is_selected_upgrade()
+        )
+    )
+)
+model.define(
+    rp.gold_count(
+        aggs.count(TowerUpgradeOption).where(
+            TowerUpgradeOption.is_selected_upgrade(),
+            TowerUpgradeOption.tier == "GOLD",
+        )
+    )
+)
+model.define(
+    rp.silver_count(
+        aggs.count(TowerUpgradeOption).where(
+            TowerUpgradeOption.is_selected_upgrade(),
+            TowerUpgradeOption.tier == "SILVER",
+        )
+    )
+)
+model.define(
+    rp.bronze_count(
+        aggs.count(TowerUpgradeOption).where(
+            TowerUpgradeOption.is_selected_upgrade(),
+            TowerUpgradeOption.tier == "BRONZE",
+        )
+    )
+)
+model.define(
+    rp.towers_covered(
+        aggs.count(distinct(CellTower)).where(
+            TowerUpgradeOption.for_tower(CellTower),
+            TowerUpgradeOption.is_selected_upgrade(),
+        )
+    )
+)
+
+# Binding-constraint classification:
+#   "budget" if total_cost is within $50k of $5M
+#   else "install_weeks" if total_install_weeks is within 5 of 200
+#   else "neither"
+model.where(
+    rp.total_cost >= BUDGET_USD - 50_000,
+).define(rp.binding_constraint("budget"))
+model.where(
+    rp.total_cost < BUDGET_USD - 50_000,
+    rp.total_install_weeks >= INSTALL_WEEKS_BUDGET - 5,
+).define(rp.binding_constraint("install_weeks"))
+model.where(
+    rp.total_cost < BUDGET_USD - 50_000,
+    rp.total_install_weeks < INSTALL_WEEKS_BUDGET - 5,
+).define(rp.binding_constraint("neither"))
+
+# Read the singleton back from the ontology and surface it.
+plan_summary_df = (
+    model.select(
+        RestorePlan.total_cost.alias("total_cost"),
+        RestorePlan.total_install_weeks.alias("total_install_weeks"),
+        RestorePlan.capacity_restored_gbps.alias("capacity_restored_gbps"),
+        RestorePlan.gold_count.alias("gold_count"),
+        RestorePlan.silver_count.alias("silver_count"),
+        RestorePlan.bronze_count.alias("bronze_count"),
+        RestorePlan.towers_covered.alias("towers_covered"),
+        RestorePlan.binding_constraint.alias("binding_constraint"),
+    )
+    .to_df()
+)
+print("\n  RestorePlan (queried from ontology):")
+print(plan_summary_df.to_string(index=False))
+
 print(f"\n{'=' * 60}")
 print("PIPELINE COMPLETE: 4 stages executed on the shared Telco ontology")
 print("=" * 60)