diff --git a/.ai b/.ai deleted file mode 160000 index 3efad83e..00000000 --- a/.ai +++ /dev/null @@ -1 +0,0 @@ -Subproject commit 3efad83e7b63bfefda429796b15e84b7cd64b806 diff --git a/.ai-repo/.framework-sync-sha b/.ai-repo/.framework-sync-sha deleted file mode 100644 index 9b1b53f7..00000000 --- a/.ai-repo/.framework-sync-sha +++ /dev/null @@ -1 +0,0 @@ -3efad83e7b63bfefda429796b15e84b7cd64b806 diff --git a/.ai-repo/README.md b/.ai-repo/README.md deleted file mode 100644 index 8187d146..00000000 --- a/.ai-repo/README.md +++ /dev/null @@ -1,42 +0,0 @@ -# Project-Specific AI Configuration - -This directory holds AI framework extensions specific to **this repository**. -The shared framework lives in `.ai/` (submodule). This directory is repo-local. - -## Structure - -``` -.ai-repo/ -├── config/ ← structured repo-owned artifact layout config -├── skills/ ← project-specific skill checklists -├── rules/ ← project-specific conventions and constraints -└── README.md ← you are here -``` - -## Config - -The canonical repo-owned artifact layout lives in `.ai-repo/config/artifact-layout.json`. -It defines the effective roadmap path, epic spec filename, milestone spec path template, -tracking doc path template, completed epic archive path, and naming patterns for this repo. - -Generated assistant surfaces should mirror the resolved values from this file; they should not be the source of truth for layout. - -## Skills - -Add a `.md` file in `skills/` with a checklist format (see `.ai/skills/` for examples). -Project skills are automatically picked up by `bash .ai/setup.sh` and distributed -to platform adapters (Copilot, Claude, Codex). - -Examples of project-specific skills: -- `deploy-to-azure.md` — deployment runbook for this project -- `run-pipeline.md` — how to trigger the CI/CD pipeline -- `data-export.md` — how to export fixture data from production - -## Rules - -Add a `.md` file in `rules/` for project-specific conventions: -- `tech-stack.md` — "Use xUnit, NSubstitute, .NET 9" -- `naming.md` — "Services use I{Name} interface pattern" -- `testing.md` — "All tests use [Theory] for parameterized cases" - -These are referenced by platform adapters so the AI reads them automatically. diff --git a/.ai-repo/config/artifact-layout.json b/.ai-repo/config/artifact-layout.json deleted file mode 100644 index 3341ee0c..00000000 --- a/.ai-repo/config/artifact-layout.json +++ /dev/null @@ -1,10 +0,0 @@ -{ - "roadmapPath": "ROADMAP.md", - "epicRootPath": "work/epics/", - "epicSpecFileName": "spec.md", - "milestoneSpecPathTemplate": "work/epics//.md", - "trackingDocPathTemplate": "work/epics//-tracking.md", - "completedEpicPathTemplate": "work/epics/completed//", - "epicIdPattern": "E-{NN}", - "milestoneIdPattern": "m-E{NN}--" -} \ No newline at end of file diff --git a/.ai-repo/config/commit.json b/.ai-repo/config/commit.json deleted file mode 100644 index 6aaad37c..00000000 --- a/.ai-repo/config/commit.json +++ /dev/null @@ -1,7 +0,0 @@ -{ - "coAuthors": { - "claude": "Claude Opus 4.7 (1M context) ", - "copilot": "GitHub Copilot ", - "codex": "OpenAI Codex " - } -} diff --git a/.ai-repo/rules/project.md b/.ai-repo/rules/project.md deleted file mode 100644 index 2520bd8f..00000000 --- a/.ai-repo/rules/project.md +++ /dev/null @@ -1,126 +0,0 @@ -# FlowTime Project Rules - -Project-specific conventions for the FlowTime mono-repo (Engine + Sim + UI). - ---- - -## Tooling - -- Prefer precise edits; stick to established patterns and avoid broad refactors without context. -- Use `rg`/`fd` for searches. - -## Project Layout - -- `src/FlowTime.Core`, `src/FlowTime.API`, `src/FlowTime.Cli`, `src/FlowTime.Contracts`, `src/FlowTime.Adapters.Synthetic` — Engine surface. -- `src/FlowTime.Sim.Core`, `src/FlowTime.Sim.Service`, `src/FlowTime.Sim.Cli` — Simulation surface. -- `src/FlowTime.UI`, `src/FlowTime.UI.Tests` — Blazor WebAssembly UI. -- `tests/` mirrors project names (e.g., `tests/FlowTime.Core.Tests`, `tests/FlowTime.Sim.Tests`, `tests/FlowTime.Api.Tests`). -- `docs/` — Engine/shared documentation. `docs-sim/` is archived — ignore unless explicitly requested. -- `work/` — AI framework housekeeping: epics, epic-local milestone specs/tracking docs, gaps, decisions. - -## Workflow Artifact Layout - -- The canonical artifact layout for this repo is defined in `.ai-repo/config/artifact-layout.json`. -- Older `*-log.md` files are historical and may remain until the related epic/docs are actively migrated. -- `work/milestones/` is a compatibility stub only. Do not create active specs or logs there. -- `ROADMAP.md` is the framework roadmap path. -- `work/epics/epic-roadmap.md` can remain as a supplemental epic index/sequencing document while it is still useful. - -## Milestone Status Sync - -- Milestone start and wrap must reconcile status across all repo-owned status surfaces in one pass: milestone spec, milestone tracking doc, epic milestone table (`work/epics//spec.md`), `ROADMAP.md`, `work/epics/epic-roadmap.md` when it mentions the epic, and `CLAUDE.md` current work. -- Do not leave an earlier milestone marked `in-progress` or `pending` once a later milestone in the same epic has started on a continuation branch. -- Treat status-surface drift as a workflow bug, not optional housekeeping. - -## Coding Conventions - -- .NET 9 / C# 13 with implicit usings and nullable enabled. -- Private fields **must use camelCase without a leading underscore** (`private readonly string dataDirectory;`). -- Follow existing patterns before introducing new abstractions; check for shared contracts in `FlowTime.Contracts`. -- Keep CLI ↔ API behaviour aligned; update relevant `.http` examples and docs when changing endpoints. -- Use invariant culture for parsing/formatting; keep tests deterministic. -- JSON payloads and schemas use camelCase — do not introduce snake_case fields. -- Never reintroduce deprecated schema fields (e.g., `binMinutes`); current schema uses `{ bins, binSize, binUnit }`. -- When inserting inline code containing `|` inside Markdown tables, escape the pipe as `\|`. - -## Branching & Versioning - -- Epic integration branches use `epic/E-{NN}-`. Every numbered epic gets an integration branch; milestone branches branch from it and merge back into it. -- Milestone branches use `milestone/`. -- Feature branches use `feature/-/` when a milestone needs parallel work. -- Single-surface quick changes can branch from `main` and PR directly back to `main` when no milestone integration branch is needed. -- Conventional commits: `feat(api):`, `fix(sim):`, `docs:`, etc. -- Commit messages: conventional prefix, no icons/emoji; subject + short bullet body capturing the milestone and key work/tests touched. -- Version format `..[-pre]`; milestone completions typically bump minor (e.g., `0.6.0 → 0.7.0`). -- Release notes in `docs/releases/` with milestone-based naming (e.g., `SIM-M2.7-v0.6.0.md`). - -## Build & Run - -- `dotnet build FlowTime.sln` / `dotnet test FlowTime.sln` -- VS Code tasks: `build`, `test`, `start-api`, `stop-api`, `start-sim-api`, `stop-sim-api`, `start-ui`, `stop-ui` -- Engine API: `dotnet run --project src/FlowTime.API` → port 8081 -- Sim API: `dotnet run --project src/FlowTime.Sim.Service` with `ASPNETCORE_URLS=http://0.0.0.0:8090` -- UI: `dotnet run --project src/FlowTime.UI` -- Default ports: 8081 (Engine API), 8090 (Sim API), 5219/7047 (UI), 8091 (Sim diagnostics), 5091 (Engine dev profile) -- Build and test before handing work back. - -## Devcontainer Port Safety - -- **Never blindly kill all processes on port 8081** — the devcontainer port-forwarder listens there; killing it destroys the session. -- To free port 8081, filter by process name: only kill `dotnet` processes. -- Use the `kill-port-8081` VS Code task — it filters safely. -- Verify processes before killing: `lsof -ti:PORT`, `ps aux | grep`. Use `pkill -f "ProcessName"` or `lsof -ti:PORT | xargs -r kill -TERM`. Never `kill `. -- Send SIGTERM first, wait, then SIGKILL only if still alive. Never start with `kill -9`. - -## Testing - -- Unit tests: fast and deterministic; no network or file-system side effects. -- API tests: use `WebApplicationFactory`; prefer real dependencies over mocks. -- Sim tests: `tests/FlowTime.Sim.Tests` — covers CLI, template parsing, provenance, service behaviours. -- Integration tests: `tests/FlowTime.Integration.Tests` for cross-surface scenarios. - -### UI testing (hard rule) - -- **UI work must be eval'd end-to-end in a real browser.** Every milestone that ships new or changed UI (Blazor or Svelte) must include Playwright tests that drive the feature in a real browser and verify the rendered outcome. Type checks and unit tests on pure helpers are necessary but not sufficient — they do not catch broken event handlers, state leaks, reactive glitches, or CSS-driven breakage. The user experience is the contract; a passing test must prove the user experience works. -- **Playwright infrastructure lives at `tests/ui/`** with config at `tests/ui/playwright.config.ts`, specs under `tests/ui/specs/`, and helpers under `tests/ui/helpers/`. Add new specs alongside existing ones. -- **Graceful skip when infrastructure is down.** If the API or dev server isn't running, the spec should skip with a clear message rather than fail. Follow the existing pattern used by the Rust engine integration tests (health probe → skip on unavailable). -- **Svelte UI runs on port 5173** (vite dev). **Blazor UI runs on port 5219**. Override `baseURL` per-spec if needed. -- **Vitest covers pure logic.** Svelte/TS pure functions (helpers, store derivations, protocol encoding) should have vitest unit tests in `ui/src/**/*.test.ts`. These run fast and guard the foundation; Playwright guards the integration. -- **Cover the critical paths.** For each user-facing surface: (1) page loads and renders expected initial state, (2) at least one user interaction drives a visible change, (3) reset/undo/error-recovery paths behave correctly, (4) key latency or correctness metrics that the UI exposes actually display correct values. - -## Truth Discipline - -### Precedence (highest to lowest) -1. **Code + passing tests** define live truth. -2. **`work/decisions.md`** defines approved direction. -3. **Epic specs and epic-local milestone specs** under `work/epics/` define implementation target, within their scope. -4. **Architecture docs** (`docs/`) summarize and connect the above — they never outrank code or decisions. -5. **Historical and exploration docs** are context only — never implementation authority. - -If code, decisions.md, and an architecture doc disagree, do not choose arbitrarily. Report the mismatch and ask. - -### Truth classes -- **`docs/`** — current ground truth. If it's in `docs/`, it describes what IS (shipped, provable by code/tests). -- **`work/epics/`** — decided-next and exploration. Proposals, specs, architectural direction for future work. -- **`docs/archive/`**, **`docs/releases/`** — historical. What WAS. Do not use for current state. -- **`docs/notes/`** — exploration only. Brainstorming, research, ideas. Never treat as implementation authority. - -### Guards -- Do not describe a target contract in present tense unless it is live. -- Do not let one file simultaneously act as current reference and historical archive. -- Do not restate a canonical contract in many places from memory — point to the owning doc. -- Do not let adapter/UI projection become the only place where semantics exist. -- Do not keep "temporary" compatibility shims without explicit deletion criteria. -- When a milestone explicitly owns a bridge or cleanup seam, do not preserve the bridge helper past that milestone as a tolerated coexistence state. Treat the surviving helper as incomplete work. -- Do not reconstruct semantic or analytical identity in adapters or clients from `kind`, `logicalType`, file stems, or similar heuristics when compiled/runtime facts can own that truth. -- When a runtime boundary changes, prefer forward-only regeneration of runs, fixtures, and approved outputs over compatibility readers that recover missing facts. -- Do not keep both a bridge abstraction and its compiled replacement once the replacement milestone is active unless the spec explicitly allows a coexistence window. -- "API stability" does not mean "keep old functions around." When a function has no production callers after a refactor, delete it and its tests in the same change — do not retain it as a dead alternative entry point under the banner of keeping the existing surface stable. -- Do not treat aspirational docs as implementation authority. - -## Documentation - -- Engine + shared docs in `docs/`; use Mermaid for diagrams (not ASCII art). -- Keep docs/schemas/templates aligned when touching contracts or schemas. -- Repository language: English. No time or effort estimates in docs or plans. -- Treat sibling checkouts as read-only references unless the user instructs otherwise. diff --git a/.ai-repo/scratch/.gitignore b/.ai-repo/scratch/.gitignore deleted file mode 100644 index 3727f436..00000000 --- a/.ai-repo/scratch/.gitignore +++ /dev/null @@ -1,3 +0,0 @@ -# Scratch dir — contents never tracked (see .ai/paths.md → Scratch Dir). -* -!.gitignore diff --git a/.claude/settings.json b/.claude/settings.json index 5e348272..dca38b0d 100644 --- a/.claude/settings.json +++ b/.claude/settings.json @@ -1,23 +1,6 @@ { - "statusLine": { - "type": "command", - "command": "bash .claude/statusline.sh" - }, - "hooks": { - "SessionStart": [ - { - "matcher": "startup", - "hooks": [ - { - "type": "command", - "command": "bash .ai/tools/scratch-audit.sh" - }, - { - "type": "command", - "command": "[ -x .ai-repo/bin/wf-graph ] && .ai-repo/bin/wf-graph report --graph work/graph.yaml --status 2>/dev/null || true" - } - ] - } - ] + "enabledPlugins": { + "aiwf-extensions@ai-workflow-rituals": true, + "wf-rituals@ai-workflow-rituals": true } } diff --git a/.claude/skills/dead-code-audit/SKILL.md b/.claude/skills/dead-code-audit/SKILL.md new file mode 100644 index 00000000..2e5cebcd --- /dev/null +++ b/.claude/skills/dead-code-audit/SKILL.md @@ -0,0 +1,263 @@ +--- +name: dead-code-audit +description: Recipe-driven dead-code report at milestone close. Two paths — `bootstrap` (no recipe found; detect stacks, ask the human to pick a tool, write `.claude/skills/dead-code-audit/recipes/dead-code-.md`, exit) and `audit` (one or more recipes found; for each, run the recipe's `toolCmd` over the milestone change-set, apply judgement, sweep for tool-blind findings, emit a per-stack section to `work/dead-code-report.md`). Soft signal — never mutates code, never fails the build, always exits 0 from the audit path. Invoked from `wrap-milestone` as a non-blocking step. +--- + +# Skill: Dead Code Audit + +End-of-milestone soft-signal pass that surfaces unused symbols, orphan fixtures, ADRs describing reverted decisions, helpers retained "for stability" with no callers, and deprecated aliases. KISS v0: **recipe-driven, tool-grounded, LLM-orchestrated**. + +The skill is generic; per-stack audit profiles ("recipes") describe how to audit one stack — which tool to invoke, which paths to exclude, which blind spots the LLM should watch. Recipes are first-class and live at `.claude/skills/dead-code-audit/recipes/dead-code-*.md`. A repo can have one or many recipes; polyglot repos run them sequentially in one audit. + +**Soft-signal contract:** never mutates code. Never fails the build. Always exits 0 from the audit path. Findings turn into `work/gaps.md` entries by hand if real — auto-gap-filing is out of scope for v0. + +## When to Use + +- Wrap-milestone step (automatic, non-blocking). +- On demand to scan the current milestone change-set for dead code. +- First-time setup in a repo that has no recipes yet — the bootstrap conversation produces them. + +## Two paths, one entry point + +``` +human invokes /-dead-code-audit + │ + ▼ +skill checks .claude/skills/dead-code-audit/recipes/dead-code-*.md + │ + ├─ none found ──▶ BOOTSTRAP path (write recipes, exit; no audit) + │ + └─ one or more ──▶ AUDIT path (run each recipe sequentially, write report) +``` + +The two paths are intentionally separated. Bootstrap is a conversation that produces the recipe as its primary deliverable — humans review the recipe before any audit runs against it. After bootstrap exits, the human re-invokes the skill to perform the first audit. + +## Path A — Bootstrap (no recipes found) + +Triggered when `.claude/skills/dead-code-audit/recipes/dead-code-*.md` matches zero files. + +### 1. Detect stacks present in the repo + +Scan for stack signals at known paths. Treat each hit as evidence of one stack: + +| Signal | Stack | +|---|---| +| `*.csproj`, `*.sln` | `.NET` | +| `pyproject.toml`, `setup.py`, `requirements*.txt` | Python | +| `Cargo.toml` | Rust | +| `package.json` (with TypeScript signal: `tsconfig.json`) | TypeScript | +| `package.json` (no `tsconfig.json`) | JavaScript | +| `go.mod` | Go | +| `pom.xml`, `build.gradle*` | Java/JVM | +| `Gemfile` | Ruby | +| `composer.json` | PHP | + +Add stacks the team explicitly mentions even if signal is absent (e.g. an external `.proto` schema repo). + +### 2. Present detected stacks; ask which to audit + +Show the list. Ask: "Which stacks should this repo audit for dead code? (one or many)". Capture the human's answer. + +### 3. For each chosen stack, propose 2–3 tool options + +Each option is one short sentence with the **tradeoff that matters for picking** — install cost, scope (cross-project vs. file-local), strictness, false-positive rate. Don't recommend a winner; the human picks. + +Reference table for common stacks (extend as needed): + +| Stack | Option A | Option B | Option C | +|---|---|---|---| +| .NET | **Roslynator** — runs in `dotnet build`; misses cross-project unused publics; zero install cost. | **JetBrains InspectCode** (free CLI) — strongest cross-project unused-public-symbol detection; ~100 MB to install. | **`dotnet format analyzers`** — IDE0051/IDE0052 only; weakest of the three; zero install cost. | +| Python | **vulture** — fast, low-noise; misses dynamic dispatch. | **pyflakes** (via flake8) — already in many toolchains; only file-local unused. | **deadcode** (newer) — broader sweep, more false positives. | +| Rust | **`cargo udeps`** — unused crate dependencies; nightly-only. | **clippy `dead_code` lint** — built-in; file-local. | **`cargo machete`** — unused dependencies; stable toolchain. | +| TypeScript | **knip** — broadest (unused files, exports, deps). | **ts-prune** — unused exports; tsconfig-aware. | **eslint `no-unused-vars` + `import/no-unused-modules`** — already in any TS toolchain; weak. | +| Go | **`staticcheck`** (U1000) — unused identifiers; standard. | **`deadcode`** (golang.org/x) — reachability analysis; main-package-aware. | **`go vet`** — minimal; stdlib only. | +| Java | **PMD `UnusedPrivate*`** — fast; file-local. | **IntelliJ Inspect (CLI)** — cross-project; heavy install. | **SpotBugs** — broader bug surface; partial overlap. | + +For stacks not in the table, propose 2–3 options based on what the language ecosystem actually offers — don't invent tools. + +### 4. Capture the human's choice for each stack + +One choice per stack. Confirm before writing. + +### 5. Write the recipe(s) + +For each chosen stack, write `.claude/skills/dead-code-audit/recipes/dead-code-.md` using the recipe shape below. Populate frontmatter with the human's choice; populate the body with stack-specific blind-spot hints (the table in step 3 of the audit path lists the standard blind-spot families to seed). + +Create the `.claude/skills/dead-code-audit/recipes/` directory if absent. + +### 6. Exit + +Print: `Recipes written. Re-run /-dead-code-audit to perform the audit.` — and stop. **Do not audit on the same invocation.** The two-step shape is deliberate: it keeps the bootstrap conversation reviewable and recovers cleanly if the wrong stack got picked. + +## Path B — Audit (one or more recipes found) + +For each recipe in `.claude/skills/dead-code-audit/recipes/dead-code-*.md`, run the four steps below sequentially. Polyglot repos produce one report with multiple per-stack sections. + +### 1. Resolve the milestone change-set + +Compute the file list touched on the milestone branch since it diverged from its base (epic branch, integration branch, or `main` — whichever the wrap path uses). Use `git diff --name-only ...HEAD`. If invoked outside a wrap context with no obvious base, default to `git diff --name-only main...HEAD` and note the scope assumption in the report header. + +### 2. Filter to recipe-relevant files + +Apply the recipe's `fileExts` filter, then exclude any path matching the recipe's `excludePaths`. If the filtered set is empty, write a per-stack section noting "no files in scope this milestone" and move on to the next recipe. + +### 3. Invoke the recipe's `toolCmd`; capture output + +Run `toolCmd` exactly as written in the recipe frontmatter. Capture stdout, stderr, and exit code. The tool may emit XML, JSON, plain text, or SARIF — read the recipe body for any format hints. **Tool failure is a finding, not a wrap blocker** — emit a per-stack section noting the failure (with the captured stderr) and continue to the next recipe. + +### 4. Apply judgement and sweep for tool-blind findings + +Read the tool output alongside the recipe body's blind-spot hints. Produce four classes of finding per recipe: + +- **confirmed-dead-suspects** — `file:line` + reason. Tool flagged it; the change-set evidence supports the flag (no live caller, no DI registration, no fixture reference, no public-surface rationale). +- **tool-flagged-but-live** — tool flagged it; grep / structural read showed a live caller. Cite the caller (``) so the next reviewer can verify in seconds. +- **intentional-public-surface** — tool flagged it; the symbol is part of a documented public surface (cross-repo consumer, exported API, schema-derived contract). Cite the surface and rationale. +- **needs-judgement** — the LLM cannot decide. Flag for human triage with the specific question that needs answering. + +Then sweep for findings the tools structurally cannot see. Standard blind-spot families (recipe body should expand these per stack): + +- **Orphan fixtures** — fixture files with no test referencing them. +- **Stale ADRs** — decision records describing reverted decisions (search for symbols / paths cited in the ADR; if absent in the change-set's HEAD tree, flag). +- **Helpers retained "for stability"** — exported helpers with zero callers in the change-set's HEAD tree, with a comment or commit history indicating "kept for compat" / "kept for stability." +- **Deprecated aliases** — symbols tolerated by parsers / emitters but not flagged by the tool because they parse cleanly. +- **Schema fields with no consumers** — schema / DTO / proto fields that no producer or consumer code references. + +### 5. Emit the per-stack section to `work/dead-code-report.md` + +Overwrite `work/dead-code-report.md` on each run. The report has one header section, then one section per recipe. Each recipe section has the four finding classes plus the blind-spot sweep results. See [Output format](#output-format) below. + +### 6. Exit 0 + +Always. The report is the output; downstream tooling reads it. + +## Recipe shape + +`.claude/skills/dead-code-audit/recipes/dead-code-.md`. One recipe per stack. ~20–40 lines. + +**Required frontmatter:** + +| Field | Type | Meaning | +|---|---|---| +| `name` | string | Stack name (matches the filename suffix). Used in report section headers. | +| `fileExts` | array of strings | File extensions to include (e.g. `[.cs]`, `[.ts, .tsx]`). | +| `tool` | string | The binary or command to invoke (display name, e.g. `roslynator`, `knip`). | +| `toolCmd` | string | The actual invocation, including args and output capture (e.g. `"roslynator analyze FlowTime.sln --severity-level info --output /tmp/roslynator.xml"`). | + +**Optional frontmatter:** + +| Field | Type | Meaning | +|---|---|---| +| `excludePaths` | array of strings | Glob-ish path prefixes to exclude (e.g. `[obj/, bin/, "*.g.cs"]`). | + +**Body (free-text):** hints used as LLM prompt context. Recommended sections: + +- **Things to look out for in this stack** — DI tricks, runtime-resolved dispatch, source generators, reflection, fixture-discovery conventions, codegen. +- **Public surface notes** — which directories or types are part of an external contract; cross-repo consumers; exported APIs. +- **Tool-specific notes** — high-signal codes / rules to weight; noisy codes to suppress. + +## Worked example — `.claude/skills/dead-code-audit/recipes/dead-code-dotnet.md` + +This is the canonical reference recipe. First-time bootstrap can use it as a template. + +~~~markdown +--- +name: dotnet +fileExts: [.cs] +excludePaths: [obj/, bin/, "*.g.cs", .claude/worktrees/] +tool: roslynator +toolCmd: "roslynator analyze FlowTime.sln --severity-level info --output /tmp/roslynator.xml" +--- + +# Dead-code recipe: .NET (Roslynator) + +## Things to look out for in this stack +- DI registrations using string keys or type-by-name (`AddSingleton`, `AddScoped`) +- xUnit `[Theory]` / `[MemberData]` / `[ClassData]` discovery — callers are runtime-resolved +- Source generators producing callers (look for `[GeneratedCode]` consumers) +- Reflection-based instantiation (`Activator.CreateInstance`, JSON polymorphic deserialization) + +## Public surface notes +`FlowTime.Contracts` is consumed by sibling repo `flowtime-sim-vnext` — treat its public types as live unless cross-repo grep confirms no callers. + +## Tool-specific notes +Roslynator's `RCS1213` (unused private member) and `RCS1170` (read-only auto property) are the highest-signal codes. Suppress `RCS1163` (unused parameter) — too noisy on event handlers and DI-injected dependencies. +~~~ + +## Output format — `work/dead-code-report.md` + +Overwritten each audit run. Section order is fixed; per-recipe sections appear in alphabetical order of recipe `name`. + +```markdown +# Dead-code Audit — {YYYY-MM-DD} +**Scope:** milestone change-set since `` ( files) +**Recipes:** dotnet, typescript +**Tool exits:** dotnet (ok), typescript (ok) + +## Recipe: dotnet + +### Confirmed-dead suspects +- `src/Foo/Bar.cs:42` — `private bool _legacyFlag` set but never read; introduced 2025-12 for compat with retired migration path. +- `src/Foo/Helpers/RetryShim.cs:1-87` — public class flagged by Roslynator RCS1213; no callers in change-set HEAD tree; CHANGELOG never released a public retry API. + +### Tool-flagged-but-live +- `src/Pack/PackLoader.cs:118` `LoadPack(string)` — flagged unused; live caller at `tests/Pack/PackLoaderTests.cs:34` via xUnit `[Theory]` data source. + +### Intentional public surface +- `src/Contracts/IPackEntry.cs:12-40` — `IPackEntry` flagged as unreferenced; consumed by sibling repo `flowtime-sim-vnext` (see recipe public-surface notes). + +### Needs judgement +- `src/Migration/V03Mapper.cs` — referenced only by V02→V03 migration test fixtures. Question for human: V02 retired in M-PACK-A-01; is V03Mapper still needed? + +### Blind-spot sweep +- **Orphan fixtures:** `tests/fixtures/legacy-pack-v01.json` — no test references it. +- **Stale ADRs:** `docs/decisions/ADR-0023.md` cites `Foo.LegacyResolver` which no longer exists in the tree. +- **Schema fields with no consumers:** `Pack.schema.cs:Manifest.deprecatedHint` — no producer or consumer. + +## Recipe: typescript +… +``` + +Section omitted when empty (e.g. no `Tool-flagged-but-live` items). The report's Markdown shape is stable — downstream tooling can grep for `### Confirmed-dead suspects`. + +## Integration with `wrap-milestone` + +`wrap-milestone` invokes this skill as a non-blocking step (see `wrap-milestone.md`). The integration shape: + +- **Bootstrap state** (no recipes): wrap surfaces a one-liner — *"dead-code audit not configured — run `/-dead-code-audit` to set up recipes"* — and continues. The wrap is not blocked. +- **Audit produced a report**: wrap adds a one-line link from the milestone tracking doc to `work/dead-code-report.md`. The wrap is not blocked. +- **Audit produced no report** (e.g. all recipes had empty filter sets): wrap notes "dead-code audit: no files in scope" and continues. + +The skill never produces a wrap gate. Findings flow through human review at PR time. + +## What's not in v0 (explicitly deferred) + +Listed here so consumers don't expect them. Each is filed as an enhancement-class follow-up if real evidence demands it. + +- **Mechanical-vs-semantic two-pass design.** v0 is one orchestrated pass per recipe. +- **`full` mode (solution-wide audit).** v0 is scoped to the milestone change-set only. +- **CI integration.** CI hygiene is a separate per-repo concern, not part of this skill. +- **Auto-gap-filing into `work/gaps.md`.** Humans file gaps if needed. +- **Cost caps, sampling, file-density ranking.** No `max-files` / `max-tokens` enforcement. +- **`wrap-epic` integration.** v0 hooks into `wrap-milestone` only. +- **Per-recipe parallelism.** Recipes run sequentially. +- **Auto-detection of stacks at audit time.** Stack detection happens at bootstrap. Re-run bootstrap if the repo's stack mix changes. +- **Multi-language pre-bundled starter recipes.** Bootstrap conversation is the v0 onboarding. +- **Per-project recipes within a stack.** A monorepo with three .NET projects gets one `dead-code-dotnet.md`; per-project exclusion goes in `excludePaths`. + +## Anti-patterns + +- Auditing on the same invocation that wrote the recipe — bootstrap exits before auditing on purpose. Re-run. +- Running with no recipe and a forced `--audit` flag — the skill takes no flags; bootstrap is the only thing that happens when no recipes exist. +- Hand-editing `work/dead-code-report.md` — it's overwritten on every audit run. Findings worth keeping go to `work/gaps.md`. +- Treating the report as a build gate — it's a soft signal. The whole point. +- Adding a recipe that names a tool the repo doesn't have on PATH — the audit step will fail and emit a `tool-failed` section instead of producing useful findings. Install the tool, or amend the recipe to name an installed alternative. + +## Invocation + +``` +/-dead-code-audit +``` + +(Replace `` with the framework skill prefix — `wf` by default.) + +No arguments. The skill picks bootstrap or audit based on the presence of recipes. diff --git a/.ai-repo/recipes/dead-code-dotnet.md b/.claude/skills/dead-code-audit/recipes/dead-code-dotnet.md similarity index 100% rename from .ai-repo/recipes/dead-code-dotnet.md rename to .claude/skills/dead-code-audit/recipes/dead-code-dotnet.md diff --git a/.ai-repo/recipes/dead-code-rust.md b/.claude/skills/dead-code-audit/recipes/dead-code-rust.md similarity index 100% rename from .ai-repo/recipes/dead-code-rust.md rename to .claude/skills/dead-code-audit/recipes/dead-code-rust.md diff --git a/.ai-repo/recipes/dead-code-typescript.md b/.claude/skills/dead-code-audit/recipes/dead-code-typescript.md similarity index 100% rename from .ai-repo/recipes/dead-code-typescript.md rename to .claude/skills/dead-code-audit/recipes/dead-code-typescript.md diff --git a/.ai-repo/skills/devcontainer.md b/.claude/skills/devcontainer/SKILL.md similarity index 97% rename from .ai-repo/skills/devcontainer.md rename to .claude/skills/devcontainer/SKILL.md index 3bcc34b4..65910da9 100644 --- a/.ai-repo/skills/devcontainer.md +++ b/.claude/skills/devcontainer/SKILL.md @@ -1,3 +1,8 @@ +--- +name: devcontainer +description: "> **This is archived content, not an active framework skill.** The active skill at `.ai/skills/devcontainer.md` is a 40-line generic core. If your repo relies on the detail below (Dockerfile templates" +--- + # Long-form devcontainer reference (archived) > **This is archived content, not an active framework skill.** The active skill at `.ai/skills/devcontainer.md` is a 40-line generic core. If your repo relies on the detail below (Dockerfile templates, stack-specific blocks, worktree topology), copy the relevant sections into your own `.ai-repo/skills/devcontainer.md` — that file overrides the framework skill for this repo. @@ -293,4 +298,4 @@ Do NOT put the worktree inside the main workspace — it duplicates the entire r | Tool caches | `/home/vscode/.cache/` | safe (costs re-download time) | | Build artifacts | `_build/`, `node_modules/`, `.venv/` | safe (costs rebuild time) | | Runtime data | `runtime/data/` (or project equivalent) | **not safe** — this is durable local state | -| Git worktrees | `/workspaces/worktrees/` | **not safe** — contains work in progress | +| Git worktrees | `/workspaces/worktrees/` | **not safe** — contains work in progress | \ No newline at end of file diff --git a/.ai-repo/skills/ui-debug.md b/.claude/skills/ui-debug/SKILL.md similarity index 64% rename from .ai-repo/skills/ui-debug.md rename to .claude/skills/ui-debug/SKILL.md index 097259af..6cbebf92 100644 --- a/.ai-repo/skills/ui-debug.md +++ b/.claude/skills/ui-debug/SKILL.md @@ -1,3 +1,8 @@ +--- +name: ui-debug +description: "Diagnose UI issues quickly and reproducibly." +--- + # Skill: ui-debug Diagnose UI issues quickly and reproducibly. @@ -10,4 +15,4 @@ Diagnose UI issues quickly and reproducibly. - [ ] Check for an existing Playwright test — run or update it before manual fixes - [ ] Prefer Playwright tests for deterministic reproduction - [ ] If updating snapshots, note why and keep diffs minimal -- [ ] Avoid external network calls; mock or stub where needed +- [ ] Avoid external network calls; mock or stub where needed \ No newline at end of file diff --git a/.devcontainer/devcontainer-lock.json b/.devcontainer/devcontainer-lock.json new file mode 100644 index 00000000..0227e8ee --- /dev/null +++ b/.devcontainer/devcontainer-lock.json @@ -0,0 +1,14 @@ +{ + "features": { + "ghcr.io/devcontainers/features/github-cli:1": { + "version": "1.1.0", + "resolved": "ghcr.io/devcontainers/features/github-cli@sha256:d22f50b70ed75339b4eed1ba9ecde3a1791f90e88d37936517e3bace0bbad671", + "integrity": "sha256:d22f50b70ed75339b4eed1ba9ecde3a1791f90e88d37936517e3bace0bbad671" + }, + "ghcr.io/devcontainers/features/node:1": { + "version": "1.7.1", + "resolved": "ghcr.io/devcontainers/features/node@sha256:8c0de46939b61958041700ee89e3493f3b2e4131a06dc46b4d9423427d06e5f6", + "integrity": "sha256:8c0de46939b61958041700ee89e3493f3b2e4131a06dc46b4d9423427d06e5f6" + } + } +} \ No newline at end of file diff --git a/.devcontainer/devcontainer.json b/.devcontainer/devcontainer.json index ef320808..05eaa041 100644 --- a/.devcontainer/devcontainer.json +++ b/.devcontainer/devcontainer.json @@ -92,6 +92,6 @@ }, "remoteEnv": { "NODE_OPTIONS": "--dns-result-order=ipv4first", - "PATH": "${containerEnv:HOME}/.cargo/bin:${containerEnv:HOME}/.dotnet/tools:${containerEnv:HOME}/.local/bin:${containerEnv:PATH}" + "PATH": "/home/vscode/.cargo/bin:/home/vscode/.dotnet/tools:/home/vscode/.local/bin:/home/vscode/go/bin:/usr/local/go/bin:${containerEnv:PATH}" } } diff --git a/.devcontainer/init.sh b/.devcontainer/init.sh index 187e85bc..27c5cbf1 100755 --- a/.devcontainer/init.sh +++ b/.devcontainer/init.sh @@ -37,6 +37,30 @@ if ! command -v roslynator >/dev/null 2>&1; then export PATH="$HOME/.dotnet/tools:$PATH" fi +# Install Go (avoids the devcontainer Go feature, which fails on the .NET base +# image's stale yarn apt source — NO_PUBKEY 62D54FD4003F6525) +if ! command -v go >/dev/null 2>&1; then + echo "Installing Go..." + GO_VERSION=1.22.10 + curl -fsSL "https://go.dev/dl/go${GO_VERSION}.linux-amd64.tar.gz" \ + | sudo tar -C /usr/local -xz + export PATH="/usr/local/go/bin:$PATH" +fi + +# Install aiwf (AI workflow framework v3, branch-tip pin during PoC). +# `go install` rejects branch names containing slashes, so resolve the branch +# tip to a commit SHA via git ls-remote and install that. +if ! command -v aiwf >/dev/null 2>&1; then + echo "Installing aiwf..." + export PATH="$HOME/go/bin:/usr/local/go/bin:$PATH" + aiwf_sha=$(git ls-remote https://github.com/23min/ai-workflow-v2.git refs/heads/poc/aiwf-v3 | awk '{print $1}') + if [ -z "$aiwf_sha" ]; then + echo "Failed to resolve aiwf branch tip" >&2 + exit 1 + fi + go install "github.com/23min/ai-workflow-v2/tools/cmd/aiwf@${aiwf_sha}" +fi + echo "Restoring solution..." dotnet restore >/dev/null diff --git a/.gitignore b/.gitignore index 27554eda..490c1cbb 100644 --- a/.gitignore +++ b/.gitignore @@ -45,24 +45,19 @@ apis/ .claude/*.lock .claude/worktrees/ -# Generated assistant adapter surfaces (rebuild locally with `bash .ai/sync.sh`) +# Generated assistant adapter surfaces (legacy v1 paths; aiwf v3 manages +# .claude/skills/aiwf-*/ via the aiwf binary and ships role agents through +# the aiwf-extensions plugin cache rather than committed paths). # CLAUDE.md is intentionally tracked as shared workspace context. .github/copilot-instructions.md .github/skills/ .claude/agents/ -.claude/skills/ .claude/rules/ai-framework.md .codex/ # Git worktrees (Claude Code agent isolation worktrees, etc.) — never commit .claude/worktrees/ -# wf-graph scan input — regenerated by `wf-graph scan`; only graph.yaml is the canonical artefact -work/graph-scan.json - -# wf-graph advisory lock file — flock target left after each mutation; not a sentinel -work/graph.yaml.lock - # Rider/JetBrains .idea/ *.sln.iml @@ -100,11 +95,7 @@ coverage/ local.settings.json .scratch/ -# Generated assistant adapter surfaces (rebuild locally with `bash .ai/sync.sh`) -.github/copilot-instructions.md -.github/skills/ -.claude/rules/ai-framework.md -.claude/statusline.sh -.codex/instructions.md -.ai-repo/.framework-sync-sha -.ai-repo/bin/ + +# aiwf: materialized skill adapters (regenerated by aiwf update) +.claude/skills/.aiwf-owned +.claude/skills/aiwf-*/ diff --git a/.gitmodules b/.gitmodules index cf34cc55..647a0a1f 100644 --- a/.gitmodules +++ b/.gitmodules @@ -1,6 +1,3 @@ -[submodule ".ai"] - path = .ai - url = https://github.com/23min/ai-workflow [submodule "lib/dag-map"] path = lib/dag-map url = https://github.com/23min/dag-map.git diff --git a/CLAUDE.md b/CLAUDE.md index 285c7f2f..54f46dcf 100644 --- a/CLAUDE.md +++ b/CLAUDE.md @@ -1,113 +1,66 @@ # CLAUDE.md - -This project uses the AI Framework v2 at `.ai/`. Follow its agents, skills, and rules. +This project uses **aiwf v3** (Go binary at `~/go/bin/aiwf`) plus the **ai-workflow-rituals** plugin marketplace (Claude Code plugins `aiwf-extensions` + `wf-rituals`). The planning kernel lives in 6 entity kinds: epic, milestone, ADR, gap, decision, contract — each with a closed status set, stable ids that survive rename/reallocate, and `git log` as the audit trail. -Despite the filename, this file is shared workspace context for both Claude and GitHub Copilot in this repo. -Keep it assistant-neutral. -Generated assistant adapter files under `.github/` and `.claude/` are local build outputs by default. -Keep their source of truth in `.ai/` and `.ai-repo/`, and regenerate them locally with `bash .ai/sync.sh`. +Despite the filename, this file is shared workspace context for both Claude and GitHub Copilot in this repo. Keep it assistant-neutral. ## Session Start -At the start of every session, pick up accumulated context: +For project state, run `aiwf status` (one-screen snapshot of in-flight work, open decisions, gaps, recent activity). `STATUS.md` is the same data auto-rendered. Recently-shipped epic context lives in each epic's `wrap.md`. -- `work/decisions.md` — shared decision log across all agents -- `work/agent-history/.md` — your role's accumulated learnings (read only the file matching the active agent) -- `work/gaps.md` — deferred work items -- `## Current Work` section below — active epic + milestone +After `/compact` or a fresh session, this file is re-available via the system prompt. Say **"refresh context"** to re-read everything. -After `/compact` or a fresh session, this file is re-available via the system prompt. If rules or project config changed mid-session, say **"refresh context"** to trigger a full re-read (see generated adapter files for the full refresh checklist). - -## Hard Rules (summary — full rules in `.ai/rules.md`) +## Hard Rules - **NEVER commit or push without explicit human approval** — "continue" / "ok" do not count - **TDD by default** for logic, API, and data code — red → green → refactor -- **Branch coverage (hard rule)** — every reachable conditional branch needs a test before declaring done; perform a line-by-line audit before the commit-approval prompt, not after the human asks -- **Identify the agent first** — read the agent file, adopt its role, follow its skill +- **TDD phase tracking** — logic-bearing ACs in `draft` and `in_progress` milestones carry a `tdd_phase: red|green|refactor` field, advanced via `aiwf promote M-NNN/AC-N --phase

`. Non-logic ACs (doc-only, gap-closure, full-suite gates, branch-coverage audit, process discipline) omit the field. Every red-tagged AC must reach green before the milestone wraps. +- **Branch coverage** — every reachable conditional branch needs a test before declaring done; perform a line-by-line audit before the commit-approval prompt - **Branch discipline** — do NOT commit milestone work directly to `main` -- **Update CLAUDE.md Current Work** after starting or wrapping a milestone -- Conventional Commits format: `feat:`, `fix:`, `chore:`, `docs:`, `test:`, `refactor:` +- Conventional Commits format: `feat(api):`, `fix(sim):`, `chore:`, `docs:`, `test:`, `refactor:` — no icons/emoji; subject + short bullet body capturing the milestone and key work/tests touched ## Agent Routing -| Intent | Agent | Read first | -|--------|-------|------------| -| build, implement, code, start, fix, patch | **builder** | `.claude/agents/builder.md` + relevant skill | -| plan, design, scope, epic, architecture | **planner** | `.claude/agents/planner.md` + relevant skill | -| review, check, validate, wrap, finish | **reviewer** | `.claude/agents/reviewer.md` + relevant skill | -| release, deploy, tag, publish | **deployer** | `.claude/agents/deployer.md` + relevant skill | - -## Framework Sources - -| Source | Purpose | -|--------|---------| -| `.ai/rules.md` | Full rules and enforcement levels | -| `.ai/paths.md` | Artifact layout defaults | -| `.ai/agents/` | Agent source definitions (generated into `.claude/agents/` — read those at invocation time) | -| `.ai/skills/` | Skill source workflows (generated into `.claude/skills/` and `.github/skills/`) | -| `.ai/templates/` | Document templates | -| `.ai-repo/rules/` | **Project-specific rules** — read before starting work | -| `.ai-repo/config/` | Artifact layout overrides | -| `.ai-repo/skills/` | Project-specific skills | - -## Resolved Artifact Layout - -These values are resolved from framework defaults in .ai/paths.md and repo overrides in .ai-repo/config/artifact-layout.json. - -| Field | Value | Purpose | -|-------|-------|---------| -| `roadmapPath` | `ROADMAP.md` | High-level roadmap path | -| `epicRootPath` | `work/epics/` | Root directory containing epic folders | -| `epicSpecFileName` | `spec.md` | Default epic spec filename inside each epic folder | -| `milestoneSpecPathTemplate` | `work/epics//.md` | Milestone spec location template | -| `trackingDocPathTemplate` | `work/epics//-tracking.md` | Milestone tracking doc location template | -| `completedEpicPathTemplate` | `work/epics/completed//` | Completed epic archive template | -| `epicIdPattern` | `E-{NN}` | Epic ID naming pattern | -| `milestoneIdPattern` | `m-E{NN}--` | Milestone ID naming pattern | -| `adrPath` | `docs/decisions/` | Architecture Decision Records folder (filename `NNNN-.md`) | -| `adrTemplatePath` | `.ai/templates/adr.md` | Repo-neutral ADR template used by `architect` and `wrap-epic` | -| `frameworkSkillPrefix` | `wf` | Prefix for framework skill slash-commands (e.g. `/wf-patch`) | -| `repoSkillPrefix` | `` | Prefix for repo-specific skill slash-commands (e.g. `/wf-li-app-legibility`) | -| `scratchPath` | `.ai-repo/scratch/` | Per-work-unit scratch dir; cleaned up at wrap. | -| `scratchAuditThresholdMb` | `100` | Size threshold (MB) for SessionStart scratch-audit warning. | -| `contractSurfaces` | `not configured` | Opt-in: enables `doc-lint`'s uncovered-contract-surface check. See `.ai/paths.md`. | - -## Project-Specific Rules - -# FlowTime Project Rules - -Project-specific conventions for the FlowTime mono-repo (Engine + Sim + UI). - ---- - -## Tooling - -- Prefer precise edits; stick to established patterns and avoid broad refactors without context. -- Use `rg`/`fd` for searches. +Role agents ship via the `aiwf-extensions` plugin (loaded into Claude Code from the plugin cache). + +| Intent | Agent | Drives | +|--------|-------|--------| +| build, implement, code, start, fix, patch | **builder** | `aiwfx-start-milestone` → `wf-tdd-cycle` → `aiwfx-wrap-milestone`; `wf-patch` for one-offs; `aiwfx-record-decision` mid-implementation | +| plan, design, scope, epic, architecture | **planner** | `aiwfx-plan-epic`, `aiwfx-plan-milestones`, `aiwfx-record-decision` | +| review, check, validate, wrap, finish | **reviewer** | `wf-review-code`, `wf-doc-lint`, `aiwfx-record-decision` | +| release, deploy, tag, publish | **deployer** | `aiwfx-release`; `wf-patch` for hotfixes between wrap and tag; `aiwfx-record-decision` | + +`aiwfx-wrap-epic` is shipped by the plugin but unclaimed by any agent's skill list; in practice the reviewer drives it (its description includes "verifies milestone or epic wrap"). After a milestone wrap, builder/reviewer should also invoke the repo-private `dead-code-audit` skill (the upstream `aiwfx-wrap-milestone` skill does not chain it). ## Project Layout +**Source tree** + - `src/FlowTime.Core`, `src/FlowTime.API`, `src/FlowTime.Cli`, `src/FlowTime.Contracts`, `src/FlowTime.Adapters.Synthetic` — Engine surface. - `src/FlowTime.Sim.Core`, `src/FlowTime.Sim.Service`, `src/FlowTime.Sim.Cli` — Simulation surface. - `src/FlowTime.UI`, `src/FlowTime.UI.Tests` — Blazor WebAssembly UI. -- `tests/` mirrors project names (e.g., `tests/FlowTime.Core.Tests`, `tests/FlowTime.Sim.Tests`, `tests/FlowTime.Api.Tests`). +- `ui/` — SvelteKit + shadcn-svelte frontend. +- `tests/` mirrors project names (e.g., `tests/FlowTime.Core.Tests`, `tests/FlowTime.Sim.Tests`, `tests/FlowTime.Api.Tests`); UI Playwright at `tests/ui/`. - `docs/` — Engine/shared documentation. `docs-sim/` is archived — ignore unless explicitly requested. -- `work/` — AI framework housekeeping: epics, epic-local milestone specs/tracking docs, gaps, decisions. -## Workflow Artifact Layout +**Planning tree** (canonical layout defined by `aiwf.yaml`; per-kind frontmatter via `aiwf schema`) -- The canonical artifact layout for this repo is defined in `.ai-repo/config/artifact-layout.json`. -- Older `*-log.md` files are historical and may remain until the related epic/docs are actively migrated. -- `work/milestones/` is a compatibility stub only. Do not create active specs or logs there. -- `ROADMAP.md` is the framework roadmap path. -- `work/epics/epic-roadmap.md` can remain as a supplemental epic index/sequencing document while it is still useful. +| Path | Purpose | +|------|---------| +| `aiwf.yaml` | aiwf consumer config | +| `work/epics/E-NN-/` | epic dirs (active and done), with `epic.md` + `M-NNN-.md` milestones | +| `work/decisions/D-NNN-.md` | per-decision entities | +| `work/gaps/G-NNN-.md` | per-gap entities | +| `work/contracts/C-NNN-/` | contract entities | +| `docs/adr/ADR-NNNN-.md` | ADRs | +| `work/archived-epics//` | pre-aiwf historical epics (no E-NN id; out of aiwf's walked roots) | +| `.claude/skills/aiwf-*/` | gitignored, materialized by `aiwf init` / `aiwf update` | -## Milestone Status Sync +`aiwf` verbs: `init`, `update`, `upgrade`, `add `, `promote`, `cancel`, `rename`, `reallocate`, `move`, `check`, `history`, `status`, `show `, `render roadmap`, `doctor`, `import`, `schema`, `template`, `whoami`, `authorize`, `contract verify|bind|unbind|recipes|recipe show|recipe install|recipe remove`. Run `aiwf help` for the full list. Don't edit entity frontmatter status by hand — use `aiwf promote` so the FSM check + commit trailer happen. Use `aiwf promote/cancel --audit-only --reason "..."` to backfill an audit trail for a state already reached via a manual commit (empty-diff audit commit, no FSM transition). -- Milestone start and wrap must reconcile status across all repo-owned status surfaces in one pass: milestone spec, milestone tracking doc, epic milestone table (`work/epics//spec.md`), `ROADMAP.md`, `work/epics/epic-roadmap.md` when it mentions the epic, and `CLAUDE.md` current work. -- Do not leave an earlier milestone marked `in-progress` or `pending` once a later milestone in the same epic has started on a continuation branch. -- Treat status-surface drift as a workflow bug, not optional housekeeping. +Provenance: human verbs need no extra flags. Non-human actors (ai/..., bot/...) must pass `--principal human/` and operate inside an active `aiwf authorize --to ` scope; the kernel adds `aiwf-principal:`, `aiwf-on-behalf-of:`, and `aiwf-authorized-by:` trailers automatically. `aiwf authorize` is human-only. + +Tracking docs (per the `aiwfx-track` skill) are advisory free-form markdown alongside a milestone spec; not aiwf entities, not validated. Older `*-log.md` / `*-tracking.md` files in `work/archived-epics/` are pre-aiwf residue. ## Coding Conventions @@ -119,6 +72,7 @@ Project-specific conventions for the FlowTime mono-repo (Engine + Sim + UI). - JSON payloads and schemas use camelCase — do not introduce snake_case fields. - Never reintroduce deprecated schema fields (e.g., `binMinutes`); current schema uses `{ bins, binSize, binUnit }`. - When inserting inline code containing `|` inside Markdown tables, escape the pipe as `\|`. +- Prefer precise edits; stick to established patterns and avoid broad refactors without context. Use `rg`/`fd` for searches. ## Branching & Versioning @@ -126,8 +80,6 @@ Project-specific conventions for the FlowTime mono-repo (Engine + Sim + UI). - Milestone branches use `milestone/`. - Feature branches use `feature/-/` when a milestone needs parallel work. - Single-surface quick changes can branch from `main` and PR directly back to `main` when no milestone integration branch is needed. -- Conventional commits: `feat(api):`, `fix(sim):`, `docs:`, etc. -- Commit messages: conventional prefix, no icons/emoji; subject + short bullet body capturing the milestone and key work/tests touched. - Version format `..[-pre]`; milestone completions typically bump minor (e.g., `0.6.0 → 0.7.0`). - Release notes in `docs/releases/` with milestone-based naming (e.g., `SIM-M2.7-v0.6.0.md`). @@ -137,8 +89,9 @@ Project-specific conventions for the FlowTime mono-repo (Engine + Sim + UI). - VS Code tasks: `build`, `test`, `start-api`, `stop-api`, `start-sim-api`, `stop-sim-api`, `start-ui`, `stop-ui` - Engine API: `dotnet run --project src/FlowTime.API` → port 8081 - Sim API: `dotnet run --project src/FlowTime.Sim.Service` with `ASPNETCORE_URLS=http://0.0.0.0:8090` -- UI: `dotnet run --project src/FlowTime.UI` -- Default ports: 8081 (Engine API), 8090 (Sim API), 5219/7047 (UI), 8091 (Sim diagnostics), 5091 (Engine dev profile) +- Blazor UI: `dotnet run --project src/FlowTime.UI` → port 5219 +- Svelte UI: `cd ui && npm run dev` → port 5173 +- Default ports: 8081 (Engine API), 8090 (Sim API), 5173 (Svelte), 5219/7047 (Blazor), 8091 (Sim diagnostics), 5091 (Engine dev profile) - Build and test before handing work back. ## Devcontainer Port Safety @@ -169,12 +122,12 @@ Project-specific conventions for the FlowTime mono-repo (Engine + Sim + UI). ### Precedence (highest to lowest) 1. **Code + passing tests** define live truth. -2. **`work/decisions.md`** defines approved direction. +2. **Decision entities** (`work/decisions/D-NNN-*.md`) and **ADRs** (`docs/adr/ADR-*.md`) define approved direction. 3. **Epic specs and epic-local milestone specs** under `work/epics/` define implementation target, within their scope. 4. **Architecture docs** (`docs/`) summarize and connect the above — they never outrank code or decisions. 5. **Historical and exploration docs** are context only — never implementation authority. -If code, decisions.md, and an architecture doc disagree, do not choose arbitrarily. Report the mismatch and ask. +If code, decisions, and an architecture doc disagree, do not choose arbitrarily. Report the mismatch and ask. ### Truth classes - **`docs/`** — current ground truth. If it's in `docs/`, it describes what IS (shipped, provable by code/tests). @@ -201,68 +154,3 @@ If code, decisions.md, and an architecture doc disagree, do not choose arbitrari - Keep docs/schemas/templates aligned when touching contracts or schemas. - Repository language: English. No time or effort estimates in docs or plans. - Treat sibling checkouts as read-only references unless the user instructs otherwise. - - -## Current Work - - -**Active focus:** none — E-21 Svelte Workbench & Analysis Surfaces closed and merged to main 2026-05-01 (wrap artefact: `work/epics/completed/E-21-svelte-workbench-and-analysis/wrap.md`). Awaiting next-epic decision. - -**Open question:** what next? Two engine-side gaps filed during E-21 dogfooding (see `work/gaps.md`) lean toward sequencing a testing-rigor / engine-investigation milestone before E-22 Model Fit. Alternatively: E-15 Telemetry Ingestion is the long-pole for the client-telemetry vision; E-22 depends on E-15 + Telemetry Loop & Parity per D-2026-04-15-032 Option A. - -> **Note:** the catalog below is a historical trail kept manually; convention prefers a narrow narrative-only Current Work section. The catalog exceeds the 15-line guideline — slated for trim during a future cleanup pass; not in scope for this milestone. - -- **E-17** Interactive What-If Mode — **completed and merged to main (2026-04-12).** Archived to `work/epics/completed/E-17-interactive-what-if-mode/`. - - 6 milestones. WebSocket bridge → parameter panel → topology heatmap → warnings → edge heatmap → time scrubber. 200 vitest + 26 Playwright E2E. -- **E-18** Time Machine (`work/epics/E-18-headless-pipeline-and-optimization/spec.md`) — **in-progress** — foundation + analysis layer delivered; Fit + Chunked + SDK carried forward as **E-22**. - - Headless engine: parameterized evaluation → streaming protocol → pipeline component. - - **m-E18-01** (complete): Parameterized evaluation — ParamTable, evaluate_with_params, compile-once eval-many. - - **m-E18-02** (complete): Engine session + streaming protocol — persistent process, MessagePack over stdin/stdout. - - **m-E18-07** (complete): `FlowTime.TimeMachine` project created; `FlowTime.Generator` deleted (Path B, no coexistence window). - - **m-E18-06** (complete): Tiered validation — `TimeMachineValidator` (schema/compile/analyse), `POST /v1/validate`, Rust `validate_schema` session command. - - **m-E18-08** (complete): `ITelemetrySource` interface + `CanonicalBundleSource` + `FileCsvSource`. 23 tests. - - **m-E18-09** (complete): Parameter sweep — `SweepSpec`/`SweepRunner`/`ConstNodePatcher`, `IModelEvaluator`/`RustModelEvaluator`, `POST /v1/sweep`. 35 tests. - - **m-E18-10** (complete): Sensitivity analysis — `ConstNodeReader`, `SensitivitySpec`/`SensitivityRunner` (central difference), `POST /v1/sensitivity`. 39 tests. - - **m-E18-11** (complete): Goal seeking — `GoalSeekSpec`/`GoalSeeker` (bisection), `POST /v1/goal-seek`. 33 tests. - - **m-E18-12** (complete): Optimization — `OptimizeSpec`/`Optimizer` (Nelder-Mead, N params), `POST /v1/optimize`. 29 unit + 10 API tests. - - **m-E18-13** (complete, merged to epic 2026-04-15): `SessionModelEvaluator` — persistent `flowtime-engine session` subprocess, MessagePack over stdin/stdout, compile-once/eval-many. `RustEngine:UseSession` config switch (default true); `RustModelEvaluator` retained as fallback. DI lifetime moved Singleton → Scoped. 44 new tests (32 unit + 8 integration + 4 API DI) — every reachable branch covered. - - **m-E18-14** (complete, merged to epic 2026-04-15): .NET Time Machine CLI — `flowtime validate/sweep/sensitivity/goal-seek/optimize` as pipeable JSON-over-stdio commands byte-compatible with `/v1/` endpoints. `--no-session` selects `RustModelEvaluator`; `--engine`/`FLOWTIME_RUST_BINARY`/default path resolution; exit code contract (0/1/2/3); `CliJsonIO` + `CliCommonArgs` + `CliEngineSetup` + `AnalysisCliRunner` shared helpers. 72 CLI unit + 10 integration tests — every reachable branch covered with one documented platform-edge gap. Full suite 1,702 passed / 9 skipped. - - **Gap analysis:** `work/epics/E-18-headless-pipeline-and-optimization/e18-gap-analysis.md` - - **Active delivery sequence (decided 2026-04-15, Option A):** - 1. ~~**m-E18-13 SessionModelEvaluator**~~ — complete and merged to epic branch. - 2. ~~**m-E18-14 .NET Time Machine CLI**~~ — complete and merged to epic branch. - 3. ~~**UI parity fork**~~ — now **E-21 Svelte Workbench & Analysis Surfaces** (in-progress). Svelte becomes platform for new telemetry/fit/discovery surfaces; Blazor → maintenance mode. - 4. **E-15 Telemetry Ingestion** — Gold Builder → Graph Builder → first dataset path. - 5. **Telemetry Loop & Parity** — parity harness (prerequisite for trustworthy fit). - 6. **E-22 Model Fit + Chunked Evaluation + Pipeline SDK** — carries the remaining E-18 scope forward as a dedicated epic. Fit, chunked evaluation, and the `FlowTime.Pipeline` embeddable SDK. - - **Aspirational (see ROADMAP.md Cloud Deployment section):** Azure-native deployment — batch Functions, event-driven, long-running Container Apps service — with cloud `ITelemetrySource` adapters (ADX, Blob, Event Hubs), Blob artifact sink, OTEL/App Insights. Not scheduled; marker so near-term work stays compatible. - - **Deferred (tracked in `work/gaps.md`):** optimization constraints, Monte Carlo, `FlowTime.Telemetry.*` adapters. - - **Architecture:** `docs/architecture/headless-engine-architecture.md` — four-layer design; `docs/architecture/time-machine-analysis-modes.md` — sweep/sensitivity/goal-seek/optimize. -- **E-20** Matrix Engine — **completed and merged to main (2026-04-10).** Archived to `work/epics/completed/E-20-matrix-engine/`. - - 10 milestones. 172 Rust tests + 1,332 .NET tests. E-17/E-18 unblocked. -- **E-10** Engine Correctness — **completed and merged to main (2026-04-09).** Archived to `work/epics/completed/E-10-engine-correctness-and-analytics/`. -- **E-16** Formula-First Core Purification — **completed.** Archived to `work/epics/completed/E-16-formula-first-core-purification/`. -- **E-19** Surface Alignment & Compatibility Cleanup — **completed and merged to main (2026-04-08).** Archived to `work/epics/completed/E-19-surface-alignment-and-compatibility-cleanup/`. - - Deferred (tracked in `work/gaps.md`): `POST /v1/run` and `POST /v1/graph` Engine route deletion per D-2026-04-08-029 (test-infrastructure migration needed first). -- **E-24** Schema Alignment — **completed and merged to main (2026-04-25).** Archived to `work/epics/completed/E-24-schema-alignment/`. - - 5 milestones. Unified the post-substitution model: one C# type (`ModelDto`+`ProvenanceDto` in `FlowTime.Contracts`), one schema (`docs/schemas/model.schema.yaml`), one validator. `SimModelArtifact` + 6 satellites deleted; Sim emits the unified type directly; Engine parses it directly. Mirrored `ParseScalar` `ScalarStyle.Plain` guard + sibling `QuotedAmbiguousStringEmitter` round-trip pair. Canary `Survey_Templates_For_Warnings` promoted to hard `val-err == 0` build-time gate. 5 ADRs ratified. Closure logged in `D-2026-04-25-038`; E-23 unblocked. -- **E-23** Model Validation Consolidation — **completed and merged to main (2026-04-26).** Archived to `work/epics/completed/E-23-model-validation-consolidation/`. - - 3 milestones. m-E23-01: 94 rules audited; 16 schema-add edits + 5-arm `oneOf` schema restructure + silent-error fallback; 12 named adjunct methods on `ModelSchemaValidator`; 32-test negative-case regression catalogue. m-E23-02: 3 production call sites + 28 test calls migrated to `ModelSchemaValidator.Validate`; `TimeMachineValidator` redundant-delegation block removed; real `ProvenanceService.StripProvenance` round-trip bug fixed via YamlStream surgical removal preserving scalar styles; +16 tests including a watertight `/v1/run`-level integration regression. m-E23-03: `ModelValidator.cs` deleted; `ValidationResult` relocated to its own file. **`ModelSchemaValidator.Validate` is now the single model-YAML validator in the codebase.** Final suite 1862 / 0 / 9. Stashed input material from the pre-pivot `milestone/m-E23-01-schema-alignment` branch + `stash@{0}` is now obsolete — schema-alignment work was absorbed by E-24 m-E24-03 / m-E24-05; safe to discard. - - **Unblocks:** m-E21-07 Validation Surface (Svelte) — consumes the consolidated `ModelSchemaValidator` once E-21 resumes. -- **E-21** Svelte Workbench & Analysis Surfaces (`work/epics/completed/E-21-svelte-workbench-and-analysis/spec.md`) — **resumed (2026-04-26)** — paused 2026-04-23 to run E-24 then E-23; both closed. m-E21-06 Heatmap View merged into `epic/E-21-svelte-workbench-and-analysis` 2026-04-26 (backfill of the missing wrap-time merge); main caught up onto the epic branch in the same pass. Reentry point is m-E21-07 Validation Surface — consumes the consolidated `ModelSchemaValidator` that E-23 delivers. - - **m-E21-01** (complete, merged to epic): Workbench Foundation — density system, dag-map events (library), workbench panel with click-to-pin node cards. 217 vitest + 293 dag-map tests. - - **m-E21-02** (complete, merged to epic): Metric Selector & Edge Cards — metric chip bar, edge click-to-pin, edge cards, class filter, custom TimelineScrubber, dark-mode/viz-palette fixes. 323 vitest + 293 dag-map = 616 tests. - - **m-E21-03** (complete, merged to epic 2026-04-17; ultrareview follow-ups 2026-04-20): Sweep & Sensitivity Surfaces — `/analysis` route with tabbed surfaces, sweep config + results, sensitivity bar chart. 433 vitest + 293 dag-map = 726 tests; 8 Playwright specs. D-2026-04-17-033 ratifies the `GET /v1/runs/{runId}/model` backend carve-out. - - **m-E21-04** (complete, merged to epic 2026-04-22 in commit `8c4898f`; **scope split 2026-04-21** — Optimize moved to m-E21-05): Goal Seek Surface — goal-seek panel on `/analysis`, shared `AnalysisResultCard` + `ConvergenceChart` components (pure-SVG with geometry siblings), `interval-bar-geometry` for search-interval visualization, new `flowtime.goalSeek(...)` client method. Backend `trace` on both `/v1/goal-seek` and `/v1/optimize` per **D-2026-04-21-034** (backend landed in commit `29ac3e9`; optimize trace ready for m-E21-05). 482 vitest (+49 new) + 293 dag-map = 775 tests; 8 Playwright passing / 1 pre-existing env flake. Full branch-coverage audit (backend + UI) in tracking doc. - - **m-E21-05** (complete, merged to epic 2026-04-22 in commit `a94fc66`): Optimize Surface — live `/v1/optimize` wired to the `/analysis` Optimize tab (N-param Nelder-Mead under bounds). Consumes shared `AnalysisResultCard` + `ConvergenceChart` + `interval-bar-geometry` from m-E21-04; adds `flowtime.optimize(...)` client method; sibling `optimize-helpers.ts` module for form validation. Per-param result table with separate range-bar column (via `intervalMarkerGeometry`) showing where each optimized value landed inside its bound. Backend trace landed in m-E21-04 commit `29ac3e9` — no backend work this milestone. 520 vitest (+19 new) · 11/11 optimize Playwright specs green against live Rust engine · 1 pre-existing sweep env flake (unchanged from main). Full branch-coverage audit in tracking doc. - - **m-E21-06** (complete, completed 2026-04-24 in commit `5dddb5d` on branch `milestone/m-E21-06-heatmap-view`): Heatmap View — nodes-x-bins grid as sibling of topology under `/time-travel/topology` with typed `` (inline views, no registry per ADR-m-E21-06-01), shared view-state store (`view-state.svelte.ts`), shared full-window 99p-clipped color-scale normalization (topology straight-swapped from per-bin per ADR-m-E21-06-02 — no escape hatch), shared-toolbar `[ Operational | Full ]` node-mode toggle reaching Blazor parity (AC15, `mode` param on `getStateWindow`). 15/15 ACs landed; zero backend work (`state_window` sufficed). Multiple mid-flight spec amendments (all dated and captured): pinned-first row-float removed (pin glyph sole indicator); column highlight reduced from full-column outline to top-bar marker; persistent-selection SVG overlay keyed to `viewState.selectedCell` (survives window blur) + workbench-card title cross-link; auto-fit `CELL_W` with fractional pixels + 4 px right-margin + root-layout `min-w-0` plumbing so SVG never exceeds container (iteration log v1→v5 in tracking doc); `fitWidth` toggle persisted in shared store. Dead-code cleanup landed the new framework guard (2026-04-23): `buildMetricMapForDef` + `buildMetricMapForDefFiltered` + `pinnedIds` on `HeatmapGridInput` deleted once they had no production callers — guard mirrored into `.ai-repo/rules/project.md` + `CLAUDE.md`. New chrome tokens `--ft-pin` (red pin glyph) + `--ft-highlight` (turquoise highlight/selection/card-title). **770 ui-vitest passing across 32 suites** (net +269 from the 501 baseline) · 16 Playwright specs on `svelte-heatmap.spec.ts` (13 AC14 + #12b/#12c/#12d/#12e), 11 pass / 2 graceful-skip on fixtures without class metadata · `.NET` green on single-test re-run (one pre-existing timing flake in a file untouched by this milestone). Four gaps filed (topology keyboard/ARIA posture, data-viz palette color-blind validation, heatmap sliding-window scrubber, bidirectional card↔view reverse cross-link). Full branch-coverage audit in tracking doc. - - 8 milestones (was 7 before split): workbench foundation → metric selector + edge cards → sweep/sensitivity → goal-seek → optimize → heatmap view → validation surface → polish. - - Absorbs E-11 M5/M7/M8 under workbench paradigm. Svelte is the platform for new surfaces; Blazor is maintenance-only. -- **E-11** Svelte UI — paused after M6; absorbed into E-21 - - M1-M4 + M6 done. M5 (Inspector) → E-21 workbench. M7 (Dashboard) → deferred. M8 (Polish) → E-21 m-E21-08. -- **E-12–E-15:** planned, not started. E-15 is on critical path for client-telemetry vision. -- **E-22** Time Machine: Model Fit & Chunked Evaluation (`work/epics/E-22-model-fit-chunked-evaluation/spec.md`) — **planning** - - Carries the remaining E-18 scope forward: model fit against real telemetry (`POST /v1/fit`), chunked evaluation (Rust `chunk_step` protocol + `POST /v1/chunked-eval`), and the `FlowTime.Pipeline` embeddable SDK wrapper. - - **Planned milestones (3):** m-E22-01 Model Fit, m-E22-02 Chunked Evaluation, m-E22-03 FlowTime.Pipeline SDK. - - **Depends on:** E-15 Telemetry Ingestion, Telemetry Loop & Parity. Sequenced after both per D-2026-04-15-032 Option A. - - **Out of scope (tracked in `work/gaps.md`):** optimization constraints, Monte Carlo, `FlowTime.Telemetry.*` direct-source adapters. \ No newline at end of file diff --git a/ROADMAP.md b/ROADMAP.md index d491923f..d8cdcb53 100644 --- a/ROADMAP.md +++ b/ROADMAP.md @@ -1,320 +1,209 @@ -# FlowTime Roadmap — Updated 2026-04-07 +# Roadmap -This roadmap reflects the current state of FlowTime Engine + Sim and the strategic direction established during the E-16 planning cycle. Architecture **epics** and milestone docs provide the implementation detail (see `work/epics/epic-roadmap.md`). +## E-10 — Engine Correctness & Analytical Primitives (done) -## Scope & Assumptions -- Engine remains responsible for deterministic execution, artifact generation, and `/state` APIs (see `docs/flowtime-engine-charter.md`). -- FlowTime.Sim owns template authoring, stochastic inputs, and template catalog endpoints. -- Product-level scope is summarized in `docs/flowtime-charter.md`. -- The engine deep review (`docs/architecture/reviews/engine-deep-review-2026-03.md`) is the primary input for correctness priorities. +### Goal -## Thesis: Pure Engine, Then Power +Fix known correctness bugs, harden engineering quality, and build the analytical primitives layer that enables downstream epics (Path Analysis, Anomaly Detection, Scenario Overlays, UI Analytical Views) to deliver their full value. -FlowTime is a **spreadsheet for flow dynamics** — a deterministic graph of pure transforms over named time series. Queueing theory made executable. +| Milestone | Title | Status | +|---|---|---| +| M-054 | Engineering Foundation (Phase 1) | done | +| M-055 | Phase 2 — Documentation Honesty | done | +| M-056 | Phase 3a — Cycle Time & Flow Efficiency | done | +| M-057 | Phase 3a.1 — Analytical Projection Hardening | done | +| M-058 | Phase 3b — WIP Limits | done | +| M-059 | Phase 3c — Variability Preservation (Cv + Kingman) | done | +| M-060 | Phase 3d — Constraint Enforcement | done | -The strategic arc is three phases: +## E-11 — Svelte UI — Parallel Frontend Track (done) -1. **Make it pure (E-16).** The engine's analytical identity and semantic meaning are still reconstructed late from strings in the API and UI. E-16 moves all of that into the compiled Core: typed references, compiled analytical descriptors, pure evaluation. After E-16, the engine is an honest formula evaluator — the compiler owns meaning, evaluation is pure, consumers read facts. +### Goal -2. **Make it interactive (E-17).** Once the engine is pure and evaluation takes microseconds, live what-if becomes possible. Change a parameter, see every metric update instantly. The spreadsheet comes alive. This needs runtime parameter identification, server-side sessions, and a push channel to the UI. +Build a SvelteKit + shadcn-svelte application in parallel with the Blazor WebAssembly frontend, delivering a polished, modern UI for demos and future evaluation while keeping the existing .NET backend APIs untouched. -3. **Make it programmable (E-18).** Once the engine is a callable pure function, embed it in pipelines. Parameter sweeps, optimization loops, model fitting against real telemetry, sensitivity analysis, digital twin architectures. FlowTime becomes an instrument, not just a simulator. +| Milestone | Title | Status | +|---|---|---| +| M-061 | Project Scaffold & Shell | done | +| M-062 | Run Orchestration | done | -This arc describes the product capability ladder, not strict implementation order. In implementation, the shared runtime parameter foundation lands first in the E-18 Time Machine foundation and is then consumed by E-17's session/push UX. E-16 completes first, then E-10 resumes, then E-12-E-15 build on the analytical layer. +## E-12 — Dependency Constraints & Shared Resources (done) -## Delivered (Completed Epics) +### Goal -9 epics completed. See `work/epics/completed/` for full specs. +Model downstream dependencies (databases, caches, external APIs, shared services) as **constraints** that can limit throughput and introduce hidden backlog/latency. Preserve FlowTime’s minimal basis (arrivals/served/queue depth) while making coupling and bottlenecks visible. -- **Time Travel V1** — `/state`, `/state_window` APIs, telemetry capture/bundles, DLQ/backlog semantics. -- **Evaluation Integrity** — Compile-to-DAG contract, centralized model compiler. -- **Edge Time Bins** — Per-edge throughput/attempt volumes, conservation checks, UI overlays. -- **Classes & Routing** — Multi-class flows with class-aware routing and visualization. -- **Service With Buffer** — First-class `serviceWithBuffer` node type replacing legacy backlog. -- **MCP Modeling & Analysis** — Draft/validate/run/inspect loop, data intake, profile fitting, storage. -- **Engine Semantics Layer** — Stable `/state`, `/state_window`, `/graph` contracts. -- **UI Performance** — Input/paint/data lane separation, eliminated main-thread stalls. -- **Package Updates** — .NET 9 dependencies and MudBlazor updated (M-11.01, M-11.02). +| Milestone | Title | Status | +|---|---|---| +| M-063 | Dependency Constraints Foundations | done | +| M-064 | Dependency Constraints (Attached to Services) | done | +| M-065 | MCP Dependency Pattern Enforcement | done | -## E-10 — Engine Correctness & Analytical Primitives (completed) +## E-13 — Path Analysis & Subgraph Queries (proposed) -**Epic:** `work/epics/completed/E-10-engine-correctness-and-analytics/spec.md` -**Status:** Complete — all 8 milestones delivered +_No milestones yet._ -The engine deep review found 3 P0 correctness bugs, engineering debt, documentation drift, and a missing analytical layer. All phases delivered: Phases 0-2 (bugs, engineering, docs), Phase 3 analytical primitives (cycle time, projection hardening, constraint enforcement, variability, WIP limits with overflow routing and SHIFT-based backpressure feedback). +## E-14 — Visualizations (Chart Gallery / Demo Lab) (cancelled) -## E-16 — Formula-First Core Purification (completed) +_No milestones yet._ -**Epic:** `work/epics/completed/E-16-formula-first-core-purification/spec.md` | **Status:** completed (`m-E16-06` completed on `milestone/m-E16-06-analytical-contract-and-consumer-purification`) +## E-15 — Telemetry Ingestion, Topology Inference, and Canonical Bundles (proposed) -The architecture gate is complete. Semantic meaning and analytical truth are now compiled into Core once and consumed as facts everywhere else. +### Goal -Six milestones in sequence: -1. **m-E16-01** — Compiled Semantic References (typed refs replace raw string parsing, Parallelism typing) — completed on `milestone/m-E16-01-compiled-semantic-references` -2. **m-E16-02** — Class Truth Boundary (real by-class data vs wildcard fallback made explicit) — completed on `milestone/m-E16-01-compiled-semantic-references` -3. **m-E16-03** — Runtime Analytical Descriptor (absorbs AnalyticalCapabilities, compiled by compiler not resolved from strings) — completed on `milestone/m-E16-01-compiled-semantic-references` -4. **m-E16-04** — Core Analytical Evaluation (all analytical math moves to Core including flowLatencyMs graph propagation) — completed on `milestone/m-E16-01-compiled-semantic-references` -5. **m-E16-05** — Warning Facts & Primitive Cleanup (backlog/stationarity/overload warnings move to Core analyzers) — completed on `milestone/m-E16-05-analytical-warning-facts-and-primitive-cleanup` -6. **m-E16-06** — Contract & Consumer Purification (publish facts in API, delete IsServiceLike/Classify heuristics from UI) — completed on `milestone/m-E16-06-analytical-contract-and-consumer-purification` +Build the pipeline that takes real-world data — event logs, traces, sensor feeds — and produces the two things FlowTime needs: a `/graph` topology and Gold-format time-binned series. This epic owns ingestion, topology inference, validation, and bundle assembly. -Key decisions: D-2026-04-03-005 (flowLatencyMs to Core), D-2026-04-03-006 (descriptor absorbs AnalyticalCapabilities), D-2026-04-03-007 (Parallelism typing in E-16). See `work/decisions.md`. +_No milestones yet._ -Migration is forward-only. Existing runs, fixtures, and approved snapshots are regenerated, not compatibility-layered. +## E-16 — Formula-First Core Purification (done) -## E-11 — Svelte UI (Parallel Frontend Track) +### Goal -**Epic:** `work/epics/E-11-svelte-ui/spec.md` | **Status:** paused after M6; absorbed into E-21 (M1-M4 + M6 done; M5 → E-21 workbench, M7 deferred, M8 → E-21 m-E21-08) +Purify FlowTime's execution boundary so semantic meaning and analytical truth are compiled into Core once and consumed as facts everywhere else. This epic turns the existing "spreadsheet for flows" mental model into an enforceable architecture: parser/compiler resolve references, the core evaluates pure vector formulas, and adapters and clients stop reconstructing domain meaning from strings. -Build a parallel SvelteKit + shadcn-svelte UI surface for demos and future evaluation while keeping the Blazor UI supported and in sync. Independent of engine work — both UIs consume existing APIs with zero backend changes. +| Milestone | Title | Status | +|---|---|---| +| M-012 | Compiled Semantic References | done | +| M-013 | Class Truth Boundary | done | +| M-014 | Runtime Analytical Descriptor | done | +| M-015 | Core Analytical Evaluation | done | +| M-016 | Analytical Warning Facts and Primitive Cleanup | done | +| M-017 | Analytical Contract and Consumer Purification | done | -Superseded on 2026-04-15 (fork decision): Svelte becomes the platform for all new surfaces and Blazor enters maintenance mode. Remaining work moved to **E-21 — Svelte Workbench & Analysis Surfaces** below. +## E-17 — Interactive What-If Mode (done) -## E-24 — Schema Alignment (completed) +### Goal -**Epic:** `work/epics/completed/E-24-schema-alignment/spec.md` | **Status:** completed — all five milestones merged to main (2026-04-25) +Enable live, interactive recalculation in FlowTime — change a parameter and see results update instantly across the entire model, like a spreadsheet. -Unified FlowTime's post-substitution model representation. One C# type (`ModelDto` + `ProvenanceDto` in `FlowTime.Contracts`), one YAML schema (`docs/schemas/model.schema.yaml` rewritten against the unified type), one validator. `SimModelArtifact` and its six satellites deleted; Sim emits the unified type directly; Engine parses it directly. `Template` (authoring-time) stays distinct. Forward-only — no migration of stored bundles. camelCase throughout. The `TemplateWarningSurveyTests.Survey_Templates_For_Warnings` canary is now a hard `val-err == 0` build-time gate at `ValidationTier.Analyse` across all twelve shipped templates. +| Milestone | Title | Status | +|---|---|---| +| M-018 | WebSocket Engine Bridge | done | +| M-019 | Svelte Parameter Panel | done | +| M-020 | Live Topology and Charts | done | +| M-021 | Warnings Surface | done | +| M-022 | Edge Heatmap | done | +| M-023 | Time Scrubber | done | -**Five milestones:** m-E24-01 Inventory & Design Decisions (doc-only) → m-E24-02 Unify Model Type (`SimModelArtifact` + 6 satellites deleted; YamlDotNet 17.0.1) → m-E24-03 Schema Unification (schema rewritten top-to-bottom; nested 7-field camelCase provenance; consumer citations on every property) → m-E24-04 Parser/Validator Scalar-Style Fix (mirrored `ParseScalar` `ScalarStyle.Plain` guard in both validators + sibling `QuotedAmbiguousStringEmitter` for round-trip symmetry) → m-E24-05 Canary Green + Hard Assertion (regression-catching verified end-to-end). +## E-18 — Time Machine (done) -**ADRs:** ADR-E-24-01 Unify the post-substitution model type · ADR-E-24-02 Forward-only regeneration · ADR-E-24-03 Schema declares only consumed fields · ADR-E-24-04 `ScalarStyle.Plain` gates `ParseScalar` coercion · ADR-E-24-05 `QuotedAmbiguousStringEmitter` round-trip symmetry. +### Goal -**Decisions:** `D-2026-04-24-036` (E-23 paused, E-24 created) · `D-2026-04-24-037` (Option E ratified; 5-milestone plan) · `D-2026-04-25-038` (E-24 closed; E-23 ready to resume). +Make FlowTime usable as a pure callable function — embeddable in pipelines, optimization loops, model discovery workflows, and digital twin architectures. The **Time Machine** (`FlowTime.TimeMachine`) is a new first-class execution component that scripts, UIs, MCP servers, and AI agents can drive programmatically. It owns compile, tiered validation, evaluate, reevaluate with parameter overrides, and canonical artifact write. -**Unblocks:** E-23 m-E23-02 (call-site migration) and m-E23-03 (`ModelValidator` delete) become byte-trivial mechanical cleanup; m-E21-07 Validation Surface eventually; E-15 Telemetry Ingestion `nodes[].source` forward contract. +FlowTime's execution component is an abstract machine in the BEAM / JVM sense: instructions (the compiled graph), state (the time grid plus accumulating series), deterministic topological stepping through time. "Time Machine" also aligns with the existing Blazor "Time Travel" UI feature — the Time Travel UI navigates runs the Time Machine produces — and the reevaluation semantics (rewind a compiled model, run it forward with different parameters) are literally time travel. -## E-23 — Model Validation Consolidation (completed and merged to main 2026-04-26) +| Milestone | Title | Status | +|---|---|---| +| M-001 | Parameterized Evaluation | done | +| M-002 | Engine Session + Streaming Protocol | done | +| M-003 | Tiered Validation | done | +| M-004 | Generator Extraction → TimeMachine | done | +| M-005 | ITelemetrySource Contract | done | +| M-006 | Parameter Sweep | done | +| M-007 | Sensitivity Analysis | done | +| M-008 | Goal Seeking | done | +| M-009 | Multi-parameter Optimization | done | +| M-010 | SessionModelEvaluator | done | +| M-011 | .NET Time Machine CLI | done | -**Epic:** `work/epics/completed/E-23-model-validation-consolidation/spec.md` | **Status:** completed and merged to main 2026-04-26; archived. Rescoped 2026-04-26 — E-24 Schema Alignment closed (all five milestones landed on `epic/E-24-schema-alignment`). E-23's spirit reframed: make `model.schema.yaml` the only declarative source of structural truth and `ModelSchemaValidator` the only runtime evaluator — eliminate every "embedded schema" outside the canonical schema (`ModelValidator.cs` hand-rolled rules, parser tolerations, silent emission defaults, post-parse orchestration checks). E-24 fixed type + schema-document embedment; E-23 closed the rule-evaluator embedment. **`ModelSchemaValidator.Validate` is now the single model-YAML validator in the codebase.** +## E-19 — Surface Alignment & Compatibility Cleanup (done) -Mini-epic (3 milestones). Collapses the codebase's two silently-disagreeing model validators to one: `ModelValidator` is **deleted**, `ModelSchemaValidator` is the single schema-driven entry point. Directly enforces the 2026-04-23 Truth Discipline guard *"'API stability' does not mean 'keep old functions around.'"* +### Goal -**Milestone slate (3 milestones):** -- m-E23-01 Rule-Coverage Audit — **status: completed (2026-04-26).** 94 rules audited across `ModelValidator.cs`, `ModelParser.cs`, `SimModelBuilder.cs`, `ModelCompiler.cs`, `RunOrchestrationService.cs`, and DTO/`ModelService` shims. Per-rule disposition assigned. **16 schema-add edits** landed on `model.schema.yaml` (with `# rule from : — added m-E23-01` citations). **Schema restructure** to a 5-arm `oneOf` at `nodes[].items` closes the JsonEverything `not`-keyword silent-error class structurally. **Validator silent-error fallback** (`SynthesizePathOnlyError`) closes the residual blind spot. **12 named adjunct methods** on `ModelSchemaValidator` (`ValidateNodeIdUniqueness`, `ValidateOutputSeriesReferences`, `ValidateExpressionNodeReferences`, `ValidateConstNodeValueCount`, `ValidatePmfArrayLengths`, `ValidatePmfValueUniqueness`, `ValidatePmfProbabilitySum`, `ValidateSelfShiftRequiresInitialCondition`, `ValidateTopologySeriesReferences`, `ValidateWipOverflowTarget`, `ValidateWipOverflowAcyclic`, `ValidateDateTimeFormats`) for cross-reference / cross-array rules JSON Schema draft-07 cannot express. **Mode-specific simulation rules** (`grid.start` non-empty + `topology.nodes` non-empty) stay at the orchestration layer per parser-justified disposition. **Negative-case canary catalogue** (26 tests in `RuleCoverageRegressionTests.cs` + 6 silent-error regression tests) locks coverage in. Canary `Survey_Templates_For_Warnings` stays green at `val-err == 0`. Full suite **1,846 / 0 / 9**. AC1-AC9 closed. -- m-E23-02 Call-Site Migration — **status: completed (2026-04-26).** 3 production sites (`POST /v1/run`, Engine CLI, `TimeMachineValidator` tier-1) + 28 test calls across 4 files migrated to `ModelSchemaValidator.Validate`. `TimeMachineValidator`'s redundant dual-validator block removed. Error-phrasing audit recorded across 6 representative invalid-model fixtures + UI/CLI consumer scan (no regex-parse consumers). Scope-expansion close-out: real `ProvenanceService.StripProvenance` round-trip bug fixed (Dictionary-based round-trip → YamlStream surgical removal preserving scalar styles); 2 stale test fixtures dropped legacy `generator:` field; 1 test flip from `OK` → `BadRequest` for the now-arrived "future" of malformed-provenance rejection. **+16 net new tests** (10 strip branch-coverage + 2 integration regression at the `/v1/run` API surface + 4 strip sub-case). Suite **1862 / 0 / 9** vs the m-E23-01 baseline of 1846 / 0 / 9. Both canaries green. `ModelValidator.cs` left on disk as single-revert safety net (deletion in m-E23-03). AC1-AC8 closed; AC9 (latency delta) deferred — optional. -- m-E23-03 Delete `ModelValidator` — **status: completed (2026-04-26).** `src/FlowTime.Core/Models/ModelValidator.cs` deleted via `git rm`. `ValidationResult` (14 lines) relocated to `src/FlowTime.Core/Models/ValidationResult.cs`, namespace stays `FlowTime.Core`, no API change. AC3 grep clean: 7 hits remain, all explanatory comments documenting m-E23-01 / m-E23-02 migration history; zero live references. Build: 0 errors, 1 pre-existing xUnit-analyzer warning unrelated. Full suite **1862 / 0 / 9** — identical to m-E23-02 tip. Both canaries green. Epic-folder archive lands when E-23 merges to main. +Tighten the remaining non-analytical legacy and compatibility surfaces after E-16 so FlowTime exposes current Engine/Sim contracts consistently across first-party UI, Sim, docs, schemas, and examples without carrying stale fallback layers or stripping supported Blazor capability. -**Out of scope (firm):** Sim's emission shape (E-24 territory; E-23 only revisits if the audit shows an unwritten emission rule), Blazor/Svelte UI code, active validation UI (lives in m-E21-07 after E-21 resumes), new validator features, `Template`-layer validation (`TemplateSchemaValidator` stays distinct for pre-substitution authoring templates). +| Milestone | Title | Status | +|---|---|---| +| M-024 | Supported Surface Inventory, Boundary ADR & Exit Criteria | done | +| M-025 | Sim Authoring & Runtime Boundary Cleanup | done | +| M-026 | Schema, Template & Example Retirement | done | +| M-027 | Blazor Support Alignment | done | -**Stashed input material:** branch `milestone/m-E23-01-schema-alignment` + `stash@{0}` hold pre-pivot m-E23-01 work. Most absorbed by E-24; should be retired when E-23 resumes — the rule audit starts fresh from post-E-24 `main`. +## E-20 — Matrix Engine (done) -**Dependencies:** E-24 Schema Alignment (cleared 2026-04-25). After E-23 lands, m-E21-07 Validation Surface resumes with a single consolidated validator to render. +### Goal -## E-21 — Svelte Workbench & Analysis Surfaces (all milestones complete — ready for epic wrap) +Replace the C# object-graph evaluation engine with a Rust-based column-store + evaluation-plan engine. The new engine reads the same YAML model files, produces identical output artifacts, and ships as a standalone CLI binary (`flowtime-engine`). This is the foundation for E-17 (Interactive What-If) and E-18 (Time Machine). -**Epic:** `work/epics/completed/E-21-svelte-workbench-and-analysis/spec.md` | **Status:** all 8 milestones complete (m-E21-08 Polish completed 2026-04-28). Ready for epic wrap and merge to main. +| Milestone | Title | Status | +|---|---|---| +| M-028 | Scaffold, Types, and Parsers | done | +| M-029 | Compiler and Core Evaluator | done | +| M-030 | Topology and Sequential Ops | done | +| M-031 | Routing and Constraints | done | +| M-032 | Derived Metrics and Analysis | done | +| M-033 | Artifacts, CLI, and Integration | done | +| M-034 | .NET Subprocess Bridge | done | +| M-035 | Full Parity Harness | done | +| M-036 | Per-Class Decomposition and Edge Series | done | +| M-037 | Artifact Sink Parity | done | -Transform the Svelte UI from a Blazor-parallel clone into the primary platform for expert flow analysis and Time Machine surfaces. Workbench paradigm: topology as navigation + click-to-pin inspection panel; `/analysis` route with tabbed Time Machine surfaces (sweep, sensitivity, goal-seek, optimize); heatmap view; validation surface; compact density with calm chrome + vivid data-viz palette. +## E-21 — Svelte Workbench & Analysis Surfaces (done) -**Depends on:** E-11 (M1-M4 + M6), E-17, E-18 analysis endpoints. +### Goal -**Completed milestones:** -- m-E21-01: Workbench Foundation — density tokens, dag-map `bindEvents`/`selected` (library), click-to-pin node cards (merged 2026-04-17) -- m-E21-02: Metric Selector & Edge Cards — metric chip bar, edge cards, class filter, custom TimelineScrubber (merged 2026-04-17) -- m-E21-03: Sweep & Sensitivity Surfaces — `/analysis` route with tabs, sweep config + results, sensitivity bar chart (merged 2026-04-17; ultrareview follow-ups 2026-04-20) -- m-E21-04: Goal Seek Surface — goal-seek panel on `/analysis`, shared `AnalysisResultCard` + `ConvergenceChart` components, additive `trace` on `/v1/goal-seek` and `/v1/optimize` per D-2026-04-21-034 (completed 2026-04-22) -- m-E21-05: Optimize Surface — live `/v1/optimize` wired to the `/analysis` Optimize tab, N-param Nelder-Mead under bounds, per-param result table with range bars, new `flowtime.optimize(...)` client, sibling `optimize-helpers.ts` module (completed 2026-04-22) -- m-E21-06: Heatmap View — nodes-x-bins grid as sibling of topology under `/time-travel/topology`, typed `` (inline views, no registry per ADR-m-E21-06-01), shared view-state store, shared full-window 99p-clipped color-scale normalization (topology straight-swaps from per-bin per ADR-m-E21-06-02), shared-toolbar `[ Operational | Full ]` node-mode toggle reaching Blazor parity. 15/15 ACs; 770 ui-vitest (+269) across 32 suites; 16 Playwright specs on `svelte-heatmap.spec.ts`; zero backend work (completed 2026-04-24) +Transform the Svelte UI from a Blazor-parallel clone into the primary platform for expert flow analysis and Time Machine surfaces, using a workbench paradigm (topology as navigation + inspection panel) instead of the Blazor overlay approach. -- m-E21-07: Validation Surface (Svelte) — consumer-side type widening on `state_window` warnings; validation panel as left column inside workbench panel; topology node + edge warning indicators; workbench-card warning surfaces; bidirectional cross-link via shared view-state store; Playwright real-bytes fixture (`FLOWTIME_E2E_TEST_RUNS=1` + sandboxed `data/test-runs/`) covering the AC1 wire-format round-trip, mocked specs covering UI-behaviour edge cases. New chrome tokens `--ft-warn` / `--ft-err` / `--ft-info`. Zero backend work. 897 ui-vitest passing across the suite; 9/9 Playwright in `svelte-validation.spec.ts`; svelte-check 413/2 baseline unchanged (completed 2026-04-28) +| Milestone | Title | Status | +|---|---|---| +| M-038 | Workbench Foundation | done | +| M-039 | Metric Selector & Edge Cards | done | +| M-040 | Sweep & Sensitivity Surfaces | done | +| M-041 | Goal Seek Surface | done | +| M-042 | Optimize Surface | done | +| M-043 | Heatmap View | done | +| M-044 | Validation Surface (Svelte) | done | +| M-045 | Visual Polish & Dark Mode QA | done | -- m-E21-08: Visual Polish & Dark Mode QA — topology keyboard + ARIA retrofit (a11y bar parity with heatmap), full bidirectional cross-link (node `.node-selected` stroke + new `selectedEdge` field on view-state with `.edge-pinned` rename + new `.edge-selected` chrome), dark-mode audit (zero token-resolution gaps), loading skeletons on `/time-travel/topology` + `/analysis` tabs, transitions rule documented + 160 ms cross-fades, elevation token audit (zero remediation; chrome already canonical), validation-panel cosmetic collapse, heatmap-side absence assertion. 919 ui-vitest (+22 from baseline); svelte-check 413/2 unchanged; 6 new Playwright specs. New chrome token `--ft-focus`. Color-blind validation + pattern encoding deferred to a follow-up by user decision 2026-04-28 (completed 2026-04-28) +## E-22 — Time Machine — Model Fit & Chunked Evaluation (proposed) -**Remaining:** none. E-21 ready for epic wrap. +### Goal -## E-19 — Surface Alignment & Compatibility Cleanup (completed) +Close out the remaining Time Machine analysis modes — **model fitting** against real telemetry and **chunked evaluation** for feedback simulation — and crystallize the resulting surface as a clean embeddable **`FlowTime.Pipeline` SDK**. These are the last two analysis modes in the E-18 Time Machine architecture; delivering them completes the "FlowTime as a callable function" arc. -**Epic:** `work/epics/completed/E-19-surface-alignment-and-compatibility-cleanup/spec.md` | **Status:** completed — all four milestones merged to main (2026-04-08) +_No milestones yet._ -After E-16 purifies analytical truth, FlowTime still carries broader non-analytical compatibility debt across first-party UI, Sim, docs, examples, and schema surfaces. E-19 removes stale fallback layers and clarifies supported surfaces in a forward-only cut while keeping Blazor current as a supported parallel UI. +## E-23 — Model Validation Consolidation (done) -This cleanup lane also draws the boundary between today's Sim authoring/orchestration residue and the future E-18 Time Machine foundation, so the current Sim path does not harden into the default programmable contract. +### Goal -This epic starts immediately after E-16 as a cleanup lane, but it does not replace E-10 Phase 3 resume. Runtime/schema/doc cleanup can run in parallel with resumed analytical work, and Blazor alignment runs alongside the E-11 Svelte track rather than behind a replacement cutoff. +Make `docs/schemas/model.schema.yaml` the **only declarative source of structural truth** about the post-substitution model, and `ModelSchemaValidator` the **only runtime evaluator**. Eliminate every "embedded schema" — every place outside the canonical schema where model rules are re-encoded. After E-23 closes: -## Near-Term Epics +- One schema. Declared in `model.schema.yaml`. +- One validator. `ModelSchemaValidator.Validate`, with named adjuncts (alongside `ValidateClassReferences`) for any rule JSON Schema draft-07 cannot express. +- Zero parallel imperative validators. `ModelValidator.cs` is deleted. +- Every rule has exactly one canonical home. No silent rules in parsers, emitters, or post-parse orchestration paths. -These depend on the analytical primitives from E-10 Phase 3 (except E-15 which is independent): +| Milestone | Title | Status | +|---|---|---| +| M-046 | Rule-Coverage Audit | done | +| M-047 | Call-Site Migration | done | +| M-048 | Delete `ModelValidator` | done | -1. **E-12 — Dependency Constraints & Shared Resources** (`work/epics/E-12-dependency-constraints/`) - - Runtime constraint enforcement (depends on Phase 3 p3d). M-10.01/02 complete. M-10.03 deferred. +## E-24 — Schema Alignment (done) -2. **E-13 — Path Analysis & Subgraph Queries** (`work/epics/E-13-path-analysis/`) - - Path-level queries, bottleneck attribution, dominant routes, path pain. +### Goal -3. **E-14 — Visualizations** (`work/epics/E-14-visualizations/`) - - Absorbed into UI Analytical Views epic. See `work/epics/ui-analytical-views/spec.md`. +Unify FlowTime's post-substitution model representation. One C# type. One YAML schema. One validator. `SimModelArtifact` is **deleted**. Sim builds the unified model type directly; the Engine accepts and parses the same type. Every field has exactly one declaration site. `TemplateWarningSurveyTests` reports `val-err=0` across all twelve templates at `ValidationTier.Analyse`, promoted to a hard build-time assertion. `ModelValidator` deletion (E-23) then becomes a mechanical cleanup. -4. **E-15 — Telemetry Ingestion, Topology Inference + Canonical Bundles** (`work/epics/E-15-telemetry-ingestion/`) - - Gold Builder + Graph Builder + bundle assembly. Independent of Phase 3. +| Milestone | Title | Status | +|---|---|---| +| M-049 | Inventory and Design Decisions | done | +| M-050 | Unify Model Type | done | +| M-051 | Schema Unification | done | +| M-052 | Parser/Validator Scalar-Style Fix | done | +| M-053 | Canary Green and Hard Assertion | done | -## Bridge Work (recommended before advanced leverage) +## E-25 — Engine Truth Gate — Edge-Flow Authority + Golden-Output Canary (proposed) -These are the lowest-risk leverage layers after purification. They make the pure engine more useful without forcing live sessions, streaming state, or optimization frameworks too early. +### Goal -1. **Scenario Overlays & What-If Runs** (`work/epics/overlays/overlays.md`) - - Deterministic derived runs created from a baseline via validated input patches. Recommended after p3c + p3b so variability- and WIP-aware experiments have a clean execution path. +Resolve the engine-correctness investigation surfaced during E-21 dogfooding (G-032 + G-033) and lock down testing rigor before further engine evolution. Concretely: make a defensible design call on edge-flow authority (expr nodes vs. topology edge weights), align engine + shipped templates so the conservation invariant is clean, and promote the lightweight `Survey_Templates_For_Warnings` baseline canary into a strict per-template **golden-output** canary that compares numeric series + warning sets at a sanctioned baseline. -2. **Telemetry Loop & Parity** (`work/epics/telemetry-loop-parity/spec.md`) - - Automated parity harness between baseline synthetic runs and telemetry replay runs. Recommended immediately after the first E-15 dataset path and before model fitting, optimization, or anomaly automation. +| Milestone | Title | Status | +|---|---|---| +| M-066 | Edge-Flow Authority Decision | draft | +| M-067 | Engine + Template Alignment | draft | +| M-068 | Golden-Output Canary | draft | -## E-20 — Matrix Engine (complete) - -**Epic:** `work/epics/E-20-matrix-engine/spec.md` | **Status:** complete (m-E20-01–10 all complete) - -Replace the C# object-graph evaluation with a Rust column-store + evaluation-plan engine. All series live in one flat `f64[series_count × bins]` matrix. The evaluation plan is an ordered list of ops (pure functions on columns). Ships as a standalone CLI binary (`flowtime-engine eval/validate/plan`). The .NET API calls the Rust binary as a subprocess. - -Three-layer architecture (D-2026-04-10-031): engine core (pure function) → artifact sink (mandatory, pluggable persistence) → consumer adapters (per-surface formatting). All 10 milestones complete. The Rust engine replaces `RunArtifactWriter`. E-17/E-18 are unblocked. - -**Depends on:** E-10 (complete), E-16 (complete) - -## E-17 — Interactive What-If Mode (complete) - -**Epic:** `work/epics/completed/E-17-interactive-what-if-mode/spec.md` | **Status:** complete | **Merged:** 2026-04-12 - -Live interactive recalculation: change a parameter, see every metric update instantly (<50ms). 6 milestones: WebSocket bridge → parameter panel → topology heatmap → warnings surface → edge heatmap → time scrubber. Advanced demo models (SaaS API, e-commerce pipeline). 200 vitest + 26 Playwright E2E. - -**Depends on:** E-20 - -## E-18 — Time Machine (in-progress) - -**Epic:** `work/epics/E-18-headless-pipeline-and-optimization/spec.md` | **Status:** in-progress (branch `epic/E-18-time-machine`) -**Gap analysis:** `work/epics/E-18-headless-pipeline-and-optimization/e18-gap-analysis.md` - -FlowTime as a callable pure function in pipelines, optimization loops, model fitting, digital twin architectures. - -**Depends on:** E-20 (complete) - -**Completed milestones:** -- m-E18-01: Parameterized evaluation (Rust) — ParamTable, evaluate_with_params, compile-once eval-many -- m-E18-02: Engine session + streaming protocol (Rust) — persistent process, MessagePack over stdin/stdout -- m-E18-06: Tiered validation — `TimeMachineValidator` (schema/compile/analyse), `POST /v1/validate`, Rust `validate_schema` -- m-E18-07: `FlowTime.TimeMachine` project created; `FlowTime.Generator` deleted (Path B) -- m-E18-08: `ITelemetrySource` interface + `CanonicalBundleSource` + `FileCsvSource` -- m-E18-09: Parameter sweep — `SweepSpec`/`SweepRunner`/`ConstNodePatcher`, `IModelEvaluator`, `POST /v1/sweep` -- m-E18-10: Sensitivity analysis — `ConstNodeReader`, `SensitivityRunner` (central difference), `POST /v1/sensitivity` -- m-E18-11: Goal seeking — `GoalSeeker` (bisection), `POST /v1/goal-seek` *(added; not in original spec)* -- m-E18-12: Optimization — `Optimizer` (Nelder-Mead, N params), `POST /v1/optimize` -- m-E18-13: SessionModelEvaluator — compile-once persistent-subprocess bridge using m-E18-02 session protocol; `RustEngine:UseSession` config switch (default true); `RustModelEvaluator` retained as fallback -- m-E18-14: .NET Time Machine CLI — `flowtime validate/sweep/sensitivity/goal-seek/optimize` as pipeable JSON-over-stdio commands, byte-compatible with `/v1/` endpoints; `--no-session` fallback - -**Active delivery sequence (decided 2026-04-15):** - -1. **UI parity fork** — Svelte UI becomes the platform for new telemetry/fit/discovery surfaces. Blazor enters maintenance mode at current functionality. Parallel track with E-15 below. -2. **E-15 Telemetry Ingestion** — Gold Builder (raw → canonical bundle) → Graph Builder (telemetry → inferred topology) → first dataset path. Critical path for the client-telemetry vision. -3. **Telemetry Loop & Parity** — parity harness validates synthetic-vs-replay drift bounds. Required before fit results are trustworthy. -4. **E-22 Model Fit + Chunked Evaluation** — carries forward the remaining E-18 scope (`FitSpec`/`FitRunner`/`POST /v1/fit` and chunked stateful session protocol) plus the `FlowTime.Pipeline` embeddable SDK wrapper. Completes the discovery pipeline and crystallizes the embeddable surface. See epic: `work/epics/E-22-model-fit-chunked-evaluation/spec.md`. - -**Deferred with no owner milestone (tracked in `work/gaps.md`):** -- Optimization constraints (penalty method on `OptimizeSpec`) -- Monte Carlo (sampling layer on `IModelEvaluator`) -- `FlowTime.Telemetry.*` adapter projects (Prometheus, OTEL, BPI) — direct-source `ITelemetrySource` implementations that bypass the E-15 Gold Builder pipeline for specific live sources; narrow bypasses, not part of E-15 scope - -## E-22 — Time Machine: Model Fit & Chunked Evaluation (planning) - -**Epic:** `work/epics/E-22-model-fit-chunked-evaluation/spec.md` | **Status:** planning - -Closes out the remaining Time Machine analysis modes from E-18: model fitting against real telemetry, chunked evaluation for feedback simulation, and the `FlowTime.Pipeline` embeddable SDK wrapper. Completes the "FlowTime as a callable function" arc. - -**Depends on:** E-15 Telemetry Ingestion (first dataset path), Telemetry Loop & Parity (validated drift bounds). Sequenced after both per D-2026-04-15-032 Option A. - -**Planned milestones:** -- m-E22-01 Model Fit — `FitSpec`/`FitRunner`/`POST /v1/fit` composing `ITelemetrySource` + `Optimizer`; `flowtime fit` CLI -- m-E22-02 Chunked Evaluation — Rust `chunk_step` session command; `POST /v1/chunked-eval`; external-controller integration -- m-E22-03 `FlowTime.Pipeline` SDK — embeddable library wrapping all analysis modes; existing API/CLI callers rewritten to dogfood the SDK - -**Out of scope (tracked as gaps):** optimization constraints, Monte Carlo, `FlowTime.Telemetry.*` direct-source adapters, tiered validation parity across UIs/MCP. - -## UI Paradigm Epics (draft — unnumbered until sequenced) - -See `work/epics/ui-workbench/reference/ui-paradigm.md` for the architectural proposal. - -- **UI Workbench & Topology Refinement** — Strip topology to structure + one color dimension, workbench panel for inspection. -- **UI Analytical Views** — Purpose-built views: heatmap, decomposition, comparison, flow balance. Absorbs E-14. -- **UI Question-Driven Interface** — Structured query panel for analytical questions with provenanced answers. - -## Mid-Term / Aspirational - -| Epic | Key Dependency | Notes | -|------|---------------|-------| -| **Anomaly & Pathology Detection** | Phase 3 + path/parity basics | Needs analytical primitives plus basic path context and telemetry parity before real-data automation | -| **UI Layout Motors** | dag-map spike | Pluggable layout engines behind stable contract | -| **Ptolemy-Inspired Semantics** | — | Conceptual guardrails for engine evolution | -| **Streaming & Subsystems** | Stable engine semantics | Long-term exploratory | -| **Cloud Deployment & Data Pipeline Integration** | E-15 + m-E18-14 CLI | Azure-native shape: Functions / Container Apps / ACI jobs. See below. | - -### Cloud Deployment & Data Pipeline Integration (aspirational) - -A natural deployment target for FlowTime is an Azure-hosted data pipeline where client -telemetry lands in ADX or Blob, and FlowTime runs batch or event-driven analysis against it. -This section captures the aspirational shape so that current architectural decisions stay -compatible with it — without yet committing to implementation. - -**Three deployment shapes anticipated:** - -1. **Scheduled batch.** Timer-triggered Azure Function (or Container Apps job) queries ADX, - loads canonical series via `ITelemetrySource`, runs FlowTime.TimeMachine fit / sweep / - sensitivity, writes results back to ADX or Blob. -2. **Event-driven.** Event Grid / Service Bus triggers a Function on a new telemetry window; - FlowTime evaluates; results push to a dashboard or downstream system. -3. **Long-running interactive service.** Container App hosting the existing ASP.NET API for - Svelte UI what-if exploration. Separate process from the batch pipeline. - -**What the current architecture already gets right:** -- Rust engine as a standalone binary — language-neutral, callable any way -- `IModelEvaluator` seam — swap subprocess for HTTP client, FFI, or WASM without changing analysis code -- `ITelemetrySource` seam — cloud adapters (ADX, Blob, Event Hubs) are additive -- Analysis modes as a library (`FlowTime.TimeMachine`) — callable from any .NET host, not tied to the API server -- Three-layer engine architecture (D-2026-04-10-031) — engine / sink / consumer separation supports Blob sinks - -**What we expect to add when Azure becomes concrete:** -- **Pipeline-grade .NET CLI (m-E18-14 will start this).** Stdin JSON in / stdout JSON out. Azure Functions custom-handler-compatible. Self-contained binary deployable to ACI. -- **Cloud `ITelemetrySource` adapters.** `AdxTelemetrySource`, `BlobTelemetrySource`, `EventHubsTelemetrySource`. Additive to the existing interface. -- **Blob-backed artifact sink.** Parallel implementation of the filesystem sink under the same directory contract. -- **OTEL / App Insights integration.** Structured spans around evaluator calls, sweeps, fits — long-running operations need observability. -- **Key Vault secrets integration.** ADX connection strings, SAS tokens via standard Azure identity patterns. - -**Note on per-eval vs. session evaluator:** Both paths have a legitimate deployment shape. -`SessionModelEvaluator` (persistent subprocess, compile-once) fits Container Apps jobs and -long-running services where startup cost is amortized over many evaluations. -`RustModelEvaluator` (stateless subprocess per eval) fits Azure Functions where each invocation -is short-lived and process isolation is a feature. Both implementations are retained. - -**Status:** Not scheduled. Marker section so that the .NET CLI, ITelemetrySource, artifact sink, -and observability work stay shaped for these scenarios as they land. Concrete Azure work begins -only when a specific client deployment target is chosen. - -## Dependency Graph - -``` -E-10 (done) + E-16 (done) + E-19 (done) + E-20 (done) + E-17 (done) - | - +--→ E-18 Time Machine (in-progress) - | m-E18-13 SessionModelEvaluator ← done - | m-E18-14 .NET Time Machine CLI ← done - | (later) m-E18-XX Model Fit ← blocked on E-15 + Telemetry Loop & Parity - | (later) Chunked evaluation ← after discovery pipeline works end-to-end - | - +--→ UI parity fork ← NEXT - | Svelte UI: platform for new surfaces (telemetry, fit, discovery) - | Blazor UI: maintenance mode, frozen at current functionality - | - +--→ E-15 Telemetry Ingestion (critical path for client-telemetry vision) - | Gold Builder → Graph Builder → first dataset path - | +--→ Telemetry Loop & Parity - | +--→ E-18 Model Fit (completes discovery pipeline) - | - +--→ E-12 Dependency Constraints (engine feature — after discovery pipeline) - +--→ E-13 Path Analysis (engine feature — after discovery pipeline) - +--→ Scenario Overlays (parameter override as plan operation) - +--→ Anomaly Detection (after path/parity basics) -``` - -## References -- `docs/architecture/reviews/engine-deep-review-2026-03.md` — Full engine deep review -- `docs/architecture/reviews/engine-review-findings.md` — Initial review findings -- `docs/architecture/reviews/review-sequenced-plan-2026-03.md` — Sequenced plan (historical rationale) -- `work/epics/epic-roadmap.md` — Architecture epics with links to specs -- `work/decisions.md` — Architectural decisions (dated D-2026-… identifiers) -- `docs/architecture/whitepaper.md` — Engine vision + future primitives -- `docs/flowtime-engine-charter.md` — Engine remit and non-goals diff --git a/STATUS.md b/STATUS.md new file mode 100644 index 00000000..eb727240 --- /dev/null +++ b/STATUS.md @@ -0,0 +1,94 @@ +# aiwf status — 2026-05-03 + +_171 entities · 0 errors · 0 warnings_ + +## In flight + +_(no active epics)_ + +## Roadmap + +### E-13 — Path Analysis & Subgraph Queries _(proposed)_ + +_(no milestones)_ + +### E-15 — Telemetry Ingestion, Topology Inference, and Canonical Bundles _(proposed)_ + +_(no milestones)_ + +### E-22 — Time Machine — Model Fit & Chunked Evaluation _(proposed)_ + +_(no milestones)_ + +### E-25 — Engine Truth Gate — Edge-Flow Authority + Golden-Output Canary _(proposed)_ + +- **M-066** — Edge-Flow Authority Decision _(draft)_ — ACs 0/9 met (9 open) +- **M-067** — Engine + Template Alignment _(draft)_ — ACs 0/13 met (13 open) +- **M-068** — Golden-Output Canary _(draft)_ — ACs 0/14 met (14 open) + +```mermaid +flowchart LR + E_25["E-25
Engine Truth Gate — Edge-Flow Authority + Golden-Output Canary"]:::epic_proposed + M_066["M-066 (0/9)
Edge-Flow Authority Decision"]:::ms_draft + E_25 --> M_066 + M_067["M-067 (0/13)
Engine + Template Alignment"]:::ms_draft + E_25 --> M_067 + M_068["M-068 (0/14)
Golden-Output Canary"]:::ms_draft + E_25 --> M_068 + classDef epic_active fill:#d6eaff,stroke:#1a73e8,color:#000 + classDef epic_proposed fill:#f4f4f4,stroke:#888,color:#000 + classDef ms_done fill:#d8f5d8,stroke:#2a8a2a,color:#000 + classDef ms_in_progress fill:#fff3c4,stroke:#caa400,color:#000 + classDef ms_draft fill:#f4f4f4,stroke:#888,color:#000 + classDef ms_cancelled fill:#fbeaea,stroke:#c33,color:#000 +``` + +## Open decisions + +_(none)_ + +## Open gaps + +| ID | Title | Discovered in | +|----|-------|---------------| +| G-001 | Path Analysis / Path Filters | | +| G-002 | Summary Helpers (Edge/Path Analytics) | | +| G-003 | Dependency Constraint Enforcement (Deferred M-10.03) | | +| G-004 | dag-map Layout Quality (Svelte UI) | | +| G-005 | dag-map Features Needed for Svelte UI M5+ | | +| G-006 | Svelte UI: SVG Performance at Scale | | +| G-007 | Client-Side Route Derivation for layoutFlow | | +| G-008 | Router Convergence Guard (Deferred from Phase 1) | | +| G-009 | Parallelism \`object?\` Typing (Deferred from Phase 1) | | +| G-010 | Legacy / Compatibility Surface Cleanup | | +| G-011 | Continuous Prediction / Crystal Ball Usage Pattern | | +| G-012 | Streaming Epic Not Formalized | | +| G-013 | E-18 Model Calibration Needs Crystal Ball Design Input | | +| G-014 | Deferred deletion: Engine \`POST /v1/run\` and \`POST /v1/graph\` | | +| G-016 | Rust Engine Parity — Evaluation Core Gaps | | +| G-017 | E-18 Optimization Constraints (no owner milestone) | | +| G-018 | \`IModelEvaluator\` Series-Key Shape Divergence | | +| G-019 | Sim-generated model shape vs. Rust engine compiler expectations | | +| G-020 | Ultrareview findings on \`epic/E-21-svelte-workbench-and-analysis\` (2026-04-20) | | +| G-022 | Heatmap view — deferred enhancements (m-E21-06 Q&A, 2026-04-23) | | +| G-023 | Topology DAG has no keyboard nav or ARIA structure (m-E21-06 AC12 homework) | | +| G-024 | Data-viz palette not validated for color-blindness (m-E21-06 AC12 homework) | | +| G-025 | Bidirectional card ↔ view selection (reverse cross-link) | | +| G-026 | Heatmap sliding-window scrubber (Blazor-parity zoom-and-pan) | | +| G-032 | \`transportation-basic\` regressed: \`edge_flow_mismatch_incoming\` × 3 after E-24 unification | | +| G-033 | Tests are too weak: surveyed-output-only canaries cannot detect drift; need deterministic golden-output assertions | | + +## Warnings + +_(none)_ + +## Recent activity + +| Date | Actor | Verb | Detail | +|------|-------|------|--------| +| 2026-05-02 | human/peter | render-roadmap | aiwf render roadmap | +| 2026-05-02 | human/peter | add | aiwf add milestone M-068 'Golden-Output Canary' | +| 2026-05-02 | human/peter | add | aiwf add milestone M-067 'Engine + Template Alignment' | +| 2026-05-02 | human/peter | add | aiwf add milestone M-066 'Edge-Flow Authority Decision' | +| 2026-05-02 | human/peter | rename | aiwf rename E-25 slug -> engine-truth-gate | + diff --git a/ai_old/README.md b/ai_old/README.md deleted file mode 100644 index c7fdc2c1..00000000 --- a/ai_old/README.md +++ /dev/null @@ -1,227 +0,0 @@ -# AI-Assisted Development Framework - -A portable, self-contained framework for AI-assisted software development using personas, skills, and structured workflows. - -## Purpose - -This framework defines: -- **Agents**: Role-based personas that guide AI behavior -- **Skills**: Reusable workflows for common development tasks -- **Instructions**: Global guardrails that apply to every session - -The framework is generic and can be adapted to any project using milestone-driven development. - ---- - -## Quick Start - -### Starting a Session - -Always begin with the **session-start** skill to establish context and choose your working mode. - -**Trigger phrases:** -- "Start a session" -- "Begin work" -- "Let's start" -- "Initialize context" - -The session start will ask you to choose: -1. **Agent role** - Which persona should guide the work? -2. **Task type** - What are you trying to accomplish? -3. **Context** - What epic, milestone, or feature? - -### Common Workflows - -**Planning a new epic:** -``` -session-start → architect (epic-refine) → documenter (milestone-draft) -``` - -**Implementing a milestone:** -``` -session-start → implementer (milestone-start) → tester (red-green-refactor) → documenter (milestone-wrap) -``` - -**Fixing a bug:** -``` -session-start → implementer → red-green-refactor → code-review -``` - -**Preparing a release:** -``` -session-start → documenter (milestone-wrap + release) -``` - ---- - -## Structure - -### `/agents/` -Role-based personas that define focus areas and typical skill usage: -- **architect**: Design decisions, system boundaries, epic planning -- **implementer**: Coding with minimal risk and clear intent -- **tester**: Test planning, TDD workflow, regression safety -- **documenter**: Documentation quality, consistency, release notes -- **deployer**: Infrastructure, packaging, release execution - -### `/skills/` -Reusable workflows organized by lifecycle stage: - -**Epic Lifecycle:** -- `epic-refine`: Clarify scope and decisions before milestone planning -- `epic-start`: Initialize context when beginning epic work -- `epic-wrap`: Close out completed epic and sync documentation - -**Milestone Lifecycle:** -- `milestone-draft`: Create milestone specifications -- `milestone-start`: Begin implementation work -- `milestone-wrap`: Complete milestone and update docs - -**Development Workflows:** -- `session-start`: Initialize working session and choose role -- `red-green-refactor`: TDD cycle (write failing test → make it pass → improve) -- `code-review`: Review changes for correctness and regressions -- `branching`: Apply milestone-driven branching strategy -- `ui-debug`: Diagnose UI issues with deterministic tests - -**Product & Release:** -- `roadmap`: Maintain epic lifecycle and planning documents -- `release`: Execute release ceremony -- `deployment`: Follow deployment procedures -- `gap-triage`: Handle discovered gaps during work - -### `/instructions/` -Global guardrails that apply regardless of agent or skill: -- **ALWAYS_DO.md**: Core rules for every session - ---- - -## When to Use Which Agent - -### Starting Work -**Use session-start first** - it will help you choose the right agent and task. - -### By Task Type -- **"I need to plan a new feature/epic"** → architect + epic-refine -- **"Let's implement milestone X"** → implementer + milestone-start -- **"Write tests for..."** → tester + red-green-refactor -- **"Update the documentation"** → documenter -- **"Prepare a release"** → documenter + release -- **"Debug the UI"** → implementer + ui-debug -- **"Deploy to production"** → deployer + deployment - -### By Question Type -- **"How should we approach..."** → architect -- **"How do I test..."** → tester -- **"Where should this be documented?"** → documenter -- **"Should this be in scope?"** → architect + gap-triage - ---- - -## Skill Trigger Phrases - -Each skill has specific trigger phrases. Here's a quick reference: - -### Epic Work -- **epic-refine**: "refine the epic", "clarify epic scope", "what are we building?" -- **epic-start**: "start epic X", "begin epic work", "initialize epic" -- **epic-wrap**: "close the epic", "epic is complete", "archive milestones" - -### Milestone Work -- **milestone-draft**: "create milestone spec", "draft milestone", "plan milestone" -- **milestone-start**: "start milestone", "begin M-XX.XX", "implement milestone" -- **milestone-wrap**: "complete milestone", "close milestone", "wrap up" - -### Development -- **session-start**: "start session", "begin work", "initialize" -- **red-green-refactor**: "write tests", "TDD", "test-driven", "implement feature" -- **code-review**: "review changes", "check the code", "look for issues" -- **branching**: "create branch", "branch strategy", "git workflow" - -### Documentation & Release -- **roadmap**: "update roadmap", "plan epics", "epic lifecycle" -- **release**: "release", "tag version", "publish" -- **gap-triage**: "found a gap", "missing requirement", "unexpected issue" - ---- - -## Core Principles - -### 1. Milestone-Driven Development -Work is organized into: -- **Epics**: Coherent architectural or product themes -- **Milestones**: Concrete, scoped deliverables within epics -- **Features**: Individual changes within milestones - -### 2. Test-Driven Development (TDD) -Always follow RED → GREEN → REFACTOR: -1. **RED**: Write failing tests first -2. **GREEN**: Implement minimum code to pass -3. **REFACTOR**: Improve structure with tests passing - -### 3. Documentation as Code -- Milestone specs are authoritative and living documents -- Tracking documents record progress during implementation -- Release notes capture what shipped and why - -### 4. No Time Estimates -Never include effort estimates, timelines, or target dates. Focus on: -- Clear requirements and acceptance criteria -- Dependency sequences -- Scope boundaries (in/out of scope) - -### 5. Incremental Progress -- Keep changes small and focused -- Build and test before handoff -- Document decisions as you go - ---- - -## Customization - -This framework is designed to be portable. To adapt it to your project: - -1. **Update agent responsibilities** to match your team structure -2. **Modify skills** to match your workflows and tooling -3. **Adjust ALWAYS_DO** for project-specific conventions -4. **Keep the structure** - the agent/skill separation is key - -### Project-Specific Hooks -Look for these markers in skills that may need customization: -- File paths and directory structures -- Build and test commands -- Documentation locations -- Branch naming conventions -- Version number formats - ---- - -## Anti-Patterns - -### ❌ DON'T -- Skip session-start and jump into coding -- Include time or effort estimates -- Write code without tests -- Merge to main without review -- Skip documentation updates -- Use vague requirements ("make it better") - -### ✅ DO -- Start every session with session-start -- Write tests first (RED → GREEN → REFACTOR) -- Keep milestone specs stable -- Update tracking docs frequently -- Make small, focused changes -- Document decisions inline - ---- - -## Getting Help - -If you're unsure which agent or skill to use: -1. Run **session-start** - it will guide you -2. Review the "When to Use" sections above -3. Check individual skill files for detailed guidance -4. Ask: "What should I do to [accomplish goal]?" - -The framework is designed to guide you through context gathering and decision-making. diff --git a/ai_old/agents/architect.md b/ai_old/agents/architect.md deleted file mode 100644 index afd6a0cf..00000000 --- a/ai_old/agents/architect.md +++ /dev/null @@ -1,17 +0,0 @@ -# Agent: architect - -Focus: design decisions, system boundaries, and alignment with architecture docs. - -Responsibilities: -- Clarify goals, constraints, and tradeoffs. -- Propose alternatives and record rationale. -- Keep epic docs in `docs/architecture/` aligned with decisions. -- Ensure milestones map to epic scope and roadmap sequencing. - -Typical skills: -- epic-refine -- epic-start -- milestone-draft -- branching (for epic integration) -- roadmap -- gap-triage diff --git a/ai_old/agents/deployer.md b/ai_old/agents/deployer.md deleted file mode 100644 index e0117b18..00000000 --- a/ai_old/agents/deployer.md +++ /dev/null @@ -1,12 +0,0 @@ -# Agent: deployer - -Focus: infrastructure, packaging, and release execution. - -Responsibilities: -- Follow deployment runbooks when present. -- Validate versioning, tagging, and release artifacts. -- Record deployment outcomes. - -Typical skills: -- deployment -- release diff --git a/ai_old/agents/documenter.md b/ai_old/agents/documenter.md deleted file mode 100644 index 2ce04d03..00000000 --- a/ai_old/agents/documenter.md +++ /dev/null @@ -1,14 +0,0 @@ -# Agent: documenter - -Focus: documentation quality, consistency, and release notes. - -Responsibilities: -- Draft and update milestone specs and tracking docs. -- Keep roadmap, charters, and reference docs aligned. -- Produce release notes and verify doc cross-links. - -Typical skills: -- milestone-draft -- milestone-wrap -- release -- gap-triage diff --git a/ai_old/agents/implementer.md b/ai_old/agents/implementer.md deleted file mode 100644 index 24431267..00000000 --- a/ai_old/agents/implementer.md +++ /dev/null @@ -1,13 +0,0 @@ -# Agent: implementer - -Focus: coding changes with minimal risk and clear intent. - -Responsibilities: -- Implement milestone requirements with precise edits. -- Follow existing patterns and contracts. -- Keep related docs/tests in sync. - -Typical skills: -- milestone-start -- red-green-refactor -- code-review (self-check) diff --git a/ai_old/agents/tester.md b/ai_old/agents/tester.md deleted file mode 100644 index ae64486e..00000000 --- a/ai_old/agents/tester.md +++ /dev/null @@ -1,12 +0,0 @@ -# Agent: tester - -Focus: test planning, TDD workflow, and regression safety. - -Responsibilities: -- Define RED/GREEN/REFACTOR steps. -- Choose deterministic, minimal test coverage for the change. -- Record test results in tracking docs. - -Typical skills: -- red-green-refactor -- code-review (test-centric) diff --git a/ai_old/instructions/ALWAYS_DO.md b/ai_old/instructions/ALWAYS_DO.md deleted file mode 100644 index 5c8a5edc..00000000 --- a/ai_old/instructions/ALWAYS_DO.md +++ /dev/null @@ -1,21 +0,0 @@ -# ALWAYS_DO - -Purpose: global guardrails that apply to every session, regardless of role. - -## Core guardrails -- Follow `.github/copilot-instructions.md` and relevant docs under `docs/development/`. -- No time or effort estimates in docs or plans. -- Use `rg`/`fd` for searches; avoid destructive git. -- Keep docs/schemas/templates aligned when touching contracts. -- Tests must be deterministic; avoid external network calls. -- Use Mermaid for diagrams; avoid ASCII art boxes. -- Prefer minimal, precise edits; avoid broad refactors without context. - -## Session hygiene -- Confirm epic context before milestone execution; if missing, run `epic-refine`/`epic-start`. -- Use TDD: RED -> GREEN -> REFACTOR; list tests first in tracking docs. -- Keep milestone specs stable; use tracking docs for progress updates. - -## Build/test -- Build and test before handoff when asked to implement changes. -- If full test suite is too slow, run per-project tests and record results. diff --git a/ai_old/skills/branching.md b/ai_old/skills/branching.md deleted file mode 100644 index 2a5b7cbe..00000000 --- a/ai_old/skills/branching.md +++ /dev/null @@ -1,13 +0,0 @@ -# Skill: branching - -Purpose: apply the milestone-driven branching strategy. - -Use when: -- Starting epic or milestone work. - -Reference: -- docs/development/branching-strategy.md - -Notes: -- Use epic integration branches when main only advances at epic completion. -- Use milestone branches for multi-surface work. diff --git a/ai_old/skills/code-review.md b/ai_old/skills/code-review.md deleted file mode 100644 index 857904d4..00000000 --- a/ai_old/skills/code-review.md +++ /dev/null @@ -1,13 +0,0 @@ -# Skill: code-review - -Purpose: review changes for correctness, regressions, and missing tests. - -Use when: -- Asked for a review or before wrapping a milestone. - -Checklist: -- Behavior: regressions, edge cases, error handling. -- Tests: coverage for new logic, deterministic behavior. -- Contracts: schema/API compatibility preserved. -- Docs: updated where behavior or contracts changed. -- Style: matches existing patterns; no broad refactors. diff --git a/ai_old/skills/deployment.md b/ai_old/skills/deployment.md deleted file mode 100644 index f08c6c79..00000000 --- a/ai_old/skills/deployment.md +++ /dev/null @@ -1,9 +0,0 @@ -# Skill: deployment - -Purpose: placeholder for deployment procedure. - -Use when: -- Deployment runbook is defined. - -Status: -- Placeholder. Add steps once deployment process is documented. diff --git a/ai_old/skills/epic-refine.md b/ai_old/skills/epic-refine.md deleted file mode 100644 index 264c222d..00000000 --- a/ai_old/skills/epic-refine.md +++ /dev/null @@ -1,61 +0,0 @@ -# Skill: epic-refine - -**Trigger phrases:** "refine the epic", "clarify epic scope", "what are we building?", "define epic boundaries", "epic planning session" - -## Purpose - -Run a human-in-the-loop preflight to confirm epic scope, decisions, and constraints before any milestone specs are drafted. This is the critical first step that prevents scope creep and misalignment. - -## Use When -- A new epic is starting and you need shared context and decisions up front -- The epic is specified but still ambiguous or has open questions -- Stakeholders need to align on goals and boundaries -- You're unsure what to build or why - -## Inputs -- Epic name and slug (short identifier, e.g., "classes", "ui-perf", "service-buffer") -- High-level goal or problem statement -- Known constraints (platform, data, dependencies, teams) -- Known dependencies (other epics, milestones, external systems) - -## Process - -### 1. Create Epic Structure -- Create epic folder: `docs/architecture//` -- Initialize `README.md` in the epic folder -- Add entry to roadmap document - -### 2. Review Existing Context -Check if related documentation exists: -- Epic folder and README -- Roadmap entries -- Related architecture docs -- Existing milestones or features - -### 3. Run Structured Q&A -Capture answers to all questions below. Don't skip any—they prevent costly rework later. - -Structured Q&A (capture answers verbatim): -- Goal: what is the epic trying to enable or fix? -- Success criteria: what observable outcomes signal success? -- In scope: what must be included to claim completion? -- Out of scope: what is explicitly excluded? -- Dependencies: what milestones, systems, or teams must align first? -- Data/contracts: any schema or API changes required? -- Surfaces: which products are affected (API, UI, CLI, Sim, Core)? -- Risks: what could derail or invalidate the plan? -- Testing: how will we validate correctness and regressions? -- Observability: what metrics, logs, or diagnostics are needed? -- Security: any auth, privacy, or access changes? -- Rollout: phased delivery, migration, or compatibility steps? -- Documentation: which docs must change at epic or milestone completion? - -Outputs: -- Epic summary (one paragraph) and decisions list. -- Open questions list with owners. -- Milestone outline (IDs optional but preferred) with rough sequencing. -- A clear handoff to milestone-draft for each planned milestone. - -Notes: -- Do not start coding or implementation planning here. -- Use milestone-draft once the epic decisions are confirmed. diff --git a/ai_old/skills/epic-start.md b/ai_old/skills/epic-start.md deleted file mode 100644 index a5f67cab..00000000 --- a/ai_old/skills/epic-start.md +++ /dev/null @@ -1,66 +0,0 @@ -# Skill: epic-start - -**Trigger phrases:** "start epic", "begin epic work", "initialize epic", "resume epic", "epic context" - -## Purpose - -Initialize or confirm epic context before milestone work begins. This ensures you have the necessary background and setup to proceed with implementation. - -## Use When -- Starting work on a new epic -- Returning to an epic after time away and context is unclear -- Switching between epics -- Team members joining ongoing epic work - -## Inputs -- Epic name and slug -- Target branch strategy (epic integration or direct to mainline) - -## Process - -### 1. Locate Epic Documentation -- Find epic folder: `docs/architecture//` -- Read epic README for scope and goals -- Check roadmap for epic status and sequencing - -### 2. Review Core Context -Gather understanding of: -- **Scope**: What's included and excluded -- **Goals**: What success looks like -- **Milestones**: Planned breakdown of work -- **Dependencies**: What must exist first -- **Status**: Where the epic stands now - -### 3. Confirm or Create Epic Branch - -**Branch naming:** `epic/` - -Example: `epic/ui-perf`, `epic/classes`, `epic/service-buffer` - -**When to use epic branches:** -- Mainline (main) only advances when epics complete -- Multiple milestones in the epic need to integrate -- Epic spans significant time or complexity - -**Commands:** -```bash -git checkout main -git pull -git checkout -b epic/ -git push -u origin epic/ -``` - -### 4. Summarize Epic Context -Produce a brief summary: -- Epic goal and value -- Planned milestones (in order) -- Current status (which milestones are done/in-progress/planned) -- Next milestone to work on - -### 5. Hand Off to Milestone Work -Explicitly transition: "Ready to start milestone [ID]. Use milestone-start to begin." - -## Outputs -- Epic context summary -- Confirmed branch plan -- Named milestone to start or continue diff --git a/ai_old/skills/epic-wrap.md b/ai_old/skills/epic-wrap.md deleted file mode 100644 index 1ddfdfd7..00000000 --- a/ai_old/skills/epic-wrap.md +++ /dev/null @@ -1,20 +0,0 @@ -# Skill: epic-wrap - -Purpose: close out an epic, sync docs, and decide PR vs direct merge. - -Use when: -- All milestones in the epic are complete. - -Process: -1) Verify each milestone status is ✅ Complete. -2) Move milestone specs to docs/milestones/completed/ as a batch. -3) Update: - - docs/architecture/epic-roadmap.md - - docs/ROADMAP.md - - docs/flowtime-charter.md and docs/flowtime-engine-charter.md (if impacted) -4) Ask whether the epic should merge via PR or directly to main. -5) Ensure release ceremony steps are ready or completed. - -Outputs: -- Epic docs and roadmap updated. -- Milestones archived together. diff --git a/ai_old/skills/gap-triage.md b/ai_old/skills/gap-triage.md deleted file mode 100644 index 984ace74..00000000 --- a/ai_old/skills/gap-triage.md +++ /dev/null @@ -1,25 +0,0 @@ -# Skill: gap-triage - -Purpose: record gaps discovered during work and decide whether to include now or defer. - -Use when: -- A missing requirement, design hole, or unexpected limitation is found. - -Inputs: -- Gap description -- Where it was found (file, test, doc) -- Impact and risk - -Process: -1) Record the gap in the backlog section of docs/ROADMAP.md (or the agreed backlog doc). -2) Classify: scope gap, design gap, implementation gap, or documentation gap. -3) Decide disposition with the user: - - Include in current milestone (update scope + tests) - - Defer to a specific milestone (preferred) - - Defer to a future epic (if broader) -4) If deferred, record the target epic/milestone and rationale. -5) If included now, update the milestone spec and tracking doc. - -Outputs: -- Gap recorded with owner and target. -- Milestone scope updated if the gap is pulled in. diff --git a/ai_old/skills/milestone-draft.md b/ai_old/skills/milestone-draft.md deleted file mode 100644 index cdcc1c33..00000000 --- a/ai_old/skills/milestone-draft.md +++ /dev/null @@ -1,54 +0,0 @@ -# Skill: milestone-draft - -**Trigger phrases:** "create milestone spec", "draft milestone", "plan milestone", "write milestone document" - -## Purpose - -Draft milestone specification documents after epic-refine is complete. The milestone spec is the authoritative plan for implementation—it must be clear, complete, and actionable. - -## Use When -- The epic is refined and you're ready to create milestone specs -- A milestone doesn't exist yet -- An existing milestone needs complete rewrite for clarity - -## Inputs -- Epic slug and context -- Milestone ID and title -- Dependencies (what must complete first) -- Target statement (one-sentence goal) -- Decisions or constraints from epic-refine - -## Guardrails - -**ALWAYS:** -- ✅ Write testable acceptance criteria -- ✅ Define clear scope boundaries (in/out) -- ✅ Use Mermaid for diagrams -- ✅ Include comprehensive test plan -- ✅ Make it implementation-ready - -**NEVER:** -- ❌ Include time or effort estimates -- ❌ Use vague requirements -- ❌ Skip test plan -- ❌ Use ASCII art for diagrams -- ❌ Leave scope ambiguous - -Process: -1) Ensure epic context exists. If missing, run epic-refine first. -2) Create or update the milestone spec under docs/milestones/ (filename may include a descriptive suffix). -3) Populate required sections with testable acceptance criteria and explicit scope boundaries. -4) Include a TDD-ready implementation plan that calls out RED -> GREEN -> REFACTOR. -5) Add or update references in: - - docs/architecture/epic-roadmap.md (epic status and milestone list) - - docs/ROADMAP.md (high-level status) - - docs/architecture//README.md (if it lists milestones) - -Outputs: -- A complete milestone spec in docs/milestones/. -- Clear dependencies and success criteria. -- TDD-ready implementation phases and test plan. - -Notes: -- Do not start implementation in this skill. -- Use milestone-start when the milestone is ready to begin. diff --git a/ai_old/skills/milestone-start.md b/ai_old/skills/milestone-start.md deleted file mode 100644 index 7cf2610a..00000000 --- a/ai_old/skills/milestone-start.md +++ /dev/null @@ -1,101 +0,0 @@ -# Skill: milestone-start - -**Trigger phrases:** "start milestone", "begin milestone work", "implement milestone", "work on M-XX.XX" - -## Purpose - -Start or resume work on an existing milestone. This bridges planning and implementation. - -## Use When -- Milestone spec exists and is marked 📋 Planned or 🔄 In Progress -- You're ready to begin coding -- Resuming milestone work after interruption - -## Inputs -- Milestone ID (e.g., M-02.10, SIM-M-03.00, UI-M-02.09) -- Epic slug (if part of an epic) - -## Preflight Check - -**Before starting:** -- ✅ Milestone spec exists and is complete -- ✅ Dependencies are satisfied -- ✅ Epic context is clear (run epic-start if needed) -- ✅ Build and tests currently pass - -## Process - -### 1. Open Milestone Spec -- Location: `docs/milestones/.md` -- Read: target, requirements, acceptance criteria -- Understand: phases, test plan, file impacts - -### 2. Create Tracking Document - -**First time starting milestone:** -- Create `docs/milestones/tracking/-tracking.md` -- Copy from tracking template -- Populate: milestone ID, start date, TDD plan - -**Tracking template structure:** -```markdown -# [Milestone ID] Tracking - -**Status:** 🔄 In Progress -**Started:** [date] -**Branch:** [branch-name] - -## Progress Summary -[Brief status update] - -## Phases -### Phase 1: [Name] -- [ ] Task 1 -- [ ] Task 2 - -### Phase 2: [Name] -- [ ] Task 1 - -## Test Results -[Record test outcomes] - -## Decisions & Notes -[Document key choices] -``` - -### 3. Create or Confirm Branch - -**Branch naming conventions:** -- Feature in milestone: `feature/-mX/` -- Milestone integration: `milestone/mX` -- Epic integration: `epic/` - -**Example commands:** -```bash -# Feature branch from milestone -git checkout milestone/m2 -git checkout -b feature/api-m2/add-endpoint - -# Or direct from main for simple changes -git checkout main -git pull -git checkout -b feature/api-m2/add-endpoint -``` - -### 4. Plan TDD Approach - -List tests BEFORE writing implementation: -1. What behavior needs testing? -2. What are the edge cases? -3. What could break? -4. How will you know it works? - -Update tracking doc with test plan. - -### 5. Begin RED → GREEN → REFACTOR - -Transition to red-green-refactor skill for implementation cycle. - -Outputs: -- Active tracking doc. -- Branch ready for implementation. diff --git a/ai_old/skills/milestone-wrap.md b/ai_old/skills/milestone-wrap.md deleted file mode 100644 index d533f763..00000000 --- a/ai_old/skills/milestone-wrap.md +++ /dev/null @@ -1,123 +0,0 @@ -# Skill: milestone-wrap - -**Trigger phrases:** "complete milestone", "close milestone", "wrap up", "milestone done", "finish milestone" - -## Purpose - -Complete a milestone without merging to main. This ensures all documentation and tracking artifacts are finalized before the milestone is considered done. - -## Use When -- All acceptance criteria are met -- All tests pass -- Implementation is complete -- Ready to hand off for review or next milestone - -## Process - -### 1. Verify Completion - -**Check acceptance criteria:** -- [ ] All functional requirements implemented -- [ ] All tests written and passing -- [ ] No known regressions -- [ ] Edge cases handled -- [ ] Error handling complete - -**Run full test suite:** -```bash -# Build first -dotnet build - -# Then test -dotnet test --nologo -``` - -### 2. Update Milestone Status - -In the milestone spec (`docs/milestones/.md`): -- Change status to ✅ Complete -- Add completion date -- Note any deferred scope - -### 3. Finalize Tracking Document - -In tracking doc (`docs/milestones/tracking/-tracking.md`): -- Mark all phases complete -- Record final test results -- Summarize what was delivered -- Document any decisions or tradeoffs -- List any follow-up work or gaps - -### 4. Update Related Documentation - -**Documentation sweep checklist:** -- [ ] Roadmap updated with milestone status -- [ ] Epic status reflects milestone completion -- [ ] Release notes drafted for changes -- [ ] Architecture docs updated if design changed -- [ ] Reference docs updated if capabilities changed -- [ ] Charters updated if product scope changed -- [ ] API docs updated if contracts changed -- [ ] Schema docs updated if formats changed - -**Common docs to check:** -- `docs/ROADMAP.md` - high-level status -- `docs/architecture/epic-roadmap.md` - epic progress -- Epic README in `docs/architecture//` -- `docs/releases/` - add release note -- Reference docs in `docs/reference/` -- Concept docs in `docs/concepts/` - -### 5. Stay on Branch - -**Do NOT merge to main yet.** - -The milestone branch stays open for: -- Review and approval -- Integration testing -- Next milestone branching -- Epic integration (if using epic branches) - -Next milestone can branch from this one if sequential. - -### 6. Prepare Handoff - -Create a summary for handoff: -```markdown -## Milestone [ID] Complete - -**What shipped:** -- [Key deliverable 1] -- [Key deliverable 2] - -**Tests:** -- All passing ✅ -- Coverage: [summary] - -**Documentation:** -- [Updated doc 1] -- [Updated doc 2] - -**Next steps:** -- Review and merge -- Or: start next milestone [ID] -``` - -## Outputs - -- ✅ Milestone status updated -- ✅ Tracking doc finalized -- ✅ Related docs updated -- ✅ Tests passing -- ✅ Branch ready for review -- ✅ Handoff summary prepared - -## Notes - -**Milestone completion ≠ epic completion** -- Milestones stay in `docs/milestones/` until epic wraps -- Archive to `docs/milestones/completed/` only when epic closes - -**If last milestone in epic:** -- Proceed to epic-wrap next -- That's when you merge to main diff --git a/ai_old/skills/red-green-refactor.md b/ai_old/skills/red-green-refactor.md deleted file mode 100644 index 0b528ea6..00000000 --- a/ai_old/skills/red-green-refactor.md +++ /dev/null @@ -1,12 +0,0 @@ -# Skill: red-green-refactor - -Purpose: enforce TDD cadence and test-first workflow. - -Use when: -- Implementing milestone tasks. - -Process: -1) RED: write failing tests first (record in tracking doc). -2) GREEN: implement minimum change to pass tests. -3) REFACTOR: improve structure with tests still passing. -4) Repeat per task; keep tests deterministic. diff --git a/ai_old/skills/release.md b/ai_old/skills/release.md deleted file mode 100644 index f576b98f..00000000 --- a/ai_old/skills/release.md +++ /dev/null @@ -1,12 +0,0 @@ -# Skill: release - -Purpose: execute the release ceremony after merging to main. - -Use when: -- A milestone or significant feature has merged to main. - -Reference: -- docs/development/release-ceremony.md - -Outputs: -- Version bump, release notes, tag, and pushed changes. diff --git a/ai_old/skills/roadmap.md b/ai_old/skills/roadmap.md deleted file mode 100644 index 643e3c56..00000000 --- a/ai_old/skills/roadmap.md +++ /dev/null @@ -1,35 +0,0 @@ -# Skill: roadmap - -Purpose: maintain the lifecycle of epics between the high-level roadmap and epic architecture docs. - -Use when: -- Proposing new epics. -- Promoting an epic from idea to planned/active. -- Closing an epic. - -Policy: -- docs/ROADMAP.md is the authoritative list of proposed and active epics. -- docs/architecture/epic-roadmap.md is the authoritative list of epics with docs. - -Status flow: -- Proposed: only in docs/ROADMAP.md. -- Planned: epic folder exists with README.md. -- Active: milestones underway. -- Complete: epic wrapped and milestones archived. - -Process: -1) Proposed epic - - Add to docs/ROADMAP.md with intent and rough ordering. - - No epic folder required. -2) Promote to Planned/Active - - Run epic-refine. - - Create docs/architecture//README.md. - - Add to docs/architecture/epic-roadmap.md. - - Update docs/ROADMAP.md status. -3) Complete - - Run epic-wrap. - - Update docs/ROADMAP.md and docs/architecture/epic-roadmap.md. - -Outputs: -- Roadmap and epic roadmap in sync. -- Clear epic status at each stage. diff --git a/ai_old/skills/session-start.md b/ai_old/skills/session-start.md deleted file mode 100644 index dd56f099..00000000 --- a/ai_old/skills/session-start.md +++ /dev/null @@ -1,233 +0,0 @@ -# Skill: session-start - -**Trigger phrases:** "start session", "begin work", "let's start", "initialize", "new session" - -## Purpose - -Initialize a working session by establishing context, choosing the appropriate agent role, and defining the task scope. This is the entry point for all AI-assisted work. - -## Why Start Here - -Starting with session-start ensures: -- Clear understanding of goals and context -- Appropriate agent persona is selected -- Correct skills are activated -- Guardrails are established -- Progress can be tracked effectively - -## When to Use - -**Always use this at the beginning of:** -- A new work session -- Switching between different types of work -- Returning to work after time away -- Collaborating with AI on an unfamiliar task - -**Skip this only if:** -- Continuing active work in the same context -- Making trivial edits or quick fixes -- Already in an established session - -## Process - -### Step 1: Understand the Goal - -Ask the user to describe their goal in one of these categories: - -**Epic Work:** -- "I want to plan a new feature/capability" → leads to epic-refine -- "I need to start work on an epic" → leads to epic-start -- "An epic is complete and needs closure" → leads to epic-wrap - -**Milestone Work:** -- "I want to create a milestone specification" → leads to milestone-draft -- "I want to implement milestone X" → leads to milestone-start -- "Milestone is done, need to wrap it up" → leads to milestone-wrap - -**Development Tasks:** -- "I want to implement a feature" → leads to red-green-refactor -- "I need to write tests" → leads to red-green-refactor -- "I want to fix a bug" → leads to red-green-refactor -- "I need to review code" → leads to code-review -- "UI is not working correctly" → leads to ui-debug - -**Documentation & Planning:** -- "I need to update documentation" → documenter agent -- "I want to plan the roadmap" → leads to roadmap skill -- "Found a gap or missing requirement" → leads to gap-triage - -**Release & Deployment:** -- "Ready to release" → leads to release -- "Need to deploy" → leads to deployment - -**Other:** -- "I'm not sure what to do" → continue with guided questions -- "Help me understand the codebase" → exploration mode - -### Step 2: Select Agent Persona - -Based on the goal, recommend the appropriate agent: - -| Goal Type | Primary Agent | Why | -|-----------|---------------|-----| -| Epic planning, design decisions | **architect** | Focuses on scope, boundaries, and tradeoffs | -| Coding, bug fixes, features | **implementer** | Focuses on precise, safe code changes | -| Writing tests, TDD | **tester** | Focuses on test planning and coverage | -| Documentation, release notes | **documenter** | Focuses on clarity and consistency | -| Deployment, infrastructure | **deployer** | Focuses on release mechanics | - -**Explain the choice:** Tell the user why this agent is appropriate and what to expect. - -### Step 3: Gather Context - -Collect the minimum necessary context: - -**For epic work:** -- Epic name/slug -- High-level goal or problem -- Known constraints -- Related documentation (if any) - -**For milestone work:** -- Milestone ID (e.g., M-02.10, SIM-M-03.00) -- Epic it belongs to (if known) -- Current status (planned, in-progress, complete) -- Location of spec and tracking docs - -**For development tasks:** -- What needs to change? -- Which files or components? -- Are there existing tests? -- What's the expected behavior? - -**For bugs:** -- Steps to reproduce -- Expected vs actual behavior -- Error messages or logs -- Which components are affected? - -### Step 4: Confirm Guardrails - -Remind the user (and yourself) of core principles: - -**Always:** -- Follow TDD: write tests first (RED → GREEN → REFACTOR) -- Build and test before considering work complete -- Keep changes small and focused -- Document decisions inline -- Update relevant docs when behavior changes - -**Never:** -- Include time or effort estimates -- Skip tests -- Make broad refactors without context -- Commit without running tests -- Merge to main without review - -### Step 5: Establish Branch Context - -If this is implementation work, confirm the branch strategy: - -**Ask:** -- Are you on the correct branch for this work? -- Is this a new feature, milestone, or bug fix? -- Should we create a new branch? - -**Branch naming conventions:** -- `epic/` - epic integration branch -- `milestone/mX` - milestone integration branch -- `feature/-mX/` - feature work branch -- `fix/` - bug fix branch - -### Step 6: Create or Resume Tracking - -For milestone work: -- Check if tracking doc exists -- If starting fresh, create from tracking template -- If resuming, read current status and last completed phase -- Update tracking doc with session start time and goals - -### Step 7: Hand Off to Skill - -Based on the gathered context, explicitly hand off to the appropriate skill: - -**Example handoffs:** -- "Starting epic-refine for [epic name]..." -- "Beginning milestone-start for M-02.10..." -- "Entering red-green-refactor cycle for [feature]..." -- "Running code-review on recent changes..." - -## Outputs - -Session-start produces: -1. **Chosen agent persona** with rationale -2. **Identified skill** to execute next -3. **Gathered context** sufficient to begin work -4. **Confirmed guardrails** for the session -5. **Active tracking** (for milestone work) -6. **Branch confirmation** (for implementation work) - -## Template Prompt - -When a user requests to start a session, use this template: - -``` -Welcome! Let's start your session. - -**What would you like to accomplish?** -(Examples: plan an epic, implement a milestone, fix a bug, write tests, update docs, prepare release) - -[Wait for response] - -Based on that goal, I recommend the **[agent name]** persona. -This agent focuses on [agent's focus area] and typically uses these skills: -- [skill 1] -- [skill 2] - -**Context check:** -- [Ask relevant context questions based on goal type] - -**Guardrails reminder:** -- ✅ Tests first (RED → GREEN → REFACTOR) -- ✅ Build and test before handoff -- ✅ Small, focused changes -- ❌ No time estimates -- ❌ No untested code - -**Ready to begin?** I'll now [describe which skill will run next]. -``` - -## Notes - -- Session-start is **not optional** for complex work - it prevents costly context-switching -- For quick edits or clarifications, you can skip session-start -- The chosen agent is a **guide**, not a constraint - switch agents if the work changes -- Update tracking docs frequently during implementation -- End sessions cleanly: summarize what was done, what's next, and any open questions - -## Common Session Paths - -**Path 1: New Epic** -``` -session-start → architect + epic-refine → [gap clarification] → documenter + milestone-draft -``` - -**Path 2: Implement Milestone** -``` -session-start → implementer + milestone-start → [create tracking] → tester + red-green-refactor → [tests pass] → documenter + milestone-wrap -``` - -**Path 3: Bug Fix** -``` -session-start → implementer → [identify root cause] → tester + red-green-refactor → code-review -``` - -**Path 4: Documentation Update** -``` -session-start → documenter → [identify affected docs] → [update docs] → code-review -``` - -**Path 5: Release** -``` -session-start → documenter + milestone-wrap → [verify completeness] → release -``` diff --git a/ai_old/skills/ui-debug.md b/ai_old/skills/ui-debug.md deleted file mode 100644 index ec826391..00000000 --- a/ai_old/skills/ui-debug.md +++ /dev/null @@ -1,15 +0,0 @@ -# Skill: ui-debug - -Purpose: diagnose UI issues quickly and reproducibly. - -Use when: -- UI behavior is incorrect, flaky, or unclear. - -Process: -1) Reproduce with the smallest possible scenario. -2) Prefer Playwright tests for deterministic reproduction. -3) If updating snapshots, note why and keep diffs minimal. -4) Avoid external network calls; mock or stub where needed. - -Notes: -- If a Playwright test exists, run or update it before manual fixes. diff --git a/aiwf.yaml b/aiwf.yaml new file mode 100644 index 00000000..a95b6fe1 --- /dev/null +++ b/aiwf.yaml @@ -0,0 +1 @@ +aiwf_version: v0.1.1 diff --git a/current_status.md b/current_status.md deleted file mode 100644 index 4eef81d9..00000000 --- a/current_status.md +++ /dev/null @@ -1,226 +0,0 @@ -# FlowTime Current Status — 2026-04-13 - -This file captures session context for handoff. It reflects the state at end of -the m-E18-12 implementation session on the `epic/E-18-time-machine` branch. - ---- - -## What just happened (m-E18-12) - -Implemented multi-parameter optimization using the Nelder-Mead simplex algorithm. -All acceptance criteria from the milestone spec are met. 100% branch coverage -confirmed, including all five non-obvious algorithm paths: - -| Path | How covered | -|------|-------------| -| Pre-loop convergence (iterations=0) | Bowl1D with tiny range [49,51], tolerance=0.1 | -| Reflection → accept (normal) | Bowl1D and QuadraticEvaluator (existing tests) | -| Expansion accepted | QuadraticEvaluator at known minimum | -| Expansion rejected → accept reflection | AbsEvaluator(target=90), range [0,200] | -| Outside contraction fail → shrink (line 116) | StepEvaluator(peak=50, valley=45), iter 1 | -| Inside contraction fail → shrink (line 128) | StepEvaluator(peak=50, valley=45), iter 2 | - -**Test counts (final):** -- `OptimizeSpec`: 17 unit tests (validation: missing fields, mismatched ranges, etc.) -- `Optimizer`: 12 unit tests (algorithm correctness + all branch coverage) -- `OptimizeEndpointsTests`: 10 API tests (9 × 400 validation paths + 1 × 503 when engine disabled) -- Total `FlowTime.TimeMachine.Tests`: 192 tests -- Total `FlowTime.Api.Tests`: 260 tests - -**New types delivered:** -- `OptimizeSpec` — paramIds, metricSeriesId, objective, searchRanges, tolerance, maxIterations -- `OptimizeResult` — Converged, Iterations, ParamValues, AchievedMetricMean -- `OptimizeObjective` — enum: Minimize / Maximize (internally always minimizes; Maximize negates metric) -- `SearchRange` — record: Lo, Hi (with validation Lo < Hi) -- `Optimizer` — Nelder-Mead simplex; injectable IModelEvaluator; IAsyncDisposable -- `POST /v1/optimize` endpoint - ---- - -## Nothing committed yet - -All of the following are staged/untracked but NOT committed: - -**m-E18-12 implementation:** -- `src/FlowTime.TimeMachine/Sweep/OptimizeSpec.cs` -- `src/FlowTime.TimeMachine/Sweep/OptimizeResult.cs` -- `src/FlowTime.TimeMachine/Sweep/OptimizeObjective.cs` -- `src/FlowTime.TimeMachine/Sweep/SearchRange.cs` -- `src/FlowTime.TimeMachine/Sweep/Optimizer.cs` -- `src/FlowTime.API/Endpoints/OptimizeEndpoints.cs` -- `src/FlowTime.API/Program.cs` (endpoint registration) -- `tests/FlowTime.TimeMachine.Tests/Sweep/OptimizeSpecTests.cs` -- `tests/FlowTime.TimeMachine.Tests/Sweep/OptimizerTests.cs` -- `tests/FlowTime.Api.Tests/OptimizeEndpointsTests.cs` -- `work/epics/E-18-headless-pipeline-and-optimization/m-E18-12-optimization.md` - -**New documentation:** -- `docs/notes/ui-optimization-explorer-vision.md` — aspirational optimization UI vision -- `docs/notes/model-discovery-path.md` — three-stage path from raw data to fitted model -- `work/epics/E-18-headless-pipeline-and-optimization/e18-gap-analysis.md` — thorough gap analysis - -**Status surface updates:** -- `CLAUDE.md` — m-E18-12 added as complete with accurate test counts + gap analysis reference -- `ROADMAP.md` — E-18 section rewritten with accurate delivered/gaps/deferred split -- `work/epics/epic-roadmap.md` — same updates -- `docs/architecture/time-machine-analysis-modes.md` — optimization mode added - -**Separate commit (not E-18 specific):** -- `README.md` — rewrite (commit separately from milestone work) - -**Proposed commit message for m-E18-12:** -``` -feat(sweep): m-E18-12 Multi-parameter Optimization — Nelder-Mead simplex - -- Optimizer (Nelder-Mead, N params): all branches covered — expand/contract/ - shrink, pre-loop convergence, expansion rejection, both contraction-fail - paths to shrink -- OptimizeSpec/OptimizeResult/OptimizeObjective/SearchRange types -- POST /v1/optimize; 503 when engine not enabled -- 29 unit tests (OptimizeSpec ×17, Optimizer ×12); 10 API tests -- docs/architecture/time-machine-analysis-modes.md updated -- e18-gap-analysis.md: thorough gap analysis against spec - -Co-Authored-By: Claude Sonnet 4.6 -``` - ---- - -## E-18 gap analysis summary - -**Status: in-progress.** Analysis complete at -`work/epics/E-18-headless-pipeline-and-optimization/e18-gap-analysis.md`. - -### Delivered (9 milestones) - -| ID | Title | -|----|-------| -| m-E18-01 | Parameterized Evaluation (Rust) — ParamTable, evaluate_with_params | -| m-E18-02 | Engine Session + Streaming Protocol (Rust) — persistent process, MessagePack | -| m-E18-06 | Tiered Validation — TimeMachineValidator, POST /v1/validate | -| m-E18-07 | Generator → TimeMachine rename; FlowTime.Generator deleted | -| m-E18-08 | ITelemetrySource + CanonicalBundleSource + FileCsvSource | -| m-E18-09 | Parameter Sweep — SweepSpec/SweepRunner, POST /v1/sweep | -| m-E18-10 | Sensitivity Analysis — SensitivityRunner, POST /v1/sensitivity | -| m-E18-11 | Goal Seeking — GoalSeeker (bisection), POST /v1/goal-seek *(not in original spec — added)* | -| m-E18-12 | Optimization — Optimizer (Nelder-Mead), POST /v1/optimize | - -### Remaining work - -**Buildable now (unblocked):** - -1. **SessionModelEvaluator** — Compile-once bridge using the m-E18-02 session protocol. - The `IModelEvaluator` seam already exists. `RustModelEvaluator` spawns - `flowtime-engine eval` once per evaluation point (compile + eval each time). - `SessionModelEvaluator` would compile once on first call, then send `eval` with - parameter overrides for each subsequent point — eliminating per-point compile overhead. - **Design:** First call sends `compile` to session → receives ParamTable (list of param IDs). - Subsequent calls use `ConstNodeReader` to read param values from the patched YAML, then - send `eval {paramId: value}` overrides. No changes to SweepRunner, SensitivityRunner, - GoalSeeker, or Optimizer. Registered in DI to replace RustModelEvaluator. - **User confirmed: "ok (1) seems very valuable and I want to do that."** - -2. **.NET Time Machine CLI** — Add validate/sweep/sensitivity/goal-seek/optimize commands - to `FlowTime.Cli`. `cat model.yaml | flowtime validate/sweep/...` surface. - Mechanical: call TimeMachineValidator/SweepRunner/etc. with JSON I/O. - **User said: "I guess we need to do (3) in this milestone."** - (Interpreted as a follow-on milestone m-E18-14 after m-E18-13 SessionModelEvaluator.) - -3. **Optimization constraints** — `ConstraintSpec` + penalty method inside Nelder-Mead loop. - Explicitly deferred from m-E18-12. No owner milestone yet. - Documented in gap analysis. Should be added to `work/gaps.md`. - -**Blocked on prerequisites:** - -4. **Model fitting** — `FitSpec`/`FitRunner`/`POST /v1/fit` composing ITelemetrySource + Optimizer - to minimize residual against observed telemetry. Infrastructure exists; endpoint not assembled. - Hard prerequisite: Telemetry Loop & Parity epic (not started). - **Confirmed belongs to E-18** — the computation is E-18's analysis mode; E-15 owns the data - ingestion and Telemetry Loop & Parity owns the validation harness. - -**Explicitly deferred:** - -- Chunked evaluation — needs stateful chunk-step session command in Rust engine -- Monte Carlo — sampling layer on top of IModelEvaluator; not started -- FlowTime.Pipeline SDK project — after fitting stabilizes -- FlowTime.Telemetry.* adapters (Prometheus, OTEL, BPI) — direct-source `ITelemetrySource` implementations that bypass E-15's Gold Builder pipeline for specific live sources; not part of E-15 scope - ---- - -## Key architectural insight (session-based evaluator) - -The m-E18-02 session protocol gives compile-once/eval-many performance, but the .NET -analysis layer doesn't use it yet. `RustModelEvaluator` calls `flowtime-engine eval` -(stateless subprocess) which compiles the model on every evaluation point. For large -sweeps (100+ points) this is ~100–500ms of compile overhead per point. - -The `IModelEvaluator` interface is the injection seam: - -```csharp -public interface IModelEvaluator -{ - Task> EvaluateAsync( - string modelYaml, CancellationToken cancellationToken = default); -} -``` - -The interface receives already-patched YAML (after `ConstNodePatcher` applies overrides). -`SessionModelEvaluator` extracts param values from the patched YAML using `ConstNodeReader`, -then sends them as `eval` overrides to the persistent session. No interface change needed. - -Session lifetime: one session per `SessionModelEvaluator` instance. The evaluator implements -`IAsyncDisposable` to clean up the session process. DI registration should be scoped or -transient depending on whether sessions should be shared across requests. - ---- - -## What's next (planned order) - -1. **Commit pending m-E18-12 work** (feat commit above) -2. **Commit README.md separately** (separate concern) -3. **m-E18-13: SessionModelEvaluator** — compile-once bridge -4. **m-E18-14: .NET Time Machine CLI commands** -5. **work/gaps.md entry for optimization constraints** -6. **Late E-18: Model fitting** — after Telemetry Loop & Parity + E-15 M1 - ---- - -## New docs created (aspirational) - -**`docs/notes/ui-optimization-explorer-vision.md`** — Documents the aspired optimization -UI that would complement E-17's manual what-if: -- Sweep view: chart metric vs. parameter across range -- Sensitivity panel: ranked bar chart of ∂metric/∂param at current baseline -- Goal-seek widget: enter target, pick lever, bisection returns suggested value + "apply" -- Optimization panel: multi-param search, simplex trajectory, before/after topology heatmap - -Key distinction: what-if = human optimizer (you turn the dials); optimization UI = machine -optimizer (you set the goal, the engine finds the answer). - -**`docs/notes/model-discovery-path.md`** — Documents the three-stage model discovery path: -1. Gold Builder (E-15, not started): raw data → canonical telemetry bundles -2. Graph Builder (E-15, not started): topology inference + human curation -3. Parameter fitting (E-18 + Telemetry Loop & Parity): minimize residual - -Also documents the relationship to process mining: process mining tools (ProM, Disco, -Celonis) work with event logs and produce process graphs + aggregate statistics. FlowTime -consumes those aggregate statistics as input. They answer different questions. - ---- - -## Hard rule established this session - -**100% branch coverage before claiming milestone complete.** - -Every algorithm path — especially failure-mode paths (failed contraction, shrink, early -convergence) — must be explicitly traced to a test case. "It converges in the happy path" -is not sufficient. This applies to all logic, API, and data code in TDD workflow. - ---- - -## Current branch - -`epic/E-18-time-machine` - -All work is on this branch. Nothing merged to main yet for E-18. -E-17, E-20, E-10, E-16, E-19 are all merged to main and archived. diff --git a/docs/notes/ffi-vs-subprocess-engine-boundary.md b/docs/notes/ffi-vs-subprocess-engine-boundary.md new file mode 100644 index 00000000..b0e15058 --- /dev/null +++ b/docs/notes/ffi-vs-subprocess-engine-boundary.md @@ -0,0 +1,45 @@ +# FFI vs subprocess: the engine boundary + +> **Status:** exploration. Not scheduled. Captured here to keep the option open and named when triggers fire. + +The current `IModelEvaluator` seam has two implementations — `RustModelEvaluator` (fresh subprocess per eval) and `SessionModelEvaluator` (persistent `flowtime-engine session` subprocess via MessagePack over stdio). Both share one property: the engine is a separate OS process, with IPC on every call. + +For Studio, this boundary gets expensive: + +- **Per-session memory multiplies.** “One Rust process per session” with ~50 concurrent sessions = 50 process address spaces. The alternative — a pooled subprocess with session affinity — is real complexity that does not serve users. +- **Session IR wants direct access to engine state.** Node identity, per-node cache, and generation counters live in Rust (per M-046). The .NET session service projects patches onto that state via IPC round-trips. A shared-memory boundary would let the service hand the evaluator `&mut Graph` or an `Arc` directly. +- **Cold start matters for short-lived embedders.** `FlowTime.Pipeline` SDK callers pay subprocess startup per invocation unless they keep a session open; an in-process library pays zero. +- **Container profile.** On cloud-agnostic hosting (Hetzner, plain Docker), collapsing engine + service into one process shrinks image size and PID footprint. + +## Alternative shape + +Rust engine compiled as a **cdylib** (or `staticlib` for .NET AOT scenarios), loaded into the hosting process via FFI: + +- .NET host → P/Invoke or a managed wrapper around a stable C ABI. +- Rust host → same crate linked directly (no FFI needed if the service is also Rust — see [`session-service-language-choice.md`](session-service-language-choice.md)). +- Python/Node/Go embedders → same C ABI via their native-interop story. + +Keeps the language-boundary benefit (engine correctness isolated from service churn) without the process boundary. + +## When to revisit + +Worth it if any of these bite: + +- Per-session memory at `N=50` concurrent sessions exceeds budget. +- Cold-start latency on short-lived pipeline invocations shows up in traces. +- The subprocess pool-with-affinity logic grows into a distinct subsystem rather than a small helper. + +Not worth it if: + +- Sessions are few and long-lived (the subprocess boundary amortizes). +- The managed wrapper surface area (marshalling complex types, lifetime of native handles, allocator mismatches) exceeds the subprocess complexity it replaces. + +## Watchpoints for current planning + +- Keep `IModelEvaluator` as the seam; do not let subprocess assumptions leak past it. +- Before committing to “one Rust process per session” in any future session-service milestone, prototype an FFI-based evaluator alongside the subprocess one and measure memory + cold start at realistic `N`. +- A language reconsideration for the service layer (see [`session-service-language-choice.md`](session-service-language-choice.md)) interacts with this decision: if the service ends up in Rust, FFI collapses to a direct link and the IPC discussion ends. + +## Provenance + +Originated as gap G-030 (2026-04-26); content moved to this note 2026-05-02 because it reads as design exploration rather than deferred engineering work. The gap was closed `wontfix` with a pointer here. diff --git a/docs/notes/session-service-language-choice.md b/docs/notes/session-service-language-choice.md new file mode 100644 index 00000000..2db1a72b --- /dev/null +++ b/docs/notes/session-service-language-choice.md @@ -0,0 +1,30 @@ +# Session service language choice — Rust vs .NET + +> **Status:** open question, not scheduled. Worth deciding before any session-service milestone scope lands so the implementation substrate is settled. + +The current stack is Rust engine + .NET session layer. The .NET choice is historical — it predates the Rust engine and predates Studio's scope expansion. Under cloud-agnostic deployment (Docker on Hetzner or similar) and with the Pipeline SDK's “embeddable into .NET only” framing challenged, the first-principles calculus shifts. + +## First-principles arguments + +- **Pipeline SDK embedding reach.** A Rust crate + C ABI reaches .NET (P/Invoke), Python (ctypes / PyO3), Node (NAPI), Go (cgo), and any other runtime. A .NET assembly reaches .NET only. Broader embedder reach is strictly better for “FlowTime as a callable function” positioning. +- **Container profile on cloud-agnostic hosting.** Rust: ~10 MB static binary, sub-second cold start. .NET AOT-trimmed: ~80 MB, multi-second cold start. On managed Azure the gap is absorbed; on plain Docker hosting it is visible. +- **Collapsing the engine/service divide.** If the service is Rust, the “engine as subprocess” discussion ends — the session service owns the engine state directly via a linked crate. (See [`ffi-vs-subprocess-engine-boundary.md`](ffi-vs-subprocess-engine-boundary.md).) +- **Studio's .NET scope is mostly net-new.** Session IR, patch vocabulary, WS multiplex, snapshot/resume do not exist today. The rewrite cost is “build Studio in Rust from scratch” rather than “port a mature .NET layer to Rust.” The existing analysis runners (SweepRunner, Nelder-Mead, etc., ~1,600 LOC) are near-pure math and port cleanly. + +## Resolution path + +Prototype spike: + +1. `axum` or `actix-web` service exposing the current `/v1/sweep` shape, wrapping the Rust engine as a linked crate (no subprocess). +2. Measure: cold start, steady-state memory at N=50 sessions, end-to-end sweep latency vs .NET + subprocess baseline, container image size. +3. Re-evaluate with data. If Rust wins decisively on memory + deployment profile and the async-Rust complexity for WS fan-out is tractable, the natural landing zone is whatever future Studio-style epic owns the session service — that is mostly new scope anyway. + +## Watchpoints for current planning + +- Do not treat “.NET session service” as settled in any current architecture doc. Mark it as an assumption pending spike. +- Keep the `IModelEvaluator` seam language-neutral in spirit — it is currently .NET-flavored but the pattern (compile-once-eval-many with parameter patches) ports directly. +- Do not expand the .NET layer with anything that would be load-bearing to keep on .NET specifically. Structural patch handling, session IR, and WS multiplex should be written as if any language could host them. + +## Provenance + +Originated as gap G-031 (2026-04-26); content moved to this note 2026-05-02 because it reads as design exploration rather than deferred engineering work. The gap was closed `wontfix` with a pointer here. diff --git a/src/FlowTime.TimeMachine/Orchestration/RunOrchestrationService.cs b/src/FlowTime.TimeMachine/Orchestration/RunOrchestrationService.cs index 484a7699..9848c7d3 100644 --- a/src/FlowTime.TimeMachine/Orchestration/RunOrchestrationService.cs +++ b/src/FlowTime.TimeMachine/Orchestration/RunOrchestrationService.cs @@ -132,7 +132,10 @@ public async Task CreateRunAsync(RunOrchestrationReques ? effectiveRequest.RunId : (effectiveRequest.DeterministicRunId ? inputContext.DeterministicRunId : null); - if (!string.IsNullOrWhiteSpace(targetRunId)) + // G-034: dryRun is authoritative. Skip the reuse short-circuit for + // dry-runs so a pre-existing deterministic run never silently demotes + // a dryRun:true request to a "reused real run" response with no plan. + if (!effectiveRequest.DryRun && !string.IsNullOrWhiteSpace(targetRunId)) { var reuseOutcome = await TryReuseExistingRunAsync(targetRunId!, outputRoot, effectiveRequest.OverwriteExisting, cancellationToken).ConfigureAwait(false); if (reuseOutcome is not null) diff --git a/tests/FlowTime.TimeMachine.Tests/RunOrchestrationServiceTests.cs b/tests/FlowTime.TimeMachine.Tests/RunOrchestrationServiceTests.cs index 2a8c21b1..91024724 100644 --- a/tests/FlowTime.TimeMachine.Tests/RunOrchestrationServiceTests.cs +++ b/tests/FlowTime.TimeMachine.Tests/RunOrchestrationServiceTests.cs @@ -376,6 +376,48 @@ public async Task CreateRunAsync_DeterministicSimulation_ReusesExistingBundle() Assert.True(second.WasReused); } + // Regression for G-034 — Sim orchestration silently demoted dryRun:true + // to a real run whenever a deterministic run with the same id already + // existed on disk. The reuse short-circuit must not run for dry-runs; + // dryRun is authoritative and must always produce a plan. + [Fact] + public async Task CreateRunAsync_DryRun_DoesNotReuseExistingDeterministicRun() + { + using var temp = new TempDirectory(); + var templatesDir = Path.Combine(temp.Path, "templates"); + Directory.CreateDirectory(templatesDir); + await File.WriteAllTextAsync(Path.Combine(templatesDir, "sim-order.yaml"), simulationTemplate); + + var templateService = new TemplateService(templatesDir, NullLogger.Instance); + var bundleBuilder = new TelemetryBundleBuilder(); + var orchestration = new RunOrchestrationService(templateService, bundleBuilder, NullLogger.Instance); + + var realRequest = new RunOrchestrationRequest + { + TemplateId = "sim-order", + Mode = "simulation", + Parameters = new Dictionary(), + OutputRoot = Path.Combine(temp.Path, "runs"), + DeterministicRunId = true, + DryRun = false + }; + + // First land a real run so the deterministic run id exists on disk. + var realOutcome = await orchestration.CreateRunAsync(realRequest); + Assert.False(realOutcome.IsDryRun); + Assert.NotNull(realOutcome.Result); + + // Now request a dry-run with the same inputs. dryRun must be honored + // even though a real run with the same deterministic id already + // exists; the response must carry a plan, not a reused result. + var dryRunRequest = realRequest with { DryRun = true }; + var dryOutcome = await orchestration.CreateRunAsync(dryRunRequest); + + Assert.True(dryOutcome.IsDryRun); + Assert.NotNull(dryOutcome.Plan); + Assert.Null(dryOutcome.Result); + } + [Fact] public async Task CreateRunAsync_DeterministicSimulation_OverwriteRegeneratesBundle() { diff --git a/tests/ui/specs/svelte-run-orchestration.spec.ts b/tests/ui/specs/svelte-run-orchestration.spec.ts new file mode 100644 index 00000000..d2025192 --- /dev/null +++ b/tests/ui/specs/svelte-run-orchestration.spec.ts @@ -0,0 +1,191 @@ +import { test, expect } from '@playwright/test'; + +// M-062 verification spec — exercises the /run page against live Sim API. +// Graceful skip when Sim API (8090) or Svelte dev server (5173) are unavailable. +// +// AC-1: cards in responsive grid with title, version badge, domain icon +// AC-2: search filter +// AC-3: select card → config panel with reuse mode + RNG seed + Advanced section +// AC-4: execute run → success result +// AC-5: preview/dry-run → plan +// AC-6: loading + empty states render +// AC-7: no raw JSON param field by default + +const SVELTE_URL = 'http://localhost:5173'; +const SIM_URL = 'http://localhost:8090'; + +async function infraUp(): Promise<{ sim: boolean; svelte: boolean }> { + const probe = async (url: string) => { + try { + const res = await fetch(url, { signal: AbortSignal.timeout(1500) }); + return res.ok; + } catch { + return false; + } + }; + const [sim, svelte] = await Promise.all([ + probe(`${SIM_URL}/healthz`), + probe(`${SVELTE_URL}/`) + ]); + return { sim, svelte }; +} + +test.describe('M-062 Run Orchestration (/run)', () => { + test.beforeEach(async ({}, testInfo) => { + const infra = await infraUp(); + if (!infra.sim || !infra.svelte) { + testInfo.skip(); + } + }); + + test('AC-1: templates render as cards in a grid with title + version badge + icon', async ({ + page + }) => { + await page.goto(`${SVELTE_URL}/run`); + // Wait for at least one card. + + +{/if} +``` + +The `bins` value comes from the `CompileResult.bins` field — store it as `let bins = $state(0)` and populate in `compileModel`. + +### Derived store updates + +```typescript +const metricMap = $derived.by(() => { + if (!engineGraph) return new Map(); + const bin = selectedBin !== null ? selectedBin : undefined; + return normalizeMetricMap(buildMetricMap(engineGraph, series, bin)); +}); + +const edgeMetricMap = $derived.by(() => { + if (!engineGraph) return new Map(); + const bin = selectedBin !== null ? selectedBin : undefined; + return normalizeMetricMap(buildEdgeMetricMap(engineGraph, series, bin)); +}); +``` + +No layout change — only metrics change. The `$effect` that applies `.has-warning` classes already runs after metric/graph updates, so warning badges remain correct at all scrubber positions. + +## Out of Scope + +- Play/pause animation (auto-advancing the scrubber through bins) — future polish. +- Highlighting the selected bin in the chart axes/ticks. +- Per-bin warning filtering (show warnings only in the selected bin). +- Synchronized scrubber across multiple chart panels. +- Keyboard navigation of the scrubber (native range input handles this). + +## Success Indicator + +Load `queue-with-wip` model. Drag the bin scrubber from bin 0 to bin 3. The topology nodes and edges shift colors as the queue builds up across time bins. All charts show a vertical crosshair at bin 3. Click "Mean" — crosshairs vanish, heatmap reverts to mean colors. The whole interaction is seamless: no eval, no network, just derived state recomputing. + +## Key References + +- `ui/src/lib/api/topology-metrics.ts` — `buildMetricMap`, `buildEdgeMetricMap` (to extend) +- `ui/src/lib/components/chart-geometry.ts` — add `crosshairX` +- `ui/src/lib/components/chart.svelte` — add `crosshairBin` prop +- `ui/src/routes/what-if/+page.svelte` — add `selectedBin` state, scrubber UI, wire `crosshairBin` +- `work/epics/E-17-interactive-what-if-mode/m-E17-05-edge-heatmap.md` — prior milestone diff --git a/work/epics/completed/E-17-interactive-what-if-mode/spec.md b/work/epics/E-17-interactive-what-if-mode/epic.md similarity index 77% rename from work/epics/completed/E-17-interactive-what-if-mode/spec.md rename to work/epics/E-17-interactive-what-if-mode/epic.md index a6244c0a..16e70b5a 100644 --- a/work/epics/completed/E-17-interactive-what-if-mode/spec.md +++ b/work/epics/E-17-interactive-what-if-mode/epic.md @@ -1,7 +1,9 @@ -# Epic: Interactive What-If Mode +--- +id: E-17 +title: Interactive What-If Mode +status: done +--- -**ID:** E-17 -**Status:** complete **Completed:** 2026-04-12 **Branch merged:** `epic/E-17-interactive-what-if-mode` → `main` @@ -55,12 +57,12 @@ The circuit simulator analogy: SPICE compiles a netlist once, then allows parame | ID | Title | Status | Summary | |----|-------|--------|---------| -| m-E17-01 | WebSocket Engine Bridge | complete | .NET WebSocket proxy over persistent Rust `flowtime-engine session` subprocess; MessagePack compile/eval/get_series round-trip | -| m-E17-02 | Svelte Parameter Panel | complete | SvelteKit `/what-if` page with live-bound sliders, example model picker, series mini-charts, latency badge | -| m-E17-03 | Live Topology + Charts | complete | Dag-map topology graph with heatmap, per-series charts with hover tooltips, layout stability across tweaks | -| m-E17-04 | Warnings Surface | complete | Engine warnings flow through session protocol into banner, details panel, and topology node badges; capacity-constrained example model drives the demo loop | -| m-E17-05 | Edge Heatmap | complete | Color topology edges by their throughput series mean; wires up the already-present `edgeMetrics` prop in dag-map-view | -| m-E17-06 | Time Scrubber | complete | Bin-position slider switches heatmap (nodes + edges) from mean to per-bin value; vertical crosshair on all charts | +| M-018 | WebSocket Engine Bridge | complete | .NET WebSocket proxy over persistent Rust `flowtime-engine session` subprocess; MessagePack compile/eval/get_series round-trip | +| M-019 | Svelte Parameter Panel | complete | SvelteKit `/what-if` page with live-bound sliders, example model picker, series mini-charts, latency badge | +| M-020 | Live Topology + Charts | complete | Dag-map topology graph with heatmap, per-series charts with hover tooltips, layout stability across tweaks | +| M-021 | Warnings Surface | complete | Engine warnings flow through session protocol into banner, details panel, and topology node badges; capacity-constrained example model drives the demo loop | +| M-022 | Edge Heatmap | complete | Color topology edges by their throughput series mean; wires up the already-present `edgeMetrics` prop in dag-map-view | +| M-023 | Time Scrubber | complete | Bin-position slider switches heatmap (nodes + edges) from mean to per-bin value; vertical crosshair on all charts | **Final polish (post-m-E17-06):** Advanced demo models (SaaS API platform, e-commerce order pipeline with chained throughput); edge color semantic fixed to destination node load; zero-anchored heatmap normalization; sidebar model picker; warnings as non-shifting overlay with bezier connectors and pulsing animation. 200 vitest. diff --git a/work/epics/E-18-headless-pipeline-and-optimization/M-001-parameterized-evaluation.md b/work/epics/E-18-headless-pipeline-and-optimization/M-001-parameterized-evaluation.md new file mode 100644 index 00000000..d293be85 --- /dev/null +++ b/work/epics/E-18-headless-pipeline-and-optimization/M-001-parameterized-evaluation.md @@ -0,0 +1,140 @@ +--- +id: M-001 +title: Parameterized Evaluation +status: done +parent: E-18 +acs: + - id: AC-1 + title: 'AC-1: ParamTable struct' + status: met + - id: AC-2 + title: 'AC-2: Compiler populates ParamTable' + status: met + - id: AC-3 + title: 'AC-3: evaluate_with_params function' + status: met + - id: AC-4 + title: 'AC-4: Equivalence' + status: met + - id: AC-5 + title: 'AC-5: Full post-eval pipeline' + status: met + - id: AC-6 + title: 'AC-6: Parameter override affects downstream' + status: met + - id: AC-7 + title: 'AC-7: Class arrival rate override' + status: met + - id: AC-8 + title: 'AC-8: WIP limit override' + status: met + - id: AC-9 + title: 'AC-9: Parameter schema extraction' + status: met + - id: AC-10 + title: 'AC-10: Compile-once, eval-many pattern' + status: met +--- + +## Goal + +The Rust engine can compile a model once and re-evaluate it many times with different parameter values without recompiling. This is the critical primitive that every downstream use case builds on — interactive what-if, parameter sweeps, optimization, sensitivity analysis. The Plan becomes a reusable program; parameters are its inputs. + +## Context + +The current `compile(model) → Plan` bakes all constants into `Op::Const { out, values }` at compile time. To change an arrival rate from 10 to 15, you must recompile the entire model. Compilation is O(nodes) with topological sorting, expression parsing, and constraint resolution — unnecessary work when only a scalar value changed. + +After this milestone, the Plan carries a `ParamTable` that lists every user-visible constant. `evaluate_with_params(plan, overrides)` writes overrides into the state matrix before the eval loop, then runs the same bin-major evaluation. The Plan is immutable and shareable; only the parameter values change. + +### Where constants come from in the compiler + +The compiler creates `Op::Const` from seven sources: + +| Source | Example | Parameter? | +|--------|---------|-----------| +| `kind: const` node values | `values: [10, 20, 30]` | Yes — primary user input | +| Traffic arrival `ratePerBin` | `ratePerBin: 20` | Yes — class arrival rate | +| PMF expected value | `pmf: { values, probabilities }` | Yes — derived from PMF definition | +| WIP limit scalar | `wipLimit: 50` | Yes — topology constraint | +| Queue initial condition | `initialCondition: { queueDepth: 5 }` | Yes — initial state | +| Expression literal | `8` in `MIN(arrivals, 8)` | Yes — inline constant in formula | +| Compiler-generated temps | Internal proportional alloc, router weight columns | No — derived, not user-visible | + +The distinction: a parameter is a constant that traces back to a user-authored value in the model YAML. Compiler-generated intermediate constants (temp columns, normalized weights) are NOT parameters. + +## Acceptance criteria + +### AC-1 — AC-1: ParamTable struct + +**AC-1: ParamTable struct.** `Plan` gains a `params: ParamTable` field. `ParamTable` contains a `Vec` where each entry has: +- `id: String` — stable identifier matching the model YAML source (e.g., `"arrivals"` for a const node, `"arrivals.Order"` for a traffic class rate, `"Queue.wipLimit"` for a topology WIP limit) +- `column: usize` — the column index in the state matrix this parameter fills +- `default: ParamValue` — original value from the model (`Scalar(f64)` for uniform, `Vector(Vec)` for per-bin) +- `kind: ParamKind` — `ConstNode`, `ArrivalRate`, `WipLimit`, `InitialCondition`, `ExprLiteral` +### AC-2 — AC-2: Compiler populates ParamTable + +**AC-2: Compiler populates ParamTable.** The compiler registers parameters for: +- Every `kind: const` node (id = node id, value from `values` field) +- Every `traffic.arrivals` entry with `ratePerBin` (id = `"{nodeId}.{classId}"`) +- Every topology node with scalar `wipLimit` (id = `"{topoNodeId}.wipLimit"`) +- Every topology node with `initialCondition.queueDepth` (id = `"{topoNodeId}.init"`) +- Expression literals are NOT parameters (they're inline formula constants, not model inputs) +### AC-3 — AC-3: evaluate_with_params function + +**AC-3: `evaluate_with_params` function.** New public function: +```rust +pub fn evaluate_with_params(plan: &Plan, overrides: &[(String, ParamValue)]) -> Vec +``` +- Applies overrides to matching param IDs before the eval loop +- `Scalar(v)` fills all bins with `v`; `Vector(vs)` writes per-bin values +- Unmatched override IDs are ignored (forward-compatible) +- Unknown param IDs do not cause errors +- Returns the filled state matrix (same shape as `evaluate`) +### AC-4 — AC-4: Equivalence + +**AC-4: Equivalence.** `evaluate_with_params(plan, &[])` (no overrides) produces identical results to `evaluate(plan)`. A Rust test asserts bitwise equality. +### AC-5 — AC-5: Full post-eval pipeline + +**AC-5: Full post-eval pipeline.** `eval_model` is refactored to accept optional overrides. When overrides are provided, it calls `evaluate_with_params` instead of `evaluate`, then runs the same post-eval pipeline: class decomposition normalization, proportional allocation propagation, edge series computation, analysis warnings. A new public entry point: +```rust +pub fn eval_model_with_params( +model: &ModelDefinition, +overrides: &[(String, ParamValue)] +) -> Result +``` +### AC-6 — AC-6: Parameter override affects downstream + +**AC-6: Parameter override affects downstream.** Overriding a const node's value propagates through all downstream expressions, queue recurrences, per-class decomposition, and edge series. Test: override `arrivals` from 10 to 20 → verify `served`, `queue_depth`, per-class series, and edge flow all change correctly. +### AC-7 — AC-7: Class arrival rate override + +**AC-7: Class arrival rate override.** Overriding a class arrival rate (e.g., `"arrivals.Order"` from 6 to 12) changes the class fraction and propagates through normalization and downstream decomposition. Test: change one class rate, verify normalization invariant still holds. +### AC-8 — AC-8: WIP limit override + +**AC-8: WIP limit override.** Overriding `"{topoNodeId}.wipLimit"` changes the queue's WIP limit and affects overflow. Test: lower WIP limit → verify overflow increases. +### AC-9 — AC-9: Parameter schema extraction + +**AC-9: Parameter schema extraction.** New public function: +```rust +pub fn extract_params(plan: &Plan) -> &ParamTable +``` +Returns the plan's parameter table. Clients use this to discover what can be tweaked, with IDs, kinds, and defaults. This is what the UI will use to auto-generate controls. +### AC-10 — AC-10: Compile-once, eval-many pattern + +**AC-10: Compile-once, eval-many pattern.** Demonstrate the pattern with a Rust test that compiles once, evaluates 10 times with different arrival rates, and verifies each result is independent (no state leakage between evaluations). Measure that subsequent evals are faster than the first (no recompilation). +## Out of Scope + +- Session management or persistent process (M-002) +- Streaming protocol or MessagePack framing (M-002) +- CLI interface changes (M-002) +- UI parameter controls (M-019) +- Parameter bounds, display names, or template metadata enrichment (future — the parameter table carries IDs and defaults only) +- Expression literal parameterization (inline `8` in `MIN(arrivals, 8)` stays baked — parameterizing expression constants requires expression-tree rewriting, which is a different problem) +- Structural model changes (adding/removing nodes requires recompilation — by design) + +## Key References + +- `engine/core/src/plan.rs` — Plan struct, Op enum, ColumnMap +- `engine/core/src/eval.rs` — `evaluate()` function, bin-major loop +- `engine/core/src/compiler.rs` — `compile()`, `eval_model()`, all `Op::Const` emission sites +- `docs/architecture/headless-engine-architecture.md` — overall architecture +- `work/epics/E-18-headless-pipeline-and-optimization/milestone-plan-v2.md` — milestone sequence diff --git a/work/epics/E-18-headless-pipeline-and-optimization/M-002-engine-session-streaming-protocol.md b/work/epics/E-18-headless-pipeline-and-optimization/M-002-engine-session-streaming-protocol.md new file mode 100644 index 00000000..cefd923a --- /dev/null +++ b/work/epics/E-18-headless-pipeline-and-optimization/M-002-engine-session-streaming-protocol.md @@ -0,0 +1,164 @@ +--- +id: M-002 +title: Engine Session + Streaming Protocol +status: done +parent: E-18 +depends_on: + - M-001 +acs: + - id: AC-1 + title: 'AC-1: session CLI command' + status: met + - id: AC-2 + title: 'AC-2: Length-prefixed MessagePack framing' + status: met + - id: AC-3 + title: 'AC-3: compile command' + status: met + - id: AC-4 + title: 'AC-4: eval command' + status: met + - id: AC-5 + title: 'AC-5: get_params command' + status: met + - id: AC-6 + title: 'AC-6: get_series command' + status: met + - id: AC-7 + title: 'AC-7: Error handling' + status: met + - id: AC-8 + title: 'AC-8: Session state' + status: met + - id: AC-9 + title: 'AC-9: Performance' + status: met + - id: AC-10 + title: 'AC-10: Integration test' + status: met +--- + +## Goal + +The Rust engine runs as a persistent process that accepts commands and streams results. `flowtime-engine session` reads length-prefixed MessagePack messages from stdin, holds a compiled Plan in memory, and writes responses to stdout. This is the headless pipeline component — the same protocol works over stdin/stdout (CLI pipes) and WebSocket (UI, via M-018 proxy). + +## Context + +After M-001, the engine can compile once and evaluate many times with different parameters via `evaluate_with_params(plan, overrides)`. But every invocation is still a batch subprocess: spawn → parse YAML → compile → evaluate → write files → exit. The overhead of process spawn + file I/O dominates latency (100-500ms). For interactive use, we need a persistent process that holds the compiled Plan and responds to parameter changes in microseconds. + +The session is a stateful loop: + +``` +stdin → [compile] → hold Plan → [eval overrides] → stdout + → [eval overrides] → stdout + → [eval overrides] → stdout + → EOF → exit +``` + +### Why MessagePack + +- **Binary f64 arrays.** A 1,000-bin series is 8KB as binary vs ~8KB+ as JSON text (with formatting overhead and parse cost). MessagePack encodes `Vec` as a binary ext type — zero parsing, memcpy-fast. +- **Length-prefixed framing.** 4-byte big-endian length prefix before each message. No newline ambiguity, no incomplete-line bugs. +- **Cross-language.** Native libraries: Rust (`rmp-serde`), JavaScript (`@msgpack/msgpack`), C# (`MessagePack-CSharp`), Python (`msgpack`). +- **Pipe-friendly.** Works over stdin/stdout for CLI composition, over WebSocket for UI. + +## Acceptance criteria + +### AC-1 — AC-1: session CLI command + +**AC-1: `session` CLI command.** `flowtime-engine session` enters a persistent loop reading from stdin and writing to stdout. No file arguments required. Exits cleanly on stdin EOF or SIGTERM. +### AC-2 — AC-2: Length-prefixed MessagePack framing + +**AC-2: Length-prefixed MessagePack framing.** Each message is `[4-byte big-endian length][MessagePack payload]`. Both requests (stdin) and responses (stdout) use this framing. Stderr is reserved for human-readable log messages (not protocol). +### AC-3 — AC-3: compile command + +**AC-3: `compile` command.** Request: `{ method: "compile", params: { yaml: "" } }`. Response: `{ result: { params: [{ id, kind, default }], series: [{ id, bins, values }], bins, grid } }`. Compiles the model, holds the Plan in session state, evaluates with defaults, returns the parameter schema and initial series. +### AC-4 — AC-4: eval command + +**AC-4: `eval` command.** Request: `{ method: "eval", params: { overrides: { "arrivals": 15.0, "Queue.wipLimit": 30.0 } } }`. Response: `{ result: { series: { "arrivals": , "served": , ... }, elapsed_us } }`. Re-evaluates with overrides, returns updated series. Must not recompile. Series values are MessagePack binary arrays (not JSON text arrays). +### AC-5 — AC-5: get_params command + +**AC-5: `get_params` command.** Request: `{ method: "get_params" }`. Response: `{ result: { params: [{ id, kind, default }] } }`. Returns the current parameter table from the compiled Plan. +### AC-6 — AC-6: get_series command + +**AC-6: `get_series` command.** Request: `{ method: "get_series", params: { names: ["arrivals", "served"] } }`. Response: `{ result: { series: { "arrivals": , "served": } } }`. Returns specific series from the current evaluation state. If no names provided, returns all non-internal series. +### AC-7 — AC-7: Error handling + +**AC-7: Error handling.** Invalid requests return `{ error: { code, message } }`. Specific errors: `not_compiled` (eval before compile), `compile_error` (bad YAML), `unknown_method`. The session continues after errors — it does not exit. +### AC-8 — AC-8: Session state + +**AC-8: Session state.** The session holds: compiled Plan, current parameter overrides, current state matrix (from most recent eval). `compile` replaces the entire session state. `eval` updates overrides and state. Multiple `eval` calls are independent (no accumulation). +### AC-9 — AC-9: Performance + +**AC-9: Performance.** For a model with 8 bins and ~10 series, `eval` with scalar overrides completes in under 1ms (excluding I/O). A Rust benchmark test evaluates 1,000 times in a loop and asserts total < 1 second. +### AC-10 — AC-10: Integration test + +**AC-10: Integration test.** A Rust integration test spawns `flowtime-engine session` as a subprocess, sends compile + eval + eval (with different overrides) + get_params via the MessagePack protocol over stdin/stdout, and verifies all responses are correct. +## Technical Notes + +### Dependencies to add + +- `rmp-serde` (MessagePack serialization for Rust) — workspace dependency +- `serde` derive on request/response types + +### Module structure + +- `engine/core/src/session.rs` — Session struct, state management, command dispatch +- `engine/core/src/protocol.rs` — Request/Response types, MessagePack framing (read/write) +- `engine/cli/src/main.rs` — `cmd_session()` entry point + +### Message envelope + +```rust +#[derive(Serialize, Deserialize)] +struct Request { + method: String, + #[serde(default)] + params: serde_json::Value, // flexible params per method +} + +#[derive(Serialize)] +struct Response { + #[serde(skip_serializing_if = "Option::is_none")] + result: Option, + #[serde(skip_serializing_if = "Option::is_none")] + error: Option, +} +``` + +Note: We use `serde_json::Value` as the flexible inner type even though the wire format is MessagePack. MessagePack and JSON share the same data model (maps, arrays, strings, numbers, bools, null). `rmp-serde` serializes/deserializes `serde_json::Value` correctly. + +### Series encoding + +Series data (`Vec`) serializes naturally as MessagePack arrays of floats. For very large series, a future optimization could use MessagePack binary ext type for raw f64 bytes, but the standard array encoding is correct and sufficient for this milestone. + +### Post-eval pipeline + +After `evaluate_with_params`, the session must also run: +- Class decomposition normalization + proportional allocation +- Edge series computation +- Analysis warnings + +This means the session calls the same post-eval pipeline as `eval_model_with_params`. The simplest approach: the session stores the compiled Plan and the ModelDefinition, and each `eval` call runs `eval_model_with_params` reusing the model but with the new overrides. + +For the compile-once optimization (skip recompilation), a future milestone can cache the Plan separately. For now, recompiling per eval is acceptable if latency is under the AC-9 target. + +## Out of Scope + +- WebSocket transport (M-018) +- .NET bridge for session mode (M-018) +- UI parameter controls (M-019) +- Parameter sweep batch mode (m-E18-03) +- Request IDs / multiplexing (single-client, sequential for now) +- Authentication or access control +- TLS/encryption + +## Key References + +- `engine/core/src/compiler.rs` — `compile()`, `eval_model_with_params()` +- `engine/core/src/plan.rs` — `ParamTable`, `ParamValue` +- `engine/core/src/eval.rs` — `evaluate_with_params()` +- `engine/cli/src/main.rs` — existing CLI command dispatch +- `docs/architecture/headless-engine-architecture.md` — protocol design +- [rmp-serde crate](https://crates.io/crates/rmp-serde) — MessagePack for Rust +- [MessagePack spec](https://msgpack.org/) — wire format diff --git a/work/epics/E-18-headless-pipeline-and-optimization/m-E18-06-tiered-validation.md b/work/epics/E-18-headless-pipeline-and-optimization/M-003-tiered-validation.md similarity index 53% rename from work/epics/E-18-headless-pipeline-and-optimization/m-E18-06-tiered-validation.md rename to work/epics/E-18-headless-pipeline-and-optimization/M-003-tiered-validation.md index 1e298c64..99a3646f 100644 --- a/work/epics/E-18-headless-pipeline-and-optimization/m-E18-06-tiered-validation.md +++ b/work/epics/E-18-headless-pipeline-and-optimization/M-003-tiered-validation.md @@ -1,8 +1,37 @@ -# m-E18-06 — Tiered Validation - -**Epic:** E-18 Time Machine -**Branch:** `milestone/m-E18-06-tiered-validation` -**Status:** complete +--- +id: M-003 +title: Tiered Validation +status: done +parent: E-18 +acs: + - id: AC-1 + title: TimeMachineValidator.Validate(yaml, ValidationTier.Schema) returns + status: met + - id: AC-2 + title: TimeMachineValidator.Validate(yaml, ValidationTier.Compile) catches + status: met + - id: AC-3 + title: TimeMachineValidator.Validate(yaml, ValidationTier.Analyse) returns + status: met + - id: AC-4 + title: POST /v1/validate responds 200 with { isValid, tier, errors, warnings + status: met + - id: AC-5 + title: Invalid tier value → 400 Bad Request + status: met + - id: AC-6 + title: Empty/null yaml → 400 Bad Request + status: met + - id: AC-7 + title: Rust session validate_schema returns { is_valid, errors } without + status: met + - id: AC-8 + title: rg "FlowTime\.Generator" src/ tests/ still zero (no regressions) + status: met + - id: AC-9 + title: dotnet test FlowTime.sln all green; Rust cargo test all green + status: met +--- ## Goal @@ -90,14 +119,30 @@ response (invalid): { result: { is_valid: false, errors: ["..."] } } Tier 2 (compile) is already served by the existing `compile` command, which returns `error: { code: "compile_error", ... }` on failure. -## Acceptance Criteria - -- [x] `TimeMachineValidator.Validate(yaml, ValidationTier.Schema)` returns errors for invalid YAML -- [x] `TimeMachineValidator.Validate(yaml, ValidationTier.Compile)` catches structural errors (bad node refs, bad expressions) -- [x] `TimeMachineValidator.Validate(yaml, ValidationTier.Analyse)` returns warnings from invariant analyzer -- [x] `POST /v1/validate` responds 200 with `{ isValid, tier, errors, warnings }` for all three tiers -- [x] Invalid tier value → 400 Bad Request -- [x] Empty/null yaml → 400 Bad Request -- [x] Rust session `validate_schema` returns `{ is_valid, errors }` without full compile -- [x] `rg "FlowTime\.Generator" src/ tests/` still zero (no regressions) -- [x] `dotnet test FlowTime.sln` all green; Rust `cargo test` all green +## Acceptance criteria + +### AC-1 — TimeMachineValidator.Validate(yaml, ValidationTier.Schema) returns + +`TimeMachineValidator.Validate(yaml, ValidationTier.Schema)` returns errors for invalid YAML +### AC-2 — TimeMachineValidator.Validate(yaml, ValidationTier.Compile) catches + +`TimeMachineValidator.Validate(yaml, ValidationTier.Compile)` catches structural errors (bad node refs, bad expressions) +### AC-3 — TimeMachineValidator.Validate(yaml, ValidationTier.Analyse) returns + +`TimeMachineValidator.Validate(yaml, ValidationTier.Analyse)` returns warnings from invariant analyzer +### AC-4 — POST /v1/validate responds 200 with { isValid, tier, errors, warnings + +`POST /v1/validate` responds 200 with `{ isValid, tier, errors, warnings }` for all three tiers +### AC-5 — Invalid tier value → 400 Bad Request + +### AC-6 — Empty/null yaml → 400 Bad Request + +### AC-7 — Rust session validate_schema returns { is_valid, errors } without + +Rust session `validate_schema` returns `{ is_valid, errors }` without full compile +### AC-8 — rg "FlowTime\.Generator" src/ tests/ still zero (no regressions) + +`rg "FlowTime\.Generator" src/ tests/` still zero (no regressions) +### AC-9 — dotnet test FlowTime.sln all green; Rust cargo test all green + +`dotnet test FlowTime.sln` all green; Rust `cargo test` all green diff --git a/work/epics/E-18-headless-pipeline-and-optimization/M-004-generator-extraction-timemachine.md b/work/epics/E-18-headless-pipeline-and-optimization/M-004-generator-extraction-timemachine.md new file mode 100644 index 00000000..4a64d9c5 --- /dev/null +++ b/work/epics/E-18-headless-pipeline-and-optimization/M-004-generator-extraction-timemachine.md @@ -0,0 +1,76 @@ +--- +id: M-004 +title: Generator Extraction → TimeMachine +status: done +parent: E-18 +acs: + - id: AC-1 + title: src/FlowTime.TimeMachine/ exists; src/FlowTime.Generator/ is gone + status: met + - id: AC-2 + title: tests/FlowTime.TimeMachine.Tests/ exists + status: met + - id: AC-3 + title: dotnet build FlowTime.sln succeeds with zero errors + status: met + - id: AC-4 + title: dotnet test FlowTime.sln passes with the same test count + status: met + - id: AC-5 + title: rg "FlowTime\.Generator" src/ tests/ --include=".cs" + status: met + - id: AC-6 + title: Solution file contains TimeMachine entry; Generator entry is absent + status: met +--- + +## Goal + +Rename `FlowTime.Generator` → `FlowTime.TimeMachine`. Move all classes, update all +references in consumers (src + tests), remove `FlowTime.Generator` from the solution. +Pure structural refactor — no behavior change, all tests green, no coexistence window +(per D-032 Path B). + +## Scope + +**In scope:** +- Create `src/FlowTime.TimeMachine/FlowTime.TimeMachine.csproj` with identical dependencies +- Move all Generator source files; update `FlowTime.Generator.*` namespaces → `FlowTime.TimeMachine.*` +- Rename `tests/FlowTime.Generator.Tests/` → `tests/FlowTime.TimeMachine.Tests/`; update its csproj +- Update project references in: FlowTime.Cli, FlowTime.Sim.Service, FlowTime.API, FlowTime.Api.Tests, FlowTime.Cli.Tests, FlowTime.Integration.Tests +- Update `using FlowTime.Generator.*` → `using FlowTime.TimeMachine.*` across all source files +- Register TimeMachine in FlowTime.sln; remove Generator entry +- Delete `src/FlowTime.Generator/` entirely + +**Out of scope:** +- Tiered validation (M-003) +- Any behavior changes whatsoever + +## Acceptance criteria + +### AC-1 — src/FlowTime.TimeMachine/ exists; src/FlowTime.Generator/ is gone + +`src/FlowTime.TimeMachine/` exists; `src/FlowTime.Generator/` is gone +### AC-2 — tests/FlowTime.TimeMachine.Tests/ exists + +`tests/FlowTime.TimeMachine.Tests/` exists; `tests/FlowTime.Generator.Tests/` is gone +### AC-3 — dotnet build FlowTime.sln succeeds with zero errors + +`dotnet build FlowTime.sln` succeeds with zero errors +### AC-4 — dotnet test FlowTime.sln passes with the same test count + +`dotnet test FlowTime.sln` passes with the same test count +### AC-5 — rg "FlowTime\.Generator" src/ tests/ --include=".cs" + +`rg "FlowTime\.Generator" src/ tests/ --include="*.cs" --include="*.csproj"` returns zero matches +### AC-6 — Solution file contains TimeMachine entry; Generator entry is absent +## Namespace Mapping + +| Old | New | +|-----|-----| +| `FlowTime.Generator` | `FlowTime.TimeMachine` | +| `FlowTime.Generator.Artifacts` | `FlowTime.TimeMachine.Artifacts` | +| `FlowTime.Generator.Capture` | `FlowTime.TimeMachine.Capture` | +| `FlowTime.Generator.Models` | `FlowTime.TimeMachine.Models` | +| `FlowTime.Generator.Orchestration` | `FlowTime.TimeMachine.Orchestration` | +| `FlowTime.Generator.Processing` | `FlowTime.TimeMachine.Processing` | diff --git a/work/epics/E-18-headless-pipeline-and-optimization/m-E18-08-telemetry-source-contract.md b/work/epics/E-18-headless-pipeline-and-optimization/M-005-itelemetrysource-contract.md similarity index 62% rename from work/epics/E-18-headless-pipeline-and-optimization/m-E18-08-telemetry-source-contract.md rename to work/epics/E-18-headless-pipeline-and-optimization/M-005-itelemetrysource-contract.md index 46553545..f30fc8dd 100644 --- a/work/epics/E-18-headless-pipeline-and-optimization/m-E18-08-telemetry-source-contract.md +++ b/work/epics/E-18-headless-pipeline-and-optimization/M-005-itelemetrysource-contract.md @@ -1,14 +1,40 @@ -# m-E18-08 — ITelemetrySource Contract - -**Epic:** E-18 Time Machine -**Branch:** `milestone/m-E18-08-telemetry-source-contract` -**Status:** complete +--- +id: M-005 +title: ITelemetrySource Contract +status: done +parent: E-18 +acs: + - id: AC-1 + title: ITelemetrySource interface exists in FlowTime.TimeMachine.Telemetry + status: met + - id: AC-2 + title: TelemetryData carries Grid + Series + optional Provenance + status: met + - id: AC-3 + title: CanonicalBundleSource.ReadAsync reads a bundle directory and returns + status: met + - id: AC-4 + title: FileCsvSource.ReadAsync reads a single CSV and returns the series + status: met + - id: AC-5 + title: Both implementations compile and have passing unit tests (23 tests + status: met + - id: AC-6 + title: ITelemetrySink is not introduced (explicitly documented as deferred) + status: met + - id: AC-7 + title: rg "FlowTime\.Generator" src/ tests/ still zero (no regressions) + status: met + - id: AC-8 + title: dotnet test FlowTime.sln all green (72 TimeMachine tests, 0 failures) + status: met +--- ## Goal Define `ITelemetrySource` as the formal input contract for the Time Machine's external data surface, with two concrete implementations from day one. Satisfies the deferred portion of -the spec's m-E18-01b scope (the tiered-validation half shipped as m-E18-06; this delivers +the spec's m-E18-01b scope (the tiered-validation half shipped as M-003; this delivers the source-contract half). ## Scope @@ -33,7 +59,7 @@ implementation. - Unit tests: `tests/FlowTime.TimeMachine.Tests/Telemetry/` **Out of scope:** -- `ITelemetrySink` — explicitly deferred per D-2026-04-07-020 +- `ITelemetrySink` — explicitly deferred per D-033 - Real-world format adapters (Prometheus, OTEL, BPI) — m-E18 telemetry adapters milestone - Time Machine `Evaluate` / `Reevaluate` consuming the source — separate milestone - HTTP endpoint changes @@ -106,13 +132,29 @@ public sealed class FileCsvSource : ITelemetrySource } ``` -## Acceptance Criteria +## Acceptance criteria + +### AC-1 — ITelemetrySource interface exists in FlowTime.TimeMachine.Telemetry + +`ITelemetrySource` interface exists in `FlowTime.TimeMachine.Telemetry` +### AC-2 — TelemetryData carries Grid + Series + optional Provenance + +`TelemetryData` carries Grid + Series + optional Provenance +### AC-3 — CanonicalBundleSource.ReadAsync reads a bundle directory and returns + +`CanonicalBundleSource.ReadAsync` reads a bundle directory and returns correct series values +### AC-4 — FileCsvSource.ReadAsync reads a single CSV and returns the series + +`FileCsvSource.ReadAsync` reads a single CSV and returns the series under the specified ID +### AC-5 — Both implementations compile and have passing unit tests (23 tests + +Both implementations compile and have passing unit tests (23 tests across 2 suites) +### AC-6 — ITelemetrySink is not introduced (explicitly documented as deferred) + +`ITelemetrySink` is **not** introduced (explicitly documented as deferred) +### AC-7 — rg "FlowTime\.Generator" src/ tests/ still zero (no regressions) + +`rg "FlowTime\.Generator" src/ tests/` still zero (no regressions) +### AC-8 — dotnet test FlowTime.sln all green (72 TimeMachine tests, 0 failures) -- [x] `ITelemetrySource` interface exists in `FlowTime.TimeMachine.Telemetry` -- [x] `TelemetryData` carries Grid + Series + optional Provenance -- [x] `CanonicalBundleSource.ReadAsync` reads a bundle directory and returns correct series values -- [x] `FileCsvSource.ReadAsync` reads a single CSV and returns the series under the specified ID -- [x] Both implementations compile and have passing unit tests (23 tests across 2 suites) -- [x] `ITelemetrySink` is **not** introduced (explicitly documented as deferred) -- [x] `rg "FlowTime\.Generator" src/ tests/` still zero (no regressions) -- [x] `dotnet test FlowTime.sln` all green (72 TimeMachine tests, 0 failures) +`dotnet test FlowTime.sln` all green (72 TimeMachine tests, 0 failures) diff --git a/work/epics/E-18-headless-pipeline-and-optimization/M-006-parameter-sweep.md b/work/epics/E-18-headless-pipeline-and-optimization/M-006-parameter-sweep.md new file mode 100644 index 00000000..dd3330ea --- /dev/null +++ b/work/epics/E-18-headless-pipeline-and-optimization/M-006-parameter-sweep.md @@ -0,0 +1,150 @@ +--- +id: M-006 +title: Parameter Sweep +status: done +parent: E-18 +acs: + - id: AC-1 + title: IModelEvaluator interface exists in FlowTime.TimeMachine.Sweep + status: met + - id: AC-2 + title: SweepSpec validates + status: met + - id: AC-3 + title: ConstNodePatcher.Patch correctly replaces const node values; returns + status: met + - id: AC-4 + title: SweepRunner.RunAsync returns one SweepPoint per input value, with + status: met + - id: AC-5 + title: SweepRunner respects CaptureSeriesIds filter (null = all series) + status: met + - id: AC-6 + title: SweepRunner respects CancellationToken between evaluation points + status: met + - id: AC-7 + title: RustModelEvaluator wraps RustEngineRunner and maps series list to + status: met + - id: AC-8 + title: POST /v1/sweep returns 400 for missing yaml / paramId / empty values + status: met + - id: AC-9 + title: POST /v1/sweep returns 503 when Rust engine not enabled + status: met + - id: AC-10 + title: Unit tests pass + status: met + - id: AC-11 + title: 'API validation tests pass: 7 tests (6×400, 1×503)' + status: met + - id: AC-12 + title: dotnet test FlowTime.sln all green (105 TimeMachine, 235 API + status: met +--- + +## Goal + +Implement parameter sweep as a first-class Time Machine operation: given a model YAML, a +const-node ID, and an array of values, evaluate the model once per value and return a +structured table of (param_value → series outputs). + +Builds on: +- M-001 `evaluate_with_params` in the Rust engine (compile-once foundation) +- M-004 `FlowTime.TimeMachine` project (host for the sweep domain model) +- M-005 `ITelemetrySource` (pattern for injectable evaluation contracts) + +## Scope + +**`FlowTime.TimeMachine.Sweep` namespace** — in `src/FlowTime.TimeMachine/Sweep/`: +- `IModelEvaluator` — injectable evaluation contract; decouples SweepRunner from the Rust binary in tests +- `SweepSpec` — validated input: ModelYaml, ParamId, Values[], optional CaptureSeriesIds +- `SweepPoint` — single evaluation result: ParamValue + Series dictionary +- `SweepResult` — full sweep result: ParamId + SweepPoint[] +- `ConstNodePatcher` — internal YAML DOM manipulation; patches a named const node's values array +- `SweepRunner` — orchestrates N evaluations via injected `IModelEvaluator` +- `RustModelEvaluator : IModelEvaluator` — wraps `RustEngineRunner`, maps series list to dictionary + +**`POST /v1/sweep`** — in `src/FlowTime.API/Endpoints/SweepEndpoints.cs`: +- Request: `{ yaml, paramId, values: [double...], captureSeriesIds?: [string...] }` +- Response (200): `{ paramId, points: [{ paramValue, series: { seriesId: double[] } }] }` +- 400: missing yaml / paramId / values +- 503: engine not enabled (RustEngine:Enabled=false) + +**In scope:** +- `src/FlowTime.TimeMachine/Sweep/IModelEvaluator.cs` +- `src/FlowTime.TimeMachine/Sweep/SweepSpec.cs` +- `src/FlowTime.TimeMachine/Sweep/SweepResult.cs` +- `src/FlowTime.TimeMachine/Sweep/ConstNodePatcher.cs` +- `src/FlowTime.TimeMachine/Sweep/SweepRunner.cs` +- `src/FlowTime.TimeMachine/Sweep/RustModelEvaluator.cs` +- `src/FlowTime.API/Endpoints/SweepEndpoints.cs` +- DI registration in `Program.cs` +- Unit tests: `tests/FlowTime.TimeMachine.Tests/Sweep/` +- API tests: `tests/FlowTime.Api.Tests/SweepEndpointsTests.cs` + +**Out of scope:** +- Sensitivity analysis (numerical gradient) — follow-on +- Multi-parameter sweeps (grid sweeps) — follow-on +- Session-based compile-once optimization — follow-on (each sweep point uses subprocess eval) +- Optimization / fitting — M-007+ +- Sweep result persistence / artifact writing — follow-on + +## Design Notes + +### Implementation approach + +Each sweep point calls `RustEngineRunner.EvaluateAsync(patchedYaml)` independently (one +subprocess per point). The YAML is patched in-memory before each call via `ConstNodePatcher`, +which uses YamlDotNet's representation model to substitute the const node's values array. + +This deliberately trades compile-once efficiency for implementation simplicity: the Rust +session protocol requires a MessagePack NuGet dependency not yet in the tree, while the +subprocess approach reuses existing infrastructure with no new dependencies. + +The `IModelEvaluator` abstraction isolates this choice from `SweepRunner`, so a future +session-based evaluator can be dropped in without changing the sweep domain model or tests. + +### ConstNodePatcher behaviour + +- Finds the first `nodes` entry where `id == nodeId` AND `kind == "const"` +- Replaces its `values` sequence with `[value, value, ..., value]` (same bin count) +- Returns the original YAML unchanged if the node is not found or is not a const node +- Uses `InvariantCulture` formatting for decimal precision + +## Acceptance criteria + +### AC-1 — IModelEvaluator interface exists in FlowTime.TimeMachine.Sweep + +`IModelEvaluator` interface exists in `FlowTime.TimeMachine.Sweep` +### AC-2 — SweepSpec validates + +`SweepSpec` validates: non-null/whitespace ModelYaml, non-null/whitespace ParamId, non-null/non-empty Values +### AC-3 — ConstNodePatcher.Patch correctly replaces const node values; returns + +`ConstNodePatcher.Patch` correctly replaces const node values; returns original YAML for unknown/non-const nodes +### AC-4 — SweepRunner.RunAsync returns one SweepPoint per input value, with + +`SweepRunner.RunAsync` returns one `SweepPoint` per input value, with correct ParamValue and Series +### AC-5 — SweepRunner respects CaptureSeriesIds filter (null = all series) + +`SweepRunner` respects `CaptureSeriesIds` filter (null = all series) +### AC-6 — SweepRunner respects CancellationToken between evaluation points + +`SweepRunner` respects `CancellationToken` between evaluation points +### AC-7 — RustModelEvaluator wraps RustEngineRunner and maps series list to + +`RustModelEvaluator` wraps `RustEngineRunner` and maps series list to dictionary +### AC-8 — POST /v1/sweep returns 400 for missing yaml / paramId / empty values + +`POST /v1/sweep` returns 400 for missing yaml / paramId / empty values +### AC-9 — POST /v1/sweep returns 503 when Rust engine not enabled + +`POST /v1/sweep` returns 503 when Rust engine not enabled +### AC-10 — Unit tests pass + +Unit tests pass: 28 sweep unit tests (SweepSpec ×9, ConstNodePatcher ×7, SweepRunner ×12) +### AC-11 — API validation tests pass: 7 tests (6×400, 1×503) + +### AC-12 — dotnet test FlowTime.sln all green (105 TimeMachine, 235 API + +`dotnet test FlowTime.sln` all green (105 TimeMachine, 235 API — pre-existing integration failures unrelated) diff --git a/work/epics/E-18-headless-pipeline-and-optimization/m-E18-10-sensitivity-analysis.md b/work/epics/E-18-headless-pipeline-and-optimization/M-007-sensitivity-analysis.md similarity index 51% rename from work/epics/E-18-headless-pipeline-and-optimization/m-E18-10-sensitivity-analysis.md rename to work/epics/E-18-headless-pipeline-and-optimization/M-007-sensitivity-analysis.md index 60352ac5..8cc1a113 100644 --- a/work/epics/E-18-headless-pipeline-and-optimization/m-E18-10-sensitivity-analysis.md +++ b/work/epics/E-18-headless-pipeline-and-optimization/M-007-sensitivity-analysis.md @@ -1,8 +1,49 @@ -# m-E18-10 — Sensitivity Analysis - -**Epic:** E-18 Time Machine -**Branch:** `epic/E-18-time-machine` -**Status:** complete +--- +id: M-007 +title: Sensitivity Analysis +status: done +parent: E-18 +acs: + - id: AC-1 + title: ConstNodeReader.ReadValue(yaml, nodeId) returns the first-bin value + status: met + - id: AC-2 + title: SensitivitySpec validates + status: met + - id: AC-3 + title: SensitivityRunner.RunAsync returns one SensitivityPoint per found + status: met + - id: AC-4 + title: Gradient computed correctly via central difference + status: met + - id: AC-5 + title: Zero-base param produces Gradient = 0.0 (no crash) + status: met + - id: AC-6 + title: Unknown param ID silently skipped (omitted from result) + status: met + - id: AC-7 + title: Missing metric series throws InvalidOperationException + status: met + - id: AC-8 + title: SensitivityRunner respects CancellationToken + status: met + - id: AC-9 + title: POST /v1/sensitivity returns 400 for missing yaml / paramIds / + status: met + - id: AC-10 + title: POST /v1/sensitivity returns 503 when Rust engine not enabled + status: met + - id: AC-11 + title: Unit tests pass + status: met + - id: AC-12 + title: 'API tests pass: 7 tests (6×400, 1×503)' + status: met + - id: AC-13 + title: dotnet test FlowTime.sln all green (137 TimeMachine, 242 API) + status: met +--- ## Goal @@ -12,13 +53,13 @@ parameter using a central-difference approximation. Answers "which parameter has impact on this metric?" Builds on: -- m-E18-09 `SweepRunner` + `ConstNodePatcher` — two-point sweep per parameter reuses the +- M-006 `SweepRunner` + `ConstNodePatcher` — two-point sweep per parameter reuses the sweep infrastructure directly - `ConstNodePatcher` — YAML DOM manipulation already in place ## Scope -**`FlowTime.TimeMachine.Sweep` namespace** (extending m-E18-09's namespace): +**`FlowTime.TimeMachine.Sweep` namespace** (extending M-006's namespace): - `ConstNodeReader` — companion to `ConstNodePatcher`; reads the current scalar value of a named const node's first bin. Returns `null` if the node is not found or not a const node. - `SensitivitySpec` — validated input: ModelYaml, ParamIds[], MetricSeriesId, Perturbation (default 5%) @@ -47,7 +88,7 @@ Builds on: - Multi-metric sensitivity — single metric per call - Distribution-based sensitivity (Morris method, Sobol indices) — follow-on - Forward-difference vs central-difference choice — central difference only -- Optimization / fitting — m-E18-11+ +- Optimization / fitting — M-008+ ## Design Notes @@ -80,21 +121,43 @@ skipped params by comparing `spec.ParamIds.Length` vs `result.Points.Length`. injected `IModelEvaluator`. Tests pass a `SweepRunner(fakeEvaluator)` — no additional test doubles needed. -## Acceptance Criteria - -- [x] `ConstNodeReader.ReadValue(yaml, nodeId)` returns the first-bin value for known const - nodes; returns `null` for unknown nodes, non-const nodes, and missing `nodes` section -- [x] `SensitivitySpec` validates: non-null/whitespace ModelYaml, non-null/non-empty ParamIds, - non-null/whitespace MetricSeriesId, Perturbation in (0, 1) exclusive -- [x] `SensitivityRunner.RunAsync` returns one `SensitivityPoint` per found param, sorted by - `|Gradient|` descending -- [x] Gradient computed correctly via central difference -- [x] Zero-base param produces Gradient = 0.0 (no crash) -- [x] Unknown param ID silently skipped (omitted from result) -- [x] Missing metric series throws `InvalidOperationException` -- [x] `SensitivityRunner` respects `CancellationToken` -- [x] `POST /v1/sensitivity` returns 400 for missing yaml / paramIds / metricSeriesId -- [x] `POST /v1/sensitivity` returns 503 when Rust engine not enabled -- [x] Unit tests pass: 32 tests (ConstNodeReader ×8, SensitivitySpec ×12, SensitivityRunner ×12) -- [x] API tests pass: 7 tests (6×400, 1×503) -- [x] `dotnet test FlowTime.sln` all green (137 TimeMachine, 242 API) +## Acceptance criteria + +### AC-1 — ConstNodeReader.ReadValue(yaml, nodeId) returns the first-bin value + +`ConstNodeReader.ReadValue(yaml, nodeId)` returns the first-bin value for known const +nodes; returns `null` for unknown nodes, non-const nodes, and missing `nodes` section +### AC-2 — SensitivitySpec validates + +`SensitivitySpec` validates: non-null/whitespace ModelYaml, non-null/non-empty ParamIds, +non-null/whitespace MetricSeriesId, Perturbation in (0, 1) exclusive +### AC-3 — SensitivityRunner.RunAsync returns one SensitivityPoint per found + +`SensitivityRunner.RunAsync` returns one `SensitivityPoint` per found param, sorted by +`|Gradient|` descending +### AC-4 — Gradient computed correctly via central difference + +### AC-5 — Zero-base param produces Gradient = 0.0 (no crash) + +### AC-6 — Unknown param ID silently skipped (omitted from result) + +### AC-7 — Missing metric series throws InvalidOperationException + +Missing metric series throws `InvalidOperationException` +### AC-8 — SensitivityRunner respects CancellationToken + +`SensitivityRunner` respects `CancellationToken` +### AC-9 — POST /v1/sensitivity returns 400 for missing yaml / paramIds / + +`POST /v1/sensitivity` returns 400 for missing yaml / paramIds / metricSeriesId +### AC-10 — POST /v1/sensitivity returns 503 when Rust engine not enabled + +`POST /v1/sensitivity` returns 503 when Rust engine not enabled +### AC-11 — Unit tests pass + +Unit tests pass: 32 tests (ConstNodeReader ×8, SensitivitySpec ×12, SensitivityRunner ×12) +### AC-12 — API tests pass: 7 tests (6×400, 1×503) + +### AC-13 — dotnet test FlowTime.sln all green (137 TimeMachine, 242 API) + +`dotnet test FlowTime.sln` all green (137 TimeMachine, 242 API) diff --git a/work/epics/E-18-headless-pipeline-and-optimization/M-008-goal-seeking.md b/work/epics/E-18-headless-pipeline-and-optimization/M-008-goal-seeking.md new file mode 100644 index 00000000..af62ef76 --- /dev/null +++ b/work/epics/E-18-headless-pipeline-and-optimization/M-008-goal-seeking.md @@ -0,0 +1,126 @@ +--- +id: M-008 +title: Goal Seeking +status: done +parent: E-18 +acs: + - id: AC-1 + title: GoalSeekSpec validates + status: met + - id: AC-2 + title: GoalSeeker.SeekAsync converges on a linear model to within tolerance + status: met + - id: AC-3 + title: GoalSeeker returns Converged=false when target is not bracketed + status: met + - id: AC-4 + title: GoalSeeker returns Converged=false (best guess) when max iterations + status: met + - id: AC-5 + title: GoalSeeker respects CancellationToken + status: met + - id: AC-6 + title: POST /v1/goal-seek returns 400 for missing/invalid required fields + status: met + - id: AC-7 + title: POST /v1/goal-seek returns 503 when engine not enabled + status: met + - id: AC-8 + title: 'Unit tests pass: 26 tests (GoalSeekSpec ×14, GoalSeeker ×12)' + status: met + - id: AC-9 + title: 'API tests pass: 8 tests (7×400, 1×503)' + status: met + - id: AC-10 + title: dotnet test FlowTime.sln all green (163 TimeMachine, 250 API) + status: met +--- + +## Goal + +Add 1D goal seeking: given a model YAML, a const-node parameter, a metric series, and a +target value, find the parameter value that drives the metric mean to the target via bisection. +Answers "what arrival rate gives 80% utilization?" without a full parameter sweep. + +Builds on: +- M-006 `SweepRunner` + `ConstNodePatcher` / `ConstNodeReader` (M-007) +- Same `IModelEvaluator` seam + +## Scope + +**`FlowTime.TimeMachine.Sweep` namespace:** +- `GoalSeekSpec` — validated input: ModelYaml, ParamId, MetricSeriesId, Target, SearchLo, + SearchHi, Tolerance (default 1e-6), MaxIterations (default 50) +- `GoalSeekResult` — output: ParamValue, AchievedMetricMean, Converged, Iterations +- `GoalSeeker` — bisection over `SweepRunner`; handles non-bracketed case gracefully + +**`POST /v1/goal-seek`** — in `src/FlowTime.API/Endpoints/GoalSeekEndpoints.cs` +- Request: `{ yaml, paramId, metricSeriesId, target, searchLo, searchHi, tolerance?, maxIterations? }` +- Response (200): `{ paramValue, achievedMetricMean, converged, iterations }` +- 400: missing/invalid required fields (searchLo ≥ searchHi is invalid) +- 503: engine not enabled + +**In scope:** +- `src/FlowTime.TimeMachine/Sweep/GoalSeekSpec.cs` +- `src/FlowTime.TimeMachine/Sweep/GoalSeekResult.cs` +- `src/FlowTime.TimeMachine/Sweep/GoalSeeker.cs` +- `src/FlowTime.API/Endpoints/GoalSeekEndpoints.cs` +- DI registration in `Program.cs` +- Unit tests: `tests/FlowTime.TimeMachine.Tests/Sweep/` +- API tests: `tests/FlowTime.Api.Tests/GoalSeekEndpointsTests.cs` +- Architecture doc: `docs/architecture/time-machine-analysis-modes.md` (written alongside) + +**Out of scope:** +- Multi-dimensional optimization (Nelder-Mead) — M-009+ +- Constraint handling beyond the `[searchLo, searchHi]` range +- Non-monotonic functions (bisection is undefined; `Converged=false` returned) + +## Algorithm + +Bisection on the metric mean: + +``` +1. Evaluate at searchLo → meanLo = mean(metric at searchLo) +2. Evaluate at searchHi → meanHi = mean(metric at searchHi) +3. If target not in [min(meanLo,meanHi), max(meanLo,meanHi)]: + return best endpoint, Converged=false +4. While iterations < maxIterations: + mid = (lo + hi) / 2 + midMean = mean(metric at mid) + if |midMean - target| < tolerance: return mid, Converged=true + if (midMean - target) same sign as (meanLo - target): lo = mid, meanLo = midMean + else: hi = mid, meanHi = midMean +5. Return mid, Converged=false (max iterations reached) +``` + +## Acceptance criteria + +### AC-1 — GoalSeekSpec validates + +`GoalSeekSpec` validates: non-null/whitespace ModelYaml/ParamId/MetricSeriesId; +SearchLo < SearchHi; Tolerance > 0; MaxIterations ≥ 1 +### AC-2 — GoalSeeker.SeekAsync converges on a linear model to within tolerance + +`GoalSeeker.SeekAsync` converges on a linear model to within tolerance +### AC-3 — GoalSeeker returns Converged=false when target is not bracketed + +`GoalSeeker` returns `Converged=false` when target is not bracketed +### AC-4 — GoalSeeker returns Converged=false (best guess) when max iterations + +`GoalSeeker` returns `Converged=false` (best guess) when max iterations exhausted +### AC-5 — GoalSeeker respects CancellationToken + +`GoalSeeker` respects `CancellationToken` +### AC-6 — POST /v1/goal-seek returns 400 for missing/invalid required fields + +`POST /v1/goal-seek` returns 400 for missing/invalid required fields +### AC-7 — POST /v1/goal-seek returns 503 when engine not enabled + +`POST /v1/goal-seek` returns 503 when engine not enabled +### AC-8 — Unit tests pass: 26 tests (GoalSeekSpec ×14, GoalSeeker ×12) + +### AC-9 — API tests pass: 8 tests (7×400, 1×503) + +### AC-10 — dotnet test FlowTime.sln all green (163 TimeMachine, 250 API) + +`dotnet test FlowTime.sln` all green (163 TimeMachine, 250 API) diff --git a/work/epics/E-18-headless-pipeline-and-optimization/m-E18-12-optimization.md b/work/epics/E-18-headless-pipeline-and-optimization/M-009-multi-parameter-optimization.md similarity index 59% rename from work/epics/E-18-headless-pipeline-and-optimization/m-E18-12-optimization.md rename to work/epics/E-18-headless-pipeline-and-optimization/M-009-multi-parameter-optimization.md index c4668106..63c08c84 100644 --- a/work/epics/E-18-headless-pipeline-and-optimization/m-E18-12-optimization.md +++ b/work/epics/E-18-headless-pipeline-and-optimization/M-009-multi-parameter-optimization.md @@ -1,8 +1,43 @@ -# m-E18-12 — Multi-parameter Optimization - -**Epic:** E-18 Time Machine -**Branch:** `epic/E-18-time-machine` -**Status:** complete +--- +id: M-009 +title: Multi-parameter Optimization +status: done +parent: E-18 +acs: + - id: AC-1 + title: OptimizeSpec validates + status: met + - id: AC-2 + title: Optimizer.OptimizeAsync converges on a 1D bowl function to within + status: met + - id: AC-3 + title: Optimizer.OptimizeAsync converges on a 2D bowl function to within + status: met + - id: AC-4 + title: Optimizer.OptimizeAsync supports Maximize objective (maximizes a + status: met + - id: AC-5 + title: Optimizer returns Converged=false when MaxIterations exhausted before + status: met + - id: AC-6 + title: Optimizer respects CancellationToken + status: met + - id: AC-7 + title: POST /v1/optimize returns 400 for missing/invalid required fields + status: met + - id: AC-8 + title: POST /v1/optimize returns 503 when engine not enabled + status: met + - id: AC-9 + title: 'Unit tests pass: 29 tests (OptimizeSpec ×17, Optimizer ×12)' + status: met + - id: AC-10 + title: 'API tests pass: 10 tests (9×400, 1×503)' + status: met + - id: AC-11 + title: dotnet test FlowTime.sln all green (192 TimeMachine, 260 API) + status: met +--- ## Goal @@ -15,9 +50,9 @@ Answers "what combination of arrival rate and capacity minimizes queue depth?" w multi-dimensional grid search. Builds on: -- `IModelEvaluator` seam (m-E18-09) -- `ConstNodePatcher` for multi-parameter YAML mutation (m-E18-09) -- `ConstNodeReader` (m-E18-10) — used in tests to read patched values +- `IModelEvaluator` seam (M-006) +- `ConstNodePatcher` for multi-parameter YAML mutation (M-006) +- `ConstNodeReader` (M-007) — used in tests to read patched values ## Scope @@ -96,18 +131,38 @@ Objective f(v) = metricMean(v) for Minimize `Optimizer` takes `IModelEvaluator` directly (not `SweepRunner`) and applies `ConstNodePatcher.Patch` sequentially for all parameters before each evaluation. -## Acceptance Criteria - -- [x] `OptimizeSpec` validates: non-null/whitespace ModelYaml/MetricSeriesId; non-null/non-empty - ParamIds; non-null SearchRanges with an entry for every ParamId (Lo < Hi for each); - Tolerance > 0; MaxIterations ≥ 1 -- [x] `Optimizer.OptimizeAsync` converges on a 1D bowl function to within tolerance -- [x] `Optimizer.OptimizeAsync` converges on a 2D bowl function to within tolerance -- [x] `Optimizer.OptimizeAsync` supports Maximize objective (maximizes a linear metric) -- [x] `Optimizer` returns `Converged=false` when MaxIterations exhausted before convergence -- [x] `Optimizer` respects `CancellationToken` -- [x] `POST /v1/optimize` returns 400 for missing/invalid required fields -- [x] `POST /v1/optimize` returns 503 when engine not enabled -- [x] Unit tests pass: 29 tests (OptimizeSpec ×17, Optimizer ×12) -- [x] API tests pass: 10 tests (9×400, 1×503) -- [x] `dotnet test FlowTime.sln` all green (192 TimeMachine, 260 API) +## Acceptance criteria + +### AC-1 — OptimizeSpec validates + +`OptimizeSpec` validates: non-null/whitespace ModelYaml/MetricSeriesId; non-null/non-empty +ParamIds; non-null SearchRanges with an entry for every ParamId (Lo < Hi for each); +Tolerance > 0; MaxIterations ≥ 1 +### AC-2 — Optimizer.OptimizeAsync converges on a 1D bowl function to within + +`Optimizer.OptimizeAsync` converges on a 1D bowl function to within tolerance +### AC-3 — Optimizer.OptimizeAsync converges on a 2D bowl function to within + +`Optimizer.OptimizeAsync` converges on a 2D bowl function to within tolerance +### AC-4 — Optimizer.OptimizeAsync supports Maximize objective (maximizes a + +`Optimizer.OptimizeAsync` supports Maximize objective (maximizes a linear metric) +### AC-5 — Optimizer returns Converged=false when MaxIterations exhausted before + +`Optimizer` returns `Converged=false` when MaxIterations exhausted before convergence +### AC-6 — Optimizer respects CancellationToken + +`Optimizer` respects `CancellationToken` +### AC-7 — POST /v1/optimize returns 400 for missing/invalid required fields + +`POST /v1/optimize` returns 400 for missing/invalid required fields +### AC-8 — POST /v1/optimize returns 503 when engine not enabled + +`POST /v1/optimize` returns 503 when engine not enabled +### AC-9 — Unit tests pass: 29 tests (OptimizeSpec ×17, Optimizer ×12) + +### AC-10 — API tests pass: 10 tests (9×400, 1×503) + +### AC-11 — dotnet test FlowTime.sln all green (192 TimeMachine, 260 API) + +`dotnet test FlowTime.sln` all green (192 TimeMachine, 260 API) diff --git a/work/epics/E-18-headless-pipeline-and-optimization/m-E18-13-session-model-evaluator.md b/work/epics/E-18-headless-pipeline-and-optimization/M-010-sessionmodelevaluator.md similarity index 60% rename from work/epics/E-18-headless-pipeline-and-optimization/m-E18-13-session-model-evaluator.md rename to work/epics/E-18-headless-pipeline-and-optimization/M-010-sessionmodelevaluator.md index 872cd0d4..1529bad1 100644 --- a/work/epics/E-18-headless-pipeline-and-optimization/m-E18-13-session-model-evaluator.md +++ b/work/epics/E-18-headless-pipeline-and-optimization/M-010-sessionmodelevaluator.md @@ -1,8 +1,70 @@ -# m-E18-13 — SessionModelEvaluator - -**Epic:** E-18 Time Machine -**Branch:** `milestone/m-E18-13-session-model-evaluator` (from `epic/E-18-time-machine`) -**Status:** complete — merged to epic branch 2026-04-15 +--- +id: M-010 +title: SessionModelEvaluator +status: done +parent: E-18 +acs: + - id: AC-1 + title: SessionModelEvaluator exists, implements IModelEvaluator and + status: met + - id: AC-2 + title: Constructor validates engine path (non-null, non-whitespace) + status: met + - id: AC-3 + title: First EvaluateAsync call spawns the subprocess exactly once + status: met + - id: AC-4 + title: First call sends compile; subsequent calls send eval with overrides + status: met + - id: AC-5 + title: Returned series dictionary uses case-insensitive keys (matches + status: met + - id: AC-6 + title: Error responses (error key present) raise InvalidOperationException + status: met + - id: AC-7 + title: DisposeAsync closes stdin, waits for exit, kills the process tree on + status: met + - id: AC-8 + title: DisposeAsync is idempotent (safe to call multiple times) + status: met + - id: AC-9 + title: Calling EvaluateAsync after DisposeAsync throws + status: met + - id: AC-10 + title: CancellationToken is observed during I/O + status: met + - id: AC-11 + title: Concurrent EvaluateAsync calls on one instance are serialized (no + status: met + - id: AC-12 + title: DI + status: met + - id: AC-13 + title: DI + status: met + - id: AC-14 + title: RustModelEvaluator.cs retained as fallback; covered by an API test + status: met + - id: AC-15 + title: Unit tests pass + status: met + - id: AC-16 + title: Integration tests pass with the Rust binary present + status: met + - id: AC-17 + title: API DI tests pass + status: met + - id: AC-18 + title: dotnet build FlowTime.sln green + status: met + - id: AC-19 + title: dotnet test FlowTime.sln all green (1,620 passed / 9 skipped) + status: met + - id: AC-20 + title: docs/architecture/time-machine-analysis-modes.md updated + status: met +--- ## Goal @@ -12,7 +74,7 @@ subprocess that re-parses YAML and re-compiles the Plan. For a sweep of 200 poin 200 compiles; for an optimization run it can be 100–1000 compiles. Each spawn is ~100–500 ms of pure compile overhead. -`SessionModelEvaluator` uses the m-E18-02 session protocol (MessagePack over stdin/stdout): +`SessionModelEvaluator` uses the M-002 session protocol (MessagePack over stdin/stdout): compile once on the first call, then send `eval` with parameter overrides for every subsequent call. The expected speedup for large batches is ~10–50×. @@ -121,43 +183,81 @@ are `Dictionary` navigated by key. Matching the Rust protocol: Errors arrive as `{ error: { code, message } }` with no `result` key. The evaluator raises `InvalidOperationException` with the error code + message. -## Acceptance Criteria - -- [x] `SessionModelEvaluator` exists, implements `IModelEvaluator` and `IAsyncDisposable` -- [x] Constructor validates engine path (non-null, non-whitespace) -- [x] First `EvaluateAsync` call spawns the subprocess exactly once; subsequent calls reuse it -- [x] First call sends `compile`; subsequent calls send `eval` with overrides extracted via `ConstNodeReader` -- [x] Returned series dictionary uses case-insensitive keys (matches `RustModelEvaluator`) -- [x] Error responses (`error` key present) raise `InvalidOperationException` with code + message -- [x] `DisposeAsync` closes stdin, waits for exit, kills the process tree on timeout -- [x] `DisposeAsync` is idempotent (safe to call multiple times) -- [x] Calling `EvaluateAsync` after `DisposeAsync` throws `ObjectDisposedException` -- [x] `CancellationToken` is observed during I/O -- [x] Concurrent `EvaluateAsync` calls on one instance are serialized (no interleaved frames) -- [x] DI: `IModelEvaluator`, `SweepRunner`, `SensitivityRunner`, `GoalSeeker`, `Optimizer` all registered as `Scoped` -- [x] DI: `RustEngine:UseSession` config (default `true`) selects `SessionModelEvaluator`; `false` selects `RustModelEvaluator` -- [x] `RustModelEvaluator.cs` retained as fallback; covered by an API test that flips the config switch -- [x] Unit tests pass: 32 tests total - - 6 constructor + disposal (SessionModelEvaluatorTests) - - 3 BuildOverrides (empty / all-found / some-missing) - - 5 ExtractResult (success / error-with-code-msg / error-missing-subfields / neither / malformed-result) - - 4 ExtractParamIds (missing-key / not-array / valid / malformed-items) - - 6 ExtractSeries (missing-key / not-dict / valid / case-insensitive / non-string-key / non-array-value) - - 1 WriteFrameAsync (length-prefixed MessagePack) - - 5 ReadFrameAsync (valid / zero / negative / excessive / truncated) - - 2 ReadExactAsync (full-read / EOF-mid-read) -- [x] Integration tests pass with the Rust binary present: 8 tests (SessionModelEvaluatorIntegrationTests) - - [x] Compile-once / eval-many returns correct series after parameter override - - [x] Parity on numeric values against per-eval path (keys differ by design — documented in `work/gaps.md`) - - [x] `SweepRunner` drives `SessionModelEvaluator` end-to-end over a 5-point sweep - - [x] Session subprocess does not leak after disposal - - [x] Invalid model raises `InvalidOperationException` with engine error code - - [x] Concurrent calls on one instance are serialized -- [x] API DI tests pass: 4 tests (ModelEvaluatorRegistrationTests — default/true/false/scope lifetime) -- [x] `dotnet build FlowTime.sln` green -- [x] `dotnet test FlowTime.sln` all green (1,620 passed / 9 skipped) -- [x] `docs/architecture/time-machine-analysis-modes.md` updated — now documents both evaluator paths, config switch, and scoped lifetime +## Acceptance criteria + +### AC-1 — SessionModelEvaluator exists, implements IModelEvaluator and + +`SessionModelEvaluator` exists, implements `IModelEvaluator` and `IAsyncDisposable` +### AC-2 — Constructor validates engine path (non-null, non-whitespace) + +### AC-3 — First EvaluateAsync call spawns the subprocess exactly once + +First `EvaluateAsync` call spawns the subprocess exactly once; subsequent calls reuse it +### AC-4 — First call sends compile; subsequent calls send eval with overrides + +First call sends `compile`; subsequent calls send `eval` with overrides extracted via `ConstNodeReader` +### AC-5 — Returned series dictionary uses case-insensitive keys (matches + +Returned series dictionary uses case-insensitive keys (matches `RustModelEvaluator`) +### AC-6 — Error responses (error key present) raise InvalidOperationException + +Error responses (`error` key present) raise `InvalidOperationException` with code + message +### AC-7 — DisposeAsync closes stdin, waits for exit, kills the process tree on + +`DisposeAsync` closes stdin, waits for exit, kills the process tree on timeout +### AC-8 — DisposeAsync is idempotent (safe to call multiple times) + +`DisposeAsync` is idempotent (safe to call multiple times) +### AC-9 — Calling EvaluateAsync after DisposeAsync throws + +Calling `EvaluateAsync` after `DisposeAsync` throws `ObjectDisposedException` +### AC-10 — CancellationToken is observed during I/O + +`CancellationToken` is observed during I/O +### AC-11 — Concurrent EvaluateAsync calls on one instance are serialized (no + +Concurrent `EvaluateAsync` calls on one instance are serialized (no interleaved frames) +### AC-12 — DI + +DI: `IModelEvaluator`, `SweepRunner`, `SensitivityRunner`, `GoalSeeker`, `Optimizer` all registered as `Scoped` +### AC-13 — DI + +DI: `RustEngine:UseSession` config (default `true`) selects `SessionModelEvaluator`; `false` selects `RustModelEvaluator` +### AC-14 — RustModelEvaluator.cs retained as fallback; covered by an API test + +`RustModelEvaluator.cs` retained as fallback; covered by an API test that flips the config switch +### AC-15 — Unit tests pass + +Unit tests pass: 32 tests total +- 6 constructor + disposal (SessionModelEvaluatorTests) +- 3 BuildOverrides (empty / all-found / some-missing) +- 5 ExtractResult (success / error-with-code-msg / error-missing-subfields / neither / malformed-result) +- 4 ExtractParamIds (missing-key / not-array / valid / malformed-items) +- 6 ExtractSeries (missing-key / not-dict / valid / case-insensitive / non-string-key / non-array-value) +- 1 WriteFrameAsync (length-prefixed MessagePack) +- 5 ReadFrameAsync (valid / zero / negative / excessive / truncated) +- 2 ReadExactAsync (full-read / EOF-mid-read) +### AC-16 — Integration tests pass with the Rust binary present + +Integration tests pass with the Rust binary present: 8 tests (SessionModelEvaluatorIntegrationTests) +- [x] Compile-once / eval-many returns correct series after parameter override +- [x] Parity on numeric values against per-eval path (keys differ by design — documented in `work/gaps.md`) +- [x] `SweepRunner` drives `SessionModelEvaluator` end-to-end over a 5-point sweep +- [x] Session subprocess does not leak after disposal +- [x] Invalid model raises `InvalidOperationException` with engine error code +- [x] Concurrent calls on one instance are serialized +### AC-17 — API DI tests pass + +API DI tests pass: 4 tests (ModelEvaluatorRegistrationTests — default/true/false/scope lifetime) +### AC-18 — dotnet build FlowTime.sln green + +`dotnet build FlowTime.sln` green +### AC-19 — dotnet test FlowTime.sln all green (1,620 passed / 9 skipped) + +`dotnet test FlowTime.sln` all green (1,620 passed / 9 skipped) +### AC-20 — docs/architecture/time-machine-analysis-modes.md updated +`docs/architecture/time-machine-analysis-modes.md` updated — now documents both evaluator paths, config switch, and scoped lifetime ## Coverage notes **Covered:** every reachable branch in the production implementation — 44 dedicated tests (32 unit + 8 integration + 4 DI). The unit tests deliberately exercise every parsing helper with hand-crafted protocol payloads that the real Rust engine would not produce (missing fields, malformed types, non-string keys, out-of-range frame lengths), because those are defense-in-depth paths against protocol corruption and must not fail silently. @@ -177,10 +277,10 @@ These six branches remain in the code as defense-in-depth and would be removed o ## Dependencies -- m-E18-02 (engine session protocol) — delivered -- m-E18-08 (ITelemetrySource) — independent -- m-E18-09 (`IModelEvaluator` seam, `ConstNodePatcher`) — delivered -- m-E18-10 (`ConstNodeReader`) — delivered +- M-002 (engine session protocol) — delivered +- M-005 (ITelemetrySource) — independent +- M-006 (`IModelEvaluator` seam, `ConstNodePatcher`) — delivered +- M-007 (`ConstNodeReader`) — delivered - `MessagePack` 3.1.4 — already in integration tests, add to TimeMachine ## Risks / notes diff --git a/work/epics/E-18-headless-pipeline-and-optimization/m-E18-14-timemachine-cli.md b/work/epics/E-18-headless-pipeline-and-optimization/M-011-net-time-machine-cli.md similarity index 60% rename from work/epics/E-18-headless-pipeline-and-optimization/m-E18-14-timemachine-cli.md rename to work/epics/E-18-headless-pipeline-and-optimization/M-011-net-time-machine-cli.md index 810f59aa..f9a23ca6 100644 --- a/work/epics/E-18-headless-pipeline-and-optimization/m-E18-14-timemachine-cli.md +++ b/work/epics/E-18-headless-pipeline-and-optimization/M-011-net-time-machine-cli.md @@ -1,8 +1,64 @@ -# m-E18-14 — .NET Time Machine CLI - -**Epic:** E-18 Time Machine -**Branch:** `milestone/m-E18-14-timemachine-cli` (from `epic/E-18-time-machine`) -**Status:** complete — merged to epic branch 2026-04-15 +--- +id: M-011 +title: .NET Time Machine CLI +status: done +parent: E-18 +acs: + - id: AC-1 + title: Five CLI commands (validate, sweep, sensitivity, goal-seek, optimize) + status: met + - id: AC-2 + title: Each command parses --spec / stdin, --output / stdout, --no-session + status: met + - id: AC-3 + title: validate reads YAML (not JSON) via --model / stdin; outputs + status: met + - id: AC-4 + title: Each analysis command reads its matching *Spec as JSON and writes its + status: met + - id: AC-5 + title: CliEngineSetup helper resolves binary path via --engine → + status: met + - id: AC-6 + title: CliEngineSetup constructs SessionModelEvaluator by default + status: met + - id: AC-7 + title: CliJsonIO helper reads JSON from stdin-or-file and writes JSON to + status: met + - id: AC-8 + title: Exit codes follow the 0/1/2/3 contract (success / analysis-failed / + status: met + - id: AC-9 + title: Missing engine binary produces exit 2 with a readable stderr message + status: met + - id: AC-10 + title: Invalid JSON produces exit 2 with a stderr message; no partial stdout + status: met + - id: AC-11 + title: --help on any command prints command-specific usage and exits 0 + status: met + - id: AC-12 + title: Unit tests pass + status: met + - id: AC-13 + title: Integration tests pass with the Rust binary present + status: met + - id: AC-14 + title: Every reachable path in the new command classes and helpers is + status: met + - id: AC-15 + title: docs/architecture/time-machine-analysis-modes.md + status: met + - id: AC-16 + title: Program.cs PrintUsage updated with the five new commands + status: met + - id: AC-17 + title: dotnet build FlowTime.sln green + status: met + - id: AC-18 + title: dotnet test FlowTime.sln all green + status: met +--- ## Goal @@ -169,41 +225,74 @@ cat optimize-spec.json | flowtime optimize The JSON path is pipeline-native: compose with `jq`, store specs as fixtures, invoke from Azure Functions custom handlers, share spec files with the API. -## Acceptance Criteria - -- [x] Five CLI commands (`validate`, `sweep`, `sensitivity`, `goal-seek`, `optimize`) wired into `Program.cs` router -- [x] Each command parses `--spec` / stdin, `--output` / stdout, `--no-session`, `--engine`, `--help` -- [x] `validate` reads YAML (not JSON) via `--model` / stdin; outputs `ValidationResult` as JSON -- [x] Each analysis command reads its matching `*Spec` as JSON and writes its matching result as JSON, byte-compatible with the corresponding `/v1/` endpoint -- [x] `CliEngineSetup` helper resolves binary path via `--engine` → `FLOWTIME_RUST_BINARY` → solution-relative default → `$PATH` -- [x] `CliEngineSetup` constructs `SessionModelEvaluator` by default; `--no-session` selects `RustModelEvaluator` -- [x] `CliJsonIO` helper reads JSON from stdin-or-file and writes JSON to stdout-or-file with camelCase / web defaults matching the API; `JsonStringEnumConverter` added so `objective: "minimize"` etc. deserialize correctly -- [x] Exit codes follow the 0/1/2/3 contract (success / analysis-failed / input-error / engine-error) -- [x] Missing engine binary produces exit 2 with a readable stderr message -- [x] Invalid JSON produces exit 2 with a stderr message; no partial stdout -- [x] `--help` on any command prints command-specific usage and exits 0 -- [x] Unit tests pass: 72 new CLI unit tests - - 15 CliJsonIO (read/write, file/stdin, camelCase, null literal, errors) - - 14 CliCommonArgs (all flag variants, missing values, unknown flag, positional, dash-as-positional) - - 8 CliEngineSetup (path precedence, evaluator selection, disposal idempotency) - - 13 ValidateCommand (help, arg errors, tier, valid/invalid YAML, output) - - 18 AnalysisCommandTests (help for each of 4 commands, shared error paths, IsOnPath, BarePath) - - 4 deferred (covered by integration tests instead — see below) -- [x] Integration tests pass with the Rust binary present: 10 tests (TimeMachineCliIntegrationTests) - - [x] `flowtime validate` with valid and invalid YAML - - [x] `flowtime sweep` end-to-end producing correct series (arrivals=10,20,30 → served=5,10,15) - - [x] `flowtime sensitivity` end-to-end (∂served/∂arrivals = 0.5) - - [x] `flowtime goal-seek` end-to-end (target served=25 → arrivals≈50) - - [x] `flowtime optimize` converging on a `MAX(x-7,7-x)` bowl around arrivals=14 - - [x] Session vs. per-eval flag (`--no-session`) both work - - [x] Output to file (`-o`) matches output to stdout - - [x] Engine compile error (unknown function) produces exit 3 -- [x] Every reachable path in the new command classes and helpers is covered (line-by-line audited) -- [x] `docs/architecture/time-machine-analysis-modes.md` — new "CLI surface" section documents the five commands, JSON I/O contract, exit codes, evaluator selection, engine resolution, and pipeline composition example -- [x] `Program.cs` `PrintUsage` updated with the five new commands -- [x] `dotnet build FlowTime.sln` green -- [x] `dotnet test FlowTime.sln` all green — 1,702 passed / 9 skipped +## Acceptance criteria + +### AC-1 — Five CLI commands (validate, sweep, sensitivity, goal-seek, optimize) + +Five CLI commands (`validate`, `sweep`, `sensitivity`, `goal-seek`, `optimize`) wired into `Program.cs` router +### AC-2 — Each command parses --spec / stdin, --output / stdout, --no-session + +Each command parses `--spec` / stdin, `--output` / stdout, `--no-session`, `--engine`, `--help` +### AC-3 — validate reads YAML (not JSON) via --model / stdin; outputs + +`validate` reads YAML (not JSON) via `--model` / stdin; outputs `ValidationResult` as JSON +### AC-4 — Each analysis command reads its matching *Spec as JSON and writes its + +Each analysis command reads its matching `*Spec` as JSON and writes its matching result as JSON, byte-compatible with the corresponding `/v1/` endpoint +### AC-5 — CliEngineSetup helper resolves binary path via --engine → + +`CliEngineSetup` helper resolves binary path via `--engine` → `FLOWTIME_RUST_BINARY` → solution-relative default → `$PATH` +### AC-6 — CliEngineSetup constructs SessionModelEvaluator by default + +`CliEngineSetup` constructs `SessionModelEvaluator` by default; `--no-session` selects `RustModelEvaluator` +### AC-7 — CliJsonIO helper reads JSON from stdin-or-file and writes JSON to + +`CliJsonIO` helper reads JSON from stdin-or-file and writes JSON to stdout-or-file with camelCase / web defaults matching the API; `JsonStringEnumConverter` added so `objective: "minimize"` etc. deserialize correctly +### AC-8 — Exit codes follow the 0/1/2/3 contract (success / analysis-failed / + +Exit codes follow the 0/1/2/3 contract (success / analysis-failed / input-error / engine-error) +### AC-9 — Missing engine binary produces exit 2 with a readable stderr message + +### AC-10 — Invalid JSON produces exit 2 with a stderr message; no partial stdout + +### AC-11 — --help on any command prints command-specific usage and exits 0 + +`--help` on any command prints command-specific usage and exits 0 +### AC-12 — Unit tests pass + +Unit tests pass: 72 new CLI unit tests +- 15 CliJsonIO (read/write, file/stdin, camelCase, null literal, errors) +- 14 CliCommonArgs (all flag variants, missing values, unknown flag, positional, dash-as-positional) +- 8 CliEngineSetup (path precedence, evaluator selection, disposal idempotency) +- 13 ValidateCommand (help, arg errors, tier, valid/invalid YAML, output) +- 18 AnalysisCommandTests (help for each of 4 commands, shared error paths, IsOnPath, BarePath) +- 4 deferred (covered by integration tests instead — see below) +### AC-13 — Integration tests pass with the Rust binary present + +Integration tests pass with the Rust binary present: 10 tests (TimeMachineCliIntegrationTests) +- [x] `flowtime validate` with valid and invalid YAML +- [x] `flowtime sweep` end-to-end producing correct series (arrivals=10,20,30 → served=5,10,15) +- [x] `flowtime sensitivity` end-to-end (∂served/∂arrivals = 0.5) +- [x] `flowtime goal-seek` end-to-end (target served=25 → arrivals≈50) +- [x] `flowtime optimize` converging on a `MAX(x-7,7-x)` bowl around arrivals=14 +- [x] Session vs. per-eval flag (`--no-session`) both work +- [x] Output to file (`-o`) matches output to stdout +- [x] Engine compile error (unknown function) produces exit 3 +### AC-14 — Every reachable path in the new command classes and helpers is + +Every reachable path in the new command classes and helpers is covered (line-by-line audited) +### AC-15 — docs/architecture/time-machine-analysis-modes.md + +`docs/architecture/time-machine-analysis-modes.md` — new "CLI surface" section documents the five commands, JSON I/O contract, exit codes, evaluator selection, engine resolution, and pipeline composition example +### AC-16 — Program.cs PrintUsage updated with the five new commands + +`Program.cs` `PrintUsage` updated with the five new commands +### AC-17 — dotnet build FlowTime.sln green + +`dotnet build FlowTime.sln` green +### AC-18 — dotnet test FlowTime.sln all green +`dotnet test FlowTime.sln` all green — 1,702 passed / 9 skipped ## Coverage notes **Covered:** every reachable branch in the command classes, helpers, and the `AnalysisCliRunner` shared path. The 89 CLI unit tests explicitly exercise: @@ -230,12 +319,12 @@ This is a single acceptable gap for platform-edge behavior. ## Dependencies -- m-E18-06 (TimeMachineValidator) — delivered -- m-E18-09 (`IModelEvaluator`, `SweepRunner`, `ConstNodePatcher`) — delivered -- m-E18-10 (`SensitivityRunner`, `ConstNodeReader`) — delivered -- m-E18-11 (`GoalSeeker`) — delivered -- m-E18-12 (`Optimizer`, `OptimizeSpec`) — delivered -- m-E18-13 (`SessionModelEvaluator`, evaluator config switch) — delivered +- M-003 (TimeMachineValidator) — delivered +- M-006 (`IModelEvaluator`, `SweepRunner`, `ConstNodePatcher`) — delivered +- M-007 (`SensitivityRunner`, `ConstNodeReader`) — delivered +- M-008 (`GoalSeeker`) — delivered +- M-009 (`Optimizer`, `OptimizeSpec`) — delivered +- M-010 (`SessionModelEvaluator`, evaluator config switch) — delivered ## Risks / notes @@ -243,7 +332,7 @@ This is a single acceptable gap for platform-edge behavior. commands should follow the same convention; don't introduce a parsing library in this milestone. A future cleanup epic can migrate all commands to `System.CommandLine`. - **Test isolation.** Integration tests spawn the Rust engine subprocess per test. Same - skip-if-missing pattern as m-E18-13 integration tests. + skip-if-missing pattern as M-010 integration tests. - **Stdin handling in tests.** Tests should not actually redirect Console.In — use the `--spec ` flag with a temp file for test inputs. Reserve stdin testing for a single smoke test that sets `Console.SetIn`. diff --git a/work/epics/E-18-headless-pipeline-and-optimization/e18-gap-analysis.md b/work/epics/E-18-headless-pipeline-and-optimization/e18-gap-analysis.md deleted file mode 100644 index f945a346..00000000 --- a/work/epics/E-18-headless-pipeline-and-optimization/e18-gap-analysis.md +++ /dev/null @@ -1,145 +0,0 @@ -# E-18 Time Machine — Gap Analysis - -**Date:** 2026-04-13 -**Branch:** `epic/E-18-time-machine` -**Against:** `spec.md` (original) and `milestone-plan-v2.md` (Rust-engine replan) - ---- - -## Delivered milestones - -| ID | Title | Status | Notes | -|----|-------|--------|-------| -| m-E18-01 | Parameterized Evaluation (Rust) | ✅ complete | ParamTable, evaluate_with_params, compile-once eval-many in Rust | -| m-E18-02 | Engine Session + Streaming Protocol (Rust) | ✅ complete | Persistent process, MessagePack over stdin/stdout, session holds compiled plan | -| m-E18-06 | Tiered Validation | ✅ complete | TimeMachineValidator (schema/compile/analyse), POST /v1/validate, Rust validate_schema session command | -| m-E18-07 | Generator → TimeMachine | ✅ complete | Rename + restructure; FlowTime.Generator deleted; rg confirms zero residual references | -| m-E18-08 | ITelemetrySource Contract | ✅ complete | Interface + CanonicalBundleSource + FileCsvSource; 23 tests | -| m-E18-09 | Parameter Sweep (.NET) | ✅ complete | SweepSpec/SweepRunner/ConstNodePatcher, IModelEvaluator/RustModelEvaluator, POST /v1/sweep; 35 tests | -| m-E18-10 | Sensitivity Analysis (.NET) | ✅ complete | ConstNodeReader, SensitivitySpec/SensitivityRunner (central difference), POST /v1/sensitivity; 39 tests | -| m-E18-11 | Goal Seeking (.NET) | ✅ complete | GoalSeekSpec/GoalSeeker (bisection), POST /v1/goal-seek; 33 tests. **Not in original spec or v2 plan — added as intermediate mode.** | -| m-E18-12 | Optimization (.NET) | ✅ complete | OptimizeSpec/Optimizer (Nelder-Mead, N params), POST /v1/optimize; 29 unit + 10 API tests | - ---- - -## Spec success criteria — status per criterion - -| Criterion | Status | Detail | -|-----------|--------|--------| -| Time Machine callable as `.NET` CLI pipeline: `cat model.yaml \| flowtime evaluate` | ❌ not built | `FlowTime.Cli` has no validate/sweep/optimize/sensitivity commands. The Rust CLI (`flowtime-engine eval/validate/plan`) exists but is a lower-level tool, not the Time Machine pipeline CLI the spec describes. | -| Tiered validation via SDK, CLI, and **sidecar** | ⚠️ partial | SDK ✅ (`TimeMachineValidator`). Sidecar ✅ (Rust `validate_schema` session command). CLI ❌ (no .NET CLI surface). | -| Parameter sweeps without recompilation per evaluation | ⚠️ partial | The Rust engine session (m-E18-02) enables compile-once/eval-many. But `RustModelEvaluator` spawns `flowtime-engine eval` **once per point** — a fresh compile + eval per evaluation. The session-based evaluator that would use m-E18-02's protocol was not built. The `IModelEvaluator` seam exists for future substitution. | -| Optimization finds parameter values satisfying objective + **constraints** | ⚠️ partial | Nelder-Mead optimizer delivered (m-E18-12). Constraints (`max(utilization) < 0.8` etc.) not implemented — explicitly deferred in m-E18-12. | -| **Model fitting** against parity-validated real telemetry | ❌ not built | Hard prerequisite is Telemetry Loop & Parity epic, which is not started. `ITelemetrySource` and `Optimizer` (the inner loop) exist but `FitSpec`/`FitRunner`/`POST /v1/fit` are not assembled. | -| **Chunked evaluation** for feedback simulation | ❌ not built | Explicitly deferred in spec ("only after the stateful execution seam exists"). | -| FlowTime.Generator deleted; rg returns zero matches | ✅ done | m-E18-07 | -| Canonical run directory preserved, clear-text | ✅ done | Unchanged. | -| Canonical bundle + CanonicalBundleSource round-trip consistent | ⚠️ partial | ITelemetrySource/CanonicalBundleSource delivered (m-E18-08). Round-trip parity not validated — that is owned by the Telemetry Loop & Parity epic (not started). | -| ITelemetrySource with ≥2 implementations | ✅ done | CanonicalBundleSource + FileCsvSource (m-E18-08). | -| ITelemetrySink not introduced on speculation | ✅ done | Correctly deferred. | -| No external format code inside FlowTime.TimeMachine | ✅ done | Confirmed by `rg -i "prometheus\|opentelemetry\|otel" src/FlowTime.TimeMachine/` → zero matches. | -| POST /telemetry/captures still works after Generator deletion | ✅ done | TelemetryCaptureEndpoints.cs wired to TimeMachine; endpoint exists and is tested. | - ---- - -## Unbuilt scope from spec and v2 plan - -### 1. m-E18-03 — Rust-native sweep (v2 plan) - -The v2 milestone plan included `m-E18-03: Parameter Sweep` as a Rust-native `flowtime-engine sweep` batch mode: N evaluations via compile-once/eval-many without any .NET layer. This was never built. Instead, m-E18-09 built a .NET sweep that calls `flowtime-engine eval` as a fresh subprocess per point. - -**Impact:** For small sweeps (≤ 20 points) the overhead is acceptable. For large sweeps (100+ points) each spawn is ~100–500ms of compile overhead. The `IModelEvaluator` interface makes a future `SessionModelEvaluator` a drop-in replacement without changing `SweepRunner`, `SensitivityRunner`, `GoalSeeker`, or `Optimizer`. - -**Remaining work:** Implement `SessionModelEvaluator : IModelEvaluator` that connects to the Rust engine session protocol (m-E18-02), compiles once, and calls `eval` with parameter overrides. No analysis class changes required. - -### 2. Session-based evaluator (.NET → Rust session bridge) - -The Rust engine session (m-E18-02) is built and working. The .NET analysis layer does not use it — `RustModelEvaluator` calls `flowtime-engine eval` (stateless, one compile per point) rather than `flowtime-engine session` (stateful, compile once). The compile-once/eval-many performance benefit of m-E18-01 and m-E18-02 is therefore not yet accessible from the .NET analysis modes. - -**Remaining work:** `SessionModelEvaluator : IModelEvaluator` — wraps a persistent engine session process, sends `compile` once, sends `eval` with YAML patch per evaluation point. - -### 3. Model fitting (m-E18-04 / m-E18-05 in v2) - -Requires: (a) `FitSpec` / `FitRunner` / `POST /v1/fit` that wires `ITelemetrySource` + `Optimizer` with residual as objective, and (b) Telemetry Loop & Parity epic as hard prerequisite for validated results. - -**Remaining work:** Blocked on Telemetry Loop & Parity. The infrastructure (`ITelemetrySource`, `Optimizer`) exists but the composition does not. - -### 4. Optimization constraints - -The spec describes `--constraint "max(node.queue.utilization) < 0.8"`. `OptimizeSpec` has no constraint field. Explicitly deferred in m-E18-12. - -**Remaining work:** `ConstraintSpec` + constraint evaluation inside the Nelder-Mead loop (penalty method or projection). Future milestone. - -### 5. Monte Carlo - -Mode 5 in the spec. Not in v2 milestone plan but in original epic scope. Requires sampling parameters from distributions, evaluating N times, characterizing output distribution (mean, variance, percentiles). Would compose `IModelEvaluator` with a sampling layer. - -**Remaining work:** Not started. Lower priority than fitting. - -### 6. Chunked evaluation - -Mode 6 in the spec. Explicitly deferred: "only after a dedicated streaming/stateful execution seam exists." The Rust engine session (m-E18-02) is the seam, but the chunk-step protocol on top of it is not designed. - -**Remaining work:** Deferred. Needs a stateful chunk-step session command in the Rust engine. - -### 7. FlowTime.Pipeline embeddable SDK project - -The spec architecture shows a new `FlowTime.Pipeline` project as a thin wrapper over the Time Machine exposing `Sweep`, `Optimize`, `Fit`, `Sensitivity`, `MonteCarlo`, `ChunkedEvaluate` as a clean programmatic API. Not built. The current surface is `FlowTime.TimeMachine.Sweep.*` directly, which works but is not the clean SDK layer the spec describes. - -**Remaining work:** Depends on whether the analysis modes stabilize first. Low priority until Fit is delivered. - -### 8. FlowTime.Telemetry.* adapter projects - -Prometheus, OpenTelemetry, BPI event log `ITelemetrySource` implementations. Not built. - -These are **direct-source bypasses** of the E-15 Gold Builder pipeline: they implement `ITelemetrySource` directly against a live source (e.g., PromQL queries, OTEL collector, BPI event log) without writing a canonical bundle to disk. They are alternative entry points, not part of E-15 scope. E-15 provides the general batch path (raw → Gold → `CanonicalBundleSource` → `ITelemetrySource`); adapters are narrower shortcuts for clients already on specific telemetry stacks. Either path feeds the same `ITelemetrySource` interface, so Fit/optimization code is source-agnostic. - -**Remaining work:** Deferred. Build when a concrete client need surfaces (e.g., a Prometheus-native customer). Until then, E-15 Gold Builder covers the general case. - -### 9. .NET Time Machine CLI - -The spec success criterion: `cat model.yaml | flowtime evaluate --params '{"parallelism":4}' | jq ...`. `FlowTime.Cli` has no Time Machine commands. The Rust CLI (`flowtime-engine eval/validate/plan`) works at a lower level. No `dotnet flowtime validate/sweep/optimize` commands exist. - -**Remaining work:** Add Time Machine commands to `FlowTime.Cli` that call `TimeMachineValidator`, `SweepRunner`, etc. with JSON I/O. Relatively mechanical once the analysis modes are stable. - ---- - -## What was added that is NOT in the spec - -**m-E18-11 Goal Seeking** — bisection-based 1D parameter-to-target-metric search. Not in original spec or v2 plan. Added as a natural intermediate capability between sensitivity (which tells you gradient) and optimization (which searches N-D space). Correctly fills a gap in the analysis ladder. - ---- - -## Verdict - -E-18 is **in-progress**. The analysis layer is substantially complete: sweep, sensitivity, goal-seek, and optimization are all built, tested to 100% branch coverage, and exposed as API endpoints. The Rust foundation (parameterized evaluation + session protocol) is solid. - -The remaining work splits into three buckets: - -**Blocked on prerequisites:** -- Model fitting — blocked on Telemetry Loop & Parity - -**Buildable now (independent work):** -- Session-based evaluator (`SessionModelEvaluator`) — unlocks compile-once performance for all analysis modes -- .NET Time Machine CLI commands -- Optimization constraints - -**Explicitly deferred by spec:** -- Chunked evaluation -- Monte Carlo -- FlowTime.Pipeline SDK project -- FlowTime.Telemetry.* adapters (direct-source bypasses of E-15's Gold Builder path) - ---- - -## Recommended next milestones - -| Priority | Milestone | Unblocked? | -|----------|-----------|------------| -| High | `SessionModelEvaluator` — session-based evaluator bridging m-E18-02 to the .NET analysis layer | Yes | -| High | Model fitting (`FitSpec`/`FitRunner`/`POST /v1/fit`) | No — blocked on Telemetry Loop & Parity | -| Medium | .NET CLI commands for validate/sweep/sensitivity/goal-seek/optimize | Yes | -| Medium | Optimization constraints (penalty method) | Yes | -| Low | Monte Carlo | Yes | -| Deferred | Chunked evaluation | No — needs stateful session design | -| Deferred | FlowTime.Pipeline SDK | After fitting stabilizes | diff --git a/work/epics/E-18-headless-pipeline-and-optimization/spec.md b/work/epics/E-18-headless-pipeline-and-optimization/epic.md similarity index 88% rename from work/epics/E-18-headless-pipeline-and-optimization/spec.md rename to work/epics/E-18-headless-pipeline-and-optimization/epic.md index fbf6d697..99177bf0 100644 --- a/work/epics/E-18-headless-pipeline-and-optimization/spec.md +++ b/work/epics/E-18-headless-pipeline-and-optimization/epic.md @@ -1,7 +1,8 @@ -# Epic: E-18 Time Machine - -**ID:** E-18 -**Status:** in-progress — 11 of 13 planned milestones delivered (m-E18-01/02/06/07/08/09/10/11/12/13/14); Model Fit (m-E18-XX) blocked on E-15 + Telemetry Loop & Parity; Chunked Evaluation (m-E18-05) deferred per D-2026-04-15-032. +--- +id: E-18 +title: Time Machine +status: done +--- > **Naming note.** This epic was originally filed as "Headless Pipeline and Optimization." The component is now named `FlowTime.TimeMachine` (the Time Machine). The directory path `work/epics/E-18-headless-pipeline-and-optimization/` is preserved for historical stability; cross-doc references use that path. The decision is recorded in `work/decisions.md` and in `work/epics/E-19-surface-alignment-and-compatibility-cleanup/m-E19-01-supported-surface-inventory.md` (A6 + shared framing). @@ -44,7 +45,7 @@ Every advanced use case is a composition of "call the evaluation function with d - **Time Machine CLI mode** — FlowTime as a pipeline-friendly command: model + params in, results out (JSON/CSV) - **Shared runtime parameter foundation** — compiled parameter identities, override points, reevaluation API, and optional enrichment contract for template-authored parameter metadata reused by E-17 - **Iteration protocol** — keep compiled graph alive, accept new parameter sets per iteration without recompile -- **Tiered validation as a first-class operation** — schema / compile / analyse tiers callable from the same Time Machine surface as compile/evaluate/reevaluate. Client-agnostic: Sim UI, Blazor UI, Svelte UI, MCP servers, external AI agents, tests, and CI are all first-class callers on equal footing. Detailed requirement below in *Tiered validation (required scope)*. Originates from E-19 m-E19-01 decision A6. +- **Tiered validation as a first-class operation** — schema / compile / analyse tiers callable from the same Time Machine surface as compile/evaluate/reevaluate. Client-agnostic: Sim UI, Blazor UI, Svelte UI, MCP servers, external AI agents, tests, and CI are all first-class callers on equal footing. Detailed requirement below in *Tiered validation (required scope)*. Originates from E-19 M-024 decision A6. - **Parameter sweep** — evaluate over a grid of parameter values - **Optimization** — find parameter values that minimize/maximize an objective subject to constraints - **Model fitting** — given observed telemetry, calibrate model parameters to match (system identification) @@ -72,7 +73,7 @@ To minimize risk, execute this epic in three layers: ## Tiered validation (required scope) -**Origin.** This requirement was decided in E-19 m-E19-01 as decision A6. E-19 retires the existing `POST /api/v1/drafts/validate` endpoint (Sim-private, mislabeled, unused by any UI, only exercised by tests) and records a hard dependency that E-18 must expose validation as a first-class, client-agnostic Time Machine operation alongside compile, evaluate, reevaluate, parameter override, and artifact write. +**Origin.** This requirement was decided in E-19 M-024 as decision A6. E-19 retires the existing `POST /api/v1/drafts/validate` endpoint (Sim-private, mislabeled, unused by any UI, only exercised by tests) and records a hard dependency that E-18 must expose validation as a first-class, client-agnostic Time Machine operation alongside compile, evaluate, reevaluate, parameter override, and artifact write. **Principle.** Validation — answering *"is this YAML a correct FlowTime model?"* — is a first-class, client-agnostic operation. `FlowTime.Core` owns the authoritative answer. The Time Machine surfaces it. No single client is privileged as the validation host. Sim UI, Blazor UI, Svelte UI, MCP servers, external AI agents, tests, and CI are all first-class callers on equal footing. @@ -220,7 +221,7 @@ This matches today's capture-directory flow, generalised: the adapter can produc - **m-E18-01a** (Path B core cut) extracts the concrete canonical bundle writer and concrete `CanonicalBundleSource` reader from Generator into the Time Machine, alongside the execution-pipeline extraction. No `ITelemetrySource` interface yet — the reader is a concrete class. This is enough to enable the telemetry loop end-to-end over the canonical bundle format using today's existing capture and replay code, just rehosted. - **m-E18-01b** (Tiered Validation & Telemetry Source Contract) introduces `ITelemetrySource` as the formal interface. Lifts `CanonicalBundleSource` to implement it. Adds `FileCsvSource` as a second implementation. Tiered validation lands in this milestone. -- **m-E18-06** (reshaped from "Telemetry I/O" to "Telemetry Ingestion Source Adapters") delivers source-only adapters for real-world formats. Depends on the `ITelemetrySource` contract from 01b. No Time Machine changes. No sinks. Specific formats (Prometheus, OTEL, BPI event logs, GTFS, …) chosen when the milestone is scheduled, not now. +- **M-003** (reshaped from "Telemetry I/O" to "Telemetry Ingestion Source Adapters") delivers source-only adapters for real-world formats. Depends on the `ITelemetrySource` contract from 01b. No Time Machine changes. No sinks. Specific formats (Prometheus, OTEL, BPI event logs, GTFS, …) chosen when the milestone is scheduled, not now. - **m-E18-04** (Optimization & Fitting) depends on the Telemetry Loop & Parity epic being complete — optimization against real telemetry requires measured drift bounds. This is a hard prerequisite, not a soft one. ### Non-goals for E-18 @@ -298,7 +299,7 @@ Core and Time Machine are strictly layered. The dependency direction is **Time M ## Generator migration (Path B: extraction and deletion) -**Origin.** This is recorded as decision D-2026-04-07-019 and referenced from E-19 m-E19-01's shared framing (item 3). The forward fate of `FlowTime.Generator` was previously left implicit. +**Origin.** This is recorded as decision D-032 and referenced from E-19 M-024's shared framing (item 3). The forward fate of `FlowTime.Generator` was previously left implicit. **Current state (pre-E-18).** `FlowTime.Generator` is the shared orchestration layer used by both `FlowTime.Sim.Service` and `FlowTime.API`. It owns `RunOrchestrationService`, `RunArtifactWriter`, deterministic run ID logic, RNG seeding, and dry-run/plan mode. Sim.Service does not reference `FlowTime.Core` directly — only via Generator. API references both Core and Generator. @@ -359,29 +360,29 @@ Plan v2 (2026-04-10): once the Rust engine (E-20) became the evaluation path, th | ID | Title | Status | Summary | |----|-------|--------|---------| -| m-E18-01 | Parameterized Evaluation (Rust) | **complete** (merged to main 2026-04-10) | `ParamTable` in Plan. Compiler extracts tweakable parameters from const nodes, traffic arrivals, WIP limits. `evaluate_with_params(plan, overrides)` pure function. Parameter metadata (id, kind, default, bounds). Foundation for everything that follows. | -| m-E18-02 | Engine Session + Streaming Protocol (Rust) | **complete** (merged to main 2026-04-10) | `flowtime-engine session` persistent CLI mode. Length-prefixed MessagePack over stdin/stdout. Commands: `compile`, `eval`, `patch`, `get_params`, `get_series`, `validate_schema`. Session holds compiled Plan + current state. | -| m-E18-06 | Tiered Validation | **complete** (merged to main) | `TimeMachineValidator` (schema / compile / analyse tiers); `POST /v1/validate`; Rust `validate_schema` session command. Satisfies E-19 m-E19-01 A6 (D-2026-04-07-017). | -| m-E18-07 | FlowTime.TimeMachine Extraction (Path B) | **complete** (merged to main) | `FlowTime.TimeMachine` project created; `FlowTime.Generator` deleted outright. Path B, no coexistence window. Per D-2026-04-07-019. | -| m-E18-08 | Telemetry Source Contract | **complete** (merged to main) | `ITelemetrySource` interface + `CanonicalBundleSource` + `FileCsvSource`. 23 tests. `ITelemetrySink` explicitly **not** introduced — see D-2026-04-07-020. | -| m-E18-09 | Parameter Sweep | **complete** (merged to main) | `SweepSpec`/`SweepRunner`/`ConstNodePatcher`; `IModelEvaluator` / `RustModelEvaluator`; `POST /v1/sweep`. 35 tests. | -| m-E18-10 | Sensitivity Analysis | **complete** (merged to main) | `ConstNodeReader`; `SensitivitySpec`/`SensitivityRunner` (central difference); `POST /v1/sensitivity`. 39 tests. | -| m-E18-11 | Goal Seeking | **complete** (merged to main) | `GoalSeekSpec`/`GoalSeeker` (bisection); `POST /v1/goal-seek`. 33 tests. (Added 2026-04; not in original plan.) | -| m-E18-12 | Multi-parameter Optimization | **complete** (merged to main) | `OptimizeSpec`/`Optimizer` (Nelder-Mead, N parameters); `POST /v1/optimize`. 29 unit + 10 API tests. | -| m-E18-13 | SessionModelEvaluator | **complete** (merged to epic 2026-04-15) | Persistent `flowtime-engine session` subprocess; MessagePack over stdin/stdout; compile-once/eval-many. `RustEngine:UseSession` config switch (default true); `RustModelEvaluator` retained as fallback. 44 new tests. | -| m-E18-14 | .NET Time Machine CLI | **complete** (merged to epic 2026-04-15) | `flowtime validate/sweep/sensitivity/goal-seek/optimize` as pipeable JSON-over-stdio commands byte-compatible with `/v1/` endpoints. `--no-session` fallback. 72 CLI unit + 10 integration tests. | +| M-001 | Parameterized Evaluation (Rust) | **complete** (merged to main 2026-04-10) | `ParamTable` in Plan. Compiler extracts tweakable parameters from const nodes, traffic arrivals, WIP limits. `evaluate_with_params(plan, overrides)` pure function. Parameter metadata (id, kind, default, bounds). Foundation for everything that follows. | +| M-002 | Engine Session + Streaming Protocol (Rust) | **complete** (merged to main 2026-04-10) | `flowtime-engine session` persistent CLI mode. Length-prefixed MessagePack over stdin/stdout. Commands: `compile`, `eval`, `patch`, `get_params`, `get_series`, `validate_schema`. Session holds compiled Plan + current state. | +| M-003 | Tiered Validation | **complete** (merged to main) | `TimeMachineValidator` (schema / compile / analyse tiers); `POST /v1/validate`; Rust `validate_schema` session command. Satisfies E-19 M-024 A6 (D-030). | +| M-004 | FlowTime.TimeMachine Extraction (Path B) | **complete** (merged to main) | `FlowTime.TimeMachine` project created; `FlowTime.Generator` deleted outright. Path B, no coexistence window. Per D-032. | +| M-005 | Telemetry Source Contract | **complete** (merged to main) | `ITelemetrySource` interface + `CanonicalBundleSource` + `FileCsvSource`. 23 tests. `ITelemetrySink` explicitly **not** introduced — see D-033. | +| M-006 | Parameter Sweep | **complete** (merged to main) | `SweepSpec`/`SweepRunner`/`ConstNodePatcher`; `IModelEvaluator` / `RustModelEvaluator`; `POST /v1/sweep`. 35 tests. | +| M-007 | Sensitivity Analysis | **complete** (merged to main) | `ConstNodeReader`; `SensitivitySpec`/`SensitivityRunner` (central difference); `POST /v1/sensitivity`. 39 tests. | +| M-008 | Goal Seeking | **complete** (merged to main) | `GoalSeekSpec`/`GoalSeeker` (bisection); `POST /v1/goal-seek`. 33 tests. (Added 2026-04; not in original plan.) | +| M-009 | Multi-parameter Optimization | **complete** (merged to main) | `OptimizeSpec`/`Optimizer` (Nelder-Mead, N parameters); `POST /v1/optimize`. 29 unit + 10 API tests. | +| M-010 | SessionModelEvaluator | **complete** (merged to epic 2026-04-15) | Persistent `flowtime-engine session` subprocess; MessagePack over stdin/stdout; compile-once/eval-many. `RustEngine:UseSession` config switch (default true); `RustModelEvaluator` retained as fallback. 44 new tests. | +| M-011 | .NET Time Machine CLI | **complete** (merged to epic 2026-04-15) | `flowtime validate/sweep/sensitivity/goal-seek/optimize` as pipeable JSON-over-stdio commands byte-compatible with `/v1/` endpoints. `--no-session` fallback. 72 CLI unit + 10 integration tests. | | m-E18-XX | Model Fit | **planned** — blocked on E-15 + Telemetry Loop & Parity | `FitSpec`/`FitRunner`/`POST /v1/fit` composing `ITelemetrySource` + `Optimizer`. Infrastructure exists; assembly requires telemetry ingestion (E-15) and parity harness first. | -| m-E18-05 | Chunked Evaluation (Mode 6) | **deferred** — after discovery pipeline works end-to-end | Bin-chunk evaluation for feedback simulation with external controllers. Requires a real stateful execution seam. Sequenced after Model Fit per Option A (D-2026-04-15-032). | +| m-E18-05 | Chunked Evaluation (Mode 6) | **deferred** — after discovery pipeline works end-to-end | Bin-chunk evaluation for feedback simulation with external controllers. Requires a real stateful execution seam. Sequenced after Model Fit per Option A (D-045). | ### Deferred from v1 (not on current critical path) -These v1 milestones were superseded or deferred when the Rust engine became the evaluation path. Some have since been re-admitted under different IDs (m-E18-06, m-E18-07, m-E18-08 above). +These v1 milestones were superseded or deferred when the Rust engine became the evaluation path. Some have since been re-admitted under different IDs (M-003, M-004, M-005 above). -- **m-E18-01a** Generator extraction — superseded by **m-E18-07** (same outcome, different entry point). -- **m-E18-01b** Tiered validation & telemetry source contract — split across **m-E18-06** (validation) and **m-E18-08** (telemetry source contract). -- **m-E18-01c** Runtime parameter foundation — replaced by **m-E18-01** (Rust-native, not C#). -- **m-E18-04** Optimization & Fitting as a single milestone — split into **m-E18-11** (goal seek), **m-E18-12** (N-parameter optimize), and **m-E18-XX** (model fit). -- **Telemetry Ingestion Source Adapters** (v1 m-E18-06 idea) — moved to **E-15** scope; not an E-18 milestone. +- **m-E18-01a** Generator extraction — superseded by **M-004** (same outcome, different entry point). +- **m-E18-01b** Tiered validation & telemetry source contract — split across **M-003** (validation) and **M-005** (telemetry source contract). +- **m-E18-01c** Runtime parameter foundation — replaced by **M-001** (Rust-native, not C#). +- **m-E18-04** Optimization & Fitting as a single milestone — split into **M-008** (goal seek), **M-009** (N-parameter optimize), and **m-E18-XX** (model fit). +- **Telemetry Ingestion Source Adapters** (v1 M-003 idea) — moved to **E-15** scope; not an E-18 milestone. ## Risks & Open Questions @@ -396,7 +397,7 @@ These v1 milestones were superseded or deferred when the Rust engine became the ## Dependencies - **E-16 Formula-First Core Purification** — must complete first. Provides the pure compiled engine that the Time Machine hosts. -- **E-19 m-E19-01 Supported Surface Inventory** — provides the A6 tiered-validation requirement, the Path B Generator extraction commitment (D-2026-04-07-019), the telemetry-as-adapter framing (D-2026-04-07-020), and the Time Machine naming decision (D-2026-04-07-018) that this epic builds on. +- **E-19 M-024 Supported Surface Inventory** — provides the A6 tiered-validation requirement, the Path B Generator extraction commitment (D-032), the telemetry-as-adapter framing (D-033), and the Time Machine naming decision (D-031) that this epic builds on. - **E-15 Telemetry Ingestion** — provides the canonical bundle schema (`docs/schemas/telemetry-manifest.schema.json`) that this epic's `CanonicalBundleSource` and canonical bundle writer must conform to. The schema already exists; this is an alignment dependency rather than a sequencing dependency. - **E-17 Interactive What-If Mode** consumes the shared runtime parameter foundation built here; it should not duplicate the runtime parameter model or reevaluation API. - **Telemetry Loop & Parity** (`work/epics/telemetry-loop-parity/spec.md`, currently unnumbered) — **hard prerequisite for m-E18-04 (Optimization & Fitting)**. Optimization and fitting against real telemetry require measured drift bounds, which only the parity harness can provide. Soft dependency for m-E18-01a through m-E18-03 (those milestones can ship without parity automation, but the loop's existence shapes the contract design in 01b). diff --git a/work/epics/E-18-headless-pipeline-and-optimization/m-E18-01-parameterized-evaluation-tracking.md b/work/epics/E-18-headless-pipeline-and-optimization/m-E18-01-parameterized-evaluation-tracking.md deleted file mode 100644 index fb87945a..00000000 --- a/work/epics/E-18-headless-pipeline-and-optimization/m-E18-01-parameterized-evaluation-tracking.md +++ /dev/null @@ -1,44 +0,0 @@ -# Tracking: m-E18-01 Parameterized Evaluation - -**Milestone:** m-E18-01 -**Epic:** E-18 Time Machine -**Status:** complete — merged to main 2026-04-10 -**Branch:** `milestone/m-E18-01-parameterized-evaluation` -**Started:** 2026-04-10 -**Completed:** 2026-04-10 - -## Progress - -| AC | Description | Status | -|----|-------------|--------| -| AC-1 | ParamTable struct in Plan | pending | -| AC-2 | Compiler populates ParamTable | pending | -| AC-3 | evaluate_with_params function | pending | -| AC-4 | Equivalence (no overrides = defaults) | pending | -| AC-5 | Full post-eval pipeline with overrides | pending | -| AC-6 | Parameter override affects downstream | pending | -| AC-7 | Class arrival rate override | pending | -| AC-8 | WIP limit override | pending | -| AC-9 | Parameter schema extraction | pending | -| AC-10 | Compile-once eval-many pattern | pending | - -## Implementation Phases - -### Phase 1: ParamTable + evaluate_with_params (AC-1, AC-3, AC-4) -- Add ParamTable, ParamEntry, ParamValue, ParamKind to plan.rs -- Add evaluate_with_params to eval.rs -- Verify equivalence: no overrides = same as evaluate - -### Phase 2: Compiler populates params (AC-2, AC-9) -- Register const nodes, arrival rates, WIP limits, initial conditions -- extract_params accessor - -### Phase 3: Full pipeline + override propagation (AC-5, AC-6, AC-7, AC-8) -- eval_model_with_params entry point -- Override propagation through class decomposition, edges, analysis -- Class rate override + normalization invariant -- WIP limit override + overflow - -### Phase 4: Compile-once eval-many (AC-10) -- Multi-eval independence test -- Performance verification (no recompile overhead) diff --git a/work/epics/E-18-headless-pipeline-and-optimization/m-E18-01-parameterized-evaluation.md b/work/epics/E-18-headless-pipeline-and-optimization/m-E18-01-parameterized-evaluation.md deleted file mode 100644 index 7a807bf5..00000000 --- a/work/epics/E-18-headless-pipeline-and-optimization/m-E18-01-parameterized-evaluation.md +++ /dev/null @@ -1,100 +0,0 @@ -# Milestone: Parameterized Evaluation - -**ID:** m-E18-01 -**Epic:** E-18 Time Machine -**Status:** complete — merged to main 2026-04-10 -**Branch:** `milestone/m-E18-01-parameterized-evaluation` (off `main`) -**Depends on:** E-20 (complete) - -## Goal - -The Rust engine can compile a model once and re-evaluate it many times with different parameter values without recompiling. This is the critical primitive that every downstream use case builds on — interactive what-if, parameter sweeps, optimization, sensitivity analysis. The Plan becomes a reusable program; parameters are its inputs. - -## Context - -The current `compile(model) → Plan` bakes all constants into `Op::Const { out, values }` at compile time. To change an arrival rate from 10 to 15, you must recompile the entire model. Compilation is O(nodes) with topological sorting, expression parsing, and constraint resolution — unnecessary work when only a scalar value changed. - -After this milestone, the Plan carries a `ParamTable` that lists every user-visible constant. `evaluate_with_params(plan, overrides)` writes overrides into the state matrix before the eval loop, then runs the same bin-major evaluation. The Plan is immutable and shareable; only the parameter values change. - -### Where constants come from in the compiler - -The compiler creates `Op::Const` from seven sources: - -| Source | Example | Parameter? | -|--------|---------|-----------| -| `kind: const` node values | `values: [10, 20, 30]` | Yes — primary user input | -| Traffic arrival `ratePerBin` | `ratePerBin: 20` | Yes — class arrival rate | -| PMF expected value | `pmf: { values, probabilities }` | Yes — derived from PMF definition | -| WIP limit scalar | `wipLimit: 50` | Yes — topology constraint | -| Queue initial condition | `initialCondition: { queueDepth: 5 }` | Yes — initial state | -| Expression literal | `8` in `MIN(arrivals, 8)` | Yes — inline constant in formula | -| Compiler-generated temps | Internal proportional alloc, router weight columns | No — derived, not user-visible | - -The distinction: a parameter is a constant that traces back to a user-authored value in the model YAML. Compiler-generated intermediate constants (temp columns, normalized weights) are NOT parameters. - -## Acceptance Criteria - -1. **AC-1: ParamTable struct.** `Plan` gains a `params: ParamTable` field. `ParamTable` contains a `Vec` where each entry has: - - `id: String` — stable identifier matching the model YAML source (e.g., `"arrivals"` for a const node, `"arrivals.Order"` for a traffic class rate, `"Queue.wipLimit"` for a topology WIP limit) - - `column: usize` — the column index in the state matrix this parameter fills - - `default: ParamValue` — original value from the model (`Scalar(f64)` for uniform, `Vector(Vec)` for per-bin) - - `kind: ParamKind` — `ConstNode`, `ArrivalRate`, `WipLimit`, `InitialCondition`, `ExprLiteral` - -2. **AC-2: Compiler populates ParamTable.** The compiler registers parameters for: - - Every `kind: const` node (id = node id, value from `values` field) - - Every `traffic.arrivals` entry with `ratePerBin` (id = `"{nodeId}.{classId}"`) - - Every topology node with scalar `wipLimit` (id = `"{topoNodeId}.wipLimit"`) - - Every topology node with `initialCondition.queueDepth` (id = `"{topoNodeId}.init"`) - - Expression literals are NOT parameters (they're inline formula constants, not model inputs) - -3. **AC-3: `evaluate_with_params` function.** New public function: - ```rust - pub fn evaluate_with_params(plan: &Plan, overrides: &[(String, ParamValue)]) -> Vec - ``` - - Applies overrides to matching param IDs before the eval loop - - `Scalar(v)` fills all bins with `v`; `Vector(vs)` writes per-bin values - - Unmatched override IDs are ignored (forward-compatible) - - Unknown param IDs do not cause errors - - Returns the filled state matrix (same shape as `evaluate`) - -4. **AC-4: Equivalence.** `evaluate_with_params(plan, &[])` (no overrides) produces identical results to `evaluate(plan)`. A Rust test asserts bitwise equality. - -5. **AC-5: Full post-eval pipeline.** `eval_model` is refactored to accept optional overrides. When overrides are provided, it calls `evaluate_with_params` instead of `evaluate`, then runs the same post-eval pipeline: class decomposition normalization, proportional allocation propagation, edge series computation, analysis warnings. A new public entry point: - ```rust - pub fn eval_model_with_params( - model: &ModelDefinition, - overrides: &[(String, ParamValue)] - ) -> Result - ``` - -6. **AC-6: Parameter override affects downstream.** Overriding a const node's value propagates through all downstream expressions, queue recurrences, per-class decomposition, and edge series. Test: override `arrivals` from 10 to 20 → verify `served`, `queue_depth`, per-class series, and edge flow all change correctly. - -7. **AC-7: Class arrival rate override.** Overriding a class arrival rate (e.g., `"arrivals.Order"` from 6 to 12) changes the class fraction and propagates through normalization and downstream decomposition. Test: change one class rate, verify normalization invariant still holds. - -8. **AC-8: WIP limit override.** Overriding `"{topoNodeId}.wipLimit"` changes the queue's WIP limit and affects overflow. Test: lower WIP limit → verify overflow increases. - -9. **AC-9: Parameter schema extraction.** New public function: - ```rust - pub fn extract_params(plan: &Plan) -> &ParamTable - ``` - Returns the plan's parameter table. Clients use this to discover what can be tweaked, with IDs, kinds, and defaults. This is what the UI will use to auto-generate controls. - -10. **AC-10: Compile-once, eval-many pattern.** Demonstrate the pattern with a Rust test that compiles once, evaluates 10 times with different arrival rates, and verifies each result is independent (no state leakage between evaluations). Measure that subsequent evals are faster than the first (no recompilation). - -## Out of Scope - -- Session management or persistent process (m-E18-02) -- Streaming protocol or MessagePack framing (m-E18-02) -- CLI interface changes (m-E18-02) -- UI parameter controls (m-E17-02) -- Parameter bounds, display names, or template metadata enrichment (future — the parameter table carries IDs and defaults only) -- Expression literal parameterization (inline `8` in `MIN(arrivals, 8)` stays baked — parameterizing expression constants requires expression-tree rewriting, which is a different problem) -- Structural model changes (adding/removing nodes requires recompilation — by design) - -## Key References - -- `engine/core/src/plan.rs` — Plan struct, Op enum, ColumnMap -- `engine/core/src/eval.rs` — `evaluate()` function, bin-major loop -- `engine/core/src/compiler.rs` — `compile()`, `eval_model()`, all `Op::Const` emission sites -- `docs/architecture/headless-engine-architecture.md` — overall architecture -- `work/epics/E-18-headless-pipeline-and-optimization/milestone-plan-v2.md` — milestone sequence diff --git a/work/epics/E-18-headless-pipeline-and-optimization/m-E18-02-engine-session-protocol-tracking.md b/work/epics/E-18-headless-pipeline-and-optimization/m-E18-02-engine-session-protocol-tracking.md deleted file mode 100644 index 16279d9b..00000000 --- a/work/epics/E-18-headless-pipeline-and-optimization/m-E18-02-engine-session-protocol-tracking.md +++ /dev/null @@ -1,40 +0,0 @@ -# Tracking: m-E18-02 Engine Session + Streaming Protocol - -**Milestone:** m-E18-02 -**Epic:** E-18 Time Machine -**Status:** complete — merged to main 2026-04-10 -**Branch:** `milestone/m-E18-02-engine-session-protocol` -**Started:** 2026-04-10 -**Completed:** 2026-04-10 - -## Progress - -| AC | Description | Status | -|----|-------------|--------| -| AC-1 | `session` CLI command | pending | -| AC-2 | Length-prefixed MessagePack framing | pending | -| AC-3 | `compile` command | pending | -| AC-4 | `eval` command | pending | -| AC-5 | `get_params` command | pending | -| AC-6 | `get_series` command | pending | -| AC-7 | Error handling | pending | -| AC-8 | Session state | pending | -| AC-9 | Performance (<1ms eval) | pending | -| AC-10 | Integration test (subprocess) | pending | - -## Implementation Phases - -### Phase 1: Protocol types + framing (AC-2) -- Add rmp-serde dependency -- protocol.rs: Request/Response types, read_message/write_message with length-prefix framing -- Unit tests for round-trip serialize/deserialize - -### Phase 2: Session struct + commands (AC-1, AC-3, AC-4, AC-5, AC-6, AC-7, AC-8) -- session.rs: Session state (model, plan, state matrix, overrides) -- Command dispatch: compile, eval, get_params, get_series -- Error handling: not_compiled, compile_error, unknown_method -- cmd_session() in main.rs - -### Phase 3: Integration test + performance (AC-9, AC-10) -- Spawn subprocess, send compile + eval sequence via MessagePack -- Performance benchmark: 1,000 evals < 1s diff --git a/work/epics/E-18-headless-pipeline-and-optimization/m-E18-02-engine-session-protocol.md b/work/epics/E-18-headless-pipeline-and-optimization/m-E18-02-engine-session-protocol.md deleted file mode 100644 index eb353c55..00000000 --- a/work/epics/E-18-headless-pipeline-and-optimization/m-E18-02-engine-session-protocol.md +++ /dev/null @@ -1,122 +0,0 @@ -# Milestone: Engine Session + Streaming Protocol - -**ID:** m-E18-02 -**Epic:** E-18 Time Machine -**Status:** complete — merged to main 2026-04-10 -**Branch:** `milestone/m-E18-02-engine-session-protocol` (off `main`) -**Depends on:** m-E18-01 (parameterized evaluation) - -## Goal - -The Rust engine runs as a persistent process that accepts commands and streams results. `flowtime-engine session` reads length-prefixed MessagePack messages from stdin, holds a compiled Plan in memory, and writes responses to stdout. This is the headless pipeline component — the same protocol works over stdin/stdout (CLI pipes) and WebSocket (UI, via m-E17-01 proxy). - -## Context - -After m-E18-01, the engine can compile once and evaluate many times with different parameters via `evaluate_with_params(plan, overrides)`. But every invocation is still a batch subprocess: spawn → parse YAML → compile → evaluate → write files → exit. The overhead of process spawn + file I/O dominates latency (100-500ms). For interactive use, we need a persistent process that holds the compiled Plan and responds to parameter changes in microseconds. - -The session is a stateful loop: - -``` -stdin → [compile] → hold Plan → [eval overrides] → stdout - → [eval overrides] → stdout - → [eval overrides] → stdout - → EOF → exit -``` - -### Why MessagePack - -- **Binary f64 arrays.** A 1,000-bin series is 8KB as binary vs ~8KB+ as JSON text (with formatting overhead and parse cost). MessagePack encodes `Vec` as a binary ext type — zero parsing, memcpy-fast. -- **Length-prefixed framing.** 4-byte big-endian length prefix before each message. No newline ambiguity, no incomplete-line bugs. -- **Cross-language.** Native libraries: Rust (`rmp-serde`), JavaScript (`@msgpack/msgpack`), C# (`MessagePack-CSharp`), Python (`msgpack`). -- **Pipe-friendly.** Works over stdin/stdout for CLI composition, over WebSocket for UI. - -## Acceptance Criteria - -1. **AC-1: `session` CLI command.** `flowtime-engine session` enters a persistent loop reading from stdin and writing to stdout. No file arguments required. Exits cleanly on stdin EOF or SIGTERM. - -2. **AC-2: Length-prefixed MessagePack framing.** Each message is `[4-byte big-endian length][MessagePack payload]`. Both requests (stdin) and responses (stdout) use this framing. Stderr is reserved for human-readable log messages (not protocol). - -3. **AC-3: `compile` command.** Request: `{ method: "compile", params: { yaml: "" } }`. Response: `{ result: { params: [{ id, kind, default }], series: [{ id, bins, values }], bins, grid } }`. Compiles the model, holds the Plan in session state, evaluates with defaults, returns the parameter schema and initial series. - -4. **AC-4: `eval` command.** Request: `{ method: "eval", params: { overrides: { "arrivals": 15.0, "Queue.wipLimit": 30.0 } } }`. Response: `{ result: { series: { "arrivals": , "served": , ... }, elapsed_us } }`. Re-evaluates with overrides, returns updated series. Must not recompile. Series values are MessagePack binary arrays (not JSON text arrays). - -5. **AC-5: `get_params` command.** Request: `{ method: "get_params" }`. Response: `{ result: { params: [{ id, kind, default }] } }`. Returns the current parameter table from the compiled Plan. - -6. **AC-6: `get_series` command.** Request: `{ method: "get_series", params: { names: ["arrivals", "served"] } }`. Response: `{ result: { series: { "arrivals": , "served": } } }`. Returns specific series from the current evaluation state. If no names provided, returns all non-internal series. - -7. **AC-7: Error handling.** Invalid requests return `{ error: { code, message } }`. Specific errors: `not_compiled` (eval before compile), `compile_error` (bad YAML), `unknown_method`. The session continues after errors — it does not exit. - -8. **AC-8: Session state.** The session holds: compiled Plan, current parameter overrides, current state matrix (from most recent eval). `compile` replaces the entire session state. `eval` updates overrides and state. Multiple `eval` calls are independent (no accumulation). - -9. **AC-9: Performance.** For a model with 8 bins and ~10 series, `eval` with scalar overrides completes in under 1ms (excluding I/O). A Rust benchmark test evaluates 1,000 times in a loop and asserts total < 1 second. - -10. **AC-10: Integration test.** A Rust integration test spawns `flowtime-engine session` as a subprocess, sends compile + eval + eval (with different overrides) + get_params via the MessagePack protocol over stdin/stdout, and verifies all responses are correct. - -## Technical Notes - -### Dependencies to add - -- `rmp-serde` (MessagePack serialization for Rust) — workspace dependency -- `serde` derive on request/response types - -### Module structure - -- `engine/core/src/session.rs` — Session struct, state management, command dispatch -- `engine/core/src/protocol.rs` — Request/Response types, MessagePack framing (read/write) -- `engine/cli/src/main.rs` — `cmd_session()` entry point - -### Message envelope - -```rust -#[derive(Serialize, Deserialize)] -struct Request { - method: String, - #[serde(default)] - params: serde_json::Value, // flexible params per method -} - -#[derive(Serialize)] -struct Response { - #[serde(skip_serializing_if = "Option::is_none")] - result: Option, - #[serde(skip_serializing_if = "Option::is_none")] - error: Option, -} -``` - -Note: We use `serde_json::Value` as the flexible inner type even though the wire format is MessagePack. MessagePack and JSON share the same data model (maps, arrays, strings, numbers, bools, null). `rmp-serde` serializes/deserializes `serde_json::Value` correctly. - -### Series encoding - -Series data (`Vec`) serializes naturally as MessagePack arrays of floats. For very large series, a future optimization could use MessagePack binary ext type for raw f64 bytes, but the standard array encoding is correct and sufficient for this milestone. - -### Post-eval pipeline - -After `evaluate_with_params`, the session must also run: -- Class decomposition normalization + proportional allocation -- Edge series computation -- Analysis warnings - -This means the session calls the same post-eval pipeline as `eval_model_with_params`. The simplest approach: the session stores the compiled Plan and the ModelDefinition, and each `eval` call runs `eval_model_with_params` reusing the model but with the new overrides. - -For the compile-once optimization (skip recompilation), a future milestone can cache the Plan separately. For now, recompiling per eval is acceptable if latency is under the AC-9 target. - -## Out of Scope - -- WebSocket transport (m-E17-01) -- .NET bridge for session mode (m-E17-01) -- UI parameter controls (m-E17-02) -- Parameter sweep batch mode (m-E18-03) -- Request IDs / multiplexing (single-client, sequential for now) -- Authentication or access control -- TLS/encryption - -## Key References - -- `engine/core/src/compiler.rs` — `compile()`, `eval_model_with_params()` -- `engine/core/src/plan.rs` — `ParamTable`, `ParamValue` -- `engine/core/src/eval.rs` — `evaluate_with_params()` -- `engine/cli/src/main.rs` — existing CLI command dispatch -- `docs/architecture/headless-engine-architecture.md` — protocol design -- [rmp-serde crate](https://crates.io/crates/rmp-serde) — MessagePack for Rust -- [MessagePack spec](https://msgpack.org/) — wire format diff --git a/work/epics/E-18-headless-pipeline-and-optimization/m-E18-07-generator-extraction.md b/work/epics/E-18-headless-pipeline-and-optimization/m-E18-07-generator-extraction.md deleted file mode 100644 index 65d0d98d..00000000 --- a/work/epics/E-18-headless-pipeline-and-optimization/m-E18-07-generator-extraction.md +++ /dev/null @@ -1,47 +0,0 @@ -# m-E18-07 — Generator Extraction → TimeMachine - -**Epic:** E-18 Time Machine -**Branch:** `milestone/m-E18-07-generator-extraction` -**Status:** complete - -## Goal - -Rename `FlowTime.Generator` → `FlowTime.TimeMachine`. Move all classes, update all -references in consumers (src + tests), remove `FlowTime.Generator` from the solution. -Pure structural refactor — no behavior change, all tests green, no coexistence window -(per D-2026-04-07-019 Path B). - -## Scope - -**In scope:** -- Create `src/FlowTime.TimeMachine/FlowTime.TimeMachine.csproj` with identical dependencies -- Move all Generator source files; update `FlowTime.Generator.*` namespaces → `FlowTime.TimeMachine.*` -- Rename `tests/FlowTime.Generator.Tests/` → `tests/FlowTime.TimeMachine.Tests/`; update its csproj -- Update project references in: FlowTime.Cli, FlowTime.Sim.Service, FlowTime.API, FlowTime.Api.Tests, FlowTime.Cli.Tests, FlowTime.Integration.Tests -- Update `using FlowTime.Generator.*` → `using FlowTime.TimeMachine.*` across all source files -- Register TimeMachine in FlowTime.sln; remove Generator entry -- Delete `src/FlowTime.Generator/` entirely - -**Out of scope:** -- Tiered validation (m-E18-06) -- Any behavior changes whatsoever - -## Acceptance Criteria - -- [x] `src/FlowTime.TimeMachine/` exists; `src/FlowTime.Generator/` is gone -- [x] `tests/FlowTime.TimeMachine.Tests/` exists; `tests/FlowTime.Generator.Tests/` is gone -- [x] `dotnet build FlowTime.sln` succeeds with zero errors -- [x] `dotnet test FlowTime.sln` passes with the same test count -- [x] `rg "FlowTime\.Generator" src/ tests/ --include="*.cs" --include="*.csproj"` returns zero matches -- [x] Solution file contains TimeMachine entry; Generator entry is absent - -## Namespace Mapping - -| Old | New | -|-----|-----| -| `FlowTime.Generator` | `FlowTime.TimeMachine` | -| `FlowTime.Generator.Artifacts` | `FlowTime.TimeMachine.Artifacts` | -| `FlowTime.Generator.Capture` | `FlowTime.TimeMachine.Capture` | -| `FlowTime.Generator.Models` | `FlowTime.TimeMachine.Models` | -| `FlowTime.Generator.Orchestration` | `FlowTime.TimeMachine.Orchestration` | -| `FlowTime.Generator.Processing` | `FlowTime.TimeMachine.Processing` | diff --git a/work/epics/E-18-headless-pipeline-and-optimization/m-E18-09-parameter-sweep.md b/work/epics/E-18-headless-pipeline-and-optimization/m-E18-09-parameter-sweep.md deleted file mode 100644 index eff9177a..00000000 --- a/work/epics/E-18-headless-pipeline-and-optimization/m-E18-09-parameter-sweep.md +++ /dev/null @@ -1,89 +0,0 @@ -# m-E18-09 — Parameter Sweep - -**Epic:** E-18 Time Machine -**Branch:** `epic/E-18-time-machine` (milestone work on continuation branch) -**Status:** complete - -## Goal - -Implement parameter sweep as a first-class Time Machine operation: given a model YAML, a -const-node ID, and an array of values, evaluate the model once per value and return a -structured table of (param_value → series outputs). - -Builds on: -- m-E18-01 `evaluate_with_params` in the Rust engine (compile-once foundation) -- m-E18-07 `FlowTime.TimeMachine` project (host for the sweep domain model) -- m-E18-08 `ITelemetrySource` (pattern for injectable evaluation contracts) - -## Scope - -**`FlowTime.TimeMachine.Sweep` namespace** — in `src/FlowTime.TimeMachine/Sweep/`: -- `IModelEvaluator` — injectable evaluation contract; decouples SweepRunner from the Rust binary in tests -- `SweepSpec` — validated input: ModelYaml, ParamId, Values[], optional CaptureSeriesIds -- `SweepPoint` — single evaluation result: ParamValue + Series dictionary -- `SweepResult` — full sweep result: ParamId + SweepPoint[] -- `ConstNodePatcher` — internal YAML DOM manipulation; patches a named const node's values array -- `SweepRunner` — orchestrates N evaluations via injected `IModelEvaluator` -- `RustModelEvaluator : IModelEvaluator` — wraps `RustEngineRunner`, maps series list to dictionary - -**`POST /v1/sweep`** — in `src/FlowTime.API/Endpoints/SweepEndpoints.cs`: -- Request: `{ yaml, paramId, values: [double...], captureSeriesIds?: [string...] }` -- Response (200): `{ paramId, points: [{ paramValue, series: { seriesId: double[] } }] }` -- 400: missing yaml / paramId / values -- 503: engine not enabled (RustEngine:Enabled=false) - -**In scope:** -- `src/FlowTime.TimeMachine/Sweep/IModelEvaluator.cs` -- `src/FlowTime.TimeMachine/Sweep/SweepSpec.cs` -- `src/FlowTime.TimeMachine/Sweep/SweepResult.cs` -- `src/FlowTime.TimeMachine/Sweep/ConstNodePatcher.cs` -- `src/FlowTime.TimeMachine/Sweep/SweepRunner.cs` -- `src/FlowTime.TimeMachine/Sweep/RustModelEvaluator.cs` -- `src/FlowTime.API/Endpoints/SweepEndpoints.cs` -- DI registration in `Program.cs` -- Unit tests: `tests/FlowTime.TimeMachine.Tests/Sweep/` -- API tests: `tests/FlowTime.Api.Tests/SweepEndpointsTests.cs` - -**Out of scope:** -- Sensitivity analysis (numerical gradient) — follow-on -- Multi-parameter sweeps (grid sweeps) — follow-on -- Session-based compile-once optimization — follow-on (each sweep point uses subprocess eval) -- Optimization / fitting — m-E18-10+ -- Sweep result persistence / artifact writing — follow-on - -## Design Notes - -### Implementation approach - -Each sweep point calls `RustEngineRunner.EvaluateAsync(patchedYaml)` independently (one -subprocess per point). The YAML is patched in-memory before each call via `ConstNodePatcher`, -which uses YamlDotNet's representation model to substitute the const node's values array. - -This deliberately trades compile-once efficiency for implementation simplicity: the Rust -session protocol requires a MessagePack NuGet dependency not yet in the tree, while the -subprocess approach reuses existing infrastructure with no new dependencies. - -The `IModelEvaluator` abstraction isolates this choice from `SweepRunner`, so a future -session-based evaluator can be dropped in without changing the sweep domain model or tests. - -### ConstNodePatcher behaviour - -- Finds the first `nodes` entry where `id == nodeId` AND `kind == "const"` -- Replaces its `values` sequence with `[value, value, ..., value]` (same bin count) -- Returns the original YAML unchanged if the node is not found or is not a const node -- Uses `InvariantCulture` formatting for decimal precision - -## Acceptance Criteria - -- [x] `IModelEvaluator` interface exists in `FlowTime.TimeMachine.Sweep` -- [x] `SweepSpec` validates: non-null/whitespace ModelYaml, non-null/whitespace ParamId, non-null/non-empty Values -- [x] `ConstNodePatcher.Patch` correctly replaces const node values; returns original YAML for unknown/non-const nodes -- [x] `SweepRunner.RunAsync` returns one `SweepPoint` per input value, with correct ParamValue and Series -- [x] `SweepRunner` respects `CaptureSeriesIds` filter (null = all series) -- [x] `SweepRunner` respects `CancellationToken` between evaluation points -- [x] `RustModelEvaluator` wraps `RustEngineRunner` and maps series list to dictionary -- [x] `POST /v1/sweep` returns 400 for missing yaml / paramId / empty values -- [x] `POST /v1/sweep` returns 503 when Rust engine not enabled -- [x] Unit tests pass: 28 sweep unit tests (SweepSpec ×9, ConstNodePatcher ×7, SweepRunner ×12) -- [x] API validation tests pass: 7 tests (6×400, 1×503) -- [x] `dotnet test FlowTime.sln` all green (105 TimeMachine, 235 API — pre-existing integration failures unrelated) diff --git a/work/epics/E-18-headless-pipeline-and-optimization/m-E18-11-goal-seeking.md b/work/epics/E-18-headless-pipeline-and-optimization/m-E18-11-goal-seeking.md deleted file mode 100644 index 83b3c82a..00000000 --- a/work/epics/E-18-headless-pipeline-and-optimization/m-E18-11-goal-seeking.md +++ /dev/null @@ -1,76 +0,0 @@ -# m-E18-11 — Goal Seeking - -**Epic:** E-18 Time Machine -**Branch:** `epic/E-18-time-machine` -**Status:** complete - -## Goal - -Add 1D goal seeking: given a model YAML, a const-node parameter, a metric series, and a -target value, find the parameter value that drives the metric mean to the target via bisection. -Answers "what arrival rate gives 80% utilization?" without a full parameter sweep. - -Builds on: -- m-E18-09 `SweepRunner` + `ConstNodePatcher` / `ConstNodeReader` (m-E18-10) -- Same `IModelEvaluator` seam - -## Scope - -**`FlowTime.TimeMachine.Sweep` namespace:** -- `GoalSeekSpec` — validated input: ModelYaml, ParamId, MetricSeriesId, Target, SearchLo, - SearchHi, Tolerance (default 1e-6), MaxIterations (default 50) -- `GoalSeekResult` — output: ParamValue, AchievedMetricMean, Converged, Iterations -- `GoalSeeker` — bisection over `SweepRunner`; handles non-bracketed case gracefully - -**`POST /v1/goal-seek`** — in `src/FlowTime.API/Endpoints/GoalSeekEndpoints.cs` -- Request: `{ yaml, paramId, metricSeriesId, target, searchLo, searchHi, tolerance?, maxIterations? }` -- Response (200): `{ paramValue, achievedMetricMean, converged, iterations }` -- 400: missing/invalid required fields (searchLo ≥ searchHi is invalid) -- 503: engine not enabled - -**In scope:** -- `src/FlowTime.TimeMachine/Sweep/GoalSeekSpec.cs` -- `src/FlowTime.TimeMachine/Sweep/GoalSeekResult.cs` -- `src/FlowTime.TimeMachine/Sweep/GoalSeeker.cs` -- `src/FlowTime.API/Endpoints/GoalSeekEndpoints.cs` -- DI registration in `Program.cs` -- Unit tests: `tests/FlowTime.TimeMachine.Tests/Sweep/` -- API tests: `tests/FlowTime.Api.Tests/GoalSeekEndpointsTests.cs` -- Architecture doc: `docs/architecture/time-machine-analysis-modes.md` (written alongside) - -**Out of scope:** -- Multi-dimensional optimization (Nelder-Mead) — m-E18-12+ -- Constraint handling beyond the `[searchLo, searchHi]` range -- Non-monotonic functions (bisection is undefined; `Converged=false` returned) - -## Algorithm - -Bisection on the metric mean: - -``` -1. Evaluate at searchLo → meanLo = mean(metric at searchLo) -2. Evaluate at searchHi → meanHi = mean(metric at searchHi) -3. If target not in [min(meanLo,meanHi), max(meanLo,meanHi)]: - return best endpoint, Converged=false -4. While iterations < maxIterations: - mid = (lo + hi) / 2 - midMean = mean(metric at mid) - if |midMean - target| < tolerance: return mid, Converged=true - if (midMean - target) same sign as (meanLo - target): lo = mid, meanLo = midMean - else: hi = mid, meanHi = midMean -5. Return mid, Converged=false (max iterations reached) -``` - -## Acceptance Criteria - -- [x] `GoalSeekSpec` validates: non-null/whitespace ModelYaml/ParamId/MetricSeriesId; - SearchLo < SearchHi; Tolerance > 0; MaxIterations ≥ 1 -- [x] `GoalSeeker.SeekAsync` converges on a linear model to within tolerance -- [x] `GoalSeeker` returns `Converged=false` when target is not bracketed -- [x] `GoalSeeker` returns `Converged=false` (best guess) when max iterations exhausted -- [x] `GoalSeeker` respects `CancellationToken` -- [x] `POST /v1/goal-seek` returns 400 for missing/invalid required fields -- [x] `POST /v1/goal-seek` returns 503 when engine not enabled -- [x] Unit tests pass: 26 tests (GoalSeekSpec ×14, GoalSeeker ×12) -- [x] API tests pass: 8 tests (7×400, 1×503) -- [x] `dotnet test FlowTime.sln` all green (163 TimeMachine, 250 API) diff --git a/work/epics/E-18-headless-pipeline-and-optimization/milestone-plan-v2.md b/work/epics/E-18-headless-pipeline-and-optimization/milestone-plan-v2.md deleted file mode 100644 index d9d8a79f..00000000 --- a/work/epics/E-18-headless-pipeline-and-optimization/milestone-plan-v2.md +++ /dev/null @@ -1,134 +0,0 @@ -# E-18 / E-17 Milestone Plan v2 — Rust Engine Reality - -**Date:** 2026-04-10 -**Context:** E-20 complete. The Rust engine is the evaluation path. The original E-18 milestone plan assumed C# Core was the engine and front-loaded Generator extraction + C# refactoring. With the Rust engine, the foundation layer is Rust-native. - -**User goal:** Parameterized evaluation → streaming engine as pipeline component → Svelte UI as streaming client. No shortcuts. No backward-compatibility tax. - -## What changes from the original plan - -| Original milestone | What happens | -|---|---| -| m-E18-01a (Generator extraction) | **Deferred.** Not on the critical path for interactive/headless. Generator stays alive for now; the Rust engine is the new execution path alongside it. | -| m-E18-01b (ITelemetrySource, tiered validation) | **Deferred.** Telemetry contracts not needed for parameter tweaking. Tiered validation moves to a later milestone. | -| m-E18-01c (Runtime parameter foundation) | **Becomes m-E18-01.** The foundation, but in Rust, not C#. | -| m-E18-02 (CLI/sidecar) | **Becomes m-E18-02.** Session mode with streaming protocol. | -| m-E18-03+ (sweep, optimize, fit) | **Renumbered.** Sweep is m-E18-03. Others follow. | - -## Milestone Sequence - -### E-18: Headless Engine Foundation (Rust) - -| # | ID | Title | Depends on | Summary | -|---|---|---|---|---| -| 1 | m-E18-01 | Parameterized Evaluation | E-20 (done) | ParamTable in Plan. Compiler extracts tweakable parameters from const nodes, traffic arrivals, WIP limits. `evaluate_with_params(plan, overrides)` pure function. Parameter metadata (id, kind, default, bounds). | -| 2 | m-E18-02 | Engine Session + Streaming Protocol | m-E18-01 | `flowtime-engine session` persistent CLI mode. Length-prefixed MessagePack over stdin/stdout. Commands: `compile`, `eval`, `patch`, `get_params`, `get_series`. Session holds compiled Plan + current state. | -| 3 | m-E18-03 | Parameter Sweep | m-E18-01 | `flowtime-engine sweep` batch mode. Scenario grid definition (JSON). N evaluations without recompile. Tabular output. | - -### E-17: Interactive What-If (Bridge + Svelte UI) - -| # | ID | Title | Depends on | Summary | -|---|---|---|---|---| -| 4 | m-E17-01 | WebSocket Engine Bridge | m-E18-02 | .NET API manages persistent engine session process. WebSocket endpoint proxies client ↔ engine session protocol. Svelte UI connects via WebSocket. | -| 5 | m-E17-02 | Svelte Parameter Panel | m-E17-01 | Auto-generated parameter controls (sliders, numeric inputs) from parameter schema. WebSocket send on change. Svelte stores for reactive series data. | -| 6 | m-E17-03 | Live Topology & Charts | m-E17-02, E-11 M3 | Topology heatmap + time-series charts reactively update when series store changes. Value-only updates (no graph re-layout). | - -### Later E-18 milestones (not immediate) - -| # | ID | Title | Depends on | Summary | -|---|---|---|---|---| -| 7 | m-E18-04 | Sensitivity Analysis | m-E18-03 | Numerical gradient: perturb each parameter, measure output change. | -| 8 | m-E18-05 | Optimization & Fitting | m-E18-03, Telemetry Loop | Objective-based optimization + model fitting against observed data. | -| 9 | m-E18-06 | Tiered Validation | m-E18-02 | Schema/compile/analyze tiers via session protocol. Client-agnostic. | -| 10 | m-E18-07 | Generator Extraction | m-E18-02 | Migrate FlowTime.Generator → FlowTime.TimeMachine. Delete Generator. | - -## Detailed milestone descriptions - -### m-E18-01: Parameterized Evaluation - -**The critical primitive.** Everything else builds on this. - -The Plan currently bakes constants into `Op::Const { out, values }` at compile time. This milestone separates "model structure" from "tweakable values." - -**Acceptance criteria:** -1. `ParamTable` struct in Plan: lists all user-visible parameters with id, column, default value, kind (scalar/vector/rate). -2. Compiler extracts parameters from: const nodes, traffic arrival rates, WIP limits, initial conditions. -3. `evaluate_with_params(plan, overrides: &[(id, ParamValue)])` — re-evaluates with parameter overrides without recompilation. -4. Parameters have metadata: id (matches model YAML node/field), display name, kind, default value. -5. Round-trip: `evaluate(plan)` produces identical results to `evaluate_with_params(plan, &[])` (no overrides = defaults). -6. Normalization invariant still holds after parameter override. -7. Post-eval class decomposition and edge series recomputed with overridden values. - -### m-E18-02: Engine Session + Streaming Protocol - -**The pipeline component.** A persistent Rust process that holds a compiled Plan and streams results. - -**Acceptance criteria:** -1. `flowtime-engine session` CLI mode: reads commands from stdin, writes responses to stdout. -2. Length-prefixed MessagePack framing (4-byte big-endian length + payload). -3. Commands: `compile` (YAML → parameter schema + initial series), `eval` (overrides → updated series), `get_params` (→ current parameter values), `get_series` (names → series data). -4. Session holds: compiled Plan, current parameter values, current state matrix. -5. `eval` with overrides returns updated series within 50ms for typical models (no recompilation, no file I/O). -6. Series data encoded as binary f64 arrays in MessagePack ext type (not JSON text). -7. Error responses with structured error codes. -8. Graceful shutdown on stdin EOF or SIGTERM. -9. Rust integration tests: spawn session process, send compile + eval + patch sequence, verify results. - -### m-E17-01: WebSocket Engine Bridge - -**The bridge.** Connects web clients to the engine session. - -**Acceptance criteria:** -1. .NET API WebSocket endpoint: `ws://localhost:8081/v1/engine/session`. -2. On WebSocket connect: spawn Rust engine session subprocess, pipe WebSocket frames ↔ engine stdin/stdout. -3. MessagePack frames pass through transparently (API is a dumb proxy). -4. Session lifetime = WebSocket lifetime. Engine process killed on disconnect. -5. Multiple concurrent sessions supported (one engine process per WebSocket). -6. Health check: session responds to ping within 100ms. - -### m-E17-02: Svelte Parameter Panel - -**The UI.** Parameter controls auto-generated from engine schema. - -**Acceptance criteria:** -1. WebSocket connection to engine session on page load. -2. `compile` sent with model YAML → receive parameter schema. -3. Parameter panel renders controls: slider for scalar params, numeric input for all. -4. Control change → `eval` with override → receive updated series. -5. Svelte writable stores for each series. Charts/topology bind to stores. -6. Debounced updates: slider drag sends eval at most every 50ms. -7. Loading state while eval in flight. - -### m-E17-03: Live Topology & Charts - -**The visualization.** Graphs and charts react to series store changes. - -**Acceptance criteria:** -1. Topology heatmap colors update when series values change (no re-layout). -2. Time-series line charts re-render on store update. -3. Chart axes auto-scale to new data range. -4. Latency from slider drag to visual update < 200ms end-to-end. - -## E-11 dependency - -m-E17-03 depends on E-11's topology visualization (M3) and timeline (M4) being functional. E-11 is paused at M6 with M1-M4 + M6 done. The Svelte UI already has: -- SvelteKit scaffold + shadcn-svelte (M1) -- API client + page routes (M2) -- Topology canvas via dag-map (M3) -- Timeline visualization (M4) -- Run orchestration page (M6) - -What's missing for E-17: WebSocket infrastructure, parameter panel, reactive data binding. These are new features, not E-11 backlog. - -## Protocol detail: why MessagePack over stdin/stdout - -**vs. JSONL:** Binary f64 arrays avoid text encoding overhead. Length-prefixed framing eliminates incomplete-line ambiguity. Self-describing like JSON but 2-4x more compact for series data. - -**vs. gRPC:** gRPC requires HTTP/2, codegen infrastructure, and doesn't compose with Unix pipes. MessagePack over stdin/stdout works with `cat`, `tee`, `jq` (with msgpack2json adapters), and any language with a MessagePack library. - -**vs. custom binary:** MessagePack is a well-specified, widely-implemented format. No custom parser needed. Libraries exist for Rust (rmp-serde), JavaScript (@msgpack/msgpack), C# (MessagePack-CSharp), Python (msgpack). - -**Transport flexibility:** Same MessagePack protocol works over: -- stdin/stdout (CLI pipelines) -- WebSocket (browser UI, via .NET proxy) -- TCP socket (future service mode) diff --git a/work/epics/completed/E-19-surface-alignment-and-compatibility-cleanup/m-E19-01-supported-surface-inventory.md b/work/epics/E-19-surface-alignment-compatibility-cleanup/M-024-supported-surface-inventory-boundary-adr-exit-criteria.md similarity index 77% rename from work/epics/completed/E-19-surface-alignment-and-compatibility-cleanup/m-E19-01-supported-surface-inventory.md rename to work/epics/E-19-surface-alignment-compatibility-cleanup/M-024-supported-surface-inventory-boundary-adr-exit-criteria.md index 8f4cd013..80b774bf 100644 --- a/work/epics/completed/E-19-surface-alignment-and-compatibility-cleanup/m-E19-01-supported-surface-inventory.md +++ b/work/epics/E-19-surface-alignment-compatibility-cleanup/M-024-supported-surface-inventory-boundary-adr-exit-criteria.md @@ -1,12 +1,44 @@ -# Milestone: Supported Surface Inventory, Boundary ADR & Exit Criteria - -**ID:** m-E19-01-supported-surface-inventory -**Epic:** Surface Alignment & Compatibility Cleanup (E-19) -**Status:** completed +--- +id: M-024 +title: Supported Surface Inventory, Boundary ADR & Exit Criteria +status: done +parent: E-19 +acs: + - id: AC-1 + title: Boundary ADR extended + status: met + - id: AC-2 + title: Supported-surfaces matrix published + status: met + - id: AC-3 + title: Exhaustive inventory table populated + status: met + - id: AC-4 + title: A1–A6 decisions are cited, not reinvented, in the inventory + status: met + - id: AC-5 + title: work/decisions.md updated + status: met + - id: AC-6 + title: E-18 epic spec updated with the validation requirement and Time + status: met + - id: AC-7 + title: CLAUDE.md "Current Work" section updated + status: met + - id: AC-8 + title: Epic status surfaces reconciled + status: met + - id: AC-9 + title: Tracking doc maintained + status: met + - id: AC-10 + title: No code deletion in this milestone + status: met +--- ## Goal -Produce the authoritative inventory, boundary ADR, and retention/deletion decisions that govern the rest of E-19. When this milestone closes, every first-party compatibility and legacy seam outside E-16's analytical boundary has an explicit classification (supported / transitional / delete / archive), an owning downstream milestone, and a grep guard specification. No code is deleted in this milestone; decisions are locked so m-E19-02, m-E19-03, and m-E19-04 can execute forward-only without re-litigating scope. +Produce the authoritative inventory, boundary ADR, and retention/deletion decisions that govern the rest of E-19. When this milestone closes, every first-party compatibility and legacy seam outside E-16's analytical boundary has an explicit classification (supported / transitional / delete / archive), an owning downstream milestone, and a grep guard specification. No code is deleted in this milestone; decisions are locked so M-025, M-026, and M-027 can execute forward-only without re-litigating scope. ## Context @@ -28,13 +60,13 @@ These are recorded here as the authoritative source. `work/decisions.md` gets co 1. **No project renames in E-19.** `FlowTime.Core`, `FlowTime.Generator`, `FlowTime.API`, and `FlowTime.Sim.*` keep their names. The boundary ADR documents what each actually does. 2. **`FlowTime.Core` is the evaluation engine.** Pure library. `ModelCompiler.Compile` + `ModelParser.ParseModel` + `Graph.Evaluate`, plus the authoritative validators (`ModelSchemaValidator`, `ModelValidator`) and the invariant analyzers. No HTTP, no orchestration, no storage, no client awareness. E-19 does not touch it. **Forward (post-E-18): Core remains pure and unchanged in these invariants.** The Time Machine depends on Core, never the reverse. Core is the library of deterministic operations (the "instruction set"); the Time Machine is the hosted component that composes those operations into first-class callable services. Nothing new gets added to Core that would reintroduce HTTP, orchestration, or client awareness. When the Time Machine needs a capability, it either composes existing Core primitives or — if Core is genuinely missing a pure computational primitive — the primitive is added to Core as a pure library function, not to the Time Machine as a parallel implementation. -3. **`FlowTime.Generator` is today's shared orchestration layer** between `Sim.Service` and `API`. `RunOrchestrationService`, `RunArtifactWriter`, deterministic run ID logic, RNG seeding, and dry-run/plan mode all live here. **During E-19, Generator is unchanged — name, structure, and responsibilities all stay the same.** **Generator's forward fate is decided and scoped to E-18: Path B — extraction and deletion.** Most of Generator's current responsibilities (compile, evaluate, artifact write, run IDs, RNG seeding, dry-run) overlap the Time Machine's scope and cannot coexist with it. In E-18, Generator's execution-pipeline responsibilities are **extracted** into the new `FlowTime.TimeMachine` project, and `FlowTime.Generator` is **deleted** in the same milestone. No "Generator and Time Machine coexist in parallel" window is permitted — this matches the no-coexistence discipline established in E-16. The specific tier 3 analyser binding (`TemplateInvariantAnalyzer` currently in `FlowTime.Sim.Core.Analysis`) is also subject to the E-18 extraction: the invariant rules belong conceptually in Core, with the Time Machine composing them into the tier 3 validation surface. See decision D-2026-04-07-019. +3. **`FlowTime.Generator` is today's shared orchestration layer** between `Sim.Service` and `API`. `RunOrchestrationService`, `RunArtifactWriter`, deterministic run ID logic, RNG seeding, and dry-run/plan mode all live here. **During E-19, Generator is unchanged — name, structure, and responsibilities all stay the same.** **Generator's forward fate is decided and scoped to E-18: Path B — extraction and deletion.** Most of Generator's current responsibilities (compile, evaluate, artifact write, run IDs, RNG seeding, dry-run) overlap the Time Machine's scope and cannot coexist with it. In E-18, Generator's execution-pipeline responsibilities are **extracted** into the new `FlowTime.TimeMachine` project, and `FlowTime.Generator` is **deleted** in the same milestone. No "Generator and Time Machine coexist in parallel" window is permitted — this matches the no-coexistence discipline established in E-16. The specific tier 3 analyser binding (`TemplateInvariantAnalyzer` currently in `FlowTime.Sim.Core.Analysis`) is also subject to the E-18 extraction: the invariant rules belong conceptually in Core, with the Time Machine composing them into the tier 3 validation surface. See decision D-032. 4. **`FlowTime.API` is the query/operator surface over canonical run artifacts.** It reads canonical run artifacts and exposes the current read/query and operator endpoints. It does not execute models, and when an obsolete API write path is retired E-19 deletes it outright instead of preserving a 410 or advisory tombstone. 5. **`FlowTime.Sim.Service` hosts authoring and, transitionally, execution.** Templates, parameter UX, provenance are permanent Sim responsibilities. Execution hosting is transitional — it exists in Sim only because no other HTTP host is wired to `FlowTime.Core` today. 6. **The Time Machine (`FlowTime.TimeMachine`) is owned by E-18 and is a new separate component.** Responsibilities: compile, tiered validation (schema / compile / analyse), evaluate, reevaluate, parameter override with stable runtime parameter identity, artifact write. Surfaces: in-process SDK, CLI, optional sidecar protocol. Not analytical primitives, not template authoring, not query/analysis of past runs. The Time Machine does not live inside Sim or API. **Dependency direction: Time Machine → Core, never reverse.** The Time Machine composes Core's pure operations (`ModelSchemaValidator`, `ModelCompiler`, `ModelParser`, `Graph.Evaluate`, invariant analyzers) into first-class callable services with consistent request/response shapes. It never reimplements what Core already does. In the BEAM/JVM framing: Core is the instruction set and execution kernel as a pure library; the Time Machine is the hosted machine that loads programs (compiled graphs), drives them, exposes iteration and reevaluation protocols, and presents a client-agnostic API. Naming rationale: FlowTime's execution component is an abstract machine in the BEAM/JVM sense — instructions (compiled graph), state (time grid plus accumulating series), deterministic stepping through time. "Time Machine" also aligns with the existing Blazor "Time Travel" UI feature that navigates runs the Time Machine produces, and the reevaluation semantics (rewind a compiled model, run it forward with different parameters) are literally time travel. 7. **When the Time Machine ships, Sim's orchestration endpoints are deleted by default.** If a temporary facade is kept at all, it must be justified by a concrete technical migration constraint, documented in the owning E-18 milestone, and treated as a short-lived bridge rather than a supported steady state. That migration is E-18's job. E-19 records the commitment so no new non-UI callers land on Sim orchestration in the meantime. 8. **The Time Machine serves all clients on equal footing.** Sim UI, Blazor UI, Svelte UI, MCP servers, external AI agents, tests, and CI are all first-class callers of Time Machine operations. No client is privileged. In particular, validation (A6) is a client-agnostic operation; MCP servers and AI agents generating candidate models need the same validation contract that UIs need for editor-time feedback. -9. **Telemetry is an adapter concern outside the Time Machine, with one exception.** The Time Machine itself does not contain external-telemetry-format-specific code (no Prometheus, no OTEL, no BPI event log parsing). External-format ingestion lives in adapter projects under `FlowTime.Telemetry.*`. The exception: writing the **canonical bundle** format (defined by E-15's schema, today produced by `TelemetryBundleBuilder` in Generator) is a Time Machine core capability, not a pluggable adapter, because it serves the **telemetry loop** that is fundamental to FlowTime's bootstrap, self-consistency, and AI-iteration use cases. The canonical run directory (`data/runs//model/`, `series/`, `run.json`) and the canonical bundle (`model.yaml`, `manifest.json`, `series/`, CSV) are **two distinct artifacts with different purposes** — runs are the in-place clear-text debugging surface, bundles are the portable interchange format — and both are preserved by Path B. The bundle format may evolve independently of the run directory format. **`ITelemetrySource` is introduced** by E-18 m-E18-01b (after the Path B extraction cut in m-E18-01a creates the concrete `CanonicalBundleSource`), with multiple implementations once 01b ships (`CanonicalBundleSource`, `FileCsvSource`, plus future Prometheus/OTEL/event-log adapters under `FlowTime.Telemetry.*` delivered by m-E18-06). **`ITelemetrySink` is explicitly deferred** until a second sink format exists; canonical bundle writing is a concrete Time Machine capability, not behind an interface. The **telemetry loop** (capture → bundle → replay → parity, established vocabulary from `work/epics/telemetry-loop-parity/spec.md`) is a first-class use case with three primary purposes: **specification/bootstrap** (generate target telemetry from a model to define what the real system must emit), **self-consistency testing** (round-trip verification of capture+replay correctness), and **AI iteration / model fitting** (compare model-generated telemetry to real observed telemetry, adjust model, iterate). Path B extracts both Generator's execution code (into the Time Machine) and Generator's telemetry-generation code (`TelemetryBundleBuilder`, `TelemetryCapture`, `CaptureManifestWriter`, `RunArtifactReader`) into the canonical bundle writer and `CanonicalBundleSource`. Existing public surfaces — `POST /telemetry/captures` API and `flowtime telemetry capture` CLI — are re-wired to the new home without changing their contracts. The parity harness itself, drift tolerance rules, and CI gating are not E-18's responsibility; they are owned by the Telemetry Loop & Parity epic. +9. **Telemetry is an adapter concern outside the Time Machine, with one exception.** The Time Machine itself does not contain external-telemetry-format-specific code (no Prometheus, no OTEL, no BPI event log parsing). External-format ingestion lives in adapter projects under `FlowTime.Telemetry.*`. The exception: writing the **canonical bundle** format (defined by E-15's schema, today produced by `TelemetryBundleBuilder` in Generator) is a Time Machine core capability, not a pluggable adapter, because it serves the **telemetry loop** that is fundamental to FlowTime's bootstrap, self-consistency, and AI-iteration use cases. The canonical run directory (`data/runs//model/`, `series/`, `run.json`) and the canonical bundle (`model.yaml`, `manifest.json`, `series/`, CSV) are **two distinct artifacts with different purposes** — runs are the in-place clear-text debugging surface, bundles are the portable interchange format — and both are preserved by Path B. The bundle format may evolve independently of the run directory format. **`ITelemetrySource` is introduced** by E-18 m-E18-01b (after the Path B extraction cut in m-E18-01a creates the concrete `CanonicalBundleSource`), with multiple implementations once 01b ships (`CanonicalBundleSource`, `FileCsvSource`, plus future Prometheus/OTEL/event-log adapters under `FlowTime.Telemetry.*` delivered by M-003). **`ITelemetrySink` is explicitly deferred** until a second sink format exists; canonical bundle writing is a concrete Time Machine capability, not behind an interface. The **telemetry loop** (capture → bundle → replay → parity, established vocabulary from `work/epics/telemetry-loop-parity/spec.md`) is a first-class use case with three primary purposes: **specification/bootstrap** (generate target telemetry from a model to define what the real system must emit), **self-consistency testing** (round-trip verification of capture+replay correctness), and **AI iteration / model fitting** (compare model-generated telemetry to real observed telemetry, adjust model, iterate). Path B extracts both Generator's execution code (into the Time Machine) and Generator's telemetry-generation code (`TelemetryBundleBuilder`, `TelemetryCapture`, `CaptureManifestWriter`, `RunArtifactReader`) into the canonical bundle writer and `CanonicalBundleSource`. Existing public surfaces — `POST /telemetry/captures` API and `flowtime telemetry capture` CLI — are re-wired to the new home without changing their contracts. The parity harness itself, drift tolerance rules, and CI gating are not E-18's responsibility; they are owned by the Telemetry Loop & Parity epic. ### A1 — Sim orchestration endpoints @@ -51,7 +83,7 @@ These are recorded here as the authoritative source. `work/decisions.md` gets co **Retire stored drafts entirely.** No UI exercises `/api/v1/drafts` CRUD today; the only callers are `DraftEndpointsTests.cs`. Active Blazor and Svelte run flows use `/api/v1/orchestration/runs`; retaining `/api/v1/drafts/run` is only about the inline-source "run this YAML right now" surface, not the default UI orchestration path. -Deletion scope (executed by m-E19-02): +Deletion scope (executed by M-025): - `/api/v1/drafts` CRUD endpoints: GET, PUT, POST create, DELETE, list - `StorageKind.Draft` and `data/storage/drafts/` directory - `draftId` resolution branches in `/api/v1/drafts/validate`, `/api/v1/drafts/generate`, `/api/v1/drafts/run` @@ -65,7 +97,7 @@ If real model versioning is wanted later, it must be designed against compiled-g **Delete the ZIP/bundle archive layer.** Sim's post-hoc ZIP write to `data/storage/runs/` has no production reader; `bundleRef` is consumed only by `RunOrchestrationTests.cs` exercising Engine bundle import. -Deletion scope (executed by m-E19-02): +Deletion scope (executed by M-025): - `StorageKind.Run` bundle ZIP writes in `RunOrchestrationService.CreateSimulationRunAsync` - `BundleRef` / `StorageRef` return values on `RunCreateResponse` - `data/storage/runs/` directory and backend write path for run bundles @@ -77,7 +109,7 @@ Deletion scope (executed by m-E19-02): **Delete bundle-import branches.** Only `RunOrchestrationTests.cs` exercises them; no UI, CLI, background job, or production workflow depends on Sim-exports-bundle → Engine-imports-bundle. The "loop" is designed but never wired. -Deletion scope (executed by m-E19-02): +Deletion scope (executed by M-025): - `bundlePath`, `bundleArchiveBase64`, and `BundleRef` branches in `RunOrchestrationEndpoints.cs` `POST /v1/runs` - `ExtractArchiveAsync` support helpers if unused after deletion - Bundle-import tests in `RunOrchestrationTests.cs` (forward-only deletion) @@ -90,7 +122,7 @@ If cross-environment run transfer is needed later, it comes back as an E-18 conc **Delete entirely.** `data/catalogs/` is empty. `TemplateServiceImplementations.GetCatalogsAsync` calls `GetMockCatalogsAsync` in both demo and API modes. `TemplateRunner.razor` hardcodes `CatalogId = "default", // No longer using catalogs`. No UI creates or selects a catalog. No tests assert meaningful catalog behavior. -Deletion scope (executed by m-E19-02): +Deletion scope (executed by M-025): - `/api/v1/catalogs` endpoints (GET, PUT, POST validate) in Sim.Service - `CatalogService`, `ICatalogService`, mock catalog service implementations - `CatalogPicker.razor` (Blazor) and any Svelte catalog selector @@ -102,7 +134,7 @@ If catalogs ever come back, redesign from scratch against a real use case. ### A6 — Validation as a first-class, client-agnostic operation -**Retire the current `POST /api/v1/drafts/validate` endpoint in m-E19-02. Preserve every library piece a future validation operation composes. Record a hard E-18 dependency: the Time Machine must expose tiered validation as a first-class, client-agnostic operation alongside compile, evaluate, reevaluate, parameter override, and artifact write.** +**Retire the current `POST /api/v1/drafts/validate` endpoint in M-025. Preserve every library piece a future validation operation composes. Record a hard E-18 dependency: the Time Machine must expose tiered validation as a first-class, client-agnostic operation alongside compile, evaluate, reevaluate, parameter override, and artifact write.** **Principle (recorded in the boundary ADR):** Validation — answering "is this YAML a correct FlowTime model?" — is a first-class, client-agnostic operation. `FlowTime.Core` owns the authoritative answer via `ModelSchemaValidator`, `ModelCompiler`, `ModelParser`, and `InvariantAnalyzer`. Sim UI, Blazor UI, Svelte UI, MCP servers, and external AI agents are all legitimate callers of validation, on equal footing. No single client — including Sim — is a privileged host for the validation operation. @@ -112,7 +144,7 @@ If catalogs ever come back, redesign from scratch against a real use case. - `FlowTime.Core` already exposes cheaper validators that nothing calls: `ModelSchemaValidator` (`src/FlowTime.Core/Models/ModelSchemaValidator.cs:21`) for pure schema checking, and `ModelValidator` (`src/FlowTime.Core/Models/ModelValidator.cs:21`) for schemaVersion/grid/structure checks. Both return `ValidationResult`. - `TemplateInvariantAnalyzer` is the right *implementation* of the heaviest validation tier, but it lives behind one mislabeled HTTP endpoint on Sim, which is not the right *home* for a client-agnostic operation. -**Deletion scope (executed by m-E19-02):** +**Deletion scope (executed by M-025):** - `POST /api/v1/drafts/validate` endpoint handler in `src/FlowTime.Sim.Service/Program.cs:540-615` - Endpoint-specific tests (forward-only — the inline/draft-source validation path through this endpoint is unused) @@ -149,47 +181,54 @@ This is not optional for E-18. Validation and compile-only are natural siblings The inventory table (AC 3) will naturally show asymmetry: Blazor rows will include "deprecated, scheduled for removal" entries that have no Svelte counterpart. That is expected, not a gap. -## Acceptance Criteria - -1. **Boundary ADR extended.** `docs/architecture/template-draft-model-run-bundle-boundary.md` contains a new "Responsibility Clarification" section (Core = evaluation library, Generator = orchestrator, API = query/operator surface over canonical runs, Sim = authoring + transitional execution host, Time Machine = new E-18 component) and three Mermaid sequence diagrams labelled **Current**, **Transitional (end of E-19)**, and **Target (post-E-18)**. The Target diagram shows the Time Machine as a distinct participant with both UI and AI/MCP clients as equal callers of tiered validation and execution operations. Diagrams correctly distinguish canonical run directory (`data/runs//`) from bundle ZIP (`data/storage/runs/`). The ADR also records the A6 principle that validation is a first-class client-agnostic operation owned by Core and surfaced through the Time Machine. - -2. **Supported-surfaces matrix published.** New file `docs/architecture/supported-surfaces.md` exists and contains the exhaustive inventory table (see AC 3), the Blazor/Svelte support policy verbatim from this spec, and the shared framing (no renames, Time Machine ownership by E-18, Core purity, API identity, Sim responsibilities, no privileged validation client). - -3. **Exhaustive inventory table populated.** A single table in `supported-surfaces.md` covers every in-scope surface element with these columns: +## Acceptance criteria - | Surface | Element | Current status | Decision | Target state | Owning milestone | Grep guard | +### AC-1 — Boundary ADR extended - Populated by systematic sweep of: - - Every route in `src/FlowTime.API/Endpoints/*.cs` - - Every route in `src/FlowTime.Sim.Service/Program.cs` and `src/FlowTime.Sim.Service/Extensions/*EndpointExtensions.cs` - - Every HTTP call site in `src/FlowTime.UI/Services/*` and the Svelte UI equivalents (`ui/src/lib/api/*` or current path) - - Every public DTO in `src/FlowTime.Contracts` - - Every JSON/YAML schema file tracked in the repo - - Every template under the active Sim template directory - - Every example under `docs/examples/` (or equivalent current-surface example location) - - Every `docs/` page that documents a contract on a current surface +**Boundary ADR extended.** `docs/architecture/template-draft-model-run-bundle-boundary.md` contains a new "Responsibility Clarification" section (Core = evaluation library, Generator = orchestrator, API = query/operator surface over canonical runs, Sim = authoring + transitional execution host, Time Machine = new E-18 component) and three Mermaid sequence diagrams labelled **Current**, **Transitional (end of E-19)**, and **Target (post-E-18)**. The Target diagram shows the Time Machine as a distinct participant with both UI and AI/MCP clients as equal callers of tiered validation and execution operations. Diagrams correctly distinguish canonical run directory (`data/runs//`) from bundle ZIP (`data/storage/runs/`). The ADR also records the A6 principle that validation is a first-class client-agnostic operation owned by Core and surfaced through the Time Machine. +### AC-2 — Supported-surfaces matrix published - Every row with `Decision = delete` or `Decision = archive` has an owning downstream milestone (m-E19-02, m-E19-03, or m-E19-04) and a grep guard specification. Every row with `Decision = supported` has a one-line rationale. Rows where the decision is still unclear are listed as explicit open questions at the bottom of the document, not silently marked supported. +**Supported-surfaces matrix published.** New file `docs/architecture/supported-surfaces.md` exists and contains the exhaustive inventory table (see AC 3), the Blazor/Svelte support policy verbatim from this spec, and the shared framing (no renames, Time Machine ownership by E-18, Core purity, API identity, Sim responsibilities, no privileged validation client). +### AC-3 — Exhaustive inventory table populated -4. **A1–A6 decisions are cited, not reinvented, in the inventory.** Every orchestration-endpoint, draft, bundle, import, catalog, and validation row in the inventory links to the corresponding decision section of this spec (or to `work/decisions.md` entries derived from it) rather than reargued inline. +**Exhaustive inventory table populated.** A single table in `supported-surfaces.md` covers every in-scope surface element with these columns: +| Surface | Element | Current status | Decision | Target state | Owning milestone | Grep guard | +Populated by systematic sweep of: +- Every route in `src/FlowTime.API/Endpoints/*.cs` +- Every route in `src/FlowTime.Sim.Service/Program.cs` and `src/FlowTime.Sim.Service/Extensions/*EndpointExtensions.cs` +- Every HTTP call site in `src/FlowTime.UI/Services/*` and the Svelte UI equivalents (`ui/src/lib/api/*` or current path) +- Every public DTO in `src/FlowTime.Contracts` +- Every JSON/YAML schema file tracked in the repo +- Every template under the active Sim template directory +- Every example under `docs/examples/` (or equivalent current-surface example location) +- Every `docs/` page that documents a contract on a current surface +Every row with `Decision = delete` or `Decision = archive` has an owning downstream milestone (M-025, M-026, or M-027) and a grep guard specification. Every row with `Decision = supported` has a one-line rationale. Rows where the decision is still unclear are listed as explicit open questions at the bottom of the document, not silently marked supported. +### AC-4 — A1–A6 decisions are cited, not reinvented, in the inventory -5. **`work/decisions.md` updated.** Short entries exist for: the shared framing (no renames, Time Machine ownership by E-18), A1, A2, A3, A4, A5, A6, the Time Machine naming decision, and the Blazor/Svelte support policy. Each entry points at this milestone spec and/or the supported-surfaces doc for detail. +**A1–A6 decisions are cited, not reinvented, in the inventory.** Every orchestration-endpoint, draft, bundle, import, catalog, and validation row in the inventory links to the corresponding decision section of this spec (or to `work/decisions.md` entries derived from it) rather than reargued inline. +### AC-5 — work/decisions.md updated -6. **E-18 epic spec updated with the validation requirement and Time Machine naming.** `work/epics/E-18-headless-pipeline-and-optimization/spec.md` (directory path preserved for historical stability) is updated in content to title the epic `E-18 Time Machine`, gains an explicit scope item for tiered validation (schema / compile / analyse) as a first-class operation alongside compile/evaluate/reevaluate/parameter-override/artifact-write with the client list (Sim UI, Blazor UI, Svelte UI, MCP servers, external AI agents, tests, CI) and the "no privileged client" principle, and has body references from "Headless" / `FlowTime.Headless` updated to "Time Machine" / `FlowTime.TimeMachine`. The same wrap pass also syncs `ROADMAP.md`, `work/epics/epic-roadmap.md`, and `CLAUDE.md` to the new naming and m-E19-01 status. +**`work/decisions.md` updated.** Short entries exist for: the shared framing (no renames, Time Machine ownership by E-18), A1, A2, A3, A4, A5, A6, the Time Machine naming decision, and the Blazor/Svelte support policy. Each entry points at this milestone spec and/or the supported-surfaces doc for detail. +### AC-6 — E-18 epic spec updated with the validation requirement and Time -7. **`CLAUDE.md` "Current Work" section updated.** E-19 status reflects that m-E19-01 is complete (when the milestone closes) and names m-E19-02 as the next milestone, consistent with the status-sync discipline in the repo's project rules. +**E-18 epic spec updated with the validation requirement and Time Machine naming.** `work/epics/E-18-headless-pipeline-and-optimization/spec.md` (directory path preserved for historical stability) is updated in content to title the epic `E-18 Time Machine`, gains an explicit scope item for tiered validation (schema / compile / analyse) as a first-class operation alongside compile/evaluate/reevaluate/parameter-override/artifact-write with the client list (Sim UI, Blazor UI, Svelte UI, MCP servers, external AI agents, tests, CI) and the "no privileged client" principle, and has body references from "Headless" / `FlowTime.Headless` updated to "Time Machine" / `FlowTime.TimeMachine`. The same wrap pass also syncs `ROADMAP.md`, `work/epics/epic-roadmap.md`, and `CLAUDE.md` to the new naming and M-024 status. +### AC-7 — CLAUDE.md "Current Work" section updated -8. **Epic status surfaces reconciled.** `work/epics/E-19-surface-alignment-and-compatibility-cleanup/spec.md` milestone table, `ROADMAP.md`, and `work/epics/epic-roadmap.md` all reflect m-E19-01 status in a single pass at wrap time. +**`CLAUDE.md` "Current Work" section updated.** E-19 status reflects that M-024 is complete (when the milestone closes) and names M-025 as the next milestone, consistent with the status-sync discipline in the repo's project rules. +### AC-8 — Epic status surfaces reconciled -9. **Tracking doc maintained.** `work/epics/E-19-surface-alignment-and-compatibility-cleanup/m-E19-01-supported-surface-inventory-tracking.md` exists and is updated after each AC is satisfied. +**Epic status surfaces reconciled.** `work/epics/E-19-surface-alignment-and-compatibility-cleanup/spec.md` milestone table, `ROADMAP.md`, and `work/epics/epic-roadmap.md` all reflect M-024 status in a single pass at wrap time. +### AC-9 — Tracking doc maintained -10. **No code deletion in this milestone.** The inventory names what will be deleted and in which downstream milestone, but no endpoint, DTO, UI client, schema, template, example, or doc is deleted as part of m-E19-01 itself. If the sweep discovers something obviously and trivially dead that cannot wait, it is logged in `work/gaps.md` with a target milestone rather than removed here. +**Tracking doc maintained.** `work/epics/E-19-surface-alignment-and-compatibility-cleanup/m-E19-01-supported-surface-inventory-tracking.md` exists and is updated after each AC is satisfied. +### AC-10 — No code deletion in this milestone +**No code deletion in this milestone.** The inventory names what will be deleted and in which downstream milestone, but no endpoint, DTO, UI client, schema, template, example, or doc is deleted as part of M-024 itself. If the sweep discovers something obviously and trivially dead that cannot wait, it is logged in `work/gaps.md` with a target milestone rather than removed here. ## Guards / DO NOT -- **DO NOT** delete any code in this milestone. m-E19-01 is a decision and documentation milestone. Every deletion is an AC of a downstream milestone with an explicit grep guard. +- **DO NOT** delete any code in this milestone. M-024 is a decision and documentation milestone. Every deletion is an AC of a downstream milestone with an explicit grep guard. - **DO NOT** rename `FlowTime.Core`, `FlowTime.Generator`, `FlowTime.API`, or `FlowTime.Sim.*`. The shared framing explicitly disallows renames in E-19. -- **DO NOT** design the Time Machine component in this milestone. The Time Machine is E-18's responsibility. m-E19-01 only records the commitment, the sunset hook, and the tiered-validation scope requirement (A6). +- **DO NOT** design the Time Machine component in this milestone. The Time Machine is E-18's responsibility. M-024 only records the commitment, the sunset hook, and the tiered-validation scope requirement (A6). - **DO NOT** treat the current Sim orchestration path as the future Time Machine contract, in diagrams, ADR text, matrix entries, or decision records. - **DO NOT** privilege any client (Sim UI, Blazor UI, Svelte UI, MCP, AI agent) in the Time Machine validation or execution contract. The "no privileged client" principle is load-bearing for the AI/MCP use case (A6) and must survive into E-18 design. - **DO NOT** mark an inventory row `supported` to avoid making a decision. If a row cannot be decided now, it goes into the explicit open-questions section with a named owner, not silently into `supported`. @@ -403,14 +442,14 @@ This milestone produces documents and decisions, not code. "Tests" are artifact- - **Inventory completeness check:** A short script or manual checklist confirms every endpoint in `src/FlowTime.API/Endpoints/` and `src/FlowTime.Sim.Service/` has a row. Every `src/FlowTime.Contracts` public DTO has a row. Every tracked schema file has a row. - **Decision-link check:** Every inventory row with `Decision in {delete, archive}` has a non-empty `Owning milestone` and `Grep guard` cell. Every row with `Decision = supported` has a non-empty rationale. - **`work/decisions.md` consistency check:** Entries for A1–A5, shared framing, and Blazor/Svelte policy exist and point at this spec or `supported-surfaces.md`. -- **Status-sync check:** `CLAUDE.md`, `ROADMAP.md`, `work/epics/epic-roadmap.md`, and the E-19 epic spec milestone table all reflect the same m-E19-01 status at wrap time. +- **Status-sync check:** `CLAUDE.md`, `ROADMAP.md`, `work/epics/epic-roadmap.md`, and the E-19 epic spec milestone table all reflect the same M-024 status at wrap time. - **Grep guard baselines (specification only, not enforcement):** For each delete-decision row, a candidate `rg` pattern is specified. The patterns are not asserted by this milestone — enforcement is the downstream milestone's AC — but they exist so the downstream milestone inherits them directly. ## Out of Scope -- All code deletion. Executed by m-E19-02 (Sim authoring & runtime boundary), m-E19-03 (schema/template/example retirement), and m-E19-04 (Blazor support alignment). +- All code deletion. Executed by M-025 (Sim authoring & runtime boundary), M-026 (schema/template/example retirement), and M-027 (Blazor support alignment). - Any change to `FlowTime.Core`, `FlowTime.Generator`, or the canonical run directory layout at `data/runs//`. -- Designing, implementing, or scoping the Time Machine component beyond recording requirements. The Time Machine is owned by E-18. m-E19-01's only E-18-touching actions are (a) updating the epic title and body references to the new name and (b) appending the tiered-validation scope requirement. +- Designing, implementing, or scoping the Time Machine component beyond recording requirements. The Time Machine is owned by E-18. M-024's only E-18-touching actions are (a) updating the epic title and body references to the new name and (b) appending the tiered-validation scope requirement. - Renaming any existing project or namespace. `FlowTime.TimeMachine` is a new component added by E-18, not a rename of an existing one. `FlowTime.Core`, `FlowTime.Generator`, `FlowTime.API`, and `FlowTime.Sim.*` keep their names. - Analytical-series work, warning fact ownership, consumer fact publication, or by-class purity. All owned by E-16 (complete). - E-10 Phase 3 analytical primitives (`p3d`, `p3c`, `p3b`). @@ -422,7 +461,7 @@ This milestone produces documents and decisions, not code. "Tests" are artifact- - **E-16** complete. Analytical truth boundary purified, consumer facts published. This milestone depends on E-16 being the authoritative owner of everything analytical so the E-19 scope line is clean. - **Boundary ADR seed** already landed in commit `ef644d1` (`docs(work): define E19 surface boundary and ADR`). This milestone extends that document, it does not create it. -- No dependency on E-10 Phase 3, E-11 Svelte UI buildout, or E-18 Time Machine execution. m-E19-01 runs as a parallel cleanup-planning lane. E-19's references to the Time Machine are forward commitments; E-19 does not wait for E-18 to ship. +- No dependency on E-10 Phase 3, E-11 Svelte UI buildout, or E-18 Time Machine execution. M-024 runs as a parallel cleanup-planning lane. E-19's references to the Time Machine are forward commitments; E-19 does not wait for E-18 to ship. ## References diff --git a/work/epics/completed/E-19-surface-alignment-and-compatibility-cleanup/m-E19-02-sim-authoring-and-runtime-boundary-cleanup.md b/work/epics/E-19-surface-alignment-compatibility-cleanup/M-025-sim-authoring-runtime-boundary-cleanup.md similarity index 74% rename from work/epics/completed/E-19-surface-alignment-and-compatibility-cleanup/m-E19-02-sim-authoring-and-runtime-boundary-cleanup.md rename to work/epics/E-19-surface-alignment-compatibility-cleanup/M-025-sim-authoring-runtime-boundary-cleanup.md index 5b4afb0b..fcfdc692 100644 --- a/work/epics/completed/E-19-surface-alignment-and-compatibility-cleanup/m-E19-02-sim-authoring-and-runtime-boundary-cleanup.md +++ b/work/epics/E-19-surface-alignment-compatibility-cleanup/M-025-sim-authoring-runtime-boundary-cleanup.md @@ -1,32 +1,63 @@ -# Milestone: Sim Authoring & Runtime Boundary Cleanup - -**ID:** m-E19-02-sim-authoring-and-runtime-boundary-cleanup -**Epic:** [Surface Alignment & Compatibility Cleanup (E-19)](./spec.md) -**Status:** completed -**Branch:** `milestone/m-E19-02-sim-authoring-and-runtime-boundary-cleanup` (off `epic/E-19`) +--- +id: M-025 +title: Sim Authoring & Runtime Boundary Cleanup +status: done +parent: E-19 +acs: + - id: AC-1 + title: Stored drafts retired (A2) + status: met + - id: AC-2 + title: /api/v1/drafts/run narrowed to inline-source only (A1, A2) + status: met + - id: AC-3 + title: Sim-only /api/v1/drafts/validate deleted (A6) + status: met + - id: AC-4 + title: Sim-side ZIP archive layer deleted (A3) + status: met + - id: AC-5 + title: Engine POST /v1/runs deleted outright (A4) + status: met + - id: AC-6 + title: Engine debug route deleted (scope narrowed during implementation) + status: met + - id: AC-7 + title: Catalogs retired entirely (A5) + status: met + - id: AC-8 + title: Public contracts cleanup consolidated + status: met + - id: AC-9 + title: Build, tests, and grep guards green + status: met + - id: AC-10 + title: Status surfaces reconciled at wrap time + status: met +--- ## Goal -Execute the runtime deletions locked by [m-E19-01](./m-E19-01-supported-surface-inventory.md) A1–A6: remove stored drafts, the Sim ZIP archive layer, Engine bundle-import and dead direct-eval routes, runtime catalogs, and the Sim-only `POST /api/v1/drafts/validate` wrapper. Narrow `POST /api/v1/drafts/run` to inline-source only. When this milestone closes, Sim authoring surfaces expose only the explicitly supported paths and Engine exposes only the canonical query/operator surface over `data/runs//`. +Execute the runtime deletions locked by [M-024](./M-024.md) A1–A6: remove stored drafts, the Sim ZIP archive layer, Engine bundle-import and dead direct-eval routes, runtime catalogs, and the Sim-only `POST /api/v1/drafts/validate` wrapper. Narrow `POST /api/v1/drafts/run` to inline-source only. When this milestone closes, Sim authoring surfaces expose only the explicitly supported paths and Engine exposes only the canonical query/operator surface over `data/runs//`. ## Context -m-E19-01 published the supported-surface matrix in [docs/architecture/supported-surfaces.md](../../../docs/architecture/supported-surfaces.md) and locked retention/deletion decisions A1–A6 in the milestone spec. No code changed in m-E19-01 — every deletion was assigned an owning downstream milestone and a grep guard. This milestone is the first deletion pass and executes every row whose `Owning milestone` column is `m-E19-02`. +M-024 published the supported-surface matrix in [docs/architecture/supported-surfaces.md](../../../docs/architecture/supported-surfaces.md) and locked retention/deletion decisions A1–A6 in the milestone spec. No code changed in M-024 — every deletion was assigned an owning downstream milestone and a grep guard. This milestone is the first deletion pass and executes every row whose `Owning milestone` column is `m-E19-02`. -Scope boundaries inherited from m-E19-01: +Scope boundaries inherited from M-024: - `FlowTime.Core`, `FlowTime.Generator`, `FlowTime.API`, and `FlowTime.Sim.*` are **not renamed** and their high-level responsibilities do not change in E-19. Generator stays frozen; its Path B extraction belongs to E-18 m-E18-01a. - The canonical run directory under `data/runs//` is unchanged. - Analytical surfaces purified by E-16 are out of scope. -- Blazor stale-wrapper cleanup (beyond the catalog selector, which is coupled to A5) belongs to m-E19-04. -- Schema, template, example, and current-doc cleanup belong to m-E19-03. +- Blazor stale-wrapper cleanup (beyond the catalog selector, which is coupled to A5) belongs to M-027. +- Schema, template, example, and current-doc cleanup belong to M-026. - `FlowTime.TimeMachine` is not introduced here — that is E-18 m-E18-01a. The default execution path for first-party UIs during and after this milestone remains `POST /api/v1/orchestration/runs` on `FlowTime.Sim.Service` per A1. Sunsetting that endpoint is an E-18 decision, not this milestone's. -## Acceptance Criteria +## Acceptance criteria -### AC1 — Stored drafts retired (A2) +### AC-1 — Stored drafts retired (A2) Forward-only deletion of the stored-draft product surface. @@ -49,19 +80,18 @@ Forward-only deletion of the stored-draft product surface. - `StorageKind.Draft` - `data/storage/drafts` -### AC2 — `/api/v1/drafts/run` narrowed to inline-source only (A1, A2) +### AC-2 — /api/v1/drafts/run narrowed to inline-source only (A1, A2) `POST /api/v1/drafts/run` at [src/FlowTime.Sim.Service/Program.cs:675](../../../src/FlowTime.Sim.Service/Program.cs) remains a live route but only accepts `DraftSource.type == "inline"`. Any `draftId` resolution branch is removed. - No request shape accepts `draftId` on this endpoint after the milestone. - Inline-source tests in `DraftEndpointsTests.cs` survive and are the only tests left covering this route. -- Documentation for this endpoint (in `docs/reference/contracts.md` and elsewhere) is updated by m-E19-03 — this milestone only removes the code branch. +- Documentation for this endpoint (in `docs/reference/contracts.md` and elsewhere) is updated by M-026 — this milestone only removes the code branch. **Grep guard:** No `draftId` reference remains in the `/api/v1/drafts/run` handler or its request shape. +### AC-3 — Sim-only /api/v1/drafts/validate deleted (A6) -### AC3 — Sim-only `/api/v1/drafts/validate` deleted (A6) - -`POST /api/v1/drafts/validate` at [src/FlowTime.Sim.Service/Program.cs:540](../../../src/FlowTime.Sim.Service/Program.cs) is removed along with its endpoint-specific tests. The library pieces that back it remain untouched (they become the tier 1/2/3 ingredients the future Time Machine composes per [D-2026-04-07-017](../../decisions.md)): +`POST /api/v1/drafts/validate` at [src/FlowTime.Sim.Service/Program.cs:540](../../../src/FlowTime.Sim.Service/Program.cs) is removed along with its endpoint-specific tests. The library pieces that back it remain untouched (they become the tier 1/2/3 ingredients the future Time Machine composes per [D-030](../../decisions.md)): **Preserved unchanged:** @@ -73,8 +103,7 @@ Forward-only deletion of the stored-draft product surface. - `FlowTime.Sim.Core.Analysis.InvariantAnalyzer` **Grep guard:** No `/api/v1/drafts/validate` route literal or `drafts/validate` handler remains in `src/` or `tests/`. - -### AC4 — Sim-side ZIP archive layer deleted (A3) +### AC-4 — Sim-side ZIP archive layer deleted (A3) Remove the post-hoc run-bundle archive path that writes ZIPs to `data/storage/runs/` and the `BundleRef` / `StorageRef` return values that surface them. @@ -89,7 +118,7 @@ Remove the post-hoc run-bundle archive path that writes ZIPs to `data/storage/ru **Grep guards:** No `StorageKind.Run`, `BundleRef`, `StorageRef`, or `data/storage/runs` reference remains in `src/` or `tests/` on the current surface. -### AC5 — Engine `POST /v1/runs` deleted outright (A4) +### AC-5 — Engine POST /v1/runs deleted outright (A4) `POST /v1/runs` in [src/FlowTime.API/Endpoints/RunOrchestrationEndpoints.cs:19](../../../src/FlowTime.API/Endpoints/RunOrchestrationEndpoints.cs) is removed entirely. No 410-style rejection stub remains. The read endpoints `GET /v1/runs` (line 20) and `GET /v1/runs/{runId}` (line 21) stay — they are the canonical run discovery/detail contract. @@ -104,10 +133,9 @@ Remove the post-hoc run-bundle archive path that writes ZIPs to `data/storage/ru **Preserve:** `GET /v1/runs` and `GET /v1/runs/{runId}` — they are the canonical run query surface consumed by the Svelte UI and operator workflows. **Grep guards:** `MapPost("/runs", HandleCreateRunAsync)`, `BundlePath`, `BundleArchiveBase64`, and `BundleRef` return zero matches in `src/` and `tests/` on the current API surface. +### AC-6 — Engine debug route deleted (scope narrowed during implementation) -### AC6 — Engine debug route deleted (scope narrowed during implementation) - -The m-E19-01 audit originally scheduled three Engine routes for deletion in this milestone: `GET /v1/debug/scan-directory/{dirName}`, `POST /v1/run`, and `POST /v1/graph`. During implementation, discovery showed that `POST /v1/run` is used by 50+ test call sites across the Engine Provenance, Parity, and Legacy test suites as the primary run-creation mechanism, and `POST /v1/graph` is used by `Legacy/ApiIntegrationTests.cs`. The matrix entry claim that these routes are "not used by current first-party UIs" is technically correct but underweighted the test-infrastructure coupling. Deleting them in this milestone would either regress ~50 tests of Engine-side runtime provenance coverage (forward-only test deletion — unacceptable) or pull substantial test-migration work that is out of scope for a runtime-cleanup milestone. +The M-024 audit originally scheduled three Engine routes for deletion in this milestone: `GET /v1/debug/scan-directory/{dirName}`, `POST /v1/run`, and `POST /v1/graph`. During implementation, discovery showed that `POST /v1/run` is used by 50+ test call sites across the Engine Provenance, Parity, and Legacy test suites as the primary run-creation mechanism, and `POST /v1/graph` is used by `Legacy/ApiIntegrationTests.cs`. The matrix entry claim that these routes are "not used by current first-party UIs" is technically correct but underweighted the test-infrastructure coupling. Deleting them in this milestone would either regress ~50 tests of Engine-side runtime provenance coverage (forward-only test deletion — unacceptable) or pull substantial test-migration work that is out of scope for a runtime-cleanup milestone. **Scope for this milestone:** @@ -115,11 +143,11 @@ The m-E19-01 audit originally scheduled three Engine routes for deletion in this **Deferred (tracked in [work/gaps.md](../../gaps.md)):** -- `POST /v1/run` and `POST /v1/graph` deletion is deferred out of m-E19-02. Retained as a transitional test-infrastructure surface until the Provenance/Parity/Legacy test suites are migrated to an alternative run-creation path (either a test-only in-process adapter over `Graph.Evaluate` / `RunOrchestrationService` or the supported Sim orchestration endpoint with template fixtures). A decisions.md entry (D-2026-04-08-029) records the scope change and the reason. +- `POST /v1/run` and `POST /v1/graph` deletion is deferred out of M-025. Retained as a transitional test-infrastructure surface until the Provenance/Parity/Legacy test suites are migrated to an alternative run-creation path (either a test-only in-process adapter over `Graph.Evaluate` / `RunOrchestrationService` or the supported Sim orchestration endpoint with template fixtures). A decisions.md entry (D-042) records the scope change and the reason. **Grep guard (narrowed):** No `"/v1/debug/scan-directory"` literal remains in runtime code. -### AC7 — Catalogs retired entirely (A5) +### AC-7 — Catalogs retired entirely (A5) Catalog surfaces are zombie residue with no supported first-party caller. Delete them atomically across runtime and the Blazor catalog selector (the one UI site coupled to this server deletion). @@ -135,7 +163,7 @@ Catalog surfaces are zombie residue with no supported first-party caller. Delete **Grep guards:** `/api/v1/catalogs`, `CatalogService`, `ICatalogService`, `CatalogPicker`, and the literal `CatalogId = "default"` return zero matches in `src/` and `tests/` on the current surface. -### AC8 — Public contracts cleanup consolidated +### AC-8 — Public contracts cleanup consolidated All public contract changes forced by AC1–AC7 above land in [src/FlowTime.Contracts/](../../../src/FlowTime.Contracts/) in a single consistent pass: @@ -145,21 +173,21 @@ All public contract changes forced by AC1–AC7 above land in [src/FlowTime.Cont `StorageBackendOptions`, `IStorageBackend`, `StorageWriteRequest`, `StorageWriteResult`, `StorageReadResult`, `StorageListRequest`, and `StorageItemSummary` remain on the public surface — they still serve surviving storage needs (for example, series storage referenced in the supported-surfaces matrix). This milestone removes only the draft/run kinds, not the underlying storage abstraction. -### AC9 — Build, tests, and grep guards green +### AC-9 — Build, tests, and grep guards green - `dotnet build FlowTime.sln` is green with no new warnings introduced by this milestone. - `dotnet test FlowTime.sln` is green across all test projects (deleted tests for deleted code are acceptable; failing tests or reduced coverage for surviving code is not). - Every grep guard from AC1–AC7 is asserted by a simple repo-root script or CI check that `rg` returns zero matches in `src/` and `tests/` for the deleted symbols. The check can be a single shell script runnable locally; it does not need to become a full CI pipeline step in this milestone but it must exist and be documented. -### AC10 — Status surfaces reconciled at wrap time +### AC-10 — Status surfaces reconciled at wrap time At milestone wrap: -- [work/epics/E-19-surface-alignment-and-compatibility-cleanup/spec.md](./spec.md) milestone table marks m-E19-02 complete and m-E19-03 next. +- [work/epics/E-19-surface-alignment-and-compatibility-cleanup/spec.md](./spec.md) milestone table marks M-025 complete and M-026 next. - [ROADMAP.md](../../../ROADMAP.md) and [work/epics/epic-roadmap.md](../../epic-roadmap.md) reflect the same status. -- [CLAUDE.md](../../../CLAUDE.md) Current Work section names m-E19-02 complete and m-E19-03 next. -- The tracking doc [m-E19-02-sim-authoring-and-runtime-boundary-cleanup-tracking.md](./m-E19-02-sim-authoring-and-runtime-boundary-cleanup-tracking.md) records every AC checked, the final test count, and the grep guard results. -- `work/decisions.md` does **not** need new entries — this milestone executes decisions A1–A6 already recorded under D-2026-04-07-023 through D-2026-04-07-028. If an implementation judgment call surfaces that m-E19-01 did not anticipate, it is logged in `work/gaps.md` or as a new D-entry at wrap time. +- [CLAUDE.md](../../../CLAUDE.md) Current Work section names M-025 complete and M-026 next. +- The tracking doc [M-025.md](./M-025.md) records every AC checked, the final test count, and the grep guard results. +- `work/decisions.md` does **not** need new entries — this milestone executes decisions A1–A6 already recorded under D-036 through D-041. If an implementation judgment call surfaces that M-024 did not anticipate, it is logged in `work/gaps.md` or as a new D-entry at wrap time. ## Technical Notes @@ -204,17 +232,17 @@ Forward-only deletion, not migration: - `/api/v1/templates/*` authoring surface — supported. No changes. - `/api/v1/drafts/generate` and `/api/v1/drafts/map-profile` — supported authoring surfaces. No changes beyond removing `draftId` resolution. - `/api/v1/series/*`, `/api/v1/profiles/*`, `/api/v1/models/*` — supported Sim authoring/data-intake surfaces. No changes. -- Blazor stale-wrapper cleanup beyond `CatalogPicker.razor` — that is m-E19-04's job. -- Schema files under `docs/schemas/` — m-E19-03 owns any deprecated-schema removal. -- Template files under `templates/` — m-E19-03 owns any deprecated-template removal. -- Example files under `examples/` — m-E19-03 owns schema-compatibility example retirement. +- Blazor stale-wrapper cleanup beyond `CatalogPicker.razor` — that is M-027's job. +- Schema files under `docs/schemas/` — M-026 owns any deprecated-schema removal. +- Template files under `templates/` — M-026 owns any deprecated-template removal. +- Example files under `examples/` — M-026 owns schema-compatibility example retirement. ## Out of Scope - Introducing or referencing `FlowTime.TimeMachine`. That component is new in E-18 m-E18-01a and does not exist yet. - Any Path B extraction of `FlowTime.Generator`. Generator is frozen. -- Schema, template, example, or docs cleanup (m-E19-03 owns those). -- Blazor stale compatibility wrappers outside the catalog picker (m-E19-04). +- Schema, template, example, or docs cleanup (M-026 owns those). +- Blazor stale compatibility wrappers outside the catalog picker (M-027). - Replacing the deleted validation endpoint with a tiered validation API on Sim — that is explicitly an E-18 m-E18-01b deliverable per A6. - Reintroducing any deleted surface as a "temporary compatibility shim." - Refactoring `RunOrchestrationService` or `IStorageBackend` beyond removing the deleted code paths. @@ -223,24 +251,24 @@ Forward-only deletion, not migration: ## Guards / DO NOT -- **DO NOT** preserve a 410-style rejection stub or advisory tombstone for any deleted route. Forward-only deletion per shared framing in [m-E19-01 § Shared Framing](./m-E19-01-supported-surface-inventory.md#shared-framing). +- **DO NOT** preserve a 410-style rejection stub or advisory tombstone for any deleted route. Forward-only deletion per shared framing in [M-024 § Shared Framing](./M-024.md#shared-framing). - **DO NOT** design or stub anything under `FlowTime.TimeMachine` or any `Headless` namespace. The Time Machine is E-18 m-E18-01a. - **DO NOT** extend the `POST /api/v1/orchestration/runs` surface. It stays as-is; sunsetting is an E-18 decision. - **DO NOT** add new compatibility wrappers, feature flags, or configuration toggles to keep deleted behaviour reachable in any environment. -- **DO NOT** widen the scope into schema/template/example cleanup. Those are m-E19-03. +- **DO NOT** widen the scope into schema/template/example cleanup. Those are M-026. - **DO NOT** touch the canonical run directory layout at `data/runs//`. The bundle archive layer is separate. - **DO NOT** re-home `TemplateInvariantAnalyzer` into `FlowTime.Core` in this milestone. That is an E-18 m-E18-01b concern. - **DO NOT** leave partially deleted symbols behind. Every grep guard must pass at wrap time. ## Dependencies -- [m-E19-01 Supported Surface Inventory, Boundary ADR & Exit Criteria](./m-E19-01-supported-surface-inventory.md) — locks A1–A6 decisions and the boundary ADR this milestone executes against. +- [M-024 Supported Surface Inventory, Boundary ADR & Exit Criteria](./M-024.md) — locks A1–A6 decisions and the boundary ADR this milestone executes against. - [docs/architecture/supported-surfaces.md](../../../docs/architecture/supported-surfaces.md) — authoritative row-by-row ownership for deletions. - [docs/architecture/template-draft-model-run-bundle-boundary.md](../../../docs/architecture/template-draft-model-run-bundle-boundary.md) — current/transitional/target diagrams that deletions must not contradict. ## References - [E-19 epic spec](./spec.md) -- [m-E19-01 spec](./m-E19-01-supported-surface-inventory.md) -- [work/decisions.md](../../decisions.md) — D-2026-04-07-017 (A6), D-2026-04-07-022 through D-2026-04-07-028 (shared framing and A1–A5) +- [M-024 spec](./M-024.md) +- [work/decisions.md](../../decisions.md) — D-030 (A6), D-035 through D-041 (shared framing and A1–A5) - [E-18 epic spec](../E-18-headless-pipeline-and-optimization/spec.md) — downstream dependency for validation replacement diff --git a/work/epics/completed/E-19-surface-alignment-and-compatibility-cleanup/m-E19-03-schema-template-example-retirement.md b/work/epics/E-19-surface-alignment-compatibility-cleanup/M-026-schema-template-example-retirement.md similarity index 84% rename from work/epics/completed/E-19-surface-alignment-and-compatibility-cleanup/m-E19-03-schema-template-example-retirement.md rename to work/epics/E-19-surface-alignment-compatibility-cleanup/M-026-schema-template-example-retirement.md index bc8c7ee6..17964f3f 100644 --- a/work/epics/completed/E-19-surface-alignment-and-compatibility-cleanup/m-E19-03-schema-template-example-retirement.md +++ b/work/epics/E-19-surface-alignment-compatibility-cleanup/M-026-schema-template-example-retirement.md @@ -1,9 +1,40 @@ -# Milestone: Schema, Template & Example Retirement - -**ID:** m-E19-03-schema-template-example-retirement -**Epic:** [Surface Alignment & Compatibility Cleanup (E-19)](./spec.md) -**Status:** completed -**Branch:** `milestone/m-E19-03-schema-template-example-retirement` (off `epic/E-19`) +--- +id: M-026 +title: Schema, Template & Example Retirement +status: done +parent: E-19 +acs: + - id: AC-1 + title: UI demo template generators emit current schema (schema-migration residue) + status: met + - id: AC-2 + title: UI sample fixture uses current schema + status: met + - id: AC-3 + title: CLI verbose output label uses current schema + status: met + - id: AC-4 + title: Active architecture docs use current schema in YAML examples + status: met + - id: AC-5 + title: Schema-migration example fixtures archived + status: met + - id: AC-6 + title: Stale template-integration spec archived + status: met + - id: AC-7 + title: Catalog-stale phrasing in active docs updated + status: met + - id: AC-8 + title: Test fixtures with stale parameter keys cleaned + status: met + - id: AC-9 + title: Grep-guard script codified + status: met + - id: AC-10 + title: Tracking doc and status surfaces reconciled + status: met +--- ## Goal @@ -11,26 +42,26 @@ Remove deprecated schema shapes, demo-template residue, schema-migration compati ## Context -[m-E19-01](./m-E19-01-supported-surface-inventory.md) inventoried active schema, template, example, and docs surfaces in the supported-surfaces matrix at [docs/architecture/supported-surfaces.md](../../../docs/architecture/supported-surfaces.md) and assigned owning milestones. Every row whose `Owning milestone` column is `m-E19-03` is executed here. +[M-024](./M-024.md) inventoried active schema, template, example, and docs surfaces in the supported-surfaces matrix at [docs/architecture/supported-surfaces.md](../../../docs/architecture/supported-surfaces.md) and assigned owning milestones. Every row whose `Owning milestone` column is `m-E19-03` is executed here. -[m-E19-02](./m-E19-02-sim-authoring-and-runtime-boundary-cleanup.md) already deleted the runtime seams (stored drafts, Sim ZIP archive layer, Engine bundle-import, runtime catalogs, `/api/v1/drafts/validate`, Engine `/v1/debug/scan-directory`) and narrowed `/api/v1/drafts/run` to inline-only. This milestone is the schema/authoring cleanup pass over the same supported-surface baseline. +[M-025](./M-025.md) already deleted the runtime seams (stored drafts, Sim ZIP archive layer, Engine bundle-import, runtime catalogs, `/api/v1/drafts/validate`, Engine `/v1/debug/scan-directory`) and narrowed `/api/v1/drafts/run` to inline-only. This milestone is the schema/authoring cleanup pass over the same supported-surface baseline. -Scope boundaries inherited from m-E19-01: +Scope boundaries inherited from M-024: - `FlowTime.Core`, `FlowTime.Generator`, `FlowTime.API`, and `FlowTime.Sim.*` are **not renamed** and their high-level responsibilities do not change in E-19. - Analytical surfaces purified by E-16 (notably `MetricsContracts.MetricsGrid.BinMinutes` as a retained wire-format field, the `TimeGrid.BinMinutes` computed property, `ModelValidator`'s `binMinutes` rejection gate, and `TargetSchemaValidationTests` that assert the gate) are explicitly out of scope and must remain untouched. -- Engine and Sim runtime route deletions are not re-opened here — m-E19-02 owns them. -- Blazor stale-wrapper cleanup and demo-mode policy belong to m-E19-04. -- `POST /v1/run` / `POST /v1/graph` remain deferred per [D-2026-04-08-029](../../decisions.md#d-2026-04-08-029-defer-post-v1run-and-post-v1graph-deletion-out-of-m-e19-02-ac6-scope-narrowing). Tests that deserialize `Grid { int binMinutes }` against those routes stay as-is. +- Engine and Sim runtime route deletions are not re-opened here — M-025 owns them. +- Blazor stale-wrapper cleanup and demo-mode policy belong to M-027. +- `POST /v1/run` / `POST /v1/graph` remain deferred per [D-042](../../decisions.md#d-2026-04-08-029-defer-post-v1run-and-post-v1graph-deletion-out-of-m-e19-02-ac6-scope-narrowing). Tests that deserialize `Grid { int binMinutes }` against those routes stay as-is. The distinction this milestone enforces: - **`binMinutes` as a YAML authoring schema field** — deprecated. Engine's `ModelValidator` rejects it at parse time. Current authoring schema is `binSize` + `binUnit`. Any active surface emitting `binMinutes` in an authored YAML shape is in scope for this milestone. - **`binMinutes` as the derived internal concept** (bin duration in minutes) — still live. `TimeGrid.BinMinutes`, `MetricsContracts.MetricsGrid.BinMinutes`, internal analytical math, and mathematical notation in architecture docs are out of scope. -## Acceptance Criteria +## Acceptance criteria -### AC1 — UI demo template generators emit current schema (schema-migration residue) +### AC-1 — UI demo template generators emit current schema (schema-migration residue) `src/FlowTime.UI/Services/TemplateServiceImplementations.cs` is the Blazor mock template service used by demo mode. It currently declares two `JsonSchemaProperty` entries keyed `"binMinutes"` and emits three demo YAML strings with `grid: binMinutes: 60` / `binMinutes: 1440`. These are active surfaces promoting the deprecated YAML authoring shape. @@ -46,11 +77,11 @@ The distinction this milestone enforces: **Preserve:** - Any `JsonIgnore`-annotated computed `BinMinutes` property used purely for UI display (e.g. in `GridInfo`, `TimeTravelMetricsGridDto`) — these are internal convenience fields, not authoring shapes. -- Demo mode itself — m-E19-03 does not retire demo mode. Blazor demo-mode policy is m-E19-04. +- Demo mode itself — M-026 does not retire demo mode. Blazor demo-mode policy is M-027. **Grep guard:** No `binMinutes` literal remains anywhere under `src/FlowTime.UI/Services/TemplateServiceImplementations.cs`. Broader `src/FlowTime.UI/` check is deferred to AC7 (grep guard script). -### AC2 — UI sample fixture uses current schema +### AC-2 — UI sample fixture uses current schema [src/FlowTime.UI/wwwroot/sample/run-example.json](../../../src/FlowTime.UI/wwwroot/sample/run-example.json) currently reads: @@ -66,7 +97,7 @@ This is a static authoring fixture shipped with the Blazor UI and is not a wire- **Grep guard:** No `binMinutes` literal remains under `src/FlowTime.UI/wwwroot/`. -### AC3 — CLI verbose output label uses current schema +### AC-3 — CLI verbose output label uses current schema [src/FlowTime.Cli/Program.cs:98](../../../src/FlowTime.Cli/Program.cs) currently prints: @@ -84,7 +115,7 @@ The computed `TimeGrid.BinMinutes` property itself stays — it is the live inte **Grep guard:** No `binMinutes` literal remains under `src/FlowTime.Cli/`. -### AC4 — Active architecture docs use current schema in YAML examples +### AC-4 — Active architecture docs use current schema in YAML examples Two active architecture docs contain YAML authoring examples still using the deprecated grid shape. Rewrite every YAML example; leave mathematical notation that uses `binMinutes` as the live derived concept (AC4 is about authoring shapes, not math). @@ -103,9 +134,9 @@ Two active architecture docs contain YAML authoring examples still using the dep **Grep guard:** No `binMinutes` literal remains in `docs/architecture/whitepaper.md` or `docs/architecture/retry-modeling.md` **except** lines containing the marker `m-E19-03:allow-binminutes-notation`. The marker is an HTML comment that Markdown renderers strip from display; it lets the grep-guard script allowlist legitimate derived-concept notation without depending on drift-prone line numbers. -### AC5 — Schema-migration example fixtures archived +### AC-5 — Schema-migration example fixtures archived -The three schema-migration example YAMLs under `examples/` exist solely as back-compat coverage fixtures, not as current user-facing examples. Per m-E19-01's supported-surfaces matrix (row for schema-migration compatibility examples), their decision is `archive`. +The three schema-migration example YAMLs under `examples/` exist solely as back-compat coverage fixtures, not as current user-facing examples. Per M-024's supported-surfaces matrix (row for schema-migration compatibility examples), their decision is `archive`. **Move (preserve git history via `git mv`):** @@ -124,9 +155,9 @@ The three schema-migration example YAMLs under `examples/` exist solely as back- **Grep guard:** No path `examples/test-old-schema.yaml`, `examples/test-no-schema.yaml`, or `examples/test-new-schema.yaml` remains referenced anywhere in `src/`, `tests/`, or active `docs/` content. Matches under `examples/archive/` and `docs/archive/` are allowed. -### AC6 — Stale template-integration spec archived +### AC-6 — Stale template-integration spec archived -[docs/ui/template-integration-spec.md](../../../docs/ui/template-integration-spec.md) is a pre-v1 UI spec that references `/api/templates/{templateId}/schema` and `/api/templates/generate` routes (pre-v1 template surface), contains `binMinutes` references, and carries its own `⚠️ SCHEMA MIGRATION IN PROGRESS` warning. Per m-E19-01's matrix (`archive/update`), move it to the archive tree: +[docs/ui/template-integration-spec.md](../../../docs/ui/template-integration-spec.md) is a pre-v1 UI spec that references `/api/templates/{templateId}/schema` and `/api/templates/generate` routes (pre-v1 template surface), contains `binMinutes` references, and carries its own `⚠️ SCHEMA MIGRATION IN PROGRESS` warning. Per M-024's matrix (`archive/update`), move it to the archive tree: **Move:** @@ -138,15 +169,15 @@ The three schema-migration example YAMLs under `examples/` exist solely as back- **Grep guard:** No active docs (outside `docs/archive/`) reference `docs/ui/template-integration-spec.md` or the pre-v1 routes `/api/templates/{id}/schema` or `/api/templates/generate`. -### AC7 — Catalog-stale phrasing in active docs updated +### AC-7 — Catalog-stale phrasing in active docs updated -m-E19-02 deleted all catalog routes, services, UI components, and DTOs per A5. Two active docs still carry leftover phrasing describing Sim as owning "template/catalog" endpoints. Rewrite the phrasing: +M-025 deleted all catalog routes, services, UI components, and DTOs per A5. Two active docs still carry leftover phrasing describing Sim as owning "template/catalog" endpoints. Rewrite the phrasing: **Rewrite:** - [docs/guides/UI.md:3](../../../docs/guides/UI.md) — drop `template/catalog calls` to `template calls` (the Sim API hosts template authoring, not catalogs). - [docs/reference/contracts.md:111](../../../docs/reference/contracts.md) — drop `template/catalog endpoints for model generation` to `template endpoints for model generation`. -- [docs/reference/engine-capabilities.md:30](../../../docs/reference/engine-capabilities.md) — rewrite `no catalog/export/import/registry endpoints` to drop `catalog/` for consistency with m-E19-02's catalog retirement. The statement becomes `no streaming endpoints; no export/import/registry endpoints` (or the closest natural phrasing). The line is factually true either way — this is a consistency edit, not a correction. +- [docs/reference/engine-capabilities.md:30](../../../docs/reference/engine-capabilities.md) — rewrite `no catalog/export/import/registry endpoints` to drop `catalog/` for consistency with M-025's catalog retirement. The statement becomes `no streaming endpoints; no export/import/registry endpoints` (or the closest natural phrasing). The line is factually true either way — this is a consistency edit, not a correction. **Explicitly leave alone (not in scope):** @@ -156,7 +187,7 @@ m-E19-02 deleted all catalog routes, services, UI components, and DTOs per A5. T **Grep guard:** No `template/catalog` literal remains in `docs/guides/UI.md` or `docs/reference/contracts.md`. No `catalog/export/import/registry` phrasing remains in `docs/reference/engine-capabilities.md`. -### AC8 — Test fixtures with stale parameter keys cleaned +### AC-8 — Test fixtures with stale parameter keys cleaned [tests/FlowTime.UI.Tests/ParameterConversionIntegrationTests.cs](../../../tests/FlowTime.UI.Tests/ParameterConversionIntegrationTests.cs) uses `["binMinutes"] = 60` as a template parameter key in three test dictionaries (lines 23, 51, 107). Active templates expose `binSize` (not `binMinutes`) — see [templates/transportation-basic.yaml:23](../../../templates/transportation-basic.yaml) — so the test key references a template parameter that does not exist. The test itself is about parameter type conversion (string arrays vs number arrays being serialized to `demandPattern` / `capacityPattern`) and does not assert anything about the grid parameter key, so a rename preserves semantic meaning. @@ -170,14 +201,14 @@ m-E19-02 deleted all catalog routes, services, UI components, and DTOs per A5. T - Any test that uses `binMinutes` as an internal local variable name (e.g. [tests/FlowTime.UI.Tests/TemplateServiceMetadataTests.cs](../../../tests/FlowTime.UI.Tests/TemplateServiceMetadataTests.cs)) — local naming, not schema. - Any test that asserts `binMinutes` is rejected by validators (e.g. [tests/FlowTime.Tests/Schema/TargetSchemaValidationTests.cs](../../../tests/FlowTime.Tests/Schema/TargetSchemaValidationTests.cs)) — legitimate invariant test. - Any test that asserts `binMinutes` does **not** appear in serialized JSON (e.g. [tests/FlowTime.UI.Tests/GridInfoSchemaTests.cs](../../../tests/FlowTime.UI.Tests/GridInfoSchemaTests.cs), `SimGridInfoSchemaTests.cs`) — legitimate invariant test. -- [tests/FlowTime.Tests/ApiIntegrationTests.cs:93](../../../tests/FlowTime.Tests/ApiIntegrationTests.cs), [tests/FlowTime.Api.Tests/Legacy/ApiIntegrationTests.cs:188](../../../tests/FlowTime.Api.Tests/Legacy/ApiIntegrationTests.cs) — `Grid { int binMinutes }` DTOs that deserialize the retained `MetricsGrid.BinMinutes` wire-format field from `POST /v1/run` and `POST /v1/graph`. Deferred per D-2026-04-08-029. +- [tests/FlowTime.Tests/ApiIntegrationTests.cs:93](../../../tests/FlowTime.Tests/ApiIntegrationTests.cs), [tests/FlowTime.Api.Tests/Legacy/ApiIntegrationTests.cs:188](../../../tests/FlowTime.Api.Tests/Legacy/ApiIntegrationTests.cs) — `Grid { int binMinutes }` DTOs that deserialize the retained `MetricsGrid.BinMinutes` wire-format field from `POST /v1/run` and `POST /v1/graph`. Deferred per D-042. - [tests/FlowTime.Api.Tests/StateEndpointTests.cs](../../../tests/FlowTime.Api.Tests/StateEndpointTests.cs), [tests/FlowTime.Api.Tests/StateResponseSchemaTests.cs](../../../tests/FlowTime.Api.Tests/StateResponseSchemaTests.cs) — state query tests passing the current request-shape including the retained `binMinutes` field. Active contract. - [tests/FlowTime.Api.Tests/Golden/metrics-run_metrics_fixture.json](../../../tests/FlowTime.Api.Tests/Golden/metrics-run_metrics_fixture.json) — golden fixture for the retained `MetricsGrid` response shape. - [tests/FlowTime.Core.Tests/Safety/NaNPolicyTests.cs](../../../tests/FlowTime.Core.Tests/Safety/NaNPolicyTests.cs) — internal `ComputeLatencyMinutes(binMinutes: …)` helper parameter name, not a schema reference. **Grep guard:** No `["binMinutes"]` dictionary-key literal remains in `tests/FlowTime.UI.Tests/ParameterConversionIntegrationTests.cs`. -### AC9 — Grep-guard script codified +### AC-9 — Grep-guard script codified Create `scripts/m-E19-03-grep-guards.sh` mirroring the structure of `scripts/m-E19-02-grep-guards.sh`. Every guard listed in AC1–AC8 becomes a line in the script. The script must exit 0 when all guards pass. @@ -199,13 +230,13 @@ Scoped searches are limited to `src/`, `tests/`, `docs/`, `examples/`, and `temp The script runs locally and in the wrap pass. It is not wired into CI in this milestone — `scripts/m-E19-02-grep-guards.sh` remains the pattern, and CI wiring is deferred. -### AC10 — Tracking doc and status surfaces reconciled +### AC-10 — Tracking doc and status surfaces reconciled - Create `work/epics/E-19-surface-alignment-and-compatibility-cleanup/m-E19-03-schema-template-example-retirement-tracking.md` at milestone start and update it after each AC lands. Tracking doc records: per-AC file changes, grep-guard results, test counts, and deviations from the spec (if any). - Flip milestone status in a single reconciliation pass at wrap time: - This spec: `draft` → `in-progress` at start → `completed` at wrap. - [work/epics/E-19-surface-alignment-and-compatibility-cleanup/spec.md](./spec.md) milestone table: `m-E19-03` status `next` → `in-progress` → `completed`; header `Status:` line updated; `## Milestones` sequence note updated to point at `m-E19-04`. - - [ROADMAP.md](../../../ROADMAP.md) E-19 section: sync m-E19-03 completion and name `m-E19-04` as next. + - [ROADMAP.md](../../../ROADMAP.md) E-19 section: sync M-026 completion and name `m-E19-04` as next. - [work/epics/epic-roadmap.md](../epic-roadmap.md) E-19 row: same sync. - [CLAUDE.md](../../../CLAUDE.md) Current Work section: sync E-19 topology and next-step pointer. - All status-surface updates happen in a single wrap commit after the grep guards pass. @@ -222,7 +253,7 @@ ACs are grouped into five focused commits plus the wrap. Each bundle is a single 4. **Grep guard script (AC9).** Its own commit. The script must pass against the tree from commits 1–3, proving the cleanup is complete before the wrap. 5. **Wrap (AC10).** Tracking doc finalization and status-surface reconciliation in a single commit after the grep guards pass. -If any bundle surfaces a complication at implementation time (e.g. inbound reference to an archived file requires a cross-bundle edit), stop and present options before widening or splitting the bundle, the way m-E19-02 handled the AC6 scope narrowing. +If any bundle surfaces a complication at implementation time (e.g. inbound reference to an archived file requires a cross-bundle edit), stop and present options before widening or splitting the bundle, the way M-025 handled the AC6 scope narrowing. ### Implementation notes @@ -259,9 +290,9 @@ Explicit list of surfaces that must remain untouched by this milestone. Any acci - Rewriting the Little's Law formula in `whitepaper.md:77`. Math notation for the live derived concept stays. - Rewriting historical review docs under `docs/architecture/reviews/`. - Rewriting authoritative migration docs under `docs/schemas/model.schema.md` and `docs/schemas/model.schema.yaml`. -- Retiring Blazor demo mode itself. That is m-E19-04 territory. -- Deleting demo-mode `TemplateServiceImplementations.cs` wholesale. m-E19-03 narrows, m-E19-04 decides demo-mode policy. -- `POST /v1/run` and `POST /v1/graph` and their test fixtures — deferred per D-2026-04-08-029. +- Retiring Blazor demo mode itself. That is M-027 territory. +- Deleting demo-mode `TemplateServiceImplementations.cs` wholesale. M-026 narrows, M-027 decides demo-mode policy. +- `POST /v1/run` and `POST /v1/graph` and their test fixtures — deferred per D-042. - Removing the `binMinutes` rejection gate in `ModelValidator` or its covering test. The gate is load-bearing. - Introducing or documenting `FlowTime.TimeMachine`. That is E-18 m-E18-01a. - New template files, new examples, or new demo YAML generators. @@ -275,23 +306,23 @@ Explicit list of surfaces that must remain untouched by this milestone. Any acci - **DO NOT** rewrite the Little's Law formula on `whitepaper.md:77`. The mathematical notation is not a schema reference. - **DO NOT** delete `examples/test-old-schema.yaml`, `test-no-schema.yaml`, or `test-new-schema.yaml`. They are archived, not deleted — schema-transition coverage is still useful history. - **DO NOT** archive or delete `docs/schemas/model.schema.md` or `docs/schemas/model.schema.yaml`. These are authoritative migration docs. -- **DO NOT** retire Blazor demo mode or `CatalogService`-equivalent residue not already covered by m-E19-02. Demo-mode policy is m-E19-04. +- **DO NOT** retire Blazor demo mode or `CatalogService`-equivalent residue not already covered by M-025. Demo-mode policy is M-027. - **DO NOT** introduce compatibility shims, `binMinutes`-to-`binSize`/`binUnit` converters, or new helper utilities. Rewrite YAML examples in place; they are static content. -- **DO NOT** add advisory comments pointing at m-E19-04 or at deleted surfaces. Forward-only. +- **DO NOT** add advisory comments pointing at M-027 or at deleted surfaces. Forward-only. - **DO NOT** leave partially archived directories behind. If a moved file has inbound references, update the references in the same commit (or file a grep-guard failure for the wrap pass to catch). - **DO NOT** widen the milestone scope to include runtime endpoint changes, Contracts-level refactors, or cross-project deletions. Those are other milestones. - **DO NOT** commit before explicit human approval per the repo's Hard Rules. ## Dependencies -- [m-E19-01 Supported Surface Inventory, Boundary ADR & Exit Criteria](./m-E19-01-supported-surface-inventory.md) — supplies the retention/archive decisions and grep-guard taxonomy this milestone executes. -- [m-E19-02 Sim Authoring & Runtime Boundary Cleanup](./m-E19-02-sim-authoring-and-runtime-boundary-cleanup.md) — already removed the runtime seams (catalogs, drafts CRUD, bundle import) whose residue AC7 finishes cleaning up in the docs layer. +- [M-024 Supported Surface Inventory, Boundary ADR & Exit Criteria](./M-024.md) — supplies the retention/archive decisions and grep-guard taxonomy this milestone executes. +- [M-025 Sim Authoring & Runtime Boundary Cleanup](./M-025.md) — already removed the runtime seams (catalogs, drafts CRUD, bundle import) whose residue AC7 finishes cleaning up in the docs layer. - [docs/architecture/supported-surfaces.md](../../../docs/architecture/supported-surfaces.md) — authoritative row-by-row ownership. ## References - [E-19 epic spec](./spec.md) -- [m-E19-01 spec](./m-E19-01-supported-surface-inventory.md) -- [m-E19-02 spec](./m-E19-02-sim-authoring-and-runtime-boundary-cleanup.md) -- [work/decisions.md](../../decisions.md) — D-2026-04-07-022 (shared framing), D-2026-04-07-027 (catalogs retired), D-2026-04-08-029 (deferred `/v1/run` `/v1/graph`) -- [scripts/m-E19-02-grep-guards.sh](../../../scripts/m-E19-02-grep-guards.sh) — template for the m-E19-03 grep-guard script +- [M-024 spec](./M-024.md) +- [M-025 spec](./M-025.md) +- [work/decisions.md](../../decisions.md) — D-035 (shared framing), D-040 (catalogs retired), D-042 (deferred `/v1/run` `/v1/graph`) +- [scripts/M-025.sh](../../../scripts/M-025.sh) — template for the M-026 grep-guard script diff --git a/work/epics/completed/E-19-surface-alignment-and-compatibility-cleanup/m-E19-04-blazor-support-alignment.md b/work/epics/E-19-surface-alignment-compatibility-cleanup/M-027-blazor-support-alignment.md similarity index 85% rename from work/epics/completed/E-19-surface-alignment-and-compatibility-cleanup/m-E19-04-blazor-support-alignment.md rename to work/epics/E-19-surface-alignment-compatibility-cleanup/M-027-blazor-support-alignment.md index b9e71d5d..c845131d 100644 --- a/work/epics/completed/E-19-surface-alignment-and-compatibility-cleanup/m-E19-04-blazor-support-alignment.md +++ b/work/epics/E-19-surface-alignment-compatibility-cleanup/M-027-blazor-support-alignment.md @@ -1,9 +1,40 @@ -# Milestone: Blazor Support Alignment - -**ID:** m-E19-04-blazor-support-alignment -**Epic:** [Surface Alignment & Compatibility Cleanup (E-19)](./spec.md) -**Status:** completed (2026-04-08) -**Branch:** `milestone/m-E19-04-blazor-support-alignment` (off `epic/E-19`) +--- +id: M-027 +title: Blazor Support Alignment +status: done +parent: E-19 +acs: + - id: AC-1 + title: Stale RunAsync wrapper deleted (row 64) + status: met + - id: AC-2 + title: Stale GetIndexAsync and GetSeriesAsync wrappers deleted (row 65) + status: met + - id: AC-3 + title: FlowTimeSimService API-mode data generation rewired to orchestration + status: met + - id: AC-4 + title: SimResultsService run queries go through the Engine API only + status: met + - id: AC-5 + title: Dead Sim run-query URL construction removed from + status: met + - id: AC-6 + title: Supported Blazor Sim client surface confirmed aligned (row 63) + status: met + - id: AC-7 + title: Svelte client surfaces confirmed aligned (rows 66, 67) + status: met + - id: AC-8 + title: Grep-guard script codified + status: met + - id: AC-9 + title: Build, tests, and grep guards green + status: met + - id: AC-10 + title: Tracking doc and status surfaces reconciled + status: met +--- ## Goal @@ -11,20 +42,20 @@ Remove stale `FlowTime.UI` Sim-client compatibility wrappers and the broken call ## Context -[m-E19-01](./m-E19-01-supported-surface-inventory.md) published the supported-surfaces matrix in [docs/architecture/supported-surfaces.md](../../../docs/architecture/supported-surfaces.md) and assigned the Blazor HTTP call-site rows (63–65) and the Svelte alignment rows (66–67) to this milestone. [m-E19-02](./m-E19-02-sim-authoring-and-runtime-boundary-cleanup.md) deleted the Sim runtime seams those wrappers would have depended on (stored drafts CRUD, Sim ZIP archive layer, Engine bundle-import, runtime catalogs, `/api/v1/drafts/validate`, `GET /v1/debug/scan-directory`) and narrowed `/api/v1/drafts/run` to inline-only. [m-E19-03](./m-E19-03-schema-template-example-retirement.md) retired deprecated schema, template, and example residue from active surfaces. +[M-024](./M-024.md) published the supported-surfaces matrix in [docs/architecture/supported-surfaces.md](../../../docs/architecture/supported-surfaces.md) and assigned the Blazor HTTP call-site rows (63–65) and the Svelte alignment rows (66–67) to this milestone. [M-025](./M-025.md) deleted the Sim runtime seams those wrappers would have depended on (stored drafts CRUD, Sim ZIP archive layer, Engine bundle-import, runtime catalogs, `/api/v1/drafts/validate`, `GET /v1/debug/scan-directory`) and narrowed `/api/v1/drafts/run` to inline-only. [M-026](./M-026.md) retired deprecated schema, template, and example residue from active surfaces. This milestone is the Blazor client-layer cleanup pass over that cleaned-up baseline. Every row whose `Owning milestone` column in the matrix is `m-E19-04` is executed here. -Scope boundaries inherited from the epic and m-E19-01: +Scope boundaries inherited from the epic and M-024: - Blazor is not retired. Blazor remains a supported first-party UI for debugging, operator workflows, and as plan-B to Svelte per the Blazor/Svelte support policy in [docs/architecture/supported-surfaces.md](../../../docs/architecture/supported-surfaces.md). - Feature parity between Blazor and Svelte is not a goal. Svelte is intentionally behind Blazor. - Demo mode stays. `FlowTimeSimService.RunDemoModeSimulationAsync` and the demo-data generators in `TemplateServiceImplementations.cs` are preserved as-is. - `FlowTime.Core`, `FlowTime.Generator`, `FlowTime.API`, and `FlowTime.Sim.*` are not renamed and their high-level responsibilities do not change. - Analytical surfaces purified by E-16 are out of scope. -- Engine and Sim runtime route deletions are not re-opened. m-E19-02 owns them. -- Schema/template/example/docs retirement is not re-opened. m-E19-03 owns it. -- `POST /v1/run` and `POST /v1/graph` remain deferred per [D-2026-04-08-029](../../decisions.md#d-2026-04-08-029-defer-post-v1run-and-post-v1graph-deletion-out-of-m-e19-02-ac6-scope-narrowing). `TemplateRunner.razor`'s engine-eval flow at line 744 consuming `IRunClient.RunAsync` (routed via `ApiRunClient` → `IFlowTimeApiClient.RunAsync` → Engine `POST /v1/run`) stays on that deferred surface — this milestone does not touch it. +- Engine and Sim runtime route deletions are not re-opened. M-025 owns them. +- Schema/template/example/docs retirement is not re-opened. M-026 owns it. +- `POST /v1/run` and `POST /v1/graph` remain deferred per [D-042](../../decisions.md#d-2026-04-08-029-defer-post-v1run-and-post-v1graph-deletion-out-of-m-e19-02-ac6-scope-narrowing). `TemplateRunner.razor`'s engine-eval flow at line 744 consuming `IRunClient.RunAsync` (routed via `ApiRunClient` → `IFlowTimeApiClient.RunAsync` → Engine `POST /v1/run`) stays on that deferred surface — this milestone does not touch it. - `FlowTimeSimApiClientWithFallback` and `PortDiscoveryService` are legitimate dev-environment port discovery, not a compatibility shim. Their pass-through methods for deleted interface members are removed in this milestone; their port-discovery bootstrap stays. The key distinction this milestone enforces: @@ -33,9 +64,9 @@ The key distinction this milestone enforces: - **Supported Sim client call** — methods backed by live Sim routes (`HealthAsync`, `GetDetailedHealthAsync`, `GetTemplatesAsync`, `GetTemplateAsync`, `GenerateModelAsync`, `CreateRunAsync`). These are the row 63 targets. Keep, audit for drift. - **Engine API client** (`IFlowTimeApiClient`) — the correct surface for run queries (index, series, state, metrics). Rewired callers use this for every run query previously routed at the stale Sim wrappers. -## Acceptance Criteria +## Acceptance criteria -### AC1 — Stale `RunAsync` wrapper deleted (row 64) +### AC-1 — Stale RunAsync wrapper deleted (row 64) `RunAsync(string yaml, ...)` on `IFlowTimeSimApiClient` targets `POST /api/v1/run` on the Sim service, which was removed on 2025-10-01 and does not return even when Sim is reachable. The file itself marks the method broken with a TODO comment. @@ -48,13 +79,12 @@ The key distinction this milestone enforces: **Preserve:** -- `ApiRunClient.RunAsync` at [src/FlowTime.UI/Services/ApiRunClient.cs:14](../../../src/FlowTime.UI/Services/ApiRunClient.cs) — routes through `IFlowTimeApiClient.RunAsync` → Engine `POST /v1/run`, which is deferred per D-2026-04-08-029. +- `ApiRunClient.RunAsync` at [src/FlowTime.UI/Services/ApiRunClient.cs:14](../../../src/FlowTime.UI/Services/ApiRunClient.cs) — routes through `IFlowTimeApiClient.RunAsync` → Engine `POST /v1/run`, which is deferred per D-042. - `RunClientRouter.RunAsync` at [src/FlowTime.UI/Services/RunClientRouter.cs:24](../../../src/FlowTime.UI/Services/RunClientRouter.cs) and `SimulationRunClient.RunAsync` — these are `IRunClient` members, not `IFlowTimeSimApiClient` members, and feed the deferred Engine direct-eval path. - `FlowTimeSimApiClient.CreateRunAsync` (row 63) — the supported Sim orchestration wrapper. **Grep guard:** No declaration or use of `IFlowTimeSimApiClient.RunAsync` remains in `src/FlowTime.UI/` or `tests/FlowTime.UI.Tests/`. The literal `api/v1/run` (i.e. the Sim `/api/v1/run` path, as distinct from Sim `/api/v1/runs/...` query routes which also do not exist and are covered by AC2) must not appear anywhere in `src/FlowTime.UI/Services/FlowTimeSimApiClient.cs` or `src/FlowTime.UI/Services/FlowTimeSimApiClientWithFallback.cs`. - -### AC2 — Stale `GetIndexAsync` and `GetSeriesAsync` wrappers deleted (row 65) +### AC-2 — Stale GetIndexAsync and GetSeriesAsync wrappers deleted (row 65) `GetIndexAsync` and `GetSeriesAsync` on `IFlowTimeSimApiClient` target `GET /api/v1/runs/{runId}/index` and `GET /api/v1/runs/{runId}/series/{seriesId}` on the Sim service. Neither route exists on Sim today and both files are marked broken with TODO comments pointing at Engine API as the correct target. @@ -71,8 +101,7 @@ The key distinction this milestone enforces: - `SeriesIndex` type itself — still consumed by the Engine client return type. **Grep guard:** No declaration or use of `IFlowTimeSimApiClient.GetIndexAsync` or `IFlowTimeSimApiClient.GetSeriesAsync` remains in `src/FlowTime.UI/` or `tests/FlowTime.UI.Tests/`. No literal `api/v1/runs/{` followed by `/index` or `/series/` constructed against a Sim base address remains in `src/FlowTime.UI/Services/FlowTimeSimApiClient.cs` or `src/FlowTime.UI/Services/FlowTimeSimApiClientWithFallback.cs`. - -### AC3 — `FlowTimeSimService` API-mode data generation rewired to orchestration +### AC-3 — FlowTimeSimService API-mode data generation rewired to orchestration `FlowTimeSimService.RunApiModeSimulationAsync` at [src/FlowTime.UI/Services/TemplateServiceImplementations.cs:951-1008](../../../src/FlowTime.UI/Services/TemplateServiceImplementations.cs) currently: @@ -95,8 +124,7 @@ The supported replacement path is `CreateRunAsync` (row 63) on the Sim orchestra - `FlowTimeSimService.RunSimulationAsync`'s outer demo-vs-api branch — only the API-mode branch body changes. **Grep guard:** No `simClient.RunAsync(` or `simClient.GetIndexAsync(` call site remains in `src/FlowTime.UI/Services/TemplateServiceImplementations.cs`. No `/sim/runs/` URL literal remains anywhere in `src/FlowTime.UI/`. - -### AC4 — `SimResultsService` run queries go through the Engine API only +### AC-4 — SimResultsService run queries go through the Engine API only `SimResultsService.GetSimulationResultsAsync` at [src/FlowTime.UI/Services/SimResultsService.cs:38-124](../../../src/FlowTime.UI/Services/SimResultsService.cs) currently branches on `isEngineRun` (a `runId.StartsWith("run_")` check) and calls either `apiClient.GetRunIndexAsync`/`GetRunSeriesAsync` (for engine runs) or the stale `simClient.GetIndexAsync`/`GetSeriesAsync` (for non-engine runs). After AC3 rewires data generation onto `CreateRunAsync`, every API-mode run produces a canonical Engine-format runId, so the branch is dead. @@ -109,11 +137,10 @@ The supported replacement path is `CreateRunAsync` (row 63) on the Sim orchestra **Preserve:** - `GetDemoModeResultsAsync` and its synthetic-data generators. -- `SimResultData` result type including the `BinMinutes` computed display property (preserved per m-E19-03 spec). +- `SimResultData` result type including the `BinMinutes` computed display property (preserved per M-026 spec). **Grep guard:** No `simClient.GetIndexAsync(` or `simClient.GetSeriesAsync(` call site remains in `src/FlowTime.UI/Services/SimResultsService.cs`. `IFlowTimeSimApiClient` must no longer appear in the `SimResultsService` constructor signature. - -### AC5 — Dead Sim run-query URL construction removed from `SimulationResults.razor` +### AC-5 — Dead Sim run-query URL construction removed from [src/FlowTime.UI/Components/Templates/SimulationResults.razor:295-312](../../../src/FlowTime.UI/Components/Templates/SimulationResults.razor) constructs a download URL conditional on demo vs API mode: @@ -132,8 +159,7 @@ After AC3 rewires API-mode data generation to produce canonical Engine run IDs, - The mode-mismatch warning logic at [src/FlowTime.UI/Components/Templates/SimulationResults.razor:280-293](../../../src/FlowTime.UI/Components/Templates/SimulationResults.razor). That UX guidance still applies. **Grep guard:** No `/sim/runs/` literal remains in `src/FlowTime.UI/Components/`. No `{apiConfig.ApiVersion}/runs/{runId}/series/` literal that does not match the canonical Engine route shape remains in `src/FlowTime.UI/Components/Templates/SimulationResults.razor`. - -### AC6 — Supported Blazor Sim client surface confirmed aligned (row 63) +### AC-6 — Supported Blazor Sim client surface confirmed aligned (row 63) Row 63 of the supported-surfaces matrix lists `HealthAsync`, `GetDetailedHealthAsync`, `GetTemplatesAsync`, `GetTemplateAsync`, `GenerateModelAsync`, `CreateRunAsync` as the supported Blazor Sim client surface. After AC1 and AC2 complete, those are the only methods remaining on `IFlowTimeSimApiClient`. @@ -147,9 +173,9 @@ Row 63 of the supported-surfaces matrix lists `HealthAsync`, `GetDetailedHealthA **Grep guard:** `IFlowTimeSimApiClient` at [src/FlowTime.UI/Services/FlowTimeSimApiClient.cs](../../../src/FlowTime.UI/Services/FlowTimeSimApiClient.cs) exposes exactly the row 63 supported set after this milestone. No method names outside `{BaseAddress, HealthAsync, GetDetailedHealthAsync, GetTemplatesAsync, GetTemplateAsync, GenerateModelAsync, CreateRunAsync}` remain on the interface. -### AC7 — Svelte client surfaces confirmed aligned (rows 66, 67) +### AC-7 — Svelte client surfaces confirmed aligned (rows 66, 67) -Rows 66 and 67 of the matrix list the Svelte `Sim` client at [ui/src/lib/api/sim.ts](../../../ui/src/lib/api/sim.ts) and Engine client at [ui/src/lib/api/flowtime.ts](../../../ui/src/lib/api/flowtime.ts) as supported first-party surfaces. m-E19-04 is the owning milestone for their alignment audit. +Rows 66 and 67 of the matrix list the Svelte `Sim` client at [ui/src/lib/api/sim.ts](../../../ui/src/lib/api/sim.ts) and Engine client at [ui/src/lib/api/flowtime.ts](../../../ui/src/lib/api/flowtime.ts) as supported first-party surfaces. M-027 is the owning milestone for their alignment audit. **Audit:** @@ -161,7 +187,7 @@ Rows 66 and 67 of the matrix list the Svelte `Sim` client at [ui/src/lib/api/sim **Grep guard:** `ui/src/lib/api/sim.ts` must not contain literals matching `catalogs`, `drafts`, `bundle`, `bundlePath`, `bundleArchiveBase64`, or `bundleRef`. `ui/src/lib/api/flowtime.ts` must not contain literals matching `POST /v1/runs`, `bundlePath`, `bundleArchiveBase64`, `bundleRef`, or `/v1/debug/`. -### AC8 — Grep-guard script codified +### AC-8 — Grep-guard script codified Create `scripts/m-E19-04-grep-guards.sh` mirroring the structure of `scripts/m-E19-03-grep-guards.sh`. Every guard listed in AC1–AC7 becomes a named test in the script. The script must exit 0 when all guards pass. @@ -181,20 +207,20 @@ Create `scripts/m-E19-04-grep-guards.sh` mirroring the structure of `scripts/m-E Scoped searches are limited to `src/FlowTime.UI/`, `ui/src/lib/api/`, and `tests/FlowTime.UI.Tests/` by default. The script runs locally and in the wrap pass. CI wiring stays deferred, matching the pattern in `scripts/m-E19-02-grep-guards.sh` and `scripts/m-E19-03-grep-guards.sh`. -### AC9 — Build, tests, and grep guards green +### AC-9 — Build, tests, and grep guards green - `dotnet build FlowTime.sln` is green with no new warnings introduced by this milestone. - `dotnet test FlowTime.sln` is green across all test projects. Test deletions for deleted code are acceptable; failing tests or reduced coverage for surviving code are not. In particular, `tests/FlowTime.UI.Tests/TimeTravelDataServiceTests.cs`, `DashboardTests.cs`, and `ArtifactListRenderTests.cs` define mock implementations of an `IFlowTimeApiClient`-like interface whose method names happen to include `RunAsync` and `GetSeriesAsync` — those are Engine-client members, not Sim-client members, and must remain untouched. Only the `IFlowTimeSimApiClient` declarations and implementations are in scope. - The Svelte `ui/` project's existing `npm`/`pnpm` build (if wired) is green after the alignment audit. - `scripts/m-E19-04-grep-guards.sh` exits 0 from the repo root. -### AC10 — Tracking doc and status surfaces reconciled +### AC-10 — Tracking doc and status surfaces reconciled - Create `work/epics/E-19-surface-alignment-and-compatibility-cleanup/m-E19-04-blazor-support-alignment-tracking.md` at milestone start and update it after each AC lands. Tracking doc records: per-AC file changes, grep-guard results, test counts, alignment-audit findings (drift or no drift), and deviations from the spec (if any). - Flip milestone status in a single reconciliation pass at wrap time: - This spec: `draft` → `in-progress` at start → `completed` at wrap. - - [work/epics/E-19-surface-alignment-and-compatibility-cleanup/spec.md](./spec.md) milestone table: `m-E19-04` status `next` → `in-progress` → `completed`; header `Status:` line updated; epic `Success Criteria` checkboxes for "first-party clients no longer maintain duplicate endpoint, metrics, or health fallback logic" and "grep and regression audits prove targeted legacy/fallback helpers are removed or isolated" flipped to checked if m-E19-04 closes them. - - [ROADMAP.md](../../../ROADMAP.md) E-19 section: sync m-E19-04 completion. If m-E19-04 is the final E-19 milestone before epic closure, advance the E-19 section to completed and name the next epic/milestone. + - [work/epics/E-19-surface-alignment-and-compatibility-cleanup/spec.md](./spec.md) milestone table: `m-E19-04` status `next` → `in-progress` → `completed`; header `Status:` line updated; epic `Success Criteria` checkboxes for "first-party clients no longer maintain duplicate endpoint, metrics, or health fallback logic" and "grep and regression audits prove targeted legacy/fallback helpers are removed or isolated" flipped to checked if M-027 closes them. + - [ROADMAP.md](../../../ROADMAP.md) E-19 section: sync M-027 completion. If M-027 is the final E-19 milestone before epic closure, advance the E-19 section to completed and name the next epic/milestone. - [work/epics/epic-roadmap.md](../epic-roadmap.md) E-19 row: same sync. - [CLAUDE.md](../../../CLAUDE.md) Current Work section: sync E-19 topology and next-step pointer. - All status-surface updates happen in a single wrap commit after the grep guards pass. @@ -210,7 +236,7 @@ ACs are grouped into four focused commits plus the wrap. Each bundle is a single 3. **Bundle C — grep guard script (AC8).** Its own commit. The script must pass against the tree from Bundle A (and Bundle B if it produced changes), proving the cleanup is complete before the wrap. 4. **Wrap (AC9 + AC10).** Tracking doc finalization and status-surface reconciliation in a single commit after the grep guards and build/test pass. -If any bundle surfaces a complication at implementation time (e.g. Bundle A discovers a deeper caller chain that cannot be rewired cleanly, or AC5's `SimulationResults.razor` demo-mode download decision needs human input), stop and present options before widening or splitting the bundle, the way m-E19-02 handled the AC6 scope narrowing and m-E19-03 handled the Little's Law allowlist marker. +If any bundle surfaces a complication at implementation time (e.g. Bundle A discovers a deeper caller chain that cannot be rewired cleanly, or AC5's `SimulationResults.razor` demo-mode download decision needs human input), stop and present options before widening or splitting the bundle, the way M-025 handled the AC6 scope narrowing and M-026 handled the Little's Law allowlist marker. ### Recommended implementation sequence within Bundle A @@ -231,7 +257,7 @@ Each step should leave the build green and the test suite passing before the nex - `SimResultsService` result loading: [src/FlowTime.UI/Services/SimResultsService.cs:38-124](../../../src/FlowTime.UI/Services/SimResultsService.cs). - Download URL construction: [src/FlowTime.UI/Components/Templates/SimulationResults.razor:295-315](../../../src/FlowTime.UI/Components/Templates/SimulationResults.razor). - Blazor `TemplateRunner.razor` data-generation caller: [src/FlowTime.UI/Pages/TemplateRunner.razor:831](../../../src/FlowTime.UI/Pages/TemplateRunner.razor). -- Blazor `TemplateRunner.razor` engine-eval caller (preserved, deferred per D-2026-04-08-029): [src/FlowTime.UI/Pages/TemplateRunner.razor:744](../../../src/FlowTime.UI/Pages/TemplateRunner.razor). +- Blazor `TemplateRunner.razor` engine-eval caller (preserved, deferred per D-042): [src/FlowTime.UI/Pages/TemplateRunner.razor:744](../../../src/FlowTime.UI/Pages/TemplateRunner.razor). - Engine API client read methods used by the rewired callers: `IFlowTimeApiClient.GetRunIndexAsync` and `GetRunSeriesAsync` in [src/FlowTime.UI/Services/FlowTimeApiClient.cs](../../../src/FlowTime.UI/Services/FlowTimeApiClient.cs). - Svelte Sim client: [ui/src/lib/api/sim.ts](../../../ui/src/lib/api/sim.ts). - Svelte Engine client: [ui/src/lib/api/flowtime.ts](../../../ui/src/lib/api/flowtime.ts). @@ -242,7 +268,7 @@ Forward-only deletion and rewire, not migration: - Tests that exist only to exercise the deleted stale wrappers are deleted alongside the wrappers. - Tests that mock `IFlowTimeSimApiClient` for an unrelated scenario (e.g. template-metadata tests) must be updated to drop the deleted methods from the mock implementation. If the mock no longer compiles because the interface shrank, that is a desired forcing function — update the mock to the shrunken surface. -- Tests that mock `IFlowTimeApiClient` (Engine client) are out of scope. Method-name collisions with `RunAsync`/`GetSeriesAsync` on the Engine mocks are not stale-wrapper residue — Engine-side deletion is deferred per D-2026-04-08-029. +- Tests that mock `IFlowTimeApiClient` (Engine client) are out of scope. Method-name collisions with `RunAsync`/`GetSeriesAsync` on the Engine mocks are not stale-wrapper residue — Engine-side deletion is deferred per D-042. - No new unit tests are required by this milestone unless a rewire surfaces a regression that existing coverage did not catch. In that case, the regression test is added alongside the fix. - Grep guards (AC8) are the load-bearing regression check for this milestone. Every deleted symbol and every rewired caller path is asserted absent or present via a guard. @@ -251,16 +277,16 @@ Forward-only deletion and rewire, not migration: Explicit list of surfaces that must remain untouched by this milestone. Any accidental change to these surfaces is a milestone regression. - `IFlowTimeApiClient` and its Engine-facing implementations (`FlowTimeApiClient`, `ApiRunClient`) — they are the canonical Engine query surface. -- `IRunClient`, `RunClientRouter`, `SimulationRunClient`, `ApiRunClient` — these feed the deferred Engine direct-eval path per D-2026-04-08-029. Their `RunAsync` members are not stale wrappers. +- `IRunClient`, `RunClientRouter`, `SimulationRunClient`, `ApiRunClient` — these feed the deferred Engine direct-eval path per D-042. Their `RunAsync` members are not stale wrappers. - `FlowTimeSimApiClient.CreateRunAsync`, `HealthAsync`, `GetDetailedHealthAsync`, `GetTemplatesAsync`, `GetTemplateAsync`, `GenerateModelAsync` — row 63 supported surface. - `FlowTimeSimApiClientWithFallback` class, its `PortDiscoveryService` integration, and its bootstrap in `Program.cs` — legitimate dev-environment port discovery. Only the pass-through methods for deleted interface members are removed. - `PortDiscoveryService` and `FlowTimeSimApiOptions` — unchanged. - `FlowTimeSimService.RunSimulationAsync` outer demo-vs-api branch, `RunDemoModeSimulationAsync`, and every demo-data generator in `TemplateServiceImplementations.cs`. - `SimResultsService.GetDemoModeResultsAsync` and every demo-data generator. -- `SimResultData` result type including its `BinMinutes` computed display property (explicitly preserved by m-E19-03 as a display helper, not a schema field). +- `SimResultData` result type including its `BinMinutes` computed display property (explicitly preserved by M-026 as a display helper, not a schema field). - `TemplateRunner.razor` — flow analysis path at line 744 routing through `IRunClient.RunAsync` stays untouched. Only the data-generation path at line 831 is affected, and only transitively via `FlowTimeSimService` rewire. -- `Simulate.razor` — listed in m-E19-03 preserved surfaces; any incidental reads of `SimResultsService` continue to work after the rewire. -- `TemplateServiceImplementations.TemplateService` (the template metadata class distinct from `FlowTimeSimService`) — template authoring is out of scope here; m-E19-03 already retired its `binMinutes` demo residue. +- `Simulate.razor` — listed in M-026 preserved surfaces; any incidental reads of `SimResultsService` continue to work after the rewire. +- `TemplateServiceImplementations.TemplateService` (the template metadata class distinct from `FlowTimeSimService`) — template authoring is out of scope here; M-026 already retired its `binMinutes` demo residue. - `ui/src/lib/api/sim.ts` and `ui/src/lib/api/flowtime.ts` — no code change expected from the alignment audit unless drift is found. - `IFlowTimeSimApiClient` methods that are NOT in the stale-wrapper set: only AC1 and AC2 targets are removed. `CreateRunAsync` in particular must be preserved and is the replacement for the deleted `RunAsync`. @@ -270,16 +296,16 @@ Explicit list of surfaces that must remain untouched by this milestone. Any acci - Retiring Blazor demo mode. Demo mode is explicitly preserved. - Forcing feature parity between Blazor and Svelte. Feature parity is not a goal. - Adding new capability to either UI. Only alignment with current contracts is in scope. -- Deleting `POST /v1/run` or `POST /v1/graph` from Engine or their test consumers. Deferred per D-2026-04-08-029. +- Deleting `POST /v1/run` or `POST /v1/graph` from Engine or their test consumers. Deferred per D-042. - Touching the `IRunClient` / `ApiRunClient` / `RunClientRouter` / `SimulationRunClient` abstractions. Those feed the deferred Engine direct-eval path. - Touching `FlowTime.Core`, `FlowTime.Generator`, `FlowTime.API`, `FlowTime.Sim.*`, or any non-UI project. -- Re-opening schema, template, example, or docs retirement. m-E19-03 owns those and is complete. -- Re-opening Sim runtime route deletion. m-E19-02 owns those and is complete. +- Re-opening schema, template, example, or docs retirement. M-026 owns those and is complete. +- Re-opening Sim runtime route deletion. M-025 owns those and is complete. - Introducing or referencing `FlowTime.TimeMachine`. That component is new in E-18 m-E18-01a and does not exist yet. - Reintroducing any deleted Sim route via a Blazor-side compatibility shim. - Refactoring `FlowTimeSimService`, `SimResultsService`, or `SimulationResults.razor` beyond what the deletions and rewires require. The commit bundle stays scoped to the stale-wrapper cleanup ripple. - Performance, observability, or error-handling improvements unrelated to deletion and rewire. -- CI wiring for `scripts/m-E19-04-grep-guards.sh`. The script exists and runs locally; CI integration is deferred matching the pattern in m-E19-02 and m-E19-03. +- CI wiring for `scripts/m-E19-04-grep-guards.sh`. The script exists and runs locally; CI integration is deferred matching the pattern in M-025 and M-026. - Updating release notes, completed-epic specs, or other historical material under `docs/releases/`, `docs/archive/`, or `work/epics/completed/`. ## Guards / DO NOT @@ -298,16 +324,16 @@ Explicit list of surfaces that must remain untouched by this milestone. Any acci ## Dependencies -- [m-E19-01 Supported Surface Inventory, Boundary ADR & Exit Criteria](./m-E19-01-supported-surface-inventory.md) — supplies matrix rows 63–67 and the Blazor/Svelte support policy this milestone executes. -- [m-E19-02 Sim Authoring & Runtime Boundary Cleanup](./m-E19-02-sim-authoring-and-runtime-boundary-cleanup.md) — already removed the Sim runtime routes these wrappers would have depended on, so the current state is "broken wrappers" not "unused-but-working wrappers." -- [m-E19-03 Schema, Template & Example Retirement](./m-E19-03-schema-template-example-retirement.md) — already retired the deprecated `binMinutes` authoring residue from `TemplateServiceImplementations.cs` demo generators and the UI sample fixture, so the only `TemplateServiceImplementations.cs` residue left is the stale Sim client caller chain this milestone rewires. +- [M-024 Supported Surface Inventory, Boundary ADR & Exit Criteria](./M-024.md) — supplies matrix rows 63–67 and the Blazor/Svelte support policy this milestone executes. +- [M-025 Sim Authoring & Runtime Boundary Cleanup](./M-025.md) — already removed the Sim runtime routes these wrappers would have depended on, so the current state is "broken wrappers" not "unused-but-working wrappers." +- [M-026 Schema, Template & Example Retirement](./M-026.md) — already retired the deprecated `binMinutes` authoring residue from `TemplateServiceImplementations.cs` demo generators and the UI sample fixture, so the only `TemplateServiceImplementations.cs` residue left is the stale Sim client caller chain this milestone rewires. - [docs/architecture/supported-surfaces.md](../../../docs/architecture/supported-surfaces.md) — authoritative row-by-row ownership. ## References - [E-19 epic spec](./spec.md) -- [m-E19-01 spec](./m-E19-01-supported-surface-inventory.md) — see matrix rows 63–67 and the Blazor/Svelte Support Policy section -- [m-E19-02 spec](./m-E19-02-sim-authoring-and-runtime-boundary-cleanup.md) -- [m-E19-03 spec](./m-E19-03-schema-template-example-retirement.md) -- [work/decisions.md](../../decisions.md) — D-2026-04-08-029 (deferred `/v1/run` `/v1/graph`), Blazor/Svelte support policy decision -- [scripts/m-E19-03-grep-guards.sh](../../../scripts/m-E19-03-grep-guards.sh) — template for the m-E19-04 grep-guard script +- [M-024 spec](./M-024.md) — see matrix rows 63–67 and the Blazor/Svelte Support Policy section +- [M-025 spec](./M-025.md) +- [M-026 spec](./M-026.md) +- [work/decisions.md](../../decisions.md) — D-042 (deferred `/v1/run` `/v1/graph`), Blazor/Svelte support policy decision +- [scripts/M-026.sh](../../../scripts/M-026.sh) — template for the M-027 grep-guard script diff --git a/work/epics/completed/E-19-surface-alignment-and-compatibility-cleanup/spec.md b/work/epics/E-19-surface-alignment-compatibility-cleanup/epic.md similarity index 87% rename from work/epics/completed/E-19-surface-alignment-and-compatibility-cleanup/spec.md rename to work/epics/E-19-surface-alignment-compatibility-cleanup/epic.md index 348a263f..76ec59f5 100644 --- a/work/epics/completed/E-19-surface-alignment-and-compatibility-cleanup/spec.md +++ b/work/epics/E-19-surface-alignment-compatibility-cleanup/epic.md @@ -1,7 +1,8 @@ -# Epic: Surface Alignment & Compatibility Cleanup - -**ID:** E-19 -**Status:** active — all four milestones (m-E19-01 through m-E19-04) completed; epic→main merge pending +--- +id: E-19 +title: Surface Alignment & Compatibility Cleanup +status: done +--- ## Goal @@ -82,13 +83,13 @@ The current Sim orchestration path is therefore not the default path forward for - [x] One explicit supported compatibility matrix exists for Engine, Sim, Svelte UI, Blazor UI, docs, schemas, and examples - [x] One explicit terminology and ownership matrix exists for template, draft, model, run, bundle, and catalog surfaces -- [x] First-party clients no longer maintain duplicate endpoint, metrics, or health fallback logic where the canonical contract exists (m-E19-04: stale `IFlowTimeSimApiClient.RunAsync`/`GetIndexAsync`/`GetSeriesAsync` deleted and callers rewired to supported `CreateRunAsync` + Engine query surface) +- [x] First-party clients no longer maintain duplicate endpoint, metrics, or health fallback logic where the canonical contract exists (M-027: stale `IFlowTimeSimApiClient.RunAsync`/`GetIndexAsync`/`GetSeriesAsync` deleted and callers rewired to supported `CreateRunAsync` + Engine query surface) - [x] Current Sim orchestration/storage/catalog residue is either explicitly supported with scope boundaries or removed from active first-party paths -- [x] Active UI/template surfaces no longer generate or promote deprecated schema shapes such as `binMinutes`-based demo templates (m-E19-03: Blazor mock template service, sample fixture, CLI verbose label, and UI test keys all rewritten to `binSize`/`binUnit`) -- [x] Legacy examples, docs, and schema references are either archived/historical or deleted; current docs present one canonical surface (m-E19-03: schema-migration example YAMLs moved to `examples/archive/`; stale `docs/ui/template-integration-spec.md` moved to `docs/archive/ui/`; active docs cleaned) +- [x] Active UI/template surfaces no longer generate or promote deprecated schema shapes such as `binMinutes`-based demo templates (M-026: Blazor mock template service, sample fixture, CLI verbose label, and UI test keys all rewritten to `binSize`/`binUnit`) +- [x] Legacy examples, docs, and schema references are either archived/historical or deleted; current docs present one canonical surface (M-026: schema-migration example YAMLs moved to `examples/archive/`; stale `docs/ui/template-integration-spec.md` moved to `docs/archive/ui/`; active docs cleaned) - [x] Blazor UI support policy is explicit: it remains a supported debugging/plan-B surface and consumes current Engine/Sim contracts without stale compatibility wrappers - [x] E-18 planning remains clean: no current Sim draft/catalog/bundle choreography is treated as the default programmable/Time Machine contract -- [x] Grep and regression audits prove targeted legacy/fallback helpers are removed or isolated to historical/archive surfaces only (m-E19-02: 21 guards; m-E19-03: 11 guards; m-E19-04: 11 guards, all 43 passing) +- [x] Grep and regression audits prove targeted legacy/fallback helpers are removed or isolated to historical/archive surfaces only (M-025: 21 guards; M-026: 11 guards; M-027: 11 guards, all 43 passing) ## Risks & Open Questions @@ -111,10 +112,10 @@ This epic starts immediately after E-16 as a post-purification cleanup lane. It | ID | Title | Summary | Depends On | Status | |----|-------|---------|------------|--------| -| m-E19-01-supported-surface-inventory | Supported Surface Inventory, Boundary ADR & Exit Criteria | Inventory compatibility seams, define supported vs historical surfaces, publish the terminology/ownership ADR, and pin retention/deletion gates for drafts, bundles, catalogs, and import paths. | E-16 | completed | -| m-E19-02-sim-authoring-and-runtime-boundary-cleanup | Sim Authoring & Runtime Boundary Cleanup | Narrow Sim to explicitly supported authoring/orchestration surfaces, remove transitional catalog/runtime callers, and keep Engine import/query ownership explicit. | m-E19-01 | completed | -| m-E19-03-schema-template-example-retirement | Schema, Template & Example Retirement | Remove or archive deprecated schema, demo template, and example material from active current surfaces. | m-E19-01 | completed | -| m-E19-04-blazor-support-alignment | Blazor Support Alignment | Remove stale `FlowTime.UI` compatibility wrappers, keep Blazor aligned with current Engine/Sim contracts, and define clear supported debugging/operator workflows alongside the parallel Svelte UI. | m-E19-02, m-E19-03 | completed | +| M-024 | Supported Surface Inventory, Boundary ADR & Exit Criteria | Inventory compatibility seams, define supported vs historical surfaces, publish the terminology/ownership ADR, and pin retention/deletion gates for drafts, bundles, catalogs, and import paths. | E-16 | completed | +| M-025 | Sim Authoring & Runtime Boundary Cleanup | Narrow Sim to explicitly supported authoring/orchestration surfaces, remove transitional catalog/runtime callers, and keep Engine import/query ownership explicit. | M-024 | completed | +| M-026 | Schema, Template & Example Retirement | Remove or archive deprecated schema, demo template, and example material from active current surfaces. | M-024 | completed | +| M-027 | Blazor Support Alignment | Remove stale `FlowTime.UI` compatibility wrappers, keep Blazor aligned with current Engine/Sim contracts, and define clear supported debugging/operator workflows alongside the parallel Svelte UI. | M-025, M-026 | completed | ## Candidate Retention / Decision Matrix diff --git a/work/epics/E-20-matrix-engine/M-028-scaffold-types-and-parsers.md b/work/epics/E-20-matrix-engine/M-028-scaffold-types-and-parsers.md new file mode 100644 index 00000000..2b62609a --- /dev/null +++ b/work/epics/E-20-matrix-engine/M-028-scaffold-types-and-parsers.md @@ -0,0 +1,115 @@ +--- +id: M-028 +title: Scaffold, Types, and Parsers +status: done +parent: E-20 +acs: + - id: AC-1 + title: 'AC-1: Rust workspace and crate structure' + status: met + - id: AC-2 + title: 'AC-2: Devcontainer Rust toolchain' + status: met + - id: AC-3 + title: 'AC-3: Model types with serde deserialization' + status: met + - id: AC-4 + title: 'AC-4: All existing model fixtures deserialize' + status: met + - id: AC-5 + title: 'AC-5: Expression parser' + status: met + - id: AC-6 + title: 'AC-6: Expression parser parity' + status: met + - id: AC-7 + title: 'AC-7: Reference model fixtures extracted' + status: met +--- + +## Goal + +Stand up the Rust project, port all model types with YAML deserialization, port the expression parser, and extract reference model fixtures. After this milestone, the Rust crate can parse any FlowTime model YAML and any FlowTime expression — the complete data layer with no computation. + +## Context + +No Rust code exists in the repo. The devcontainer does not have Rust installed. The C# engine defines ~22 model types in `ModelParser.cs` and a recursive-descent expression parser in `FlowTime.Expressions/`. Existing YAML model files in `examples/` and `fixtures/` serve as integration test fixtures. + +## Acceptance criteria + +### AC-1 — AC-1: Rust workspace and crate structure + +**AC-1: Rust workspace and crate structure.** A Rust workspace at `engine/` (repo root) with two crates: +- `engine/core/` — library crate (`flowtime-core`) containing model types, expression parser, and (future) compiler/evaluator. +- `engine/cli/` — binary crate (`flowtime-engine`) that depends on `flowtime-core`. For this milestone, it only parses a model YAML and prints a summary (node count, grid dimensions). +- `Cargo.toml` workspace at `engine/` level. +### AC-2 — AC-2: Devcontainer Rust toolchain + +**AC-2: Devcontainer Rust toolchain.** The devcontainer gains Rust support: +- `rustup` and `cargo` available on `$PATH`. +- `cargo build` and `cargo test` work from `engine/`. +- Installation via devcontainer feature or post-create script — not manual. +### AC-3 — AC-3: Model types with serde deserialization + +**AC-3: Model types with serde deserialization.** Rust structs mirroring every C# model type that participates in YAML deserialization: +- `ModelDefinition`, `GridDefinition`, `NodeDefinition`, `TopologyDefinition`, `TopologyNodeDefinition`, `TopologyNodeSemanticsDefinition`, `TopologyEdgeDefinition`, `ConstraintDefinition`, `ConstraintSemanticsDefinition`, `ClassDefinition`, `TrafficDefinition`, `ArrivalDefinition`, `ArrivalPatternDefinition`, `OutputDefinition`, `RouterDefinition`, `RouterInputsDefinition`, `RouterRouteDefinition`, `DispatchScheduleDefinition`, `PmfDefinition`, `InitialConditionDefinition`, `UiHintsDefinition`. +- All fields use camelCase JSON/YAML naming (matching existing schema). +- Optional fields are `Option`. +- `serde_yaml` for deserialization. +### AC-4 — AC-4: All existing model fixtures deserialize + +**AC-4: All existing model fixtures deserialize.** Every `.yaml` model file in `examples/` and `fixtures/` (excluding `examples/archive/`) parses into the Rust `ModelDefinition` without error. Test: `cargo test` includes a parameterized test that loads each fixture. +### AC-5 — AC-5: Expression parser + +**AC-5: Expression parser.** Port of the C# `ExpressionParser` (recursive descent) producing equivalent AST types: +- AST: `Expr` enum with variants `Literal(f64)`, `ArrayLiteral(Vec)`, `NodeRef(String)`, `BinaryOp { op, left, right }`, `FunctionCall { name, args }`. +- `BinaryOp`: Add, Subtract, Multiply, Divide. +- Grammar: `Expression = Term (('+' | '-') Term)*`, `Term = Factor (('*' | '/') Factor)*`, `Factor = Number | Array | NodeRef | FunctionCall | '(' Expression ')'`. +- Error reporting with position. +### AC-6 — AC-6: Expression parser parity + +**AC-6: Expression parser parity.** Every expression that appears in existing model fixtures and C# test fixtures parses correctly. Additionally, the following expressions must parse and produce correct AST structure (extracted from C# tests): +- `"capacity"` → NodeRef +- `"100.0"` → Literal +- `"a + b"` → BinaryOp(Add) +- `"a * b + c"` → precedence: Add(Mul(a, b), c) +- `"(a + b) * c"` → Mul(Add(a, b), c) +- `"SHIFT(demand, 1)"` → FunctionCall("SHIFT", [NodeRef("demand"), Literal(1)]) +- `"CONV(errors, [0.0, 0.6, 0.3, 0.1])"` → FunctionCall with ArrayLiteral +- `"CLAMP(queue_depth / 50, 0, 1)"` → nested function + binary op +- `"raw_arrivals * (1 - SHIFT(pressure, 1))"` → nested binary ops with function call +- `"MIN(capacity, arrivals)"` → FunctionCall +- `"MAX(0, SHIFT(queue_depth, 1) + arrivals)"` → nested +### AC-7 — AC-7: Reference model fixtures extracted + +**AC-7: Reference model fixtures extracted.** A directory `engine/fixtures/` containing YAML model files at graduated complexity levels, copied or symlinked from existing `examples/` and `fixtures/`. At minimum: +- Simple: const-only model (e.g., `m0.const.yaml`) +- Expression: model with expr nodes +- Queue: model with serviceWithBuffer topology +- PMF: model with PMF nodes +- Router: model with router nodes +- Constraint: model with constraints +- Multi-class: model with class definitions +- WIP limit: model with wipLimit/wipOverflow +- These serve as the progressive parity test fixtures for M2-M6. +## Technical Notes + +- **Crate naming:** `flowtime-core` (library) and `flowtime-engine` (binary) follow Rust conventions (kebab-case crate names). +- **Workspace location:** `engine/` at repo root keeps Rust separate from the .NET solution. `Cargo.lock` lives in `engine/`. +- **serde field naming:** Use `#[serde(rename_all = "camelCase")]` on structs to match the existing YAML camelCase convention. Use `#[serde(default)]` for optional fields that have C# defaults. +- **Expression parser:** The C# parser is 371 lines. The Rust port should be similar size. Use `&str` + byte position for error reporting. +- **No computation:** This milestone deliberately excludes compilation, evaluation, and artifact writing. The types and parsers are the foundation; computation starts in M-029. +- **Sim YAML models:** Some fixtures use Sim-specific fields (`metadata.generator`, `parameters`, etc.) that are not in `ModelDefinition`. Use `#[serde(flatten)]` or ignore unknown fields with `#[serde(deny_unknown_fields)]` disabled — existing models must parse without error. + +## Out of Scope + +- Model compilation (topo sort, column map, plan generation) — M-029 +- Evaluation (matrix ops) — M-029+ +- Model validation (schema checks, initial condition validation) — M-029+ +- Artifact writing — M-033 +- WebAssembly compilation — future +- Expression evaluation (only parsing) — M-029 + +## Dependencies + +- None (first milestone in E-20) diff --git a/work/epics/E-20-matrix-engine/M-029-compiler-and-core-evaluator.md b/work/epics/E-20-matrix-engine/M-029-compiler-and-core-evaluator.md new file mode 100644 index 00000000..abc2f507 --- /dev/null +++ b/work/epics/E-20-matrix-engine/M-029-compiler-and-core-evaluator.md @@ -0,0 +1,112 @@ +--- +id: M-029 +title: Compiler and Core Evaluator +status: done +parent: E-20 +acs: + - id: AC-1 + title: 'AC-1: ColumnMap' + status: met + - id: AC-2 + title: 'AC-2: Op enum and evaluator' + status: met + - id: AC-3 + title: 'AC-3: Expression compiler' + status: met + - id: AC-4 + title: 'AC-4: Model compiler (const + expr)' + status: met + - id: AC-5 + title: 'AC-5: End-to-end evaluation' + status: met + - id: AC-6 + title: 'AC-6: Parity with C# on simple models' + status: met + - id: AC-7 + title: 'AC-7: Plan inspection' + status: met +--- + +## Goal + +Compile simple FlowTime models (const + expr nodes, no topology) into an evaluation plan, execute the plan against a flat matrix, and produce correct series output. This is the first milestone where the Rust engine computes something — the "hello world" of the matrix model. + +## Context + +M-028 delivered the Rust workspace, model types with YAML deserialization, and the expression parser. The crate can parse any model and any expression, but cannot compile or evaluate anything. + +The C# engine evaluates models via: +1. `ModelParser.ParseNodes()` — creates `INode` instances from `NodeDefinition` +2. `Graph(nodes)` — topological sort +3. `Graph.Evaluate(grid)` — iterate in topo order, each node produces a `Series` + +The matrix engine replaces this with: +1. Compiler: assign column indices, emit ops from node definitions +2. Evaluator: iterate ops, execute against flat `f64[]` matrix + +## Acceptance criteria + +### AC-1 — AC-1: ColumnMap + +**AC-1: ColumnMap.** Bidirectional mapping between series names (strings) and column indices (usize). `name_to_index()` and `index_to_name()`. Constructed during compilation. +### AC-2 — AC-2: Op enum and evaluator + +**AC-2: Op enum and evaluator.** `Op` enum with variants for the element-wise operations needed by const + expr models: +- `Const { out, values }` — write constant values to a column +- `VecAdd { out, a, b }`, `VecSub`, `VecMul`, `VecDiv` — element-wise binary ops +- `ScalarMul { out, input, k }`, `ScalarAdd { out, input, k }` — scalar ops +- `VecMin { out, a, b }`, `VecMax { out, a, b }` — element-wise min/max +- `Clamp { out, val, lo, hi }` — clamp to range +- `Mod { out, a, b }` — modulo +- `Floor { out, input }`, `Ceil { out, input }`, `Round { out, input }` — rounding +- `Step { out, input, threshold }` — step function +- `Pulse { out, period, phase, amplitude }` — periodic pulse + +Evaluator function: `fn evaluate(plan: &[Op], bins: usize, series_count: usize) -> Vec` — allocates matrix, iterates ops, returns filled matrix. +### AC-3 — AC-3: Expression compiler + +**AC-3: Expression compiler.** Compile an expression AST (`Expr`) into a sequence of `Op`s given a `ColumnMap`. Each binary op and function call emits one or more ops, using temporary columns for intermediate results. Node references resolve to column indices via the ColumnMap. +### AC-4 — AC-4: Model compiler (const + expr) + +**AC-4: Model compiler (const + expr).** `fn compile(model: &ModelDefinition) -> Result<(Plan, ColumnMap), CompileError>`: +- Assigns a column index to each node's output series. +- Topological sort based on expression dependencies. +- Emits `Const` ops for `kind: "const"` nodes. +- Emits expression ops for `kind: "expr"` nodes. +- Returns the plan (ordered ops) and column map. +### AC-5 — AC-5: End-to-end evaluation + +**AC-5: End-to-end evaluation.** `fn eval_model(model: &ModelDefinition) -> Result` that compiles and evaluates, returning named series. Test with the `hello.yaml` fixture: +- `demand` = [10, 10, 10, 10, 10, 10, 10, 10] +- `served` = demand * 0.8 = [8, 8, 8, 8, 8, 8, 8, 8] +### AC-6 — AC-6: Parity with C# on simple models + +**AC-6: Parity with C# on simple models.** Create a parity test that evaluates a model with both the Rust engine and the C# engine (via pre-computed reference outputs) and compares series values. At minimum: +- Const-only model: all series match +- Const + expr model: expression results match (binary ops, scalar multiply) +- Nested expressions: `MIN(a, b)`, `MAX(a, b)`, `CLAMP(x, lo, hi)` +- Multiple dependent expressions (chain: a → b → c) +### AC-7 — AC-7: Plan inspection + +**AC-7: Plan inspection.** `fn format_plan(plan: &Plan, column_map: &ColumnMap) -> String` that prints a human-readable plan. The CLI `flowtime-engine plan ` command uses this. Output shows op type, column names (not just indices). +## Technical Notes + +- **Matrix layout:** Row-major `Vec` of size `series_count * bins`. Column `c` at bin `t` is at index `c * bins + t`. All bins for one series are contiguous. +- **Temporary columns:** Expression compilation may need intermediate columns (e.g., `a + b` in `(a + b) * c` needs a temp column for `a + b`). The compiler allocates these from the column map with generated names like `__temp_0`. +- **Topo sort:** Collect dependencies from expression AST (node references). Kahn's algorithm (same as C#). Reject cycles. +- **Fixture update:** `simple-const.yaml` uses legacy field `expression` instead of `expr`. Update the fixture to use `expr` so it works with the Rust model types. +- **No topology:** This milestone handles flat node lists only. Topology synthesis (serviceWithBuffer queue nodes) comes in M-030. +- **PMF nodes:** `kind: "pmf"` computes a constant expected value from the distribution. Can be included here as a simple op, or deferred to M-030. Include if straightforward. + +## Out of Scope + +- Topology synthesis (queue nodes, retry echo) — M-030 +- Sequential ops (QueueRecurrence, Shift, Convolve, DispatchGate) — M-030 +- Routing and constraints — M-031 +- Derived metrics — M-032 +- Artifact writing (CSVs, JSON) — M-033 +- SHIFT/feedback handling — M-030 + +## Dependencies + +- M-028 complete (model types + expression parser) diff --git a/work/epics/E-20-matrix-engine/M-030-topology-and-sequential-ops.md b/work/epics/E-20-matrix-engine/M-030-topology-and-sequential-ops.md new file mode 100644 index 00000000..9c9dff72 --- /dev/null +++ b/work/epics/E-20-matrix-engine/M-030-topology-and-sequential-ops.md @@ -0,0 +1,102 @@ +--- +id: M-030 +title: Topology and Sequential Ops +status: done +parent: E-20 +acs: + - id: AC-1 + title: 'AC-1: Sequential op variants' + status: met + - id: AC-2 + title: 'AC-2: Topology synthesis' + status: met + - id: AC-3 + title: 'AC-3: WIP limits and overflow routing' + status: met + - id: AC-4 + title: 'AC-4: SHIFT feedback' + status: met + - id: AC-5 + title: 'AC-5: Parity fixtures' + status: met + - id: AC-6 + title: 'AC-6: Existing tests unbroken' + status: met +--- + +## Goal + +Add topology synthesis and sequential operations to the Rust engine so it can evaluate models with queues, retry echo, dispatch schedules, WIP limits, overflow routing, and SHIFT-based backpressure. After this milestone, the matrix engine handles the core flow dynamics — everything except routing, constraints, derived metrics, and artifact writing. + +## Context + +M-029 delivered the compiler and evaluator for flat models (const + expr + PMF). The compiler produces a plan of element-wise ops, the evaluator executes against a flat matrix. 38 Rust tests passing. + +The C# engine handles topology via: +1. `ModelCompiler` — synthesizes `serviceWithBuffer` and `retryEcho` nodes from topology definitions +2. `ServiceWithBufferNode.Evaluate` — sequential queue recurrence: `Q[t] = max(0, Q[t-1] + inflow - outflow - loss)` +3. `ExprNode.EvaluateShiftFunction` — temporal shift: `out[t] = input[t - lag]` +4. `DispatchScheduleProcessor` — gates outflow to dispatch bins only +5. `WipOverflowEvaluator` — post-evaluation routing of WIP overflow to target queues +6. `Graph.EvaluateFeedbackSubgraph` — bin-by-bin evaluation for SHIFT feedback cycles + +In the matrix model, all of these become ops in the plan. Sequential ops (QueueRecurrence, Shift, Convolve) process bins in order, reading from previous bins in the same matrix — feedback falls out naturally without special handling. + +## Acceptance criteria + +### AC-1 — AC-1: Sequential op variants + +**AC-1: Sequential op variants.** Add to the `Op` enum: +- `Shift { out, input, lag }` — `out[t] = input[t - lag]` (0 for t < lag) +- `Convolve { out, input, kernel }` — causal convolution: `out[t] = Σ(k) input[t-k] * kernel[k]` +- `QueueRecurrence { out, inflow, outflow, loss, init, wip_limit, overflow_out }` — sequential queue depth with optional WIP limit and overflow tracking +- `DispatchGate { out, input, period, phase, capacity }` — gates output to dispatch bins, optionally capping at capacity +### AC-2 — AC-2: Topology synthesis + +**AC-2: Topology synthesis.** The compiler processes `model.topology.nodes` to synthesize queue and retry echo nodes (same logic as C# `ModelCompiler`): +- For each `serviceWithBuffer`/`queue`/`dlq` topology node: synthesize a `QueueRecurrence` op from `semantics.arrivals`, `semantics.served` (or capacity), `semantics.errors`, and `initialCondition.queueDepth`. +- For each topology node with `retryEcho` + `retryKernel`: synthesize a `Convolve` op. +- Queue node ID follows the C# snake_case convention (`Queue → queue_queue`). +- Dispatch schedule on topology node → `DispatchGate` op on the outflow before `QueueRecurrence`. +### AC-3 — AC-3: WIP limits and overflow routing + +**AC-3: WIP limits and overflow routing.** +- `QueueRecurrence` op supports optional `wip_limit` column (scalar const or series) and `overflow_out` column. +- Overflow routing: compiler resolves `wipOverflow` topology node ID to the target's inflow column, emits an additional `VecAdd` to inject overflow into the target's inflow. +- Overflow cycle validation at compile time (same as C# `ValidateNoOverflowCycles`). +### AC-4 — AC-4: SHIFT feedback + +**AC-4: SHIFT feedback.** Models with SHIFT-based cross-node feedback cycles evaluate correctly without special handling. The Shift op reads `state[input, t - lag]` which was written in a previous bin iteration since the evaluator processes ops in plan order and sequential ops process bins in order. Test: the backpressure model from E-10 p3b (queue → pressure → SHIFT → effective_arrivals → queue) produces the same stabilization pattern. +### AC-5 — AC-5: Parity fixtures + +**AC-5: Parity fixtures.** Create simulation-mode topology fixtures (with `grid` + inline `const`/`expr` nodes + topology) and verify parity with C# output: +- Simple queue: const arrivals/served → queue depth matches hand-calculated values +- Queue with WIP limit: overflow tracked correctly +- Queue with dispatch schedule: outflow gated to period bins +- Retry echo: CONV(failures, kernel) produces correct retry series +- Backpressure feedback: SHIFT-based throttle stabilizes queue +- Cascading WIP overflow: A→B→C overflow chain +### AC-6 — AC-6: Existing tests unbroken + +**AC-6: Existing tests unbroken.** All 38 existing Rust tests still pass. The compiler changes don't break const/expr compilation. +## Technical Notes + +- **Evaluation order for sequential ops:** The plan is still executed as a linear op list. Sequential ops (QueueRecurrence, Shift, Convolve) internally loop over bins. This is correct because the evaluator processes ops in dependency order (topo sort), and within a sequential op, each bin reads from previous bins that are already written. No special "feedback mode" needed. +- **Overflow routing without re-evaluation:** Unlike the C# `WipOverflowEvaluator` (which iterates evaluate → override → re-evaluate), the matrix compiler can emit the overflow routing as additional ops after the queue ops. The QueueRecurrence op writes overflow to `overflow_out` column; a subsequent `VecAdd` adds it to the target's inflow. For cascading (A→B→C), the compiler orders the QueueRecurrence ops so A runs before B which runs before C. No iteration needed — single-pass. +- **Topology node ID → queue column:** The compiler maintains a mapping from topology node ID to the synthesized queue column index, used for overflow routing and later for derived metrics (M-032). +- **New fixtures needed:** Existing fixtures in `engine/fixtures/` are either flat models (hello, simple-const) or telemetry models (file: references). This milestone needs simulation-mode topology fixtures with inline data. Create them as Rust test inline YAML or as new fixture files. +- **Dispatch schedule:** The C# `DispatchScheduleProcessor` zeros outflow on non-dispatch bins, then caps at capacity on dispatch bins. In the matrix model, this is a `DispatchGate` op applied to the outflow column before it reaches `QueueRecurrence`. + +## Out of Scope + +- Routing (router flow materialization) — M-031 +- Constraints (proportional allocation) — M-031 +- Multi-class flows — M-031 +- Derived metrics (utilization, latency, etc.) — M-032 +- Invariant analysis — M-032 +- Artifact writing — M-033 +- File-based series references (`file:*.csv`) — future (telemetry mode) + +## Dependencies + +- M-029 complete (compiler + evaluator + element-wise ops) diff --git a/work/epics/E-20-matrix-engine/M-031-routing-and-constraints.md b/work/epics/E-20-matrix-engine/M-031-routing-and-constraints.md new file mode 100644 index 00000000..55cd893c --- /dev/null +++ b/work/epics/E-20-matrix-engine/M-031-routing-and-constraints.md @@ -0,0 +1,101 @@ +--- +id: M-031 +title: Routing and Constraints +status: done +parent: E-20 +acs: + - id: AC-1 + title: 'AC-1: Router weight-based splitting' + status: met + - id: AC-2 + title: 'AC-2: Router class-based routing' + status: met + - id: AC-3 + title: 'AC-3: Constraint proportional allocation' + status: met + - id: AC-4 + title: 'AC-4: Router → Constraint evaluation order' + status: met + - id: AC-5 + title: 'AC-5: Parity fixtures' + status: met + - id: AC-6 + title: 'AC-6: Existing tests unbroken' + status: met +--- + +## Goal + +Add router flow materialization and constraint allocation to the Rust engine so it can evaluate models with routers (weight-based and class-based flow splitting) and shared-capacity constraints (proportional allocation when demand exceeds capacity). After this milestone, the matrix engine handles the full evaluation pipeline except derived metrics, invariant analysis, and artifact writing. + +## Context + +M-030 delivered topology synthesis (QueueRecurrence, Shift, Convolve, DispatchGate), WIP overflow routing, and SHIFT-based backpressure feedback. The evaluator uses bin-major evaluation. 65 Rust tests passing. + +The C# engine handles routing and constraints via: +1. `RouterFlowMaterializer.ComputeOverrides()` — distributes flows from a source to targets via class-based routing (priority 1) then weight-based routing (remaining flow, priority 2). +2. `ConstraintAllocator.AllocateProportional()` — when total demand exceeds capacity, allocates proportionally: `allocated[node] = capacity * (demand[node] / totalDemand)`. +3. `ConstraintAwareEvaluator` — applies router overrides first, then constraint overrides, then re-evaluates. +4. `ClassContributionBuilder` — decomposes totals into per-class series, propagates through graph. + +In the matrix model, all of these become plan ops. Router splitting is `ScalarMul` (weight fractions) or direct column copying (class routing). Constraint allocation is a new `ProportionalAlloc` op that reads multiple demand columns + capacity and writes capped output columns. Multi-class is tracked as separate columns per class. + +## Acceptance criteria + +### AC-1 — AC-1: Router weight-based splitting + +**AC-1: Router weight-based splitting.** The compiler processes `NodeDefinition.router` to split a source series across targets by weight: +- For each route: `target_arrivals += source * (weight / totalWeight)`. +- Routes without explicit weight default to 1.0. +- Multiple routes to the same target accumulate via VecAdd. +- The router's source is resolved from `router.inputs.queue` (the queue node whose outflow feeds the router) or the node's own series. +- Emitted as ScalarMul + VecAdd ops — no new Op variant needed. +### AC-2 — AC-2: Router class-based routing + +**AC-2: Router class-based routing.** Routes with a `classes` list route per-class flow to specific targets: +- The compiler resolves per-class arrival columns from `model.traffic.arrivals` (each entry has a `classId` and `nodeId`). +- Class routes extract per-class columns and sum them for the target. +- Remaining flow (after class routes) is distributed by weight among weight-only routes. +- Per-class columns use the naming convention `{nodeId}__class_{classId}`. +### AC-3 — AC-3: Constraint proportional allocation + +**AC-3: Constraint proportional allocation.** New `ProportionalAlloc` op: +- Reads N demand columns + 1 capacity column. +- Per bin: if `totalDemand > capacity`, writes `capped[i] = capacity * (demand[i] / totalDemand)`. Otherwise writes demands unchanged. +- The compiler processes `topology.constraints` to emit ProportionalAlloc ops, connecting each constraint's `semantics.arrivals` (demand total) and `semantics.served` (capacity) to the constrained topology nodes via `topologyNode.constraints` lists. +- Constrained nodes' inflow columns are replaced with the capped versions. +### AC-4 — AC-4: Router → Constraint evaluation order + +**AC-4: Router → Constraint evaluation order.** The compiler emits router ops before constraint ops (matching C# `RouterAwareGraphEvaluator` → `ConstraintAwareEvaluator` order). Constraint allocation reads from router-adjusted columns. The unified topo sort orders: data nodes → router splits → constraint allocation → queue recurrence. +### AC-5 — AC-5: Parity fixtures + +**AC-5: Parity fixtures.** Create test models and verify parity with C# output: +- Weight-based router: 3 routes with weights [0.5, 0.3, 0.2], verify target arrivals sum to source +- Class-based router: 2 classes routed to different targets +- Mixed router: some routes class-based, remainder weight-based +- Simple constraint: 2 nodes sharing capacity, demand > capacity → proportional split +- Constraint below capacity: demand < capacity → no capping +- Router + constraint combined: router feeds constrained nodes +### AC-6 — AC-6: Existing tests unbroken + +**AC-6: Existing tests unbroken.** All 65 existing Rust tests still pass. +## Technical Notes + +- **Router source resolution:** A router node in the YAML has `router.inputs.queue` pointing to a queue node. The router distributes the queue's outflow (served series) across targets. Each target gets a fraction of the served flow as its arrivals. +- **No new Op for routing:** Weight-based routing decomposes to ScalarMul + VecAdd (existing ops). Class routing decomposes to Copy + VecAdd. The compiler emits these standard ops — the router abstraction lives in the compiler, not the evaluator. +- **New Op for constraints:** `ProportionalAlloc` is a genuinely new operation — it reads N+1 columns and writes N columns with per-bin conditional logic. This is similar in spirit to QueueRecurrence (reads multiple columns, writes with conditional logic) but operates on groups of columns. +- **Per-class column naming:** `{nodeId}__class_{classId}` (double underscore to avoid collision with user-defined node IDs). These are internal columns that may not appear in outputs. +- **Constraint topology nodes:** Each constraint in `topology.constraints` has `semantics.arrivals` (total demand reference) and `semantics.served` (capacity reference). Topology nodes reference constraints via their `constraints` list. The compiler maps constraint IDs to the topology nodes they constrain. +- **Bin-major evaluation:** ProportionalAlloc processes one bin at a time (like all ops), reading demand[t] and capacity[t] and writing capped[t]. This is compatible with the bin-major evaluator from M-030. + +## Out of Scope + +- Derived metrics (utilization, latency, etc.) — M-032 +- Invariant analysis — M-032 +- Artifact writing — M-033 +- File-based series references (`file:*.csv`) — future (telemetry mode) +- Per-class output series in artifacts — M-033 (artifact layer decides what to write) + +## Dependencies + +- M-030 complete (topology synthesis, sequential ops, bin-major evaluation) diff --git a/work/epics/E-20-matrix-engine/M-032-derived-metrics-and-analysis.md b/work/epics/E-20-matrix-engine/M-032-derived-metrics-and-analysis.md new file mode 100644 index 00000000..cd2f790a --- /dev/null +++ b/work/epics/E-20-matrix-engine/M-032-derived-metrics-and-analysis.md @@ -0,0 +1,104 @@ +--- +id: M-032 +title: Derived Metrics and Analysis +status: done +parent: E-20 +acs: + - id: AC-1 + title: 'AC-1: Utilization metric' + status: met + - id: AC-2 + title: 'AC-2: Cycle time components' + status: met + - id: AC-3 + title: 'AC-3: Kingman G/G/1 approximation' + status: met + - id: AC-4 + title: 'AC-4: Invariant warnings' + status: met + - id: AC-5 + title: 'AC-5: Derived metrics integration in compiler' + status: met + - id: AC-6 + title: 'AC-6: Parity tests' + status: met + - id: AC-7 + title: 'AC-7: Existing tests unbroken' + status: met +--- + +## Goal + +Add derived metric computation and invariant analysis to the Rust engine. Derived metrics (utilization, cycle time, flow efficiency, Kingman approximation) are emitted as additional plan ops on the evaluation matrix. Invariant analysis (conservation checks, warnings) runs as a post-evaluation pass over the matrix columns. After this milestone, the engine produces all analytical output — only artifact writing and CLI remain. + +## Context + +M-031 delivered routing and constraint allocation. The engine now handles the full evaluation pipeline: const/expr/PMF nodes, topology synthesis (queues, retry echo, dispatch), WIP overflow, SHIFT feedback, routers, and constraints. 78 Rust tests passing. + +The C# engine computes derived metrics in `RuntimeAnalyticalEvaluator` and runs invariant checks in `InvariantAnalyzer`. In the matrix model, derived metrics are additional columns computed from evaluation output. Invariant analysis is a read-only pass that produces warnings. + +## Acceptance criteria + +### AC-1 — AC-1: Utilization metric + +**AC-1: Utilization metric.** Compute `utilization[t] = served[t] / effectiveCapacity[t]` for topology nodes with capacity semantics. Emit as a derived column per node. Returns 0 when capacity is 0. Effective capacity = base capacity × parallelism (when parallelism is defined). +### AC-2 — AC-2: Cycle time components + +**AC-2: Cycle time components.** Compute per-bin: +- `queueTimeMs[t] = (queueDepth[t] / served[t]) × binMs` (0 when served ≤ 0) +- `serviceTimeMs[t] = processingTimeMsSum[t] / servedCount[t]` (0 when servedCount ≤ 0) +- `cycleTimeMs[t] = queueTimeMs[t] + serviceTimeMs[t]` (sum of available components) +- `flowEfficiency[t] = serviceTimeMs[t] / cycleTimeMs[t]` (0 when cycleTime ≤ 0) +- `latencyMinutes[t] = queueTimeMs[t] / 60000` +- Which components are emitted depends on node category: queue-only → queueTime+latency, service-only → serviceTime, serviceWithBuffer → all. +### AC-3 — AC-3: Kingman G/G/1 approximation + +**AC-3: Kingman G/G/1 approximation.** Compute `E[Wq] ≈ (ρ/(1-ρ)) × ((Ca² + Cs²)/2) × E[S]` where: +- `ρ` = utilization (must be in (0, 1)) +- `Ca` = coefficient of variation of arrivals +- `Cs` = coefficient of variation of service +- `E[S]` = mean service time (ms) +- Returns 0 for invalid inputs (ρ ≥ 1, negative Cv, etc.) +- Cv is computed from PMF nodes (σ/μ) or as 0.0 for constant series. +### AC-4 — AC-4: Invariant warnings + +**AC-4: Invariant warnings.** Post-evaluation analysis producing a `Vec` struct: +- **Non-negativity:** Flag bins where arrivals, served, errors, or queueDepth < -ε (ε = 1e-6) +- **Conservation:** Flag bins where served > arrivals + ε (for non-queue nodes) or served > capacity + ε +- **Queue balance:** Flag bins where computed queue depth diverges from actual (|computed - actual| > ε) +- **Stationarity:** Flag when arrivals first-half vs second-half mean diverges > 25% +- Warning struct: `{ node_id, code, message, bins, severity }` +### AC-5 — AC-5: Derived metrics integration in compiler + +**AC-5: Derived metrics integration in compiler.** The compiler emits derived metric ops after topology ops, reading from queue depth, served, capacity, and other evaluation columns. A new `compile_derived_metrics` phase appends ops to the plan. The `EvalResult` includes a method to retrieve warnings. +### AC-6 — AC-6: Parity tests + +**AC-6: Parity tests.** Test models verifying: +- Utilization: served=8, capacity=10 → utilization=0.8 +- Queue time: queueDepth=10, served=5, binMs=60000 → queueTimeMs=120000 +- Kingman: ρ=0.8, Ca=1.0, Cs=0.5, E[S]=10 → E[Wq]=25 +- Conservation violation: served > arrivals detected as warning +- Stationarity: increasing arrivals flagged +### AC-7 — AC-7: Existing tests unbroken + +**AC-7: Existing tests unbroken.** All 78 existing Rust tests still pass. +## Technical Notes + +- **Derived columns naming:** `{nodeId}_utilization`, `{nodeId}_queue_time_ms`, `{nodeId}_cycle_time_ms`, `{nodeId}_flow_efficiency`, `{nodeId}_latency_min`, `{nodeId}_kingman_wq`. +- **Bin duration:** Resolved from `grid.binSize` × `grid.binUnit` → milliseconds. Needed for queue time computation. +- **Cv computation:** For PMF nodes, Cv = σ/μ computed at compile time from the PMF definition. For const nodes, Cv = 0. For expr nodes, Cv is not computed (future: sample Cv from evaluated series). +- **Invariant analysis is read-only.** It does not modify the matrix. It returns a list of warnings. The warnings are stored alongside the EvalResult. +- **No new Op for most derived metrics.** Utilization = VecDiv(served, capacity). Queue time = VecDiv(queueDepth, served) then ScalarMul by binMs. These compose from existing ops. Kingman may need a dedicated op or can be computed at compile time from scalar inputs. + +## Out of Scope + +- Artifact writing (CSVs, index.json, run.json) — M-033 +- CLI commands — M-033 +- Per-class derived metrics — future +- Window-level aggregation (multi-bin statistics) — future +- Edge-specific warnings (edge flow conservation) — future +- Streak detection (backlog growth, overload, age risk) — future (could add in M-032 if time permits) + +## Dependencies + +- M-031 complete (routing, constraints, full evaluation pipeline) diff --git a/work/epics/E-20-matrix-engine/M-033-artifacts-cli-and-integration.md b/work/epics/E-20-matrix-engine/M-033-artifacts-cli-and-integration.md new file mode 100644 index 00000000..43567c05 --- /dev/null +++ b/work/epics/E-20-matrix-engine/M-033-artifacts-cli-and-integration.md @@ -0,0 +1,104 @@ +--- +id: M-033 +title: Artifacts, CLI, and Integration +status: done +parent: E-20 +acs: + - id: AC-1 + title: 'AC-1: CSV series writer' + status: met + - id: AC-2 + title: 'AC-2: series/index.json' + status: met + - id: AC-3 + title: 'AC-3: run.json' + status: met + - id: AC-4 + title: 'AC-4: CLI eval --output flag' + status: met + - id: AC-5 + title: 'AC-5: CLI validate command' + status: met + - id: AC-6 + title: 'AC-6: Round-trip parity test' + status: met + - id: AC-7 + title: 'AC-7: Existing tests unbroken' + status: met +--- + +## Goal + +Add artifact writing (per-series CSVs, index.json, run.json) and complete the CLI so the Rust engine can be invoked as a standalone binary that reads a YAML model and produces a run directory with all output artifacts. This is the final milestone — after it, the Rust engine is a complete standalone replacement for the C# evaluation pipeline. + +## Context + +M-032 delivered derived metrics and invariant analysis. The engine now handles the full computation pipeline: parsing, compilation, evaluation, derived metrics, and warnings. 113 Rust tests passing. + +The CLI already has `parse`, `plan`, and `eval` commands. The `eval` command prints series to stdout. This milestone extends `eval` to write structured artifacts to an output directory, matching the C# engine's output format. + +## Acceptance criteria + +### AC-1 — AC-1: CSV series writer + +**AC-1: CSV series writer.** Write each named (non-temp) column as a CSV file: +- Format: `bin_index,value\n` header, then `{t},{value}\n` per bin +- File naming: `{seriesId}.csv` (using the column name from the column map) +- Written to `{output}/series/` directory +- Values formatted in invariant culture (`.` decimal separator, no thousands separator) +### AC-2 — AC-2: series/index.json + +**AC-2: series/index.json.** Write a JSON index of all output series: +- Schema: `{ "schemaVersion": 1, "grid": {...}, "series": [{id, path, points}] }` +- Grid: bins, binSize, binUnit from the model +- One entry per non-temp series, referencing its CSV path +### AC-3 — AC-3: run.json + +**AC-3: run.json.** Write run metadata: +- Schema: `{ "schemaVersion": 1, "engineVersion": "0.1.0", "grid": {...}, "warnings": [...], "series": [{id, path}] }` +- Includes evaluation warnings from invariant analysis +- Warning format: `{ "nodeId", "code", "message", "severity" }` +### AC-4 — AC-4: CLI eval --output flag + +**AC-4: CLI eval --output flag.** Extend the `eval` command: +- `flowtime-engine eval --output

` — evaluates and writes artifacts to `` +- Creates `/series/` directory structure +- Writes CSVs + index.json + run.json +- Without `--output`, prints summary to stdout (existing behavior) +- Exit code 0 on success, 1 on error +### AC-5 — AC-5: CLI validate command + +**AC-5: CLI validate command.** Add `validate` command: +- `flowtime-engine validate ` — parses, compiles, and runs analysis without artifact writing +- Prints warnings to stdout as JSON +- Exit code 0 if no errors, 1 if compilation fails +### AC-6 — AC-6: Round-trip parity test + +**AC-6: Round-trip parity test.** End-to-end test: +- Load a reference model fixture, run `eval --output`, verify the CSV contents match expected values +- Verify index.json is valid JSON with correct series count +- Verify run.json contains grid and warnings +### AC-7 — AC-7: Existing tests unbroken + +**AC-7: Existing tests unbroken.** All 113 existing Rust tests still pass. +## Technical Notes + +- **New module: `writer.rs`** in `flowtime-core` handles artifact writing. It takes an `EvalResult` + `ModelDefinition` and writes to a directory. The CLI calls this module. +- **CSV precision:** Use `{value}` default f64 formatting (full precision). The C# engine uses invariant culture which is equivalent. +- **No hashing in this milestone.** The C# engine produces SHA256 hashes for series, model, and scenario. Deferring hashing to keep scope tight — can add in a follow-up. +- **No manifest.json in this milestone.** The manifest includes RNG/provenance data that the Rust engine doesn't produce yet. +- **No per-class series output.** Per-class columns use internal naming (`__class_`) and are not written as separate artifacts yet. +- **Temp column filtering:** Columns starting with `__temp_` are internal intermediates and are not written. + +## Out of Scope + +- SHA256 hashing of artifacts — follow-up +- manifest.json (RNG, provenance) — follow-up +- Per-class series output — follow-up +- Parquet aggregate output — future +- .NET subprocess bridge wiring — future (separate integration work) +- stdin/stdout pipeline mode — future + +## Dependencies + +- M-032 complete (derived metrics, analysis, warnings) diff --git a/work/epics/E-20-matrix-engine/M-034-net-subprocess-bridge.md b/work/epics/E-20-matrix-engine/M-034-net-subprocess-bridge.md new file mode 100644 index 00000000..6dbebcf1 --- /dev/null +++ b/work/epics/E-20-matrix-engine/M-034-net-subprocess-bridge.md @@ -0,0 +1,83 @@ +--- +id: M-034 +title: .NET Subprocess Bridge +status: done +parent: E-20 +acs: + - id: AC-1 + title: 'AC-1: SHA256 hashing in Rust writer' + status: met + - id: AC-2 + title: 'AC-2: RustEngineRunner in FlowTime.Core' + status: met + - id: AC-3 + title: 'AC-3: Configuration switch' + status: met + - id: AC-4 + title: 'AC-4: Integration tests' + status: met +--- + +## Goal + +Bridge the Rust `flowtime-engine` binary into the .NET API as a subprocess call, with SHA256 hashing for provenance (per D-043), and a configuration switch to run the Rust engine alongside (not replacing) the C# engine. + +## Context + +M-033 delivered the Rust CLI binary and artifact writer (CSVs, index.json, run.json). The E-20 spec lists the ".NET subprocess bridge" and "full parity harness" as in-scope deliverables (spec line 66). The epic was marked complete before these were built. This milestone delivers the bridge and foundational parity tests. + +D-043 established the provenance strategy: port SHA256 basics (model hash + per-series hashes in manifest.json) as part of the bridge work. Sim-specific provenance is deferred until the bridge is exercised by real Sim runs. + +## Acceptance criteria + +### AC-1 — AC-1: SHA256 hashing in Rust writer + +**AC-1: SHA256 hashing in Rust writer.** The Rust artifact writer computes: +- SHA256 of the raw model YAML text (model hash) +- SHA256 of each series CSV file (series hashes) +- Writes `manifest.json` with `modelHash` and per-series `hash` fields +- Hash format: `"sha256:{hex}"` +- When no YAML text is available, `modelHash` is `null` +### AC-2 — AC-2: RustEngineRunner in FlowTime.Core + +**AC-2: RustEngineRunner in FlowTime.Core.** A subprocess bridge class that: +- Writes model YAML to a temp file (UTF-8, no BOM) +- Invokes `flowtime-engine eval --output ` +- Reads back `run.json`, `manifest.json`, and series CSVs +- Returns typed DTOs (grid, warnings, series values, manifest with hashes) +- Cleans up temp directory on both success and failure (finally block) +- Configurable process timeout (default 60s) with process tree kill on expiry +- Clean `RustEngineException` for missing binary, non-zero exit, and timeout +### AC-3 — AC-3: Configuration switch + +**AC-3: Configuration switch.** Opt-in via `appsettings.json`: +- `RustEngine:Enabled` (default `false`) +- `RustEngine:BinaryPath` (default: auto-discover from solution root) +- DI registration in `Program.cs` when enabled +- Does not replace C# evaluation path — both engines available side by side +### AC-4 — AC-4: Integration tests + +**AC-4: Integration tests.** At least one parity test plus error coverage: +- C#/Rust parity on simple const+expr model +- C#/Rust parity on topology model (serviceWithBuffer queue) +- C#/Rust parity on negative/precision values +- Empty model (0 nodes) +- Invalid YAML error handling +- Binary not found error handling +- Process timeout handling +- Temp directory cleanup on success and failure +- Manifest hash presence and determinism +## Out of Scope + +- Replacing the C# evaluation path in `POST /v1/run` +- Sim-specific provenance fields (template IDs, parameter bindings) +- Plan hashing (deferred to E-17/E-18) +- Per-class output support +- Full C# parity harness across all fixture models + +## Test Summary + +- 4 Rust unit tests (manifest structure, hash determinism, null hash, SHA256 correctness) +- 14 C# integration tests (see AC-4) +- All 123 existing Rust tests pass +- All 1,301 existing .NET tests pass diff --git a/work/epics/E-20-matrix-engine/M-035-full-parity-harness.md b/work/epics/E-20-matrix-engine/M-035-full-parity-harness.md new file mode 100644 index 00000000..9a11d69b --- /dev/null +++ b/work/epics/E-20-matrix-engine/M-035-full-parity-harness.md @@ -0,0 +1,79 @@ +--- +id: M-035 +title: Full Parity Harness +status: done +parent: E-20 +depends_on: + - M-034 +acs: + - id: AC-1 + title: 'AC-1: outputs: filtering in Rust compiler' + status: met + - id: AC-2 + title: 'AC-2: Parameterized parity test' + status: met + - id: AC-3 + title: 'AC-3: All non-class, non-edge fixtures pass parity' + status: met + - id: AC-4 + title: 'AC-4: Class and edge fixtures documented' + status: met + - id: AC-5 + title: 'AC-5: Parity matrix output' + status: met +--- + +## Goal + +Establish an automated parity test that runs every Rust engine fixture (21 models) through both the Rust and C# engines and compares series values. Produces a green/red matrix showing exactly which models match and which diverge, and where. This is the baseline before any engine core work. + +## Context + +M-034 delivered the .NET subprocess bridge and 3 parity tests (simple const+expr, topology queue, negative/precision values). The E-20 spec promises a "full parity harness" across reference models. Only 3 of 21 fixtures are tested today — topology, routing, constraint, PMF, and class-enabled models are untested. + +The `outputs:` filtering feature (YAML `outputs` section) is parsed by the Rust model but not used. Some models rely on it to select output series. This must be implemented for the harness to test those models correctly. + +## Acceptance criteria + +### AC-1 — AC-1: outputs: filtering in Rust compiler + +**AC-1: `outputs:` filtering in Rust compiler.** When the model has an `outputs` section, the Rust engine filters its output to only include the listed series. The `as` field renames the series in the output. When no `outputs` section is present, all non-temp series are included (current behavior). +### AC-2 — AC-2: Parameterized parity test + +**AC-2: Parameterized parity test.** A single test method that: +- Iterates over all `engine/fixtures/*.yaml` files +- Evaluates each through the Rust engine (via `RustEngineRunner`) +- Evaluates each through the C# engine (`ModelService.ParseAndConvert` → `RouterAwareGraphEvaluator.Evaluate`) +- Compares shared series values bin-by-bin with configurable tolerance (default: 1e-10) +- Uses case-insensitive series matching (Rust lowercases topology node IDs) +- Reports per-fixture, per-series pass/fail with divergence details on failure +### AC-3 — AC-3: All non-class, non-edge fixtures pass parity + +**AC-3: All non-class, non-edge fixtures pass parity.** The following fixtures must produce identical series values in both engines: +- `hello.yaml`, `simple-const.yaml` — trivial models +- `complex-pmf.yaml`, `pmf.yaml` — PMF nodes +- `http-service.yaml` — expression-based service +- `topology-simple-queue.yaml`, `topology-backpressure.yaml`, `topology-cascading-overflow.yaml`, `topology-wip-limit.yaml`, `topology-dispatch.yaml`, `topology-retry-echo.yaml` — topology models +- `constraint-below-capacity.yaml`, `constraint-proportional.yaml` — constraint allocation +- `router-weight.yaml`, `router-with-constraint.yaml` — weight-based routing +- `retry-service-time.yaml` — retry kernels +- `order-system.yaml`, `microservices.yaml` — complex multi-node models +### AC-4 — AC-4: Class and edge fixtures documented + +**AC-4: Class and edge fixtures documented.** Fixtures that use classes (`class-enabled.yaml`, `router-class.yaml`, `router-mixed.yaml`) are tested but expected divergences are documented. The harness marks them as "known divergence — per-class decomposition not yet implemented" rather than failing the test run. +### AC-5 — AC-5: Parity matrix output + +**AC-5: Parity matrix output.** The test run produces a clear summary (in test output or a generated report) showing pass/fail status for each fixture. This becomes the baseline for M-036. +## Out of Scope + +- Per-class column decomposition (M-036) +- Edge series materialization (M-036) +- Artifact layout changes (M-037) +- Any changes to the C# engine + +## Key References + +- `engine/fixtures/` — 21 YAML fixtures +- `engine/core/src/model.rs` — `OutputDefinition` struct (parsed, not yet used) +- `tests/FlowTime.Integration.Tests/RustEngineBridgeTests.cs` — existing 14 bridge tests +- `work/gaps.md` — Rust Engine Parity section diff --git a/work/epics/E-20-matrix-engine/M-036-per-class-decomposition-and-edge-series.md b/work/epics/E-20-matrix-engine/M-036-per-class-decomposition-and-edge-series.md new file mode 100644 index 00000000..88addd0e --- /dev/null +++ b/work/epics/E-20-matrix-engine/M-036-per-class-decomposition-and-edge-series.md @@ -0,0 +1,101 @@ +--- +id: M-036 +title: Per-Class Decomposition and Edge Series +status: done +parent: E-20 +depends_on: + - M-035 +acs: + - id: AC-1 + title: 'AC-1: Class assignment map' + status: met + - id: AC-2 + title: 'AC-2: Per-class series in EvalResult' + status: met + - id: AC-3 + title: 'AC-3: Expression-tree per-class evaluation' + status: met + - id: AC-4 + title: 'AC-4: ServiceWithBuffer per-class decomposition' + status: met + - id: AC-5 + title: 'AC-5: Edge series in EvalResult' + status: met + - id: AC-6 + title: 'AC-6: Router per-class distribution' + status: met + - id: AC-7 + title: 'AC-7: Parity harness green for class fixtures' + status: met + - id: AC-8 + title: 'AC-8: Parity harness green for edge fixtures' + status: met + - id: AC-9 + title: 'AC-9: Normalization invariant' + status: met +--- + +## Goal + +The Rust engine core returns complete evaluation results including per-class series decomposition and per-edge metrics. After this milestone, the engine is feature-complete for evaluation — it computes everything the C# engine computes. This is the critical prerequisite for E-17 and E-18. + +## Context + +The Rust engine currently computes class-based routing internally (using `__class_` prefixed temporary columns) but does not expose per-class series in its output. The C# engine uses `ClassContributionBuilder` (1,695 lines, 4-pass algorithm) and `EdgeFlowMaterializer` (764 lines) to produce per-class and per-edge series after evaluation. + +Per D-044, the engine core must return complete results. The artifact sink (M-037) then formats and persists them. This milestone focuses on the evaluation layer only. + +### Design consideration: port vs. redesign + +The C# `ClassContributionBuilder` is a post-evaluation 4-pass algorithm that decomposes total series into per-class contributions using proportional allocation and expression-tree re-evaluation. In the Rust matrix engine, an alternative approach may be more natural: + +- **Option A (port):** Implement a post-evaluation decomposition pass similar to C#. Receives the evaluated state matrix and splits columns by class proportions. +- **Option B (plan ops):** Extend the compiler to emit per-class columns as explicit plan operations during compilation. Each class gets its own columns, evaluated in the main bin-major loop. No post-processing needed. + +Option B leverages the matrix architecture — it's "classes as columns" rather than "classes as post-processing." But it may increase matrix size for models with many classes. The spec author should evaluate both approaches during implementation and choose based on correctness and simplicity. Document the choice in the tracking doc. + +## Acceptance criteria + +### AC-1 — AC-1: Class assignment map + +**AC-1: Class assignment map.** The engine extracts `traffic.arrivals` entries to build a node-to-class mapping. Equivalent to `ClassAssignmentMapBuilder` (trivial — 37 lines of C#). +### AC-2 — AC-2: Per-class series in EvalResult + +**AC-2: Per-class series in EvalResult.** The `EvalResult` struct (or its successor) includes per-class series: for each node that has class contributions, the result contains `(node_id, class_id) → f64[]` series values. At minimum, arrival nodes, expression nodes, and serviceWithBuffer nodes must have per-class decomposition. +### AC-3 — AC-3: Expression-tree per-class evaluation + +**AC-3: Expression-tree per-class evaluation.** For `expr` nodes, per-class decomposition must handle the expression tree correctly: binary ops (add, sub, mul, div), scalar ops, and functions (SHIFT, CONV, MIN, MAX, CLAMP). The per-class series must sum to the total series within floating-point tolerance. +### AC-4 — AC-4: ServiceWithBuffer per-class decomposition + +**AC-4: ServiceWithBuffer per-class decomposition.** Queue nodes produce per-class queue depth, per-class served, per-class arrivals. Queue depth decomposition follows proportional allocation based on arrival class fractions. +### AC-5 — AC-5: Edge series in EvalResult + +**AC-5: Edge series in EvalResult.** The result includes per-edge metrics: `(edge_id, metric) → f64[]` where metric is one of: `flowVolume`, `attemptsVolume`, `failuresVolume`, `retryVolume`. Per-class edge decomposition: `(edge_id, metric, class_id) → f64[]`. +### AC-6 — AC-6: Router per-class distribution + +**AC-6: Router per-class distribution.** Class-based routes distribute flow to their designated targets. Weight-based routes distribute remaining (non-class-assigned) flow by weight. Router diagnostics (leakage, accuracy) are available in the result. +### AC-7 — AC-7: Parity harness green for class fixtures + +**AC-7: Parity harness green for class fixtures.** The 3 class-enabled fixtures (`class-enabled.yaml`, `router-class.yaml`, `router-mixed.yaml`) pass the parity harness from M-035. Known divergences resolved. +### AC-8 — AC-8: Parity harness green for edge fixtures + +**AC-8: Parity harness green for edge fixtures.** The 6 edge-bearing fixtures pass the parity harness with edge series comparison. Edge series values match C# `EdgeFlowMaterializer` output within tolerance. +### AC-9 — AC-9: Normalization invariant + +**AC-9: Normalization invariant.** For every node, the sum of per-class series equals the total series within 1e-10 tolerance. A Rust test asserts this invariant across all class-enabled fixtures. +## Out of Scope + +- Artifact directory layout changes (M-037) +- Series ID naming convention (`{node}@{component}@{class}`) — that's sink concern +- Per-class CSV file writing — that's sink concern +- StateQueryService compatibility — that's sink concern +- Sim-specific provenance (D-043 deferral still applies) + +## Key References + +- `src/FlowTime.Core/Artifacts/ClassContributionBuilder.cs` — C# reference (1,695 lines) +- `src/FlowTime.Core/Routing/EdgeFlowMaterializer.cs` — C# reference (764 lines) +- `src/FlowTime.Core/Artifacts/ClassAssignmentMapBuilder.cs` — C# reference (37 lines) +- `engine/core/src/compiler.rs` — existing class-aware routing (lines 397-533) +- `engine/fixtures/class-enabled.yaml`, `router-class.yaml`, `router-mixed.yaml` — class fixtures +- D-044 — three-layer architecture decision diff --git a/work/epics/E-20-matrix-engine/M-037-artifact-sink-parity.md b/work/epics/E-20-matrix-engine/M-037-artifact-sink-parity.md new file mode 100644 index 00000000..4a079b79 --- /dev/null +++ b/work/epics/E-20-matrix-engine/M-037-artifact-sink-parity.md @@ -0,0 +1,125 @@ +--- +id: M-037 +title: Artifact Sink Parity +status: done +parent: E-20 +depends_on: + - M-036 +acs: + - id: AC-1 + title: 'AC-1: model/ directory' + status: met + - id: AC-2 + title: 'AC-2: spec.yaml at run root' + status: met + - id: AC-3 + title: 'AC-3: Series ID naming convention' + status: met + - id: AC-4 + title: 'AC-4: Full series/index.json schema' + status: met + - id: AC-5 + title: 'AC-5: Full run.json schema' + status: met + - id: AC-6 + title: 'AC-6: Full manifest.json schema' + status: met + - id: AC-7 + title: 'AC-7: aggregates/ directory' + status: met + - id: AC-8 + title: 'AC-8: Deterministic run ID' + status: met + - id: AC-9 + title: 'AC-9: StateQueryService integration test' + status: met + - id: AC-10 + title: 'AC-10: Parity with C# artifact layout' + status: met +--- + +## Goal + +The Rust artifact sink produces the full directory layout that `StateQueryService` can read. After this milestone, the C# `RunArtifactWriter` is no longer needed for Rust-evaluated runs. E-17 and E-18 are unblocked. + +## Context + +The Rust engine (after M-036) returns complete evaluation results: total series, per-class series, edge series, warnings, grid info, and metadata. The current `writer.rs` produces a minimal artifact set (bare series CSVs, simple index.json, run.json, manifest.json). The C# `RunArtifactWriter` produces a much richer layout that `StateQueryService` expects: `model/` directory, normalized `spec.yaml`, per-class CSV naming, full JSON schemas with class metadata, and provenance files. + +Per D-044, the artifact sink is a separate layer from the engine core. It receives the model input and EvalResult, and persists them durably. This milestone builds the full sink as a Rust library used by both the CLI and the .NET bridge. + +## Acceptance criteria + +### AC-1 — AC-1: model/ directory + +**AC-1: `model/` directory.** The sink writes: +- `model/model.yaml` — copy of the input model YAML +- `model/metadata.json` — template metadata extracted from YAML provenance section and/or passed metadata: `{ schemaVersion, templateId, templateTitle, templateVersion, mode, modelHash, source, hasTelemetrySources, telemetrySources, nodeSources, parameters }` +- `model/provenance.json` — written when provenance metadata is provided (pass-through). Omitted when absent (backward compatible). +### AC-2 — AC-2: spec.yaml at run root + +**AC-2: `spec.yaml` at run root.** Normalized model YAML with topology semantics rewritten to `file://` URIs pointing to series CSV paths. This is what `StateQueryService` reads to resolve topology node bindings. +### AC-3 — AC-3: Series ID naming convention + +**AC-3: Series ID naming convention.** Series files use the format `{nodeId}@{COMPONENT_ID}@{CLASS_ID}.csv`: +- Default (no class): `{nodeId}@{COMPONENT_ID}@DEFAULT.csv` +- Per-class: `{nodeId}@{COMPONENT_ID}@{classId}.csv` +- Edge: `edge_{edgeId}_{metric}@{COMPONENT_ID}@{classId}.csv` +- Component IDs follow C# conventions: `ARRIVALS`, `SERVED`, `QUEUE`, `ERRORS`, etc. +### AC-4 — AC-4: Full series/index.json schema + +**AC-4: Full `series/index.json` schema.** Each series entry includes: +- `id`, `kind` (flow/stock/ratio/time), `path`, `unit`, `componentId`, `class`, `classKind` (fallback/specific), `points`, `hash` +- `formats` section with aggregates table reference +- `classes` array with declared class definitions +- `classCoverage` field (full/partial/missing) +### AC-5 — AC-5: Full run.json schema + +**AC-5: Full `run.json` schema.** Includes: +- `schemaVersion`, `runId`, `engineVersion`, `source`, `inputHash` +- `grid` (bins, binSize, binUnit, timezone, align) +- `scenarioHash`, `modelHash` +- `classesCoverage` +- `warnings` array (nodeId, code, message, severity, bins) +- `series` array (id, path, unit) +- `classes` array (id, displayName, description) +### AC-6 — AC-6: Full manifest.json schema + +**AC-6: Full `manifest.json` schema.** Extends existing to include: +- `rng` section (kind, seed) +- `provenance` section (hasProvenance, modelId, templateId, inputHash) +- `classes` array +- `seriesHashes` (per-series SHA256) +- `createdUtc` timestamp +### AC-7 — AC-7: aggregates/ directory + +**AC-7: `aggregates/` directory.** Created as a placeholder (empty directory). Matches C# behavior. +### AC-8 — AC-8: Deterministic run ID + +**AC-8: Deterministic run ID.** When deterministic mode is requested, run ID is derived from `sha256(normalized_spec + seed + bias)` truncated to 16 hex chars. Matches C# `DeterministicRunNaming`. +### AC-9 — AC-9: StateQueryService integration test + +**AC-9: StateQueryService integration test.** A C# integration test that: +- Evaluates a class-enabled model through the Rust engine + sink +- Loads the produced run directory via `StateQueryService.LoadContextAsync` +- Verifies: topology resolved, per-class series loadable, provenance hash valid, warnings present +- This is the definitive proof that the Rust sink is compatible. +### AC-10 — AC-10: Parity with C# artifact layout + +**AC-10: Parity with C# artifact layout.** For a reference model, produce artifacts from both C# `RunArtifactWriter` and Rust sink. Compare directory structures and file contents. Document any intentional differences. +## Out of Scope + +- Replacing `RunArtifactWriter` callers (that's wiring work for when the switch happens) +- Parquet aggregates (placeholder only — future work) +- Telemetry bundle building (stays in C# `TelemetryBundleBuilder`) +- Template orchestration (stays in C# `RunOrchestrationService`) +- Storage backend abstraction (filesystem only — S3/database is future) + +## Key References + +- `src/FlowTime.Core/Artifacts/RunArtifactWriter.cs` — C# reference (1,287 lines) +- `src/FlowTime.API/Services/StateQueryService.cs` — reads artifacts back (5,195 lines) +- `src/FlowTime.Core/Artifacts/DeterministicRunNaming.cs` — run ID generation +- `engine/core/src/writer.rs` — current minimal Rust writer +- D-044 — three-layer architecture (engine core / artifact sink / consumer adapters) +- D-043 — provenance strategy diff --git a/work/epics/completed/E-20-matrix-engine/spec.md b/work/epics/E-20-matrix-engine/epic.md similarity index 87% rename from work/epics/completed/E-20-matrix-engine/spec.md rename to work/epics/E-20-matrix-engine/epic.md index f5c3a340..5ba4cafc 100644 --- a/work/epics/completed/E-20-matrix-engine/spec.md +++ b/work/epics/E-20-matrix-engine/epic.md @@ -1,7 +1,11 @@ -# Epic: Matrix Engine +--- +id: E-20 +title: Matrix Engine +status: done +--- + +> **Status note:** M-028–10 all complete -**ID:** E-20 -**Status:** complete (m-E20-01–10 all complete) **Owner:** Engine ## Goal @@ -204,16 +208,16 @@ Artifacts (CSVs + index.json + run.json) | ID | Title | Summary | Status | |----|-------|---------|--------| -| m-E20-01 | Scaffold, types, and parsers | Rust crate, model types with serde, YAML deserialization, expression parser. Reference model fixtures. Devcontainer Rust toolchain. | complete | -| m-E20-02 | Compiler and core evaluator | Column map, topo sort, plan generation for const/expr nodes. Evaluator loop + element-wise ops. First end-to-end parity on simple models. | complete | -| m-E20-03 | Topology and sequential ops | Queue synthesis, QueueRecurrence, Shift, Convolve, DispatchGate, PMF, WIP limits with overflow. Feedback subgraphs (bin-sequential). | complete | -| m-E20-04 | Routing and constraints | Router flow materialization, constraint allocation, multi-class flow distribution — all as plan ops. | complete | -| m-E20-05 | Derived metrics and analysis | Utilization, latency, cycle time, Cv, Kingman as plan ops. Invariant analysis as column arithmetic. Warnings. | complete | -| m-E20-06 | Artifacts, CLI, and integration | CSV/JSON artifact writer. CLI (eval, validate, plan). | complete | -| m-E20-07 | .NET subprocess bridge | SHA256 hashing + manifest.json. RustEngineRunner subprocess bridge. Config switch. Parity tests. | complete | -| m-E20-08 | Full parity harness | All 21 fixtures tested against C# engine. `outputs:` filtering. Green/red parity matrix. | complete | -| m-E20-09 | Per-class decomposition and edge series | Per-class columns, edge metrics, class assignment. Engine core feature-complete. | complete | -| m-E20-10 | Artifact sink parity | Full directory layout. StateQueryService compatible. RunArtifactWriter replaceable. | complete | +| M-028 | Scaffold, types, and parsers | Rust crate, model types with serde, YAML deserialization, expression parser. Reference model fixtures. Devcontainer Rust toolchain. | complete | +| M-029 | Compiler and core evaluator | Column map, topo sort, plan generation for const/expr nodes. Evaluator loop + element-wise ops. First end-to-end parity on simple models. | complete | +| M-030 | Topology and sequential ops | Queue synthesis, QueueRecurrence, Shift, Convolve, DispatchGate, PMF, WIP limits with overflow. Feedback subgraphs (bin-sequential). | complete | +| M-031 | Routing and constraints | Router flow materialization, constraint allocation, multi-class flow distribution — all as plan ops. | complete | +| M-032 | Derived metrics and analysis | Utilization, latency, cycle time, Cv, Kingman as plan ops. Invariant analysis as column arithmetic. Warnings. | complete | +| M-033 | Artifacts, CLI, and integration | CSV/JSON artifact writer. CLI (eval, validate, plan). | complete | +| M-034 | .NET subprocess bridge | SHA256 hashing + manifest.json. RustEngineRunner subprocess bridge. Config switch. Parity tests. | complete | +| M-035 | Full parity harness | All 21 fixtures tested against C# engine. `outputs:` filtering. Green/red parity matrix. | complete | +| M-036 | Per-class decomposition and edge series | Per-class columns, edge metrics, class assignment. Engine core feature-complete. | complete | +| M-037 | Artifact sink parity | Full directory layout. StateQueryService compatible. RunArtifactWriter replaceable. | complete | ### Milestone progression diff --git a/work/epics/E-21-svelte-workbench-analysis-surfaces/M-038-workbench-foundation.md b/work/epics/E-21-svelte-workbench-analysis-surfaces/M-038-workbench-foundation.md new file mode 100644 index 00000000..dc013040 --- /dev/null +++ b/work/epics/E-21-svelte-workbench-analysis-surfaces/M-038-workbench-foundation.md @@ -0,0 +1,187 @@ +--- +id: M-038 +title: Workbench Foundation +status: done +parent: E-21 +acs: + - id: AC-1 + title: Main content area padding reduced + status: met + - id: AC-2 + title: Sidebar narrowed + status: met + - id: AC-3 + title: Compact design tokens defined in app.css + status: met + - id: AC-4 + title: shadcn component overrides applied + status: met + - id: AC-5 + title: Existing pages still function + status: met + - id: AC-6 + title: bindEvents() exported from dag-map + status: met + - id: AC-7 + title: selected render option in dag-map + status: met + - id: AC-8 + title: dag-map tests cover events and selection + status: met + - id: AC-9 + title: Topology page restructured as split layout + status: met + - id: AC-10 + title: Click-to-pin interaction + status: met + - id: AC-11 + title: Node card content + status: met + - id: AC-12 + title: Timeline integration + status: met + - id: AC-13 + title: Cards dismissible + status: met + - id: AC-14 + title: Auto-pin highest-utilization node on first load + status: met + - id: AC-15 + title: Playwright test coverage + status: met + - id: AC-16 + title: Vitest coverage for new pure logic + status: met +--- + +## Goal + +Establish the compact design system, implement dag-map click/hover events in the library, and build the workbench panel with click-to-pin node inspection — the foundation that every subsequent E-21 milestone builds on. + +## Context + +The Svelte UI has topology rendering (dag-map with heatmap mode), timeline scrubbing, run orchestration, and what-if parameter manipulation. But there is no way to click a node and inspect it. The layout uses shadcn-svelte's consumer-product defaults (generous padding, large text, wide sidebar) which waste space in a data-dense workbench. + +This milestone delivers three things in sequence: +1. Compact design tokens that replace the spacious defaults +2. dag-map library events so nodes/edges are clickable +3. Workbench panel where clicked nodes show metrics and sparklines + +### Prior art + +- dag-map already emits `data-node-id` and `data-edge-from`/`data-edge-to` attributes on SVG elements (`lib/dag-map/src/render.js`) +- Topology page (`ui/src/routes/time-travel/topology/+page.svelte`) has timeline scrubbing and state API integration +- What-if page (`ui/src/routes/what-if/+page.svelte`) has real-time metric display via WebSocket +- `DagMapView.svelte` wraps dag-map with theme switching and metric mapping + +## Acceptance criteria + +### AC-1 — Main content area padding reduced + +**Main content area padding reduced.** The root layout no longer applies blanket `p-6`. Page-level padding is context-dependent: topology/workbench pages use minimal padding (`p-1` or `p-2`), form/config pages may use moderate padding. +### AC-2 — Sidebar narrowed + +**Sidebar narrowed.** Expanded sidebar width ≤ 208px (from 280px). Collapsed width ≤ 40px. All nav items still readable and clickable. +### AC-3 — Compact design tokens defined in app.css + +**Compact design tokens defined in `app.css`.** Two token layers: +- **Chrome tokens:** `--ft-bg`, `--ft-bg-elevated`, `--ft-border`, `--ft-text`, `--ft-text-muted`, `--ft-text-emphasis`. Calm values. Dark mode: near-black backgrounds (`hsl(220 10% 4%)`-range), subtle borders, muted gray text. Light mode: warm light backgrounds, subtle borders, dark text. +- **Data-viz tokens:** `--ft-viz-teal`, `--ft-viz-pink`, `--ft-viz-coral`, `--ft-viz-blue`, `--ft-viz-green`, `--ft-viz-amber` plus sequential/diverging scale entry points. Vivid against both dark and light backgrounds. +- **Spacing tokens:** `--ft-space-xs` (2px), `--ft-space-sm` (4px), `--ft-space-md` (6px), `--ft-space-lg` (8px), `--ft-space-xl` (12px). Tighter than the current 4/8/12/16/24px scale. +- **Border radius:** `--ft-radius` at `0.25rem` or less (from `0.5rem`). +- **Type:** working text size is `text-xs` (12px). Emphasis is `text-sm` (14px). Headers use `text-sm font-semibold` or `text-base`. +### AC-4 — shadcn component overrides applied + +**shadcn component overrides applied.** Cards, buttons, inputs, and sidebar components use the compact tokens. No component uses raw `p-4`, `p-6`, `gap-4` etc. — spacing comes from the token scale. +### AC-5 — Existing pages still function + +**Existing pages still function.** What-if page, run orchestration, topology page, health page all render correctly with the new density. Visual audit confirms no layout breakage. Vitest and Playwright suites still pass. +### AC-6 — bindEvents() exported from dag-map + +**`bindEvents()` exported from dag-map.** Given an SVG container element, `bindEvents(container, callbacks)` uses event delegation to fire: +- `onNodeClick(nodeId, event)` — click on any `[data-node-id]` element +- `onNodeHover(nodeId | null, event)` — mouseenter/mouseleave on node elements +- `onEdgeClick(fromId, toId, event)` — click on any `[data-edge-from]` element +- `onEdgeHover(fromId, toId | null, event)` — mouseenter/mouseleave on edge elements +- Returns a cleanup function that removes all listeners. +- Edge hit areas: edge paths are thin lines. `bindEvents` should set `pointer-events: stroke` and use a wider invisible stroke or a transparent hit-area overlay (≥ 8px clickable width) so edges are practically clickable. +### AC-7 — selected render option in dag-map + +**`selected` render option in dag-map.** `renderSVG(dag, layout, { ..., selected: Set })` draws a selection indicator (ring, outline, or highlight) on nodes whose ID is in the set. The selection visual must compose correctly with heatmap mode (heatmap fills + selection ring, not one replacing the other). +### AC-8 — dag-map tests cover events and selection + +**dag-map tests cover events and selection.** Unit tests (dag-map's existing test infrastructure) verify: `bindEvents` fires correct callbacks for node/edge clicks and hovers; `selected` set renders the selection indicator; selection composes with heatmap mode. dag-map version bumped and published (or linked via workspace protocol). +### AC-9 — Topology page restructured as split layout + +**Topology page restructured as split layout.** The topology page shows the DAG in the upper area and the workbench panel in the lower area, separated by a resizable split (drag to resize, reasonable default like 60/40 or 65/35). When no nodes are pinned, the workbench shows a minimal empty state hint ("Click a node to inspect"). +### AC-10 — Click-to-pin interaction + +**Click-to-pin interaction.** Clicking a node in the topology DAG pins it to the workbench. The node appears with a selection indicator in the DAG (via `selected` set) and a card in the workbench. Clicking a pinned node again unpins it (removes card, removes selection indicator). Multiple nodes can be pinned simultaneously. +### AC-11 — Node card content + +**Node card content.** Each workbench card shows: +- Node ID and kind (service, queue, dlq, source, router, etc.) +- Key metrics at the current timeline bin: utilization, queue depth, arrivals, served, errors, capacity — as available from the state API response +- Sparkline showing the selected metric over the full time window (all bins) +- Values formatted with appropriate precision (see `format.ts` utilities) +- Compact layout using the density tokens — the card should fit meaningful content in ~180-220px width +### AC-12 — Timeline integration + +**Timeline integration.** When the timeline scrubs (bin changes), all workbench card metric values update to the new bin. Sparklines show a position indicator (vertical line or dot) at the current bin. +### AC-13 — Cards dismissible + +**Cards dismissible.** Each card has a small close/unpin control. Dismissing a card removes the node from the `selected` set and the workbench. +### AC-14 — Auto-pin highest-utilization node on first load + +**Auto-pin highest-utilization node on first load.** When a run loads and state data is available, the node with the highest utilization at bin 0 is auto-pinned to the workbench so it is never empty on first view. If utilization data is unavailable, skip auto-pin (empty state is acceptable). +### AC-15 — Playwright test coverage + +**Playwright test coverage.** At least one Playwright spec covering: (a) topology loads and renders, (b) clicking a node opens a workbench card, (c) clicking the close control removes the card, (d) scrubbing the timeline updates card values. Specs skip gracefully if the API or dev server is unavailable. +### AC-16 — Vitest coverage for new pure logic + +**Vitest coverage for new pure logic.** Any new helper functions (metric extraction, card data shaping, sparkline data preparation) have vitest tests with branch coverage. +## Technical Notes + +### Density system approach + +- Introduce the `--ft-*` custom properties alongside the existing shadcn `--background`, `--foreground` etc. variables. The shadcn variables can initially alias the `--ft-*` tokens, keeping component library compatibility while allowing the token layer to diverge. +- The dark mode near-black should feel like the `.scratch/colors.png` reference: `hsl(220 10% 4%)` background, `hsl(220 8% 8%)` elevated, `hsl(220 6% 14%)` border. Text at `hsl(220 10% 65%)` for muted, `hsl(220 5% 85%)` for default. +- Light mode: `hsl(220 10% 97%)` background, `hsl(0 0% 100%)` elevated, `hsl(220 10% 88%)` border. Text at `hsl(220 10% 40%)` for muted, `hsl(220 10% 15%)` for default. +- Data-viz colors from `reference-palette.png` (epic folder) hue families: teal `#94E2D5`/`#2B8A8E`, pink `#F38BA8`/`#C45B4A`, coral `#EB6F92`, blue `#89B4FA`/`#3D5BA9`, green `#A6E3A1`/`#4A8C5C`, amber `#F9E2AF`/`#D4944C`. The dark-mode values are lighter/more vivid; light-mode values are darker/more saturated to maintain contrast. + +### dag-map events + +- dag-map already emits `data-node-id` on station `` groups and `data-edge-from`/`data-edge-to` on edge `` elements. `bindEvents()` delegates from the SVG container using these attributes. +- For edge hit areas: add an invisible wider stroke path (same shape, stroke-width 8-12px, `opacity: 0`, `pointer-events: stroke`) behind each visible edge path. This is a render-time addition in `render.js`, not a separate overlay. +- The `selected` visual should be a ring or glow around the node circle/rect, using a dedicated CSS class (`dag-map-selected`) so consumers can override the style. Default: 2px outline in the theme's `ink` color, offset by 2px. + +### Workbench panel + +- Implement the split as a CSS grid with `grid-template-rows` and a draggable splitter. No library dependency — a simple `mousedown` → `mousemove` → `mouseup` handler on a narrow divider element. Store the split ratio in `localStorage`. +- Node cards are a new Svelte component (`WorkbenchCard.svelte`). They consume the state API response that the topology page already fetches. +- Sparkline: reuse the existing `Sparkline.svelte` component. It already exists in `ui/src/lib/components/`. +- The workbench state (pinned node IDs) lives in a Svelte store so it persists across route navigations within the same session. Not persisted to `localStorage` (ephemeral per session). + +### What-if page audit + +- The what-if page (`/what-if`) uses its own layout with dag-map and parameter panels. It does NOT use the workbench. The density pass adjusts its spacing tokens but does not add workbench functionality to it. The what-if page is a standalone surface that predates the workbench paradigm. + +## Out of Scope + +- Edge cards (M-039) +- Metric selector chip bar (M-039) +- Class filter (M-039) +- Analysis tab surfaces (M-040/04) +- Heatmap view (M-042) +- Validation/warning surfaces (M-043) +- Final visual polish and dark mode QA (M-044) +- Color palette iteration beyond the initial token values (user will bring examples for future iteration) +- dag-map layout engine changes (separate concern) +- Expert authoring surface + +## Dependencies + +- dag-map library (`lib/dag-map/`) — we own it; changes ship in this milestone +- E-18 Time Machine APIs available on port 8081 (already merged to main) +- E-17 what-if infrastructure (`/what-if` route, engine session API) — must not regress +- E-11 M6 run orchestration (`/run` route) — must not regress diff --git a/work/epics/completed/E-21-svelte-workbench-and-analysis/m-E21-02-metric-selector-edge-cards.md b/work/epics/E-21-svelte-workbench-analysis-surfaces/M-039-metric-selector-edge-cards.md similarity index 50% rename from work/epics/completed/E-21-svelte-workbench-and-analysis/m-E21-02-metric-selector-edge-cards.md rename to work/epics/E-21-svelte-workbench-analysis-surfaces/M-039-metric-selector-edge-cards.md index fad1ab03..62b2ef63 100644 --- a/work/epics/completed/E-21-svelte-workbench-and-analysis/m-E21-02-metric-selector-edge-cards.md +++ b/work/epics/E-21-svelte-workbench-analysis-surfaces/M-039-metric-selector-edge-cards.md @@ -1,8 +1,40 @@ -# Milestone: Metric Selector & Edge Cards - -**ID:** m-E21-02-metric-selector-edge-cards -**Epic:** E-21 — Svelte Workbench & Analysis Surfaces -**Status:** complete (merged to epic 2026-04-17) +--- +id: M-039 +title: Metric Selector & Edge Cards +status: done +parent: E-21 +acs: + - id: AC-1 + title: Chip bar renders below the toolbar + status: met + - id: AC-2 + title: Selecting a metric changes the topology heatmap coloring + status: met + - id: AC-3 + title: Workbench card sparklines reflect the selected metric + status: met + - id: AC-4 + title: Clicking an edge in the topology pins it to the workbench + status: met + - id: AC-5 + title: Edge card content + status: met + - id: AC-6 + title: Edge selection indicator in DAG + status: met + - id: AC-7 + title: Class filter dropdown appears when classes exist + status: met + - id: AC-8 + title: Class filter controls topology visibility + status: met + - id: AC-9 + title: Vitest coverage for new helpers + status: met + - id: AC-10 + title: Existing Playwright specs still pass + status: met +--- ## Goal @@ -10,44 +42,46 @@ Complete the workbench as a general inspection tool by adding a metric selector ## Context -m-E21-01 delivered the workbench foundation: density system, dag-map click/hover events, and node cards with utilization-based heatmap coloring. But the topology only colors by utilization (hardcoded), there's no way to inspect edges, and class filtering doesn't exist. +M-038 delivered the workbench foundation: density system, dag-map click/hover events, and node cards with utilization-based heatmap coloring. But the topology only colors by utilization (hardcoded), there's no way to inspect edges, and class filtering doesn't exist. This milestone adds the remaining "what am I looking at?" controls that the Blazor feature bar provided (15+ toggles) — but in the simplified workbench paradigm: one metric selector, one class filter, and edge inspection via pinning. -## Acceptance Criteria - -### Metric selector (AC1-AC3) - -1. **Chip bar renders below the toolbar.** A horizontal row of metric chips: Utilization, Queue Depth, Arrivals, Served, Errors, Flow Latency. One active at a time (radio behavior). Default: Utilization. - -2. **Selecting a metric changes the topology heatmap coloring.** Each chip maps to a specific field in the state API response (`derived.utilization`, `metrics.queueDepth`, `metrics.arrivals`, `metrics.served`, `metrics.errors`, `derived.flowLatencyMs`). Selecting a chip re-extracts metrics from the current state data and passes them to DagMapView. Node metric labels update accordingly (e.g., "85%" for utilization, "14.5" for queue depth). - -3. **Workbench card sparklines reflect the selected metric.** When the selected metric changes, the sparkline in each pinned workbench card updates to show that metric's values over the full time window (requires fetching state window data or caching per-bin values). +## Acceptance criteria -### Edge cards (AC4-AC6) +### AC-1 — Chip bar renders below the toolbar -4. **Clicking an edge in the topology pins it to the workbench.** Uses the `bindEvents` `onEdgeClick` callback. Pinned edges appear as cards in the workbench alongside node cards. Clicking a pinned edge again unpins it. +**Chip bar renders below the toolbar.** A horizontal row of metric chips: Utilization, Queue Depth, Arrivals, Served, Errors, Flow Latency. One active at a time (radio behavior). Default: Utilization. +### AC-2 — Selecting a metric changes the topology heatmap coloring -5. **Edge card content.** Each edge workbench card shows: - - Source and target node IDs - - Flow volume at current bin (from state API edge data if available, or from node-level served/arrivals) - - Sparkline of flow volume over time (if data available) - - Compact layout matching node cards +**Selecting a metric changes the topology heatmap coloring.** Each chip maps to a specific field in the state API response (`derived.utilization`, `metrics.queueDepth`, `metrics.arrivals`, `metrics.served`, `metrics.errors`, `derived.flowLatencyMs`). Selecting a chip re-extracts metrics from the current state data and passes them to DagMapView. Node metric labels update accordingly (e.g., "85%" for utilization, "14.5" for queue depth). +### AC-3 — Workbench card sparklines reflect the selected metric -6. **Edge selection indicator in DAG.** Pinned edges get a visual highlight in the topology (e.g., brighter color, thicker stroke, or glow). This uses a CSS class approach since dag-map doesn't have an `selectedEdges` option — the Svelte wrapper applies the class after render. +**Workbench card sparklines reflect the selected metric.** When the selected metric changes, the sparkline in each pinned workbench card updates to show that metric's values over the full time window (requires fetching state window data or caching per-bin values). +### AC-4 — Clicking an edge in the topology pins it to the workbench -### Class filter (AC7-AC8) +**Clicking an edge in the topology pins it to the workbench.** Uses the `bindEvents` `onEdgeClick` callback. Pinned edges appear as cards in the workbench alongside node cards. Clicking a pinned edge again unpins it. +### AC-5 — Edge card content -7. **Class filter dropdown appears when classes exist.** If the current run has per-class data (any node's state includes `byClass` entries), a dropdown/chip filter appears in the toolbar area. Lists all class IDs found in the data. Multi-select: toggle individual classes on/off. +**Edge card content.** Each edge workbench card shows: +- Source and target node IDs +- Flow volume at current bin (from state API edge data if available, or from node-level served/arrivals) +- Sparkline of flow volume over time (if data available) +- Compact layout matching node cards +### AC-6 — Edge selection indicator in DAG -8. **Class filter controls topology visibility.** When classes are filtered, the topology heatmap shows metrics for only the selected classes (using `byClass[classId]` data instead of aggregate). If no class filter is active, show aggregate (default behavior). +**Edge selection indicator in DAG.** Pinned edges get a visual highlight in the topology (e.g., brighter color, thicker stroke, or glow). This uses a CSS class approach since dag-map doesn't have an `selectedEdges` option — the Svelte wrapper applies the class after render. +### AC-7 — Class filter dropdown appears when classes exist -### Cross-cutting (AC9-AC10) +**Class filter dropdown appears when classes exist.** If the current run has per-class data (any node's state includes `byClass` entries), a dropdown/chip filter appears in the toolbar area. Lists all class IDs found in the data. Multi-select: toggle individual classes on/off. +### AC-8 — Class filter controls topology visibility -9. **Vitest coverage for new helpers.** Metric extraction by selected metric, edge data extraction, class discovery from state data — all have vitest tests. +**Class filter controls topology visibility.** When classes are filtered, the topology heatmap shows metrics for only the selected classes (using `byClass[classId]` data instead of aggregate). If no class filter is active, show aggregate (default behavior). +### AC-9 — Vitest coverage for new helpers -10. **Existing Playwright specs still pass.** The m-E21-01 workbench specs and E-17 what-if specs continue to work. +**Vitest coverage for new helpers.** Metric extraction by selected metric, edge data extraction, class discovery from state data — all have vitest tests. +### AC-10 — Existing Playwright specs still pass +**Existing Playwright specs still pass.** The M-038 workbench specs and E-17 what-if specs continue to work. ## Technical Notes - The metric selector state lives in the workbench store (or a co-located topology store) — persists across bin scrubs but resets on run change. @@ -57,15 +91,15 @@ This milestone adds the remaining "what am I looking at?" controls that the Blaz ## Out of Scope -- Analysis tab surfaces (m-E21-03/04) -- Heatmap view (m-E21-05) -- Validation surface (m-E21-06) +- Analysis tab surfaces (M-040/04) +- Heatmap view (M-042) +- Validation surface (M-043) - New dag-map layout changes - Edge metric labels on the DAG itself (edges show color only, detail in workbench) ## Dependencies -- m-E21-01 (complete) — workbench foundation, dag-map events, density system +- M-038 (complete) — workbench foundation, dag-map events, density system - FlowTime API state endpoint — already available on port 8081 ## Coverage Notes diff --git a/work/epics/E-21-svelte-workbench-analysis-surfaces/M-040-sweep-sensitivity-surfaces.md b/work/epics/E-21-svelte-workbench-analysis-surfaces/M-040-sweep-sensitivity-surfaces.md new file mode 100644 index 00000000..63fc8f02 --- /dev/null +++ b/work/epics/E-21-svelte-workbench-analysis-surfaces/M-040-sweep-sensitivity-surfaces.md @@ -0,0 +1,139 @@ +--- +id: M-040 +title: Sweep & Sensitivity Surfaces +status: done +parent: E-21 +acs: + - id: AC-1 + title: New /analysis route + status: met + - id: AC-2 + title: Run picker + status: met + - id: AC-3 + title: Tab bar + status: met + - id: AC-4 + title: Parameter selector + status: met + - id: AC-5 + title: Value range inputs + status: met + - id: AC-6 + title: Run sweep and render results + status: met + - id: AC-7 + title: Captured series filter + status: met + - id: AC-8 + title: Param multi-select + status: met + - id: AC-9 + title: Target metric picker + perturbation + status: met + - id: AC-10 + title: Run sensitivity and render results + status: met + - id: AC-11 + title: Vitest coverage for pure logic + status: met + - id: AC-12 + title: Playwright coverage + status: met +--- + +## Goal + +Deliver the first two Time Machine analysis surfaces in Svelte: parameter sweep and sensitivity analysis. These are new capabilities that Blazor never had — the headline proof that the fork delivers value. + +## Context + +E-18 shipped `POST /v1/sweep` and `POST /v1/sensitivity` against the Rust session engine. Until now there is no UI for either. This milestone introduces a new `/analysis` route with tabbed surfaces that let an expert: + +1. Pick a run → parse its model for sweepable parameters (const nodes) +2. Run a sweep over a chosen parameter with a range of values → see result series per point +3. Run sensitivity analysis across multiple parameters against a target metric → see ranked gradients + +The workbench paradigm established in M-038/02 sets the conventions: compact density, calm chrome with vivid data-viz colors, semantic tokens, `--ft-viz-*` palette, the shared `TimelineScrubber` and `Chart` components. + +### API contracts (confirmed via code) + +**POST /v1/sweep** +```json +Request: { "yaml": "...", "paramId": "arrivals", "values": [10, 15, 20], "captureSeriesIds": ["served"] } +Response: { "paramId": "arrivals", "points": [ { "paramValue": 10, "series": { "served": [8, 8, 8, 8] } }, ... ] } +``` + +**POST /v1/sensitivity** +```json +Request: { "yaml": "...", "paramIds": ["arrivals", "capacity"], "metricSeriesId": "queue.queueTimeMs", "perturbation": 0.05 } +Response: { "metricSeriesId": "queue.queueTimeMs", "points": [ { "paramId": "capacity", "baseValue": 50, "gradient": -2.35 }, ... ] } +``` + +**Parameter discovery**: clients parse the model YAML and collect nodes with `kind: const`. Their `id` is the parameter name; `values[0]` is a reasonable baseline. + +## Acceptance criteria + +### AC-1 — New /analysis route + +**New `/analysis` route.** SvelteKit page at `ui/src/routes/analysis/+page.svelte`. Accessible from the sidebar under Tools. Compact layout consistent with the workbench paradigm. +### AC-2 — Run picker + +**Run picker.** Dropdown at the top of `/analysis` to select a run. Defaults to the most recent run (same pattern as `/time-travel/topology`). Loading the model YAML for the selected run populates a param list. +### AC-3 — Tab bar + +**Tab bar.** Four tabs: Sweep, Sensitivity, Goal Seek, Optimize. Only Sweep and Sensitivity are wired in this milestone; Goal Seek and Optimize render placeholder "coming in M-041" content. Tab state preserved in `localStorage` or URL query. +### AC-4 — Parameter selector + +**Parameter selector.** Dropdown listing the run's const-node parameters discovered from the model YAML. Each option shows the parameter id and its baseline value. Empty state when no const nodes exist. +### AC-5 — Value range inputs + +**Value range inputs.** Three inputs (from, to, step) compute the sweep values. A text input for "or custom (comma-separated)" supersedes from/to/step when non-empty. A live preview shows the final value list and count; disallow runs > 50 points with an inline warning (soft cap, still runnable). +### AC-6 — Run sweep and render results + +**Run sweep and render results.** A "Run sweep" button calls `POST /v1/sweep`. While running, show a spinner. On result, render: +- A line chart: x = param value, y = selected output series aggregate (mean per point) — picked via a series selector populated from response keys. +- A per-point table: param value column + one column per captured series showing aggregate (mean) with a compact sparkline of that series across bins. +- Reasonable handling for errors (API 400/503) with inline error messages. +### AC-7 — Captured series filter + +**Captured series filter.** Optional multi-select chip bar listing common series (`arrivals`, `served`, `errors`, `queue`, `utilization`, `flowLatencyMs`). Empty selection = capture all. Sends `captureSeriesIds` in the request. +### AC-8 — Param multi-select + +**Param multi-select.** Chip-bar of all discovered const params. Clicking toggles selection. Defaults to all selected. +### AC-9 — Target metric picker + perturbation + +**Target metric picker + perturbation.** A text input for the target series id (common ones offered as chips: `served`, `queue`, `flowLatencyMs`, `utilization`). A slider for perturbation (default 0.05, range 0.01–0.30). +### AC-10 — Run sensitivity and render results + +**Run sensitivity and render results.** A "Run sensitivity" button calls `POST /v1/sensitivity`. On result, render a horizontal bar chart sorted by |gradient| descending, colored by sign (positive/negative), with numeric gradient labels and the base value shown per row. Empty/error states handled. +### AC-11 — Vitest coverage for pure logic + +**Vitest coverage for pure logic.** New helpers (param discovery from YAML, sweep value range generator, aggregate/mean computation) have vitest tests with explicit branch coverage including error paths. +### AC-12 — Playwright coverage + +**Playwright coverage.** New spec `svelte-analysis.spec.ts`: page loads, sweep can be configured and run against a real run, sensitivity can be configured and run. Graceful skip when infra is down. +## Technical Notes + +- **YAML parsing in browser**: `js-yaml` is already a transitive dep via other libs, but we should explicitly add it. Alternative: the API could provide a `/v1/runs/{id}/params` endpoint that returns const-node ids + baselines. For this milestone, browser-parse with `js-yaml`; if it proves fragile, promote to a server endpoint in a later milestone. +- **Chart reuse**: existing `Chart.svelte` handles multi-series line data. Sweep result chart passes `{name: paramValue, values: [aggregate]}` per captured series — a new shape. Consider a dedicated `ParamSweepChart` wrapper that transposes sweep results into Chart's format. +- **Bar chart**: no current component. Build a simple horizontal-bar SVG in `SensitivityBarChart.svelte` — pure SVG with the viz palette (coral for negative, teal for positive gradients). +- **Analysis state**: small store `analysis.svelte.ts` to hold current run YAML, last sweep/sensitivity results, selected tab. Session-ephemeral. +- **Loading state**: use existing `Loader2` icon from lucide; debounce run-button clicks. + +## Out of Scope + +- Goal Seek + Optimize surfaces (M-041) +- Server-side parameter discovery endpoint +- Saving/re-running past analyses (history panel) +- Exporting sweep/sensitivity results +- Constraints on optimization (already out of scope globally per gaps.md) + +## Dependencies + +- M-038/02 (complete) — workbench paradigm, chart component, density tokens +- `POST /v1/sweep`, `POST /v1/sensitivity` — available on port 8081 + +## Coverage Notes + +(Filled at wrap time.) diff --git a/work/epics/E-21-svelte-workbench-analysis-surfaces/M-041-goal-seek-surface.md b/work/epics/E-21-svelte-workbench-analysis-surfaces/M-041-goal-seek-surface.md new file mode 100644 index 00000000..fc02f1fc --- /dev/null +++ b/work/epics/E-21-svelte-workbench-analysis-surfaces/M-041-goal-seek-surface.md @@ -0,0 +1,278 @@ +--- +id: M-041 +title: Goal Seek Surface +status: done +parent: E-21 +acs: + - id: AC-1 + title: Goal-seek trace plumbed end-to-end + status: met + - id: AC-2 + title: Optimize trace plumbed end-to-end + status: met + - id: AC-3 + title: D-2026-04-21-034 appended to work/decisions.md at start-milestone time + status: met + - id: AC-4 + title: Goal Seek placeholder replaced + status: met + - id: AC-5 + title: Shared result card + shared convergence chart extracted up front + status: met + - id: AC-6 + title: Parameter selector + status: met + - id: AC-7 + title: Search interval + target + advanced inputs + status: met + - id: AC-8 + title: Run goal-seek and render results + status: met + - id: AC-9 + title: Not-bracketed and not-converged states + status: met + - id: AC-10 + title: Session form state — goal-seek + status: met + - id: AC-11 + title: Vitest coverage for pure logic + status: met + - id: AC-12 + title: Playwright coverage + status: met + - id: AC-13 + title: Line-by-line branch audit performed in two passes, each captured in + status: met +--- + +**Started:** 2026-04-21 +**Completed:** 2026-04-22 + +## Goal + +Wire the `/analysis` Goal Seek tab to live Time Machine so single-parameter target-seeking is usable from the Svelte workbench, and extend `/v1/goal-seek` (plus its sibling `/v1/optimize`) with the per-iteration `trace` they already compute internally but currently discard. Ship the shared convergence chart + analysis result card components here so the subsequent Optimize milestone (M-042) consumes them directly. + +## Context + +M-040 shipped the `/analysis` route shell with a four-tab bar (Sweep, Sensitivity, Goal Seek, Optimize). Goal Seek and Optimize currently render `coming in m-E21-04` placeholders. This milestone activates the Goal Seek tab and lands the two shared visualization components; the Optimize tab stays placeholder (pointing at M-042) until the follow-up milestone. + +The backend `trace` extension covers **both** `/v1/goal-seek` and `/v1/optimize` in one change because they ship under one decision (D-047). That work landed early in this milestone (commit `29ac3e9`); the optimize trace is ready for M-042 to consume without further backend work. + +Shared infrastructure already in place from M-040: + +- `ui/src/routes/analysis/+page.svelte` — run/sample picker, scenario card, tab bar, active-tab persistence +- `ui/src/lib/utils/analysis-helpers.ts` — `discoverConstParams`, `ConstParam` type, numeric helpers +- `ui/src/lib/api/flowtime.ts` — `flowtime.sweep(...)`, `flowtime.sensitivity(...)` methods +- `GET /v1/runs/{runId}/model` — read-only model fetch (D-046) +- Density tokens, `--ft-viz-*` palette, `Loader2` spinner pattern, inline-error pattern +- `sensitivity-bar-geometry.ts` — template for pure-SVG geometry helpers with vitest coverage + +### Scope split note + +This milestone was originally drafted as `m-E21-04-goal-seek-optimize` covering both Goal Seek and Optimize tabs. It was split on 2026-04-21 after Phase 1 backend landed: Goal Seek remains here; Optimize moved to a new **M-042 Optimize Surface**; heatmap / validation / polish renumbered to 06 / 07 / 08. Rationale: 16 ACs across backend + shared components + two surfaces was too large, and "Phase 1 / Phase 2" sub-phasing in the tracking doc was the smell. The backend trace change on `/v1/optimize` that landed here is kept; M-042 consumes it. + +### API contracts — current and extended + +The existing endpoints already compute per-iteration state but discard it before returning. This milestone extends both response shapes with an additive `trace` field (see Decision Record below). Requests are unchanged. `POST /v1/sweep` and `POST /v1/sensitivity` are untouched. + +**POST /v1/goal-seek** — `src/FlowTime.API/Endpoints/GoalSeekEndpoints.cs` + +Request (unchanged): +```json +{ + "yaml": "...", + "paramId": "capacity", + "metricSeriesId": "derived.utilization", + "target": 0.8, + "searchLo": 10, + "searchHi": 100, + "tolerance": 1e-6, + "maxIterations": 50 +} +``` + +Response (extended): +```json +{ + "paramValue": 42.187, + "achievedMetricMean": 0.7999, + "converged": true, + "iterations": 12, + "trace": [ + { "iteration": 0, "paramValue": 10, "metricMean": 0.42, "searchLo": 10, "searchHi": 100 }, + { "iteration": 0, "paramValue": 100, "metricMean": 0.95, "searchLo": 10, "searchHi": 100 }, + { "iteration": 1, "paramValue": 55, "metricMean": 0.88, "searchLo": 10, "searchHi": 55 }, + { "iteration": 2, "paramValue": 32.5,"metricMean": 0.72, "searchLo": 32.5, "searchHi": 55 } + /* ... */ + ] +} +``` + +Trace semantics: +- Two `iteration: 0` entries for the initial boundary evaluations (`searchLo`, `searchHi`), in that order. +- One entry per bisection step with `iteration: 1..N`, where the recorded `paramValue` is the midpoint evaluated at that step and `searchLo` / `searchHi` are the **post-step** bracket (after narrowing). +- `metricMean` is the unsigned mean at that `paramValue` — same value that drives the bisection decision. +- When the target is already hit at a boundary (converged in 0 iterations), the trace contains only the two boundary entries. When the target is not bracketed, the trace contains only the two boundary entries and the response reports `converged: false`, `iterations: 0`. + +400 / 503 behaviour unchanged. + +**POST /v1/optimize** — `src/FlowTime.API/Endpoints/OptimizeEndpoints.cs` + +The `trace` extension on `/v1/optimize` is owned by this milestone's backend AC (AC2) since it shares D-047 with goal-seek. The surface that consumes it — the Optimize tab — lives in M-042. **Full request/response shape is owned by M-042's spec** (`m-E21-05-optimize.md` → API contract section); do not duplicate it here. AC2's tests lock these trace invariants: + +- One entry per iteration (one post-sort entry before the main loop as `iteration: 0`, plus one per main-loop iteration after its post-iteration sort). +- `paramValues` is the current best vertex (`simplex[0]`), `metricMean` is its **unsigned** mean — the internal minimize-sign flip is reversed at record time for maximize runs. +- Trace length equals `iterations + 1` on every return path (pre-loop converged, main-loop converged, max-iterations exhausted). +- 0-iteration convergence yields a single `iteration: 0` entry. The per-evaluation probe log (reflection / expansion / contraction / shrink intermediate vertices) is intentionally not exposed. + +### Decision Record + +**D-047 — Additive `trace` field on `/v1/goal-seek` and `/v1/optimize`** — appended to `work/decisions.md` at start-milestone time (commit `5988f5c`). Scope covers both endpoints; implementation landed in commit `29ac3e9` of this milestone. No rewording needed for the split. + +## Acceptance criteria + +### AC-1 — Goal-seek trace plumbed end-to-end + +**Goal-seek trace plumbed end-to-end.** `GoalSeeker.SeekAsync` records the two boundary evaluations and each bisection midpoint with the post-step bracket. `GoalSeekResult` gains a `Trace` property (`IReadOnlyList`). `GoalSeekEndpoints` passes the trace through to the response. All five return paths (`Converged` at `searchLo`, `Converged` at `searchHi`, not-bracketed, tolerance hit mid-loop, max-iterations exhausted) return a trace whose shape matches the semantics above. Existing `GoalSeekEndpointsTests.cs` gains coverage for trace shape + ordering + post-step bracket invariants on each return path. +### AC-2 — Optimize trace plumbed end-to-end + +**Optimize trace plumbed end-to-end.** `Optimizer.OptimizeAsync` records the post-sort best vertex once before the main loop (as `iteration: 0`) and once per iteration thereafter. `OptimizeResult` gains a `Trace` property (`IReadOnlyList`). `OptimizeEndpoints` passes the trace through. Maximize runs report unsigned `metricMean` on trace entries (sign reversed internally). Existing `OptimizeEndpointsTests.cs` gains coverage for trace shape + ordering + unsigned-metric invariant on both objectives + trace-length / iterations consistency. _The consuming Optimize surface is delivered in M-042._ +### AC-3 — D-2026-04-21-034 appended to work/decisions.md at start-milestone time + +**`D-2026-04-21-034` appended to `work/decisions.md` at start-milestone time.** Body matches the draft in Context. E-21 epic spec Scope / Constraints updated in the same commit to reference the new decision alongside D-046 (read-only run-adjacent) and to list the additive compute-response change as the other explicit carve-out. +### AC-4 — Goal Seek placeholder replaced + +**Goal Seek placeholder replaced.** The `goal-seek` tab panel in `ui/src/routes/analysis/+page.svelte` renders live content (not the `coming in m-E21-04` stub). The `optimize` tab panel keeps its placeholder copy updated to reference **M-042**. `TAB_INFO` copy for Goal Seek stands as-is — "convergence info" now accurately describes what the UI renders. +### AC-5 — Shared result card + shared convergence chart extracted up front + +**Shared result card + shared convergence chart extracted up front.** `ui/src/lib/components/analysis-result-card.svelte` and `ui/src/lib/components/convergence-chart.svelte` land as reusable components in this milestone so M-042's Optimize surface can consume them without further extraction work. Required behaviours (exact prop/slot names are an implementation decision): +- **Result card** — accepts a distinct header region, a primary-value region (large monospace), and a meta region for compact key-value pairs (iterations / converged badge / tolerance / direction / target / residual as applicable per surface). +- **Convergence chart** — consumes a **normalized** input shape `Array<{ iteration: number; metricMean: number }>`; each caller adapts its response into that shape before passing it in. The chart does not branch on surface type. Goal Seek's bracket and (future) Optimize's `paramValues` are rendered elsewhere (interval bar, per-param table) and do not enter the chart. Required behaviours: optional horizontal reference line when a target is supplied (dashed); caller-supplied y-axis label; line colour reflects converged state (teal when converged, amber when not); the converged/final point is visually emphasized (e.g. a larger marker) relative to intermediate points. +- Geometry lives in pure `.ts` siblings with vitest coverage, mirroring `sensitivity-bar-geometry`. +### AC-6 — Parameter selector + +**Parameter selector.** Single-select dropdown listing the current model's const-node parameters (reuses `discoverConstParams`). Each option shows `{id} (base {baseline})` — same format as the Sweep tab. Empty state when no const params exist (same copy as Sweep). +### AC-7 — Search interval + target + advanced inputs + +**Search interval + target + advanced inputs.** Two numeric inputs `searchLo` and `searchHi` with inline validation (both required, `searchLo < searchHi`, defaults `0.5 × baseline` / `2 × baseline` of the selected parameter). Free-text input for `metricSeriesId` with the same chip shortcuts as Sensitivity (`served`, `queue`, `flowLatencyMs`, `utilization`). Numeric input for `target`. A collapsed "Advanced" disclosure exposes `tolerance` (default 1e-6) and `maxIterations` (default 50). All required fields must be valid before the Run button enables. +### AC-8 — Run goal-seek and render results + +**Run goal-seek and render results.** "Run goal seek" button calls `flowtime.goalSeek(...)` (new API method, response type includes `trace`). While running, show a spinner (`Loader2Icon`) and disable the button. On success, render: +- The shared result card (AC5) with the final `paramValue`, `achievedMetricMean`, `target`, `|achieved − target|` residual, converged badge, and iteration count. +- The shared convergence chart (AC5) plotting `metricMean` vs `iteration` as a line, with a horizontal reference line at `target`. Boundary evaluations (`iteration: 0`) are plotted as two initial points on the x-axis at position 0. The converged/final point is visually emphasized per AC5. +- A **search-interval bar** (SVG) showing the original `[searchLo, searchHi]` range with a marker at the final `paramValue`, using `intervalMarkerGeometry` from `interval-bar-geometry.ts`. This is the Goal Seek consumer that justifies landing that geometry file in this milestone; Optimize reuses it for per-param mini bars in M-042. +- 400 and 503 errors surfaced as inline messages using the existing analysis-page error pattern. +### AC-9 — Not-bracketed and not-converged states + +**Not-bracketed and not-converged states.** When the API returns `converged: false` with `iterations: 0` (target not bracketed), the result card shows an amber warning explaining that the target was not reachable within the search interval and suggests widening the bounds. The convergence chart still renders the two boundary evaluations. When `converged: false` with `iterations == maxIterations`, the card shows an amber "did not converge" badge and the chart is drawn over the full trace. +### AC-10 — Session form state — goal-seek + +**Session form state — goal-seek.** The Goal Seek form retains its last input values across tab switches within the same page session (in-memory is sufficient). Form values reset when the scenario (run / sample model) changes. Mirrors the Sweep tab behaviour. _Optimize session state lives in M-042._ +### AC-11 — Vitest coverage for pure logic + +**Vitest coverage for pure logic.** New helpers added to `ui/src/lib/utils/analysis-helpers.ts` (or a sibling `goal-seek-helpers.ts` if the file grows unwieldy) have vitest tests with branch coverage: +- `defaultSearchBounds(baseline)` — `0.5 × baseline` / `2 × baseline`; guards for `baseline === 0`, negative baselines, non-finite inputs. +- `validateSearchInterval({lo, hi})` — structured error for missing / non-finite / `lo >= hi`. +- `intervalMarkerGeometry({ lo, hi, value, width })` — clamping when `value ∉ [lo, hi]`, degenerate `hi === lo`, non-finite inputs. _(Shared with Optimize's per-param range bars in M-042.)_ +- `convergence-chart-geometry.ts` — operates on the **normalized** `Array<{ iteration, metricMean }>` shape defined in AC5. `convergencePath({ trace, width, height, padding, yDomain })` with tests for empty trace, single-point trace, trace with multiple entries at the same `iteration` (goal-seek boundary case: two points at `iteration: 0`), monotonic vs non-monotonic traces, flat metric (all equal), non-finite values, y-domain override vs auto-fit, target-line y-coordinate computation. +- `analysis-result-card-geometry.ts` (if needed) — whatever pure logic the card uses (badge-colour selection given `converged`, residual formatting). Skip the file if the card is pure markup with no computation worth testing. +- No mocks; no DOM. +- `validateOptimizeForm` is out of scope here; it lives in M-042. +### AC-12 — Playwright coverage + +**Playwright coverage.** Extend `tests/ui/specs/svelte-analysis.spec.ts` (preferred) or add `svelte-analysis-goal-seek.spec.ts`: +- Goal Seek happy path: page loads, param selector populates, interval defaults render, Run button disabled until form is complete, run against a real engine returns a result card with `paramValue`, `converged` badge, iterations, **and a rendered convergence chart with at least one plotted point beyond iteration 0**. +- Goal Seek not-bracketed deterministic repro — uses the tuple recorded in the tracking doc's Notes section (first bundled sample in `SAMPLE_MODELS`, its first discovered const param, `target: 1e12` unreachable). Assert the warning message + the chart rendering only the two boundary points. +- Graceful skip when Engine API (8081) or Svelte dev server (5173) is down, matching the existing probe-and-skip pattern in `svelte-analysis.spec.ts`. +- _Optimize Playwright coverage is owned by M-042._ +### AC-13 — Line-by-line branch audit performed in two passes, each captured in + +**Line-by-line branch audit** performed in two passes, each captured in the tracking doc's Coverage Notes before its respective commit-approval prompt: +- **AC13a — Backend pass (already complete, commit `29ac3e9`).** Five goal-seek return paths; pre-loop and main-loop exits in Nelder-Mead; shrink-vs-no-shrink branches. The optimize branches are audited here even though the consumer is M-042, because the implementation lives on this milestone's commits. +- **AC13b — UI pass (pending).** New frontend components, geometry helpers, form validators, and render-condition branches in the Goal Seek tab. +Both passes enumerate every reachable branch and match each to a named test (xUnit / vitest / Playwright). Unreachable / defensive-default branches are documented with rationale, following M-040's pattern. +## Technical Notes + +### Backend + +- **`GoalSeekTracePoint` record** in `FlowTime.TimeMachine.Sweep` — `(int Iteration, double ParamValue, double MetricMean, double SearchLo, double SearchHi)`. Serializes to camelCase JSON automatically via existing endpoint serialization settings. +- **`OptimizeTracePoint` record** in `FlowTime.TimeMachine.Sweep` — `(int Iteration, IReadOnlyDictionary ParamValues, double MetricMean)`. +- **Trace buffer inside the runners** — accumulate in a `List<...>` and hand the result to `MakeResult` / `Converged` / `NotConverged` helpers. Avoid allocating per-iteration closures. +- **Max trace size** — bounded by `maxIterations + 2` for goal-seek and `maxIterations + 1` for optimize. No separate cap needed. +- **Serialization** — endpoint response records already use System.Text.Json camelCase; adding `Trace` on both response records picks up the same convention. Verify with a round-trip test. +- **.NET CLI (M-011) impact** — the `goal-seek` and `optimize` CLI subcommands pipe JSON through; the new `trace` field appears automatically. No CLI code change required; add a CLI test confirming trace is present in the JSON output. + +### Frontend + +- **API client addition** (`ui/src/lib/api/flowtime.ts`): + + ```ts + async goalSeek(body: { + yaml: string; + paramId: string; + metricSeriesId: string; + target: number; + searchLo: number; + searchHi: number; + tolerance?: number; + maxIterations?: number; + }) { + return post<{ + paramValue: number; + achievedMetricMean: number; + converged: boolean; + iterations: number; + trace: { + iteration: number; + paramValue: number; + metricMean: number; + searchLo: number; + searchHi: number; + }[]; + }>(`${API}/goal-seek`, body); + } + ``` + + The matching `flowtime.optimize(...)` method is owned by M-042 — draft preserved in that milestone's Technical Notes. + +- **`ConvergenceChart.svelte`** — pure SVG, ~80-120 lines. Consumes the normalized `Array<{ iteration, metricMean }>` shape (see AC5). Geometry in `convergence-chart-geometry.ts` handles y-domain computation, point projection, target-line y-coord, and multi-point-at-same-x placement (goal-seek's two `iteration: 0` entries). Single-series line for simplicity; no legend. Uses `--ft-viz-*` palette tokens; teal when `converged`, amber otherwise; dashed horizontal reference line at `target` when provided. Axis labels: `iteration` on x, `yLabel` prop (caller-supplied, e.g. "metric mean" or "queue.queueTimeMs") on y. + +- **`AnalysisResultCard.svelte`** — compact card using existing density tokens. Header (title + converged badge), primary value (large monospace), meta grid (iterations, tolerance, direction, target if present). No new shadcn components required. + +- **Interval bar + per-param range bars** — extracted into `interval-bar-geometry.ts` with vitest. Reused by Goal Seek's interval visualization (single bar) and by Optimize's per-param table (one mini bar per row) — the file lands here, M-042 reuses it. + +- **Form state** — co-located in the route component using `$state` runes. Promote to a `goal-seek-state.svelte.ts` store only if readability degrades; M-040 kept state local and that's the baseline to beat. + +- **Scenario-change reset** — when `selectedRunId` or `selectedSampleId` changes, reset the Goal Seek form. Wire into the same reactivity that already drives scenario changes in `/analysis`. + +- **Error messaging** — reuse the existing error surface pattern from M-040; do not introduce a new toast or modal system. + +- **Density / styling** — small inputs, tight gutters, 8–12 px steps. Use the analysis page's existing typography scale; no new font sizes. + +## Out of Scope + +- Optimize tab surface — separate milestone **M-042**. +- Per-evaluation probe log for optimize (raw reflection/expansion/contraction/shrink intermediate vertices). The exposed trace is per-iteration best only. +- Multi-objective / Pareto optimization (not in the engine). +- Constraints on optimization (deferred — tracked in `work/gaps.md`). +- History panel for past goal-seek / optimize runs. +- Exporting results (CSV, JSON download). +- Persisting form values to `localStorage` across browser sessions. +- Keyboard shortcuts beyond what the analysis page already supports. +- Server-side parameter discovery (browser-side `discoverConstParams` still owns this). +- Trace extension on `/v1/sweep` or `/v1/sensitivity` (not needed; not covered by D-047). + +## Dependencies + +- M-040 (complete) — analysis route shell, tab bar, run/sample picker, param discovery, density tokens, inline-error pattern, sensitivity bar geometry (as a template for the interval bar + convergence chart geometry). +- `POST /v1/goal-seek`, `POST /v1/optimize` — available on port 8081 against `RustEngine:Enabled=true`. Both covered by existing API tests (`GoalSeekEndpointsTests.cs`, `OptimizeEndpointsTests.cs`) that this milestone extends with trace-shape assertions. +- D-047 — appended to `work/decisions.md` at start-milestone time; covers AC3. +- Sample models bundled at `ui/src/lib/utils/sample-models.ts`. At least one sample must have const nodes, a reachable metric target (for the happy-path Playwright goal-seek), and accommodate the unreachable-target case from AC12. + +## Notes + +- **Branch name vs milestone title.** The milestone branch is `milestone/m-E21-04-goal-seek-optimize` — it keeps its original name after the split because it already carries the Phase 1 backend commit (`29ac3e9`) and is referenced across CLAUDE.md Current Work and status surfaces. The branch name is the one documented mismatch with the renamed milestone folder (`m-E21-04-goal-seek`); all other surfaces reflect the new title. + +## Coverage Notes + +See `m-E21-04-goal-seek-tracking.md` sections "Phase 1 — Branch-coverage audit" (backend) and "Coverage Notes → UI pass" (frontend) for the full line-by-line audit. Each reachable branch is matched to a named xUnit / vitest / Playwright test; defensive / unreachable branches are enumerated with rationale. diff --git a/work/epics/E-21-svelte-workbench-analysis-surfaces/M-042-optimize-surface.md b/work/epics/E-21-svelte-workbench-analysis-surfaces/M-042-optimize-surface.md new file mode 100644 index 00000000..9bd010c9 --- /dev/null +++ b/work/epics/E-21-svelte-workbench-analysis-surfaces/M-042-optimize-surface.md @@ -0,0 +1,217 @@ +--- +id: M-042 +title: Optimize Surface +status: done +parent: E-21 +acs: + - id: AC-1 + title: Optimize placeholder replaced + status: met + - id: AC-2 + title: Param multi-select with bounds — layout + status: met + - id: AC-3 + title: Objective metric + direction + advanced inputs + status: met + - id: AC-4 + title: Run optimize and render results + status: met + - id: AC-5 + title: Not-converged state + status: met + - id: AC-6 + title: Session form state + status: met + - id: AC-7 + title: Vitest coverage for pure logic + status: met + - id: AC-8 + title: Playwright coverage + status: met + - id: AC-9 + title: Line-by-line branch audit before the commit-approval prompt + status: met +--- + +**Created:** 2026-04-21 (split from M-041) +**Started:** 2026-04-22 +**Completed:** 2026-04-22 +## Goal + +Wire the `/analysis` Optimize tab to live `/v1/optimize` so N-parameter Nelder-Mead optimization under bounds is usable from the Svelte workbench. Consume the shared `AnalysisResultCard` + `ConvergenceChart` components delivered by M-041 and the already-landed `trace` field on the optimize response (commit `29ac3e9` of M-041's branch). Deliver a per-param result table with mini range bars so the user sees where each optimized parameter landed inside its bound. + +## Context + +This milestone was split out of the original `m-E21-04-goal-seek-optimize` on 2026-04-21 (16 ACs was too large; "Phase 1 / Phase 2" sub-phasing was the smell). The preconditions for Optimize to land cheaply are all complete before this milestone starts: + +- **Backend `trace` on `/v1/optimize`** — landed in M-041 commit `29ac3e9` under D-047. `OptimizeResponse` already carries `IReadOnlyList Trace` with pre-loop + per-iteration best-vertex entries; maximize runs emit unsigned `metricMean`. `OptimizeEndpointsTraceTests` + `OptimizerTests` already lock every reachable branch. +- **Shared UI components** — `ui/src/lib/components/analysis-result-card.svelte` and `ui/src/lib/components/convergence-chart.svelte` land in M-041 (goal-seek is the first consumer). Their geometry siblings (`convergence-chart-geometry.ts`, `interval-bar-geometry.ts`) also land there with vitest coverage. This milestone consumes them directly — no further component extraction is needed. +- **Analysis route shell** — tab bar, scenario picker, inline-error pattern, session form state model, and the `optimize` tab placeholder (updated in M-041 to reference this milestone) are all in place from M-040 / M-041. + +What remains is: the Optimize tab surface (param multi-select + bounds table + direction toggle + Advanced), the `flowtime.optimize(...)` API client method, the per-param result table with mini range bars, and the Playwright / vitest coverage for the optimize-specific pieces. + +### API contract + +**POST /v1/optimize** — `src/FlowTime.API/Endpoints/OptimizeEndpoints.cs` + +Request (unchanged): +```json +{ + "yaml": "...", + "paramIds": ["arrivals", "capacity"], + "metricSeriesId": "queue.queueTimeMs", + "objective": "minimize", + "searchRanges": { + "arrivals": { "lo": 5, "hi": 50 }, + "capacity": { "lo": 10, "hi": 200 } + }, + "tolerance": 1e-4, + "maxIterations": 200 +} +``` + +Response (already extended in M-041 commit `29ac3e9`): +```json +{ + "paramValues": { "arrivals": 17.3, "capacity": 74.2 }, + "achievedMetricMean": 0.042, + "converged": true, + "iterations": 87, + "trace": [ + { "iteration": 0, "paramValues": { "arrivals": 27.5, "capacity": 105 }, "metricMean": 0.31 }, + { "iteration": 1, "paramValues": { "arrivals": 25.8, "capacity": 112 }, "metricMean": 0.27 }, + { "iteration": 2, "paramValues": { "arrivals": 22.4, "capacity": 98 }, "metricMean": 0.19 } + /* ... */ + ] +} +``` + +Trace semantics (as delivered by M-041's backend AC2): +- One entry per Nelder-Mead iteration, recorded **after** the per-iteration `Sort` so `paramValues` is the current best vertex (`simplex[0]`) and `metricMean` is the unsigned mean at that vertex. +- `iteration: 0` is the initial simplex's best vertex after the pre-loop sort. `iteration: 1..N` are the post-iteration bests. +- When the search converges in 0 iterations (initial simplex already satisfies tolerance), the trace contains only the `iteration: 0` entry. +- Maximize runs emit unsigned `metricMean` on the trace (same convention as `achievedMetricMean`). The internal sign-flip is not leaked. +- The per-iteration best is canonical; the raw per-evaluation log (reflection / expansion / contraction / shrink probes) is intentionally not exposed. + +400 / 503 behaviour unchanged. + +## Acceptance criteria + +### AC-1 — Optimize placeholder replaced + +**Optimize placeholder replaced.** The `optimize` tab panel in `ui/src/routes/analysis/+page.svelte` renders live content (not the M-041 "coming in M-042" stub). `TAB_INFO` copy for Optimize stands as-is — "convergence history" now accurately describes what the UI renders. +### AC-2 — Param multi-select with bounds — layout + +**Param multi-select with bounds — layout.** Chip-bar of all discovered const params at the top (toggle to include in the optimization; same chip styling and toggle interaction as Sensitivity). Below the chip-bar, a **compact table** with one row per selected param and columns `param id`, `baseline`, `lo`, `hi`. The `lo` / `hi` cells are inline numeric inputs with defaults `0.5 × baseline` / `2 × baseline`; the table appears only when at least one chip is active. Rationale: keeps the chip-bar a pure selector (matches Sensitivity muscle memory) and groups the bounds into one aligned grid, which reads cleanly for 1–5 params (the realistic scale for a hand-driven Nelder-Mead session). **Empty state** when no const params are discoverable on the current model: render the Sweep/Goal-Seek shape string `"No const-kind parameters in this model to optimize over."` in the same `

` wrapper used by the Sweep (line 678) and Goal Seek (line 1004) surfaces in `ui/src/routes/analysis/+page.svelte`, with the Run button disabled. **No-params-selected state**: when the chip-bar has rendered but zero chips are toggled on, the bounds table is hidden and the Run button is disabled with an inline hint ("select at least one parameter"). Inline validation: at least one param selected; for every selected param, both bounds required and `lo < hi`. +### AC-3 — Objective metric + direction + advanced inputs + +**Objective metric + direction + advanced inputs.** Free-text `metricSeriesId` with the same chip shortcuts used by Sensitivity (`served`, `queue`, `flowLatencyMs`, `utilization`). A two-option toggle for direction (`minimize` / `maximize`), defaulting to `minimize` on first render and after every scenario-change reset. A collapsed "Advanced" disclosure exposes `tolerance` (default 1e-4) and `maxIterations` (default 200). All required fields must be valid before the Run button enables. +### AC-4 — Run optimize and render results + +**Run optimize and render results.** "Run optimize" button calls `flowtime.optimize(...)` (new API method — see Technical Notes). While running, show a spinner (`Loader2Icon`) and disable the button. On success, render: +- The **shared** result card (delivered in M-041) showing the objective metric + direction, final `achievedMetricMean`, converged badge, iteration count. +- A per-param table: `paramId`, final value, `[lo, hi]` bound (printed as `[lo, hi]` in a text cell), and a **separate** column for the mini "range bar" (SVG, reuses `interval-bar-geometry.ts` from M-041) showing where the final value landed inside its bound. The range bar is its own column — do not overlay it on the `[lo, hi]` text cell, so the text stays selectable/copyable and the bar's width is not coupled to text length. +- The **shared** convergence chart (delivered in M-041) plotting `metricMean` vs `iteration` over the full trace. No target reference line (there is no target for optimize) — the y-axis label reflects the direction ("minimizing X" / "maximizing X"). +- 400 and 503 errors surfaced as inline messages using the existing analysis-page error pattern. +### AC-5 — Not-converged state + +**Not-converged state.** When the API returns `converged: false` with `iterations == maxIterations`, the shared result card shows an amber "did not converge" badge (same pattern as Goal Seek's max-iterations case from M-041 AC9) and the convergence chart is drawn over the full trace. When `converged: false` with `iterations == 0` (initial simplex failed to satisfy tolerance in 0 iterations — a degenerate max-iterations case), the single `iteration: 0` trace point is plotted and the amber badge still shows. The per-param table renders whatever final `paramValues` the response carries. +### AC-6 — Session form state + +**Session form state.** The Optimize form retains its last input values (selected param chips, per-param bounds, metric, direction, advanced fields) across tab switches within the same page session (in-memory is sufficient). Form values reset when the scenario (run / sample model) changes. Mirrors the Sweep + Goal Seek tab behaviour. +### AC-7 — Vitest coverage for pure logic + +**Vitest coverage for pure logic.** Optimize-specific pure helpers live in a new sibling file `ui/src/lib/utils/optimize-helpers.ts` (with `optimize-helpers.test.ts` alongside it). Do **not** pile them into `analysis-helpers.ts` — keep the optimize surface's helpers modular and scoped to the surface, mirroring how each analysis surface owns its own component files. Branch-covered tests: +- `validateOptimizeForm({ selectedParams, bounds, metricSeriesId, objective })` — per-field error map. Exercises: no params selected; missing lo or hi on any selected param; `lo >= hi`; non-finite bounds; empty metric string; invalid objective. +- Any per-param range-bar geometry helper extracted from `interval-bar-geometry.ts` for the table's mini bars. (If the existing `intervalMarkerGeometry` from M-041 covers this unchanged, no new tests are required beyond a call-site test.) +- Shared cross-surface helpers (e.g. `discoverConstParams`) stay in `analysis-helpers.ts`; optimize-only helpers stay in `optimize-helpers.ts`. +- No mocks; no DOM. +### AC-8 — Playwright coverage + +**Playwright coverage.** Extend `tests/ui/specs/svelte-analysis.spec.ts` (preferred) or add `svelte-analysis-optimize.spec.ts`: +- Optimize happy path: page loads, param chip-bar populates, multi-select toggles work, bounds inputs render per selected param, direction toggle works, Run button disabled until form is complete, run against a real engine returns the shared result card **with a converged badge**, a **per-param result table with one row per selected param (id, final value, `[lo, hi]` bound, and a rendered range bar)**, and a rendered convergence chart with multiple iterations plotted. Uses the deterministic tuple recorded in the tracking doc's Notes section (≥ 2 const params from a named bundled sample, bounds that reliably converge inside `maxIterations`). +- No-params-selected state: when the user opens the Optimize tab with no chips toggled on, the bounds table is hidden, the Run button is disabled, and the inline hint renders. +- Graceful skip when Engine API (8081) or Svelte dev server (5173) is down, matching the existing probe-and-skip pattern in `svelte-analysis.spec.ts`. +### AC-9 — Line-by-line branch audit before the commit-approval prompt + +**Line-by-line branch audit** before the commit-approval prompt — the new UI components / helpers only (backend audit is complete from M-041). Enumerate every reachable branch in `validateOptimizeForm` (and any sibling helpers in `optimize-helpers.ts`), the per-param range-bar call sites, and the Optimize tab's render conditions (happy-path / empty / no-params-selected / not-converged), matching each to a test (vitest / Playwright). Record unreachable / defensive-default branches in the tracking doc's Coverage Notes, following M-040's pattern. +## Technical Notes + +- **API client addition** (`ui/src/lib/api/flowtime.ts`): + + ```ts + async optimize(body: { + yaml: string; + paramIds: string[]; + metricSeriesId: string; + objective: 'minimize' | 'maximize'; + searchRanges: Record; + tolerance?: number; + maxIterations?: number; + }) { + return post<{ + paramValues: Record; + achievedMetricMean: number; + converged: boolean; + iterations: number; + trace: { + iteration: number; + paramValues: Record; + metricMean: number; + }[]; + }>(`${API}/optimize`, body); + } + ``` + +- **Trace adaptation.** Convert the optimize trace into the chart's normalized `Array<{ iteration, metricMean }>` shape at the call site: `trace.map(p => ({ iteration: p.iteration, metricMean: p.metricMean }))`. The chart does not branch on surface type — it receives the same normalized shape Goal Seek passes. + +- **Chart y-axis label.** Reflects the direction — e.g. `minimizing ${metricSeriesId}` / `maximizing ${metricSeriesId}`. No target reference line (no target for optimize). Line colour reflects converged state (teal vs amber), same as Goal Seek. Exact prop names match whatever M-041's extraction settled on. + +- **Per-param result table.** One row per paramId in `paramValues`. Columns (in order): id, final value (monospace, fixed precision), `[lo, hi]` bound as a text cell, mini SVG range bar as its own column (reuses `interval-bar-geometry.ts` from M-041). Keeping the range bar in a dedicated column preserves text-cell copyability and decouples the bar's rendered width from the `[lo, hi]` string length. + +- **Form state.** Co-located in the route component using `$state` runes, mirroring Goal Seek's pattern from M-041. If readability degrades with the multi-select + per-param bounds structure, promote to `optimize-state.svelte.ts` (sibling file, same modularization philosophy as `optimize-helpers.ts`). + +- **Helper module layout.** `ui/src/lib/utils/optimize-helpers.ts` owns optimize-specific pure helpers (form validation, trace→chart normalization, per-param table-row construction). Cross-surface helpers that were shared across Sweep / Sensitivity / Goal Seek / Optimize (e.g. `discoverConstParams`) stay in `analysis-helpers.ts`. The test file `optimize-helpers.test.ts` sits alongside the helper file; do not extend `analysis-helpers.test.ts`. + +- **Scenario-change reset.** Wire into the same reactivity that already drives scenario changes in `/analysis` (Sweep, Sensitivity, Goal Seek all do this). + +- **Error messaging.** Reuse the existing error surface pattern from M-040 / M-041; do not introduce a new toast or modal system. + +- **Density / styling.** Small inputs, tight gutters, 8–12 px steps. Use the analysis page's existing typography scale; no new font sizes. + +## Out of Scope + +- Per-evaluation probe log for optimize (raw reflection/expansion/contraction/shrink intermediate vertices). The exposed trace is per-iteration best only. +- Multi-objective / Pareto optimization (not in the engine). +- Constraints on optimization (deferred — tracked in `work/gaps.md`). +- History panel for past optimize runs. +- Exporting results (CSV, JSON download). +- Persisting form values to `localStorage` across browser sessions. +- Keyboard shortcuts beyond what the analysis page already supports. +- Server-side parameter discovery (browser-side `discoverConstParams` still owns this). +- Backend trace / endpoint changes — complete in M-041. +- Extraction of shared `AnalysisResultCard` / `ConvergenceChart` components — complete in M-041. + +## Dependencies + +- **M-041 (complete before this milestone starts)** — delivers shared `AnalysisResultCard`, `ConvergenceChart`, `convergence-chart-geometry.ts`, `interval-bar-geometry.ts`, and the goal-seek surface baseline. +- **Backend trace on `/v1/optimize`** — already landed in M-041 commit `29ac3e9` under D-047. No backend work required in this milestone. +- `POST /v1/optimize` — available on port 8081 against `RustEngine:Enabled=true`. +- Sample models bundled at `ui/src/lib/utils/sample-models.ts` — at least one with ≥ 2 const nodes and a metric that changes monotonically with them (so the Nelder-Mead simplex has room to move during the Playwright happy path). See the candidate tuple in Notes. + +## Notes + +- The original combined milestone `m-E21-04-goal-seek-optimize` was split on 2026-04-21 after the shared backend trace landed but before any UI work began. The split preserves the decision record (D-047 covers both endpoints) and preserves commit `29ac3e9` on the M-041 branch. + +- **Candidate Playwright happy-path tuple (to verify at milestone start):** + - Model id: `coffee-shop` (first entry in `SAMPLE_MODELS`, `ui/src/lib/utils/sample-models.ts:47`) — same sample used by the M-041 goal-seek not-bracketed Playwright case, chosen for continuity and because it ships with multiple const nodes. + - `paramIds`: the first two discoverable const params via `discoverConstParams` (expected to include `customers_per_hour`; confirm the second at milestone start). + - `searchRanges`: for each selected param, `{ lo: 0.5 × baseline, hi: 2 × baseline }` — mirrors the default-bounds rule in AC2. + - `metricSeriesId`: `served` (Sensitivity chip shortcut — verify the exact engine-emitted id at authoring time; if the actual series is namespaced, e.g. `Register.served`, update this tuple + the AC8 assertion together). + - `objective`: `minimize`; `tolerance`: `1e-4`; `maxIterations`: `200`. + - Expected: `converged: true` within the iteration budget, trace length ≥ 2, `paramValues` populated for both selected params, per-param table renders one row per param with a visible range-bar marker inside `[lo, hi]`. + - **Verification gate at milestone start**: if `coffee-shop` lacks a second usable const param, or the chosen metric does not move monotonically under these bounds in the Rust engine's output, **swap the sample** (pick an alternate from `SAMPLE_MODELS` whose metric is monotonic under its default bounds, record the replacement tuple here) before writing the Playwright spec. Do **not** soften AC8 to a not-converged assertion — the converged-badge happy path is what AC8 is proving; AC5 already owns the not-converged rendering. A silently-flaky Playwright test is the failure mode to avoid. + +## Coverage Notes + +(Filled at wrap — follow M-040's structure: pure-logic tests, component rendering via Playwright, defensive / unreachable branches enumerated with rationale.) diff --git a/work/epics/E-21-svelte-workbench-analysis-surfaces/M-043-heatmap-view.md b/work/epics/E-21-svelte-workbench-analysis-surfaces/M-043-heatmap-view.md new file mode 100644 index 00000000..adacf2f6 --- /dev/null +++ b/work/epics/E-21-svelte-workbench-analysis-surfaces/M-043-heatmap-view.md @@ -0,0 +1,279 @@ +--- +id: M-043 +title: Heatmap View +status: done +parent: E-21 +acs: + - id: AC-1 + title: View switcher renders above the canvas + status: met + - id: AC-2 + title: Heatmap replaces the canvas only + status: met + - id: AC-3 + title: Heatmap grid renders correct dimensions + status: met + - id: AC-4 + title: Three cell states render correctly with disambiguating tooltips + status: met + - id: AC-5 + title: Shared color-scale normalization (full window, 99p-clipped, excluding + status: met + - id: AC-6 + title: Row sort modes implemented; sort is pin-agnostic + status: met + - id: AC-7 + title: Class filter with optional row-stability toggle + status: met + - id: AC-8 + title: Click-a-cell pins node and jumps scrubber + status: met + - id: AC-9 + title: Scrubber-to-column highlight (two-way coupling) + status: met + - id: AC-10 + title: Pinned row glyph in the label gutter + status: met + - id: AC-11 + title: Bin-axis labels with hover parity + status: met + - id: AC-12 + title: Accessibility baseline + status: met + - id: AC-13 + title: Shared view-state store + status: met + - id: AC-14 + title: Testing + status: met + - id: AC-15 + title: Node-mode toggle (operational / full) + status: met +--- + +**Created:** 2026-04-23 +**Started:** 2026-04-23 +**Completed:** 2026-04-24 +## Goal + +Deliver a nodes-x-bins heatmap view as a **sibling of topology** under `/time-travel/topology`, sharing the toolbar, class filter, metric selector, timeline scrubber, workbench sidebar, and pin state. Heatmap reuses the existing `GET /v1/runs/{runId}/state_window` endpoint — **zero backend changes**. Introduce a typed `ViewSwitcher` component, a shared view-state store that both views consume, and a shared full-window color-scale normalization so "bright red at (N, T)" on the heatmap matches "bright red on node N" on topology when the scrubber is at bin T. + +## Context + +E-21's first five milestones delivered the workbench paradigm (M-038 foundation + click-to-pin cards), the metric selector + edge cards + class filter (M-039), and the `/analysis` analysis surfaces (M-040 sweep/sensitivity, M-041 goal-seek, M-042 optimize). What remains on the "views around the data" side of E-21 is: + +1. A second view of the same model — the heatmap — that reveals temporal patterns a single-bin topology snapshot cannot (a node that is fine in bins 1–4 but saturated in bins 5–8 is invisible on topology; the heatmap makes that obvious). +2. A reusable view-switcher shape so later views (decomposition, comparison, flow-balance — all out of E-21 scope) can slot in without structural refactoring. + +The heatmap's data need (`state_window` per-node per-bin series) is already served by the Engine API. `ui/src/lib/api/flowtime.ts:101` already exposes `getStateWindow(runId, startBin, endBin)`. The topology page (`ui/src/routes/time-travel/topology/+page.svelte`) already calls it to populate sparklines. The heatmap calls it once per scenario, identically. + +### Design decisions settled at planning (2026-04-23 Q&A) + +The 14-question Q&A on 2026-04-23 locked every design decision below. The key shape: + +- **View location (Q1):** Heatmap is a sibling of topology under `/time-travel/topology`, behind a view switcher on the canvas. Not an `/analysis` tab. Not its own route. +- **Workbench integration (Q2):** Heatmap replaces the **canvas** when selected; toolbar + scrubber + workbench sidebar persist unchanged across view switches. Pin state and scrubber position survive view switches in both directions. +- **Shared normalization (Q3):** Shared full-window color scale with 99th-percentile clipping. **Topology's per-bin normalization changes to match** — this is explicitly a cross-view parity change, captured under ADR-m-E21-06-02 below. +- **Axis orientation (Q4):** Nodes as rows (Y, labels on left), bins as columns (X, left-to-right). +- **View switcher (Q5, Q13):** Horizontal tabs above the canvas (`[ Topology | Heatmap ]`), shadcn-style underline, `Alt+1` / `Alt+2` shortcuts. Typed `` component with inline view array on the topology page — **no manifest registry, no Svelte context API**. Captured under ADR-m-E21-06-01. +- **Class filter (Q6):** Full parity with topology — hides rows AND restricts metric computation AND domain computation. Adds a **row-stability toggle** that dims filtered rows in place (off by default). +- **Row sort (Q7):** Topological order is the default; modes = topological / node id / max desc / mean desc / variance desc. Pin position is natural within the active sort; the pin glyph (AC10) is the pinned-row indicator. (Amended mid-implementation 2026-04-23 — see Confirmations item #5 below: the original "pinned-first modifier always-on" language was dropped in favor of pin-agnostic sort.) +- **Cell states (Q8):** Three states — observed (colored), no-data-for-bin (neutral grey + subtle hatch), metric-undefined-for-node (row-level muted). Tooltip always disambiguates. +- **Scrubber coupling (Q9):** Two-way. Scrubber position highlights the current-bin column in the heatmap; clicking a cell jumps the scrubber and pins the node. +- **Pinned row markers (Q10):** Pin glyph in the row-label gutter, click-to-unpin. (Amended 2026-04-23: the glyph is the sole pinned-row indicator — there is no positional float; rows keep their natural sort position.) +- **Bin-axis labels (Q11):** Sparse human time labels on the top axis; stride chosen by column pixel width × bin size. Absolute time when `StateWindowResponse.timestampsUtc` is populated (backend sends per-bin wall-clock timestamps), offset-from-start otherwise. Tooltip always shows both bin index and time. +- **Accessibility baseline (Q12):** Keyboard nav, ARIA grid structure, focus ring, tooltip-on-focus, one keyboard Playwright spec. Pattern encoding / high-contrast / screen-reader polish deferred to M-045. +- **Testing (Q14):** 13 Playwright critical-path specs + 5 vitest pure-logic suites. +- **Node-mode toggle (Q15, added 2026-04-23 after Q14):** Shared toolbar toggle `[ Operational | Full ]` controlling the `mode` parameter on `GET /v1/runs/{runId}/state_window`. Operational (default) hides `expr`/`const`/`pmf` computed nodes — matches the Blazor UI's "operational nodes" toggle. Full exposes them; they render as row-level-muted rows under operational metrics (utilization, queue depth) per AC4, and as coloured rows under metrics that are defined for them (value / output). Toggle state lives in the shared view-state store and applies to **both** topology and heatmap (re-fetches `state_window` on change). + +### API contract + +**GET /v1/runs/{runId}/state_window** — `src/FlowTime.API/Program.cs:1028` + +Query: `startBin`, `endBin` (required); optional `mode` (`operational` default, `full` available), `edgeIds`, `edgeMetrics`, `classIds`. Heatmap calls this **once per scenario** with `startBin=0`, `endBin=binCount-1`, no edge/class filter in the request (class filter is applied client-side to allow the toggle-in-place behaviour from Q6). `mode` is passed from the shared view-state store (AC15 node-mode toggle) and drives which node kinds appear in the response. Response is the same `StateWindowResponse` shape topology already consumes. No new endpoints, no additive response fields, no carve-outs needed. + +Client surface: `getStateWindow` gains an `mode?: 'operational' | 'full'` parameter (default `operational`, backward-compatible for existing call sites). The heatmap and topology both read `mode` from the shared store and pass it through. + +## Acceptance criteria + +### AC-1 — View switcher renders above the canvas + +**View switcher renders above the canvas.** `/time-travel/topology` shows a horizontal tab bar `[ Topology | Heatmap ]` above the DAG/heatmap area, implemented as a new `` component at `ui/src/lib/components/view-switcher.svelte`. Views are listed inline in the route's `