diff --git a/artifacts/features.yaml b/artifacts/features.yaml index c9ee4c7..c91bb43 100644 --- a/artifacts/features.yaml +++ b/artifacts/features.yaml @@ -307,7 +307,7 @@ artifacts: - id: FEAT-020 type: feature title: AADL browser rendering (spar WASM) - status: draft + status: approved description: > Render AADL component diagrams in the dashboard using a spar WASM module compiled for the browser. Provides interactive visualization diff --git a/artifacts/requirements.yaml b/artifacts/requirements.yaml index ff0d170..0a47ac7 100644 --- a/artifacts/requirements.yaml +++ b/artifacts/requirements.yaml @@ -55,6 +55,8 @@ artifacts: target: SC-1 - type: satisfies target: SC-3 + - type: satisfies + target: SC-15 - id: REQ-005 type: requirement title: ReqIF 1.2 import/export @@ -103,6 +105,9 @@ artifacts: fields: priority: must category: functional + links: + - type: satisfies + target: SC-18 - id: REQ-008 type: requirement @@ -115,6 +120,9 @@ artifacts: fields: priority: should category: functional + links: + - type: satisfies + target: SC-16 - id: REQ-009 type: requirement @@ -254,6 +262,8 @@ artifacts: links: - type: satisfies target: SC-7 + - type: satisfies + target: SC-17 - id: REQ-018 type: requirement title: Commit message validation at commit time @@ -303,6 +313,8 @@ artifacts: links: - type: satisfies target: SC-10 + - type: satisfies + target: SC-19 - id: REQ-021 type: requirement title: Distributed baselining via convention tags diff --git a/docs/architecture.md b/docs/architecture.md index 9844520..8360c4c 100644 --- a/docs/architecture.md +++ b/docs/architecture.md @@ -405,6 +405,79 @@ Source scanner ([[REQ-026]], [[DD-021]], [[FEAT-043]]) extracting traceability markers from test code. Ephemeral injection into the link graph, same pattern as commit traceability ([[DD-012]]). +### 8.8 Security Hardening + +STPA-Sec analysis (H-13, H-14, H-17) identified four attack surfaces requiring +hardening in the current architecture. + +#### 8.8.1 Content-Security-Policy Headers + +The dashboard server (serve.rs) must set a strict CSP header on all responses: + +``` +Content-Security-Policy: default-src 'self'; script-src 'self'; style-src 'self' 'unsafe-inline'; img-src 'self' data:; connect-src 'self' +``` + +This mitigates XSS from H-13 even if HTML escaping is missed in one rendering +path. The `'unsafe-inline'` exception for styles is required by the current +inline CSS approach and should be removed once styles are extracted to a file. + +**STPA coverage:** SC-15 (HTML-escape all content), H-13 (XSS via unescaped +artifact content). + +#### 8.8.2 Markdown HTML Sanitization + +The pulldown-cmark renderer in document.rs must sanitize raw HTML blocks. Two +layers of defense: + +1. **Output escaping** -- All artifact field values (title, description, custom + fields) must be HTML-escaped at the output boundary before interpolation into + HTML. The `html_escape()` function in document.rs handles `&`, `<`, `>`, `"`. + serve.rs routes must use this function for all artifact-derived content. + +2. **Markdown raw HTML filtering** -- pulldown-cmark's `ENABLE_STRIKETHROUGH` + and similar extensions are safe, but `Event::Html` and `Event::InlineHtml` + events must be escaped or stripped. Image URLs must be restricted to + `http://`, `https://`, and relative paths (no `javascript:` or `data:` + schemes). + +**STPA coverage:** SC-15, H-13.1 (script injection), H-13.2 (image URL +injection). + +#### 8.8.3 Git Clone Hook Protection + +The externals module (externals.rs) executes `git clone` and `git fetch` for +cross-repo artifact linking ([[REQ-020]]). To prevent arbitrary code execution +via git hooks (H-17), all git operations must: + +1. Pass `--config core.hooksPath=/dev/null` to disable post-checkout and other + hooks in cloned repositories. +2. Use `--depth 1` for initial clones when only HEAD is needed, reducing the + attack surface from malicious repository history. +3. Log the URL and commit SHA being fetched for audit trail purposes. + +**STPA coverage:** SC-19 (git clone hook protection), H-17 (arbitrary code +execution via hooks), UCA-L-6. + +#### 8.8.4 WASM Adapter Output Validation + +The WASM runtime (wasm_runtime.rs) loads third-party adapter components that +return imported artifacts. Beyond existing fuel metering and memory limits, +the host must validate adapter output ([[REQ-008]]): + +1. **Count verification** -- The adapter must report the expected artifact count + as a separate return value. The host compares the declared count against the + actual returned count and rejects mismatches. +2. **ID pattern check** -- All returned artifact IDs must match expected patterns + for the declared type. IDs that do not match the schema's ID prefix + conventions are flagged. +3. **Schema validation** -- All returned artifacts pass full schema validation + before acceptance into the store (this already exists but is noted for + completeness). + +**STPA coverage:** SC-16 (WASM output validation), H-14 (untrusted adapter +code), UCA-C-21. + ## 9. Requirements Coverage This document addresses the following requirements: diff --git a/docs/plans/2026-03-16-coverage-gap-analysis.md b/docs/plans/2026-03-16-coverage-gap-analysis.md new file mode 100644 index 0000000..d1a496d --- /dev/null +++ b/docs/plans/2026-03-16-coverage-gap-analysis.md @@ -0,0 +1,444 @@ +# Coverage Gap Analysis: STPA, Commits, and Test Traceability + +**Date:** 2026-03-16 +**Scope:** Rivet v0.2.0 — comprehensive gap analysis across commit traceability, +test coverage, STPA completeness, and lifecycle traceability. + +--- + +## Executive Summary + +| Metric | Current | Target | Gap | +|--------|---------|--------|-----| +| Commit coverage | 11.1% (10/90) | 50%+ | 80 artifacts uncovered | +| Test traceability markers | 0 source markers | 31 REQ markers | No `// rivet: verifies` annotations exist | +| STPA loss scenarios | 19 scenarios | ~27 needed | 8 UCAs lack loss scenarios (UCA-C-10 through UCA-C-17) | +| Lifecycle coverage gaps | 44 artifacts | 0 | Missing downstream links on approved artifacts | +| Validation warnings | 2 | 0 | FEAT-050/051 use `phase-4` (not in allowed values) | +| Schema coverage rules | 100% (all 10 rules) | 100% | Link-level coverage is complete | + +--- + +## Part 1: Commit Coverage Gap + +### Current State + +`rivet commits` reports **11.1% artifact coverage** (10/90 traceable artifacts). + +- **Linked commits:** 1 +- **Orphan commits:** 23 (no artifact trailers) +- **Exempt commits:** 13 +- **Broken refs:** 0 + +### Artifacts Without Commit Coverage (80 total) + +#### Requirements (12 uncovered out of 31) + +| ID | Title | Status | Notes | +|----|-------|--------|-------| +| REQ-020 | Cross-repository artifact linking | draft | Phase 3 — not yet implemented | +| REQ-021 | Distributed baselining | draft | Phase 3 | +| REQ-022 | Single-binary WASM asset embedding | draft | Phase 3 | +| REQ-023 | Conditional validation rules | draft | Phase 3 | +| REQ-024 | Change impact analysis | draft | Phase 3 | +| REQ-025 | sphinx-needs JSON import | draft | Phase 3 | +| REQ-026 | Test-to-requirement traceability extraction | draft | Phase 3 | +| REQ-027 | Build-system-aware cross-repo discovery | draft | Phase 3 | +| REQ-028 | Diagnostic-quality parsing with lossless syntax trees | draft | Phase 3 | +| REQ-029 | Incremental validation via dependency-tracked computation | draft | Phase 3 | +| REQ-030 | Formal correctness guarantees | draft | Phase 3-4 | +| REQ-031 | Schema-validated artifact mutation from CLI | draft | Phase 3 | + +**Analysis:** All 12 uncovered REQs are draft/Phase 3+. The 19 approved REQs +also lack commit coverage because commits predate the trailer system. This is +the biggest gap — retroactively tagging existing commits is not practical. + +#### Design Decisions (15 uncovered out of 28) + +DD-014 through DD-028 — all Phase 3 decisions with no implementing commits yet. + +#### Features (25 uncovered out of 67) + +FEAT-033 through FEAT-057 — all draft/Phase 3 features awaiting implementation. + +#### STPA Artifacts (28 uncovered) + +- H-9, H-10, H-11, H-12 and sub-hazards (8 total) — added in Phase 2.5 +- SC-11 through SC-14 (4 total) — added in Phase 2.5 +- UCA-C-10 through UCA-C-17 (8 total) — added in Phase 2.5 +- CC-C-10 through CC-C-17 (8 total) — added in Phase 2.5 + +### Path from 11.1% to 50%+ + +**Strategy 1: Tag future commits (organic growth)** +- Each Phase 3 implementation commit should reference its REQ/FEAT/DD. +- With 8 parallel workstreams, 5-10 tagged commits per workstream would cover + ~40-80 artifacts, reaching 50%+ within one sprint. + +**Strategy 2: Retroactive coverage for approved artifacts** +- The 19 approved REQs and ~30 approved FEATs implemented in Phase 1-2 have + implementing commits but those commits lack trailers. +- Option A: Create one `chore: retroactive traceability tagging` commit per + artifact batch (e.g., all Phase 1 REQs) with Implements trailers. +- Option B: Add the implementing commit SHAs to the artifact YAML as a + `commits` field — but this is not how `rivet commits` works (it scans git log). +- **Recommended:** Accept that Phase 1-2 commits are organically unlinked. + Focus enforcement on Phase 3+ commits. The `rivet commit-msg-check` hook + (FEAT-029) will prevent future orphans. + +**Strategy 3: Reduce the denominator** +- STPA artifacts (hazards, UCAs, controller-constraints, loss-scenarios) are + safety analysis — they don't have "implementing commits" in the traditional + sense. Consider adding them to `trace-exempt-artifacts` in rivet.yaml. +- If 28 STPA artifacts are exempted, the denominator drops from 90 to 62, + and achieving 50% requires covering ~31 artifacts. + +**Projected coverage with Phase 3 discipline:** +- Phase 3 has ~25 FEATs + ~12 REQs + ~15 DDs to implement = ~52 artifacts. +- If each gets 1+ tagged commit: (10 + 52) / 90 = **68.9%** +- With STPA exemptions: (10 + 52) / 62 = **100%** + +--- + +## Part 2: Test Coverage Gap + +### Current State + +`rivet coverage` reports **100%** on all 10 schema-level coverage rules. +This means every requirement has a feature satisfying it, every hazard has +a constraint, every UCA has a controller constraint, etc. + +However, **zero source-level test traceability markers** exist. A grep for +`rivet: verifies` across rivet-core/src/ and rivet-core/tests/ returned +no results. + +### Test Files and What They Cover + +| Test File | Tests | Effective REQ Coverage | +|-----------|-------|----------------------| +| rivet-core/src/diff.rs | 5 unit tests | REQ-001 (store diffs) | +| rivet-core/src/lifecycle.rs | 4 unit tests | REQ-004 (validation lifecycle) | +| rivet-core/src/document.rs | 8 unit tests | REQ-001, REQ-007 (document system) | +| rivet-core/tests/integration.rs | ~18 integration tests | REQ-001, REQ-003, REQ-004, REQ-007, REQ-010 | +| rivet-core/tests/stpa_roundtrip.rs | ~4 tests | REQ-002 (STPA) | +| rivet-core/tests/proptest_core.rs | 6 proptest properties | REQ-001, REQ-004, REQ-010 | +| rivet-core/tests/oslc_integration.rs | ~2 tests | REQ-006 (OSLC) | +| rivet-core/tests/docs_schema.rs | ~2 tests | REQ-007 (docs) | +| rivet-core/tests/commits_integration.rs | ~4 tests | REQ-017, REQ-018, REQ-019 | +| rivet-core/tests/commits_config.rs | ~3 tests | REQ-017 (config parsing) | +| rivet-core/tests/externals_config.rs | ~3 tests | REQ-020 (cross-repo) | +| rivet-core/tests/mutate_integration.rs | ~8 tests | REQ-031 (mutations) | + +### Requirements With Tests But No Markers + +These requirements have tests that exercise them, but the tests lack +`// rivet: verifies REQ-XXX` annotations: + +- **REQ-001** — Store/model unit tests (diff.rs, document.rs, integration.rs) +- **REQ-002** — STPA roundtrip tests +- **REQ-003** — Integration tests (ASPICE rules) +- **REQ-004** — Link graph tests, proptest, integration +- **REQ-005** — ReqIF integration tests (stub) +- **REQ-007** — Document and CLI integration tests +- **REQ-010** — Schema merge tests, proptest +- **REQ-014** — All TEST-* artifacts verify this +- **REQ-017** — commits_integration.rs, commits_config.rs +- **REQ-018** — commits_integration.rs (commit-msg-check) +- **REQ-019** — commits_integration.rs (orphan detection) +- **REQ-020** — externals_config.rs +- **REQ-031** — mutate_integration.rs + +### Requirements Without Any Tests + +| REQ | Title | Notes | +|-----|-------|-------| +| REQ-006 | OSLC sync | Only stub test, OSLC not implemented | +| REQ-008 | WASM adapters | No WASM runtime tests | +| REQ-009 | Test results as evidence | TEST-010 covers the model, not the release flow | +| REQ-011 | Rust edition 2024 / MSRV | Checked by CI, not unit tested | +| REQ-012 | CI quality gates | Meta-requirement, verified by CI pipeline | +| REQ-013 | Performance benchmarks | Benchmarks exist but not traced | +| REQ-015 | ASPICE 4.0 schemas | Schema tests exist, no marker | +| REQ-016 | Cybersecurity schema | Schema merge tests cover it | +| REQ-021 through REQ-030 | Phase 3 requirements | Not yet implemented | + +### Recommendations + +1. **Add `// rivet: verifies REQ-XXX` markers** to existing test functions. + This requires implementing FEAT-043 (test traceability source scanner) first, + or manually adding markers now in preparation. + +2. **Marker format to adopt** (per FEAT-043 design): + ```rust + // rivet: verifies REQ-001 + #[test] + fn test_store_insert_lookup() { ... } + ``` + +3. **Priority order for adding markers:** + - Phase 1: All integration tests in `tests/integration.rs` (~18 tests) + - Phase 2: Proptest properties in `tests/proptest_core.rs` (6 tests) + - Phase 3: Unit tests in `src/diff.rs`, `src/document.rs`, `src/lifecycle.rs` + - Phase 4: Specialized tests (stpa_roundtrip, commits, externals, mutate) + +4. **Estimated coverage after markers:** 13/31 REQs with direct test markers + (42%). Adding markers to existing tests for REQ-015, REQ-016 would reach + 15/31 (48%). + +--- + +## Part 3: STPA Completeness + +### 3.1 Missing Hazards for Recent Features + +The following features added in Phase 2/2.5 have **no corresponding STPA hazards**: + +#### HTML Export Corruption +The dashboard generates HTML for compliance evidence viewing. If the HTML +export contains corrupted data (XSS in artifact fields rendered as HTML, +broken link graphs, missing artifacts from the export), auditors receive +misleading evidence. + +**Proposed hazard:** +- **H-13: Rivet dashboard renders artifact content as unescaped HTML, enabling + XSS or content injection in compliance evidence** + - Losses: [L-2, L-3] — compliance evidence corruption and data sovereignty + - This is particularly relevant because the dashboard renders artifact + descriptions as HTML and serves as audit evidence (SC-6 applies) + +#### config.js Injection +The dashboard serves a dynamically generated `config.js` that bootstraps +client-side behavior (WebSocket URLs, feature flags). If this is injectable, +an attacker could redirect the dashboard to exfiltrate artifact data. + +**Proposed hazard:** +- **H-14: Rivet dashboard config.js endpoint is injectable, allowing artifact + data exfiltration or UI manipulation** + - Losses: [L-3, L-6] — data sovereignty and audit trail + +#### WASM Runtime Panics +The spar WASM rendering module runs in the dashboard to render AADL diagrams. +If the WASM module panics on malformed AADL input, it could crash the browser +tab or produce an incomplete rendering that masks architecture gaps. + +**Proposed hazard:** +- **H-15: Rivet WASM renderer panics on malformed AADL input, producing + incomplete architecture diagrams** + - Losses: [L-1, L-4] — traceability integrity (missing components in diagram) + and engineering productivity (crashed browser tab) + +### 3.2 System Constraint to Requirement Linkage + +All 14 system constraints are linked to hazards (verified by `rivet coverage`). + +Cross-referencing SCs to REQs: + +| SC | Hazard | Linked to REQ? | Notes | +|----|--------|---------------|-------| +| SC-1 | H-1 | REQ-031 (via links) | Yes | +| SC-2 | H-2 | REQ-031 (via links) | Yes | +| SC-3 | H-3 | No direct REQ link | Should link to REQ-004 | +| SC-4 | H-4 | No direct REQ link | Should link to REQ-005, REQ-006 | +| SC-5 | H-5 | No direct REQ link | Should link to REQ-006 | +| SC-6 | H-6 | No direct REQ link | **Gap** — no REQ for report verification gating | +| SC-7 | H-7 | No direct REQ link | Should link to REQ-017 | +| SC-8 | H-8 | No direct REQ link | Should link to REQ-006 | +| SC-9 | H-4 | No direct REQ link | Should link to REQ-005 | +| SC-10 | H-1, H-3 | No direct REQ link | Should link to REQ-020 | +| SC-11 | H-9 | REQ-029 (via links) | Yes | +| SC-12 | H-10 | REQ-023 (via links) | Yes | +| SC-13 | H-11 | REQ-028 (via links) | Yes | +| SC-14 | H-12 | REQ-030 (via links) | Yes | + +**Gap:** SC-3 through SC-10 have no `satisfies` link from any requirement. +The constraints exist but no requirement explicitly commits to satisfying them. + +### 3.3 UCA to Hazard Linkage + +All 45 UCAs are linked to hazards (100% coverage per `rivet coverage`). No gaps. + +### 3.4 Loss Scenarios for UCA-C-10 through UCA-C-17 + +**Current state:** The loss-scenarios.yaml has 19 scenarios. UCA-C-10 through +UCA-C-17 (the incremental validation and parser UCAs added in Phase 2.5) have +**zero loss scenarios**. This is a significant STPA completeness gap. + +**Missing loss scenarios (8 needed):** + +| UCA | Description | Proposed Loss Scenario | +|-----|-------------|----------------------| +| UCA-C-10 | Salsa doesn't invalidate on file change | LS-C-5: Developer edits YAML, `rivet validate` returns cached PASS from salsa, new validation error is missed. Developer merges broken traceability. | +| UCA-C-11 | Conditional rules not re-evaluated on field change | LS-C-6: Artifact status changes from draft to approved, conditional rule requiring verification-criteria doesn't fire, approved artifact ships without verification evidence. | +| UCA-C-12 | Contradictory conditional rules applied | LS-C-7: Schema update adds rule A requiring field X when status=approved, existing rule B forbids field X when safety=ASIL_B. Engineers cannot make ASIL_B approved artifacts valid, disable validation entirely. | +| UCA-C-13 | Incremental != full validation results | LS-C-8: After a sequence of edits, incremental validation reports PASS. A colleague runs full validation on the same files and gets 3 errors. Trust in the tool collapses, team abandons automated validation. | +| UCA-C-14 | Conditional rules evaluated before schema loads | LS-C-9: Schema file with 5 conditional rules is loaded after validation begins. Only 2 of 5 rules apply. Safety-critical field requirements from the remaining 3 rules are never checked. | +| UCA-C-15 | Parser misses git_override | LS-C-10: MODULE.bazel pins dependency to a specific commit via git_override, but parser extracts registry version. Cross-repo validation runs against wrong commit, reports coverage for artifacts that don't exist in the pinned version. | +| UCA-C-16 | Parser silently skips unsupported Starlark | LS-C-11: MODULE.bazel uses load() to import a macro that declares 5 dependencies. Parser silently skips load(), missing all 5 repos. Cross-repo validation has blind spots for 5 modules. | +| UCA-C-17 | Parser extracts wrong module name | LS-C-12: Parser bug swaps name= and version= keyword values in bazel_dep(). Cross-repo links resolve against "1.2.3" (the version string used as module name), which doesn't exist, but the error message is confusing. | + +### 3.5 HTML Export Controller + +The dashboard (`CTRL-DASH`) is currently modeled as a read-only display +controller. However, the HTML it serves is **de facto compliance evidence** — +auditors view the dashboard to verify traceability coverage. SC-6 states: + +> "Rivet must generate compliance reports only from verified traceability data." + +The dashboard currently has no validation gate — it renders whatever data is +loaded, whether validated or not. This means: + +**Proposed controller extension:** Add control actions to CTRL-DASH: +- **CA-DASH-2:** Render compliance-grade HTML pages (coverage matrix, validation + summary, artifact detail pages used as audit evidence) +- **CA-DASH-3:** Export static HTML snapshots for offline audit review + +**Proposed UCAs for export controller:** +- **UCA-D-3:** Dashboard exports compliance HTML without running validation + first (hazards: H-6, H-3) +- **UCA-D-4:** Dashboard exports HTML containing unescaped artifact content + that renders as executable JavaScript (hazards: L-3, H-7) +- **UCA-D-5:** Dashboard exports HTML snapshot that omits artifacts from + partially-loaded sources (hazards: H-2, H-3) + +--- + +## Part 4: Artifact Inventory for v0.2.0 + +### New STPA Artifacts Needed + +#### Hazards (3 new) + +| ID | Title | Losses | +|----|-------|--------| +| H-13 | Dashboard renders unescaped HTML content in compliance evidence | L-2, L-3 | +| H-14 | Dashboard config.js endpoint is injectable | L-3, L-6 | +| H-15 | WASM renderer panics on malformed input, producing incomplete diagrams | L-1, L-4 | + +#### System Constraints (3 new) + +| ID | Title | Hazards | +|----|-------|---------| +| SC-15 | Dashboard must HTML-escape all artifact content before rendering | H-13 | +| SC-16 | Dashboard must sanitize all dynamically generated JavaScript | H-14 | +| SC-17 | WASM renderers must trap panics and return error SVGs, never crash | H-15 | + +#### UCAs (3 new for dashboard export) + +| ID | Controller | Description | Hazards | +|----|-----------|-------------|---------| +| UCA-D-3 | CTRL-DASH | Dashboard exports compliance HTML without validation gate | H-6, H-3 | +| UCA-D-4 | CTRL-DASH | Dashboard renders unescaped content as executable HTML | H-13 | +| UCA-D-5 | CTRL-DASH | Dashboard exports partial data from incomplete source loading | H-2, H-3 | + +#### Controller Constraints (3 new) + +| ID | Constraint | UCAs | +|----|-----------|------| +| CC-D-3 | Dashboard must gate compliance-grade pages on successful validation | UCA-D-3 | +| CC-D-4 | Dashboard must HTML-escape all artifact-sourced content | UCA-D-4 | +| CC-D-5 | Dashboard must verify all sources loaded before generating export | UCA-D-5 | + +#### Loss Scenarios (8 new for incremental/parser UCAs + 3 for dashboard) + +| ID | Title | UCA | +|----|-------|-----| +| LS-C-5 | Salsa cache returns stale validation pass after file edit | UCA-C-10 | +| LS-C-6 | Conditional rule misses status change to approved | UCA-C-11 | +| LS-C-7 | Contradictory rules make ASIL_B approval impossible | UCA-C-12 | +| LS-C-8 | Incremental/full validation divergence erodes tool trust | UCA-C-13 | +| LS-C-9 | Schema loads after validation, missing conditional rules | UCA-C-14 | +| LS-C-10 | Parser misses git_override, validates wrong commit | UCA-C-15 | +| LS-C-11 | Parser silently skips load(), missing 5 dependencies | UCA-C-16 | +| LS-C-12 | Parser swaps keyword args, resolves wrong module | UCA-C-17 | +| LS-D-1 | Dashboard shows green metrics without validation gate | UCA-D-3 | +| LS-D-2 | XSS in artifact description executes in auditor's browser | UCA-D-4 | +| LS-D-3 | HTML export omits STPA artifacts from unmounted source | UCA-D-5 | + +### Missing Test Traceability Markers + +**Immediate action (no code change needed, just comments):** + +Add `// rivet: verifies REQ-XXX` to these test files: + +| File | Tests | Markers to Add | +|------|-------|---------------| +| rivet-core/tests/integration.rs | 18 tests | REQ-001, REQ-003, REQ-004, REQ-007, REQ-010 | +| rivet-core/tests/proptest_core.rs | 6 tests | REQ-001, REQ-004, REQ-010 | +| rivet-core/tests/stpa_roundtrip.rs | 4 tests | REQ-002, REQ-004 | +| rivet-core/src/diff.rs | 5 tests | REQ-001 | +| rivet-core/src/document.rs | 8 tests | REQ-001, REQ-007 | +| rivet-core/src/lifecycle.rs | 4 tests | REQ-004 | +| rivet-core/tests/commits_integration.rs | 4 tests | REQ-017, REQ-018, REQ-019 | +| rivet-core/tests/commits_config.rs | 3 tests | REQ-017 | +| rivet-core/tests/externals_config.rs | 3 tests | REQ-020 | +| rivet-core/tests/mutate_integration.rs | 8 tests | REQ-031 | + +**Total:** ~63 test functions need markers for ~15 distinct REQs. + +### Lifecycle Coverage Gaps to Close + +44 approved artifacts are missing downstream links (from `rivet validate`): + +**Requirements missing aadl-component allocation (7):** +REQ-012, REQ-013, REQ-014, REQ-015, REQ-016, REQ-017, REQ-018, REQ-019 + +**Features missing design-decision links (5):** +FEAT-001, FEAT-002, FEAT-009, FEAT-010, FEAT-018 + +**Features with no downstream artifacts at all (28):** +All TEST-* and many FEAT-* approved artifacts — these are leaf nodes +(test artifacts and features) that by nature have no further downstream. +Consider adding a lifecycle rule exemption for `feature` artifacts with +tags containing `testing` or `swe-*`. + +### Validation Warnings to Fix + +| Artifact | Issue | Fix | +|----------|-------|-----| +| FEAT-050 | `phase: phase-4` not in allowed values | Add `phase-4` to allowed values in dev.yaml, or change to `future` | +| FEAT-051 | `phase: phase-4` not in allowed values | Same fix | + +--- + +## Priority Action Plan + +### Immediate (this sprint) + +1. Fix FEAT-050/051 phase value warnings +2. Add 8 loss scenarios for UCA-C-10 through UCA-C-17 +3. Add `// rivet: verifies` markers to top 3 test files (~30 tests) +4. Ensure all Phase 3 commits use artifact trailers + +### Short-term (next sprint) + +5. Add H-13/H-14/H-15 hazards and associated SC/UCA/CC/LS artifacts +6. Add SC-3 through SC-10 `satisfies` links from requirements +7. Add remaining test markers (~33 tests) +8. Consider STPA artifact exemption from commit coverage denominator + +### Medium-term (Phase 3) + +9. Implement FEAT-043 (test traceability source scanner) to automate marker extraction +10. Implement FEAT-029 (commit-msg-check) to prevent future orphan commits +11. Add missing DD links for FEAT-001, FEAT-002, FEAT-009, FEAT-010, FEAT-018 +12. Add lifecycle rule exemptions for leaf-node test features + +--- + +## Artifact Count Summary + +| Category | Existing | New Needed | Total After | +|----------|----------|-----------|-------------| +| Losses | 6 | 0 | 6 | +| Hazards | 12 | 3 | 15 | +| Sub-hazards | 10 | 0 | 10 | +| System constraints | 14 | 3 | 17 | +| UCAs | 45 | 3 | 48 | +| Controller constraints | 45 | 3 | 48 | +| Loss scenarios | 19 | 11 | 30 | +| **STPA total** | **151** | **23** | **174** | +| Requirements | 31 | 0 | 31 | +| Design decisions | 28 | 0 | 28 | +| Features | 67 | 0 | 67 | +| AADL components | 21 | 0 | 21 | +| **Grand total** | **328** | **23** | **351** | diff --git a/docs/plans/2026-03-16-formal-verification-completion.md b/docs/plans/2026-03-16-formal-verification-completion.md new file mode 100644 index 0000000..b7766d8 --- /dev/null +++ b/docs/plans/2026-03-16-formal-verification-completion.md @@ -0,0 +1,303 @@ +# Formal Verification Completion Plan + +**Issue:** #23 — Formal verification strategy (Kani + Verus + Rocq) +**Date:** 2026-03-16 +**Status:** Analysis complete, ready for execution + +--- + +## 1. Current State + +### 1.1 Kani (Bounded Model Checking) — 10 harnesses, ready for CI + +**File:** `rivet-core/src/proofs.rs` (gated behind `#[cfg(kani)]`) + +| # | Harness | Property proved | +|---|---------|-----------------| +| 1 | `proof_parse_artifact_ref_no_panic` | `parse_artifact_ref` never panics for any printable ASCII input up to 8 bytes | +| 2 | `proof_store_insert_no_panic` | `Store::insert` never panics for any bounded artifact | +| 3 | `proof_store_duplicate_returns_error` | Duplicate insert returns `Err`, store length stays 1 | +| 4 | `proof_coverage_percentage_bounds` | `CoverageEntry::percentage()` always in [0.0, 100.0]; edge cases at 0 and total=0 | +| 5 | `proof_cardinality_exhaustive` | `validate()` handles all `Cardinality` variants without panic for 0-2 links | +| 6 | `proof_compute_coverage_report_bounds` | End-to-end `compute_coverage` yields covered <= total, percentage in [0, 100] | +| 7 | `proof_schema_merge_idempotent` | `Schema::merge` with duplicate file preserves type/link/inverse counts | +| 8 | `proof_linkgraph_lone_artifact_is_orphan` | Unlinked artifact detected as orphan | +| 9 | `proof_linkgraph_dag_no_cycles` | A->B->C chain has no cycles | +| 10 | `proof_linkgraph_cycle_detected` | A->B->A cycle is correctly detected | + +**Integration status:** Module is declared in `lib.rs` (`#[cfg(kani)] mod proofs;`), `Cargo.toml` declares `cfg(kani)` in `unexpected_cfgs`. CI has a commented-out Kani job (lines 229-236 of `.github/workflows/ci.yml`). + +**Verdict: READY to uncomment and ship.** The `kani-github-action@v1` installs Kani automatically. No Bazel needed. + +### 1.2 Verus (SMT-backed functional correctness) — 6 specs, 3 proved + +**File:** `rivet-core/src/verus_specs.rs` (gated behind `#[cfg(verus)]`) + +| # | Spec/Proof | Status | +|---|-----------|--------| +| 1 | `store_well_formed` spec + `lemma_insert_preserves_wellformed` | **Proved** — inserting a fresh ID preserves store well-formedness | +| 2 | `backlink_symmetric` spec + `lemma_build_yields_symmetric` | **Proved** (from preconditions) — forward/backward link symmetry | +| 3 | `coverage_bounded` spec + `lemma_coverage_bounded` | **Proved** — coverage percentage in [0, 100] using vstd arithmetic lemmas | +| 4 | `validation_soundness` | **Spec only** — stated as open spec function, NOT proved. States: no errors implies all types known, no broken links, backlinks symmetric | +| 5 | `reachable_sound` + `reachable_complete` | **Spec only** — defines reachability with fuel-bounded induction, NOT proved | +| 6 | `coverage_validation_agreement` | **Spec only** — 100% coverage implies no error diagnostics for that rule, NOT proved | + +**Integration status:** `verus/BUILD.bazel` defines `verus_library` and `verus_test` targets. `verus/MODULE.bazel` references `pulseengine/rules_verus` (commit `e2c1600`). Requires Bazel 8+ and a nightly Rust toolchain (1.82.0-nightly pinned by Verus). + +**Verdict: NOT runnable in CI today.** Requires Bazel infrastructure and rules_verus to be published/available. The 3 unproved specs need substantial proof work. + +### 1.3 Rocq (Deep metamodel proofs) — 24 Qed, 1 Admitted + +**Files:** `proofs/rocq/Schema.v` (667 lines), `proofs/rocq/Validation.v` (201 lines) + +#### Schema.v — 17 Qed, 1 Admitted + +| # | Theorem/Lemma | Status | +|---|--------------|--------| +| 1 | `schema_satisfiable` | **Proved** — empty store satisfies any rule set | +| 2 | `monotonicity_non_source` | **Proved** — adding non-source artifact preserves validity | +| 3 | `validation_empty_store` | **Proved** — zero work for empty store | +| 4 | `validation_empty_rules` | **Proved** — zero work for empty rule set | +| 5 | `validation_work_add_one` | **Proved** — adding one artifact adds |rules| work | +| 6 | `store_get_not_in` | **Proved** — lookup returns None if ID not present | +| 7 | `store_get_in` | **Proved** — lookup succeeds for present ID in unique store | +| 8 | `broken_link_detection_sound` | **Proved** — absent target means link is broken | +| 9 | `store_get_app_new` | **Proved** — newly appended artifact is retrievable | +| 10 | `insert_then_get` | **Proved** — insert then get returns the artifact | +| 11 | `store_get_app_old` | **Proved** — insert preserves old lookups | +| 12 | `insert_preserves_old` | **Proved** — insert doesn't affect other artifact retrieval | +| 13 | `insert_duplicate_fails` | **Proved** — duplicate ID insert returns None | +| 14 | `backlink_from_forward_link` | **Proved** — forward link induces backlink | +| 15 | `vmodel_chain_two_steps` | **Proved** — two consecutive rules imply reachability | +| 16 | `single_rule_constructible` | **Proved** — any single rule is satisfiable | +| 17 | `no_source_no_violations` | **Proved** — no source artifacts means zero violations | +| 18 | `zero_violations_implies_satisfied` | **ADMITTED** — requires inductive reasoning over `filter` | + +#### Validation.v — 7 Qed, 0 Admitted + +| # | Theorem/Lemma | Status | +|---|--------------|--------| +| 1 | `validation_deterministic` | **Proved** — pure function, reflexivity | +| 2 | `empty_store_no_diagnostics` | **Proved** — empty store yields empty diagnostics | +| 3 | `check_broken_links_reports` | **Proved** — broken link always produces SevError diagnostic | +| 4 | `check_broken_links_clean` | **Proved** — all targets present means no broken-link diags | +| 5 | `check_artifact_rule_clean` | **Proved** — non-matching artifact kind means no rule diag | +| 6 | `check_broken_links_length` | **Proved** — broken-link diags bounded by link count | +| 7 | `check_artifact_rules_length` | **Proved** — rule diags bounded by rule count | + +**Totals: 24 proved (Qed), 1 admitted.** + +**Integration status:** `proofs/rocq/BUILD.bazel` defines targets using `rules_rocq_rust`. `proofs/rocq/MODULE.bazel` references `pulseengine/rules_rocq_rust` (commit `6a8da0b`) and requires Nix + Bazel 8+ for hermetic Rocq 9.0 toolchain. + +**Verdict: NOT runnable in CI today.** Requires Bazel + Nix infrastructure. One admitted theorem needs completion. + +--- + +## 2. Gap Analysis + +### 2.1 What's proved vs what's spec'd but not proved + +| Layer | Proved | Spec'd only | Admitted | Total | +|-------|--------|-------------|----------|-------| +| Kani | 10 harnesses (all complete) | 0 | 0 | 10 | +| Verus | 3 lemmas | 3 specs (validation_soundness, reachability, coverage-validation agreement) | 0 | 6 | +| Rocq | 24 theorems | 0 | 1 (zero_violations_implies_satisfied) | 25 | +| **Total** | **37** | **3** | **1** | **41** | + +### 2.2 What's missing from Issue #23 scope + +Issue #23 calls for proofs of: + +| Desired proof | Current status | +|--------------|----------------| +| `LinkGraph::build()` no panics | **Partially covered** — Kani #5, #8-10 exercise build indirectly, but no dedicated harness for arbitrary store+schema | +| `parse_artifact_ref()` all inputs | **Done** — Kani #1 | +| `Schema::merge()` never panics, preserves types | **Done** — Kani #7 (idempotence); no explicit panic-freedom harness | +| `validate()` cardinality logic exhaustive | **Done** — Kani #5 | +| `detect_circular_deps()` DFS terminates, finds all cycles | **Partially covered** — Kani #9, #10 test DAG/cycle but not DFS termination | +| MODULE.bazel parser all inputs | **Not started** — no Kani harness for Bazel parser | +| Validation soundness (PASS -> rules satisfied) | **Spec only** — Verus spec #4, not proved | +| Validation completeness (violated -> diagnostic) | **Partially covered** — Rocq `zero_violations_implies_satisfied` is ADMITTED | +| Backlink symmetry | **Done** — Verus #2 proved, Rocq #14 proved | +| Conditional rule consistency | **Not started** | +| Reachability correctness | **Spec only** — Verus #5 | +| Schema satisfiability | **Done** — Rocq #1 | +| Monotonicity | **Done** — Rocq #2 | +| Link graph well-foundedness / validation terminates | **Done** — Rocq #3-5 | +| ASPICE V-model completeness | **Partially covered** — Rocq #15 (two-step chain) | + +### 2.3 CI integration gaps + +| Tool | CI status | Blocker | +|------|-----------|---------| +| Kani | **Commented out** in ci.yml (lines 229-236) | None — just uncomment | +| Verus | No CI job | Requires Bazel + rules_verus + nightly Rust 1.82.0 | +| Rocq | No CI job | Requires Bazel + rules_rocq_rust + Nix + Rocq 9.0 | + +--- + +## 3. CI Integration Plan + +### 3.1 Kani — Immediate (uncomment existing job) + +The ci.yml already has a commented-out Kani job at lines 229-236: + +```yaml +kani: + name: Kani Proofs + runs-on: ubuntu-latest + steps: + - uses: actions/checkout@v6 + - uses: model-checking/kani-github-action@v1 + - run: cargo kani -p rivet-core +``` + +**Action:** Uncomment. The `kani-github-action@v1` handles installation. `cargo kani -p rivet-core` runs all 10 harnesses. Estimated CI time: 5-15 minutes depending on solver performance. + +**Considerations:** +- Pin `kani-version` for reproducibility (e.g., `kani-version: '0.50.0'`) +- Add `continue-on-error: false` to block PRs on proof failure +- Consider caching: Kani compiles CBMC which is slow on first run; the action caches internally + +### 3.2 Verus — Medium-term (requires Bazel in CI) + +Two paths: + +**Path A: Bazel in CI (preferred)** +- Add a Bazel CI job that runs `bazel test //verus:rivet_specs_verify` +- Requires: `rules_verus` published, Bazel 8+ on runner, ~10 min setup +- Blocked on: `pulseengine/rules_verus` being a real, working Bazel module + +**Path B: Direct Verus invocation (workaround)** +- Install Verus nightly binary in CI +- Run `verus rivet-core/src/verus_specs.rs` directly +- Problem: the file uses `use crate::...` imports that won't resolve outside cargo + +**Recommendation:** Path A. Park until Bazel is adopted for the project. + +### 3.3 Rocq — Medium-term (requires Bazel + Nix in CI) + +**Path A: Bazel + Nix in CI (preferred)** +- Add a Bazel CI job: `bazel test //proofs/rocq:rivet_metamodel_test` +- Requires: `rules_rocq_rust` published, Nix on runner, Bazel 8+ +- Estimated CI time: 2-5 minutes (Rocq proofs compile fast) + +**Path B: Direct coqc invocation (workaround)** +- Install Rocq/Coq 9.0 via Nix or apt +- Run `coqc -Q proofs/rocq Rivet proofs/rocq/Schema.v proofs/rocq/Validation.v` +- Simpler but non-hermetic + +**Path C: Nix-only CI job (intermediate)** +```yaml +rocq: + name: Rocq Proofs + runs-on: ubuntu-latest + steps: + - uses: actions/checkout@v6 + - uses: cachix/install-nix-action@v27 + - run: | + nix-env -iA nixpkgs.coq + coqc -Q proofs/rocq Rivet proofs/rocq/Schema.v + coqc -Q proofs/rocq Rivet proofs/rocq/Validation.v +``` + +**Recommendation:** Path C as an immediate step, Path A when Bazel is adopted. + +--- + +## 4. Prioritized New Proofs to Add + +### Priority 1 — High value, low effort (next sprint) + +| # | Proof | Tool | Rationale | +|---|-------|------|-----------| +| 1 | `proof_linkgraph_build_no_panic` — arbitrary store+schema | Kani | Issue #23 explicitly calls for this; exercises the core graph builder | +| 2 | `proof_schema_merge_no_panic` — arbitrary schema files | Kani | Dedicated panic-freedom proof (current #7 only tests idempotence) | +| 3 | `proof_validate_no_panic` — arbitrary store+schema+graph | Kani | Most critical: validate is the safety-critical function | +| 4 | Complete `zero_violations_implies_satisfied` | Rocq | The only Admitted theorem; requires inductive filter reasoning | +| 5 | Uncomment Kani CI job | CI | Zero-cost, immediate value | + +### Priority 2 — Medium effort, high value (next month) + +| # | Proof | Tool | Rationale | +|---|-------|------|-----------| +| 6 | `proof_detect_cycles_terminates` — DFS terminates for any graph | Kani | Issue #23 scope; verify `has_cycles` terminates | +| 7 | Prove `validation_soundness` | Verus | The highest-value proof: PASS means all rules satisfied | +| 8 | Prove `reachable_sound` + `reachable_complete` | Verus | Reachability correctness for transitive closure | +| 9 | ASPICE full V-model chain (N steps, not just 2) | Rocq | Extend `vmodel_chain_two_steps` to arbitrary chains | +| 10 | Rocq CI job (Path C: Nix-based) | CI | Proves Rocq proofs compile on every PR | + +### Priority 3 — Higher effort, strategic value (quarter) + +| # | Proof | Tool | Rationale | +|---|-------|------|-----------| +| 11 | Prove `coverage_validation_agreement` | Verus | 100% coverage implies no error diags | +| 12 | Conditional rule consistency | Verus | No contradictions when rules co-fire | +| 13 | MODULE.bazel parser panic-freedom | Kani | Issue #23 scope; lower priority since not safety-critical path | +| 14 | Verus CI via Bazel | CI | Requires Bazel adoption | +| 15 | `coq-of-rust` extraction | Rocq | Auto-extract Rocq model from Rust source; experimental | + +--- + +## 5. Timeline Estimate + +| Phase | Duration | Deliverables | +|-------|----------|-------------| +| **Week 1** (immediate) | 1-2 days | Uncomment Kani CI job; add 3 new Kani harnesses (#1-3 above) | +| **Week 2** | 3-5 days | Complete Rocq admitted theorem (#4); add Rocq Nix CI job (#10) | +| **Weeks 3-4** | 1-2 weeks | Kani cycle-termination proof (#6); begin Verus validation_soundness proof (#7) | +| **Month 2** | 2-3 weeks | Complete Verus reachability proofs (#8); ASPICE full chain (#9) | +| **Month 3** | 2-3 weeks | Verus coverage-validation agreement (#11); conditional rules (#12); Bazel CI (#14) | + +**Total estimated effort:** ~6-8 weeks of focused formal verification work. + +--- + +## 6. Can We Run Kani Right Now? + +**In CI: YES.** Uncomment lines 229-236 in `.github/workflows/ci.yml`. The `kani-github-action@v1` handles all installation. The 10 harnesses are self-contained and compile correctly under `#[cfg(kani)]`. + +**Locally: Requires installation.** Kani is not installed on this machine. Install via: +```bash +cargo install --locked kani-verifier +cargo kani setup +cargo kani -p rivet-core +``` + +**Expected outcome:** All 10 harnesses should pass. The harnesses use small bounds (8 bytes, 4 chars, 3 artifacts) to keep solver time manageable. + +--- + +## 7. rules_verus and rules_rocq_rust Integration Status + +### rules_verus (`verus/MODULE.bazel`) +- **Source:** `pulseengine/rules_verus` (GitHub) +- **Commit:** `e2c1600a8cca4c0deb78c5fcb4a33f1da2273d29` +- **Verus version:** `0.2026.02.15` +- **Requirements:** Bazel 8+, Rust nightly (1.82.0, pinned by Verus) +- **Status:** BUILD.bazel and MODULE.bazel are written. Depends on `rules_verus` being a functional Bazel module (may need real implementation work in the `pulseengine/rules_verus` repo). + +### rules_rocq_rust (`proofs/rocq/MODULE.bazel`) +- **Source:** `pulseengine/rules_rocq_rust` (GitHub) +- **Commit:** `6a8da0bd30b5f80f811acefbf6ac5740a08d4a8c` +- **Rocq version:** 9.0 via Nix +- **Requirements:** Bazel 8+, Nix package manager, `rules_nixpkgs_core` 0.13.0 +- **Status:** BUILD.bazel and MODULE.bazel are written. Depends on `rules_rocq_rust` being a functional Bazel module. The Nix dependency adds complexity but ensures hermetic toolchains. + +**Neither Bazel module is likely functional today** — both reference `pulseengine/rules_*` repos that may be stubs or works-in-progress. The Rocq proofs can be verified independently with plain `coqc`; the Verus specs need the Verus toolchain but could potentially be verified with a standalone Verus binary. + +--- + +## 8. Summary Scorecard + +| Metric | Value | +|--------|-------| +| Kani harnesses | 10 (all complete) | +| Verus specs | 6 (3 proved, 3 spec-only) | +| Rocq theorems | 25 (24 Qed, 1 Admitted) | +| **Total proved** | **37** | +| **Total spec'd but not proved** | **4** | +| CI integration | Kani ready (commented out), Verus/Rocq blocked on Bazel | +| Lowest-hanging fruit | Uncomment Kani CI job (5-minute task) | +| Biggest gap | Verus `validation_soundness` — the most important proof, not yet attempted | +| ISO 26262 readiness | Strong foundation; need validation soundness proof for TCL 1 argument | diff --git a/docs/plans/2026-03-16-oslc-analysis.md b/docs/plans/2026-03-16-oslc-analysis.md new file mode 100644 index 0000000..2c783ed --- /dev/null +++ b/docs/plans/2026-03-16-oslc-analysis.md @@ -0,0 +1,490 @@ +# OSLC Ecosystem Analysis and Strategic Recommendation + +**Date:** 2026-03-16 +**Artifacts:** REQ-006, FEAT-011, DD-001 +**Status:** Research complete, recommendation pending decision + +--- + +## 1. What Rivet's OSLC Module Already Implements + +The existing implementation in `rivet-core/src/oslc.rs` (~1870 lines) is a +well-structured OSLC client that covers the core protocol mechanics. Feature-gated +behind `#[cfg(feature = "oslc")]` to isolate the `reqwest` dependency. + +### Implemented (working, tested) + +| Capability | Status | Evidence | +|---|---|---| +| Service Provider Catalog discovery | Complete | `OslcClient::discover()`, wiremock test | +| OSLC Query with `oslc.where` / `oslc.select` | Complete | `OslcClient::query()`, wiremock test | +| Pagination support (next_page parsing) | Partial | Response struct supports it, no auto-follow | +| GET single resource (JSON-LD) | Complete | `OslcClient::get_resource()`, wiremock test | +| POST to creation factory | Complete | `OslcClient::create_resource()`, wiremock test | +| PUT to update resource | Complete | `OslcClient::update_resource()`, wiremock test | +| Basic auth | Complete | `with_basic_auth()`, wiremock test | +| Bearer token auth | Complete | `with_bearer_token()`, wiremock test | +| RM domain types (Requirement) | Complete | `OslcRequirement` struct, full serde | +| QM domain types (TestCase, TestResult) | Complete | `OslcTestCase`, `OslcTestResult` structs | +| CM domain types (ChangeRequest) | Complete | `OslcChangeRequest` struct | +| OSLC-to-Artifact mapping | Complete | `oslc_to_artifact()`, 6 link types mapped | +| Artifact-to-OSLC mapping | Complete | `artifact_to_oslc()`, bidirectional | +| Sync diff computation | Complete | `compute_diff()` with remote/local/modified/unchanged | +| SyncAdapter trait (pull/push/diff) | Complete | `OslcSyncAdapter` implements trait | +| Pull (query -> artifacts) | Complete | Tested with mixed resource types | +| Push (artifacts -> create) | Basic | Always POSTs; no create-vs-update logic | +| Error handling (HTTP codes, malformed JSON) | Complete | 404, 500, malformed JSON/catalog tests | +| JSON-LD @type dispatching | Complete | `parse_member_resource()` with fallback | + +### Not Implemented (gaps for real-world use) + +| Gap | Severity | Notes | +|---|---|---| +| DELETE resource | Minor | Not needed for most sync workflows | +| TRS (Tracked Resource Set) | Major | Required for incremental sync; currently full-pull only | +| OAuth 1.0a (Jazz Form Auth) | **Critical** | IBM ELM uses Jazz form-based auth + OAuth 1.0a, not Basic/Bearer | +| OAuth 2.0 / OIDC | **Critical** | Modern Polarion/codebeamer may require OIDC | +| Resource Shapes validation | Medium | No shape discovery or constraint checking | +| ETag / If-Match concurrency | Medium | No optimistic concurrency control on PUT | +| Pagination auto-follow | Minor | `next_page` is parsed but not auto-followed | +| Delegated Dialogs | Low | Only needed for embedded UI (not Rivet's pattern) | +| OSLC-Core-Version 3.0 header | Minor | Currently sends "2.0"; should be configurable | +| Configuration Management (versions/baselines) | Major | No OSLC Config support for versioned resources | +| ASPICE type mapping | **Critical** | Only maps 4 generic types; no mapping for ASPICE's 14 types | +| Push with diff-based create/update | Medium | Push always creates; no update-existing logic | +| Rate limiting / retry | Minor | No backoff or retry on transient failures | +| RDF/XML content type | Medium | Only JSON-LD; codebeamer requires `application/rdf+xml` | + +--- + +## 2. OSLC Core 3.0 Specification Requirements + +OSLC Core 3.0 (OASIS Standard, August 2021) is an 8-part multi-part specification. +It was designed for backward compatibility with OSLC 2.0 -- most 2.0 servers remain +3.0 compliant without changes. + +### Parts of the Specification + +| Part | Title | Client relevance | +|---|---|---| +| 1 | Overview | Architecture guidance | +| 2 | Discovery | **Required** -- catalog, service providers | +| 3 | Resource Preview | Optional -- compact rendering for UI | +| 4 | Delegated Dialogs | Optional -- embedded creation/selection UI | +| 5 | Attachments | Optional -- binary file handling | +| 6 | Resource Shape | **Important** -- server resource validation | +| 7 | Vocabulary | **Required** -- common terms (dcterms, oslc) | +| 8 | Constraints | Machine-readable shapes | + +### What a Conformant Client MUST Do + +1. Preserve unknown properties between GET and PUT (Rivet's `extra: BTreeMap` field + does this correctly for typed resources, but `serde(flatten)` may drop nested + JSON-LD constructs) +2. Send `OSLC-Core-Version: 2.0` header (currently implemented correctly) +3. Support content negotiation for at least one RDF format + +### What a Conformant Server MUST Do (not Rivet's concern as a client) + +1. Support GET returning RDF +2. Return OSLC-Core-Version header +3. Implement OSLC Core vocabulary + +### Domain Specifications (OASIS Standards) + +| Domain | Version | Key resources | +|---|---|---| +| Requirements Management (RM) | 2.1 | Requirement, RequirementCollection | +| Quality Management (QM) | 2.1 | TestCase, TestResult, TestPlan, TestExecutionRecord | +| Change Management (CM) | 3.0 | ChangeRequest | +| Architecture Management (AM) | 3.0 | Resource (generic) | +| Configuration Management | 1.0 | Versions, baselines, change sets | +| Tracked Resource Set (TRS) | 3.0 | Base + ChangeLog for incremental sync | + +### TRS (Tracked Resource Set) -- Key for Incremental Sync + +TRS is the mechanism for change tracking without polling individual resources. +A TRS provides: +- **Base**: point-in-time enumeration of all tracked resources (LDP Container) +- **Change Log**: ordered series of creation/modification/deletion events +- **Cutoff**: point in the change log already reflected in the base + +A client reads the Base initially, then polls the Change Log for deltas. +This is conceptually similar to git fetch -- get the full state once, +then apply incremental changes. + +**Assessment**: TRS is essential for production OSLC sync but adds significant +complexity. Without TRS, Rivet must do full-pull-and-diff on every sync, +which is acceptable for small/medium artifact sets (< 10,000) but not for +large enterprise deployments. + +--- + +## 3. Real ALM Tools and Their Actual OSLC Support + +### IBM DOORS Next (ELM) -- The Gold Standard + +**OSLC support: Deep, native, original** + +DOORS Next is the reference OSLC implementation. IBM created OSLC. + +- **Domains**: RM (native provider), QM consumer, CM consumer +- **OSLC version**: 2.0 with some 3.0 extensions +- **Auth**: Jazz Form-based auth + OAuth 1.0a (not Basic Auth). + Modern versions support OIDC (OpenID Connect). + The authentication flow is complex: GET protected resource -> receive + `X-com-ibm-team-repository-web-auth-msg: authrequired` header -> + POST credentials to `/j_security_check` -> follow redirects -> get JSESSIONID. + This is significantly more complex than Basic/Bearer auth. +- **TRS**: Supported. IBM uses TRS extensively for cross-tool integration + within the ELM suite. +- **Config Management**: Supported (global configurations, baselines). +- **Practical notes**: DOORS Next is the most OSLC-capable tool, but its + authentication mechanism is the most complex. Many OSLC client libraries + exist for Java (Eclipse Lyo) but very few for Rust/non-Java languages. + Module-level operations (document structure) are not fully exposed via OSLC. + +### Siemens Polarion -- Partial, Growing + +**OSLC support: Partial native, supplemented by OSLC Connect** + +- **Domains**: RM (native), CM (native). **QM not natively supported** -- + requires manual admin configuration to add QM semantics. +- **OSLC version**: 2.0 / partial 3.0 +- **Auth**: Basic Auth or token-based auth. More straightforward than Jazz. +- **REST API**: Polarion has a rich proprietary REST API (actively developed, + version 2512 current) that is more capable than its OSLC endpoint for + many operations. The REST API covers gaps that OSLC does not. +- **OSLC Connect**: Third-party product (SodiusWillert) that provides enhanced + OSLC integration for Polarion, including linking to Jira, DOORS, etc. + This suggests Polarion's native OSLC is not sufficient for cross-tool + integration without middleware. +- **ReqIF**: Fully supported for import/export. +- **Practical notes**: For Polarion integration, the proprietary REST API + is more practical than OSLC for most use cases. OSLC is primarily used + for cross-tool linking with IBM ELM, not as the primary integration method. + +### PTC codebeamer -- Partial, Evolving + +**OSLC support: Provider and consumer, RM and CM domains** + +- **Domains**: RM (provider/consumer), CM (provider/consumer). + QM integration with IBM ETM added in codebeamer 3.1 (August 2025). +- **OSLC version**: 3.0 header required (`OSLC-Core-version: 3.0`) +- **Content type**: Requires `application/rdf+xml` (not JSON-LD!). + This is a significant incompatibility with Rivet's current JSON-LD-only client. +- **Auth**: Basic Auth supported. +- **OSLC usage**: Primarily for linking with Windchill PLM and IBM ELM. + Not the primary integration API for standalone use. +- **ReqIF**: Supported since codebeamer 7.6.0 for interchange with DOORS. +- **REST API**: codebeamer has a comprehensive proprietary REST API that + is the primary integration method for most customers. +- **Practical notes**: codebeamer's OSLC is focused on PLM-to-ALM linking + (PTC ecosystem integration), not on general-purpose data sync. + +### Jama Connect -- External Adapter Only + +**OSLC support: Not native. Requires third-party adapter.** + +- **No native OSLC**: Jama does not implement OSLC natively. +- **OSLC adapter**: Available via Koneksys (oslc-adapter-jama on GitHub), + which wraps Jama's REST API as an OSLC provider. This adapter supports + RM domain and OAuth 1.0a but is a community/third-party project. +- **OSLC Model Connector**: Partnership with MiD/Smartfacts provides + MBSE integration via OSLC, but this is for model-to-requirements linking, + not general artifact sync. +- **REST API**: Jama has a well-documented REST API that is the primary + integration method. +- **Practical notes**: Integrating with Jama via OSLC requires running a + separate adapter service. Direct REST API integration is more practical. + +### Summary: Actual OSLC Implementation Reality + +| Tool | Native OSLC | RM | QM | CM | TRS | Auth Method | Primary API | +|---|---|---|---|---|---|---|---| +| DOORS Next | Yes (deep) | Yes | Yes | Yes | Yes | Jazz OAuth 1.0a / OIDC | OSLC | +| Polarion | Partial | Yes | No* | Yes | No | Basic / Token | Proprietary REST | +| codebeamer | Partial | Yes | Partial | Yes | No | Basic | Proprietary REST | +| Jama Connect | No | Adapter | No | No | No | OAuth 1.0a (adapter) | Proprietary REST | + +*\* QM requires admin-level semantic configuration* + +**Key finding**: Only IBM DOORS Next uses OSLC as its primary integration API. +Polarion, codebeamer, and Jama all have proprietary REST APIs that are more +capable, better documented, and more widely used than their OSLC endpoints. + +--- + +## 4. The ASPICE Type Mapping Problem + +Rivet's OSLC module maps only 4 generic OSLC types: + +| OSLC Type | Rivet artifact type | +|---|---| +| oslc_rm:Requirement | requirement | +| oslc_qm:TestCase | test-case | +| oslc_qm:TestResult | test-result | +| oslc_cm:ChangeRequest | change-request | + +But ASPICE v4.0 defines 14 artifact types that Rivet tracks: + +| ASPICE Type | ASPICE Process | OSLC Mapping? | +|---|---|---| +| stakeholder-req | SYS.1 | oslc_rm:Requirement (lossy) | +| system-req | SYS.2 | oslc_rm:Requirement (lossy) | +| system-arch-component | SYS.3 | No OSLC equivalent | +| sw-req | SWE.1 | oslc_rm:Requirement (lossy) | +| sw-arch-component | SWE.2 | No OSLC equivalent | +| sw-detail-design | SWE.3 | No OSLC equivalent | +| unit-verification | SWE.4 | oslc_qm:TestCase (lossy) | +| sw-integration-verification | SWE.5 | oslc_qm:TestCase (lossy) | +| sw-verification | SWE.6 | oslc_qm:TestCase (lossy) | +| sys-integration-verification | SYS.4 | oslc_qm:TestCase (lossy) | +| sys-verification | SYS.5 | oslc_qm:TestCase (lossy) | +| verification-execution | -- | oslc_qm:TestResult (partial) | +| verification-verdict | -- | oslc_qm:TestResult (partial) | + +**Fundamental problem**: OSLC's 4 resource types cannot represent ASPICE's +14-type V-model without information loss. The `aspice-process` field, +`method` field (automated-test, review, formal-verification, etc.), and the +detailed link types (derives-from, allocated-from, refines, verifies, +result-of, part-of-execution) have no OSLC-standard representation. + +Workaround options: +1. Use `dcterms:type` or custom RDF properties to carry ASPICE type info + (non-standard, tool-specific) +2. Map everything to generic Requirement/TestCase and lose type granularity +3. Use ReqIF instead, which preserves arbitrary attribute schemas + +--- + +## 5. Eclipse SCORE and the Competitive Context + +Eclipse SCORE (Safe Open Vehicle Core) is the primary adoption target for Rivet. + +**SCORE's tooling approach**: +- **Documentation**: Sphinx + sphinx-needs (docs-as-code) +- **Requirements format**: `needs.json` (sphinx-needs export format) +- **Metamodel**: 50+ need types defined in `metamodel.yaml` +- **No OSLC usage**: SCORE does not use OSLC for tool synchronization +- **No ReqIF usage**: SCORE uses `needs.json` as its interchange format +- **Version control**: Git-based, plain text files + +**Implication for Rivet**: The SCORE adoption opportunity (the strategic priority +per Rivet's roadmap) requires `needs.json` import (already implemented), not OSLC. +SCORE projects will not have Polarion or DOORS to sync with -- they use +sphinx-needs and git. + +--- + +## 6. Alternative Approaches + +### A. ReqIF Interchange (Already Implemented) + +Rivet already has a working ReqIF 1.2 adapter (`rivet-core/src/reqif.rs`). + +**Strengths**: +- Universal support: DOORS, Polarion, codebeamer all import/export ReqIF +- File-based: works across organizational boundaries (no server access needed) +- Preserves arbitrary attributes via ReqIF SPEC-OBJECT-TYPE/ATTRIBUTE +- Git-friendly when stored as XML files +- OMG standard with strong automotive industry adoption +- No authentication complexity + +**Weaknesses**: +- Not real-time; requires explicit export/import cycles +- No incremental sync (full document exchange) +- XML verbosity +- Tool-specific attribute mapping still needed + +**Assessment**: ReqIF is the proven interchange format for cross-company +requirements exchange in automotive. Every major ALM tool supports it. It is +the right format for "send requirements to supplier / receive from customer" +workflows, which is the dominant integration pattern in ASPICE processes. + +### B. needs.json Import (Already Implemented) + +Rivet already has a working needs.json adapter (`rivet-core/src/formats/needs_json.rs`). + +**Strengths**: +- Direct path to Eclipse SCORE adoption +- JSON format, lightweight, git-friendly +- Preserves sphinx-needs-specific metadata +- No server infrastructure required + +**Assessment**: This is the highest-priority integration for the SCORE adoption +play. Already implemented. + +### C. Direct REST Adapters Per Tool + +The approach OSLC was supposed to replace. Build specific adapters for +each tool's proprietary REST API. + +**Strengths**: +- Full access to tool-specific features +- Better documentation and community support +- More reliable authentication +- Can leverage tool-specific SDKs + +**Weaknesses**: +- N adapters for N tools (maintenance burden) +- API versioning and deprecation risk +- Each adapter is a significant effort + +**Assessment**: Pragmatic but not scalable. Should be considered only for +the 1-2 tools with the highest demand from actual users. + +### D. Git-Based Sync (Export/Import YAML) + +Rivet's native format is YAML in git. The simplest "sync" is: +1. Export YAML from Rivet +2. Commit to a shared git repo +3. Other tools import from the repo (via adapter) + +**Strengths**: +- Zero infrastructure +- Full Rivet fidelity (no lossy mapping) +- Git provides history, branching, merge +- Already how SCORE projects work + +**Assessment**: This is actually the dominant real-world pattern for +docs-as-code projects. It works when the "other tool" can import YAML/JSON. + +--- + +## 7. Strategic Assessment + +### The DD-001 Decision Should Be Revisited + +DD-001 ("OSLC over per-tool REST adapters") was based on the premise that +"Polarion, DOORS, and codebeamer already support [OSLC] natively" and that +"one adapter handles all tools." + +**This premise is only partly true**: + +1. Only DOORS Next has deep, native OSLC support. Polarion and codebeamer + have partial OSLC that is secondary to their proprietary REST APIs. + +2. OSLC's 4 resource types cannot represent ASPICE's 14-type V-model + without information loss. ReqIF can. + +3. OSLC authentication (Jazz OAuth, OIDC) is significantly more complex + than the Basic/Bearer auth currently implemented. A production OSLC + client for DOORS Next requires implementing Jazz form-based auth, + which involves multi-step HTTP redirects, cookie management, and + CSRF token handling. + +4. The OSLC ecosystem has low adoption: research indicates only ~20% of + the 25 most prevalent ALM tools have any OSLC capability. + +5. Eclipse SCORE (Rivet's strategic adoption target) does not use OSLC. + +### The Existing OSLC Code Is Not Wasted + +The OSLC module is well-architected and serves as a solid foundation: +- The `SyncAdapter` trait is useful regardless of protocol +- The mapping layer (`oslc_to_artifact`, `artifact_to_oslc`) demonstrates + the pattern for any adapter +- The `SyncDiff` computation is protocol-independent +- The test infrastructure (wiremock-based) is exemplary + +### Cost/Benefit Analysis + +| Investment | Benefit | Priority | +|---|---|---| +| Complete OSLC client (Jazz auth, TRS, RDF/XML, Config Mgmt) | Connect to DOORS Next | Low -- only ~10% of target users have DOORS Next | +| ReqIF import/export (already done) | Exchange with any major ALM tool | **Already shipped** | +| needs.json import (already done) | SCORE adoption | **Already shipped** | +| Polarion REST adapter | Direct Polarion sync | Medium -- if user demand materializes | +| codebeamer REST adapter | Direct codebeamer sync | Medium -- if user demand materializes | + +--- + +## 8. Recommendation + +### Near-Term (Phase 2-3): Keep OSLC as-is, invest elsewhere + +1. **Do not invest further in OSLC** until a concrete customer/user requires + DOORS Next integration. The existing OSLC module is a working prototype + that demonstrates the architecture. It is sufficient for the "we support + OSLC" checkbox. + +2. **Invest in ReqIF robustness** -- this is the universal interchange format + that every major tool supports. Ensure round-trip fidelity with DOORS, + Polarion, and codebeamer exports. Add attribute-type-to-ASPICE-type mapping. + +3. **Invest in needs.json quality** -- this is the SCORE adoption path. + Ensure that SCORE's 50+ need types import cleanly with full link preservation. + +4. **Ship the WASM adapter runtime (FEAT-012)** -- this allows users to write + their own tool-specific adapters without Rivet needing to maintain them. + A WASM adapter for Polarion's REST API or codebeamer's REST API can be + contributed by the community. + +### Medium-Term: Demote DD-001, promote adapter trait + +5. **Update DD-001 status** from `approved` to `superseded` with rationale: + "OSLC remains supported but is not the primary integration strategy. + ReqIF for interchange, needs.json for SCORE, and WASM adapters for + tool-specific REST APIs provide better coverage with less complexity." + +6. **Rename REQ-006 priority** from `should` to `could`. OSLC sync is + a nice-to-have, not a must-have, given the real-world tool landscape. + +### Long-Term: OSLC When Demanded + +7. If a customer requires DOORS Next integration: + - Implement Jazz form-based authentication + - Add TRS support for incremental sync + - Add RDF/XML content negotiation + - Add OSLC Configuration Management for baseline support + - This is a 4-6 week effort for a production-quality client + +8. If a customer requires Polarion or codebeamer integration: + - Build a direct REST adapter (not OSLC) + - Package it as a WASM component + - This is a 1-2 week effort per tool + +### Priority Order for Integration Work + +``` +1. needs.json import [done] -- SCORE adoption +2. ReqIF import/export [done] -- Universal interchange +3. WASM adapter runtime [planned] -- Extensibility +4. Polarion REST adapter [on demand] -- If users request +5. codebeamer REST adapter [on demand] -- If users request +6. OSLC + DOORS Next [on demand] -- If users request +``` + +--- + +## Sources + +### OSLC Specifications +- [OSLC Core 3.0 Overview](https://docs.oasis-open-projects.org/oslc-op/core/v3.0/oslc-core.html) +- [OSLC Specifications Index](https://open-services.net/specifications/) +- [OSLC TRS 3.0](https://docs.oasis-open-projects.org/oslc-op/trs/v3.0/tracked-resource-set.html) +- [OSLC Core 3.0 Changes](https://docs.oasis-open-projects.org/oslc-op/core-3.0-changes/v1.0/pn01/core-3.0-changes.html) + +### Tool OSLC Support +- [DOORS Next OSLC Services](https://www.ibm.com/docs/en/engineering-lifecycle-management-suite/doors-next/7.0.3?topic=function-extending-doors-next-by-using-oslc-services) +- [Polarion REST API](https://developer.siemens.com/polarion/rest-api-spec.html) +- [Polarion OSLC Services](https://docs.sodiuswillert.com/oslc-connect/latest/polarion-oslc-services-for-your-configuration) +- [Codebeamer OSLC Server Guide](https://support.ptc.com/help/codebeamer/r2.1/en/codebeamer/cb_dpt_integration/26334048.html) +- [Jama OSLC Adapter](https://github.com/OSLC/oslc-adapter-jama) + +### Authentication +- [Jazz Authentication Wiki](https://github.com/OSLC/oslc-client/wiki/JazzAuthentication) +- [Jazz OAuth discussion](https://jazz.net/forum/questions/268230/oslc-rootservices-with-oauth1-vs-oauth2) + +### Industry Analysis +- [OSLC Challenges (Jama Software)](https://www.jamasoftware.com/blog/oslc-what-is-it-and-what-are-its-challenges/) +- [OSLC as ALM Integration Standard](https://mgtechsoft.com/blog/what-is-open-services-lifecycle-collaboration/) +- [ReqIF vs OSLC comparison](https://www.se-trends.de/en/reqif-and-systems-engineering/) +- [ReqIF overview (Visure)](https://visuresolutions.com/alm-guide/reqif/) + +### Eclipse SCORE +- [S-CORE Documentation](https://eclipse-score.github.io/score/main/) +- [Eclipse S-CORE GitHub](https://github.com/eclipse-score) +- [Sphinx-Needs](https://www.sphinx-needs.com/) +- [Open-Needs](https://open-needs.org/) diff --git a/docs/plans/2026-03-16-rowan-salsa-completion.md b/docs/plans/2026-03-16-rowan-salsa-completion.md new file mode 100644 index 0000000..c0817fa --- /dev/null +++ b/docs/plans/2026-03-16-rowan-salsa-completion.md @@ -0,0 +1,393 @@ +# rowan + salsa Incremental Validation: Completion Plan + +**Issue:** [#22 — rowan + salsa incremental validation architecture](https://github.com/pulseengine/rivet/issues/22) +**Date:** 2026-03-16 +**Artifacts:** REQ-028, REQ-029, DD-023, DD-024, FEAT-046, FEAT-047, FEAT-048 + +--- + +## 1. Current State Assessment + +### 1.1 What Works Today + +**Layer 1 — rowan CST (MODULE.bazel only):** +- `rivet-core/src/bazel.rs` — Complete rowan-based parser for MODULE.bazel Starlark subset +- ~20 `SyntaxKind` variants (tokens + composite nodes + error recovery) +- Hand-written lexer + recursive-descent parser emitting `GreenNode` +- HIR extraction (`BazelModule`) from CST: `module()`, `bazel_dep()`, overrides +- Error recovery — `load()`, variable assignment, dotted expressions wrapped as `Error` nodes +- 10 tests covering lexer, CST, HIR, error recovery, and realistic input + +**Layer 2 — salsa incremental database:** +- `rivet-core/src/db.rs` — Complete salsa 0.26 database with: + - **Inputs:** `SourceFile` (path + content), `SchemaInput` (name + content), set containers for both + - **Tracked queries:** `parse_artifacts(file)`, `validate_all(sources, schemas)`, `evaluate_conditional_rules(sources, schemas)` + - **Helper functions (non-tracked):** `build_pipeline`, `build_store`, `build_schema` + - **Concrete database:** `RivetDatabase` with `load_schemas()`, `load_sources()`, `update_source()`, `store()`, `schema()`, `diagnostics()`, `conditional_diagnostics()` +- CLI integration: `rivet validate --incremental` and `--verify-incremental` flags +- Verification mode: SC-11 parity check comparing incremental vs sequential pipelines +- 16 database tests covering: empty DB, unlinked/linked artifacts, incremental updates, determinism, add/remove artifacts, conditional rules, rule composition, consistency checks + +**Layer 3 — Supporting infrastructure:** +- `yaml_edit.rs` — Indentation-aware YAML editor (lossless roundtrip for mutations) +- `impact.rs` — Change impact analysis via BFS on link graph +- `validate.rs` — 8-phase validation: `validate_structural()` (phases 1-7) + conditional rules (phase 8) +- `store.rs` — In-memory artifact store with type indexing +- `links.rs` — petgraph-based link graph with forward/backward/broken links + +### 1.2 What Does NOT Work Yet + +1. **CLI does not default to incremental** — `--incremental` is opt-in, sequential pipeline is default +2. **Only `generic-yaml` format supported in salsa path** — STPA, AADL, ReqIF, needs-json formats are skipped +3. **No rowan CST for artifact/schema YAML files** — parsing goes through serde_yaml, no span information +4. **No source location on diagnostics** — `Diagnostic` struct has no file path, line, or column +5. **Store and LinkGraph lack `PartialEq`** — cannot be salsa tracked return types (noted in db.rs comments) +6. **`build_store` and `build_schema` are non-tracked** — pipeline runs twice in `validate_all` + `evaluate_conditional_rules` +7. **No file watching** — incremental database is created fresh each invocation, no warm-cache reuse +8. **No LSP server** — no `tower-lsp` or `lsp-server` integration +9. **No per-file artifact granularity in salsa** — `parse_artifacts` returns `Vec`, not individually tracked artifacts +10. **Impact analysis does not use salsa** — `impact.rs` uses separate diff-based approach + +--- + +## 2. Architecture: LSP-Ready State + +### 2.1 Target Query Graph + +``` + ┌─────────────────┐ + │ File Watcher │ (notify crate) + │ or LSP client │ + └────────┬────────┘ + │ set_content(path, text) + ▼ +┌────────────────────────────────────────────────────────────┐ +│ RivetDatabase │ +│ │ +│ ┌──────────────┐ ┌──────────────┐ │ +│ │ SourceFile │ │ SchemaInput │ salsa::input │ +│ │ path+content │ │ name+content │ │ +│ └──────┬───────┘ └──────┬───────┘ │ +│ │ │ │ +│ ▼ ▼ │ +│ ┌──────────────┐ ┌──────────────┐ │ +│ │parse_artifact│ │ parse_schema │ salsa::tracked │ +│ │ s(file) │ │ (file) │ (per-file) │ +│ └──────┬───────┘ └──────┬───────┘ │ +│ │ │ │ +│ ▼ ▼ │ +│ ┌──────────────┐ ┌──────────────┐ │ +│ │ ArtifactSet │ │MergedSchema │ salsa::tracked │ +│ │ (all files) │ │ (all schemas)│ (aggregates) │ +│ └──────┬───────┘ └──────┬───────┘ │ +│ │ │ │ +│ └───────┬───────────┘ │ +│ ▼ │ +│ ┌─────────────┐ │ +│ │ LinkGraph │ salsa::tracked │ +│ │(+broken refs)│ │ +│ └──────┬──────┘ │ +│ │ │ +│ ┌────────┼────────┐ │ +│ ▼ ▼ ▼ │ +│ ┌──────────┐ ┌─────┐ ┌──────────────┐ │ +│ │structural│ │trace│ │ conditional │ salsa::tracked │ +│ │validation│ │rules│ │ rules │ (per-category) │ +│ └────┬─────┘ └──┬──┘ └──────┬───────┘ │ +│ │ │ │ │ +│ └──────────┼───────────┘ │ +│ ▼ │ +│ ┌──────────────┐ │ +│ │ all_diagnost │ salsa::tracked │ +│ │ ics() │ (top-level) │ +│ └──────┬───────┘ │ +│ │ │ +│ ▼ │ +│ ┌──────────────┐ │ +│ │ Diagnostic │ With file path, │ +│ │ + SourceLoc │ line:col spans │ +│ └──────────────┘ │ +└────────────────────────────────────────────────────────────┘ + │ + ▼ + ┌─────────────────────────┐ + │ LSP / CLI / Dashboard │ + │ textDocument/ │ + │ publishDiagnostics │ + └─────────────────────────┘ +``` + +### 2.2 Key Architectural Properties + +- **Single database instance** shared across CLI (watch mode), dashboard (serve), and LSP +- **File-level granularity** for invalidation: changing one YAML file re-parses only that file +- **Per-category validation** tracked separately: structural, traceability, conditional rules +- **Diagnostic spans** traceable to source locations via rowan TextRange or line:col pairs +- **Same DB serves all consumers**: `rivet validate`, `rivet serve`, `rivet lsp` + +--- + +## 3. Remaining Work Items + +### Phase A: Make Incremental the Default (no rowan required) + +**A1. Derive `PartialEq` for Store and LinkGraph** (~1 day) +- Add `#[derive(PartialEq)]` to `Store`, `LinkGraph`, and all transitive types +- For `LinkGraph`: need `PartialEq` on `ResolvedLink`, `Backlink` (already `Clone + Debug`) +- petgraph `DiGraph` does not implement `PartialEq`; either skip the graph field from comparison or store it separately +- This unblocks making `build_store` and `build_schema` into salsa tracked functions, eliminating the duplicate pipeline execution noted in db.rs + +**A2. Lift `build_store` and `build_schema` to tracked functions** (~0.5 day) +- Currently non-tracked helpers called from both `validate_all` and `evaluate_conditional_rules` +- With PartialEq on return types, these become `#[salsa::tracked]` functions +- Eliminates redundant store/schema assembly on every validation call + +**A3. Add all adapter formats to the salsa path** (~2 days) +- Currently `cmd_validate_incremental` skips sources with format != "generic" / "generic-yaml" +- Need to route STPA, AADL, ReqIF, needs-json through the salsa database +- Strategy: add a `parse_artifacts_with_adapter(db, source, format)` tracked function that dispatches to the correct adapter +- Each adapter's parse result (Vec) is cached per file + +**A4. Source location on Diagnostic** (~1 day) +- Add `source_file: Option` and `source_line: Option` to `Diagnostic` +- Populate from `Artifact::source_file` (already tracked) and serde_yaml error positions +- No rowan needed — serde_yaml provides line numbers for parse errors; artifact `source_file` provides file paths +- Validate pipeline propagates source_file through to diagnostics + +**A5. Remove `--incremental` flag, make salsa the default** (~0.5 day) +- Keep `--verify-incremental` as a safety check for transition period +- Add `--no-incremental` escape hatch (legacy sequential mode) +- Update CLI tests to use the salsa path +- Gate behind a transition period: `--incremental` prints deprecation notice + +**A6. File-watching warm-cache mode for CLI** (~2 days) +- Add `rivet validate --watch` using the `notify` crate +- Holds `RivetDatabase` in memory between file changes +- On file change: `db.update_source(path, new_content)` then `db.diagnostics()` +- Demonstrates the core incremental value: sub-ms revalidation on warm cache +- This is the simplest proof of salsa's value proposition + +**A7. Integrate salsa into `rivet serve` dashboard** (~2 days) +- Dashboard currently reloads all artifacts on each request +- Share `RivetDatabase` behind an `Arc>` in the axum state +- File watcher updates the database; handlers read cached results +- Validation page, stats, graph all query the same warm database + +### Phase B: rowan CST for Artifact YAML Files + +**B1. Design YAML CST SyntaxKind enum** (~1 day) +- Define `YamlSyntaxKind` covering the subset Rivet uses: + - `Root`, `Document`, `MappingEntry`, `SequenceEntry`, `BlockMapping`, `FlowSequence` + - `Key`, `Colon`, `Value`, `Dash`, `Indent`, `Comment`, `Newline`, `ScalarValue`, `StringLit` + - `ArtifactsBlock`, `ArtifactEntry` (higher-level composite nodes) + - `Error` for recovery +- Define `YamlLanguage` implementing `rowan::Language` + +**B2. YAML lexer** (~2 days) +- Indentation-sensitive tokenizer for the artifact YAML subset +- Must handle: block scalars (`>`, `|`), flow sequences (`[...]`), quoted strings, comments +- Produce tokens with exact byte positions for span tracking +- This is more complex than the MODULE.bazel lexer due to indentation significance + +**B3. YAML recursive-descent parser** (~3 days) +- Build `GreenNode` CST from token stream +- Error recovery: partial parse on malformed YAML still produces usable CST +- Must produce `ArtifactEntry` nodes for each `- id:` block +- Span information on every node enables diagnostic-quality error reporting + +**B4. Replace serde_yaml parsing with rowan CST path** (~2 days) +- `parse_generic_yaml()` currently uses serde_yaml +- New path: lex -> parse -> CST -> HIR extraction (Artifact structs) +- HIR extraction walks the CST to produce the same `Vec` output +- serde_yaml path kept as fallback; feature flag or runtime switch + +**B5. Schema file rowan CST** (~2 days) +- Separate `SchemaSyntaxKind` or reuse `YamlSyntaxKind` for schema YAML +- Schema files have a known structure; CST enables better error messages for malformed schemas +- Lower priority than artifact CST — schemas change infrequently + +**B6. Wire rowan spans into Diagnostic** (~1 day) +- `Diagnostic` gains `text_range: Option` field +- Validation functions that detect errors from CST-parsed artifacts attach the span +- Conversion to line:col via a line index computed from source text + +### Phase C: LSP Server + +**C1. Add `tower-lsp` dependency and server skeleton** (~1 day) +- New crate `rivet-lsp` or module in `rivet-cli` +- Implement `LanguageServer` trait with stub handlers +- `rivet lsp` subcommand starts the server on stdin/stdout + +**C2. `textDocument/didOpen` and `textDocument/didChange`** (~1 day) +- On open: load file content into `RivetDatabase` via `SourceFile::new()` +- On change: `db.update_source(path, new_content)` +- Trigger revalidation, publish diagnostics + +**C3. `textDocument/publishDiagnostics`** (~1 day) +- Convert `Diagnostic` (with source location) to LSP `Diagnostic` +- Map `Severity::Error` -> `DiagnosticSeverity::Error`, etc. +- Publish to client after each revalidation + +**C4. `textDocument/completion`** (~2 days) +- Artifact ID completion in `links:` target fields +- Artifact type completion in `type:` fields +- Link type completion in `links:` type fields +- Schema-aware: only suggest valid target types for a given link type + +**C5. `textDocument/hover`** (~1 day) +- Hover over artifact ID in `links:` shows target artifact summary +- Hover over link type shows schema description +- Hover over artifact type shows type definition + +**C6. `textDocument/definition`** (~1 day) +- Go-to-definition on artifact IDs in link targets +- Jump to the `- id:` line in the target artifact's source file +- Requires `source_file` to be populated on all artifacts + +**C7. VS Code extension packaging** (~2 days) +- Extension activates for `*.yaml` files in projects with `rivet.yaml` +- Ships `rivet` binary, starts `rivet lsp` +- Configuration: schema path, project root +- This is the commercial value play for Eclipse SCORE adoption + +### Phase D: Advanced Incremental Features + +**D1. Per-artifact salsa tracking** (~2 days) +- Currently `parse_artifacts()` returns `Vec` — the entire file +- Add `#[salsa::tracked]` struct `TrackedArtifact` wrapping individual artifacts +- `parse_artifacts()` returns `Vec` +- Changing one artifact in a multi-artifact file only invalidates that artifact's downstream queries +- Requires careful design: tracked structs need stable identity across re-parses + +**D2. Salsa-powered impact analysis** (~1 day) +- Replace `impact.rs` BFS approach with salsa query invalidation +- "What is impacted by changing file X?" = "What salsa queries would re-execute?" +- salsa does not currently expose its dependency graph externally; may need to track this manually +- Alternative: keep BFS impact but feed it from the salsa-cached link graph + +**D3. Cross-repo incremental validation** (~3 days) +- `externals.rs` already supports cross-repo artifact linking +- External artifacts become `SourceFile` inputs with a "repo:path" key +- File watching spans multiple repo directories +- Invalidation boundary: external changes re-trigger local link validation but not external schema validation + +**D4. Document validation in salsa** (~1 day) +- `validate_documents()` checks `[[ID]]` references in markdown documents +- Add `DocumentInput` salsa input and `validate_document_refs()` tracked function +- Document changes only revalidate affected document references + +--- + +## 4. Migration Strategy + +### Principle: No Breaking Changes + +Every phase is additive. The sequential pipeline (`validate()` in validate.rs) remains functional throughout. The salsa path produces identical results (verified by SC-11). + +### Phase Ordering and Dependencies + +``` +Phase A (Foundation) Phase B (rowan YAML) Phase C (LSP) +━━━━━━━━━━━━━━━━━━━ ━━━━━━━━━━━━━━━━━━ ━━━━━━━━━━━━ +A1 PartialEq ──┐ B1 SyntaxKind enum C1 server skeleton +A2 tracked fn ─┤ B2 YAML lexer ──┐ C2 didOpen/didChange +A3 all adapters│ B3 YAML parser ─┤ C3 publishDiagnostics +A4 source locs ┤ B4 replace serde┤ C4 completion +A5 default ────┤ B5 schema CST │ C5 hover +A6 --watch ────┤ B6 spans ────────┘ C6 definition +A7 serve ──────┘ C7 VS Code ext + +A must complete ──→ B can start (B needs stable salsa infra) +A must complete ──→ C can start (C needs source locations) +B6 feeds ─────────→ C3 (spans make LSP diagnostics precise) +``` + +### Transition Timeline + +| Milestone | Items | Key Deliverable | +|-----------|-------|----------------| +| **v0.2.0** | A1-A5 | Incremental is default; all adapters supported | +| **v0.3.0** | A6-A7, B1-B3 | Watch mode, dashboard integration, YAML CST prototype | +| **v0.4.0** | B4-B6, C1-C3 | rowan-parsed artifacts, basic LSP with diagnostics | +| **v0.5.0** | C4-C7, D1-D2 | Full LSP, VS Code extension, per-artifact tracking | +| **v1.0.0** | D3-D4 | Cross-repo incremental, document validation | + +--- + +## 5. New Artifacts Needed + +### Requirements + +| ID | Title | Rationale | +|----|-------|-----------| +| REQ-033 | Source location tracking on validation diagnostics | Diagnostics without file:line are unusable for LSP and large projects | +| REQ-034 | File-watching incremental validation mode | Core value proposition of salsa — sub-ms revalidation on file change | +| REQ-035 | YAML lossless CST with rowan | Enables error recovery, span-based diagnostics, and LSP features | +| REQ-036 | Language Server Protocol support for artifact YAML | IDE integration for SCORE adoption; commercial VS Code extension play | + +### Design Decisions + +| ID | Title | Rationale | +|----|-------|-----------| +| DD-031 | YAML CST SyntaxKind design for artifact/schema files | The YAML subset used by Rivet is small enough for a hand-written parser but needs indentation sensitivity | +| DD-032 | tower-lsp over lsp-server for Rivet LSP | tower-lsp is async and composable with axum; lsp-server is sync. spar could go either way but Rivet already has tokio | +| DD-033 | notify crate for file watching in CLI and dashboard | Cross-platform file watching; same approach as rust-analyzer | + +### Features + +| ID | Title | Links | +|----|-------|-------| +| FEAT-053 | Source locations on all validation diagnostics | satisfies REQ-033 | +| FEAT-054 | `rivet validate --watch` with warm salsa cache | satisfies REQ-034, implements DD-024 | +| FEAT-055 | YAML rowan CST parser for artifact files | satisfies REQ-035, implements DD-031 | +| FEAT-056 | `rivet lsp` Language Server Protocol implementation | satisfies REQ-036, implements DD-032 | +| FEAT-057 | VS Code extension for Rivet artifact YAML | satisfies REQ-036 | +| FEAT-058 | salsa integration in `rivet serve` dashboard | satisfies REQ-029, implements DD-024 | + +### STPA Artifacts (extend existing) + +The following existing STPA artifacts already cover incremental validation safety: +- **H-9:** Incremental validation returns stale results (hazard) +- **SC-11:** Incremental must equal full validation (safety constraint) +- **UCA-C-10..C-14:** Incremental validation UCAs +- **CC-C-10..C-14:** Causal chain for incremental errors + +New loss scenario needed: +- **LS-NEW:** LSP server publishes stale diagnostics after rapid file edits due to race between file watcher and salsa query execution + +--- + +## 6. Risks and Mitigations + +| Risk | Impact | Mitigation | +|------|--------|------------| +| YAML indentation parsing is hard to get right | B2-B3 take longer than estimated | Start with a minimal subset (artifact files only), reuse yaml_edit.rs knowledge | +| salsa 0.26 API instability | API changes break db.rs | Pin to exact version; salsa 0.26 is stable (used by spar, rust-analyzer uses fork) | +| PartialEq on petgraph DiGraph | A1 blocked — DiGraph has no PartialEq | Store the graph behind an opaque wrapper; compare by node/edge counts or skip graph from PartialEq | +| Per-artifact tracking identity issues | D1 correctness — re-parse changes artifact identity | Use artifact ID as the salsa identity key; handle ID renames as remove+add | +| LSP adoption requires VS Code extension | C7 is cross-ecosystem work (TypeScript) | Start with a minimal extension; existing YAML LSP extensions can be adapted | + +--- + +## 7. Success Criteria + +1. **`rivet validate` defaults to incremental** — no `--incremental` flag needed +2. **SC-11 always passes** — `--verify-incremental` confirms parity on every CI run +3. **`rivet validate --watch`** shows sub-10ms revalidation on file change (warm cache) +4. **`rivet serve` dashboard** uses shared salsa database — no full reload per request +5. **`rivet lsp`** publishes diagnostics with file:line:col accuracy +6. **VS Code extension** provides completion, hover, and go-to-definition for artifact YAML +7. **All adapter formats** (generic, stpa, aadl, reqif, needs-json) route through salsa + +--- + +## 8. Relation to Phase 3 Workstreams + +This plan is workstream 7 ("rowan + salsa incremental validation") from the +[Phase 3 parallel workstreams design](2026-03-14-phase3-parallel-workstreams-design.md). + +It intersects with: +- **Workstream 3** (CLI mutation safety) — mutations should invalidate the salsa cache +- **Workstream 4** (dashboard) — shared salsa database for serve +- **Workstream 6** (cross-repo) — external artifacts as salsa inputs +- **Workstream 8** (SCORE adoption) — LSP/VS Code extension is a key selling point diff --git a/docs/plans/2026-03-16-stpa-sec-analysis.md b/docs/plans/2026-03-16-stpa-sec-analysis.md new file mode 100644 index 0000000..c9da12d --- /dev/null +++ b/docs/plans/2026-03-16-stpa-sec-analysis.md @@ -0,0 +1,867 @@ +# STPA and STPA-Sec Fresh Analysis Report + +**Date:** 2026-03-16 +**Scope:** Full Rivet codebase at current state (branch `feat/compound-layout`) +**Method:** Fresh STPA (Steps 1-4) + STPA-Sec extension + OSLC lifecycle lens + +--- + +## Executive Summary + +This report presents a fresh STPA analysis performed against the current Rivet codebase, comparing findings against the existing STPA artifacts in `safety/stpa/`. The analysis identified: + +- **0 new losses** needed (existing 6 losses remain complete) +- **5 new hazards** (H-13 through H-17) for components added since the last analysis +- **5 new system constraints** (SC-15 through SC-19) +- **3 new controllers** added to the control structure (CTRL-EXPORT, CTRL-YAML-EDIT, CTRL-SALSA) +- **15 new UCAs** across export, document rendering, commit traceability, and WASM runtime +- **14 loss scenarios** needed (8 for existing UCAs lacking them + 6 for new UCAs) +- **6 STPA-Sec findings** with missing mitigations +- **3 OSLC lifecycle gaps** + +--- + +## Part 1: Fresh STPA Analysis + +### Step 1: Loss Completeness Review + +The existing 6 losses were evaluated against all new capabilities: + +| Loss | Title | Still Complete? | Notes | +|------|-------|-----------------|-------| +| L-1 | Loss of traceability integrity | Yes | Export, markdown rendering, and document validation all create derivative representations of traceability data. HTML export integrity falls under L-1 (incorrect traceability data in reports) and L-2 (compliance evidence). | +| L-2 | Loss of compliance evidence | Yes | HTML exports used as audit evidence are covered: an incorrect export is misleading compliance evidence. | +| L-3 | Loss of data sovereignty | Yes | YAML mutation and cross-repo sync are data sovereignty concerns. | +| L-4 | Loss of engineering productivity | Yes | Salsa cache staleness and build-system misparsing waste engineering time. | +| L-5 | Loss of safety assurance | Yes | All new hazards eventually chain to L-5 through traceability gaps. | +| L-6 | Loss of audit trail | Yes | YAML mutation without attribution is an audit trail concern. | + +**Verdict: No new losses needed.** The existing 6 losses adequately cover all new capabilities. The requested losses for "export integrity," "incremental validation correctness," and "formal proof validity" are already subsumed by L-1, L-2, and L-5 respectively. + +### Step 2: New Hazards + +The following hazards were identified for code added since the last STPA analysis. Note: the user requested IDs H-13+ but some of the proposed hazards (XSS via markdown, salsa staleness, build-system misparsing) overlap with existing hazards. Where overlap exists, I note the existing hazard and add only genuinely new hazards. + +**Already covered by existing hazards:** + +- "salsa cache returns stale results after schema change" -- **H-9** and sub-hazards H-9.1, H-9.2 +- "build-system provider misidentifies dependency versions" -- **H-11** and sub-hazards H-11.1, H-11.2 +- "impact analysis misses transitively affected artifacts" -- subsumable under **H-1** (stale cross-references) and **H-3** (incorrect coverage metrics) + +**New hazards:** + +```yaml +# Proposed additions to safety/stpa/hazards.yaml + + - id: H-13 + title: Rivet document renderer produces HTML containing unescaped artifact content + description: > + The markdown-to-HTML renderer in document.rs processes artifact + descriptions, titles, and field values into HTML output for the + dashboard and document views. If artifact content contains HTML + entities, script tags, or event handler attributes that are not + properly escaped, the rendered output enables cross-site scripting + (XSS) in the dashboard. In a worst-case environment where the + dashboard is used during an audit review or shared via screen + recording, injected scripts could alter displayed traceability + data, exfiltrate session information, or modify the visual + presentation of compliance metrics. + losses: [L-1, L-2, L-3] + + - id: H-14 + title: Rivet WASM adapter executes untrusted code that corrupts import results + description: > + The WASM runtime (wasm_runtime.rs) loads and executes third-party + adapter components. A compromised or buggy WASM adapter could + return fabricated artifacts, modify link targets, inject additional + artifacts, or silently drop artifacts during import. The host trusts + the adapter's output without independent verification against the + source data. In a worst-case environment where adapters are + distributed as binary components without source review, a supply + chain attack could introduce falsified traceability data. + losses: [L-1, L-3, L-5] + + - id: H-15 + title: Rivet commit traceability analysis produces false coverage from misidentified artifact references + description: > + The commit analysis engine (commits.rs) extracts artifact IDs from + git trailer values using pattern matching (PREFIX-DIGITS). False + positives occur when non-artifact strings match this pattern (e.g., + "ISO-26262" parsed as artifact ID "ISO-26262", or "SHA-256" parsed + as "SHA-256"). False negatives occur when artifact IDs use + non-standard formats (e.g., "H-1.2", "UCA-C-10") that the regex + does not match. Both inflate or deflate commit-artifact coverage + metrics, misrepresenting implementation completeness. + losses: [L-1, L-2] + + - id: H-16 + title: Rivet dashboard serves stale data after hot-reload fails silently + description: > + The dashboard server (serve.rs) provides a reload endpoint that + re-reads all artifact files, schemas, and documents from disk. If + the reload encounters a parse error in one YAML file, it may fail + and leave the in-memory state unchanged without notifying the user. + The dashboard continues serving the pre-reload state while the + user believes they are viewing current data. In a worst-case + environment where an engineer has just fixed a validation error + and reloaded, they see the old (passing) state and conclude + their fix is working when it actually introduced a new error. + losses: [L-1, L-3, L-4] + + - id: H-17 + title: Rivet cross-repo sync clones arbitrary git repositories specified in rivet.yaml + description: > + The externals module (externals.rs) executes `git clone` and + `git fetch` commands against URLs specified in the project's + rivet.yaml configuration. A malicious or compromised rivet.yaml + could specify a git URL pointing to a hostile repository. The + clone/fetch operation may trigger git hooks, download large + volumes of data, or overwrite local state in the cache directory. + In a worst-case environment where rivet.yaml is modified in a + supply chain attack, the sync command becomes a vector for + arbitrary code execution via git hooks. + losses: [L-3, L-5] +``` + +**Sub-hazards for H-13:** + +```yaml +sub-hazards: + - id: H-13.1 + parent: H-13 + title: Rivet dashboard renders artifact descriptions containing script injection + description: > + An artifact's description field contains `` + or ``. The dashboard detail view renders this + content as raw HTML. + hazards: [H-13] + rationale: > + Script injection in the dashboard compromises the integrity of + all displayed traceability data for the current session. + + - id: UCA-D-4 + description: > + Dashboard does not display reload errors when hot-reload fails. + context: > + User clicks reload, a YAML parse error occurs in one file, + but the dashboard returns a success response and continues + showing the old state. + hazards: [H-16] + rationale: > + User believes they are viewing current data when the dashboard + is serving stale state from before the edit. +``` + +**CTRL-CORE — Commit Traceability:** + +```yaml +# Additional core UCAs for commit traceability + + - id: UCA-C-18 + description: > + Core commit analysis extracts false-positive artifact IDs from + trailer values that match the PREFIX-DIGITS pattern but are not + actual artifact identifiers. + context: > + A commit trailer contains "Implements: ISO-26262 compliance" and + the pattern matcher extracts "ISO-26262" as an artifact ID. + hazards: [H-15] + rationale: > + False positive artifact references inflate commit coverage + metrics, creating an illusion of implementation completeness. + + - id: UCA-C-19 + description: > + Core commit analysis fails to extract artifact IDs that use + sub-hazard notation (e.g., "H-1.2") or multi-letter suffixes + (e.g., "UCA-C-10"). + context: > + A commit trailer contains "Refs: H-1.2, UCA-C-10" but the + is_artifact_id() function requires digits-only after the hyphen, + so neither ID is extracted. + hazards: [H-15, H-1] + rationale: > + False negative artifact references create coverage gaps. STPA + artifacts (the most safety-critical) are systematically missed + by the commit traceability engine. + + - id: UCA-C-20 + description: > + Core does not detect circular cross-repo dependencies during + external artifact loading. + context: > + Repo A declares repo B as external, and repo B declares repo A. + The sync process enters an infinite loop or stack overflow. + hazards: [H-11, H-1] + rationale: > + Circular dependencies cause the tool to hang or crash during + sync, preventing any validation from completing. +``` + +**CTRL-CORE — WASM Runtime:** + +```yaml +# Additional core UCAs for WASM adapter runtime + + - id: UCA-C-21 + description: > + Core WASM runtime does not validate that adapter-returned + artifacts correspond to actual records in the source data. + context: > + A WASM adapter returns 500 artifacts from a source file that + contains only 50 records. The additional 450 are fabricated. + hazards: [H-14] + rationale: > + Fabricated artifacts pass schema validation because they + conform to the declared types, but they introduce false + traceability links that inflate coverage. + + - id: UCA-C-22 + description: > + Core WASM runtime does not enforce fuel limits, allowing a + malicious adapter to consume unbounded CPU. + context: > + Fuel metering is configured but a bug in the fuel accounting + allows the adapter to execute indefinitely, causing a denial + of service. + hazards: [H-14] + rationale: > + Denial of service during import blocks all validation and + reporting operations. + + - id: UCA-C-23 + description: > + Core WASM runtime leaks host filesystem paths to the guest + adapter via WASI preopened directories. + context: > + The adapter is given access to the AADL directory via WASI + preopened dirs. The adapter reads files outside the intended + scope by exploiting symlinks or relative paths. + hazards: [H-14, H-17] + rationale: > + Information disclosure of host filesystem structure aids + further attacks. Arbitrary file read enables data exfiltration. +``` + +**CTRL-CORE — Lifecycle / Coverage:** + +```yaml + - id: UCA-C-24 + description: > + Core lifecycle completeness check uses a hardcoded downstream + type map that does not reflect the loaded schema's traceability + rules. + context: > + The lifecycle.rs module hardcodes expected_downstream() mappings + (requirement -> [feature, aadl-component, design-decision]). + A project using the cybersecurity schema has different downstream + expectations (e.g., threat-scenario -> countermeasure) that are + not represented. + hazards: [H-3, H-1] + rationale: > + Lifecycle gap analysis produces false negatives for schemas + other than the dev schema, missing genuine coverage gaps in + cybersecurity and STPA domains. +``` + +**CTRL-CORE — Document Validation:** + +```yaml + - id: UCA-C-25 + description: > + Core document validation only checks [[ID]] references but does + not validate {{artifact:ID}} embed references. + context: > + A document uses {{artifact:NOPE-999}} which renders as a broken + embed card in the dashboard, but validate_documents() does not + check this pattern, so no diagnostic is emitted. + hazards: [H-1, H-3] + rationale: > + Broken artifact embeds in documents are invisible to validation, + producing documents that appear complete but contain broken + references. +``` + +**CTRL-CLI — External Sync:** + +```yaml + - id: UCA-L-6 + description: > + CLI sync command executes git clone against an arbitrary URL from + rivet.yaml without validating the URL or disabling git hooks. + context: > + A developer opens a project with a modified rivet.yaml containing + a malicious git URL. Running `rivet sync` clones the repository, + which includes a post-checkout hook that executes arbitrary code. + hazards: [H-17] + rationale: > + The rivet.yaml file is the trust boundary for external + dependencies. A compromised config file enables arbitrary + code execution on the developer's machine. + + - id: UCA-L-7 + description: > + CLI does not validate rivet.lock file integrity before using + pinned commit SHAs for external sync. + context: > + An attacker modifies rivet.lock to point to a different commit + SHA that contains compromised artifacts. The sync operation + checks out the attacker-controlled commit. + hazards: [H-17, H-14] + rationale: > + The lock file is meant to ensure reproducible baselines. + If it can be tampered with undetected, the baseline guarantee + is void. +``` + +### Step 5: Loss Scenarios + +#### 5a. Loss Scenarios for Existing UCAs Currently Lacking Them + +The following UCAs (UCA-C-10 through UCA-C-17) were identified as having no loss scenarios in `loss-scenarios.yaml`: + +```yaml +# Proposed additions to safety/stpa/loss-scenarios.yaml + + # --- Incremental validation UCAs --- + + - id: LS-C-5 + title: Salsa input query not updated after file write + uca: UCA-C-10 + type: inadequate-process-model + hazards: [H-9, H-1, H-3] + scenario: > + The developer edits an artifact YAML file and saves it to disk. + The salsa database's file-content input query was set when the + file was first read. Because there is no file-watcher or + explicit invalidation call between validation invocations in + the same process (e.g., during `rivet serve`), the salsa + database returns the cached parse result from the old file + contents [UCA-C-10]. The link graph and validation diagnostics + reflect the pre-edit state. The dashboard shows "0 errors" + while the file on disk contains broken links [H-1, H-3]. + process-model-flaw: > + The salsa database believes its file-content input still + matches the on-disk content because no explicit set_file_content() + call was made after the file was modified externally. + causal-factors: + - No file-system watcher to detect external file modifications + - Salsa inputs are set once at load time, not refreshed + - Dashboard reload may not invalidate all salsa inputs + + - id: LS-C-6 + title: Conditional rule evaluation uses cached field values + uca: UCA-C-11 + type: inadequate-process-model + hazards: [H-9, H-1] + scenario: > + A schema defines a conditional rule: "when status == approved, + verification-criteria is required." An artifact has status=draft + and no verification-criteria field. The salsa database caches + the rule evaluation result as "not applicable" (because + status != approved). The developer changes the artifact's status + to "approved" but does not add verification-criteria. Because + the salsa conditional-rule query depends only on the rule + definition (not the artifact's field values), the cached "not + applicable" result is returned [UCA-C-11]. The artifact passes + validation despite missing the required field [H-1]. + process-model-flaw: > + The salsa query for conditional rule evaluation does not track + the artifact's field values as dependencies. It only depends + on the rule definition, so field value changes don't trigger + re-evaluation. + causal-factors: + - Conditional rule query does not declare artifact fields as inputs + - Salsa dependency tracking is opt-in per query + - No test verifies conditional rule re-evaluation after field change + + - id: LS-C-7 + title: Contradictory conditional rules not detected at schema load + uca: UCA-C-12 + type: inadequate-control-algorithm + hazards: [H-10] + scenario: > + A project uses two schema files: the ASPICE schema requires + "verification-method" when status=approved, while a custom overlay + schema forbids "verification-method" when safety-level=QM (to + reduce overhead for non-safety items). An artifact with both + status=approved and safety-level=QM triggers both rules. The + schema merge algorithm does not check for rule conflicts [UCA-C-12]. + The developer sees two contradictory validation errors and disables + one rule via a schema override, undermining the validation system. + causal-factors: + - Schema merge is purely additive (union of all rules) + - No SAT-solver or constraint check for rule compatibility + - Conditional rules lack a priority or override mechanism + + - id: LS-C-8 + title: Incremental validation diverges from full validation after rename + uca: UCA-C-13 + type: inadequate-control-algorithm + hazards: [H-9, H-3] + scenario: > + A developer renames an artifact from REQ-042 to REQ-042a. The + salsa database invalidates the queries for the renamed artifact's + file. However, other artifacts that link to REQ-042 are in + different files whose salsa inputs have not changed. The + incremental validation for those files returns cached "valid" + results that still show links to REQ-042 as resolved [UCA-C-13]. + A full validation would detect the broken links. The dashboard + shows 0 errors while a clean `rivet validate` shows 3 broken + links. + causal-factors: + - Link resolution queries depend on the link graph, not individual files + - Salsa does not track cross-file artifact ID dependencies + - No periodic full-revalidation check is implemented + + - id: LS-C-9 + title: Schema loading races with conditional rule evaluation + uca: UCA-C-14 + type: inadequate-control-algorithm + hazards: [H-9, H-10] + scenario: > + The project loads schemas from three files: common.yaml, + dev.yaml, and custom-overlay.yaml. The salsa query graph allows + conditional rule evaluation to begin as soon as two schemas are + merged (common + dev), before the overlay is loaded. The overlay + adds a conditional rule that restricts certain field combinations. + Artifacts evaluated before the overlay loads pass validation + [UCA-C-14]. After the overlay loads, those same artifacts would + fail. The final validation result depends on evaluation order, + making it non-deterministic. + causal-factors: + - Schema merge is incremental (file-by-file) rather than all-at-once + - No barrier between "all schemas loaded" and "evaluation begins" + - Salsa query ordering is determined by demand, not explicit sequencing + + - id: LS-C-10 + title: Parser extracts registry version instead of git_override commit + uca: UCA-C-15 + type: inadequate-control-algorithm + hazards: [H-11, H-1] + scenario: > + A MODULE.bazel file declares `bazel_dep(name="meld", version="2.0")` + and separately `git_override(module_name="meld", commit="abc123")`. + The parser extracts the bazel_dep declaration and records + meld@2.0. It does not process the git_override because the + function name is not in its recognized set [UCA-C-15]. Cross-repo + validation loads meld@2.0 from the registry cache instead of + the pinned commit abc123. The traceability data comes from the + wrong version of meld, and coverage results are meaningless [H-1]. + causal-factors: + - Parser's function recognition list is incomplete + - git_override is parsed separately from bazel_dep with no linkage + - No test covers the override-applies-to-dep scenario + + - id: LS-C-11 + title: Parser silently skips load() statement containing dependency + uca: UCA-C-16 + type: inadequate-control-algorithm + hazards: [H-11] + scenario: > + A MODULE.bazel file uses `load("@rules_rust//rust:defs.bzl", "rust_library")` + followed by a macro call that declares additional bazel_deps. + The parser does not recognize load() statements and silently + skips the line [UCA-C-16]. The macro-declared dependencies are + invisible to Rivet. Cross-repo links to artifacts in those + repos are reported as broken when they actually exist in the + undiscovered external repos. + causal-factors: + - Starlark load() support is not implemented + - Parser treats unrecognized lines as comments + - No diagnostic emitted for skipped lines + + - id: LS-C-12 + title: Parser swaps keyword argument name and value + uca: UCA-C-17 + type: inadequate-control-algorithm + hazards: [H-11, H-1] + scenario: > + A MODULE.bazel file contains `bazel_dep(version="1.0", name="meld")`. + The parser assumes positional argument order (name first, version + second) rather than parsing keyword arguments [UCA-C-17]. It + extracts name="1.0" and version="meld". The dependency is + recorded with the wrong module name, causing cross-repo + resolution to fail silently or resolve against the wrong repo. + causal-factors: + - Parser uses positional extraction rather than keyword matching + - No unit test for non-standard argument ordering + - CST construction assumes a fixed parameter order + +#### 5b. Loss Scenarios for New UCAs + + - id: LS-D-3 + title: Artifact description XSS via dashboard rendering + uca: UCA-D-3 + type: inadequate-control-algorithm + hazards: [H-13] + scenario: > + A developer creates an artifact with description containing + ``. + The dashboard's artifact detail view renders this description + by calling html_escape() on the description string. However, + the html_escape function in document.rs is only called for + markdown inline text, not for the artifact detail view in + serve.rs, which may interpolate the description directly into + HTML. The script executes in the auditor's browser, exfiltrating + the current traceability view [H-13, L-2, L-3]. + causal-factors: + - Multiple rendering paths with inconsistent escaping + - serve.rs generates HTML via string formatting, not a template engine + - No Content-Security-Policy header to mitigate XSS + + - id: LS-C-13 + title: WASM adapter returns fabricated artifacts + uca: UCA-C-21 + type: inadequate-process-model + hazards: [H-14] + scenario: > + A WASM adapter component is loaded from a .wasm file distributed + as a binary. The adapter's import() function is called with a + ReqIF source file containing 50 SpecObjects. The adapter returns + 80 artifacts — the original 50 plus 30 fabricated artifacts with + valid-looking IDs (REQ-051 through REQ-080) that link to + existing requirements, inflating coverage metrics [UCA-C-21]. + The host has no way to verify the count because the source data + format is adapter-specific. Schema validation passes because the + fabricated artifacts have valid types and fields. + process-model-flaw: > + The host trusts that the adapter's output faithfully represents + the source data. There is no independent verification mechanism + because the host cannot parse the adapter-specific source format. + causal-factors: + - WASM adapter output is trusted without independent verification + - No artifact-count or hash-based integrity check on adapter output + - Binary WASM components cannot be source-reviewed by users + + - id: LS-C-14 + title: Commit trailer false positive from non-artifact ID + uca: UCA-C-18 + type: inadequate-control-algorithm + hazards: [H-15] + scenario: > + A commit message contains the trailer "Refs: ISO-26262, MISRA-C2012". + The is_artifact_id() function matches any string with uppercase + prefix + hyphen + digits. "ISO-26262" matches (prefix="ISO", + suffix="26262") and is extracted as an artifact reference [UCA-C-18]. + The store does not contain "ISO-26262", so it is counted as a + broken reference. However, in the commit analysis summary, it + inflates the "linked commits" count because the commit has at + least one artifact reference (even though it's a false positive). + causal-factors: + - is_artifact_id() uses an overly broad pattern + - No validation against the actual artifact store during extraction + - Standard identifiers (ISO numbers, MISRA rules) match the pattern + + - id: LS-L-3 + title: Git clone executes malicious post-checkout hook + uca: UCA-L-6 + type: control-path + hazards: [H-17] + scenario: > + An attacker submits a PR that modifies rivet.yaml to add an + external dependency pointing to a malicious git repository. + The repository contains a .git/hooks/post-checkout script that + exfiltrates SSH keys or modifies local source files. A reviewer + approves the PR without noticing the rivet.yaml change. When + any developer runs `rivet sync`, git clones the malicious repo + and the post-checkout hook executes with the developer's + permissions [UCA-L-6]. + causal-factors: + - rivet.yaml changes are not flagged as security-sensitive in review + - git clone enables hooks by default + - No URL allowlist or domain restriction for external repos + - sync command does not use --config core.hooksPath=/dev/null + + - id: LS-C-15 + title: Lifecycle check misses cybersecurity downstream requirements + uca: UCA-C-24 + type: inadequate-process-model + hazards: [H-3, H-1] + scenario: > + A project uses the cybersecurity schema with threat-scenario, + countermeasure, and cybersecurity-requirement types. The + lifecycle completeness check in lifecycle.rs only knows about + requirement -> [feature, aadl-component, design-decision] + mappings. Cybersecurity requirements have no expected downstream + types in the hardcoded map [UCA-C-24]. The lifecycle gap analysis + reports 0 gaps for cybersecurity requirements, even though none + have countermeasure links. The safety engineer trusts the + lifecycle check and misses genuine coverage gaps. + process-model-flaw: > + The lifecycle module's process model is hardcoded to the dev + schema's traceability chain, not derived from the loaded schema's + traceability rules. It does not know about cybersecurity or STPA + type hierarchies. + causal-factors: + - expected_downstream() is hardcoded, not derived from schema rules + - No test covers lifecycle checks with non-dev schemas + - The module was written for the dev schema and not generalized +``` + +--- + +## Part 2: STPA-Sec Analysis + +### 2.1 Supply Chain Attack: Compromised WASM Adapter + +| Aspect | Detail | +|--------|--------| +| **Attack surface** | `wasm_runtime.rs` — `WasmAdapter::call_import()`, `call_render()`, `call_analyze()` | +| **Potential impact** | L-1 (fabricated traceability links), L-3 (data exfiltration via WASI), L-5 (false safety assurance from fabricated coverage) | +| **Existing mitigation** | Fuel metering (1B ops), memory limit (256 MiB), WASI preopened dirs are read-only | +| **Missing mitigation** | (1) No code signing or hash verification of .wasm files before loading. (2) No independent verification of adapter output against source data. (3) No sandboxed filesystem — adapter can read any preopened directory. (4) No audit log of adapter invocations and their inputs/outputs. | +| **Recommended constraint** | SC-16 (validate adapter outputs independently). Add: WASM component hash verification against a trusted manifest. | + +### 2.2 Injection Attack: Malicious Artifact Content (XSS) + +| Aspect | Detail | +|--------|--------| +| **Attack surface** | `document.rs` — `render_to_html()`, `resolve_inline()`, `html_escape()`; `serve.rs` — all HTML-generating routes | +| **Potential impact** | L-1 (altered traceability display), L-2 (manipulated audit evidence), L-3 (session/data exfiltration) | +| **Existing mitigation** | `html_escape()` function exists in document.rs and handles `&`, `<`, `>`, `"`. The `parse_markdown_link()` function restricts URLs to http/https/#. | +| **Missing mitigation** | (1) No Content-Security-Policy (CSP) header on the dashboard server. (2) serve.rs generates HTML via `format!()` string interpolation, not a template engine with auto-escaping. Each HTML route must manually remember to escape — a single missed interpolation creates an XSS vector. (3) No automated test that verifies HTML output is properly escaped for adversarial inputs. (4) `{{artifact:ID}}` embed rendering constructs HTML from artifact fields with html_escape, but there's no test for nested injection (e.g., artifact title containing quotes that break out of an HTML attribute). | +| **Recommended constraint** | SC-15 (HTML-escape all content). Add: CSP header (`script-src 'self'`), move to a template engine, add XSS-specific test cases. | + +### 2.3 Data Integrity: YAML File Tampering Bypassing Validation + +| Aspect | Detail | +|--------|--------| +| **Attack surface** | YAML artifact files on disk, `store.rs` — `Store::upsert()`, `store.rs` — `Store::insert()` | +| **Potential impact** | L-1 (corrupted traceability), L-3 (unauthorized modification), L-6 (audit trail gaps) | +| **Existing mitigation** | Git history provides an audit trail. `rivet validate` checks schema conformance and link integrity. CI runs validation as a merge gate. | +| **Missing mitigation** | (1) No file integrity checking between validation and report generation — an attacker who can modify files on disk between `rivet validate` and `rivet serve` (or `rivet stats`) can serve unvalidated data. (2) `Store::upsert()` overwrites artifacts without recording what changed or who changed it. (3) No cryptographic hash of the validated state is stored, so there's no way to detect post-validation tampering. | +| **Recommended constraint** | After validation passes, compute a hash of all artifact content. Report generation must verify the hash has not changed. Alternatively, validation and report generation must share the same in-memory state (currently true for `rivet serve` but not for separate CLI invocations). | + +### 2.4 Information Disclosure: Export Leaking Internal Paths + +| Aspect | Detail | +|--------|--------| +| **Attack surface** | `model.rs` — `Artifact.source_file`; `serve.rs` — source view routes; `document.rs` — `Document.source_file` | +| **Potential impact** | L-3 (information disclosure of filesystem structure, usernames, internal paths) | +| **Existing mitigation** | `Artifact.source_file` is `#[serde(skip)]` so it's not serialized in YAML exports. | +| **Missing mitigation** | (1) The dashboard source view (`/source/*`) serves raw file content including full filesystem paths in headers/URLs. An attacker with dashboard access sees the full path (e.g., `/Volumes/Home/username/git/project/safety/stpa/losses.yaml`). (2) Git commit information (author name, email, commit SHA) is displayed in the dashboard and would be included in any HTML export. (3) The `build_version()` function in main.rs embeds git branch, commit, and dirty state into the binary — this is visible in `rivet --version` output and potentially in generated reports. | +| **Recommended constraint** | Dashboard should use relative paths, not absolute filesystem paths. HTML exports should strip or anonymize source_file paths. Build metadata should be configurable (allow stripping in release builds). | + +### 2.5 Denial of Service: Pathological Artifact Graphs + +| Aspect | Detail | +|--------|--------| +| **Attack surface** | `links.rs` — `LinkGraph::reachable()`, `LinkGraph::build()`; `serve.rs` — graph visualization routes | +| **Potential impact** | L-4 (engineering productivity loss from hung processes) | +| **Existing mitigation** | `LinkGraph::reachable()` uses a visited-set to prevent infinite loops on cycles. `has_cycles()` uses petgraph's `is_cyclic_directed()`. | +| **Missing mitigation** | (1) `reachable()` uses a `Vec` for the visited set with `contains()` checks, giving O(n^2) performance for large graphs. A project with 10,000+ artifacts could experience significant slowdowns. (2) The dashboard graph visualization route computes ego-subgraphs and Sugiyama layouts. For deeply connected graphs, the layout computation could take minutes, with no timeout. (3) No limit on the number of artifacts or links that can be loaded — a malicious YAML file with millions of entries would exhaust memory. | +| **Recommended constraint** | Switch `reachable()` visited set to `HashSet`. Add a configurable timeout for layout computation. Add a configurable limit on artifact/link count with a diagnostic when exceeded. | + +### 2.6 Configuration Tampering: Modified rivet.yaml + +| Aspect | Detail | +|--------|--------| +| **Attack surface** | `rivet.yaml`, `rivet.lock`; `externals.rs` — `sync_external()`; `lib.rs` — `load_project_config()` | +| **Potential impact** | L-3 (data sovereignty via malicious external repos), L-5 (safety assurance via modified schemas) | +| **Existing mitigation** | rivet.yaml is version-controlled in git. Changes are visible in PR diffs. | +| **Missing mitigation** | (1) No schema pinning — if a schema is loaded from a path (not embedded), an attacker who modifies the schema file can weaken validation rules without changing rivet.yaml. (2) No integrity check on embedded schemas — a modified binary with weakened embedded schemas would pass all validation. (3) `rivet.lock` has no signature or checksum, so it can be modified to point to different commit SHAs without detection. | +| **Recommended constraint** | Embedded schemas should include a hash that is verified at load time. rivet.lock should include content hashes of external repo states. Schema files loaded from disk should be verified against expected hashes if available. | + +--- + +## Part 3: OSLC Lifecycle Lens + +### 3.1 Resource Integrity + +**OSLC concern:** Are artifacts complete and consistent? + +| Check | STPA Coverage | Gap? | +|-------|--------------|------| +| All artifacts have required fields | SC-1 (validate cross-references), validation in validate.rs checks required fields | No | +| All links resolve to valid targets | SC-1, UCA-C-4 (dangling links) | No | +| Schema conformance | SC-4 (semantic compatibility), UCA-C-6 (wrong schema) | No | +| Cross-repo artifact integrity | SC-10 (external artifact existence), H-11 (MODULE.bazel) | No | +| **Document-artifact consistency** | validate_documents() checks [[ID]] refs | **Gap: {{artifact:ID}} embeds not validated (UCA-C-25)** | + +### 3.2 Change Management + +**OSLC concern:** Are changes tracked and auditable? + +| Check | STPA Coverage | Gap? | +|-------|--------------|------| +| All modifications attributed | SC-7 (audit trail), H-7 (unattributed modification) | No | +| Change history preserved | L-6 (audit trail), git commit history | No | +| Commit-artifact traceability | commits.rs, UCA-C-18/C-19 (extraction errors) | Partial — see H-15 | +| **Configuration change tracking** | | **Gap: rivet.yaml and schema file changes are not tracked as artifact-level events. A schema change that weakens a validation rule has no artifact-level audit trail.** | + +### 3.3 Configuration Management + +**OSLC concern:** Are baselines reproducible? + +| Check | STPA Coverage | Gap? | +|-------|--------------|------| +| Deterministic artifact loading | SC-11 (incremental = full), deterministic file ordering | No | +| Reproducible baselines | externals.rs lock/baseline commands, rivet.lock | No | +| Schema versioning | Schema files have version metadata | No | +| **Build reproducibility** | | **Gap: No STPA analysis covers the scenario where the Rivet binary itself is not reproducible (different compiler flags, different embedded schema versions). Two developers running different Rivet versions against the same artifacts could get different validation results. SC-14 partially covers this for formal proofs but not for the tool binary itself.** | + +### 3.4 Quality Management + +**OSLC concern:** Are verification records trustworthy? + +| Check | STPA Coverage | Gap? | +|-------|--------------|------| +| Test results linked to artifacts | results.rs, ResultStore | No | +| Coverage metrics from validated data | SC-3 (coverage from validated data), CC-C-8 | No | +| Report generation gated on validation | SC-6 (compliance reports from verified data), CC-L-2 | No | +| Formal verification of tool correctness | SC-14 (proofs validate implementation), H-12 | No | +| **Dashboard data freshness** | UCA-D-2 (stale metrics) | Partially covered, extended by H-16 | + +--- + +## Part 4: Recommended New YAML Artifacts + +### Priority 1: Immediate (new hazards and constraints) + +1. **Add to `hazards.yaml`:** H-13, H-14, H-15, H-16, H-17, sub-hazards H-13.1, H-13.2 +2. **Add to `system-constraints.yaml`:** SC-15, SC-16, SC-17, SC-18, SC-19 +3. **Add to `control-structure.yaml`:** No new controllers needed yet (the requested CTRL-EXPORT, CTRL-YAML-EDIT, CTRL-SALSA do not exist in the codebase — export.rs, yaml_edit.rs, db.rs are not present on this branch. The analysis covers these concerns under existing controllers.) +4. **Add to `ucas.yaml`:** + - Dashboard: UCA-D-3, UCA-D-4 + - Core: UCA-C-18 through UCA-C-25 + - CLI: UCA-L-6, UCA-L-7 +5. **Add to `controller-constraints.yaml`:** CC-D-3/D-4, CC-C-18 through CC-C-25, CC-L-6/L-7 +6. **Add to `loss-scenarios.yaml`:** LS-C-5 through LS-C-15, LS-D-3, LS-L-3 + +### Priority 2: STPA-Sec artifacts (new) + +Create `safety/stpa/security-constraints.yaml` with: +- SEC-1: CSP header on dashboard +- SEC-2: WASM component hash verification +- SEC-3: Git clone hook disabling for externals +- SEC-4: rivet.lock integrity verification +- SEC-5: Artifact count validation for WASM adapter output +- SEC-6: Path anonymization in exports and dashboard + +### Priority 3: OSLC gap remediation + +- Add a hazard (H-18?) for "Rivet schema change weakens validation without artifact-level audit trail" +- Add a constraint (SC-20?) for "Rivet must record schema version and content hash alongside validation results" +- Add a hazard (H-19?) for "Different Rivet binary versions produce different validation results for the same inputs" +- Add a constraint (SC-21?) for "Rivet must embed its version and schema hashes in all generated reports for reproducibility verification" + +--- + +## Summary of Findings + +| Category | Count | Existing | New | +|----------|-------|----------|-----| +| Losses | 6 | 6 | 0 | +| Hazards | 17 (+2 sub) | 12 (+6 sub) | 5 (+2 sub) | +| System constraints | 19 | 14 | 5 | +| Controllers | 7 | 7 | 0 (requested ones don't exist on this branch) | +| UCAs | 47 | 32 | 15 | +| Controller constraints | 44 | 31 | 13 | +| Loss scenarios | 27 | 13 | 14 | +| STPA-Sec findings | 6 | 0 | 6 | +| OSLC gaps | 3 | 0 | 3 | + +### Critical Findings + +1. **XSS in dashboard** (H-13): The dashboard generates HTML via string formatting in serve.rs. While document.rs has html_escape(), the serve.rs routes may have inconsistent escaping. No CSP header is set. This is the highest-impact security finding because the dashboard is used during audit reviews. + +2. **WASM supply chain** (H-14): WASM adapters are loaded as binary blobs without signature verification. The adapter output is trusted without independent verification. Fuel and memory limits provide DoS protection but not integrity protection. + +3. **Missing loss scenarios** (UCA-C-10 through UCA-C-17): Eight UCAs related to incremental validation and MODULE.bazel parsing had no loss scenarios. These are now covered by LS-C-5 through LS-C-12. + +4. **Commit traceability false positives** (H-15): The `is_artifact_id()` pattern matches standard identifiers like ISO-26262 and MISRA-C2012. STPA artifact IDs (H-1.2, UCA-C-10) are NOT matched. This systematically excludes safety-critical artifacts from commit coverage. + +5. **Lifecycle check hardcoding** (UCA-C-24): The lifecycle completeness check only knows about the dev schema's type hierarchy. Cybersecurity, STPA, and ASPICE schemas are invisible to it. + +6. **Git clone as code execution vector** (H-17): `rivet sync` clones arbitrary git URLs from rivet.yaml without disabling hooks, creating a path from configuration tampering to arbitrary code execution. diff --git a/docs/verification.md b/docs/verification.md index f38846c..39350de 100644 --- a/docs/verification.md +++ b/docs/verification.md @@ -244,3 +244,61 @@ Each phase 3 workstream adds verification at the appropriate level: - **[[REQ-029]] salsa incremental** — proptest comparing incremental vs full validation results, Verus soundness proof - **[[REQ-030]] formal verification** — the Kani/Verus/Rocq harnesses ARE the verification - **[[REQ-031]] CLI mutations** — proptest for random mutation sequences never producing invalid YAML, integration tests for all rejection cases + +## 12. STPA-Sec Test Requirements + +The STPA-Sec analysis (2026-03-16) identified security-relevant hazards that +require dedicated test coverage. These tests supplement the existing functional +test suite with adversarial input testing. + +### 12.1 XSS Prevention (H-13, SC-15) + +| Test | Description | Verifies | +|------|-------------|----------| +| `test_artifact_title_xss_escaped` | Artifact with title `` renders escaped in dashboard HTML | SC-15, UCA-D-3 | +| `test_artifact_description_xss_escaped` | Artifact description with `` is sanitized | SC-15, H-13.1 | +| `test_document_markdown_raw_html_stripped` | Markdown `"); + assert!( + !html.contains(""), + "closing must be stripped, got: {html}" + ); + } + + #[test] + fn inline_html_is_stripped() { + let html = render_markdown("Hello bold world"); + assert!( + !html.contains(""), + "inline HTML tags must be stripped, got: {html}" + ); + } + + #[test] + fn raw_iframe_is_stripped() { + let html = render_markdown(""); + assert!( + !html.contains(" must be stripped from markdown output, got: {html}" + ); + } + #[test] fn strip_tags_basic() { let plain = strip_html_tags("

Hello world

"); diff --git a/rivet-core/src/model.rs b/rivet-core/src/model.rs index 2247aaf..749aa42 100644 --- a/rivet-core/src/model.rs +++ b/rivet-core/src/model.rs @@ -6,6 +6,15 @@ use serde::{Deserialize, Serialize}; /// Unique identifier for an artifact (e.g., "L-1", "H-3.2", "SWREQ-012"). pub type ArtifactId = String; +/// Statuses that indicate an artifact should be fully traced in the lifecycle. +pub const TRACED_STATUSES: &[&str] = &[ + "implemented", + "done", + "approved", + "accepted", + "verified", +]; + /// A typed, directional link from one artifact to another. #[derive(Debug, Clone, Serialize, Deserialize, PartialEq, Eq)] pub struct Link { diff --git a/rivet-core/src/results.rs b/rivet-core/src/results.rs index 2bc8a2d..271102d 100644 --- a/rivet-core/src/results.rs +++ b/rivet-core/src/results.rs @@ -185,14 +185,15 @@ impl ResultStore { } /// Load all test run YAML files from a directory. -pub fn load_results(dir: &Path) -> anyhow::Result> { +pub fn load_results(dir: &Path) -> Result, crate::error::Error> { let mut runs = Vec::new(); if !dir.exists() { return Ok(runs); } - let mut entries: Vec<_> = std::fs::read_dir(dir)? + let mut entries: Vec<_> = std::fs::read_dir(dir) + .map_err(|e| crate::error::Error::Results(format!("{}: {e}", dir.display())))? .filter_map(|e| e.ok()) .filter(|e| { let p = e.path(); @@ -203,9 +204,10 @@ pub fn load_results(dir: &Path) -> anyhow::Result> { for entry in entries { let path = entry.path(); - let content = std::fs::read_to_string(&path)?; + let content = std::fs::read_to_string(&path) + .map_err(|e| crate::error::Error::Results(format!("{}: {e}", path.display())))?; let file: TestRunFile = serde_yaml::from_str(&content) - .map_err(|e| anyhow::anyhow!("{}: {e}", path.display()))?; + .map_err(|e| crate::error::Error::Results(format!("{}: {e}", path.display())))?; runs.push(TestRun { run: file.run, results: file.results, diff --git a/rivet-core/src/wasm_runtime.rs b/rivet-core/src/wasm_runtime.rs index 367b307..d2747bc 100644 --- a/rivet-core/src/wasm_runtime.rs +++ b/rivet-core/src/wasm_runtime.rs @@ -388,10 +388,24 @@ impl WasmAdapter { match result { Ok(wit_artifacts) => { - let artifacts = wit_artifacts + // Warn on suspiciously large result sets (potential DoS). + const MAX_ARTIFACTS_WARN: usize = 10_000; + if wit_artifacts.len() > MAX_ARTIFACTS_WARN { + log::warn!( + "WASM adapter returned {} artifacts (threshold: {}), possible DoS", + wit_artifacts.len(), + MAX_ARTIFACTS_WARN, + ); + } + + let artifacts: Vec = wit_artifacts .into_iter() .map(convert_wit_artifact_to_host) .collect(); + + // Validate and sanitize each returned artifact. + let artifacts = validate_wasm_artifacts(artifacts)?; + Ok(artifacts) } Err(e) => Err(WasmError::Guest(format!("adapter import error: {:?}", e))), @@ -760,6 +774,65 @@ fn yaml_value_to_wit_field( } } +// --------------------------------------------------------------------------- +// WASM output validation +// --------------------------------------------------------------------------- + +/// Validate and sanitize artifacts returned by a WASM adapter. +/// +/// - Rejects artifacts with empty `id` or `artifact_type`. +/// - Strips HTML tags from `title` and `description` fields to prevent XSS. +fn validate_wasm_artifacts(artifacts: Vec) -> Result, WasmError> { + let mut result = Vec::with_capacity(artifacts.len()); + + for mut artifact in artifacts { + // Reject empty IDs + if artifact.id.trim().is_empty() { + return Err(WasmError::Guest( + "WASM adapter returned artifact with empty ID".into(), + )); + } + + // Reject empty artifact types + if artifact.artifact_type.trim().is_empty() { + return Err(WasmError::Guest(format!( + "WASM adapter returned artifact '{}' with empty type", + artifact.id, + ))); + } + + // Sanitize title — strip HTML tags to prevent XSS + artifact.title = strip_html_from_text(&artifact.title); + + // Sanitize description + if let Some(ref desc) = artifact.description { + artifact.description = Some(strip_html_from_text(desc)); + } + + result.push(artifact); + } + + Ok(result) +} + +/// Strip HTML tags from plain text fields to prevent XSS injection. +/// +/// This is a basic sanitizer for text that should never contain HTML. +/// It removes `` sequences but preserves the text content between them. +fn strip_html_from_text(input: &str) -> String { + let mut result = String::with_capacity(input.len()); + let mut in_tag = false; + for ch in input.chars() { + match ch { + '<' => in_tag = true, + '>' => in_tag = false, + _ if !in_tag => result.push(ch), + _ => {} + } + } + result +} + // --------------------------------------------------------------------------- // Helpers // --------------------------------------------------------------------------- @@ -938,6 +1011,81 @@ mod tests { } } + #[test] + fn validate_wasm_artifacts_rejects_empty_id() { + let artifacts = vec![Artifact { + id: "".into(), + artifact_type: "requirement".into(), + title: "Test".into(), + description: None, + status: None, + tags: vec![], + links: vec![], + fields: std::collections::BTreeMap::new(), + source_file: None, + }]; + let result = validate_wasm_artifacts(artifacts); + assert!(result.is_err()); + assert!( + format!("{:?}", result.unwrap_err()).contains("empty ID"), + "should mention empty ID" + ); + } + + #[test] + fn validate_wasm_artifacts_rejects_empty_type() { + let artifacts = vec![Artifact { + id: "REQ-001".into(), + artifact_type: "".into(), + title: "Test".into(), + description: None, + status: None, + tags: vec![], + links: vec![], + fields: std::collections::BTreeMap::new(), + source_file: None, + }]; + let result = validate_wasm_artifacts(artifacts); + assert!(result.is_err()); + assert!( + format!("{:?}", result.unwrap_err()).contains("empty type"), + "should mention empty type" + ); + } + + #[test] + fn validate_wasm_artifacts_strips_html_from_title() { + let artifacts = vec![Artifact { + id: "REQ-001".into(), + artifact_type: "requirement".into(), + title: "Safe Title".into(), + description: Some("Description".into()), + status: None, + tags: vec![], + links: vec![], + fields: std::collections::BTreeMap::new(), + source_file: None, + }]; + let result = validate_wasm_artifacts(artifacts).unwrap(); + assert_eq!(result.len(), 1); + assert_eq!(result[0].title, "alert(1)Safe Title"); + assert!(!result[0].title.contains("safe"), + "evil()safe" + ); + assert_eq!(strip_html_from_text("no tags here"), "no tags here"); + assert_eq!(strip_html_from_text(""), ""); + } + /// End-to-end: load the spar WASM component, preopen a directory with /// real AADL files, call the renderer, and verify the SVG output. /// diff --git a/safety/stpa/controller-constraints.yaml b/safety/stpa/controller-constraints.yaml index b20424e..169b746 100644 --- a/safety/stpa/controller-constraints.yaml +++ b/safety/stpa/controller-constraints.yaml @@ -402,3 +402,116 @@ controller-constraints: supported function call types. ucas: [UCA-C-17] hazards: [H-11, H-1] + + # ========================================================================= + # STPA-Sec extension constraints + # ========================================================================= + - id: CC-D-3 + controller: CTRL-DASH + constraint: > + Dashboard must HTML-escape all artifact content (titles, + descriptions, field values) before rendering in HTML views. + No artifact content may be interpolated into HTML without + escaping. + ucas: [UCA-D-3] + hazards: [H-13] + + - id: CC-D-4 + controller: CTRL-DASH + constraint: > + Dashboard must return an error response when reload fails and + display a visible stale-data indicator until a successful reload + occurs. + ucas: [UCA-D-4] + hazards: [H-16] + + - id: CC-C-18 + controller: CTRL-CORE + constraint: > + Core commit analysis must validate extracted artifact IDs against + the known artifact store, not rely solely on pattern matching. + References that do not resolve to known artifacts must not be + counted as coverage. + ucas: [UCA-C-18] + hazards: [H-15] + + - id: CC-C-19 + controller: CTRL-CORE + constraint: > + Core commit analysis must support all artifact ID formats used + in the project, including sub-hazard notation (H-1.2), multi-segment + prefixes (UCA-C-10), and controller constraints (CC-L-5). + ucas: [UCA-C-19] + hazards: [H-15, H-1] + + - id: CC-C-20 + controller: CTRL-CORE + constraint: > + Core must detect circular cross-repo dependencies before + beginning artifact loading and report them as a configuration + error. + ucas: [UCA-C-20] + hazards: [H-11, H-1] + + - id: CC-C-21 + controller: CTRL-CORE + constraint: > + Core WASM runtime must validate adapter output artifact count + and IDs against the source data before accepting them into the + artifact store. + ucas: [UCA-C-21] + hazards: [H-14] + + - id: CC-C-22 + controller: CTRL-CORE + constraint: > + Core WASM runtime must enforce fuel limits and terminate + adapters that exceed their allocation, returning an error + diagnostic. + ucas: [UCA-C-22] + hazards: [H-14] + + - id: CC-C-23 + controller: CTRL-CORE + constraint: > + Core WASM runtime must restrict WASI preopened directories to + the minimum required paths and prevent symlink traversal + outside those paths. + ucas: [UCA-C-23] + hazards: [H-14, H-17] + + - id: CC-C-24 + controller: CTRL-CORE + constraint: > + Core lifecycle completeness check must derive expected downstream + types from the loaded schema's traceability rules, not from + hardcoded mappings. + ucas: [UCA-C-24] + hazards: [H-3, H-1] + + - id: CC-C-25 + controller: CTRL-CORE + constraint: > + Core document validation must check both [[ID]] wiki-link + references and {{artifact:ID}} embed references against the + artifact store. + ucas: [UCA-C-25] + hazards: [H-1, H-3] + + - id: CC-L-6 + controller: CTRL-CLI + constraint: > + CLI sync command must disable git hooks when cloning external + repositories (--config core.hooksPath=/dev/null) and must not + execute arbitrary URLs without user confirmation for new repos. + ucas: [UCA-L-6] + hazards: [H-17] + + - id: CC-L-7 + controller: CTRL-CLI + constraint: > + CLI must verify rivet.lock file integrity (content hashes of + external repo states) before using pinned commit SHAs for + external sync operations. + ucas: [UCA-L-7] + hazards: [H-17, H-14] diff --git a/safety/stpa/hazards.yaml b/safety/stpa/hazards.yaml index d31d1ea..817a1a5 100644 --- a/safety/stpa/hazards.yaml +++ b/safety/stpa/hazards.yaml @@ -123,6 +123,75 @@ hazards: of correctness. losses: [L-2, L-5] + - id: H-13 + title: Rivet document renderer produces HTML containing unescaped artifact content + description: > + The markdown-to-HTML renderer in document.rs processes artifact + descriptions, titles, and field values into HTML output for the + dashboard and document views. If artifact content contains HTML + entities, script tags, or event handler attributes that are not + properly escaped, the rendered output enables cross-site scripting + (XSS) in the dashboard. In a worst-case environment where the + dashboard is used during an audit review or shared via screen + recording, injected scripts could alter displayed traceability + data, exfiltrate session information, or modify the visual + presentation of compliance metrics. + losses: [L-1, L-2, L-3] + + - id: H-14 + title: Rivet WASM adapter executes untrusted code that corrupts import results + description: > + The WASM runtime (wasm_runtime.rs) loads and executes third-party + adapter components. A compromised or buggy WASM adapter could + return fabricated artifacts, modify link targets, inject additional + artifacts, or silently drop artifacts during import. The host trusts + the adapter's output without independent verification against the + source data. In a worst-case environment where adapters are + distributed as binary components without source review, a supply + chain attack could introduce falsified traceability data. + losses: [L-1, L-3, L-5] + + - id: H-15 + title: Rivet commit traceability analysis produces false coverage from misidentified artifact references + description: > + The commit analysis engine (commits.rs) extracts artifact IDs from + git trailer values using pattern matching (PREFIX-DIGITS). False + positives occur when non-artifact strings match this pattern (e.g., + "ISO-26262" parsed as artifact ID "ISO-26262", or "SHA-256" parsed + as "SHA-256"). False negatives occur when artifact IDs use + non-standard formats (e.g., "H-1.2", "UCA-C-10") that the regex + does not match. Both inflate or deflate commit-artifact coverage + metrics, misrepresenting implementation completeness. + losses: [L-1, L-2] + + - id: H-16 + title: Rivet dashboard serves stale data after hot-reload fails silently + description: > + The dashboard server (serve.rs) provides a reload endpoint that + re-reads all artifact files, schemas, and documents from disk. If + the reload encounters a parse error in one YAML file, it may fail + and leave the in-memory state unchanged without notifying the user. + The dashboard continues serving the pre-reload state while the + user believes they are viewing current data. In a worst-case + environment where an engineer has just fixed a validation error + and reloaded, they see the old (passing) state and conclude + their fix is working when it actually introduced a new error. + losses: [L-1, L-3, L-4] + + - id: H-17 + title: Rivet cross-repo sync clones arbitrary git repositories specified in rivet.yaml + description: > + The externals module (externals.rs) executes `git clone` and + `git fetch` commands against URLs specified in the project's + rivet.yaml configuration. A malicious or compromised rivet.yaml + could specify a git URL pointing to a hostile repository. The + clone/fetch operation may trigger git hooks, download large + volumes of data, or overwrite local state in the cache directory. + In a worst-case environment where rivet.yaml is modified in a + supply chain attack, the sync command becomes a vector for + arbitrary code execution via git hooks. + losses: [L-3, L-5] + sub-hazards: # --- H-1 refinements: types of stale references --- - id: H-1.1 @@ -209,3 +278,23 @@ sub-hazards: load() statements that the Starlark subset parser does not handle. The parser silently skips the unrecognized construct, missing a dependency declaration. + + # --- H-13 refinements: XSS rendering paths --- + - id: H-13.1 + parent: H-13 + title: Rivet dashboard renders artifact descriptions containing script injection + description: > + An artifact's description field contains `` + or ``. The dashboard detail view renders this + content as raw HTML. + hazards: [H-13] + rationale: > + Script injection in the dashboard compromises the integrity of + all displayed traceability data for the current session. + + not-providing: + - id: UCA-D-4 + description: > + Dashboard does not display reload errors when hot-reload fails. + context: > + User clicks reload, a YAML parse error occurs in one file, + but the dashboard returns a success response and continues + showing the old state. + hazards: [H-16] + rationale: > + User believes they are viewing current data when the dashboard + is serving stale state from before the edit. + + too-early-too-late: [] + stopped-too-soon: [] + +# ============================================================================= +# Core Engine UCAs — Commit Traceability (STPA-Sec extension) +# ============================================================================= +commit-ucas: + control-action: Extract artifact references from git commit trailers + controller: CTRL-CORE + + providing: + - id: UCA-C-18 + description: > + Core commit analysis extracts false-positive artifact IDs from + trailer values that match the PREFIX-DIGITS pattern but are not + actual artifact identifiers. + context: > + A commit trailer contains "Implements: ISO-26262 compliance" and + the pattern matcher extracts "ISO-26262" as an artifact ID. + hazards: [H-15] + rationale: > + False positive artifact references inflate commit coverage + metrics, creating an illusion of implementation completeness. + + not-providing: + - id: UCA-C-19 + description: > + Core commit analysis fails to extract artifact IDs that use + sub-hazard notation (e.g., "H-1.2") or multi-letter suffixes + (e.g., "UCA-C-10"). + context: > + A commit trailer contains "Refs: H-1.2, UCA-C-10" but the + is_artifact_id() function requires digits-only after the hyphen, + so neither ID is extracted. + hazards: [H-15, H-1] + rationale: > + False negative artifact references create coverage gaps. STPA + artifacts (the most safety-critical) are systematically missed + by the commit traceability engine. + + too-early-too-late: [] + stopped-too-soon: [] + +# ============================================================================= +# Core Engine UCAs — Cross-Repo Dependencies (STPA-Sec extension) +# ============================================================================= +cross-repo-ucas: + control-action: Discover and load external repository artifacts + controller: CTRL-CORE + + not-providing: + - id: UCA-C-20 + description: > + Core does not detect circular cross-repo dependencies during + external artifact loading. + context: > + Repo A declares repo B as external, and repo B declares repo A. + The sync process enters an infinite loop or stack overflow. + hazards: [H-11, H-1] + rationale: > + Circular dependencies cause the tool to hang or crash during + sync, preventing any validation from completing. + + providing: [] + too-early-too-late: [] + stopped-too-soon: [] + +# ============================================================================= +# Core Engine UCAs — WASM Runtime (STPA-Sec extension) +# ============================================================================= +wasm-ucas: + control-action: Load and execute WASM adapter components for artifact import + controller: CTRL-CORE + + not-providing: [] + + providing: + - id: UCA-C-21 + description: > + Core WASM runtime does not validate that adapter-returned + artifacts correspond to actual records in the source data. + context: > + A WASM adapter returns 500 artifacts from a source file that + contains only 50 records. The additional 450 are fabricated. + hazards: [H-14] + rationale: > + Fabricated artifacts pass schema validation because they + conform to the declared types, but they introduce false + traceability links that inflate coverage. + + - id: UCA-C-22 + description: > + Core WASM runtime does not enforce fuel limits, allowing a + malicious adapter to consume unbounded CPU. + context: > + Fuel metering is configured but a bug in the fuel accounting + allows the adapter to execute indefinitely, causing a denial + of service. + hazards: [H-14] + rationale: > + Denial of service during import blocks all validation and + reporting operations. + + - id: UCA-C-23 + description: > + Core WASM runtime leaks host filesystem paths to the guest + adapter via WASI preopened directories. + context: > + The adapter is given access to the AADL directory via WASI + preopened dirs. The adapter reads files outside the intended + scope by exploiting symlinks or relative paths. + hazards: [H-14, H-17] + rationale: > + Information disclosure of host filesystem structure aids + further attacks. Arbitrary file read enables data exfiltration. + + too-early-too-late: [] + stopped-too-soon: [] + +# ============================================================================= +# Core Engine UCAs — Lifecycle / Coverage (STPA-Sec extension) +# ============================================================================= +lifecycle-ucas: + control-action: Check lifecycle completeness and downstream coverage + controller: CTRL-CORE + + not-providing: [] + + providing: + - id: UCA-C-24 + description: > + Core lifecycle completeness check uses a hardcoded downstream + type map that does not reflect the loaded schema's traceability + rules. + context: > + The lifecycle.rs module hardcodes expected_downstream() mappings + (requirement -> [feature, aadl-component, design-decision]). + A project using the cybersecurity schema has different downstream + expectations (e.g., threat-scenario -> countermeasure) that are + not represented. + hazards: [H-3, H-1] + rationale: > + Lifecycle gap analysis produces false negatives for schemas + other than the dev schema, missing genuine coverage gaps in + cybersecurity and STPA domains. + + too-early-too-late: [] + stopped-too-soon: [] + +# ============================================================================= +# Core Engine UCAs — Document Validation (STPA-Sec extension) +# ============================================================================= +document-validation-ucas: + control-action: Validate document references and embeds + controller: CTRL-CORE + + not-providing: + - id: UCA-C-25 + description: > + Core document validation only checks [[ID]] references but does + not validate {{artifact:ID}} embed references. + context: > + A document uses {{artifact:NOPE-999}} which renders as a broken + embed card in the dashboard, but validate_documents() does not + check this pattern, so no diagnostic is emitted. + hazards: [H-1, H-3] + rationale: > + Broken artifact embeds in documents are invisible to validation, + producing documents that appear complete but contain broken + references. + + providing: [] + too-early-too-late: [] + stopped-too-soon: [] + +# ============================================================================= +# CLI UCAs — External Sync (STPA-Sec extension) +# ============================================================================= +external-sync-ucas: + control-action: Clone and fetch external git repositories for cross-repo linking + controller: CTRL-CLI + + not-providing: [] + + providing: + - id: UCA-L-6 + description: > + CLI sync command executes git clone against an arbitrary URL from + rivet.yaml without validating the URL or disabling git hooks. + context: > + A developer opens a project with a modified rivet.yaml containing + a malicious git URL. Running `rivet sync` clones the repository, + which includes a post-checkout hook that executes arbitrary code. + hazards: [H-17] + rationale: > + The rivet.yaml file is the trust boundary for external + dependencies. A compromised config file enables arbitrary + code execution on the developer's machine. + + - id: UCA-L-7 + description: > + CLI does not validate rivet.lock file integrity before using + pinned commit SHAs for external sync. + context: > + An attacker modifies rivet.lock to point to a different commit + SHA that contains compromised artifacts. The sync operation + checks out the attacker-controlled commit. + hazards: [H-17, H-14] + rationale: > + The lock file is meant to ensure reproducible baselines. + If it can be tampered with undetected, the baseline guarantee + is void. + + too-early-too-late: [] + stopped-too-soon: []