Standard: Component Readiness Grades (CRG) v1.0 Assessed: 2026-03-01 Assessor: Jonathan D.A. Jewell + Claude Opus 4.6
| Component | Grade | Release Stage | Evidence Summary |
|---|---|---|---|
assail |
C | Beta | Dogfooded on self; 22 findings. Tested on 141 repos via assemblyline. |
attack |
D | Alpha | Works on example binary (cpu axis). Other axes not tested on diverse targets. |
assault |
D | Alpha | Works on self + example binary. Full multi-axis only tested on one target. |
ambush |
D | Alpha | Works with and without timeline. Timeline events skip when target exits fast (correct behaviour). |
amuck |
D | Alpha | Generates mutated files. Preset light works. Dangerous preset and exec-program untested on diverse targets. |
abduct |
D | Alpha | File isolation + mtime-shift works. Time-skewing (frozen/slow modes) and exec-program untested on diverse targets. |
adjudicate |
D | Alpha | Aggregates 2+ reports with expert-system verdict. Only tested on panic-attack's own reports. |
axial |
D | Alpha | Observation with --report works. Exec-program observation works. grep/agrep/aspell/pandoc untested. |
analyze |
C | Beta | Detects UseAfterFree, NullPointerDeref from crash reports. Both rule evaluation and stderr matching work on synthetic data. |
report |
C | Beta | Renders assault reports in terminal. Works on self-generated reports. All view modes available. |
tui |
E | Pre-alpha | Initialises but requires real terminal. Cannot be tested in CI/headless. No smoke test possible. |
gui |
E | Pre-alpha | Initialises but requires display server. Cannot be tested in CI/headless. |
diff |
C | Beta | Compares two reports correctly. Shows robustness delta, weak point delta, per-axis changes. |
manifest |
C | Beta | Exports AI.a2ml to Nickel format. Works on self. Output is valid Nickel. |
a2ml-export |
C | Beta | Round-trips assault report to A2ML bundle. Works on self-generated reports. |
a2ml-import |
C | Beta | Round-trips A2ML bundle back to JSON. Verified round-trip integrity. |
panll |
C | Beta | Exports event-chain with real constraints. 2 critical WPs, attack events extracted correctly. |
assemblyline |
C | Beta | Scanned 141 repos in parallel (rayon). BLAKE3 fingerprinting. 3448 findings, 254 critical. |
diagnostics |
C | Beta | Reports version, manifest, directories, integrations. Works on self. |
help |
C | Beta | Lists all 19 subcommands with descriptions and options. |
- Components at C (Beta) or above: 14/19 (74%)
- Components at D (Alpha): 5/19 (26%)
- Components at E (Pre-alpha): 2/19 (11%)
- Components at F (Reject): 0/19 (0%)
- Minimum project-wide grade: E (tui, gui)
- Weighted assessment: The project is Beta-quality for its core workflow (assail/assault/report/assemblyline) and Alpha-quality for the full dynamic testing suite.
Evidence:
- Successfully scans its own codebase: 22 weak points detected (2 critical, 9 high, 10 medium, 1 low)
- Verbose mode shows per-file risk breakdown with 40 files ranked
- Logic engine produces 125 facts and 9 derived facts
- JSON output is well-formed and machine-readable
- Exercised across 141 repos via assemblyline (3448 total findings)
- 47 language analyzers registered
Known limitations:
- Framework detection has false positives (reports Phoenix/Ecto/Cowboy/OTP on a pure Rust project)
- Some patterns detect their own search strings as findings (e.g., "transmute" in analyzer.rs)
Promotion path to B: Test on 6 diverse projects in different languages (not just Rust repos via assemblyline).
Evidence:
- CPU axis works on example binary (exits cleanly, 0 crashes)
- Report output is structured and correct
Known limitations:
- Only tested on one binary with one axis
- Memory/disk/network/concurrency/time axes not individually validated
- No test against a program that actually crashes under stress
Promotion path to C: Test all 6 axes on panic-attack's own test binaries and the vulnerable_program example.
Evidence:
- Combines assail + attack successfully
- Produces structured AssaultReport with all sections
- VerisimDB hexad storage works automatically
- Multi-format output (JSON, YAML, Nickel) works
Known limitations:
- Only tested with cpu axis (full multi-axis on self not validated in this session)
- Previous session ran full multi-axis; results were valid but only on one target
Promotion path to C: Run full multi-axis assault on panic-attack's own binary.
Evidence:
- Works without timeline (falls back to standard attack flow)
- Timeline YAML parsing works correctly (4 events across 3 tracks)
- Timeline events are correctly scheduled with start offsets
- Events are correctly skipped when target exits before their start time
Known limitations:
- Timeline events only tested once; stressor threads for cpu/memory/concurrency verified but only in isolation
- No test with a long-running program that exercises the full timeline duration
Promotion path to C: Create a test binary that runs for 15+ seconds, run with the timeline spec, verify all events fire in sequence.
Evidence:
- Light preset generates 1 mutated variant with prepend/append operations
- Output file written to runtime/amuck/
- JSON report correctly records operations applied
Known limitations:
- Dangerous preset not tested
- Custom spec file not tested
- exec-program integration not tested (compile and test mutated files)
Promotion path to C: Test dangerous preset, write a custom spec, and use exec-program to compile and test mutated variants of our own source files.
Evidence:
- Direct scope copies target + dependencies correctly
- mtime-offset-days shifts file timestamps
- Readonly lock is applied to copied files
- Workspace created in runtime/abduct/
Known limitations:
- frozen/slow time modes not tested
- virtual-now not tested
- exec-program integration not tested
- twohops/directory scope not tested
Promotion path to C: Test frozen time mode with exec-program on a binary that checks timestamps.
Evidence:
- Processes 2 assault reports correctly
- Expert-system verdict ("fail" based on critical weak points) is generated
- Rule hits documented with confidence scores
- Priorities extracted correctly
Known limitations:
- Only tested with assault reports; amuck/abduct report aggregation untested
- Only 2 reports aggregated; scaling untested
- Only one campaign pattern exercised (campaign_fail_on_high_signal)
Promotion path to C: Test with all 3 report types (assault, amuck, abduct) and with 5+ reports.
Evidence:
- Report observation mode works (reads assault JSON, produces markdown)
- Exec-program mode works (runs binary, captures output)
- Markdown output is well-formatted
Known limitations:
- grep/agrep pattern matching not tested
- aspell integration not tested
- pandoc conversion not tested
- i18n (non-English output) not tested via this subcommand
Promotion path to C: Test grep patterns on stderr of a crashing program, test aspell on output text.
Evidence:
- Detects UseAfterFree from both rule evaluation (Alloc→Free→Use sequence) and stderr patterns
- Detects NullPointerDeref from SIGSEGV in signal field
- Confidence scores differentiated (0.85 rule-based, 0.95 stderr-based)
- Variable bindings reported in evidence (X_loc, X_loc2, X = heap_var)
Known limitations:
- Only synthetic crash reports tested (no real crash from running code)
- Deadlock, DataRace, MemoryLeak, BufferOverflow rules not exercised
Promotion path to B: Feed in real crash reports from at least 6 different crash scenarios (use ASAN/TSAN output from real C/Rust programs).
Evidence:
- Renders full assault report in terminal with sections: assail, detail panel, attack results, signatures, assessment
- All view modes available via --report-view flag
Promotion path to B: Test rendering of reports from 6+ diverse projects.
Evidence:
- Code exists and compiles
- Attempts to initialize crossterm terminal but fails without a real TTY (os error 6)
- Cannot be smoke-tested in a headless/CI environment
Promotion path to D: Add a --dry-run flag or test harness that validates the report loading without needing a terminal.
Evidence:
- Code exists and compiles
- Times out (no display server in CLI context)
- eframe-based; requires Wayland/X11
Promotion path to D: Same as TUI — add headless validation mode.
Evidence:
- Correctly compares two reports: robustness delta, weak point delta, per-axis status changes
- Framework changes tracked
- Severity breakdown tracked (critical, high, medium, low)
Promotion path to B: Compare reports from 6+ diverse projects at different points in time.
Evidence:
- Parses AI.a2ml and exports to Nickel format
- Output includes all manifest sections: version, project, canonical-locations, critical-invariants, lifecycle, tools, reports
Promotion path to B: Test on AI.a2ml files from 6+ different repos.
Evidence:
- Round-trip verified: assault JSON → A2ML bundle → JSON
- Output file sizes match (3371 lines round-tripped)
- Kind discrimination works (--kind assault)
Promotion path to B: Test with all report kinds (assault, amuck, abduct) from 6+ projects.
Evidence:
- Exports event chain from assault report
- 2 constraints extracted from critical weak points
- Attack events correctly represented
- Summary includes weak points, crashes, robustness score
Promotion path to B: Test with reports from 6+ projects with varying numbers of findings.
Evidence:
- Scanned 141 repos in parallel via rayon
- 3448 weak points found, 254 critical
- BLAKE3 fingerprinting computed for all repos
- Results sorted by risk (developer-ecosystem: 633, idaptik: 427, ...)
- Filters (--findings-only, --min-findings) work correctly
Promotion path to B: Run on 6+ different parent directories (different machines, different repo structures).
Evidence:
- Reports version, AI manifest status, directory existence, report cache counts
- Correctly identifies missing integration configs (Hypatia, gitbot-fleet)
Promotion path to B: Validate diagnostics output on 6+ repos with different configurations.
Evidence:
- Lists all 19 subcommands with accurate descriptions
- Shows all global options
- Per-subcommand help available
Promotion path to B: Help is generic by nature; B/A grades apply once external users confirm the docs are clear.
No components earned an F. Candidates considered and rejected:
- tui/gui: These are E, not F. They serve a real purpose (interactive report review) and are salvageable with a dry-run mode. No better alternative exists specifically for panic-attack reports.
- axial: Some features (aspell, pandoc) could be delegated to external tools, but the integrated observation + report correlation is unique to panic-attack. Not an F.
- Framework detection false positives: assail detects Elixir/Erlang frameworks (Phoenix, Ecto, OTP) on a pure Rust project. This should be gated on detected language.
- Self-detection: The tool detects its own pattern-matching strings as vulnerabilities. Consider adding self-exclusion or annotation support.
- Timeline event scheduling: The DAW-style timeline works but needs a long-running test target to exercise fully.
- amuck dangerous preset: Untested. Could generate broken mutations that confuse users.
- abduct time modes: frozen/slow modes are implemented but completely untested.