Bridge the gap between your controls and your frameworks.
Evidentia is an open-source, Python-first Governance, Risk, and Compliance (GRC) platform that turns compliance from a spreadsheet problem into a software problem. It provides composable building blocks for control gap analysis, AI-generated risk statements, automated evidence collection, and compliance reporting — all usable from a Python library, a CLI, or a REST API.
GRC tooling has been waiting for its Terraform moment. Vanta and Drata are the AWS Consoles of compliance — polished SaaS dashboards charging $30K–$80K/year per framework. Evidentia is the library-first compliance infrastructure layer underneath: composable, embeddable, and built on the open standards (OSCAL) that the entire federal compliance stack is moving toward in 2026.
The moat trinity (v0.9.6 sharpened framing — May 2026): Evidentia's
durable differentiation against the wave of "agentic GRC" launches in
Q1 2026 collapses to three compounding moats — OSCAL emission +
DFAH / PRT determinism + cryptographic CIMD provenance. The
positioning posture explicitly does NOT chase "agentic" vocabulary;
the leading frame is "deterministic, auditable, OSS-native
compliance engineering." See docs/positioning-and-value.md
§6.1.A + §6.1.B for the full counter-positioning rationale.
It's the only OSS tool today that combines all of the following in one package:
- OSCAL-native end-to-end — ingests NIST OSCAL catalogs, emits OSCAL Assessment Results. Ready for the September 2026 federal mandate (OMB M-24-15 + FedRAMP RFC-0024). Vanta, Drata, AuditBoard, OneTrust, ServiceNow IRM, MetricStream all ship zero OSCAL output today.
- Cryptographically signed evidence — Sigstore/Rekor keyless signing of every Assessment Results document, PEP 740 attestations on every released wheel + sdist, and a CycloneDX SBOM attached to every GitHub Release. No other OSS GRC tool puts cryptographic provenance on the evidence itself.
- 89 framework catalogs bundled — NIST 800-53 Rev 5 (full 1,196
controls + Low/Moderate/High/Privacy baselines), CSF 2.0, FedRAMP,
CMMC 2.0, EU AI Act, DORA, NIS2, GDPR, all 15 comprehensive US state
privacy laws, the full FFIEC IT Examination Handbook stack (5
booklets: Information Security / Audit / Management / Operations
/ Outsourcing), FFIEC Cybersecurity Assessment Tool, OCC Bulletin
2026-13a / FRB SR 26-02 (Model Risk Management), plus 20 Tier-C
licensed-stub frameworks with
evidentia catalog importfor your licensed copies. More than any commercial vendor (Vanta: 35+, Drata: 20+, RegScale: 60+). - Apache 2.0 license — embeddable in commercial products without AGPL friction. The OSS GRC alternatives (CISO Assistant, Eramba, Comp AI) are AGPL with paid commercial tiers.
- Library-first, CLI-second, API-third —
pip install evidentia-core; from evidentia_core import GapAnalyzer. The closest peers (compliance-trestle, RegScale OSCAL Hub) are workflow / CLI tools, not embeddable libraries. - Air-gap capable —
--offlineflag refuses network egress; signs evidence with GPG when Sigstore can't reach Fulcio. Built for FedRAMP High, CMMC Level 2, and EU sovereign-cloud deployments where SaaS GRC is a non-starter. - AI-optional, not AI-mandatory — risk-statement generation and control explanation use LLMs via LiteLLM (any provider — OpenAI, Anthropic, Google, Azure, Bedrock, Ollama, vLLM). Everything else is deterministic. No leakage of sensitive evidence to third-party AI APIs unless you explicitly opt in.
- CI-native via composite GitHub Action — drop in
uses: polycentric-labs/evidentia/.github/actions/gap-analysis@v0and every PR runs gap analysis, posts a sticky compliance comment, and blocks merge on regression. No commercial GRC tool does this at the PR level.
For the full competitive analysis, market tailwinds, intellectual
ancestry, and 12-month direction, see
docs/positioning-and-value.md — a
12,000-word synthesis from 7 parallel research streams.
Modern GRC is stuck in 2005. The typical compliance program runs on:
- Spreadsheets that get copy-pasted between auditors, engineers, and exec staff
- Vendor GRC suites that cost $50K-500K per year, lock you in, and still require weeks of manual work to map one framework to another
- Point solutions that handle one piece (a vulnerability scanner, a policy tracker, a questionnaire manager) but can't talk to each other
- Consultants who re-learn your environment from scratch every audit cycle
Meanwhile, the compliance workload keeps growing. A single fintech or healthcare SaaS company today might simultaneously be in scope for SOC 2, PCI DSS 4.0, HIPAA, GDPR, CCPA, ISO 27001, and NYDFS Part 500 — seven frameworks with substantial overlap, each demanding its own evidence, gap analysis, and risk documentation.
The same control (say, "MFA on all privileged accounts") satisfies requirements in all seven. But because each framework uses different vocabulary, numbering, and organization, compliance teams end up documenting the same control seven different ways — and audit season becomes a months-long exercise in cross-referencing.
This is a software problem. It should be solved the way software problems get solved: with composable libraries, structured data, version control, and automation.
Evidentia is built on four principles:
-
Open standards, not vendor lock-in. Inputs and outputs use OSCAL — NIST's open standard for control catalogs and assessment results. If you outgrow Evidentia, your data travels with you.
-
Library-first, CLI-second, API-third. The Python library is the canonical interface. The CLI is a thin wrapper. The REST API is a thin wrapper. Everything Evidentia can do via the CLI, it can do from a Python script — which means you can embed it in CI pipelines, compliance portals, or custom integrations.
-
AI where it helps, not where it hurts. Evidentia uses LLMs for tasks where language understanding is the bottleneck (writing NIST SP 800-30 risk statements from a gap, validating whether a policy PDF actually covers a control). It uses deterministic code for tasks where correctness matters (OSCAL parsing, gap arithmetic, cross-framework mapping).
-
Provider-agnostic LLM access. All AI features route through LiteLLM + Instructor, giving you structured Pydantic output from any model — OpenAI, Anthropic, Google, Azure, Ollama, vLLM, or any OpenAI-compatible endpoint. No vendor lock-in on the AI layer either.
- Security engineers at startups and mid-size companies who need to hit SOC 2 Type II without hiring a full compliance team
- GRC consultants who want to stop rebuilding the same spreadsheets for every engagement
- Platform teams who want to embed gap analysis into their CI pipelines and catch drift before the auditor does
- CISO offices that want a real audit trail on risk decisions, backed by versioned structured data instead of Slack threads
- Anyone who has ever said "I know this NIST control is the same as this SOC 2 criterion, but I don't want to re-document it for the fifth time"
v0.10.5 (May 2026) — Output-format expansion (OCSF Detection
Finding + CycloneDX VEX) + workspace refactor (evidentia-eval
package extraction) + collector idempotency hardening for the v1.0
API freeze + positioning rewrite + design-partner program draft.
Patch on v0.10.4 (released ~24 hours prior). Two new evidentia gap analyze output formats: --format ocsf-detection emits OCSF
Detection Finding (class_uid 2004) JSON arrays — the SIEM-target
counterpart to v0.10.4's Compliance Finding 2003; Splunk / Elastic /
Microsoft Sentinel / Datadog ingest 2004 natively as production
traffic. --format cyclonedx-vex emits CycloneDX 1.6 VEX
documents, complementing the existing release-time CycloneDX SBOM
emit. OutputFormat Literal extended additively; existing emits
(json / csv / markdown / oscal-ar / sarif / ocsf)
unchanged. New 8th workspace package evidentia-eval: the DFAH
determinism + faithfulness harness moves out of evidentia-ai/eval/
to its own pip-installable package, making the air-gap clean
install posture a hard contract (the production runtime no longer
transitively loads sentence-transformers / numpy); deprecation
shim through v0.11.x, removal in v0.12.0. Collector idempotency
hardening (Phase 10): deterministic SecurityFinding.id
derivation from (source_system, source_finding_id) natural keys
via UUID v5 + pinned namespace — closes the principal-engineer-
audit-flagged gap that two collect() runs against an unchanged
source produced random uuid4 ids; additive-only, OCSF round-trip
preserved. Positioning rewrite (Phase 11): §10 rewritten with
no metaphors (direct enumerated claims only until reference
customers exist); EU AI Act Annex III date corrected from
2026-08-02 → 2027-12-02 per Digital Omnibus 2026-05-07 political
agreement; v1.0 acceptance gates rewritten (OpenSSF Gold honest-gap
tied to SOC 2 Type I segregation-of-duties). Design-partner
program draft (Phase 12): 6-8 week fixed-fee "Evidentia CMMC +
FedRAMP Readiness Accelerator" engagement at $25K-$75K tiered
pricing; 9 open questions queued for review. Phases 1-5 of the
2026-05-24 v0.10.5 draft (OSPS Baseline catalog + OSCAL conversion
- OSPS-CONFORMANCE.md + SECURITY.md refresh + EOL.md +
verification-recipe) deferred to v0.10.6 without scope loss.
3,443 tests passing / 14 skipped / 3,457 collected across 278
source files; mypy strict 0/0; ruff clean. PyPI: 8 packages at
0.10.5 (was 7;
evidentia-evaladded).
v0.10.2 (May 2026) — MCP-as-backend tool surface expansion +
GRC Engineering Club marketplace plugin (staged) + close v0.10.1
F-V101-L1 SSRF surface. Third release of the v0.10.x line on the
same calendar day as v0.10.0 + v0.10.1 ships. 4 new MCP tools
(gap_analyze_sarif, collect_ocsf, tprm_vendor_list,
poam_list) expand the §MCP tool contract from 8 → 12, bringing
the v0.10.x OCSF / SARIF / TPRM / POA&M surface into Claude
Desktop, Claude Code, and other MCP-aware AI clients.
GRC Engineering Club marketplace plugin staged in
marketplace/grc-engineering-suite/plugins/evidentia/ — first
concrete OSS-vs-paid scope decision locked at generalist OSS only
(persona-tied skills reserved for the future Evidentia Pro /
Federal commercial tier). F-V101-L1 SSRF surface CLOSED via a
new default-on --block-private-ips flag on evidentia collect ocsf URL mode (rejects RFC1918 + link-local incl. AWS metadata
169.254.169.254 + loopback + multicast + reserved ranges via
socket.getaddrinfo pre-resolution). All v0.10.x findings now
closed (zero unfixed CRITICAL / HIGH / MEDIUM / LOW). 3348 tests
passing / 14 skipped across 267 source files; mypy strict 0/0;
ruff clean. PyPI: 7 packages at 0.10.2.
v0.10.1 (May 2026) — OCSF integration consolidation + close
both v0.10.0 review findings. Same-day patch on v0.10.0 (v0.10.0
published 03:07 UTC, v0.10.1 ships ~the same day). Closes
F-V100-L1 (trust-boundary on the OCSF unmapped["evidentia"]
block — finding_from_ocsf gains a trust_unmapped: bool = True
keyword-only parameter; the new ingestion collector passes
False) and F-V100-M1 (release-tooling — scripts/ bump_version.py reads [tool.uv.sources] as the workspace
allowlist so third-party pins like py-ocsf-models>=0.9.0,<0.10.0
are no longer over-bumped). Ships the deferred third-party
OCSF ingestion collector (evidentia collect ocsf --input <file-or-url> — HTTPS-only URL mode with size + timeout caps;
Detection Finding path handles Prowler + AWS Security Hub output)
and evidentia collect convert --format ocsf (SecurityFinding
→ OCSF Compliance Finding bundle). Migrates the remaining 11
collectors (Okta + 4 SQL adapters + Databricks + Snowflake +
Vanta + Drata + BitSight + SecurityScorecard) to populate
compliance_status per the v0.10.0 pilot pattern. Adds the
Finding class-name alias on SecurityFinding (deprecation
target: v1.0.0) and the EventAction.COLLECT_OCSF_EMITTED
audit value. 3332 tests passing / 14 skipped across 267 source
files; mypy strict 0/0; ruff clean. PyPI: 7 packages at 0.10.1.
v0.10.0 (May 2026) — OCSF-aligned findings schema + SARIF
CI-gate output. Opens the v0.10.x research-driven integration line.
Ships an additive normalized findings schema (SecurityFinding gains
compliance_status + remediation Optional fields mirroring OCSF
compliance.status + remediation.desc; new ComplianceStatus
enum), a bidirectional OCSF Compliance Finding mapping layer
(evidentia_core.ocsf — behind the new optional [ocsf] extra
pulling py-ocsf-models; finding_to_ocsf / finding_from_ocsf
with lossless round-trip for Evidentia-produced findings via the
OCSF-standard unmapped block), SARIF 2.1.0 output for
evidentia gap analyze --format sarif (each ControlGap → a
SARIF result; stable partialFingerprints track gaps across runs;
surfaces in GitHub code scanning + GitLab security dashboards), and
3 pilot collectors (AWS, GitHub, Postgres) populating the new
fields. finding.py joins the docs/api-stability.md
frozen-models table. The remaining ~11 collectors + the third-party
OCSF ingestion collector (incl. a Detection Finding path for
Prowler / AWS Security Hub) are tracked for v0.10.1. 3292 tests
passing / 17 skipped across 265 source files; mypy strict 0/0; ruff
clean. PyPI: 7 packages at 0.10.0.
v0.9.9 (May 2026) — Supply-chain hygiene + pre-push gate
fidelity. A focused supply-chain patch — no source or test code
changed. Closes paramiko CVE-2026-44405 (compliance-trestle
4.0.3 pulls paramiko to 5.0.0, a dev-only transitive dependency);
adds an osv-scanner --sbom pre-push gate wired into CI and the
release checklist through one shared script, so transitive and
disputed advisories surface before a tag; and clears the entire
Dependabot PR queue (five grouped version-update PRs merged, three
orphaned PRs closed). 3250 tests passing / 14 skipped across 261
source files; mypy strict 0/0; ruff clean. PyPI: 7 packages at
0.9.9.
v0.9.8 (May 2026) — v0.9.7 deferral closure + v1.0-prep
integration wiring. Wires v0.9.7's data/decision-only primitives
into live surfaces: multi-tenant RBAC enforced end-to-end across
CLI, REST, and the POA&M / evidence stores; MCP tool outputs signed
at the FastMCP dispatch layer with an in-tree Sigstore-keyless
reference signer (evidentia_mcp.sigstore_signer); a shared
evidentia_core.factory_resolver; and the FedRAMP Rev 5 High +
CMMC L2 HF-Hub eval-suite subsets. Closes the CR-V97 review polish,
the F-V97-mcp-signer-trust + F-V97-multi-tenant-claim-spoofing
findings, three SigningContext.production() runtime breaks
(sigstore 4.2.0 API migration), and the idna CVE-2026-45409 bump.
3250 tests passing / 14 skipped across 262 source files; mypy
strict 0/0; ruff clean. PyPI: 7 packages at 0.9.8.
v0.9.5 (May 2026) — Walk-through refinement + collaboration
primitives + carry-over closure. Closes 18 deferred review
findings (7 v0.9.3 LOWs + 8 v0.9.4 LOWs + 2 INFOs + 1
rebucketed Q-finding), adds 3 collaboration-primitive surfaces,
validates the federal-SI walk-through against an AI-persona
reviewer with FedRAMP 20x / RFC-0024 framing, and ships
daemon-status REST expansion. Phase 1 carry-over closure:
new evidentia_core.security.atomic_write_text helper (lifts
the v0.9.4 inline .tmp cleanup across 4 call sites); auto-
wired ProxyHeadersMiddleware via
EVIDENTIA_TRUST_PROXY_HEADERS=1 env var or
create_app(trust_proxy_headers=True); pytest-randomly +
schemathesis + playwright in [dev] deps with
tests/dast/ scaffold; 7 v0.9.3 LOWs (explicit SSL context on
webhook urlopen, trust-boundary doc on EVIDENTIA_AI_REGISTRY_ DIR, SIGINT race window doc, state-file size cap, RFC 5321
recipient validation, dedup-state mtime cache, sleep_fn
typing); 8 v0.9.4 LOWs (FileLock fd-leak fix CWE-404, fcntl
per-fd doc CWE-662, rate-limit LRU spray protection CWE-400,
idempotency replay-after-delete regression test, IPv6 scope-id
correct sort, sleep_fn Callable typing, cross-process
FileLock test, rate-limit docstring tightening); 2 INFO
closures (Pydantic-upgrade body-hash audit guidance + model_ copy validator re-validate pattern). Phase 2 operator
polish: AI-persona walk-through validation (10 refinement
recommendations: CLI flag bug fix, OMB M-24-10 / NIST AI RMF
reframe, FedRAMP 20x framing, Step 8 OSCAL POA&M emit, CISA
Secure-by-Design framing, CA-7 meta-control clarification,
SCR Form adjacency, FIPS 199 + ATO-linkage as v0.9.6 targets);
GET /api/conmon/daemon-history?limit=N rolling-history
endpoint + --history-file daemon CLI flag; Prometheus
evidentia_conmon_daemon_* gauges at /api/metrics. Phase 3
collaboration primitives: POA&M Milestone.owner +
Milestone.reviewer Optional fields with CLI + REST filters;
EvidenceArtifact.version + lineage_id + predecessor_id +
new_version() factory helper (data-model + helper at v0.9.5;
WORM store-side append-only lands v0.9.6); evidentia_core. rbac package with Role enum / RBACPolicy / check_ permission / FastAPI require_role(action) dependency
factory + EVIDENTIA_RBAC_POLICY_FILE env-var policy loading
(default permissive policy preserves v0.9.4 behavior; CLI-side
enforcement deferred to v0.9.6). 2862 tests passing / 17
skipped across ~225 source files; mypy strict 0/0; ruff
clean. PyPI: 7 packages at 0.9.5. First direct-push
ship-cycle since the v0.9.x line started using PR-based
workflow (per the post-v0.9.4 lesson;
enforce_admins: False on branch protection always allowed
this; PR ceremony was self-imposed not branch-protection-
required).
v0.9.4 (May 2026) — Daemon hardening + operator polish +
federal-SI walk-through. Consolidation pass closing v0.9.3
deferred review items. Phase 1 daemon hardening: cross-platform
file-lock helper (POSIX fcntl.flock + Windows msvcrt.locking)
wrapping mark_completed + AlertDeduper.mark_dispatched behind
opt-in --state-lock flag — closes F-V93-Q3 HIGH race-condition;
webhook SSRF mitigation (default-deny http:// + loopback/RFC1918/
link-local/reserved IPs; opt-in --webhook-allow-plaintext +
--webhook-allow-private-network) — closes F-V93-S2 MEDIUM
(CWE-918, blocks cloud-metadata-service IAM exfiltration);
per-client-IP token-bucket rate-limit middleware on
POST /api/ai-gov/register + /classify + X-Idempotency-Key header
support on register — closes F-V93-S10 LOW; polish batch
(F-V93-Q11 dynamic User-Agent + Q12 Windows latency doc + Q14
narrow except + S9 path-disclosure doc). Phase 2 operator
polish: GET /api/conmon/daemon-status endpoint + status
sidecar; evidentia conmon dedup-list CLI verb; evidentia ai-gov update + retire CLI verbs wiring the AI_SYSTEM_UPDATED +
AI_SYSTEM_RETIRED EventActions. Phase 3 federal-SI
walk-through (reserved since v0.9.0): synthetic fixtures +
7-step recipe + smoke test; 3 walk-through-surfaced refinements
applied. Phase 4 hygiene: workflow_dispatch on test.yml,
token-rotation doc fix, flaky-Jira-test real fix (root cause was
0.7%-probability assertion-collision with random request_id, NOT
fixture leak). 19th consecutive PROCEED-CLEAN of v0.7.x →
v0.8.x → v0.9.x line. 2798 tests passing / 17 skipped across
219 source files; mypy strict 0/0; ruff clean. PyPI: 7 packages
at 0.9.4.
v0.9.3 (May 2026) — CONMON daemon + AI governance + carry-overs.
The largest minor release of the v0.9.x line so far. Combines two
originally-PROPOSED themes into a single ship: Theme A (CONMON
daemon) — evidentia conmon watch --poll long-running daemon
with state-file-driven slug→last_completed tracking, configurable
poll interval (default 3600s; min 60s), graceful SIGINT/SIGTERM
shutdown, SMTP + generic-webhook alerting channels (STARTTLS-only
with has_extn assertion + capture-replay-protected HMAC signing
with X-Evidentia-Timestamp header), control health scoring CLI +
REST endpoint, ContinuousEvidenceSource plugin Protocol. Theme B
(AI governance) — EU AI Act catalog enrichment (risk_tier +
applies_to_annex_iii on every Article 9-15 control; tier promoted
D→A), NIST AI RMF crosswalks to EU AI Act + ISO 42001,
evidentia_core.ai_governance classification + registry +
file-backed store, evidentia ai-gov CLI (5 verbs) + /api/ai-gov/*
REST router (5 endpoints with audit-event parity). Plus
carry-overs: LLM-rater κ recompute on full 147-entry corpus
(framework-agnostic κ = 0.8820; NIST κ = 1.0000; overall κ = 0.7956;
3 of 5 subsets PASS κ≥0.80); docker/requirements drift CI gate;
GHCR public-flip release-checklist; api-stability.md DRAFT.
Step 5.A pre-release-review batch closes 8 of 10 MEDIUM
findings inline (F-V93-S1 SMTP STARTTLS hardening + F-V93-S3
webhook HMAC replay protection + F-V93-Q1 dead unknown field
removal + F-V93-Q2 AI gov REST router audit-event wiring + F-V93-Q5
new CONMON_DAEMON_POLL_FAILED action + F-V93-Q7 enum compare +
F-V93-Q8 upfront --deployment-status validation + F-V93-Q10
dedup-state corruption backup). The 1 HIGH (F-V93-Q3 race-condition)
closed via documented single-writer contract matching v0.9.0
poam_store / v0.7.9 vendor_store precedent. 18th consecutive
PROCEED-CLEAN of v0.7.x → v0.8.x → v0.9.x line. 2742 tests
passing / 17 skipped across 217 source files; mypy strict 0/0;
ruff clean. PyPI: 7 packages all at 0.9.3 with PEP 740
attestations.
v0.9.2 (May 2026) — CONMON REST parity + LLM rater + federal
corpus. Walk-through-driven refinement delivering HTTP API surface
for continuous-monitoring cadences (4 endpoints under /api/conmon/*
matching CLI parity), LLM-assisted second rater (scripts/llm_rater.py
with temperature-0 deterministic labeling + --rule llm mode in the
κ script), federal calibration corpus (corpus_federal.jsonl —
24 entries spanning FedRAMP ConMon + POA&M + NIST 800-53 CA-7;
total corpus 147 entries), and 10 federal-SI walk-through scenarios
in capability-matrix.md. 17th consecutive PROCEED-CLEAN.
v0.9.1 (May 2026) — Polycentric Labs org migration. Repo
transferred from allenfbyrd/evidentia to polycentric-labs/ evidentia to unblock the org's GitHub Team plan + future federal-
SaaS shipping context. All 7 PyPI Trusted Publisher entries
re-registered. 16th consecutive PROCEED-CLEAN.
v0.9.0 (May 2026) — Federal compliance — POA&M lifecycle +
CONMON cycle calendar + walk-through-as-validation. First minor
of the v0.9.x line. Opens the federal-compliance theme reserved
at v0.8.7 cycle-close. Lands operator-facing surfaces auditors
expect in any regulated-industry GRC tool: Plan-of-Action-and-
Milestones tracking + Continuous Monitoring cycle calendar.
Phase 1 — POA&M data layer: POAMState 5-state enum
(planned / in_progress / overdue / completed / verified)
aligned to FedRAMP POA&M Template Completion Guide v3.0 + NIST
SP 800-53A Rev 5 Appendix F; forward-only state transitions;
Milestone Pydantic record; ControlGap.poam_milestones
optional list (default-empty for v0.7.x + v0.8.x backward-compat);
new evidentia_core.poam sub-package (state.py + milestone.py);
new evidentia_core.poam_store JSON file-store mirroring
v0.7.9 vendor_store; 6 new EventActions.
Phase 2 — evidentia poam CLI (7 verbs: create from gap
report / list / show / update / milestone add|update / delete /
calendar); /api/poam/* FastAPI router (8 endpoints);
evidentia_core.oscal.poam_exporter.gap_report_to_oscal_poam()
emitting OSCAL 1.1.2 plan-of-action-and-milestones JSON with
SHA-256 back-matter integrity (mirrors v0.7.0 finding-resource
embedding). Default severity-filter is CRITICAL + HIGH per FedRAMP
§3.1 auditor-default; --all opts into the full set.
Phase 3 — evidentia_core.conmon pure-function library
with 7 bundled cadences (NIST 800-53 CA-7 monthly + FedRAMP
ConMon × 3 + CMMC L2 triennial + DoD RMF annual + OCC 2026-13a
model-risk annual); evidentia conmon CLI (list / next /
check); 2 new EventActions. No daemon — operators poll.
Plus new operator runbooks (docs/poam-runbook.md +
docs/conmon-runbook.md); 14-item Step 5.A refinement batch
(UUID canonicalization in poam_store + vendor_store via
str(UUID(id)) preventing duplicate-records-per-alias +
non-conformant OSCAL UUID emit; _enum_value extracted to
evidentia_core.models.common; stale-doc refreshes across
governance + config + generation_context references).
15th consecutive PROCEED-CLEAN of v0.7.x → v0.8.x → v0.9.x
line. 2583 tests passing / 17 skipped across 227 source files;
mypy strict 0/0; ruff clean. Phase 4 walk-through deferred to
v0.9.1 per §31.A POA&M-first / walk-through-as-validation
posture.
v0.8.7 (May 2026) — Final v0.8.x wrap-up. Single focused
session closing the v0.8.6 P3 CLI deferral + backfilling
v0.8.6 cycle-close artifacts deferred during single-session
compression. NEW --faithfulness-threshold-mode {framework-aware,fixed} flag on evidentia eval risk- determinism (default framework-aware) closes the v0.8.6
P3 deferral; --faithfulness-threshold default changed from
0.3 → None sentinel for backward-compatible framework-
aware default resolution. Resolution precedence: explicit
value wins → framework-aware mode (extracts framework from
prompt_id; looks up via resolve_threshold(framework, method))
→ fixed mode (0.30 framework-agnostic). 6 v0.8.6 cycle-close
artifacts backfilled (docs/security-review-v0.8.6.md +
docs/v0.8.6-plan.md + threat-model v0.8.6 delta +
capability-matrix v0.8.6 snapshot + README v0.8.6 entry +
ROADMAP v0.8.6 PLANNED → SHIPPED transition). 14th
consecutive PROCEED-CLEAN of v0.7.x → v0.8.x line. 2386
tests passing across 217 source files; mypy strict 0/0; ruff
clean. FINAL v0.8.x patch — v0.9.0 opens with the
federal-compliance theme per the 2026-04-28 §10 Q4 lock-in.
v0.8.6 (May 2026) — CIMD scope enforcement at MCP-
protocol level + Cohen's Kappa rater agreement + per-claim
confidence + framework-aware threshold defaults + v0.7.x
retrospective + v1.0 transition narrative DRAFT. Aggressive
~2-3 week comprehensive scope (single-session compression
matching v0.8.3 + v0.8.4 + v0.8.5 cadence). Closes ALL 3
v0.8.5 carry-overs + 3 cycle-additions. MCP CIMD scope
enforcement at MCP-protocol level: NEW evidentia_mcp.scope
module monkey-binds FastMCP.call_tool (mcp Python SDK 1.27
has no public middleware hook); --default-client-id <slug>
CLI flag; pass-through preserves v0.8.5 default no-gating
behavior; per-call AI_MCP_TOOL_AUTHORIZED /
AI_MCP_TOOL_DENIED audit events; deny paths raise
McpError code -32602. Cohen's Kappa rater agreement
script (scripts/compute_inter_rater_kappa.py): two-rater
file mode + rule-based-rater mode; CI-gateable exit codes;
empirical κ = 0.4848 (moderate) at jaccard threshold 0.85
ships as "single-rater + κ probe inconclusive" per the
documented R3 mitigation. Per-claim bootstrap-resampled
confidence: FaithfulnessResult.confidence field
(default-off cost-aware ~100ms/claim; opt-in via
compute_confidence=True); framework-aware threshold
defaults: DEFAULT_THRESHOLDS_BY_FRAMEWORK_JACCARD map
(NIST 0.60 / FFIEC 0.35 / ISO27001 0.30) +
resolve_threshold(framework, method) helper +
FaithfulnessResult.framework field. docs/v0.7.x- retrospective.md publishes the 18-release v0.7.x cycle
narrative; docs/v1.0-transition.md DRAFT captures v1.0
theme candidates + acceptance gates + open questions for
v0.9.0 cycle-open. Thirteenth consecutive PROCEED-CLEAN of
the v0.7.x → v0.8.x line. 2383 tests passing across 217
source files; mypy strict 0/0; ruff clean. v0.8.7 wrap-up
release closes the v0.8.6 P3 CLI deferral.
v0.8.5 (May 2026) — DFAH CLI flags + corpus expansion +
real-LLM integration tests + MCP CIMD richness. Aggressive
~2-3 week comprehensive scope (single-session compression
matching v0.8.3 + v0.8.4 cadence). Closes ALL 4 v0.8.4
carry-overs in one focused session per Allen's explicit
"implement CIMD now" directive — ending the 5-cycle CIMD
deferral pattern. DFAH faithfulness CLI flags:
evidentia eval risk-determinism --check-faithfulness --faithfulness-threshold N --faithfulness-method {jaccard,semantic} --source-clauses-file <yaml>. Closes the
v0.8.4 P1.2 CLI-surface deferral; pre-condition validation
rejects malformed inputs BEFORE any LLM call fires. DFAH
corpus expansion to 123 entries with per-framework subsets
(NIST 24 / FFIEC 24 / ISO 27001 24). tune_faithfulness_threshold.py --corpus-pattern <glob> for per-framework sweep; empirical
per-framework recommended thresholds documented. Real-LLM
integration tests at tests/integration/test_eval/ opt-in
via EVIDENTIA_LLM_INTEGRATION=1. MCP CIMD richness:
new evidentia_mcp.cimd module ships CIMDDocument (per
RFC 7591 + MCP conventions) + CIMDRegistry (JSON-file-
backed, version-tagged). evidentia mcp serve --cimd-registry <path> flag wires it through stdio + SSE + HTTP transports.
Twelfth consecutive PROCEED-CLEAN of v0.7.x → v0.8.x line.
2338 tests passing across 216 source files; mypy strict
0/0; ruff clean. v0.8.6 reservations: per-tool scope
enforcement at MCP-protocol level + multi-rater corpus pass
- per-claim confidence scoring.
v0.8.4 (May 2026) — G4 Path 2 + DFAHarness wiring.
Aggressive ~2-3 week focused scope (single-session compression
matching v0.8.3 cadence). Closes the v0.8.3 ship-failure root
cause via G4 Path 2 (post-PyPI regeneration in
release.yml — sidesteps cross-platform reproducibility
entirely). New regeneration step BETWEEN Wait-for-PyPI + docker
build runs pip-compile --generate-hashes --no-emit-find-links
against PyPI's just-published wheels → ephemeral
docker/requirements.txt overwrite → docker build picks it
up. Hashes match because pip-compile downloads from PyPI's
bytes in the Linux CI runner — same source as the container
build's pip install. Built-in 3-attempt retry loop with 30s
sleeps absorbs PyPI propagation lag. DFAHarness
check_faithfulness=True wiring closes the v0.8.3 P1.2
deferral: EvalSample.source_clauses field +
EvalResult.faithfulness_results list +
DFAHarness.run(check_faithfulness=, faithfulness_threshold=, faithfulness_method=, claim_extraction_fn=, faithfulness_score_fn=) kwargs.
EventAction.AI_EVAL_FAITHFULNESS_CHECKED +
AI_EVAL_FAITHFULNESS_VIOLATION (reserved-but-inactive in
v0.8.0) ACTIVATED. Mock-callable injection points keep harness
tests cost-zero while exercising real production code paths.
14 new unit tests across 5 test classes. Eleventh consecutive
PROCEED-CLEAN of the v0.7.x → v0.8.x line. 2313 tests passing
across 220+ source files; mypy strict 0/0; ruff clean. MCP
CIMD richness deferred to v0.8.5 (5th cycle-deferral; v0.8.5
re-evaluates with potential formal retirement) + CLI flags +
calibration corpus expansion + real-LLM integration tests
deferred to v0.8.5.
v0.8.3 + v0.8.3.1 hot-fix (May 2026) — Supply-chain G4
attempt + AI-quality completion. Aggressive ~3-week cycle
executed in a single focused session. Closes 5 of 8 v0.8.2
carry-overs (G4 attempt failed first-fire + reverted in
same-day hot-fix). G4 Path 1 ATTEMPTED: Dockerfile flipped
from exact-version pinning to
pip install --require-hashes -r /tmp/requirements.txt against
hash-pinned docker/requirements.txt; release.yml exported
SOURCE_DATE_EPOCH for SOURCE_DATE_EPOCH-driven uv build
reproducibility. First-fire revealed uv build is NOT
byte-identical between Windows local + Linux CI runner even
with same SOURCE_DATE_EPOCH (file-ordering / timestamp-precision
drift). PyPI publish succeeded but container build's
pip install --require-hashes failed: local-computed hashes ≠
Linux-CI-built wheel hashes. v0.8.3.1 hot-fix REVERTED the
Dockerfile to exact-version pinning (same v0.8.2 surface; no
regression); container ship recovered same-day. Recurring
Scorecard PinnedDependencies false-positive cycle continued
(alerts dismissed per the runbook). G4 closure deferred to
v0.8.4 with Path 2 (post-PyPI regeneration; sidesteps cross-
platform reproducibility entirely). F-V82-S1:
bump_version.py --regenerate-requirements auto-detects host
platform; on non-Linux hosts auto-invokes pip-compile inside
Linux base image. F-V82-S2: evidentia eval verify CLI
replaces broad except Exception with specific SigstoreError
subclass catches mapped to distinct exit codes. DFAH
sentence-transformers path (P1.1): opt-in
[eval-faithfulness] extra; default model all-MiniLM-L6-v2
(~90 MB); catches paraphrases that the v0.8.2 Jaccard baseline
misses. LLM atomic-claim extraction (P1.2): new
extract_claims() function decomposes AI-generated artifacts
into atomic verifiable claims for faithfulness scoring.
Calibration corpus (P1.3): 50-entry corpus + threshold-
tuning script empirically guide operators on per-corpus
threshold selection. Tenth consecutive PROCEED-CLEAN of the
v0.7.x → v0.8.x line. 2299 tests passing across 220+ source
files; mypy strict 0/0; ruff clean. MCP CIMD richness
deferred to v0.8.4 (4th cycle-deferral; gated on empirical
operator demand) + DFAHarness check_faithfulness=True wiring
deferred to v0.8.4 polish.
v0.8.2 (May 2026) — Review-deferral closure + supply-chain
hardening + test-quality + DFAH faithfulness. Aggressive ~3-week
cycle executed in a single focused session. Closes 8 reservations
carried out of v0.8.1. F-V81-S1: evidentia mcp serve --allow-root <path> gates file-path tool inputs via
validate_within. F-V81-S2: AuthProvider construction moved
from import-time module-level → FastAPI lifespan event;
imports are side-effect-free. G4 Dockerfile foundation:
docker/requirements.txt regenerated against the v0.8.2 dep
tree with SHA256 hashes per transitive (activation deferred to
v0.8.3 per §25.6 R1 build-determinism). G1 mutmut + G2
hypothesis: mutation-testing baseline + 8 property-based
tests on normalizer + crosswalk. DFAH faithfulness scoring:
second arXiv 2601.15322 metric via stdlib Jaccard token-overlap
(threshold 0.3 default). First-class Sigstore signing for
evidentia eval: --sign / --no-sign flag + evidentia eval verify subcommand. Ninth consecutive PROCEED-CLEAN of
the v0.7.x → v0.8.x line. 2277 tests passing across ~215
source files; mypy strict 0/0; ruff clean. CIMD richness +
sentence-transformers faithfulness + DFAH calibration corpus
deferred to v0.8.3.
v0.8.1 (May 2026) — *Review-deferral close-out + LLM richness
- network surfaces*. Aggressive ~4-week cycle compressed to a
single focused session. Closes ALL 12 v0.8.0-bucketed review
findings (2 HIGH + 4 MEDIUM + 6 LOW polish + 2 INFO). Ships the
LLM-driven richness for the v0.8.0 P0 surfaces:
evidentia eval risk-determinism --context X --gaps Yruns the DFAHarness against the live RiskStatementGenerator; PRT LLM-driven per-claim decomposition replaces the v0.8.0 stub (trace_kind=v0.8.1-llmvs=v0.8.0-stubaudit-log discriminator). Network surfaces:evidentia mcp serve --transport sse|http+ FastAPI AuthProvider middleware (evidentia serve --auth-token-file <path>). Closes the v0.8.0 F-V08-S3/api/metricsauth gate. Eighth consecutive PROCEED-CLEAN of the v0.7.x → v0.8.x line. Three Phase 4 infra primitives (G4 Dockerfile--require-hashes, G1 mutmut, G2 hypothesis) deferred to v0.8.2 per §24.6 R6. 2240 tests passing across 211 source files; mypy strict 0/0; ruff clean.
v0.8.0 (May 2026) — The OSS-native AI moat. First minor
after the v0.7.x cycle close. Lands four AI-quality features
that distinguish a Vanta-class dashboard from a compliance-
engineering tool: DFAH determinism harness (evidentia eval stub-smoke — auditor-defensible numerical proof that AI
artifact generation is reproducible per arXiv 2601.15322),
Policy Reasoning Traces (evidentia risk generate --emit-trace — decomposes risk statements into ordered claims
with policy clause citations per arXiv 2509.23291), MCP
server (evidentia mcp serve — exposes Evidentia to MCP-
aware AI clients over stdio with 4 read-only tools), and
plugin contract scaffolding (evidentia_core.plugins —
4 ABCs + 3 reference implementations + entry-point discovery
for community catalog providers + SI-partner extensions).
M-4 collector base-class refactor consolidates ~60% of the
HTTP scaffolding across the 4 vendor-risk collectors. New
/api/metrics Prometheus endpoint + docs/evidence-integrity.md
operator deployment guidance. Pre-release-review v4 Pre-tag
PROCEED-CLEAN with 5 inline-fixes from the parallel security
- code-quality reviews; 12 findings bucketed to v0.8.1 with documented rationale. 2227 tests passing across 210 source files; mypy strict 0/0; ruff clean.
v0.7.16 (May 2026) — Final v0.7.x cycle release. Closes the v0.7.x cycle (18 patches + 2 hot-fixes over ~12 days; 6 consecutive PROCEED-CLEAN). python-dotenv CVE bump
- commit-msg hook variant of standing-rule sweep + post- ship release.yml hardening validation. v0.8.0 design phase opens immediately post-ship.
v0.7.15 (May 2026) — Tailwind 4 + SettingsPage refactor +
standing-rule pre-commit hook. Final v0.7.x cycle release before
v0.8.0 design opens. Tailwind 3→4 migration (full shadcn/ui
preset rewrite to CSS-first @theme {}; PostCSS chain replaced
with @tailwindcss/vite plugin; tailwindcss-animate v3-era →
tw-animate-css v4-compatible). SettingsPage.tsx refactored to
key-based remount of <SettingsForm/> sub-component; lint rule
react-hooks/set-state-in-effect promoted from warn to
error. New scripts/standing_rule_sweep.sh + pre-commit hook
runs the canonical 21-pattern guard at commit-time. Fifth
consecutive PROCEED-CLEAN /security-review. Post-ship hardening
(commit fd36e78) extends release.yml publish-container Wait
step to all 6 packages — closes the LAST PyPI propagation race
surface. 2120 tests passing across 188 source files.
v0.7.14 (May 2026) — frontend modernization + Codecov
P2.1 RESOLVED + final v0.7.x hygiene + v0.8.0 G4 foundation.
7 of 8 PR #21 frontend major bumps (TypeScript 5→6, ESLint
9→10 flat-config, plugin-react-hooks 5→7, jsdom + minors;
tailwind 3→4 deferred to v0.7.15). 3 deferred v0.7.8 LOWs
closed (Tableau Windows tempfile via TemporaryDirectory,
Databricks LTS env-var, test-coverage gaps). Codecov dashboard
fixed (was 0% since v0.7.10; now 82.14% via removing the
flag_management.individual_flags[].paths glob that filtered
all files out). container-build Wait extended to all 6 packages.
Hash-pinned docker/requirements.txt preview lands as v0.8.0
G4 reproducible-build foundation. Fourth consecutive PROCEED-
CLEAN /security-review. 2120 tests passing across 188 source
files.
v0.7.13 (May 2026) — *dependency modernization + Codecov fix
- P3 carry-over closures + release-notes hygiene*. Wrap-up of
the v0.7.x cycle. No new public surfaces. Codecov coverage
upload fixed (switched to
source_pkgsso Cobertura XML emits full repo-relative paths).release.ymlnow auto-populates the GitHub Release body fromCHANGELOG.mdvia a newextract_changelog_block.pystep +body_patharg — closes the v0.7.5–v0.7.12 stub-body gap structurally; future releases auto-populate. P3 carry-overs closed: M-9 OSCAL UUID conformance + L-2 Vanta/Drata high-risk extended fields + L-4 SIG BYO sparse-row debug logging + 5 of 9 v0.7.8 LOWs. Third consecutive PROCEED-CLEAN /security-review. Plus 10 historical release-body backfills landed retroactively. 2100 tests passing across 188 source files.
v0.7.12 (May 2026) — concrete cloud-WORM backends + FAIR
Monte Carlo + GDPR purge-flow + alert-zero. Adds the three cloud
backends to the WORMBackend ABC introduced in v0.7.11:
S3ObjectLockWORM, AzureImmutableBlobWORM, GCSBucketLockWORM
(installed via evidentia[worm-s3] / [worm-azure] /
[worm-gcs] extras). Adds FAIR Monte Carlo simulation
(risk quantify --method fair-mc) using stdlib-only Beta-PERT
sampling. Adds GDPR Article 17 purge-flow (purge_immediately +
force_gdpr_purge operator override). Plus 3 cloud-WORM operator
runbooks, alert-zero closure (CodeQL custom sanitizer pack
registers validate_within as a path-injection sanitizer),
bump_version.py inter-package pin tightening, and
release-checklist Steps 5.5 + 9.5 doc-consistency + release-notes
practices. Second consecutive PROCEED-CLEAN /security-review.
2075 tests passing across 188 source files.
v0.7.11 (May 2026) — *audit chain-of-custody + governance trio
- Open FAIR + 6-store harmony*. Adds the
evidentia retentionCLI (set / list / show / extend / transition / delete / report) with 10-regime classification (SEC 17a-4 / FINRA 3110 / IRS / SOX / HIPAA / GLBA / PCI / SR 11-7 / GDPR / generic) and aWORMBackendABC with aLocalFilesystemWORMreference impl. Adds KRI/KPI/KGI metrics overlay (P1.5 G3), Open FAIR risk quantification (P1.5 G4 deterministic PERT-mean), and process-as-code governance workflows (P1.5 G5). Closes 9 of 17 v0.7.10 P3 deferrals including F-V10-S2 ($EDITORallowlist). First v0.7.x PROCEED-CLEAN /security-review (0 findings). 1929 tests passing across 184 source files.
v0.7.9 + v0.7.10 (May 2026) — industry overlay (financial
services TPRM + model risk + governance primitives) + federal-
compliance carry-overs. v0.7.9 ships evidentia tprm (vendor
inventory + DD-questionnaire generator with 5 output formats
including SIG BYO + caiq-full + concentration-report) + 4
vendor-risk SaaS collectors (Vanta / Drata / BitSight /
SecurityScorecard) + OSCAL TPRM emit. v0.7.10 adds the model-
risk overlay (SR 11-7 / OCC 2011-12 model inventory + validation
report templates) + 7 new bundled catalogs (5 FFIEC IT Handbook
booklets + OCC 2011-12 / FRB SR 11-7 + FFIEC CAT). Bundled
catalog count: 82 → 89.
v0.7.8 (May 2026) — cloud data-warehouse collectors + BI
integrations. Adds two read-only evidence collectors for cloud
data warehouses (Databricks workspace API + Snowflake
account_usage views; mapped to NIST 800-53 controls AC-2 / AC-3
/ AC-6 / AC-7 / AU-2 / AU-3 / IA-2 / IA-5 / SC-7 / SC-12 / SI-2)
plus the first output integrations to enterprise BI platforms
(Tableau Server / Cloud + Power BI). Three published datasets per
BI platform: gap inventory, NIST SP 800-30 risk register with AI-
provenance fields, and CollectionContext audit trail. Ships
walkthrough docs (docs/cloud-dw-collectors.md,
docs/bi-integrations.md) and an end-to-end Meridian-with-BI
demo.
v0.7.7 (May 2026) — SQL family evidence collectors. Five
read-only relational-DB adapters ([sql-postgres], [sql-mysql],
[sql-sqlite], [sql-mssql], [sql-oracle]) mapping DB-resident
compliance evidence to NIST 800-53 controls. Plus ServiceNow
output integration carry-forward.
See CHANGELOG.md for the full version history
(v0.1.0 through v0.9.9). For forward direction, see
docs/v0.8.0-plan.md (the OSS-native AI
moat — DFAH + PRT + MCP + plugin contracts) and
docs/ROADMAP.md (everything else).
-
Gap analysis against 89 bundled frameworks across four redistribution tiers:
-
Tier A — US federal (25 frameworks, verbatim public domain): NIST 800-53 Moderate sample, 800-171 Rev 2/Rev 3, 800-172, CSF 2.0, AI RMF 1.0, Privacy Framework 1.0, SSDF 800-218; FedRAMP Rev 5 Low/Moderate/High/LI-SaaS baselines; CMMC 2.0 Levels 1/2/3; HIPAA Security/Privacy/Breach Notification Rules; GLBA Safeguards, NY DFS 500, NERC CIP v7, FDA 21 CFR Part 11, IRS 1075, CMS ARS, FBI CJIS v6, CISA Cross-Sector CPGs.
-
Tier A — International (6 frameworks): UK NCSC CAF 3.2, UK Cyber Essentials, Australian Essential Eight, Australian ISM, Canada ITSG-33, New Zealand NZISM.
-
Tier D — Statutory obligations (21 frameworks, government edicts, uncopyrightable): EU GDPR, EU AI Act, EU NIS2, EU DORA, UK DPA 2018, Canada PIPEDA, plus all 15 comprehensive US state privacy laws (CA CCPA/CPRA, VA, CO, CT, UT, TX, OR, DE, MT, IA, FL, TN, NH, MD, MN).
-
Tier C — Licensed stubs (20 frameworks): ISO/IEC 27001:2022, 27002:2022, 27017, 27018, 27701, 42001 (AI), 22301 (BC); PCI DSS v4.0.1; HITRUST CSF v11; COBIT 2019; SWIFT CSCF 2024; CIS Controls v8.1 plus 5 CIS Benchmarks (AWS, Azure, GCP, Kubernetes, RHEL 9); Secure Controls Framework 2024; IEC 62443; SOC 2 TSC. Copyrighted authoritative text isn't bundled — ships with public clause numbering plus a
evidentia catalog importhook for your licensed copy. -
Tier B — Threat and vulnerability catalogs (4 frameworks): MITRE ATT&CK Enterprise (41 techniques), MITRE CWE Top 25 (2024), MITRE CAPEC sample, CISA KEV sample (Log4Shell, MOVEit, EternalBlue, etc).
-
-
Six bundled crosswalks: NIST CSF 2.0 → 800-53, FedRAMP Moderate → CMMC L2, NIST 800-53 → HIPAA Security, ISO 27001 → NIST 800-53, VCDPA → CCPA/CPRA, NIST 800-53 → SOC 2 TSC.
-
Multi-format inventory parsing. Load your controls from YAML, CSV, JSON (including OSCAL component-definition), or any format with fuzzy-matched column headers. Status normalization handles "implemented", "partial", "planned", "in progress", "missing", etc.
-
Cross-framework crosswalk engine. Bidirectional mapping index: ask "what NIST 800-53 controls satisfy my SOC 2 criterion?" or "what CMMC Level 2 controls match my FedRAMP Moderate posture?". v0.2.0 ships six bundled crosswalks (118 mappings total). Custom crosswalks are drop-in JSON files in
catalogs/data/mappings/. -
Prioritized gap reports. Severity by implementation state, effort-weighted priority scores, efficiency opportunities (controls that close gaps in 2+ frameworks simultaneously), and a prioritized remediation roadmap.
-
Five output formats: JSON (canonical), CSV (flat), Markdown (human review), OSCAL Assessment Results (for audit handoff and tool interop), and SARIF 2.1.0 (
--format sarif, v0.10.0 — runs gap analysis as a CI gate, surfaced in GitHub code scanning and GitLab security dashboards). -
AI risk statement generator. NIST SP 800-30 Rev 1 compliant risk statements from gaps + system context. Uses Instructor to enforce the
RiskStatementPydantic schema on LLM output, with automatic retries on validation failure. Works with any LiteLLM-supported model. -
Typer + Rich CLI with
init,catalog list/show/crosswalk/import/ where/license-info/remove,gap analyze,gap diff,risk generate,explain control,collect aws,collect github,integrations jira,oscal verify(v0.7.0 — verifies SHA-256 digests + GPG.ascand/or Sigstore.sigstore.jsonsignatures),serve(web UI),doctor, andversioncommands.catalog listsupports--tierand--categoryfilters;catalog importaccepts direct JSON or an OSCAL profile (via--profile <profile.json> --catalog <source.json>). Global flags:--offline(air-gap mode),--json-logs(ECS 8.11 structured output for SIEM ingestion),--config <path>,--verbose,--quiet. -
3292 tests passing + 17 environmental skips (verified locally on Windows; full suite runs on Linux CI per the v0.9.9 baseline) covering models, catalog loading (with a parametric smoke test per bundled framework), recursive enhancement flattener for NIST Rev 5 3-level IDs, tier invariants, OSCAL profile resolution, user-import directory precedence, crosswalk bidirectionality, multi-format inventory parsing, severity calculation, all five report exporters (incl. OSCAL Assessment Results + SARIF 2.1.0), Jira + ServiceNow + Tableau + Power BI push/sync, the full evidence-collector surface (AWS Config
- Security Hub + IAM Access Analyzer + GitHub branch protection +
CODEOWNERS + Dependabot + Okta + Vanta + Drata + BitSight +
SecurityScorecard + Databricks + Snowflake + SQL family), FastAPI
/api/*endpoints, air-gap mode, OSCAL AR digest + GPG + Sigstore round-trip verification, POA&M state-machine transitions, CONMON cadence enforcement (incl. the CONMON daemon), DFAH faithfulness scoring (Jaccard + sentence-transformers semantic + LLM atomic-claim extraction), multi-tenant RBAC, WORM evidence-store invariants, MCP CIMD scope gating, OCSF mapping, the federal-SI walk-through scenarios, Hypothesis property tests, and 3 trestle conformance tests against the NIST OSCAL reference impl.
- Security Hub + IAM Access Analyzer + GitHub branch protection +
CODEOWNERS + Dependabot + Okta + Vanta + Drata + BitSight +
SecurityScorecard + Databricks + Snowflake + SQL family), FastAPI
Setting expectations matters. v0.7.x → v0.9.x has been a steady cadence of enterprise hardening (SLSA L3 build provenance + cosign keyless signing + PEP 740 attestations + OpenSSF Scorecard / OSV gates + Silver-tier badge), AI features (DFAH faithfulness scoring, AI governance, MCP CIMD scope enforcement), federal compliance (POA&M state machine + CONMON cadences + CONMON daemon + WORM evidence store + multi-tenant RBAC + OSCAL 1.2.1), and integration-surface expansion (Okta, Vanta, Drata, ServiceNow, BitSight, SecurityScorecard, Databricks, Snowflake, SQL-family, Tableau, Power BI). v0.10.0 added the OCSF-aligned findings schema
- SARIF 2.1.0 CI-gate export. See
CHANGELOG.mdanddocs/ROADMAP.mdfor the full per-release history. The following are still on the roadmap but not yet shipped:
- Multi-modal LLM evidence validation (Phase 3 / v1.x) — "is
this screenshot actually proof of MFA?" scoring, freshness
detection, multi-modal validation via Document Screenshot
Embedding (DSE). DFAH textual-claim faithfulness scoring shipped
in v0.8.2–v0.8.6 (Jaccard + sentence-transformers semantic + LLM
atomic-claim extraction + Cohen's Kappa rater agreement); DSE
itself remains academic-only and is tracked in
docs/positioning-and-value.md§13. - Additional cloud collectors — Azure + GCP evidence collectors. Carried forward as optional / community-driven items.
- Authoritative control text for copyrighted frameworks (ISO
27001/27002, SOC 2 TSC, PCI DSS, HITRUST CSF, etc.) — ships as
Tier-C stubs with public clause numbering only. Use
evidentia catalog importto load your own licensed copy. (Design decision, not a roadmap item — the project will never bundle copyrighted text.)
- Python 3.12+
- uv 0.4+ (recommended) or pip
pip install evidentiaThis installs the evidentia and cb CLI commands, plus the five workspace
sub-packages as transitive dependencies (evidentia-core, evidentia-ai,
evidentia-collectors, evidentia-integrations, evidentia-api).
git clone https://github.com/polycentric-labs/evidentia.git
cd evidentia
uv sync --all-packagesThis downloads Python 3.12 (if needed), creates a .venv, and installs all
8 workspace packages in editable mode (as of v0.10.5: evidentia-core,
evidentia-ai, evidentia-eval, evidentia-collectors,
evidentia-integrations, evidentia-api, evidentia-mcp, and the
evidentia meta-package).
uv run pytest tests/ -q
# Expected: full suite passes in ~10s on a warm checkoutHit a snag? See
docs/troubleshooting.mdfor common first-run issues — wrong Python version, missing SPA bundle, Sigstore TUF metadata fetch failures, Docker bind-mount permissions.Want the absolute shortest path? See
docs/quickstart.md— five commands frompip installto a verified OSCAL Assessment Results document.
evidentia serve brings up a FastAPI + React SPA on
http://127.0.0.1:8000. Five interactive surfaces ship today,
mirroring the CLI 1:1:
| Page | What it does |
|---|---|
| Home | Three-path onboarding (sample data / upload / wizard) |
| Frameworks | Browse all 89 bundled catalogs with tier + category filters |
| Gap Analyze | Form + framework picker → TanStack Table results |
| Gap Diff | Two-report classification + PR-comment markdown export |
| Risk Generate | Streamed AI risk statements per gap |
See docs/gui/README.md for a per-page
walkthrough + accessibility notes + troubleshooting.
Evidentia ships with a realistic fictional fintech scenario in
examples/meridian-fintech/. Walk through it in five steps:
# 1. Verify installation
uv run evidentia doctor
# 2. Explore available frameworks
uv run evidentia catalog list
# 3. Inspect a specific control
uv run evidentia catalog show nist-800-53-mod --control SI-4
# 4. See how one framework maps to another
uv run evidentia catalog crosswalk \
--source nist-800-53-mod --target soc2-tsc --control AC-2
# 5. Run gap analysis on the Meridian Financial sample inventory
cd examples/meridian-fintech
uv --project ../.. run evidentia gap analyze \
--inventory my-controls.yaml \
--frameworks nist-800-53-mod,soc2-tsc \
--output report.md --format markdown \
--min-efficiency-frameworks 2Expected output: a 17-gap report against 28 required controls, 39.3% coverage, 11 critical / 5 high / 1 medium severities, with the top of the priority queue dominated by monitoring/detection gaps (CC7.1, CC7.2, SI-4, AU-6).
v0.7.0 ships a composite GitHub Action that turns every PR into a
compliance check. It runs evidentia gap analyze, diffs against the
main-branch baseline, posts a sticky PR comment with the diff, and gates
merge on regressions.
# .github/workflows/compliance.yml
name: Compliance check
on:
pull_request:
branches: [main]
push:
branches: [main]
permissions:
contents: read
pull-requests: write
jobs:
compliance:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v5
with: { fetch-depth: 2 }
- uses: polycentric-labs/evidentia/.github/actions/gap-analysis@v0
with:
inventory: inventory.yaml
frameworks: nist-800-53-rev5-moderate,soc2-tsc
github-token: ${{ secrets.GITHUB_TOKEN }}See .github/actions/gap-analysis/README.md
for the full input/output surface, OSCAL AR + Sigstore signing options,
SHA-pinned variants for audit pipelines, and the migration guide from
the legacy standalone allenfbyrd/evidentia-action@v1 (now archived).
Requires an LLM API key. Any LiteLLM-supported provider works:
export OPENAI_API_KEY=sk-... # or ANTHROPIC_API_KEY, etc.
uv --project ../.. run evidentia risk generate \
--context system-context.yaml \
--gaps report.json \
--model gpt-4o \
--output risks.json \
--limit 5This produces five validated RiskStatement objects (NIST SP 800-30 structure)
for the five highest-priority gaps.
# From an empty directory
uv run evidentia init
# Creates:
# evidentia.yaml — config with defaults
# my-controls.yaml — template control inventory
# system-context.yaml — template system context
# .evidentia/ — local storageEdit my-controls.yaml with your real inventory and run evidentia gap analyze.
Evidentia is a uv workspace monorepo of six composable Python packages plus a React/Vite frontend workspace:
| Package | Role |
|---|---|
evidentia-core |
Pydantic data models, OSCAL catalog loader, crosswalk engine, gap analyzer |
evidentia-ai |
LiteLLM + Instructor client, risk statement generator, control explainer |
evidentia-collectors |
Evidence collection agents — AWS (Config + Security Hub), GitHub (branch protection + CODEOWNERS), Okta (SSO + MFA), Databricks (Unity Catalog + clusters), Snowflake (LOGIN_HISTORY + grants + masking), 5 SQL adapters (Postgres / MySQL / SQLite / MSSQL / Oracle), 4 vendor-risk APIs (Vanta / Drata / BitSight / SecurityScorecard) |
evidentia-integrations |
Jira push + bidirectional status sync, ServiceNow ticket sync, Tableau + Power BI publish |
evidentia-api |
FastAPI server (26 REST routes across 12 router modules) that bundles the React SPA for evidentia serve |
evidentia |
CLI meta-package: Typer/Rich entry points (evidentia + cb alias) |
evidentia-ui (non-Python) |
Vite + React 18 + shadcn/ui frontend; built bundle is copied into evidentia-api at wheel time |
The 6 v0.5.1 controlbridge-* deprecation shims published in v0.6.0
were removed at v0.7.0 per the public migration contract. Existing
v0.5.1 installs continue to work; future releases no longer produce
shim wheels.
┌─────────────────┐ ┌─────────────────┐ ┌────────────────────┐
│ my-controls.yaml│ │ OSCAL catalogs │ │ framework mappings │
│ .csv │ │ (77 bundled; │ │ (crosswalks) │
│ .json │ │ manifest-driven) │ └──────────┬───────┘
└────────┬────────┘ └────────┬────────┘ │
│ ▼ ▼
│ ┌──────────────────────────────────────┐
└──────────▶│ GapAnalyzer │
│ normalize → match → score → rank │
└──────────────────┬───────────────────┘
│
┌──────────────────┴───────────────────┐
│ │
▼ ▼
┌──────────────────────┐ ┌──────────────────────┐
│ GapAnalysisReport │ │ RiskStatementGen │
│ (JSON/CSV/MD/OSCAL) │ │ (NIST SP 800-30) │
└──────────────────────┘ └──────────┬───────────┘
│
▼
┌──────────────────────┐
│ LiteLLM+Instructor │
│ (any LLM provider) │
└──────────────────────┘
- Pydantic v2 everywhere with
ConfigDict(use_enum_values=True, extra="forbid", str_strip_whitespace=True). Structured data, strict validation, JSON-roundtripping for free. - OSCAL as the lingua franca. Inputs parse OSCAL catalogs and component-definitions. Outputs include OSCAL Assessment Results. Your data is portable.
- Instructor for AI structured output. LLMs return raw text; Instructor enforces a Pydantic schema and automatically retries on validation failure. No regex parsing of LLM output.
- Hatchling build backend with
[tool.hatch.build.targets.wheel] packages = ["src/..."]and[dependency-groups] dev = [...](the modern uv spec, not the deprecated[tool.uv] dev-dependencies).
See docs/ROADMAP.md for the detailed version-level plan.
Summary below.
- Core data models
- OSCAL catalog loader + crosswalk engine
- Multi-format inventory parser
- Gap analyzer with priority scoring
- Report exporters (JSON/CSV/Markdown/OSCAL-AR)
- AI risk statement generator
- CLI (init, catalog, gap, risk, doctor)
- Sample data + end-to-end walkthrough
- Phase 1.5 (v0.2.0 big-bang): exhaustive framework expansion
— full upstream NIST 800-53 Rev 5 OSCAL (~1189 controls + Low/Mod/High/Privacy baselines),
NIST 800-171 r2/r3, 800-172, CSF 2.0, AI RMF, SSDF, Privacy Framework;
FedRAMP Rev 5 baselines; CMMC 2.0 L1/L2/L3; CJIS, CISA CPGs, HIPAA,
GLBA, NY DFS 500, NERC CIP, FDA 21 CFR Pt 11, IRS 1075, CMS ARS;
EU GDPR/AI Act/NIS2/DORA, UK NCSC CAF, Essential Eight, ACSC ISM,
Canada ITSG-33/PIPEDA, NZISM; 15 US state privacy laws; Tier-C stubs
for ISO 27001/27002/27017/27018/27701/42001/22301/9001, SOC 2 TSC,
PCI DSS 4.0, HITRUST, COBIT, SWIFT CSCF, CIS Controls + Benchmarks,
SCF, IEC 62443; MITRE ATT&CK, CWE, CAPEC, CISA KEV;
evidentia catalog importfor user-licensed Tier-C content; GitHub Actions refresh CI for upstream change detection.
-
evidentia gap diff— classify gaps opened / closed / severity-changed / unchanged -
--fail-on-regressionfor CI integration -
evidentia explain <control_id>— LLM-generated plain-English control translations - Three realistic example scenarios (Meridian fintech, Acme Healthtech, Northstar DoD)
- FastAPI REST server (
evidentia serve) — 26/api/*routes across 12 router modules - React + Vite + shadcn/ui web UI (WCAG 2.1 AA via Radix primitives)
- Air-gapped mode (
--offlineflag +doctor --check-air-gapvalidator) - Reusable GitHub Action (
allenfbyrd/evidentia-action@v1)
- Base collector architecture with
check_connection(),collect(),get_supported_controls() - AWS collector — Config rules + Security Hub (FSBP / CIS)
- GitHub collector — branch protection, CODEOWNERS, visibility
- Jira integration — push gaps as issues + bidirectional status sync
- ControlBridge → Evidentia across code, PyPI, GitHub, docs
- v0.5.1 deprecation shims for the six old PyPI names
- SHA-256 digest per evidence item in OSCAL AR exports
- Optional GPG signing of the AR document (air-gap path)
- Sigstore/Rekor signing of the AR (online path, OIDC-based)
- CycloneDX SBOM on every release
- PyPI Trusted Publisher (OIDC) + PEP 740 attestations
- OSCAL conformance via
compliance-trestleround-trip in CI - AWS IAM Access Analyzer + GitHub Dependabot collectors
- ECS-8.11 / NIST-AU-3 / OpenTelemetry structured logs
- Consolidated GitHub Action at
.github/actions/gap-analysis/ - Tamper-evident audit trail for external-auditor review
-
GenerationContextmetadata on every AI-generated artifact (sibling ofCollectionContext) - 9 new
evidentia.ai.*EventActionentries for ECS-structured AI audit events - Typed exception hierarchy in
evidentia_ai.exceptions— closes BLOCKER B3 forrisk_statements/+explain/ - Bounded retry via
with_retry_async+build_retrying/build_async_retryingagainst sharedLLM_TRANSIENT_EXCEPTIONSset -
run_id-correlated audit trails across AI generated/failed/retry/cache_hit/batch_completed events - Best-effort operator identity via
evidentia_ai.client.get_operator_identity()— closes NIST AU-3 "Identity" gap for AI artifacts - 116+ net new tests across
test_ai/,test_audit/,test_models/
- OpenSSF Scorecard weekly workflow (
.github/workflows/scorecard.yml) publishing tosecurityscorecards.dev - Cursor + VS Code workspace config (
.vscode/{4 files}+.cursorrules+.editorconfig) for testing/validation inline -
docs/ide-setup.mdwalkthrough — pytest discovery, mypy strict, ruff format-on-save, coverage gutters, 7 debug configs, 16 pre-canned tasks - Catalog-drift detector fix — pinned
yaml.safe_dump(width=200)for byte-stable manifest emit +--ignore-all-spaceworkflow guard (closes issues #1-#4) - Pre-release-review refinements — 4 MEDIUM doc/config polish fixes (DORA past-tense, doc stamp date, Windows venv path, regen stderr warning)
- Risk-statement quality validator (NIST SP 800-30 / IR 8286 scoring + auto-regeneration)
- Additional collectors — IAM Access Analyzer, Dependabot, Okta, Azure, GCP
- Additional integrations — ServiceNow, Vanta, Drata
- Compliance ROI scoring ("close N gaps across M frameworks with one remediation")
- Auto-generated TypeScript types from FastAPI OpenAPI schema
- Tauri desktop packaging for offline-first users
- Evidence-to-control relevance scoring (is this screenshot actually proof of MFA?)
- Freshness / staleness detection per framework (SOC 2 = 90 days, NIST = 365)
- Multi-modal validation (PDFs, screenshots, log exports, JSON)
- Coverage heatmaps
- Multi-user auth / RBAC for network deployments (localhost-only today)
- Multi-tenant database backend (PostgreSQL)
- Cost analytics (LLM spend per control / per framework)
- Plugin system for custom collectors
- OSCAL catalog marketplace / community contributions
- Integration with policy-as-code tools (OPA, Cedar)
- Terraform provider for compliance-as-code
See Evidentia-Architecture-and-Implementation-Plan.md
for the full canonical plan (~318 KB) including all code sketches, data
flows, and technology rationales.
Evidentia/
├── packages/
│ ├── evidentia-core/ # Pydantic models, catalogs, gap analyzer
│ ├── evidentia-ai/ # LiteLLM client, risk generator, explain
│ ├── evidentia-collectors/ # AWS (Config + Security Hub), GitHub
│ ├── evidentia-integrations/ # Jira (push + sync)
│ ├── evidentia-api/ # FastAPI REST server + bundled SPA
│ ├── evidentia/ # CLI meta-package (Typer entry points)
│ └── evidentia-ui/ # Vite + React + shadcn/ui frontend
├── tests/
│ ├── fixtures/ # Sample inventories + recorded fixtures
│ ├── unit/ # Unit tests (per-package subtrees)
│ └── integration/ # CLI + examples smoke tests
├── examples/
│ ├── meridian-fintech/ # Realistic fintech walkthrough
│ ├── acme-healthtech/ # HIPAA-focused scenario
│ └── northstar-systems/ # DoD / CMMC scenario
├── docs/
│ ├── ROADMAP.md # Version-level plan
│ ├── air-gapped.md # `--offline` mode guide
│ ├── architecture/ # Deep-dive docs
│ ├── github-action/ # Reusable action docs
│ └── gui/ # Web UI guide
├── .github/
│ ├── workflows/test.yml # CI: pytest matrix + ruff + mypy
│ ├── workflows/release.yml # Auto-release on main-branch deploys
│ └── ISSUE_TEMPLATE/ # Bug report / feature request
└── pyproject.toml # uv workspace root
uv run pytest tests/ -q # All tests
uv run pytest tests/unit/ -v # Unit tests with verbose output
uv run pytest tests/unit/test_gap_analyzer/ # One subpackage- Drop an OSCAL catalog JSON file in
packages/evidentia-core/src/evidentia_core/catalogs/data/<framework-id>.json. - Register its metadata in
catalogs/registry.pyunderFRAMEWORK_METADATA. - Optionally add crosswalks in
catalogs/data/mappings/. - Run
evidentia catalog list— your framework should appear.
- Python 3.12+ syntax:
str | None,list[str],from datetime import UTC from __future__ import annotationsat the top of every module- Ruff + mypy (configured in
pyproject.toml)
Phases 1, 1.5, 2 (Jira + AWS + GitHub), and Accessible GRC (v0.4.x web UI
- air-gap mode) are shipped. High-value contribution areas:
- Additional crosswalks — especially ISO 27001 ↔ NIST 800-53 and PCI DSS ↔ SOC 2
- Queued collectors — IAM Access Analyzer, Dependabot, Okta, Azure, GCP
- Queued integrations — ServiceNow, Vanta, Drata
- Evidence chain of custody (v0.7.0) — SHA-256 digests + GPG signing of OSCAL AR exports
- Risk-statement quality validation — NIST SP 800-30 / IR 8286 scoring of AI output
- Production OSCAL catalogs — drop-in JSON files from upstream sources
- Test coverage — edge cases in CSV header matching, OSCAL parsing, and air-gap guard
Please do not open a public GitHub issue for security concerns.
See SECURITY.md for the disclosure process —
GitHub Private Vulnerability Reporting is the preferred channel;
email is documented as a backup. The policy also covers the
supported-version table, scope, disclosure timeline, and
supply-chain provenance verification.
Every release ships with cryptographic provenance: PEP 740
attestations on every wheel + sdist (Sigstore + Rekor), CycloneDX
1.6 SBOM attached to each GitHub
Release.
Verification command in SECURITY.md.
Evidentia stands on the shoulders of excellent open-source projects:
- NIST OSCAL — the structured data standard that makes framework interop possible
- Pydantic — type-safe data models without the boilerplate
- LiteLLM — unified LLM access across every provider
- Instructor — structured output extraction from LLMs
- Typer and Rich — the CLI is only as good as the framework
- uv — Python packaging that finally feels modern
This project was developed alongside AI platforms.
Models used: Claude Opus 4.6, Claude Opus 4.7, Sonar Deep Research