rayyagari2-create · rayyagari2-create · May 22, 2026 · May 22, 2026 · May 22, 2026
diff --git a/docs/adapter-designs/DESIGN-01.md b/docs/adapter-designs/DESIGN-01.md
@@ -350,7 +350,7 @@ means no observed violation. It does not mean no violation occurred.
 | Surface | Value | Notes |
 |---|---|---|
 | contextInjectionSurface | `rules_file` + `agents_md` | .cursor/rules/*.mdc + AGENTS.md |
-| eventSurface | `hook_rich` | 18 hook events including subagentStart/Stop |
+| eventSurface | `hook_rich` | ~21 hook events including subagentStart/Stop; includes Tab hooks and workspaceOpen |
 | controlSurface | `tool_level` | failClosed: true blocks on hook failure |
 | approvalSurface | `pre_tool` + `pre_session` | AWF intake gate + per-tool via hooks |
 | artifactSurface | `git_diff` + `pr_url` + `test_results` + `handoff_note` | Cloud agent REST API artifacts |
@@ -361,15 +361,21 @@ means no observed violation. It does not mean no violation occurred.
 | sandboxSurface | `sandbox_mode` + `worktree` | --sandbox + --worktree flags |
 | trustSubjectType | `agent` | .cursor/agents/ definitions |
 | enforcementDepth | `tool_level` | Event-rich, comparable to Claude Code |
-| currentEvidenceStrength | E1 | Artifact level until hooks wired to AWF |
-| targetEvidenceStrength | E3 | Local hooks / E2 cloud agents (SSE) |
+| currentEvidenceStrength | per-mode (see below) | Mode A local: E1 until hooks wired. Mode B cloud: E1 before adapter / E2 after Gate 2 SSE verification. Mode C self-hosted: E1. Mode D human: E1. |
+| targetEvidenceStrength | per-mode (see below) | Mode A local: E3 (conditions in DESIGN-06 §8). Mode B cloud: E2 hard ceiling. Mode C self-hosted: E3 pending worker hook test. Mode D human: E3-observed not E3-enforced. |
 | requiresRuntimeHookInstall | true | AWF hooks deployed to .cursor/hooks.json |
-| supportsFailClosed | true | failClosed: true blocks on hook failure |
+| supportsFailClosed | true — requires failClosed: true per blocking hook entry. Default is fail-open. E3 claims require Cursor build >= 2026-05-14 patched release. | See DESIGN-06 §8 |
 
 **Notes:**
 - Cloud agent outbound webhooks: "coming soon." Use SSE stream as substitute.
 - Cursor is classified event-rich/controlled. Closer to Claude Code than Codex.
 
+**Known gaps:**
+- Fail-closed requires explicit per-hook opt-in; default is fail-open.
+- E3 claims require Cursor build >= 2026-05-14 patch.
+- SDK has no canUseTool-equivalent; hooks are the only synchronous enforcement path.
+- Enterprise audit log is admin-action only, not a tool-call audit surface.
+
 ---
 
 ### Devin

diff --git a/docs/adapter-designs/DESIGN-02.md b/docs/adapter-designs/DESIGN-02.md
@@ -79,7 +79,7 @@ OpenClaw (sub-agents via ACP)
 
 **subject_key pattern:**
 ```
-{runtime}::subagent::{parent_agent_id}::{task_class}::{session_id}
+{runtime}::subagent::{parent_agent_id}::{task_class}::{workspace}::{repo}::{session_id}
 ```
 
 **Examples:**
@@ -148,7 +148,7 @@ No internal agent decomposition is visible to AWF.
 
 **subject_key pattern:**
 ```
-{runtime}::session::{workspace}::{task_type}::{playbook_or_config}
+{runtime}::session::{workspace}::{repo}::{task_type}::{playbook_or_config}
 ```
 
 **Examples:**
@@ -215,7 +215,7 @@ not an autonomous agent's behavior.
 
 **subject_key pattern:**
 ```
-{runtime}::human::{human_actor_id}::{workspace}::{task_type}
+{runtime}::human::{human_actor_id}::{workspace}::{repo}::{task_type}
 ```
 
 **Examples:**
@@ -348,11 +348,11 @@ the same subject_key.
 | Trust type | Stable components | Unstable (exclude) |
 |---|---|---|
 | agent | runtime, "agent", agent_name, workspace | session_id, run_id |
-| subagent | runtime, "subagent", parent_agent_id, task_class, session_id | run_id |
-| role_profile | runtime, role, workspace, task_type, risk_lane, sandbox_mode, skill_set | session_id |
-| session | runtime, "session", workspace, task_type, playbook | session_id, run_id |
+| subagent | runtime, "subagent", parent_agent_id, task_class, workspace, repo, session_id | run_id |
+| role_profile | runtime, role, workspace, repo, task_type, risk_lane, sandbox_mode, skill_set | session_id |
+| session | runtime, "session", workspace, repo, task_type, playbook | session_id, run_id |
 | graph_node | "langgraph", graph_id, node_name, workspace | invocation_id |
-| human_runtime | runtime, "human", human_actor_id, workspace, task_type | session_id |
+| human_runtime | runtime, "human", human_actor_id, workspace, repo, task_type | session_id |
 | task | runtime, "task", workspace, task_class, work_item_id | run_id |
 
 ---
@@ -409,7 +409,7 @@ confidence in a STANDARD tier (not enough sessions to be certain).
    An agent-srv with HIGH trust on Claude Code starts PROVISIONAL on Codex.
 
 2. **Trust subjects are runtime-scoped.**
-   `claude_code::agent::agent-srv::ruvoni` and `codex::worker::ruvoni::backend::medium-risk::workspace-write`
+   `claude_code::agent::agent-srv::ruvoni::family-trip-ai` and `codex::worker::ruvoni::family-trip-ai::backend::medium-risk::workspace-write::backend-skill`
    are different trust subjects even if they represent the same "backend agent concept."
 
 3. **Workspace isolation is enforced.**
@@ -433,6 +433,11 @@ confidence in a STANDARD tier (not enough sessions to be certain).
 trust subject was created. It covers agent instruction files, role TOML configs,
 Devin playbook content and skill_set composition.
 
+config_hash covers agent/role/playbook definition, allowed tools, skill set,
+sandbox mode, and governance-relevant runtime config. Changing allowed tools
+or sandbox mode resets the confidence band even if the prompt definition is
+unchanged.
+
 **When config_hash changes:**
 - The trust_subject_id remains the same (history is preserved)
 - confidence_band resets to LOW unless the change is verified cosmetic
@@ -449,8 +454,10 @@ Devin playbook content and skill_set composition.
 
 ## Open Questions
 
-1. Should subagent trust_subject_id accumulate across sessions for named subagents in Claude Code?
-   Currently specified as session-scoped. Needs decision.
+1. OQ-1 RESOLVED: Subagent trust remains session-scoped by default. If a runtime exposes
+   persistent named subagents with stable definitions and reliable lifecycle events, AWF may
+   promote them to agent trust subjects in a future adapter version. Until then, subagents are
+   contributing subjects only, not long-lived primary trust subjects.
 2. For Codex role_profile subjects, if the .codex/agents/*.toml file is modified mid-sprint,
    should config_hash check fire before or after session start?
 3. LangGraph: should graph_node trust_subject_id be per-graph version or per-graph-name?

diff --git a/schemas/README.md b/schemas/README.md
@@ -3,13 +3,14 @@
 Generic, product-agnostic JSON Schemas for the Agentic Workforce Framework.
 All schemas are written for **AJV with JSON Schema Draft 2020-12**.
 
-These schemas describe five governance artifacts that any single-workspace
+These schemas describe six governance artifacts that any single-workspace
 deployment must produce and consume:
 
 - **AgentTaskManifest** the dispatch contract. No manifest = no dispatch.
 - **QAVerdict** structured QA result with defect classification and trust impact.
 - **FailureRecord** entry in the self-learning failure library (17-class taxonomy).
 - **TrustScore** D1-D4 session score plus the 8-dimension long-term profile.
+- **TrustSubject** accountable identity AWF scores: agent, subagent, role_profile, session, graph_node, human_runtime, or task (DESIGN-02).
 
 ## v1 schemas (current)
 
@@ -19,6 +20,7 @@ deployment must produce and consume:
 | [`v1/qa-verdict.schema.json`](v1/qa-verdict.schema.json) | `QAVerdict` | Structured QA verdict with per-finding evidence and trust delta |
 | [`v1/failure-record.schema.json`](v1/failure-record.schema.json) | `FailureRecord` | 17-class taxonomy, recurrence count, prevention artifacts, agents involved |
 | [`v1/trust-score.schema.json`](v1/trust-score.schema.json) | `TrustScore` | D1-D4 session score plus 8-dimension continuous profile, trust tier, confidence band |
+| [`v1/trust-subject.schema.json`](v1/trust-subject.schema.json) | `TrustSubject` | Accountable trust subject (agent, subagent, role_profile, session, graph_node, human_runtime, task) with subject_key construction rules, config_hash, archive lifecycle (DESIGN-02) |
 | [AgentSpawnSidecar](v1/agent-spawn-sidecar.schema.json) | Hook-readable spawn authorization record. Written by the Orchestrator before Agent tool call. Validated by PreToolUse hook. The enforcement artifact for agent spawn governance. |
 
 > **Schema dependency:** The AgentSpawnSidecar schema is the

diff --git a/schemas/v1/trust-subject.schema.json b/schemas/v1/trust-subject.schema.json
@@ -0,0 +1,117 @@
+{
+  "$schema": "https://json-schema.org/draft/2020-12/schema",
+  "$id": "https://github.com/agentic-workforce-framework/schemas/v1/trust-subject.schema.json",
+  "title": "TrustSubject",
+  "description": "Accountable trust subject for every unit of agentic work. The trust subject is what AWF actually scores, replacing the V4.3 assumption that every runtime exposes a stable agent identity. The subject_type is determined by what the runtime exposes (agent, subagent, role_profile, session, graph_node, human_runtime, or task), not by what we would prefer it to expose. Trust subjects are runtime-scoped, workspace-isolated, and immutable once created (archive, never delete). See DESIGN-02 (Trust Subject Model) for the full specification.",
+  "version": "1.0",
+  "type": "object",
+  "required": [
+    "id",
+    "subject_type",
+    "runtime_provider",
+    "subject_key",
+    "workspace_id",
+    "created_at"
+  ],
+  "properties": {
+    "id": {
+      "type": "string",
+      "format": "uuid",
+      "description": "Immutable UUID primary key. Once assigned, this value never changes. Failure records and capability profiles reference this id. To retire a subject, set archived_at — do not delete."
+    },
+    "subject_type": {
+      "type": "string",
+      "enum": [
+        "agent",
+        "subagent",
+        "role_profile",
+        "session",
+        "graph_node",
+        "human_runtime",
+        "task"
+      ],
+      "description": "Which of the seven trust subject types this row represents. Determined by what the runtime exposes: agent (named agent with stable identity, e.g. Claude Code, Cursor, OpenClaw), subagent (spawned within a session), role_profile (Codex worker/explorer/custom roles), session (Devin, Multica — runtime exposes no internal agent structure), graph_node (LangGraph node/edge with embedded AWF governance), human_runtime (human + runtime pair for IDE-assisted work), task (lowest-granularity fallback when no higher subject type is identifiable)."
+    },
+    "runtime_provider": {
+      "type": "string",
+      "enum": [
+        "claude_code",
+        "codex",
+        "cursor",
+        "openclaw",
+        "devin",
+        "multica",
+        "langgraph"
+      ],
+      "description": "Runtime that produces sessions for this subject. Trust does not transfer across runtimes — the same logical agent on a new runtime starts PROVISIONAL. Implementations may extend this enum when onboarding additional runtimes; doing so does not loosen the per-runtime isolation rule."
+    },
+    "subject_key": {
+      "type": "string",
+      "pattern": "^[a-z0-9_-]+(::[a-z0-9_-]+)+$",
+      "maxLength": 200,
+      "description": "Stable, deterministic identity string for this subject. Same runtime + same config MUST produce the same subject_key. Construction rules: (1) all lowercase; (2) components separated by '::'; (3) no spaces — use hyphens within a component; (4) include only components stable across sessions for this subject_type; (5) maximum 200 characters. Patterns by subject_type — agent: {runtime}::agent::{agent_name}::{workspace}::{repo}; subagent: {runtime}::subagent::{parent_agent_id}::{task_class}::{workspace}::{repo}::{session_id}; role_profile: codex::{role}::{workspace}::{repo}::{task_type}::{risk_lane}::{sandbox_mode}::{skill_set}; session: {runtime}::session::{workspace}::{task_type}::{playbook_or_config}; graph_node: langgraph::{graph_id}::{node_name}::{workspace}; human_runtime: {runtime}::human::{human_actor_id}::{workspace}::{task_type}; task: {runtime}::task::{workspace}::{task_class}::{work_item_id}. Unique per (workspace_id, runtime_provider, subject_key)."
+    },
+    "workspace_id": {
+      "type": "string",
+      "format": "uuid",
+      "description": "Workspace this subject belongs to. Trust is workspace-isolated: history accumulated in workspace A does not count toward workspace B, even for the same subject_key string."
+    },
+    "repo": {
+      "type": ["string", "null"],
+      "description": "Repository scope for this subject. Required for agent and role_profile subject_types when the workspace contains multiple repos — trust must not bleed across repos. Null for subject_types where repo is not part of the key (session, graph_node, task)."
+    },
+    "task_type": {
+      "type": ["string", "null"],
+      "description": "Task classification this subject is scoped to (e.g. frontend-bugfix, auth-change, database-migration). Part of the subject_key for role_profile, session, and human_runtime types. A subject trusted for one task_type starts PROVISIONAL for a new task_type — see cross-runtime trust rule 4 in DESIGN-02."
+    },
+    "risk_lane": {
+      "type": ["string", "null"],
+      "description": "Risk lane this subject operates in (e.g. low-risk, medium-risk, high-risk, restricted). Part of the subject_key for role_profile so that the same role at different risk lanes has separate trust history."
+    },
+    "sandbox_mode": {
+      "type": ["string", "null"],
+      "description": "Sandbox/permission mode the subject runs under (e.g. read-only, workspace-write, full-access). Part of the trust identity for role_profile — a workspace-write role and a read-only role are different subjects even with otherwise identical configuration."
+    },
+    "skill_set": {
+      "type": ["array", "null"],
+      "items": { "type": "string" },
+      "description": "Skill bundle composition for this subject (e.g. ['frontend-skill'], ['awf-risk-plan']). Part of the subject_key for role_profile. Changes to skill_set composition are a material configuration change and trigger config_hash recomputation."
+    },
+    "human_actor_id": {
+      "type": ["string", "null"],
+      "format": "uuid",
+      "description": "Required and non-null when subject_type is 'human_runtime' — identifies the human driving the IDE-assisted session. Null for all other subject_types. Trust for human_runtime subjects tracks human+runtime productivity patterns and is used for analytics, not autonomy gating."
+    },
+    "config_hash": {
+      "type": ["string", "null"],
+      "description": "Hash of the underlying configuration (agent instruction file, role TOML, Devin playbook content, skill_set composition) captured at subject creation. Recomputed by the adapter at each session start and compared. When it changes: trust_subject_id remains stable, confidence_band resets to LOW unless the change is verified cosmetic (whitespace, comments, formatting), and a new entry is logged in subject metadata."
+    },
+    "subject_version": {
+      "type": ["string", "null"],
+      "description": "Human-readable version label for the subject configuration. Bumped for cosmetic changes that keep config_hash unchanged, and also recorded alongside config_hash changes for audit readability."
+    },
+    "archived_at": {
+      "type": ["string", "null"],
+      "format": "date-time",
+      "description": "ISO 8601 timestamp when this subject was retired. Subjects are never deleted — archiving preserves history and keeps existing failure records and capability profiles intact. Null while the subject is active."
+    },
+    "created_at": {
+      "type": "string",
+      "format": "date-time",
+      "description": "ISO 8601 timestamp when the subject row was created."
+    }
+  },
+  "additionalProperties": false,
+  "$defs": {
+    "trustLevel": {
+      "type": "string",
+      "enum": ["PROVISIONAL", "RESTRICTED", "STANDARD", "HIGH", "PROBATION"],
+      "description": "Autonomy tier derived from total D1-D4 score across recent sessions, attached to the trust subject (not to a session). PROVISIONAL (<60): read-only analysis, every task needs human approval — new subjects always start here. RESTRICTED (60-74): draft plans and draft PRs only, human reviews before merge, 3+ sessions to exit. STANDARD (75-89): PR creation allowed after CI, no high-risk files without approval, 5+ sessions at STANDARD to reach HIGH. HIGH (90-100): low-risk task lane, human merge approval still required (Invariant 5 — never waived). PROBATION (any session <40): immediate demotion regardless of prior tier, 3 consecutive clean sessions required to exit. Lives on trust_capability_profiles, not on the trust_subject row itself — defined here for cross-schema reuse."
+    },
+    "evidenceStrength": {
+      "type": "string",
+      "enum": ["E0", "E1", "E2", "E3"],
+      "description": "Strength of evidence backing a trust assessment. E0: post-hoc / session-outcome only (Devin, Multica when AWF runs above it). E1: session-outcome plus surfaced telemetry. E2: per-decision evidence from runtime hooks. E3: full per-action evidence including pre-tool and post-tool hook coverage. Lives on trust_capability_profiles — defined here for cross-schema reuse."
+    }
+  }
+}