Problem
Vault currently uses a single shared token (VAULT_TOKENS env var) for the entire team. Any team member (or compromised agent) with the token has full access to all decryption operations. There is no way to:
- Identify which user made a request
- Limit a user's access scope
- Revoke a single user's access without rotating the shared token
User Flows
Flow 1: Initial Setup (Fresh Install)
install.sh creates default role/token config files, adds the user to the docker group, and installs the runevault CLI alias.
install.sh
├── create vault-roles.yml (default roles: admin, agent)
├── create vault-tokens.yml (empty: tokens: [])
├── usermod -aG docker $SUDO_USER
├── add 'runevault' alias to user's shell profile
├── configure docker-compose volumes
└── print instructions: "Run 'runevault token issue' to create your first token"
# After install, admin issues the first token:
$ runevault token issue --user alice --role agent --expires 90d
The admin's auth is SSH access to the host. No admin token — if you can SSH in and run runevault, you are the admin.
Flow 2: Issue Token for Team Member
$ runevault token issue --user alice --role agent --expires 90d
Token issued for 'alice':
Role: agent
Scope: get_public_key, decrypt_scores, decrypt_metadata
Top-K: 5
Expires: 2026-06-18
Token: evt_7f3a9c...
⚠ This token will NOT be shown again. Share it securely with the user.
What happens internally:
runevault alias runs docker exec rune-vault python /app/vault_admin.py token issue ...
vault_admin.py inside the container sends HTTP POST to internal unix socket
- Vault admin server validates request, generates
evt_ + secrets.token_hex(16)
- Updates in-memory token store immediately
- Async writes updated state to
vault-tokens.yml
- Returns token to CLI — printed once, not stored in logs
Flow 3: Team Member Configures Client
After receiving token from admin:
# In their Claude Desktop client config
{
"servers": {
"rune-vault": {
"url": "https://vault.example.com:50051",
"token": "evt_7f3a9c..."
}
}
}
No change to the client-side UX — same single token field.
Flow 4: API Request with Per-User Token
Client (alice) → gRPC: DecryptScores(token=evt_7f3a9c..., top_k=3)
│
├── validate_token()
│ ├── lookup token → found: user=alice, role=agent
│ ├── check expiry → OK (2026-06-18)
│ ├── check top_k → request(3) ≤ role_limit(5) → OK
│ └── check rate_limit → 12/30 in window → OK
│
├── derive agent_id from user ("alice") instead of token hash
│ └── agent_dek = HMAC-SHA256(master_key, "alice")
│
└── return DecryptScoresResponse(results=[...])
Flow 5: Rate Limit Hit
Client (alice) → 31st request in 60s window
│
├── validate_token()
│ ├── lookup token → found: user=alice, role=agent
│ ├── check rate_limit → agent: 30/60s
│ └── ✗ 31 > 30 in current window
│
└── return RESOURCE_EXHAUSTED: "Rate limit exceeded. Retry after 23s"
Flow 6: Token Expiry
Client (alice, expired) → gRPC: GetPublicKey(token=evt_7f3a9c...)
│
├── validate_token()
│ ├── lookup token → found: user=alice
│ └── ✗ expires=2026-06-18, now=2026-07-01
│
└── return UNAUTHENTICATED: "Token expired for user 'alice'"
Admin must reissue:
$ runevault token revoke --user alice
$ runevault token issue --user alice --role agent --expires 90d
Flow 7: Revoke Token (Team Member Leaves)
$ runevault token revoke --user alice
Revoked token for 'alice'.
# Takes effect immediately — no restart needed.
What happens:
docker exec runs vault_admin.py which sends HTTP DELETE to internal unix socket
- Vault removes alice from in-memory token store immediately
- Async writes updated state to
vault-tokens.yml
- Alice's
agent_id derived keys remain valid for previously encrypted metadata (data doesn't become inaccessible — just no new operations)
Flow 8: List Tokens
$ runevault token list
USER ROLE TOP_K RATE EXPIRES
alice agent 5 30/60s 2026-06-18
bob agent 5 30/60s 2026-09-01
Token values are never shown in list output.
Flow 9: Role Management
Roles are managed via the runevault role subcommand.
# List roles
$ runevault role list
ROLE SCOPE TOP_K RATE
admin get_public_key,decrypt_scores,decrypt_metadata 10 60/60s
agent get_public_key,decrypt_scores,decrypt_metadata 5 30/60s
# Create a custom role
$ runevault role create --name researcher --scope get_public_key,decrypt_scores --top-k 3 --rate-limit 10/60s
Role 'researcher' created.
# Update a role
$ runevault role update --name agent --top-k 8
Role 'agent' updated. Changes take effect immediately for all tokens with this role.
# Delete a role
$ runevault role delete --name researcher
Role 'researcher' deleted.
⚠ Tokens assigned to this role will fail validation until reassigned.
What happens internally:
vault_admin.py sends HTTP request to internal unix socket
- Admin server updates in-memory role store immediately
- Async writes updated state to
vault-roles.yml
- Role changes take effect immediately for all tokens assigned to that role
Default roles (admin, agent) are created at install time and can be modified but not deleted.
Design Decisions
Memory-first token and role management
Both token and role changes take effect immediately in memory. The files (vault-tokens.yml, vault-roles.yml) serve as SSOT for startup/recovery, but runtime changes flow through the Admin HTTP API → in-memory store → async file persist. This eliminates the need for container restarts, which is critical for security incident response (immediate revocation).
Admin API: container-internal unix socket
The admin HTTP server listens on a unix socket inside the container only (/var/run/vault-admin.sock). It is NOT mounted to the host. Access path:
SSH → host shell → runevault alias → docker exec → vault_admin.py → curl → unix socket (internal)
Why no admin token:
- SSH access to the host is the authentication boundary
docker exec requires docker group membership (set up by install.sh)
- The unix socket is not exposed outside the container
- Three layers of protection (SSH + docker group + container isolation) make an admin token redundant
runevault CLI alias
install.sh adds a bash alias to the admin user's shell profile:
alias runevault='docker exec rune-vault python /app/vault_admin.py'
- No host Python dependency — Python runs inside the container
- No host-side files to manage —
vault_admin.py lives in the container image
docker exec requires docker group membership, which install.sh configures via usermod -aG docker $SUDO_USER
Config files as SSOT for persistence
Token and role storage uses YAML config files (vault-tokens.yml, vault-roles.yml), not SQLite or other DB. Reasons:
- Human-readable, diff-able, git-trackable (with token values excluded)
- Docker-native (volume mount)
- Vault stays as stateless as possible — only FHE keys and these configs are persistent state
- Token count is small (<50), change frequency is low (monthly)
- Files are loaded at startup to populate in-memory stores; async-written after each change
Two default roles: admin and agent
Rune plugins need encrypt + decrypt capabilities together — the plugin captures organizational context (encrypt with public key) and retrieves it (decrypt scores + metadata). An "encrypt-only" or "score-only" role has no practical use case. The meaningful access boundary is:
- admin: System management + all Vault operations
- agent: Standard Vault operations (all 3 gRPC methods, with lower top_k and rate limits)
- Custom roles can be created via
runevault role create for specific needs
No token hashing at rest
Tokens are stored as plaintext in the config file. File permissions (600) protect them at rest. TLS protects them in transit. The Vault host is assumed to be an admin-only zone — if an attacker can read the config file, they already have host-level access.
Agent ID derived from username, not token
Currently agent_id = sha256(token)[:32]. After this change, agent_id = sha256(username)[:32]. This means:
- Token rotation doesn't change agent_id (metadata DEK stays consistent)
- User identity is stable across token reissues
Architecture
Vault host machine
├── vault-roles.yml ← async-persisted by server
├── vault-tokens.yml ← async-persisted by server
├── .env ← TLS config (mode 600)
├── docker-compose.yml
└── rune-vault container
├── /app/vault_admin.py ← CLI, called via docker exec
├── /app/vault-roles.yml ← volume mount (read-write)
├── /app/vault-tokens.yml ← volume mount (read-write)
├── /var/run/vault-admin.sock ← internal unix socket (NOT mounted)
├── gRPC server (0.0.0.0:50051) ← client-facing
└── Admin HTTP server ← unix socket only (container-internal)
├── POST /tokens (issue)
├── DELETE /tokens/{user} (revoke)
├── GET /tokens (list)
├── POST /roles (create)
├── PUT /roles/{name} (update)
├── DELETE /roles/{name} (delete)
└── GET /roles (list)
Admin access flow:
Admin (SSH) → runevault token issue --user alice --role agent
→ docker exec rune-vault python /app/vault_admin.py token issue --user alice --role agent
→ vault_admin.py: curl --unix-socket /var/run/vault-admin.sock POST /tokens {...}
→ Admin HTTP handler: generate token, update memory, async persist
→ return token to stdout
Config File Formats
vault-roles.yml
roles:
admin:
scope: [get_public_key, decrypt_scores, decrypt_metadata, manage_tokens]
top_k: 10
rate_limit: 60/60s
agent:
scope: [get_public_key, decrypt_scores, decrypt_metadata]
top_k: 5
rate_limit: 30/60s
vault-tokens.yml
tokens:
- user: alice
token: evt_7f3a9c1e2b4d6f8a0c2e4b6d8f0a1c2e
role: agent
created: 2026-03-20
expires: 2026-06-18
- user: bob
token: evt_def456789abc012def456789abc012de
role: agent
created: 2026-03-20
expires: 2026-09-01
Requirements
runevault CLI setup
In-container admin utility (vault_admin.py)
Token management:
Role management:
Vault server-side: Admin HTTP API
Vault server-side: Auth changes
Docker Compose integration
install.sh integration
Per-user token value
gRPC error codes
| Condition |
gRPC Status |
Detail |
| Token not found |
UNAUTHENTICATED |
Invalid authentication token |
| Token expired |
UNAUTHENTICATED |
Token expired for user '<name>' |
| Rate limited |
RESOURCE_EXHAUSTED |
Rate limit exceeded. Retry after <n>s |
| top_k exceeded |
INVALID_ARGUMENT |
top_k <n> exceeds limit <max> for role '<role>' |
Affected Files
- New:
vault/vault_admin.py — in-container admin CLI utility
- New:
vault/admin_server.py — Admin HTTP server (internal unix socket, token + role endpoints)
- Modify:
vault/vault_core.py — validate_token(), in-memory token/role stores, agent_id derivation
- Modify:
vault/vault_grpc_server.py — user identity in context, startup integration with admin server
- Modify:
vault/Dockerfile — include vault_admin.py and admin_server.py
- Modify:
vault/monitoring.py — per-user metrics labels
- Modify:
vault/docker-compose.yml — volume mounts (read-write for both configs), remove VAULT_TOKENS
- Modify:
vault/.env.example — remove VAULT_TOKENS
- Modify:
install.sh — docker group setup, runevault alias, generate config files
- Modify:
tests/unit/test_auth.py — per-user token, expiry, rate limit, role CRUD tests
Priority
High — Limits blast radius when a single user's agent is compromised via prompt injection.
Dependencies
Problem
Vault currently uses a single shared token (
VAULT_TOKENSenv var) for the entire team. Any team member (or compromised agent) with the token has full access to all decryption operations. There is no way to:User Flows
Flow 1: Initial Setup (Fresh Install)
install.shcreates default role/token config files, adds the user to the docker group, and installs therunevaultCLI alias.# After install, admin issues the first token: $ runevault token issue --user alice --role agent --expires 90dThe admin's auth is SSH access to the host. No admin token — if you can SSH in and run
runevault, you are the admin.Flow 2: Issue Token for Team Member
What happens internally:
runevaultalias runsdocker exec rune-vault python /app/vault_admin.py token issue ...vault_admin.pyinside the container sends HTTP POST to internal unix socketevt_+secrets.token_hex(16)vault-tokens.ymlFlow 3: Team Member Configures Client
After receiving token from admin:
No change to the client-side UX — same single token field.
Flow 4: API Request with Per-User Token
Flow 5: Rate Limit Hit
Flow 6: Token Expiry
Admin must reissue:
Flow 7: Revoke Token (Team Member Leaves)
What happens:
docker execrunsvault_admin.pywhich sends HTTP DELETE to internal unix socketvault-tokens.ymlagent_idderived keys remain valid for previously encrypted metadata (data doesn't become inaccessible — just no new operations)Flow 8: List Tokens
Token values are never shown in
listoutput.Flow 9: Role Management
Roles are managed via the
runevault rolesubcommand.What happens internally:
vault_admin.pysends HTTP request to internal unix socketvault-roles.ymlDefault roles (
admin,agent) are created at install time and can be modified but not deleted.Design Decisions
Memory-first token and role management
Both token and role changes take effect immediately in memory. The files (
vault-tokens.yml,vault-roles.yml) serve as SSOT for startup/recovery, but runtime changes flow through the Admin HTTP API → in-memory store → async file persist. This eliminates the need for container restarts, which is critical for security incident response (immediate revocation).Admin API: container-internal unix socket
The admin HTTP server listens on a unix socket inside the container only (
/var/run/vault-admin.sock). It is NOT mounted to the host. Access path:Why no admin token:
docker execrequires docker group membership (set up by install.sh)runevaultCLI aliasinstall.shadds a bash alias to the admin user's shell profile:vault_admin.pylives in the container imagedocker execrequires docker group membership, which install.sh configures viausermod -aG docker $SUDO_USERConfig files as SSOT for persistence
Token and role storage uses YAML config files (
vault-tokens.yml,vault-roles.yml), not SQLite or other DB. Reasons:Two default roles: admin and agent
Rune plugins need encrypt + decrypt capabilities together — the plugin captures organizational context (encrypt with public key) and retrieves it (decrypt scores + metadata). An "encrypt-only" or "score-only" role has no practical use case. The meaningful access boundary is:
runevault role createfor specific needsNo token hashing at rest
Tokens are stored as plaintext in the config file. File permissions (600) protect them at rest. TLS protects them in transit. The Vault host is assumed to be an admin-only zone — if an attacker can read the config file, they already have host-level access.
Agent ID derived from username, not token
Currently
agent_id = sha256(token)[:32]. After this change,agent_id = sha256(username)[:32]. This means:Architecture
Admin access flow:
Config File Formats
vault-roles.ymlvault-tokens.ymlRequirements
runevaultCLI setupinstall.sh:usermod -aG docker $SUDO_USERfor docker group accessinstall.sh: Addalias runevault='docker exec rune-vault python /app/vault_admin.py'to user's shell profile (~/.bashrcor~/.zshrc)runevaultcommand availabilityIn-container admin utility (
vault_admin.py)Token management:
runevault token issue --user <name> --role <role> [--expires <duration>]runevault token revoke --user <name>runevault token listRole management:
runevault role listrunevault role create --name <name> --scope <scopes> --top-k <n> --rate-limit <rate>runevault role update --name <name> [--scope <scopes>] [--top-k <n>] [--rate-limit <rate>]runevault role delete --name <name>Python CLI using
argparse+urllibwith unix socket supportVault server-side: Admin HTTP API
HTTP server on internal unix socket (
/var/run/vault-admin.sock)Token endpoints:
POST /tokens— issue new tokenDELETE /tokens/{user}— revoke tokenGET /tokens— list tokens (no token values)Role endpoints:
POST /roles— create new rolePUT /roles/{name}— update existing roleDELETE /roles/{name}— delete role (reject if default role)GET /roles— list all rolesNo admin token required — access is protected by SSH + docker group + container isolation
In-memory token and role stores with async file persistence
Python stdlib
http.serverbased (no external dependencies)Vault server-side: Auth changes
vault-roles.ymlandvault-tokens.ymlat startup → populate in-memory storesVAULT_TOKENSenv var with in-memory token store as token sourcevalidate_token()checks: token lookup → expiry → top_k → rate_limitsha256(username)[:32]VAULT_TOKENSenv var exists and no config files, use legacy mode with deprecation warningDocker Compose integration
vault-tokens.ymlas read-write volumevault-roles.ymlas read-write volumeVAULT_TOKENSenv var from.env.exampleanddocker-compose.ymlinstall.sh integration
vault-roles.ymlwith admin/agent rolesvault-tokens.yml(tokens: [])usermod -aG docker $SUDO_USERfor passwordless docker accessrunevaultalias to user's shell profile (auto-detect bash/zsh)Per-user token value
gRPC error codes
UNAUTHENTICATEDInvalid authentication tokenUNAUTHENTICATEDToken expired for user '<name>'RESOURCE_EXHAUSTEDRate limit exceeded. Retry after <n>sINVALID_ARGUMENTtop_k <n> exceeds limit <max> for role '<role>'Affected Files
vault/vault_admin.py— in-container admin CLI utilityvault/admin_server.py— Admin HTTP server (internal unix socket, token + role endpoints)vault/vault_core.py—validate_token(), in-memory token/role stores, agent_id derivationvault/vault_grpc_server.py— user identity in context, startup integration with admin servervault/Dockerfile— includevault_admin.pyandadmin_server.pyvault/monitoring.py— per-user metrics labelsvault/docker-compose.yml— volume mounts (read-write for both configs), removeVAULT_TOKENSvault/.env.example— removeVAULT_TOKENSinstall.sh— docker group setup, runevault alias, generate config filestests/unit/test_auth.py— per-user token, expiry, rate limit, role CRUD testsPriority
High — Limits blast radius when a single user's agent is compromised via prompt injection.
Dependencies