Tool guards are a safety layer between the LLM deciding to call a tool and the actual execution. Every tool call passes through the guard system before reaching the tool implementation. Guards can allow, deny, escalate to an LLM validator, or prompt the user for approval.
Agent Loop (tool call dispatch)
──────────────────────────────
│ tc.Name, tc.Arguments
▼
┌──────────────────────┐
│ Guard Manager │
│ (Evaluate) │
│ │
│ ┌───────────────┐ │
│ │ Rule 1: Deny │──┤── VerdictDeny → return error to LLM
│ │ Rule 2: Ask │──┤── VerdictAsk → prompt user
│ │ Rule 3: LLM │──┤── VerdictLLM → call LLM guard
│ │ ... │ │── VerdictAllow → proceed
│ └───────────────┘ │
└──────────┬───────────┘
│ Decision
▼
┌──────────────────────┐
│ Denied? → error msg │
│ Ask? → user prompt │
│ Allow? → execute │
└──────────────────────┘
The guard sits at a single point in app/agent.go, inside the tool call loop, before cfg.ToolManager.ExecuteTool(). The Tool interface is untouched — guards are completely transparent to tool implementations.
// app/agent.go — inside the tool call loop
if cfg.GuardMgr != nil {
decision, guardErr := cfg.GuardMgr.Evaluate(ctx, tc.Name, tc.Arguments)
if guardErr != nil {
// send error as tool result, continue to next tool call
}
if decision != nil && decision.Verdict == guard.VerdictDeny {
// send blocked message as tool result, continue to next tool call
}
}
// Only reaches here if guard allowed (or no guard configured)
result, err := cfg.ToolManager.ExecuteTool(tc.Name, tc.Arguments, eventsCh)When a tool call is blocked, the LLM receives a tool result explaining the block (e.g. "Operation blocked by safety guard: dangerous command blocked: rm -rf /"). This allows the LLM to explain the situation to the user and suggest alternatives, rather than silently failing.
All types live in internal/guard/guard.go.
type Verdict string
const (
VerdictAllow Verdict = "allow" // safe, proceed with execution
VerdictDeny Verdict = "deny" // blocked, return error to LLM
VerdictAsk Verdict = "ask" // prompt the user for approval
VerdictLLM Verdict = "llm" // escalate to LLM-based validator
)type Decision struct {
Verdict Verdict
Reason string // human-readable explanation shown to user/LLM
}type EvalContext struct {
ToolName string // e.g. "Bash", "Write", "Read"
Input json.RawMessage // raw JSON arguments from the LLM
WorkingDir string // current working directory (os.Getwd())
}type Rule interface {
Evaluate(ctx *EvalContext) *Decision // nil = abstain (no opinion)
}Rules return nil to abstain — indicating they have no opinion on this tool call. This is distinct from returning VerdictAllow, which is a positive assertion that the call is safe.
type PermissionHandler func(toolName string, decision Decision) boolCalled when the final verdict is VerdictAsk. Blocks until the user responds. Returns true if approved.
The Manager (internal/guard/manager.go) orchestrates rule evaluation, verdict escalation, session caching, and user prompting.
type Manager struct {
rules []Rule
llmValidator LLMValidator // nil = disabled
permHandler PermissionHandler // nil = auto-deny
sessionApproved map[string]bool // "Bash:reason" → true
mu sync.RWMutex
}Evaluate(ctx context.Context, toolName, input string) (*Decision, error):
-
Build context — Parse
inputasjson.RawMessage, get cwd viaos.Getwd(), constructEvalContext. -
Run rules in order — First non-nil
Decisionwins. If no rule fires, returnVerdictAllow. -
Handle verdict:
VerdictAllow→ return immediately (tool executes)VerdictDeny→ return immediately (tool blocked)VerdictLLM→ callllmValidator.Validate()if configured; on error or if unconfigured, fall back toVerdictAskVerdictAsk→ check session cache → if miss, callpermHandler→ if no handler, auto-deny
-
Session caching — When a user approves a prompt, the approval is cached using the key
"toolName:reason"for the process lifetime. Subsequent identical prompts are auto-approved without re-prompting.
Rules are evaluated in registration order. The first non-nil decision wins. This means:
- DangerousCommandRule runs first to catch catastrophic commands before any other logic
- WorkingDirRule runs second to enforce directory boundaries
- SensitiveFileRule runs third to protect sensitive files
- Plugin rules run next (project-specific overrides)
- DefaultPolicyRule runs last as the catch-all baseline
// app/main.go — registration order
guardMgr.AddRule(&guard.DangerousCommandRule{}) // highest priority
guardMgr.AddRule(&guard.WorkingDirRule{})
guardMgr.AddRule(&guard.SensitiveFileRule{})
for _, r := range guard.LoadPlugins() { // project-specific
guardMgr.AddRule(r)
}
guardMgr.AddRule(&guard.DefaultPolicyRule{}) // catch-all (lowest)All built-in rules live in internal/guard/rules.go.
Applies to: Bash tool only.
Matches command strings against two sets of patterns:
Deny list (always blocked, no user override):
| Pattern | What it catches |
|---|---|
rm -rf /, rm -rf ~, rm -rf $HOME |
Recursive deletion of root/home |
mkfs |
Filesystem formatting |
dd if=... of=/dev/... |
Raw device writes |
:(){ ...|... };: |
Fork bombs |
chmod -R 777 / |
Recursive permission change on root |
Ask list (user must approve):
| Pattern | Reason |
|---|---|
sudo |
Privilege escalation |
curl|sh, wget|sh |
Pipe-to-shell |
git push --force, git push -f |
Force push |
git reset --hard |
Hard reset |
npm publish, cargo publish |
Package publishing |
docker run, docker exec |
Container execution |
Commands not matching either list cause the rule to abstain (nil), deferring to later rules.
Applies to: Read, Write, Edit, Glob (file path check), Bash (path extraction).
File tools: Parses file_path or path from the JSON input, resolves to an absolute path via filepath.Abs + filepath.Clean, and checks whether it falls inside the working directory. Files inside cwd → abstain. Files outside cwd → VerdictAsk.
Bash: Extracts absolute paths from the command string using the regex /[^\s;|&>"']+. For each path outside cwd, checks if the command is write-oriented (rm, mv, cp, chmod, mkdir, rmdir, tee, dd, chown, touch). Write commands outside cwd → VerdictAsk. Read-only commands outside cwd → abstain (reading outside cwd is usually harmless).
Applies to: Write and Edit only (reading sensitive files is allowed).
Checks the file basename against glob patterns:
| Pattern | Files matched |
|---|---|
.env, .env.* |
Environment configuration |
*credentials* |
Credential files |
*secret* |
Secret files |
*.pem, *.key |
Cryptographic keys |
.git/config |
Git configuration |
.ssh/* |
SSH configuration |
Match → VerdictAsk. No match → abstain.
Applies to: All tools. Always returns a non-nil decision (never abstains). This is the catch-all rule and must be registered last.
| Tool | Verdict | Rationale |
|---|---|---|
Skill, Read, Glob |
VerdictAllow |
Read-only, no side effects |
Write, Edit |
VerdictAllow |
Earlier rules already catch sensitive files and out-of-cwd writes |
Bash (known-safe) |
VerdictAllow |
Matches against a safe-list |
Bash (unknown) |
VerdictLLM |
Escalates to LLM guard (falls back to VerdictAsk if LLM guard is disabled) |
| Other tools | VerdictAllow |
Unknown tools default to allow |
Known-safe Bash patterns (commands that skip the guard):
- Shell builtins:
echo,pwd,which,env,printenv - Read-only:
ls,cat,head,tail,wc,sort,uniq,diff - Git read-only:
git status,git log,git diff,git branch,git show,git stash - Build/test:
go build/test/run/vet/fmt,npm test/run/ci/install,cargo build/test/check,make,cmake - Search:
grep,rg,ag,fd,find
Matching is prefix-based — go test ./... matches because it starts with go test.
Optional LLM-based validator (internal/guard/llm_guard.go) for ambiguous commands that don't match any safe-list pattern. When enabled, unknown Bash commands are sent to a fast/cheap LLM for evaluation instead of immediately prompting the user.
Sends a single completion request with this prompt:
You are a security evaluator for a CLI coding agent working in: {cwd}
Evaluate this tool call:
Tool: {toolName}
Input: {sanitized input, truncated to 500 bytes}
Respond with exactly one line:
ALLOW
DENY: <reason>
ASK: <reason>
Consider: working directory boundaries, system damage risk, data exfiltration,
common dev operations.
The response is parsed line-by-line. Only the first line is considered:
ALLOW→VerdictAllowDENY: <reason>→VerdictDenyASK: <reason>→VerdictAsk- Anything else →
VerdictAsk(ambiguous response defaults to prompting)
Controlled via environment variables:
| Variable | Default | Description |
|---|---|---|
BITCODE_GUARD_LLM |
(unset) | Set to true to enable |
BITCODE_GUARD_LLM_MODEL |
Main model | Model to use for evaluation |
BITCODE_GUARD_LLM_BASE_URL |
Main base URL | API endpoint |
BITCODE_GUARD_LLM_API_KEY |
Main API key | API key |
Typically you'd point this at a fast, cheap model (e.g. a small local model or a low-cost API tier) since it only needs to classify commands as safe/unsafe.
- If the LLM guard is enabled but the API call fails → falls back to
VerdictAsk(prompt user) - If the LLM guard is not enabled →
VerdictLLMautomatically falls back toVerdictAsk
The Guard Agent is an expert multi-turn LLM agent designed for security-aware tool validation. It replaces the simple single-turn LLM guard with a sophisticated agent that uses domain-specific skills and structured reasoning.
Tool Call (Name + JSON args)
│
▼
┌──────────────────────┐
│ Guard Manager │
│ Rule Chain │
└──────────┬───────────┘
│ VerdictLLM
▼
┌──────────────────────────────────────┐
│ Guard Agent │
│ ────────────────────────────────── │
│ Expert persona system prompt │
│ │
│ tools.Manager │
│ └─ SkillTool ◄─ GuardSkillMgr │◄── guard-skills/ dirs
│ (embedded built-ins) │
│ │
│ Standard tool-call agent loop: │
│ Complete() → FinishToolCalls │
│ → SkillTool.Execute() │
│ → inject skill body as result │
│ → Continue until FinishStop │
│ → parse ALLOW/DENY/ASK │
└──────────────────────────────────────┘
-
Expert Persona — A senior security engineer and sysadmin with extensive cloud deployment experience (AWS, GCP, Azure, Kubernetes).
-
Multi-turn Reasoning — Uses a standard tool-call loop (same pattern as the main agent) to reason about complex tool calls. The agent can invoke skills, analyze code, and make informed decisions.
-
Language-Specific Skills — Automatically detects the programming language/runtime and injects relevant security context:
- Bash — Command substitution, eval, redirection attacks, pipe-to-shell
- Python —
eval(),exec(),subprocesswithshell=True, pickle deserialization - Go —
exec.Command, shell invocation,unsafepackage, path traversal - JavaScript/TypeScript —
eval(),child_process, prototype pollution, SSRF
-
Code Simulation — On-demand skill for step-by-step code tracing to predict execution behavior before allowing the call.
Guard skills work the same as the main agent's skills system but are loaded from guard-skills/ directories:
~/.agents/guard-skills/ ← lowest precedence (disk)
~/.claude/guard-skills/
~/.bitcode/guard-skills/
.agents/guard-skills/ ← project-level
.claude/guard-skills/
.bitcode/guard-skills/ ← highest precedence (disk)
internal/guard/skills/ ← embedded built-ins (lowest of all)
The following skills are embedded in the binary:
| Skill | Language | Auto-invoke | Description |
|---|---|---|---|
bash.md |
bash |
Yes | Bash security patterns and dangerous constructs |
python.md |
python |
Yes | Python security patterns (subprocess, eval, pickle) |
go.md |
go |
Yes | Go security patterns (exec.Command, unsafe) |
js.md |
js |
Yes | JS/TS security patterns (eval, child_process) |
simulate.md |
— | No | Code simulation protocol (on-demand) |
Guard skills support additional frontmatter fields:
---
name: Bash Security Expert
description: Pattern library for dangerous Bash constructs
language: bash
auto_invoke: true
---
# Bash Security Patterns
...| Field | Type | Description |
|---|---|---|
language |
string | The language this skill applies to |
auto_invoke |
bool | If true, automatically inject skill body into guard context |
When the guard agent evaluates a tool call:
-
Language Detection — The guard detects the language/runtime from the tool call:
Bashtool → alwaysbashBashwithpython/python3/uv run→pythonBashwithgo run/go build/go test→goBashwithnode/deno/bun/npx→js- File tools with
.py/.go/.js/.tsextensions
-
Auto-Inject Skills — Skills with
auto_invoke: truematching the detected language have their bodies pre-injected into the first user message. The guard LLM sees them immediately without needing to make a tool call. -
On-Demand Skills — All skills are listed in the system prompt. The guard can invoke them via
SkillToolwhen it needs deeper analysis.
[User message to guard]
Tool: Bash
Input: python3 -c "import subprocess; subprocess.run('rm -rf /tmp/old', shell=True)"
Auto-context: [python.md body pre-injected]
[Assistant — FinishToolCalls]
{"tool": "Skill", "arguments": {"skill": "simulate"}}
[Tool result]
# Code Simulation Protocol
... (simulate.md body)
[Assistant — FinishStop]
"Tracing the code: subprocess.run with shell=True executing 'rm -rf /tmp/old'.
/tmp/old is outside working directory. Shell=True with a literal string is acceptable
but the path is fixed (/tmp) not cwd-relative.
ASK: subprocess.run with shell=True deletes outside working directory"
→ Decision{Verdict: VerdictAsk, Reason: "subprocess.run with shell=True deletes outside working directory"}
| Variable | Default | Description |
|---|---|---|
BITCODE_GUARD_LLM |
true |
Enable the Guard Agent (set to false to disable) |
BITCODE_GUARD_MODEL |
main model | Model to use for guard agent |
BITCODE_GUARD_MAX_TURNS |
5 |
Max turns for guard agent reasoning loop |
To use only rule-based guards without LLM evaluation:
BITCODE_GUARD_LLM=false ./bitcodeThis will cause all VerdictLLM verdicts to fall back to VerdictAsk (prompt user) instead of calling the Guard Agent.
When the final verdict is VerdictAsk, the guard system prompts the user for approval using a minimal bubbletea program (internal/guard/prompt.go).
⚠ Guard: Bash command modifies /tmp/old-builds which is outside working directory /home/user/project
Tool: Bash
[y] Allow once [a] Always allow [n] Deny
| Key | Action |
|---|---|
y |
Allow this one tool call |
a |
Always allow (caches for the session) |
n, q, Esc, Ctrl+C |
Deny |
The permission prompt needs to pause the thinking spinner before displaying. The TerminalPermissionHandler accepts pauseThinking and resumeThinking callbacks. In practice, only pauseThinking is needed — the spinner restarts automatically on the next OnThinking(true) call when the LLM begins its next turn.
// app/main.go — wiring
guardMgr.SetPermissionHandler(guard.TerminalPermissionHandler(
func() { // pauseThinking
if spin != nil {
spin.Stop()
spin = nil
}
},
nil, // resumeThinking not needed
))When BitCode runs with -p (single-shot mode), the AutoDenyHandler is used instead. All VerdictAsk decisions are automatically denied, and the LLM receives an error explaining why the operation was blocked.
if isNonInteractive {
guardMgr.SetPermissionHandler(guard.AutoDenyHandler())
}When a user chooses "Always allow" (a), the approval is cached in sessionApproved using the key "toolName:reason". This means:
- The same type of operation (same tool, same reason string) won't prompt again
- The cache lives only for the current process — restarting BitCode resets it
- There is no persistent allow-list on disk
Guard plugins allow projects to define custom rules via configuration files (internal/guard/plugins.go).
Same precedence model as reminders and skills. Later entries with the same id overwrite earlier ones:
~/.agents/guards/(lowest)~/.claude/guards/~/.bitcode/guards/.agents/guards/(project-level).claude/guards/.bitcode/guards/(highest)
YAML:
# .bitcode/guards/block-docker.yaml
id: block-docker
tool: Bash
patterns:
- match: "docker"
verdict: ask
reason: "Docker commands require approval"Markdown with frontmatter:
---
id: protect-env
tool: Write,Edit
patterns:
- file_match: ".env*"
verdict: ask
reason: "Modifying environment configuration"
---| Field | Type | Required | Description |
|---|---|---|---|
id |
string | No (derived from filename) | Unique identifier for deduplication |
tool |
string | No | Comma-separated tool names to match (e.g. Bash, Write,Edit). Empty = all tools. |
patterns |
list | Yes | Pattern rules to evaluate |
| Field | Type | Description |
|---|---|---|
match |
string (regex) | Regex matched against the Bash command string |
file_match |
string (glob) | Glob matched against the file basename for file tools (Read, Write, Edit) |
verdict |
string | allow, deny, or ask |
reason |
string | Human-readable reason shown to the user |
Each pattern can have either match (for Bash) or file_match (for file tools), or both. Patterns are checked in order; the first matching pattern determines the verdict.
Block all network access:
id: no-network
tool: Bash
patterns:
- match: "\\b(curl|wget|ssh|scp|nc|netcat)\\b"
verdict: ask
reason: "Network access requires approval"Protect database migrations:
id: protect-migrations
tool: Write,Edit
patterns:
- file_match: "*.sql"
verdict: ask
reason: "Modifying SQL migration files requires approval"Block package installation:
id: no-install
tool: Bash
patterns:
- match: "\\b(apt|yum|brew|pip|gem)\\s+install\\b"
verdict: ask
reason: "Package installation requires approval"Guard events are rendered in yellow/amber to visually distinguish them from normal tool events (green) and errors (red).
⚠ Guard(Bash)
⎿ Blocked: dangerous command blocked: rm -rf /
The event uses PreviewType: PreviewGuard (defined in internal/event.go) and is rendered by renderGuardEvent() in app/render.go.
The system prompt (app/system_prompt.go) includes instructions telling the LLM about guards:
# Safety Guards
Tool calls are subject to safety guards. If a tool call is blocked, you will receive
an error explaining why. Do not retry blocked operations. Instead, explain to the user
what you wanted to do and suggest alternatives.
This prevents the LLM from entering retry loops when a tool call is blocked.
internal/guard/
guard.go # Core types (Verdict, Decision, EvalContext, Rule, PermissionHandler)
manager.go # Manager — rule chain evaluation, session caching, verdict escalation
rules.go # 4 built-in rules (DangerousCommand, WorkingDir, SensitiveFile, DefaultPolicy)
llm_guard.go # Deprecated — replaced by GuardAgent
guard_agent.go # GuardAgent — multi-turn LLM agent with SkillTool support
guard_prompt.go # BuildGuardSystemPrompt() — expert persona prompt
langdetect.go # DetectLanguage() + SkillsForLanguage() helpers
prompt.go # Terminal permission prompt (TerminalPermissionHandler, AutoDenyHandler)
plugins.go # Plugin loading from guards/ directories
guard_test.go # Tests for rules, manager, LLM parsing, plugin parsing
skills/ # Embedded built-in guard skills
bash.md # Bash security patterns (auto_invoke: true)
python.md # Python security patterns (auto_invoke: true)
go.md # Go security patterns (auto_invoke: true)
js.md # JS/TS security patterns (auto_invoke: true)
simulate.md # Code simulation protocol (on-demand)
internal/skills/
skills.go # Skill manager with Config support (SubDir, Embedded, Metadata)
Integration points in app/:
agent.go— Guard check beforeExecuteToolin the tool call loop;GuardMgrfield onAgentConfigmain.go— Creates the Manager, registers built-in rules in priority order, loads plugins, configures LLM guard and permission handlersrender.go—renderGuardEvent()for yellow guard event displaysystem_prompt.go— Safety guard instructions for the LLMinternal/event.go—PreviewGuardconstant