GitHub - nanny-run/nanny: Nanny is an open source deterministic execution boundary runtime for AI agents.

Open-source execution boundary for autonomous systems.
Hard limits. Deterministic stops. Structured audit trail.

Documentation · Quickstart · Changelog · Report a Bug · Contributing

What is Nanny?

You deploy a multi-agent system on Friday. Monday morning your CFO sends a Slack: "Why did we spend $4,000 over the weekend?" One agent got stuck in a loop. Nobody stopped it. No audit trail. Nothing.

This is happening right now at hundreds of companies.

Nanny is the execution boundary that prevents it.

You tell Nanny what each agent is allowed to do — how many steps, how much budget, which tools, how long. The moment any limit is crossed, Nanny kills the process immediately, emits a structured log saying exactly what happened and why, and exits. No grace period. No recovery logic. No second chances.

When you have multiple specialized agents — a researcher, an analyst, a reporter — Nanny gives each one its own budget, its own tool allowlist, and its own kill switch. The analysis agent cannot call the reporter's tools. A loop-detection rule stops any agent from running the same computation five times in a row. The moment any agent steps outside its role or hits its ceiling, it stops. You get a full audit trail of every call, every decision, and every stop reason.

Think of it as a hard execution boundary — deterministic, auditable, and structurally impossible for any agent to bypass.

flowchart TD
    CMD(["nanny run"])
    CMD --> NANNY

    subgraph NANNY["Nanny — parent process"]
        direction LR

        subgraph CHILD["Child process"]
            AGENT["python agent.py"]
        end

        subgraph ENFORCE[" "]
            direction TB
            STEPS["steps"]
            COST["cost"]
            TIMER["timeout"]
        end

        AGENT -- "tool call" --> ENFORCE
        ENFORCE -- "✓  allowed" --> AGENT
    end

    ENFORCE -- "✗  limit reached → killed" --> DEAD(["process exits"])
    DEAD --> LOG["ExecutionStopped\nreason · steps · cost_spent\n→ stdout"]

The Nanny ecosystem

Layer	What it does
Nanny CLI	Hard timeout, step, and cost limits for any agent process in any language.
Rust SDK	Per-function cost metering, allowlist enforcement, and custom rules — in-process.
Python SDK	Per-function and per-role governance for Python agents — each agent in your fleet gets its own budget, tool allowlist, and custom rules.
Nanny Cloud (v0.2.0)	Durable audit logs, team dashboards, org-level budget aggregation, and cross-process fleet enforcement.

→ Full docs at docs.nanny.run

Sample applications

Four complete agent samples ship in examples/. All use Ollama — no API key required.

Sample	What it does	Stop reasons demonstrated
`examples/rust/webdingo`	Web research agent (Rust) — fetches pages, synthesises a report. Classic spiral risk.	`BudgetExhausted`, `RuleDenied`
`examples/rust/qabud`	Code review agent (Rust) — reads source files, identifies issues, blocks sensitive files before they're opened.	`RuleDenied`, `ToolDenied`, `MaxStepsReached`
`examples/python/dev_assist`	Debug agent (LangChain) — given a stack trace, reads relevant files and searches for related symbols.	`BudgetExhausted`, `RuleDenied`, `ToolDenied`
`examples/python/metrics_crew`	Multi-agent governance (CrewAI) — four specialized agents with per-role budgets, per-role tool allowlists, and a loop-detection rule. The analysis agent cannot call the reporter's tools. If it tries, `ToolDenied` fires. This is what least-privilege fleet governance looks like in 200 lines of Python.	`BudgetExhausted`, `RuleDenied`, `ToolDenied`

# Rust examples
cd examples/rust/webdingo && nanny run -- "best Rust HTTP clients"
cd examples/rust/qabud && nanny run -- ./src

# Python examples
cd examples/python/dev_assist && nanny run
cd examples/python/metrics_crew && nanny run

Scope: Nanny governs agents within a single process today. When all agents run in the same process — as in CrewAI, LangGraph, AutoGen, or any framework that orchestrates agents within one Python or Rust runtime — every agent is governed. Cross-process and cross-machine fleet enforcement is the v0.2.0 cloud layer.

Install

The Nanny CLI is a system tool — install it once globally and use nanny run from any project that has a nanny.toml.

macOS

brew tap nanny-run/nanny
brew install nannyd

Linux

curl -fsSL https://install.nanny.run | sh

Have Rust installed? cargo install nannyd also works.

Windows

irm https://install.nanny.run/windows | iex

Installs to %LOCALAPPDATA%\nanny\ and adds to PATH. Restart your terminal after installing.

Or download a pre-built binary directly from GitHub Releases.

SDK installation

SDKs are project dependencies — add them per project, not globally.

Rust

cargo add nannyd

Python

pip install nanny-sdk

60-second quickstart

# 1. Scaffold a nanny.toml in your project root
nanny init

# 2. Run your agent
nanny run

# 3. Use a named limit set for specific workloads
nanny run --limits=researcher

nanny.toml:

[runtime]
mode = "local"

[start]
cmd = "python agent.py"   # nanny run always reads this

[limits]
steps   = 100     # max tool calls
cost    = 1000    # max cost units
timeout = 30000   # wall-clock ms

[limits.researcher]
steps   = 200
cost    = 5000
timeout = 120000

[tools]
allowed = ["http_get", "read_file"]   # anything not listed is denied

Rust SDK — all three macros

For Rust agents, annotate functions directly to get per-function cost accounting, allowlist enforcement, and custom policy rules:

use nannyd::{tool, rule, agent, PolicyContext};

/// Each call charges 10 cost units and requires the tool to be in the allowlist.
#[nanny::tool(cost = 10)]
fn search_web(query: String) -> String {
    // ... HTTP request ...
    String::new()
}

/// Return false to stop the agent immediately with RuleDenied.
#[nanny::rule("no_spiral")]
fn check_spiral(ctx: &PolicyContext) -> bool {
    let h = &ctx.tool_call_history;
    // Stop if the last 3 calls were all search_web
    !(h.len() >= 3 && h.iter().rev().take(3).all(|t| t == "search_web"))
}

/// Activates [limits.researcher] for the duration of this function.
/// Limits revert automatically on return, even if the function panics.
#[nanny::agent("researcher")]
async fn run_research(topic: &str) {
    // ... agent loop — search_web governed by nanny ...
}

All macros are no-ops when running outside nanny run — no enforcement overhead.

→ Full Rust SDK guide at docs.nanny.run/v0.1/guides/rust-sdk

Python SDK — all three decorators

For Python agents, the same model as the Rust SDK — as decorators:

from nanny_sdk import tool, rule, agent

@tool(cost=10)
def search_web(query: str) -> str:
    import httpx
    return httpx.get(f"https://en.wikipedia.org/wiki/{query}").text

@rule("no_spiral")
def check_spiral(ctx) -> bool:
    h = ctx.tool_call_history
    return not (len(h) >= 3 and len(set(h[-3:])) == 1)

@agent("researcher")
def run_research(topic: str) -> list[str]:
    # Runs under [limits.researcher] from nanny.toml
    return [search_web(topic)]

Works with any framework — LangChain, CrewAI, plain Python. Stack decorators to combine framework registration with Nanny governance:

from langchain_core.tools import tool as lc_tool
from nanny_sdk import tool as nanny_tool

@lc_tool                   # outer — LangChain registers for dispatch
@nanny_tool(cost=5)        # inner — Nanny intercepts before the function runs
def read_file(path: str) -> str:
    with open(path) as f:
        return f.read()

All decorators are no-ops when running outside nanny run — zero overhead in development and CI.

→ Full Python SDK guide at docs.nanny.run/v0.1/guides/python-sdk

Event log

Every run emits NDJSON to stdout. One event per line. Always starts with ExecutionStarted, always ends with ExecutionStopped.

{"event":"ExecutionStarted","ts":1711234567000,"limits":{"steps":100,"cost":1000,"timeout":30000},"limits_set":"[limits]","command":"python agent.py"}
{"event":"ToolAllowed","ts":1711234567120,"tool":"http_get"}
{"event":"StepCompleted","ts":1711234567800,"step":1}
{"event":"ExecutionStopped","ts":1711234572000,"reason":"BudgetExhausted","steps":12,"cost_spent":1000,"elapsed_ms":5000}

Pipe it to a file, stream it to your log aggregator, or query it inline:

nanny run > nanny.log
nanny run | tee nanny.log

Documentation

Full reference at docs.nanny.run — quickstart, concepts, CLI reference, nanny.toml schema, event log, Rust SDK guide, and Python SDK guide.

Contributing

See CONTRIBUTING.md.

License

Apache-2.0 — see LICENSE.

Name		Name	Last commit message	Last commit date
Latest commit History 149 Commits
.cargo		.cargo
.github/workflows		.github/workflows
assets		assets
crates		crates
docs		docs
examples		examples
homebrew		homebrew
sdks/python		sdks/python
.gitignore		.gitignore
ARCHITECTURE.md		ARCHITECTURE.md
CHANGELOG.md		CHANGELOG.md
CODE_OF_CONDUCT.md		CODE_OF_CONDUCT.md
CONTRIBUTING.md		CONTRIBUTING.md
Cargo.lock		Cargo.lock
Cargo.toml		Cargo.toml
LICENSE		LICENSE
README.md		README.md
SECURITY.md		SECURITY.md
install.ps1		install.ps1
install.sh		install.sh

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

What is Nanny?

The Nanny ecosystem

Sample applications

Install

SDK installation

60-second quickstart

Rust SDK — all three macros

Python SDK — all three decorators

Event log

Documentation

Contributing

License

About

Uh oh!

Releases 7

Packages

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

What is Nanny?

The Nanny ecosystem

Sample applications

Install

SDK installation

60-second quickstart

Rust SDK — all three macros

Python SDK — all three decorators

Event log

Documentation

Contributing

License

About

Resources

License

Code of conduct

Contributing

Security policy

Uh oh!

Stars

Watchers

Forks

Releases 7

Packages 0

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Packages