tokmon

See where your AI tokens actually go.

tokmon is a transparent local proxy between coding agents (Claude Code, Cursor, aider) and LLM APIs (Anthropic, OpenAI). It logs token usage and estimated cost, detects anomaly patterns (cache breaks, retry loops, token spikes), and shows a real-time terminal dashboard.

Zero config. Local-first. No prompt or response content stored.

The Problem

Developers are hitting token limits and cost spikes without visibility into where usage comes from. The HN thread that triggered this project: "Claude Code users hitting usage limits way faster than expected". Real pain points from that discussion: silent cache-break inflation, retry loops burning quota, and "token anxiety" from opaque usage behavior.

Install

brew install tokmon or go install github.com/evgenybalyakin/tokmon@latest or download a release binary.

Quickstart

# Start proxy
tokmon

# Run agent through tokmon
ANTHROPIC_BASE_URL=http://localhost:4100/anthropic claude

# See live telemetry
tokmon dash

What You'll See

tokmon dashboard updates every 500ms with:

total requests/tokens/cache/cost/error rate
last 10 requests with cache ratio and per-request cost
cache break, retry loop, token spike warnings

Features

Transparent local proxy for Anthropic/OpenAI with streaming support.
OpenAI streaming compatibility patch (stream_options.include_usage=true only when missing).
SQLite telemetry in WAL mode with async, bounded write queue.
Session-aware request fingerprinting with prompt text stripped.
Real-time terminal dashboard built with Bubble Tea.
stats command with plain text and JSON outputs.
CI guardrails via stats --assert-* (budget, error rate, retry-loop checks).
Export telemetry to json, jsonl, or csv.
Retention pruning with dry-run by default.
Setup helper for shell env var bootstrap.

How It Works

┌─────────────┐     ┌─────────┐     ┌──────────────────┐
│ Claude Code │────▶│ tokmon  │────▶│ api.anthropic.com │
│ / Cursor /  │◀────│ (proxy) │◀────│ / api.openai.com  │
│ aider / etc │     └────┬────┘     └──────────────────┘
└─────────────┘          │
                    ┌────▼────┐
                    │ SQLite  │
                    │ (logs)  │
                    └─────────┘

tokmon listens on localhost:<port> and routes:

/anthropic/* -> https://api.anthropic.com/*
/openai/* -> https://api.openai.com/*

Configuration

Environment variables

Variable	Meaning	Default
`TOKMON_DB`	SQLite database file	`~/.tokmon/tokmon.db`
`TOKMON_PORT`	proxy listen port	`4100`
`TOKMON_SESSION`	explicit session id	auto-generated
`TOKMON_TZ`	timezone for `stats` day/week windows	local timezone

Proxy flags

Flag	Meaning	Default
`--port`	proxy listen port	`4100`
`--budget`	budget USD limit	`0` (disabled)
`--budget-scope`	`session` or `day`	`session`
`--budget-action`	`warn`, `pause`, `stop`	`warn`
`--db`	sqlite path override	env/default

Automation Guardrails

# Fail pipeline if known estimated cost exceeds $5
tokmon stats --json --assert-budget 5

# Fail if error rate > 1.0%
tokmon stats --assert-error-rate 1.0

# Fail if retry loops are detected
tokmon stats --assert-no-retry-loops

FAQ

Does tokmon see my API key?

Yes, because it forwards headers to the upstream provider. tokmon does not persist API keys and does not log prompt/response bodies.

Does tokmon slow down requests?

It forwards streams chunk-by-chunk (tee pattern) and writes telemetry asynchronously, so proxying overhead is minimal.

Does tokmon work with Claude Pro/Max subscriptions?

Yes. Usage logging is independent of your billing method. Cost estimates are still useful as a normalized comparison metric.

What about OpenAI / GPT?

Supported. Set OPENAI_BASE_URL=http://localhost:4100/openai.

Why does tokmon modify OpenAI streaming requests?

OpenAI omits usage in streams unless stream_options.include_usage=true. tokmon injects that field only when stream: true and include_usage is missing.

Can I run this in CI/CD?

Yes. Start proxy in background, run agent with *_BASE_URL override, then collect tokmon stats --json.

Contributing

Contributions are welcome. Please open an issue first for major design changes.

go test ./...

MIT licensed.

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
cmd		cmd
internal		internal
testdata		testdata
.editorconfig		.editorconfig
.env.example		.env.example
.gitattributes		.gitattributes
.gitignore		.gitignore
.goreleaser.yml		.goreleaser.yml
LICENSE		LICENSE
Makefile		Makefile
README.md		README.md
go.mod		go.mod
go.sum		go.sum
main.go		main.go

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

tokmon

The Problem

Install

Quickstart

What You'll See

Features

How It Works

Configuration

Environment variables

Proxy flags

Automation Guardrails

FAQ

Does tokmon see my API key?

Does tokmon slow down requests?

Does tokmon work with Claude Pro/Max subscriptions?

What about OpenAI / GPT?

Why does tokmon modify OpenAI streaming requests?

Can I run this in CI/CD?

Contributing

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

tokmon

The Problem

Install

Quickstart

What You'll See

Features

How It Works

Configuration

Environment variables

Proxy flags

Automation Guardrails

FAQ

Does tokmon see my API key?

Does tokmon slow down requests?

Does tokmon work with Claude Pro/Max subscriptions?

What about OpenAI / GPT?

Why does tokmon modify OpenAI streaming requests?

Can I run this in CI/CD?

Contributing

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages