Clawd Throttle

Route every LLM request to the cheapest model that can handle it.

Clawd Throttle is an OpenClaw skill (MCP server) and HTTP reverse proxy that classifies prompt complexity in under 1ms and routes to the cheapest capable model across 8 LLM providers and 30+ models.

Supported Providers

Provider	Models	Input $/MTok	Output $/MTok
Anthropic	Opus 4.6, Opus 4.5, Sonnet 4.5, Haiku 4.5, Haiku 3.5	$0.25–$5.00	$1.25–$25.00
OpenAI	GPT-5.2, GPT-5.1, GPT-5-mini, GPT-5-nano, GPT-4o, GPT-4o-mini, o3	$0.10–$5.00	$0.40–$30.00
Google	Gemini 2.5 Pro, 2.5 Flash, 2.0 Flash-Lite	$0.01–$1.25	$0.02–$10.00
DeepSeek	DeepSeek-Chat, DeepSeek-Reasoner	$0.14–$0.55	$0.28–$2.19
xAI	Grok-4, Grok-3, Grok-3-mini, Grok-4.1-fast	$0.30–$3.00	$0.50–$15.00
Moonshot	Kimi K2.5, K2-thinking	$0.35–$0.60	$1.50–$2.50
Mistral	Mistral Large, Small, Codestral	$0.10–$2.00	$0.30–$6.00
MiniMax	M2.5	$0.30	$1.20
Ollama	Local models (any)	$0.00	$0.00

All API keys are optional. Configure one or more providers — Clawd Throttle automatically routes to the best available model.

Quick Start

Standalone Use

# 1. Clone and install
npm install

# 2. Set at least one API key
export ANTHROPIC_API_KEY=sk-...    # or any other provider

# 3. Run setup (optional — prompts for all keys and mode)
npm run setup          # Windows
npm run setup:unix     # macOS/Linux

# 4. Start
npm start              # MCP stdio server
npm start -- --http    # MCP + HTTP proxy
npm start -- --http-only  # HTTP proxy only

OpenClaw Integration

📖 See OPENCLAW_SETUP.md for full integration guide.

Quick version:

Install: cd /root/clawd/skills && git clone <repo> && cd clawd-throttle && npm install
Configure: Create ~/.config/clawd-throttle/config.json with API keys and mode
Start service: sudo systemctl start clawd-throttle-http (see systemd template)
Route OpenClaw: Add ANTHROPIC_BASE_URL=http://127.0.0.1:8484 to openclaw.json env.vars
Restart OpenClaw: openclaw gateway stop && openclaw gateway start

⚠️ DO NOT add throttle as an MCP provider under auth.profiles in openclaw.json — this breaks the gateway. Use the HTTP proxy method with ANTHROPIC_BASE_URL instead.

HTTP Proxy Mode

Clawd Throttle runs as an HTTP reverse proxy that accepts OpenAI and Anthropic API formats. Any client that can point at a custom base URL works without code changes.

Starting the Proxy

CLAWD_THROTTLE_HTTP=true npm start           # Enable HTTP proxy
npm start -- --http                           # CLI flag (HTTP + MCP)
npm start -- --http-only                      # HTTP only
CLAWD_THROTTLE_HTTP_PORT=9090 npm start -- --http  # Custom port

Endpoints

Method	Path	Description
POST	`/v1/messages`	Anthropic Messages API format
POST	`/v1/chat/completions`	OpenAI Chat Completions format
GET	`/health`	Health check with uptime and mode
GET	`/stats`	Routing stats (optional `?days=N`, default 30)

Examples

OpenAI format:

curl http://localhost:8484/v1/chat/completions \
  -H "Content-Type: application/json" \
  -d '{
    "messages": [
      {"role": "system", "content": "You are helpful."},
      {"role": "user", "content": "Explain monads"}
    ],
    "max_tokens": 1000
  }'

Streaming:

curl --no-buffer http://localhost:8484/v1/chat/completions \
  -H "Content-Type: application/json" \
  -d '{
    "messages": [{"role": "user", "content": "Write a haiku"}],
    "max_tokens": 100,
    "stream": true
  }'

Force a specific model:

curl http://localhost:8484/v1/messages \
  -H "Content-Type: application/json" \
  -H "X-Throttle-Force-Model: deepseek" \
  -d '{
    "messages": [{"role": "user", "content": "hello"}],
    "max_tokens": 100
  }'

Response Headers

Header	Description
`X-Throttle-Model`	The model that handled the request
`X-Throttle-Tier`	Classified tier: simple, standard, or complex
`X-Throttle-Score`	Raw classifier score (0.00–1.00)
`X-Throttle-Request-Id`	Unique request ID for log correlation

Client Configuration

Point any OpenAI-compatible client at the proxy:

# Python (openai SDK)
import openai
client = openai.OpenAI(base_url="http://localhost:8484/v1", api_key="unused")
response = client.chat.completions.create(
    model="auto",
    messages=[{"role": "user", "content": "hello"}],
)

// TypeScript (Anthropic SDK)
import Anthropic from '@anthropic-ai/sdk';
const client = new Anthropic({
  baseURL: 'http://localhost:8484',
  apiKey: 'unused',
});

Routing Modes

Routing uses preference lists — ordered arrays of models per (mode, tier). The router picks the first model whose provider has a configured API key.

Mode	Simple	Standard	Complex
eco	Grok Fast, Flash, GPT-5-nano	Flash, Grok Fast, GPT-4o-mini	Haiku, DeepSeek-R1, Kimi
standard	Grok Fast, Flash, Haiku	Haiku, Grok Fast, Flash	Sonnet, Haiku, GPT-5.1
performance (default)	Haiku, MiniMax, Grok Fast	MiniMax, Haiku, Kimi	MiniMax, Sonnet, Kimi
gigachad	Haiku, Grok Fast, Flash	Sonnet, Haiku, GPT-5.1	Opus 4.6, Sonnet, GPT-5.2

How It Works

Prompt arrives via route_request MCP tool or HTTP proxy endpoint
Classifier scores it on 8 dimensions in <1ms
Composite score maps to a tier: simple (<=0.30), standard, or complex (>=0.65)
Preference list lookup: first model whose provider is configured wins
Fallback: if no preferred model available, use cheapest available model
Request proxied to the selected provider's API
Decision logged to JSONL for cost tracking

MCP Tools

Tool	Description
`route_request`	Send prompt to cheapest capable model, get response + routing metadata
`classify_prompt`	Analyze complexity without API call (diagnostic)
`get_routing_stats`	Cost savings, model distribution, tier breakdown
`set_mode`	Change routing mode at runtime
`get_config`	View config with all 8 providers (keys redacted)
`get_recent_routing_log`	Inspect recent routing decisions

Overrides

Heartbeats/summaries: "ping", "summarize this" -> always cheapest
Force model: /opus, /deepseek, /grok, /kimi, /mistral, /local, /gpt-5, /o3, etc.
HTTP header: X-Throttle-Force-Model: deepseek
Sub-agents: Pass parentRequestId to step down one tier automatically

Configuration

Config file: ~/.config/clawd-throttle/config.json

Environment Variables

Variable	Description
`ANTHROPIC_API_KEY`	Anthropic API key
`GOOGLE_AI_API_KEY`	Google AI API key
`OPENAI_API_KEY`	OpenAI API key
`DEEPSEEK_API_KEY`	DeepSeek API key
`XAI_API_KEY`	xAI/Grok API key
`MOONSHOT_API_KEY`	Moonshot/Kimi API key
`MISTRAL_API_KEY`	Mistral API key
`MINIMAX_API_KEY`	MiniMax API key
`OLLAMA_BASE_URL`	Ollama base URL (default: http://localhost:11434/v1)
`CLAWD_THROTTLE_MODE`	eco, standard, or performance
`CLAWD_THROTTLE_LOG_LEVEL`	debug, info, warn, error
`CLAWD_THROTTLE_HTTP`	Set to `true` to enable HTTP proxy
`CLAWD_THROTTLE_HTTP_PORT`	HTTP proxy port (default: 8484)

Requirements

Node.js 18+
At least one LLM provider API key (or Ollama running locally)

Privacy

Prompt content is never stored — only SHA-256 hashes in logs
All data stays local in ~/.config/clawd-throttle/
API keys stored in your local config file or environment variables

Development

npm run dev          # Watch mode
npm test             # Run tests
npm run stats        # View routing stats

Name		Name	Last commit message	Last commit date
Latest commit History 15 Commits
.claude		.claude
data		data
scripts		scripts
src		src
tests/unit		tests/unit
.env.example		.env.example
.gitignore		.gitignore
CHANGELOG.md		CHANGELOG.md
OPENCLAW_SETUP.md		OPENCLAW_SETUP.md
README.md		README.md
SKILL.md		SKILL.md
clawd-throttle-http.service		clawd-throttle-http.service
package-lock.json		package-lock.json
package.json		package.json
tsconfig.json		tsconfig.json
vitest.config.ts		vitest.config.ts

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Clawd Throttle

Supported Providers

Quick Start

Standalone Use

OpenClaw Integration

HTTP Proxy Mode

Starting the Proxy

Endpoints

Examples

Response Headers

Client Configuration

Routing Modes

How It Works

MCP Tools

Overrides

Configuration

Environment Variables

Requirements

Privacy

Development

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Clawd Throttle

Supported Providers

Quick Start

Standalone Use

OpenClaw Integration

HTTP Proxy Mode

Starting the Proxy

Endpoints

Examples

Response Headers

Client Configuration

Routing Modes

How It Works

MCP Tools

Overrides

Configuration

Environment Variables

Requirements

Privacy

Development

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages