Credit: forked from JuliusBrussee/caveman. Julius built the compression concept. This project adds the rehydration layer.
Claude speaks cavespeak. Gemma rehydrates. You read prose. Token cost stays low.
Claude Code CLI
β ANTHROPIC_BASE_URL=http://localhost:3000
β
proxy.js (Node, local)
ββ injects cavespeak system prompt β Anthropic API
ββ tool_use responses β PASS THROUGH (no rehydration)
ββ end_turn responses β Ollama (Gemma 4) β polished prose β CLI
Claude compresses internally. Gemma translates back to English. Tool calls β file reads, bash commands, searches β pass through untouched so the pipeline stays fast.
/skill cavespeak β compression only. Responses are terse caveman-speak. Cheap and fast, but raw.
/skill eloquent-cavespeak β full pipeline. Same token savings, polished output. Requires Ollama running locally.
Prerequisites: Node.js 18+, Ollama
# 1. Install Ollama
brew install ollama # or: https://ollama.com/download
# 2. Install the skills into Claude Code
npx skills add coderlevelup/eloquent-cavespeak
# 3. Clone and install dependencies
git clone https://github.com/coderlevelup/eloquent-cavespeak
cd eloquent-cavespeak
npm installThen in Claude Code, run the skill β it handles starting the proxy, checking Ollama, and pulling the model automatically:
/skill eloquent-cavespeak
Follow the prompt it returns to restart Claude Code with ANTHROPIC_BASE_URL=http://localhost:3000.
| Variable | Default | Description |
|---|---|---|
PORT |
3000 |
Proxy listen port |
OLLAMA_HOST |
http://localhost:11434 |
Ollama base URL |
GEMMA_MODEL |
gemma4:latest |
Model used for rehydration |
Run /skill eloquent-cavespeak inside this repo, then ask Claude to explain how the proxy works. It will explain itself β compressed upstream, polished on the way out.