Run Claude Code with local Ollama models (free) or authenticate via Claude Pro subscription / Anthropic API.
Covers macOS M-Series, NVIDIA RTX 40-series, Blackwell, and DGX Spark hardware.
| File | Description |
|---|---|
Claude_Code_Complete_Setup_Guide.pdf |
Full illustrated guide with tables, hardware specs, and troubleshooting |
scripts/setup_ollama.sh |
One-shot setup script for macOS/Linux — Ollama + Claude Code |
scripts/reset_claude_code.sh |
Clean-slate reset — removes all Ollama/OpenClaw/dummy-key config |
# 1. Install Ollama
brew install ollama # macOS
curl -fsSL https://ollama.com/install.sh | sh # Linux
# 2. Pull a model (see hardware guide below)
ollama pull qwen3-coder
# 3. Install Claude Code
curl -fsSL https://claude.ai/install.sh | bash
# 4. Set env vars in ~/.zshrc
echo 'export ANTHROPIC_BASE_URL="http://localhost:11434"' >> ~/.zshrc
echo 'export ANTHROPIC_AUTH_TOKEN="ollama"' >> ~/.zshrc
echo 'export ANTHROPIC_DEFAULT_HAIKU_MODEL="glm-4.7-flash"' >> ~/.zshrc
echo 'export ANTHROPIC_DEFAULT_SONNET_MODEL="qwen3-coder"' >> ~/.zshrc
echo 'export ANTHROPIC_DEFAULT_OPUS_MODEL="qwen3-coder"' >> ~/.zshrc
source ~/.zshrc
# 5. Launch
ollama serve &
claude# No Ollama needed. Just install Claude Code and login.
curl -fsSL https://claude.ai/install.sh | bash
claude login # Opens browser — sign in with your claude.ai accountcurl -fsSL https://claude.ai/install.sh | bash
echo 'export ANTHROPIC_API_KEY="sk-ant-api03-your-key-here"' >> ~/.zshrc
source ~/.zshrc
claude| RAM | Chip | Recommended Models |
|---|---|---|
| 8 GB | M1/M2/M3/M4 base | glm-4.7-flash, qwen2.5-coder:7b |
| 16 GB | M1/M2/M3/M4 Pro | qwen3-coder, deepseek-coder:13b |
| 24 GB | M1/M2/M3 Max · M4 Max | qwen3-coder, gpt-oss:20b, deepseek-r1:32b |
| 48 GB+ | M1/M2/M3 Ultra · M4 Ultra | deepseek-r1:70b, qwen3-coder:72b |
| GPU | VRAM | Recommended Models |
|---|---|---|
| RTX 4060 | 8 GB | glm-4.7-flash, qwen2.5-coder:7b |
| RTX 4060 Ti | 16 GB | qwen3-coder, deepseek-coder:13b |
| RTX 4070 / Ti / Super | 12–16 GB | qwen3-coder, gpt-oss:20b |
| RTX 4080 / Super | 16 GB | gpt-oss:20b, deepseek-r1:32b (Q4) |
| RTX 4090 Workstation | 24 GB | deepseek-r1:32b, llama3.3:70b (Q4) |
| RTX 6000 Ada / Blackwell | 48–96 GB | deepseek-r1:70b, qwen3-coder:72b |
| Multi-GPU RTX 6000 (NVLink) | 96–192 GB | Full 70B+ in FP16 |
| NVIDIA DGX Spark (GB10) | 128 GB | All 70B+ models, multiple concurrent agents |
Claude Code uses 3 internal tiers. You must map all three:
# Example for 16 GB hardware (Mac 16GB / RTX 4060 Ti)
export ANTHROPIC_DEFAULT_HAIKU_MODEL="glm-4.7-flash"
export ANTHROPIC_DEFAULT_SONNET_MODEL="qwen3-coder"
export ANTHROPIC_DEFAULT_OPUS_MODEL="qwen3-coder"| Hardware | Haiku | Sonnet | Opus |
|---|---|---|---|
| 8 GB | glm-4.7-flash |
qwen2.5-coder:7b |
qwen2.5-coder:7b |
| 16 GB | glm-4.7-flash |
qwen3-coder |
qwen3-coder |
| 24 GB | qwen2.5-coder:7b |
qwen3-coder |
deepseek-r1:32b |
| 48 GB+ | qwen2.5-coder:7b |
qwen3-coder:72b |
deepseek-r1:70b |
If you had a previous Ollama / OpenClaw / dummy-key setup:
chmod +x scripts/reset_claude_code.sh
./scripts/reset_claude_code.shThis removes all Anthropic env vars, logs out, and resets Claude Code config.
After reset, run claude login to authenticate with Claude Pro.
ANTHROPIC_API_KEYset + Claude Pro subscription → Claude Code uses the API key and bills per-token, ignoring your subscription. Remove the env var and useclaude logininstead.- Not setting all 3 model tiers → causes
model not founderrors mid-session. - Ollama not running →
ConnectionRefused. Always runollama servebefore launching Claude. - VS Code not picking up changes → must fully quit (
Cmd+Q) and reopen, not just close the window.
Found a better model for a specific hardware config? Open a PR or raise an issue — hardware landscape changes fast and community updates are welcome.
MIT — free to use, share, and adapt with attribution.