Skip to content

Latest commit

 

History

History
535 lines (457 loc) · 19 KB

File metadata and controls

535 lines (457 loc) · 19 KB

🔌 Providers & Model Configuration

Back to README

Providers

Note

Voice transcription can use a configured multimodal model via voice.model_name. Groq Whisper remains available as a fallback when no voice model is configured.

Provider Purpose Get API Key
gemini LLM (Gemini direct) aistudio.google.com
zhipu LLM (Zhipu direct) bigmodel.cn
zai-coding LLM (Z.AI Coding Plan) z.ai
volcengine LLM(Volcengine direct) volcengine.com
openrouter LLM (recommended, access to all models) openrouter.ai
anthropic LLM (Claude direct) console.anthropic.com
openai LLM (GPT direct) platform.openai.com
deepseek LLM (DeepSeek direct) platform.deepseek.com
qwen LLM (Qwen direct) dashscope.console.aliyun.com
groq LLM + Voice transcription (Whisper) console.groq.com
cerebras LLM (Cerebras direct) cerebras.ai
vivgrid LLM (Vivgrid direct) vivgrid.com
nvidia LLM (NVIDIA NIM) build.nvidia.com
moonshot LLM (Kimi/Moonshot direct) platform.moonshot.cn
minimax LLM (Minimax direct) platform.minimaxi.com
avian LLM (Avian direct) avian.io
mistral LLM (Mistral direct) console.mistral.ai
longcat LLM (Longcat direct) longcat.ai
modelscope LLM (ModelScope direct) modelscope.cn
mimo LLM (Xiaomi MiMo direct) platform.xiaomimimo.com

Model Configuration (model_list)

What's New? PicoClaw now uses a model-centric configuration approach. Simply specify vendor/model format (e.g., zhipu/glm-4.7) to add new providers—zero code changes required!

This design also enables multi-agent support with flexible provider selection:

  • Different agents, different providers: Each agent can use its own LLM provider
  • Model fallbacks: Configure primary and fallback models for resilience
  • Load balancing: Distribute requests across multiple endpoints
  • Centralized configuration: Manage all providers in one place

📋 All Supported Vendors

Vendor model Prefix Default API Base Protocol API Key
OpenAI openai/ https://api.openai.com/v1 OpenAI Get Key
Anthropic anthropic/ https://api.anthropic.com/v1 Anthropic Get Key
智谱 AI (GLM) zhipu/ https://open.bigmodel.cn/api/paas/v4 OpenAI Get Key
Z.AI Coding Plan openai/ https://api.z.ai/api/coding/paas/v4 OpenAI Get Key
DeepSeek deepseek/ https://api.deepseek.com/v1 OpenAI Get Key
Google Gemini gemini/ https://generativelanguage.googleapis.com/v1beta OpenAI Get Key
Groq groq/ https://api.groq.com/openai/v1 OpenAI Get Key
Moonshot moonshot/ https://api.moonshot.cn/v1 OpenAI Get Key
通义千问 (Qwen) qwen/ https://dashscope.aliyuncs.com/compatible-mode/v1 OpenAI Get Key
NVIDIA nvidia/ https://integrate.api.nvidia.com/v1 OpenAI Get Key
Ollama ollama/ http://localhost:11434/v1 OpenAI Local (no key needed)
OpenRouter openrouter/ https://openrouter.ai/api/v1 OpenAI Get Key
LiteLLM Proxy litellm/ http://localhost:4000/v1 OpenAI Your LiteLLM proxy key
VLLM vllm/ http://localhost:8000/v1 OpenAI Local
Cerebras cerebras/ https://api.cerebras.ai/v1 OpenAI Get Key
VolcEngine (Doubao) volcengine/ https://ark.cn-beijing.volces.com/api/v3 OpenAI Get Key
神算云 shengsuanyun/ https://router.shengsuanyun.com/api/v1 OpenAI -
BytePlus byteplus/ https://ark.ap-southeast.bytepluses.com/api/v3 OpenAI Get Key
Vivgrid vivgrid/ https://api.vivgrid.com/v1 OpenAI Get Key
LongCat longcat/ https://api.longcat.chat/openai OpenAI Get Key
ModelScope (魔搭) modelscope/ https://api-inference.modelscope.cn/v1 OpenAI Get Token
Xiaomi MiMo mimo/ https://api.xiaomimimo.com/v1 OpenAI Get Key
Azure OpenAI azure/ https://{resource}.openai.azure.com Azure Get Key
Antigravity antigravity/ Google Cloud Custom OAuth only
GitHub Copilot github-copilot/ localhost:4321 gRPC -

Basic Configuration

{
  "model_list": [
    {
      "model_name": "ark-code-latest",
      "model": "volcengine/ark-code-latest",
      "api_key": "sk-your-api-key"
    },
    {
      "model_name": "gpt-5.4",
      "model": "openai/gpt-5.4",
      "api_key": "sk-your-openai-key"
    },
    {
      "model_name": "claude-sonnet-4.6",
      "model": "anthropic/claude-sonnet-4.6",
      "api_key": "sk-ant-your-key"
    },
    {
      "model_name": "glm-4.7",
      "model": "zhipu/glm-4.7",
      "api_key": "your-zhipu-key"
    }
  ],
  "agents": {
    "defaults": {
      "model_name": "gpt-5.4"
    }
  }
}

Voice Transcription

You can configure a dedicated model for audio transcription with voice.model_name. This lets you reuse existing multimodal providers that support audio input instead of relying only on Groq.

If voice.model_name is not configured, PicoClaw will continue to fall back to Groq transcription when a Groq API key is available.

{
  "model_list": [
    {
      "model_name": "voice-gemini",
      "model": "gemini/gemini-2.5-flash",
      "api_key": "your-gemini-key"
    }
  ],
  "voice": {
    "model_name": "voice-gemini",
    "echo_transcription": false
  },
  "providers": {
    "groq": {
      "api_key": "gsk_xxx"
    }
  }
}

Vendor-Specific Examples

OpenAI

{
  "model_name": "gpt-5.4",
  "model": "openai/gpt-5.4",
  "api_key": "sk-..."
}

VolcEngine (Doubao)

{
  "model_name": "ark-code-latest",
  "model": "volcengine/ark-code-latest",
  "api_key": "sk-..."
}

智谱 AI (GLM)

{
  "model_name": "glm-4.7",
  "model": "zhipu/glm-4.7",
  "api_key": "your-key"
}

Z.AI Coding Plan (GLM)

Z.AI and 智谱 AI are two brands of the same provider. For the Z.AI Coding Plan use the openai model key and the api base as follows, rather than the zhipu config

{
  "model_name": "glm-4.7",
  "model": "openai/glm-4.7",
  "api_key": "your-z.ai-key"
  "api_base": "https://api.z.ai/api/coding/paas/v4"
}

DeepSeek

{
  "model_name": "deepseek-chat",
  "model": "deepseek/deepseek-chat",
  "api_key": "sk-..."
}

Anthropic (with API key)

{
  "model_name": "claude-sonnet-4.6",
  "model": "anthropic/claude-sonnet-4.6",
  "api_key": "sk-ant-your-key"
}

Run picoclaw auth login --provider anthropic to paste your API token.

Anthropic Messages API (native format)

For direct Anthropic API access or custom endpoints that only support Anthropic's native message format:

{
  "model_name": "claude-opus-4-6",
  "model": "anthropic-messages/claude-opus-4-6",
  "api_key": "sk-ant-your-key",
  "api_base": "https://api.anthropic.com"
}

Use anthropic-messages protocol when:

  • Using third-party proxies that only support Anthropic's native /v1/messages endpoint (not OpenAI-compatible /v1/chat/completions)
  • Connecting to services like MiniMax, Synthetic that require Anthropic's native message format
  • The existing anthropic protocol returns 404 errors (indicating the endpoint doesn't support OpenAI-compatible format)

Note: The anthropic protocol uses OpenAI-compatible format (/v1/chat/completions), while anthropic-messages uses Anthropic's native format (/v1/messages). Choose based on your endpoint's supported format.

Ollama (local)

{
  "model_name": "llama3",
  "model": "ollama/llama3"
}

Custom Proxy/API

{
  "model_name": "my-custom-model",
  "model": "openai/custom-model",
  "api_base": "https://my-proxy.com/v1",
  "api_key": "sk-...",
  "request_timeout": 300
}

LiteLLM Proxy

{
  "model_name": "lite-gpt4",
  "model": "litellm/lite-gpt4",
  "api_base": "http://localhost:4000/v1",
  "api_key": "sk-..."
}

PicoClaw strips only the outer litellm/ prefix before sending the request, so proxy aliases like litellm/lite-gpt4 send lite-gpt4, while litellm/openai/gpt-4o sends openai/gpt-4o.

Z.AI Coding Plan

If the standard Zhipu endpoint (https://open.bigmodel.cn/api/paas/v4) returns 429 (code 1113: insufficient balance), try using the Z.AI Coding Plan endpoint instead:

{
  "model_name": "glm-4.7",
  "model": "openai/glm-4.7",
  "api_key": "your-zhipu-api-key",
  "api_base": "https://api.z.ai/api/coding/paas/v4"
}

Note: The Z.AI Coding Plan endpoint and standard Zhipu endpoint use the same API key format but have separate billing. If you encounter 429 errors with the standard Zhipu endpoint, the Z.AI Coding Plan endpoint may have available balance.

Load Balancing

Configure multiple endpoints for the same model name—PicoClaw will automatically round-robin between them:

{
  "model_list": [
    {
      "model_name": "gpt-5.4",
      "model": "openai/gpt-5.4",
      "api_base": "https://api1.example.com/v1",
      "api_key": "sk-key1"
    },
    {
      "model_name": "gpt-5.4",
      "model": "openai/gpt-5.4",
      "api_base": "https://api2.example.com/v1",
      "api_key": "sk-key2"
    }
  ]
}

Automatic Model Failover (Cascade)

PicoClaw already supports automatic failover when you configure primary + fallbacks in the agent model settings. The runtime fallback chain retries the next candidate for retriable failures such as HTTP 429, quota/rate-limit errors, and timeout errors. It also applies cooldown tracking per candidate to avoid immediately retrying a recently failed target.

{
  "model_list": [
    {
      "model_name": "qwen-main",
      "model": "openai/qwen3.5:cloud",
      "api_base": "https://api.example.com/v1",
      "api_key": "sk-main"
    },
    {
      "model_name": "deepseek-backup",
      "model": "deepseek/deepseek-chat",
      "api_key": "sk-backup-1"
    },
    {
      "model_name": "gemini-backup",
      "model": "gemini/gemini-2.5-flash",
      "api_key": "sk-backup-2"
    }
  ],
  "agents": {
    "defaults": {
      "model": {
        "primary": "qwen-main",
        "fallbacks": ["deepseek-backup", "gemini-backup"]
      }
    }
  }
}

If you use key-level failover for the same model, PicoClaw can chain through additional key-backed candidates before moving to cross-model backups.

Migration from Legacy providers Config

The old providers configuration is deprecated but still supported for backward compatibility.

Old Config (deprecated):

{
  "providers": {
    "zhipu": {
      "api_key": "your-key",
      "api_base": "https://open.bigmodel.cn/api/paas/v4"
    }
  },
  "agents": {
    "defaults": {
      "provider": "zhipu",
      "model": "glm-4.7"
    }
  }
}

New Config (recommended):

{
  "model_list": [
    {
      "model_name": "glm-4.7",
      "model": "zhipu/glm-4.7",
      "api_key": "your-key"
    }
  ],
  "agents": {
    "defaults": {
      "model_name": "glm-4.7"
    }
  }
}

For detailed migration guide, see migration/model-list-migration.md.

Provider Architecture

PicoClaw routes providers by protocol family:

  • OpenAI-compatible protocol: OpenRouter, OpenAI-compatible gateways, Groq, Zhipu, and vLLM-style endpoints.
  • Anthropic protocol: Claude-native API behavior.
  • Codex/OAuth path: OpenAI OAuth/token authentication route.

This keeps the runtime lightweight while making new OpenAI-compatible backends mostly a config operation (api_base + api_key).

Zhipu

1. Get API key and base URL

2. Configure

{
  "agents": {
    "defaults": {
      "workspace": "~/.picoclaw/workspace",
      "model_name": "glm-4.7",
      "max_tokens": 8192,
      "temperature": 0.7,
      "max_tool_iterations": 20
    }
  },
  "providers": {
    "zhipu": {
      "api_key": "Your API Key",
      "api_base": "https://open.bigmodel.cn/api/paas/v4"
    }
  }
}

3. Run

picoclaw agent -m "Hello"
Full config example
{
  "agents": {
    "defaults": {
      "model_name": "anthropic/claude-opus-4-5"
    }
  },
  "session": {
    "dm_scope": "per-channel-peer"
  },
  "providers": {
    "openrouter": {
      "api_key": "sk-or-v1-xxx"
    },
    "groq": {
      "api_key": "gsk_xxx"
    }
  },
  "voice": {
    "model_name": "voice-gemini",
    "echo_transcription": false
  },
  "channels": {
    "telegram": {
      "enabled": true,
      "token": "123456:ABC...",
      "allow_from": ["123456789"]
    },
    "discord": {
      "enabled": true,
      "token": "",
      "allow_from": [""]
    },
    "whatsapp": {
      "enabled": false,
      "bridge_url": "ws://localhost:3001",
      "use_native": false,
      "session_store_path": "",
      "allow_from": []
    },
    "feishu": {
      "enabled": false,
      "app_id": "cli_xxx",
      "app_secret": "xxx",
      "encrypt_key": "",
      "verification_token": "",
      "allow_from": []
    },
    "qq": {
      "enabled": false,
      "app_id": "",
      "app_secret": "",
      "allow_from": []
    }
  },
  "tools": {
    "web": {
      "brave": {
        "enabled": false,
        "api_key": "BSA...",
        "max_results": 5
      },
      "duckduckgo": {
        "enabled": true,
        "max_results": 5
      },
      "perplexity": {
        "enabled": false,
        "api_key": "",
        "max_results": 5
      },
      "searxng": {
        "enabled": false,
        "base_url": "http://localhost:8888",
        "max_results": 5
      }
    },
    "cron": {
      "exec_timeout_minutes": 5
    }
  },
  "heartbeat": {
    "enabled": true,
    "interval": 30
  }
}

📝 API Key Comparison

Service Pricing Use Case
OpenRouter Free: 200K tokens/month Multiple models (Claude, GPT-4, etc.)
Volcengine CodingPlan ¥9.9/first month Best for Chinese users, multiple SOTA models (Doubao, DeepSeek, etc.)
Zhipu Free: 200K tokens/month Suitable for Chinese users
Brave Search $5/1000 queries Web search functionality
SearXNG Free (self-hosted) Privacy-focused metasearch (70+ engines)
Groq Free tier available Fast inference (Llama, Mixtral)
Cerebras Free tier available Fast inference (Llama, Qwen, etc.)
LongCat Free: up to 5M tokens/day Fast inference
ModelScope Free: 2000 requests/day Inference (Qwen, GLM, DeepSeek, etc.)

PicoClaw Meme