Problem
VibePod's LLM integration works by injecting ANTHROPIC_BASE_URL / CODEX_OSS_BASE_URL into agent containers, expecting an Anthropic- or OpenAI-compatible endpoint at that URL.
However, ollama.com's cloud API only exposes its native endpoint (/api/chat with Authorization: Bearer) — not the /v1/messages (Anthropic) or /v1/chat/completions (OpenAI) compatibility layers. Those are features of the local Ollama server process, not the hosted service.
This means users who want to use Ollama cloud models today must run a local Ollama instance, defeating the purpose of a cloud-backed workflow.
Proposed Solution: Ollama Sidecar Container
Add a managed ollama container to VibePod's stack (alongside proxy and datasette). The sidecar runs Ollama locally but uses :cloud-suffixed model names to offload inference to ollama.com. Agent containers hit the sidecar's compatibility endpoints as normal.
Config:
# ~/.config/vibepod/config.yaml
llm:
enabled: true
base_url: "http://vibepod-ollama:11434"
api_key: "ollama"
model: "gpt-oss:120b-cloud"
Usage:
# Sidecar starts automatically, OLLAMA_API_KEY is passed through
VP_LLM_ENABLED=true VP_LLM_MODEL=gpt-oss:120b-cloud vp run claude
This fits the existing image namespace pattern:
ollama -> ollama/ollama:latest
With an optional env override:
VP_IMAGE_OLLAMA=ollama/ollama:latest vp run claude
Why this approach
- No changes needed to
vibepod-proxy or agent containers
- Both
claude and codex agents work immediately (Anthropic + OpenAI compat layers are exposed on the sidecar)
- Users get cloud inference without installing Ollama on the host
OLLAMA_API_KEY is the only credential needed
- Lightweight — the sidecar itself does no local inference for
:cloud models
Alternative Considered: Proxy Translation Layer
Extend vibepod-proxy with a mitmproxy addon that intercepts /v1/messages or /v1/chat/completions calls and rewrites them to Ollama's native https://ollama.com/api/chat format. More powerful (no sidecar needed, everything cloud-native), but significantly more complex to implement correctly, especially for streaming/SSE responses.
Could be pursued as a follow-up.
Implementation Notes
- The sidecar needs
OLLAMA_API_KEY injected so cloud model auth works
- Use
host.docker.internal pattern already established in the LLM docs for Docker networking
vp run ollama or auto-start as a dependency when llm.enabled: true and base_url points to the sidecar
Problem
VibePod's LLM integration works by injecting
ANTHROPIC_BASE_URL/CODEX_OSS_BASE_URLinto agent containers, expecting an Anthropic- or OpenAI-compatible endpoint at that URL.However,
ollama.com's cloud API only exposes its native endpoint (/api/chatwithAuthorization: Bearer) — not the/v1/messages(Anthropic) or/v1/chat/completions(OpenAI) compatibility layers. Those are features of the local Ollama server process, not the hosted service.This means users who want to use Ollama cloud models today must run a local Ollama instance, defeating the purpose of a cloud-backed workflow.
Proposed Solution: Ollama Sidecar Container
Add a managed
ollamacontainer to VibePod's stack (alongsideproxyanddatasette). The sidecar runs Ollama locally but uses:cloud-suffixed model names to offload inference toollama.com. Agent containers hit the sidecar's compatibility endpoints as normal.Config:
Usage:
# Sidecar starts automatically, OLLAMA_API_KEY is passed through VP_LLM_ENABLED=true VP_LLM_MODEL=gpt-oss:120b-cloud vp run claudeThis fits the existing image namespace pattern:
With an optional env override:
Why this approach
vibepod-proxyor agent containersclaudeandcodexagents work immediately (Anthropic + OpenAI compat layers are exposed on the sidecar)OLLAMA_API_KEYis the only credential needed:cloudmodelsAlternative Considered: Proxy Translation Layer
Extend
vibepod-proxywith a mitmproxy addon that intercepts/v1/messagesor/v1/chat/completionscalls and rewrites them to Ollama's nativehttps://ollama.com/api/chatformat. More powerful (no sidecar needed, everything cloud-native), but significantly more complex to implement correctly, especially for streaming/SSE responses.Could be pursued as a follow-up.
Implementation Notes
OLLAMA_API_KEYinjected so cloud model auth workshost.docker.internalpattern already established in the LLM docs for Docker networkingvp run ollamaor auto-start as a dependency whenllm.enabled: trueand base_url points to the sidecar