LLM Backends

CUCo supports multiple LLM providers for code generation, mutation, judging, meta-summarization, and embeddings. This document covers setup for each provider.

Provider Overview

Provider	Model Name Format	Client	Use Case
Anthropic (direct)	`claude-sonnet-4-6`, `claude-opus-4-6`	`anthropic.Anthropic()`	Direct Anthropic API
Anthropic (Bedrock)	`bedrock/us.anthropic.claude-opus-4-6-v1`	`anthropic.AnthropicBedrock()`	AWS-managed Anthropic
OpenAI	`gpt-4.1-mini`, `o3-mini`	`openai.OpenAI()`	OpenAI API
Azure OpenAI	`azure-gpt-4.1-mini`	`openai.AzureOpenAI()`	Azure-managed OpenAI
DeepSeek	`deepseek-chat`, `deepseek-reasoner`	`openai.OpenAI(base_url=...)`	DeepSeek API
Google Gemini	`gemini-2.0-flash`, `gemini-2.5-pro`	`openai.OpenAI(base_url=...)`	Google AI API
Claude CLI	`claude-cli/opus`, `claude-cli/sonnet`	subprocess	Claude Code CLI

Anthropic (Direct API)

Environment Variables

ANTHROPIC_API_KEY=sk-ant-...

Available Models

llm_models=["claude-sonnet-4-6"]
llm_models=["claude-opus-4-6"]
llm_models=["claude-haiku-4-5"]

Anthropic via AWS Bedrock (recommended)

This is the default provider in the included workloads.

Environment Variables

AWS_ACCESS_KEY_ID=AKIA...
AWS_SECRET_ACCESS_KEY=...
AWS_REGION_NAME=us-east-1

Available Models

Model names use the bedrock/ prefix followed by the Bedrock model ID:

# Sonnet 4.6
llm_models=["bedrock/us.anthropic.claude-sonnet-4-6"]

# Opus 4.6 (strongest, most expensive)
llm_models=["bedrock/us.anthropic.claude-opus-4-6-v1"]

# Sonnet 4.5
llm_models=["bedrock/us.anthropic.claude-sonnet-4-5-20250929-v1:0"]

# Sonnet 4
llm_models=["bedrock/us.anthropic.claude-sonnet-4-20250514-v1:0"]

# Haiku 4.5 (fastest, cheapest)
llm_models=["bedrock/us.anthropic.claude-haiku-4-5-20251001-v1:0"]

Pricing (per million tokens)

Model	Input	Output
Claude Opus 4.6	$5.00	$25.00
Claude Sonnet 4.6	$3.00	$15.00
Claude Sonnet 4.5	$3.00	$15.00
Claude Haiku 4.5	$1.00	$5.00

OpenAI

Environment Variables

OPENAI_API_KEY=sk-...

Available Models

llm_models=["gpt-4.1-mini"]
llm_models=["gpt-4.1"]
llm_models=["o3-mini"]
llm_models=["o4-mini"]

Azure OpenAI

Environment Variables

AZURE_OPENAI_API_KEY=...
AZURE_API_VERSION=2024-02-15-preview
AZURE_API_ENDPOINT=https://your-resource.openai.azure.com/

Model Names

Use the azure- prefix:

llm_models=["azure-gpt-4.1-mini"]
llm_models=["azure-gpt-4.1"]

DeepSeek

Environment Variables

DEEPSEEK_API_KEY=...

Available Models

llm_models=["deepseek-chat"]
llm_models=["deepseek-reasoner"]

Google Gemini

Environment Variables

GEMINI_API_KEY=...

Available Models

llm_models=["gemini-2.0-flash"]
llm_models=["gemini-2.5-pro-preview-05-06"]
llm_models=["gemini-2.5-flash-preview-04-17"]

Claude CLI

Uses the Claude Code CLI (claude -p) as a subprocess. No API key needed if Claude CLI is already authenticated.

llm_models=["claude-cli/opus"]
llm_models=["claude-cli/sonnet"]
llm_models=["claude-cli/haiku"]

This is primarily used for the fast-path agent mode, where the LLM gets full file system autonomy.

Reasoning Models

Some models support extended thinking / chain-of-thought reasoning. CUCo automatically enables this for known reasoning models:

Provider	Reasoning Models
Anthropic	`claude-3-7-sonnet-`, `claude-sonnet-4-`, `claude-opus-4-*`
OpenAI	`o3-mini`, `o4-mini`
DeepSeek	`deepseek-reasoner`
Gemini	`gemini-2.5-pro-`, `gemini-2.5-flash-`

For reasoning models, CUCo:

Sets temperature to 1.0 (required by most reasoning APIs)
Adds thinking/budget parameters (e.g., thinking.budget_tokens for Anthropic)
Passes reasoning_effort if configured in llm_kwargs

Embedding Models

Used for novelty filtering and similarity-based retrieval.

Provider	Model	Variable
OpenAI	`text-embedding-3-small`, `text-embedding-3-large`	`OPENAI_API_KEY`
Azure	`azure-text-embedding-3-small`	`AZURE_OPENAI_API_KEY`
Gemini	`gemini-embedding-001`	`GEMINI_API_KEY`
Bedrock	`bedrock-amazon.titan-embed-text-v1`	`AWS_ACCESS_KEY_ID`

Configure via:

evo_config = EvolutionConfig(
    embedding_model="bedrock-amazon.titan-embed-text-v1",
    ...
)

Dynamic Model Selection

CUCo can automatically select between multiple models using a bandit algorithm:

evo_config = EvolutionConfig(
    llm_models=[
        "bedrock/us.anthropic.claude-opus-4-6-v1",
        "bedrock/us.anthropic.claude-sonnet-4-6",
    ],
    llm_dynamic_selection="ucb",  # Asymmetric Upper Confidence Bound
    ...
)

The UCB bandit tracks which models produce higher-scoring candidates and allocates more queries to better-performing models over time.

Alternatively, use None (default) for round-robin selection across models.

Cost Tracking

CUCo tracks API costs for all LLM calls. Each QueryResult includes input_cost and output_cost based on the pricing tables in cuco/llm/models/pricing.py. Cumulative costs are logged during evolution.

To add a new model, add its pricing entry to the appropriate dictionary in pricing.py:

BEDROCK_MODELS = {
    "bedrock/your-new-model-id": {
        "input_price": X / M,
        "output_price": Y / M,
    },
    ...
}

Choosing a Model

Recommendations for CUCo workloads:

Role	Recommended	Reasoning
Mutation (slow-path)	Opus 4.6 or Sonnet 4.6	Complex code reasoning, large context
Meta-summarization	Opus 4.6	Cross-generation pattern analysis
Fast-path rewrite	Sonnet 4.6	Good balance of quality and cost
Fast-path judge	Same as rewriter	Simpler task, lower token count
Evaluation feedback	Sonnet 4.6	Quick factual analysis
Embeddings	Titan or text-embedding-3-small	Cheap, fast

For budget-conscious runs, Sonnet 4.6 works well for all roles. For maximum quality, use Opus 4.6 for mutation and meta-summarization.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

LLM Backends

Provider Overview

Anthropic (Direct API)

Environment Variables

Available Models

Anthropic via AWS Bedrock (recommended)

Environment Variables

Available Models

Pricing (per million tokens)

OpenAI

Environment Variables

Available Models

Azure OpenAI

Environment Variables

Model Names

DeepSeek

Environment Variables

Available Models

Google Gemini

Environment Variables

Available Models

Claude CLI

Reasoning Models

Embedding Models

Dynamic Model Selection

Cost Tracking

Choosing a Model

FilesExpand file tree

llm-backends.md

Latest commit

History

llm-backends.md

File metadata and controls

LLM Backends

Provider Overview

Anthropic (Direct API)

Environment Variables

Available Models

Anthropic via AWS Bedrock (recommended)

Environment Variables

Available Models

Pricing (per million tokens)

OpenAI

Environment Variables

Available Models

Azure OpenAI

Environment Variables

Model Names

DeepSeek

Environment Variables

Available Models

Google Gemini

Environment Variables

Available Models

Claude CLI

Reasoning Models

Embedding Models

Dynamic Model Selection

Cost Tracking

Choosing a Model