Agent Optimization Techniques Library

A reference library showcasing agent-level optimization techniques for LLM-based applications. This is a research/prototype project designed to help developers understand and explore different optimization strategies they can apply to their own agents.

This library is grounded in academic research from 2024-2025. See docs/RESEARCH_BASIS.md for the research papers and industry resources that inform these techniques.

What This Is

A collection of 15+ independent optimization modules covering:

Caching strategies (simple, semantic, response-based, prefix, KV caching)
Model routing (cost-aware, cascade, fallback patterns)
Prompt optimization (compression, few-shot selection, templates)
Context management (sliding windows, token counting, memory hierarchy)
Request batching (sync/async request batching)
Advanced retrieval (hybrid search, reranking, hypothetical documents)
Advanced prompting (chain-of-thought, few-shot CoT, ReAct)
Structured output handling (parsing, validation, function calls)
Cost tracking (budget management, cost analysis)
Evaluation & monitoring (feedback loops, metrics, agent monitoring)
Framework integrations (DSPy, LangGraph)
Specialized techniques (streaming, parallel execution, speculative execution)

Each module can be used independently or composed together.

What This Is NOT

Not production-ready code - This is prototype/research quality
Not model-level optimization - Focuses only on agent/application layer
Not hardware-level optimization - Does not address compute or infrastructure
Not a turnkey framework - Requires understanding and adaptation for specific use cases

Current State (January 2025)

Testing: 282 tests passing (244 unit + 38 integration tests)

Documentation: Quickstart guide + module docs + working examples

Code Quality: Clean APIs with comprehensive test coverage, limited production error handling

Installation

pip install -e .

Or install specific feature sets:

pip install -e ".[tiktoken]"    # For token counting
pip install -e ".[dspy]"         # For DSPy integration
pip install -e ".[langgraph]"    # For LangGraph integration
pip install -e ".[hnswlib]"      # For HNSW retrieval

Quick Start

Basic Caching

from agent_opt.caching import SimpleCache

cache = SimpleCache(max_size=1000, ttl_seconds=3600)

# Check cache
result = cache.get(prompt)
if result is None:
    result = llm.call(prompt)
    cache.set(prompt, result)

print(cache.get_stats())

Cost-Aware Routing

from agent_opt.routing import CostAwareRouter, CostModelConfig

router = CostAwareRouter(
    models=[
        CostModelConfig("gpt-4", cost_per_1k_input=0.03, cost_per_1k_output=0.06),
        CostModelConfig("gpt-3.5", cost_per_1k_input=0.0005, cost_per_1k_output=0.0015),
    ],
    optimization_goal="balanced",
)
selected = router.select_model(input_tokens=100, output_tokens=50)
print(f"Using {selected.name}")

Context Management

from agent_opt.context import SlidingWindow

window = SlidingWindow(max_messages=10, max_tokens=4000)
window.add_message({"role": "user", "content": "Hello"})
window.add_message({"role": "assistant", "content": "Hi there"})
messages = window.get_messages()  # Bounded context

Examples

Run the examples to see techniques in action:

python examples/basic_caching_example.py
python examples/model_routing_example.py
python examples/comprehensive_optimization_example.py

Project Structure

src/agent_opt/
├── caching/              # Cache implementations
├── routing/              # Model routing strategies
├── prompts/              # Prompt optimization
├── context/              # Context window management
├── batching/             # Request batching
├── retrieval/            # Advanced retrieval techniques
├── advanced_prompts/     # Complex prompting strategies
├── structured/           # Structured output handling
├── cost/                 # Cost tracking and budgeting
├── evaluation/           # Evaluation and monitoring
├── advanced_caching/     # Specialized caching techniques
├── streaming/            # Token streaming utilities
├── request_optimization/ # Request-level optimizations
├── dspy_integration/     # DSPy framework support
└── langgraph_integration/# LangGraph support

examples/                 # Usage examples

Documentation

Quick Start Guide - Getting started with key modules
Module Docstrings - Each module has detailed API documentation
Examples - Working examples in examples/

Testing

Run the test suite:

pytest test/unit/ -v

Current Status: 244 tests passing across all 15 modules

Contributing

See CONTRIBUTING.md for guidelines on contributing to this project.

Security

If you discover a security issue, please create an issue in our issue tracker. As a reminder is this project for research basis, as I have built the stems from this library, so please make sure that you have built out the code for production use.

License

MIT

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
.github		.github
docs		docs
examples		examples
site		site
src/agent_opt		src/agent_opt
test		test
.coverage		.coverage
.gitignore		.gitignore
.pre-commit-config.yaml		.pre-commit-config.yaml
.python-version		.python-version
CHANGELOG.md		CHANGELOG.md
CODE_OF_CONDUCT.md		CODE_OF_CONDUCT.md
CONTRIBUTING.md		CONTRIBUTING.md
LICENSE		LICENSE
README.md		README.md
SECURITY.md		SECURITY.md
mkdocs.yml		mkdocs.yml
pyproject.toml		pyproject.toml
uv.lock		uv.lock

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Agent Optimization Techniques Library

What This Is

What This Is NOT

Current State (January 2025)

Installation

Quick Start

Basic Caching

Cost-Aware Routing

Context Management

Examples

Project Structure

Documentation

Testing

Contributing

Security

License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Agent Optimization Techniques Library

What This Is

What This Is NOT

Current State (January 2025)

Installation

Quick Start

Basic Caching

Cost-Aware Routing

Context Management

Examples

Project Structure

Documentation

Testing

Contributing

Security

License

About

Resources

License

Code of conduct

Contributing

Security policy

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages