An MCP server that gives your IDE or agent access to Google Gemini with autonomous codebase exploration. Your pal Gemini.
When you ask gpal a question, Gemini doesn't just guess — it explores your codebase itself. It lists directories, reads files, and searches for patterns before answering. This makes it ideal for:
- 🔍 Deep code analysis — "Find all error handling patterns in this codebase"
- 🏗️ Architectural reviews — "How is authentication implemented?"
- 🐛 Bug hunting — "Why might this function return null?"
- 📚 Codebase onboarding — "Explain how the request pipeline works"
- 🖼️ Visual review — Analyze screenshots, diagrams, video via
media_paths - 📋 Structured extraction — "List all API endpoints as JSON"
| Feature | Description |
|---|---|
| Stateful sessions | Maintains conversation history via ctx.session_id |
| Autonomous exploration | Gemini has tools to list, read, and search files |
| FileSearch | Semantic code search via Google's native FileSearch API |
| Gemini 3 Series | Supports Flash and Pro with unified auto mode |
| Context Caching | Store large code contexts to reduce costs and latency |
| Observability | Native OpenTelemetry support (OTLP gRPC) |
| Distributed Tracing | Propagates traceparent from MCP requests |
| Multimodal | Analyze images, audio, video, PDFs |
| Batch Processing | Async discounted (~50%) Gemini batch API |
Limits: 10MB file reads, 20MB inline media, 20 search matches max.
| Tool | Model | Use Case |
|---|---|---|
consult_gemini |
auto (default) |
Lite explores, then Flash synthesizes |
consult_gemini |
flash |
Fast, efficient mapping and searching |
consult_gemini |
pro |
Deep reasoning, complex reviews |
consult_gemini_oneshot |
flash / pro |
Stateless single-shot queries, no session history |
Auto mode: Lite autonomously explores the codebase (cheap, thorough), then Flash synthesizes over what Lite found. Use model="pro" for deep reasoning (Lite explores, then Pro with thinking HIGH).
gpal supports native OpenTelemetry for monitoring and distributed tracing. It automatically propagates traceparent headers from incoming MCP requests.
# Configure via standard environment variables
export OTEL_SERVICE_NAME="gpal-server"
export OTEL_EXPORTER_OTLP_ENDPOINT="http://localhost:4317"
# Or via CLI argument
uv run gpal --otel-endpoint localhost:4317Reduce costs for large projects by caching context on Google's servers:
- Upload large files using
upload_file. - Create a cache using
create_context_cachewith the returned URIs. - Reference the cache name in
consult_geminicalls via thecached_contentparameter. - View active caches via the
gpal://cachesresource.
Semantic code search using Google's native FileSearch API — no local embeddings or databases:
# Create a store and upload files
create_file_store("my-project")
upload_to_file_store("stores/...", "src/server.py")
# Gemini searches stores automatically during generation
consult_gemini("find authentication logic", model="auto")- Google handles chunking, embedding, and retrieval
- Stores managed via
create_file_store,upload_to_file_store,list_file_stores,delete_file_store - When stores exist, Gemini searches them automatically during
consult_geminicalls
Customize what Gemini "knows" about you, your project, or your workflow by composing system prompts from multiple sources.
Config file (~/.config/gpal/config.toml):
# Files loaded in order and concatenated
system_prompts = [
"~/.config/gpal/GEMINI.md",
"~/CLAUDE.md",
]
# Inline text appended after files
system_prompt = "常に日本語で回答してください (Always respond in Japanese)"
# Set to false to fully replace the built-in prompt with your own
include_default_prompt = truePaths support ~ and $ENV_VAR expansion, so you can use $WORKSPACE/CLAUDE.md etc.
CLI flags (repeatable, concatenated in order):
# Append additional prompt files
uv run gpal --system-prompt /path/to/project-context.md
# Multiple files
uv run gpal --system-prompt ~/GEMINI.md --system-prompt ./CLAUDE.md
# Replace the built-in prompt entirely
uv run gpal --system-prompt ~/my-prompt.md --no-default-promptComposition order:
- Built-in gpal system instruction (unless
include_default_prompt = falseor--no-default-prompt) - Files from
system_promptsin config.toml - Inline
system_promptfrom config.toml - Files from
--system-promptCLI flags
Check what's active via the gpal://info resource — it shows which sources contributed and the total instruction length.
- Python 3.12+
- uv (recommended)
- Gemini API key
git clone https://github.com/tobert/gpal.git
cd gpal
export GEMINI_API_KEY="your_key_here" # or GOOGLE_API_KEY
uv run gpalAdd to your MCP config (e.g., claude_desktop_config.json):
{
"mcpServers": {
"gpal": {
"command": "uv",
"args": ["--directory", "/path/to/gpal", "run", "gpal"],
"env": {
"GEMINI_API_KEY": "your_key_here"
}
}
}
}Then ask your AI assistant:
"Ask Gemini to analyze the authentication flow in this codebase"
"Use
consult_geminito find where errors are handled"
uv run pytest # Run tests
uv run pytest -v # Verbose outputtest_connectivity.py, test_agentic.py, test_switching.py) make live API calls and will incur Gemini API costs.
- cpal — The inverse: an MCP server that lets Gemini (or any MCP client) consult Claude. Your pal Claude.
MIT — see LICENSE
- Refactoring Agent: A loop that edits files, runs tests (via
code_executionor shell), and iterates until green. - Review Agent: specialized system instruction for code review that outputs structured comments.