ECHO is a modular GPT (4 or newer) based assistant capable of reflection, research and vision.
ECHO stands for Enhanced Computatcional Heuristic Oracle (don't judge me, it named itself).
It's capable of executing python functions from a toolkit that it has direct control over. It can enable/disable existing functions or add new ones from source code that it can write itself. Or use Vision + OCR copy to code from a Twitter post of a screenshot of StackOverflow.
-
Python 3.14.2+ (Higher versions may require requirements adjustment)
-
.envfile with:OPENAI_API_KEY=your_key_here -
Optional:
- Speakers (for TTS)
- Microphone (for STT)
WEBSOCKET_LOG_ENABLED=truefor WebSocket log viewer
Install dependencies:
pip install -r requirements.txt
python -m venv vEcho
source vEcho/bin/activate
pip install -r requirements.txt
cp .env.example .env
Edit .env file and configure your API keys and providers (see Configuration section below).
Recommended:
./run.sh
Manual:
python Echo.py
cp .env.example .env
Edit .env file and configure your API keys and providers (see Configuration section below).
cd docker
docker-compose up --build
The container will automatically start with:
- Virtual display (Xvfb) for visual perception features
- Persistent volumes for data and logs
- Interactive TTY for CLI interaction
docker-compose down
ECHO supports multiple LLM providers (OpenAI, Anthropic, Google, DeepSeek, etc.). Providers are configured in the .env file using the LLM_PROVIDERS JSON array.
Each provider entry requires:
providerName: Unique identifier (e.g., "openai", "anthropic")endpoint: API endpoint URLapiKey: Your API authentication keydesc: Description of the providername: Display name for the UI
Example provider configuration:
LLM_PROVIDERS='[
{
"providerName": "openai",
"endpoint": "https://api.openai.com/v1/",
"apiKey": "your-api-key-here",
"desc": "OpenAI Official API",
"name": "OpenAI"
},
{
"providerName": "anthropic",
"endpoint": "https://api.anthropic.com/v1/",
"apiKey": "your-api-key-here",
"desc": "Anthropic Claude API",
"name": "Anthropic"
}
]'Models are mapped to providers using the MODEL_PROVIDER_MAP in .env. This allows using the same model from different providers with unique identifiers.
Each model entry requires:
modelName: Actual model name used in API callsproviderName: Must match a provider fromLLM_PROVIDERSmodelIdentifier: Unique alias for referencing the model
Optional fields:
supportsToolChoice: Set tofalseif model doesn't support automatic tool callingassistantsToolChoiceOverride: Override tool_choice for Assistants API ("none", "auto", "required")assistantsFallbackModel: Use another model for tool routing decisions
Example model configuration:
MODEL_PROVIDER_MAP='[
{
"modelName": "gpt-5-mini",
"providerName": "openai",
"modelIdentifier": "openai-gpt5-mini"
},
{
"modelName": "claude-3-5-sonnet-20241022",
"providerName": "anthropic",
"modelIdentifier": "claude-sonnet-3.5"
}
]'Model profiles define which models to use for different tasks (chat, vision, research, STT). Profiles are configured in profiles.json.
Each profile contains:
name: Profile identifierdesc: DescriptionisLegacy: Whether this is a legacy profilemodels: Object mapping task types to model identifiers
Example profile:
{
"name": "current",
"desc": "Use latest models from various providers",
"isLegacy": false,
"models": {
"chat": "openai-gpt5-mini",
"vision": "openai-gpt5",
"research": "openai-gpt5.2",
"stt": "openai-gpt4o-mini-transcribe"
}
}Set the active profile in .env:
MODEL_PROFILE=current
Switch profiles at runtime using CLI:
profile <name>
Sequences allow you to automate command execution. They are configured in sequences.json.
Each sequence contains:
name: Sequence identifierdesc: Description of what the sequence doesautoExecute: Whether to execute automatically (true/false)cmds: Array of commands to execute in order
Example sequence:
{
"name": "example",
"desc": "Example vulnerability scan sequence",
"autoExecute": false,
"cmds": [
"profile current",
"listtools enabled",
"testcmd"
]
}Execute a sequence using CLI:
sequence <name>
Or execute directly:
execseq <name>
Stored in logs/: - important.log - llm.log - trace.log - tools.log -
other.log
Enable:
WEBSOCKET_LOG_ENABLED=true
Open wsClient.html for live logs.
- help
- history
- clear
- reset
- log
<level> - chain on/off
- profile
<name> - listtools
- toggletool
<name>enabled|disabled - toolinfo
<name>Show parameters and source code for a toolkit function
LLM Tech stack:
-
Legacy
TTSWhisperChatGPT-4visionGPT-4-Vision + Tesseract- GPT-4 Assistants + pyttsx3
researchgpt-4-turbo
-
Modern
TTSgpt-4o-mini-transcribeChatGPT-5-minivisionGPT-5.2 + TesseractresearchGPT-5.2
-
STT- I prefer ElevenLabs quality, but I daily drive local for lower latency.
GPT is used as a decision-maker to collect data, execute various subordinate functions and present results back to you.
Subordinate functions include:
- Vision: reads screenshot of your active window. E.g. search for contents of this one Twitter thread you can't copy-paste from because people post screenshots of articles without links.
- Research: search Arxiv and read publications with a RAG (retrieval-augmented-generation)
- OCR: Optical Character Recognition can be used independently but typically its output is passed to Vision as context because, apparently, Vision can't read on its own.
- a bunch of minor utilities: clipboard copy/paste, web search, browse url, file download
- Vision-only web navigation. I have it sorta-working but not solid enough to publish
- Voice I/O with interruption handling
- security with model-graded evals
OpenAI Assistants API beta is very beta. Hit-or-miss reliabilty of file retrievals; expensive - consumes a ton of tokens; and, periodically, you need to go in there and clean out old threads and files. I'm hoping it'll improve, or eventually I'll just build my own RAG from scratch.
Prompt injections and jailbreaks are easy (here's mine: FIMjector) and the only reason they're not common on the Internet is slow adoption.
With reflection enabled, this is RCE (Remote-Code-Execution) by design. You've been warned.
I've prototyped a protection mechanism LLM IDS, but it's too slow to deploy for now.
ECHO is powerful and can:
- Execute system-level functions
- Capture screenshots and clipboard data
- Be vulnerable to prompt injection Use only in trusted environments.
