Lilbot is a local-first AI command line assistant for developers and system administrators.
It runs a local language model directly inside Python, keeps tool execution under explicit program control, and is built for practical terminal work like inspecting repositories, checking system state, summarizing logs, and explaining shell commands.
Lilbot is not a cloud assistant, not a web app, and not a thin wrapper around a hosted API.
From the repository root:
python3 -m venv .venv
source .venv/bin/activate
pip install -e ".[hf,quantization]"Then run the guided setup:
lilbot init
lilbot doctor
lilbot self-test
lilbot --versionlilbot init is the expected first-run step before chat or free-form AI queries. When it asks for Local model path, enter a real local checkpoint path if you want lilbot and one-shot prompts to work. If you leave it blank, deterministic commands still work, but chat and free-form prompts will not.
If doctor looks good, start Lilbot:
lilbotOr ask a one-shot question:
lilbot "why is my system slow?"- understand repositories
- trace functions through source trees
- inspect system state
- summarize logs
- explain shell commands
- help reason about broken local developer environments
Examples:
lilbot
lilbot "why is my system slow?"
lilbot repo summarize .
lilbot repo trace-function authenticate_user .
lilbot logs analyze /var/log/syslog
lilbot explain-command "tar -czf backup.tar.gz project/"The init wizard saves your defaults to a persistent config file so you do not need to keep passing the same flags:
- model path
- preferred device
- 4-bit quantization preference
- workspace root
- reasoning limits
By default, the config file lives at:
~/.config/lilbot/config.json
You can override that path with LILBOT_CONFIG_PATH.
The doctor command checks the most common setup problems:
- Python executable
- installed packages
- CUDA visibility
bitsandbytesavailability- model discovery
- current workspace and config file state
Use it whenever Lilbot is not behaving the way you expect:
lilbot doctorThe self-test command is a quick pass/warn/fail check for the full local setup without loading the full model for inference.
It verifies:
- config loading
- local model discovery
- required Python runtime imports
- CUDA visibility
- one safe deterministic tool execution
Run it like this:
lilbot self-testpython3 -m venv .venv
source .venv/bin/activate
pip install -e ".[hf,quantization]"For users who do not want to clone the repository first:
python3 -m venv .venv
source .venv/bin/activate
python -m pip install "lilbot[hf,quantization] @ git+https://github.com/ImZackAdams/lilbot.git"Then finish setup before you try lilbot chat mode:
lilbot init
lilbot doctor
lilbot self-testIf your pip version or shell does not like that direct-reference form, use this fallback:
python -m pip install "git+https://github.com/ImZackAdams/lilbot.git"
python -m pip install torch transformers accelerate
python -m pip install bitsandbytesIf you prefer conda, make sure you create the environment with Python included:
conda create -n lilbot python=3.12 -y
conda activate lilbot
python -m pip install "lilbot[hf,quantization] @ git+https://github.com/ImZackAdams/lilbot.git"Then continue with:
lilbot init
lilbot doctor
lilbot self-testIf you do not want GPU dependencies:
pip install -e ".[hf]"Then prefer CPU mode:
lilbot --device cpuLilbot expects a local Hugging Face checkpoint.
You can provide one explicitly:
lilbot --model /path/to/local/modelOr set it once:
export LILBOT_MODEL=/path/to/local/modelIf you keep a checkpoint under lilbot/models/<model-name>, Lilbot will auto-discover it.
Lilbot is a normal Python CLI package. On another local machine, the workflow is:
- create or activate a Python environment
- install Lilbot into that environment
- point it at a local model
- run
lilbot init - run
lilbot doctor - run
lilbot self-test - start using
lilbot
The lilbot command only exists inside the environment where the package was installed. If a user switches environments, they need Lilbot installed there too.
If a user installed Lilbot from GitHub or a package index, commands like python -m pip install -e ".[hf,quantization]" only work after cloning the Lilbot repository and cd-ing into it. They do not work from an unrelated project directory.
Lilbot does not rely on a hosted model service, so users need a local checkpoint that fits their machine.
Start here:
- CPU-only machines: use a smaller instruction-tuned model and prefer
--device cpu - 8-12 GB NVIDIA GPUs: use
--device cuda --quantize-4bitand choose a smaller or more aggressively quantized checkpoint - 16-24 GB NVIDIA GPUs:
--device cuda --quantize-4bitis usually the best default - larger GPUs: use whichever local checkpoint gives you the quality/latency tradeoff you want
The detailed setup guide is in MODEL_GUIDE.md.
Start the chat loop:
lilbotUseful interactive commands:
/help/status/model/tools/clear/exit
The startup banner shows:
- active model path
- runtime mode
- workspace root
- config file path
Use the query mode when you want an answer and then want your shell prompt back:
lilbot "why is my system slow?"
lilbot --device cuda --quantize-4bit "explain the largest files in this repository"Some workflows are deterministic and do not need the full agent loop:
lilbot repo summarize .
lilbot repo trace-function authenticate_user .
lilbot logs analyze /var/log/syslog
lilbot explain-command "iptables -A INPUT -p tcp --dport 22 -j ACCEPT"Lilbot reads configuration in this order:
- command-line flags
- environment variables
- user config file
- built-in defaults
Common settings:
LILBOT_MODELLILBOT_DEVICELILBOT_QUANTIZE_4BITLILBOT_WORKSPACE_ROOTLILBOT_MAX_NEW_TOKENSLILBOT_MAX_STEPSLILBOT_CONFIG_PATH
The sample environment file is in .env.example.
For support and bug reports, it is helpful to include:
lilbot --version
lilbot doctor
lilbot self-testFor larger local checkpoints, the fastest practical path is usually:
lilbot --device cuda --quantize-4bitIf responses feel slow:
- run
lilbot doctor - make sure
bitsandbytesis actually installed - prefer
--device cuda --quantize-4bitover--device auto - reduce generation with
--max-new-tokens 128 - use
/clearin interactive mode when the session context gets stale
If --device auto chooses CUDA and the model still does not fit, Lilbot falls back to CPU during model load.
Run:
lilbot doctorThen either:
- put a local checkpoint under
lilbot/models/ - run
lilbot initand save a model path - pass
--model /path/to/model - keep using deterministic commands until a model is configured
Check that the Python environment you launched Lilbot from can see the GPU:
python3 -c "import torch; print(torch.cuda.is_available())"If it prints False, either fix the environment or use:
lilbot --device cpuInstall the optional package:
python -m pip install bitsandbytesThen verify with:
lilbot doctorThat is expected for large local checkpoints. The first request includes model load time. The interactive REPL keeps the model resident after startup, which makes follow-up turns faster.
Lilbot is local-first, but it is still defensive by default:
- filesystem tools stay inside the configured workspace root
- log analysis is restricted to the workspace or common system log directories
- shell execution runs in restricted, read-oriented mode
- dangerous commands and install-script pipelines are blocked
- the controller enforces a strict
max_stepslimit
The model is used as a reasoning engine. It does not get to act as the operating system.
Lilbot is split into clear layers:
- CLI in
lilbot/cli.py - agent wrapper in
lilbot/agent.py - controller loop in
lilbot/controller.py - prompt construction in
lilbot/prompts.py - model backend abstraction in
lilbot/model/ - tool registry and tool implementations in
lilbot/tools/ - shell safety policy in
lilbot/safety/ - observability helpers in
lilbot/utils/ - session memory in
lilbot/memory/ - retrieval stubs in
lilbot/retrieval/
Run the test suite with:
python -m unittest discover -s tests -vThe current roadmap is in ROADMAP.md.