Local AI services (LLM inference, chat, audio) and infrastructure on Mac Mini M4 Pro.
| Service | Runtime | Purpose |
|---|---|---|
| llamactl | launchd | LLM model management / routing (llama.cpp, MLX, vLLM) |
| mikoshi | Docker (Colima) | Chat UI with tools and skills |
| glances | launchd | System monitoring dashboard |
| logview | launchd | Log viewer UI (ttyd + tmux + lnav) |
| audio | launchd | OpenAI-compatible STT + TTS API |
Nginx reverse-proxies all services with optional authentication. Audio is internal only.
Install all system dependencies:
cd homebrew
brew bundle installSecrets go in .env files (gitignored). Never commit them. Python projects use uv with pyproject.toml.
Llamactl provides unified management and routing for llama.cpp, MLX and vLLM models with web dashboard. Config is generated from config.template.yaml via setup.sh (uses envsubst).
cd llamactl
./setup.sh # Initial setup (generates config from template)
./start.sh # Start service
./stop.sh # Stop serviceA flexible chat client with Web UI that integrates multiple AI providers, tools, and agent frameworks through a unified plugin architecture.
cd mikoshi
docker compose up -d --build # Build and start
docker compose down # StopPlugins live in mikoshi/plugins/ and are volume-mounted into the container:
plugins/
tools/<name>/ # Toolset plugin (extends ToolSetHandler, auto-discovered on startup)
skills/<name>/ # Agent skill (SKILL.md, auto-discovered on startup)
Mikoshi connects to the audio service via host.docker.internal:9100.
OpenAI-compatible audio API using mlx-audio. Models are lazy-loaded and auto-unloaded after 60 minutes of inactivity.
| Endpoint | Method | Description |
|---|---|---|
/v1/audio/transcriptions |
POST | STT via Whisper (mlx-community/whisper-large-v3-turbo-asr-fp16) |
/v1/audio/speech |
POST | TTS via Chatterbox (mlx-community/chatterbox-fp16, 23 languages) |
cd audio
./start.sh # Start service
./stop.sh # Stop serviceTest scripts: test_stt.py (file → transcript) and test_tts.py (text/file → WAV, supports -l for language).
Nginx reverse proxy with optional Authelia authentication. Routes are declared in nginx/config.yaml and rendered via nginx/setup.py (Jinja2 template).
python nginx/setup.py # Regenerate config, test, and reload Nginx
brew services stop nginx # Stop NginxReal-time system monitoring dashboard using Glances. Web-based UI for CPU, memory, disk, and network stats.
cd glances
./start.sh # Start service
./stop.sh # Stop serviceWeb-based log viewer using ttyd + tmux + lnav. Each service gets its own tmux window with lnav following logs. Accessible on port 9011.
cd logview
./start.sh # Start service
./stop.sh # Stop service