FishVision

Prometheus, Alertmanager, Loki, Tempo, and Grafana observability stack with IRC alerting via alertmanager-irc-relay and LLM-powered alert analysis.

This project provides an end-to-end monitoring, logging, tracing, and alerting stack where Prometheus alerts are routed to IRC in real time, and an LLM-based IRC bot provides automated alert analysis.

Audience

DevOps
SRE
Infrastructure Engineers
Incident Response Teams

Objective

Deliver critical Prometheus alerts (e.g., high CPU, disk full) to a designated IRC channel to improve visibility and reduce mean time to response. Provide centralized logging (Loki), distributed tracing (Tempo), and LLM-assisted alert triage (Ollama + IRC Bot).

Project Structure

FishVision/
├── alertmanager/
│   └── alertmanager.yml              # Alertmanager config with webhook receiver
├── alertmanager-irc-relay.yaml       # IRC relay deployment config
├── docker-compose.yml                # Compose stack for all services
├── docs/
│   ├── planned-implementation.md     # Adaptation/rollout plan
│   └── security-audit.md            # Security audit notes
├── grafana/
│   ├── dashboards/
│   │   ├── andon-alert-observability.json
│   │   └── factory.json
│   └── provisioning/
│       ├── dashboards/dashboards.yml
│       └── datasources/datasources.yml
├── irc-bot/
│   ├── Dockerfile                    # IRC bot container
│   ├── bot.py                        # LLM-powered alert analysis bot
│   ├── tools.py                      # Bot tool functions
│   └── requirements.txt
├── irc-deamon/
│   ├── Dockerfile.irc                # IRC server container
│   └── config.yml                    # IRC server configuration
├── k8s/
│   ├── base/                         # Kustomize base manifests
│   │   ├── kustomization.yaml
│   │   ├── namespace.yaml
│   │   ├── prometheus.yaml
│   │   ├── alertmanager.yaml
│   │   ├── grafana.yaml
│   │   ├── loki.yaml
│   │   ├── tempo.yaml
│   │   ├── irc-relay.yaml
│   │   └── ingress.yaml
│   └── overlays/
│       ├── dev/kustomization.yaml
│       ├── staging/kustomization.yaml
│       └── prod/kustomization.yaml
├── loki/
│   └── loki-config.yaml              # Loki log aggregation config
├── prometheus/
│   ├── alert.rules.yml               # Prometheus alerting rules
│   └── prometheus.yml                # Prometheus scrape + rule config
├── promtail/
│   └── promtail-config.yaml          # Promtail log collection config
├── tempo/
│   └── tempo-config.yaml             # Tempo tracing config
└── utils/
    └── node-exporter-installer.sh    # Helper script to install Node Exporter

Architecture

Component	Description
Node Exporter	Exposes host-level metrics from Linux systems
Prometheus	Scrapes metrics and evaluates alert rules
Alertmanager	Routes alerts and sends notifications
IRC Relay	Receives webhooks and relays alerts to IRC
IRC Server	Hosts the target IRC channel (e.g., `#alerts`)
Grafana	Visualizes metrics, logs, and traces
Loki	Log aggregation and querying
Promtail	Collects and ships container/host logs to Loki
Tempo	Distributed tracing backend
Ollama	Local LLM inference for alert analysis
IRC Bot	LLM-powered bot that analyzes alerts in IRC

Prerequisites

Docker & Docker Compose installed
Outbound IRC traffic allowed from relay host
Working IRC server (local or external)
Ollama (included in stack) for LLM-powered alert analysis

Optional: Node Exporter installed on monitored hosts (script provided in utils/).

Quick Start

Install Node Exporter (optional)
```
./utils/node-exporter-installer.sh
```
Start the stack
```
docker-compose up -d
```
Access services
- Prometheus: http://localhost:9090
- Alertmanager: http://localhost:9093
- Grafana: http://localhost:3030
- Loki: http://localhost:3100
- Tempo: http://localhost:3200
- Ollama: http://localhost:11434
- IRC server: configured from irc-deamon/

Trigger a test alert

stress-ng --cpu 4 --timeout 180s

Expected message in #alerts:

[FIRING] HighCPUUsage: server1.example.com has high CPU

Kubernetes Deployment

Kustomize manifests are provided in k8s/ for deploying to Kubernetes clusters.

# Dev environment
kubectl apply -k k8s/overlays/dev/

# Staging
kubectl apply -k k8s/overlays/staging/

# Production
kubectl apply -k k8s/overlays/prod/

Maintenance

Task	Frequency	Notes
Test alert delivery	Monthly	Simulate CPU load and verify IRC
Update container images	Quarterly	Check for new versions in `docker-compose.yml`
Rotate bot nick/channel	As needed	Update relay flags in config
Update alert rules	As needed	Edit `prometheus/alert.rules.yml` + restart Prometheus

Security Notes

Run relay on a private network or behind a reverse proxy (NGINX, Caddy)
Enable logging for relay HTTP traffic
Restrict IRC server access as appropriate
See docs/security-audit.md for a detailed security audit

Documentation

Planned Implementation -- adaptation and rollout plan
Security Audit -- security review and recommendations

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

FishVision

Audience

Objective

Project Structure

Architecture

Prerequisites

Quick Start

Kubernetes Deployment

Maintenance

Security Notes

Documentation

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 24 Commits
alertmanager		alertmanager
docs		docs
grafana		grafana
irc-deamon		irc-deamon
k8s		k8s
loki		loki
prometheus		prometheus
promtail		promtail
tempo		tempo
utils		utils
README.md		README.md
clean.sh		clean.sh
docker-compose.yml		docker-compose.yml
docker-compose.yml.bak		docker-compose.yml.bak
loki-init.sh		loki-init.sh

Folders and files

Latest commit

History

Repository files navigation

FishVision

Audience

Objective

Project Structure

Architecture

Prerequisites

Quick Start

Kubernetes Deployment

Maintenance

Security Notes

Documentation

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages