Skip to content

z3e8/edge-model-ops-platform

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

22 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Edge ML Control Plane + Pi Inference

Small edge-to-control-plane setup: Pi runs an image inference service, laptop runs a control plane that stores telemetry + manages model deploy/rollback.

Description

The Pi serves /infer with a bounded queue (so overload becomes a clear 503 instead of random timeouts). The control plane stores telemetry in Postgres, keeps model versions (sha + files), and can deploy a new model to the Pi over ssh + rollback if things look worse.

Live Demo Link & Screenshots

  • TODO: add later

Tech Stack

  • Python (FastAPI)
  • Postgres (docker compose)
  • Raspberry Pi + systemd (edge service)
  • bash + small python scripts for demo

Features

  • telemetry ingest + simple summaries (p50/p95, error/reject rate)
  • model registry w/ sha256 + local artifact storage
  • deploy model to Pi (scp/ssh) + verify + restart
  • rollback on regression (latency/error/reject thresholds)

Setup

  • copy .env.example to .env if you want to change ports/tokens
  • docker compose up -d --build
  • register the Pi once (see demo scripts below)

Architecture

operator scripts -> control plane (FastAPI) -> Postgres + artifacts/
                      ^
                      | telemetry batches (HTTP)
                      |
                 Pi edge service (Flask)

Demo scripts

  • ./demo/demo_up.sh
  • ./demo/demo_register_device.sh (needs SSH_HOST=...)
  • ./demo/demo_upload_models.sh
  • ./demo/demo_deploy.sh (set MODEL_VERSION=v1 / v2)
  • ./demo/demo_deploy_and_rollback.sh (runs the full story)
  • python demo/load_pi.py (load generator)

Future Improvements

  • token rotation + tighter auth
  • artifact store to S3/MinIO
  • do telemetry aggregates instead of per-event rows
  • make deployments async (right now deploy waits inside the request)

Challenges and What I Learned

  • Overload behavior: the “nice” thing is fast responses, the correct thing is predictable failure (bounded queue + explicit reject).
  • Deployment was mostly about safety checks and boring edges (checksum, restart, timeouts). That stuff is what breaks in real life.
  • Telemetry design: if you store everything you drown; if you store nothing you can’t debug. Keeping a small schema helped a lot.
  • Rollback logic is subtle because “numbers got worse” doesn’t always mean “bad model”, so I kept the trigger simple and obvious.

Credits

Solo project built by Zane Hensley

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors