Skip to content

Latest commit

 

History

History
69 lines (56 loc) · 3.08 KB

File metadata and controls

69 lines (56 loc) · 3.08 KB

Changelog

All notable changes to this project are documented in this file.

The format is based on Keep a Changelog, and this project adheres to Semantic Versioning.

0.3.0 — 2026-05-27

Added

  • Operator worker ON/OFF control plane: a worker_controls table + a per-(host, queue) WorkerControlWatcher that hard-stops (kill in-flight + free RAM/VRAM) or parks a worker on command (migration 0012). See docs/worker_control.md.
  • Durable workflow_node_events store with lifecycle/trip/terminal event writers emitted from the worker, plus a NodePool retention sweep (migration 0011).
  • Orchestrator-side dead-worker sweep that flags a worker whose heartbeat is stale while it still holds a running job; health-gated stall trip with re-queue-and-retry (migrations 0009/0010).
  • No-progress StallWatchdog for GPU nodes; re-queue all running jobs on restart and worker self-kill when reassigned; always re-queue orphaned runs.
  • invoke_context host hook — a per-node setup/teardown context manager whose finalize(context_delta) is applied only on success.
  • MIT LICENSE and PEP 639 license metadata, keywords, and trove classifiers in pyproject.toml.
  • README.md: a "local-cluster swift management" purpose note, an Architecture diagram (Mermaid), an "Example dashboard" section, a "Turning workers on/off" section, and a Docs index.
  • AGENTS.md as a byte-identical mirror of CLAUDE.md, and a documented changelog convention.
  • This CHANGELOG.md.

Changed

  • GPU guard is now health-driven (GPU-idle and static-RAM) instead of a fixed wall-clock cap.

Fixed

  • GPU claim: restored the capability-routing gate.
  • __input__ nodes are parked via the dispatch outbox instead of importing the sentinel.

Removed

  • uv.lock is no longer tracked — this is a library; consumers resolve against pyproject.toml. The lockfile stays local for dev/CI reproducibility.

0.2.0 — 2026-05-25

Added

  • Multi-tenant ingest path: host-defined ingest queues and per-job args, with host-side task_name/queue validation (migration 0008). A second consumer (e.g. lm_flood) can route its own queue names and carry per-job arguments without forking the schema.

0.1.0 — 2026-05-25

Added

  • Initial standalone release — a Postgres-as-queue workflow engine extracted from the ai_leads stack (Phase 6): a SELECT … FOR UPDATE SKIP LOCKED claim loop woken by LISTEN/NOTIFY, lease reclaim, a DAG dispatcher with a durable dispatch-event outbox, a GPU warm-model cache, periodic ingest work + a PG-native scheduler, and per-host hw-metrics telemetry. Migrations 0001–0007.