Skip to content

Spec 025 — Overview uptime KPI + Reverb live updates + perf chart (closes phase 5)#76

Merged
Copxer merged 3 commits into
mainfrom
spec/025-overview-and-realtime
May 1, 2026
Merged

Spec 025 — Overview uptime KPI + Reverb live updates + perf chart (closes phase 5)#76
Copxer merged 3 commits into
mainfrom
spec/025-overview-and-realtime

Conversation

@Copxer
Copy link
Copy Markdown
Owner

@Copxer Copxer commented May 1, 2026

Closes #75

Spec: specs/phase-5-monitoring/025-overview-and-realtime.md

Last spec of phase 5. Replaces the long-standing MOCK_KPIS['uptime'] placeholder on Overview with a real volume-weighted cross-website aggregate, broadcasts every persisted check via Reverb so the per-website Show page reflects live data, and renders a response-time Sparkline of the last 50 checks.

Summary

  • GetMonitoringUptimeKpiQuery — volume-weighted 24h uptime % across all websites (decision B: a busy site with one failure dominates a quiet 100% site, the truest system-wide measure). Returns {overall, change, sparkline (12d oldest-first), status}. Empty 24h window → null overall + muted status. Days with no checks default to 100% on the sparkline (documented "no failures observed" framing — better than rendering as a 0% flatline on fresh accounts).
  • Wired into GetOverviewDashboardQuery; MOCK_KPIS['uptime'] removed; docblock graduates uptime to "real today."
  • New WebsiteCheckRecorded ShouldBroadcastNow event with pre-resolved owner id (mirrors spec 021's WorkflowRunUpserted pattern). Broadcasts on users.{ownerUserId}.monitoring. Light pulse { check_id, website_id }.
  • routes/channels.php authorizes the new channel (per-user gate matching activity / deployments).
  • RecordWebsiteCheckAction dispatches the event after every persisted check (steady-state runs included). Spec 024's transition activity events still fire separately on healthy↔failed swings.
  • Show.vue subscribes via Echo on mount, filters client-side by website_id, partial-reloads website + summary + checks on matching pulses. realtimeConnected ref drives an offline pill. Response-time Sparkline of last 50 response_time_ms with leading-null skip + carry-forward fill; <2 data points renders the "not enough data" placeholder.

Test plan

  • vendor/bin/pint --test passes.
  • php artisan test — 18 net new passing tests across 3 new files (KPI query 10, event 5, dispatch + transition extensions 3) + 2 extended (record action + dashboard query). Full suite 357 passed (was 339); 50 failures are env-CSRF baseline; CI passes them.
  • npm run build clean.
  • Manual smoke (post-merge): create a monitor, confirm Overview's Uptime KPI shows real % once probes accumulate; visit the Show page and confirm a php artisan schedule:work tick triggers a partial reload + the Sparkline updates without a manual refresh.

Self-review notes

Self-review pass via superpowers:code-reviewer flagged 3 recommendations, all addressed:

  • usePage() hoisted to setup top-level — was inside onMounted, idiomatically wrong (Inertia could validate the call site at any release).
  • responseTimeSeries now skips leading null Error checks rather than carry-forwarding a 0 — the Sparkline's min/max-normalized rendering was pulling the line to the 0ms floor when the first point was a transport-error row.
  • WebsiteCheckRecorded::broadcastOn() docblock documents the per-user-channel-vs-per-website-channel trade-off for future scaling (~1k monitors / sub-30s intervals).

Phase-1 acknowledgements (PR-body-only)

  • Empty-day uptime sparkline = 100%. Honest framing for fresh accounts; misleading for an active monitor whose dispatcher died and stopped reporting. Surfaced indirectly via last_checked_at going stale on the index/show pages and via the right rail's lost activity heartbeat. Future polish: switch to null + Sparkline gap rendering.
  • Broadcast volume. 100 monitors × 60s ≈ 1.67 broadcasts/sec — Reverb-friendly. Revisit if monitor count crosses ~1k or interval drops below 30s.
  • Service location of GetMonitoringUptimeKpiQuery from GetOverviewDashboardQuery::handle() via app(...) — matches the existing class's no-constructor pattern. Refactor opportunity for a follow-up if more queries graduate.

Phase 5 status

# Spec Status
023 Website monitor MVP 🟢
024 Scheduled checks + uptime + activity events 🟢
025 Overview integration + Reverb live updates 🟢 (this PR)

Phase complete. The only remaining MOCK_KPIS slices are services (phase 6) and alerts (phase 7) — they ride with their own future phases.

After merge, what's next:

  • Phase 6 — Docker Host Agent MVP (host telemetry ingestion endpoint, agent token system, container metrics).
  • Phase 7 — Alerts Engine.
  • Phase 8 — Analytics & Health Scores.

Copxer added 3 commits April 30, 2026 23:37
Last spec of phase 5. Replaces MOCK_KPIS['uptime'] with a real volume-
weighted aggregate across all user's websites, broadcasts every
persisted check via Reverb so the Show page reflects live data, and
adds a response-time Sparkline of the last 50 checks.
…art (spec 025)

Closes phase 5. Replaces MOCK_KPIS['uptime'] with a real cross-website
aggregate, broadcasts every persisted check via Reverb so the per-
website Show page reflects live data, and renders a response-time
Sparkline of the last 50 checks.

- GetMonitoringUptimeKpiQuery: volume-weighted 24h uptime % across all
  websites (locked decision B — busy site with one failure dominates
  a quiet 100% site, the truest system-wide measure). Returns
  {overall, change, sparkline (12 days oldest-first), status}.
  - overall: float|null; null on empty 24h window.
  - change: 24h delta vs prior 24h; 0 when either window is empty.
  - sparkline: daily uptime %; days with no checks default to 100.0
    ("no failures observed" reads better than "everything down" on
    a fresh account; documented limitation).
  - status: muted (null) | success (≥99) | warning (≥95) | danger.
- GetOverviewDashboardQuery::handle() calls the new query for the
  uptime slice; MOCK_KPIS['uptime'] removed; docblock updated.
- New WebsiteCheckRecorded ShouldBroadcastNow event:
  - Constructor takes pre-resolved (checkId, websiteId, ownerUserId)
    ints — mirrors spec 021's WorkflowRunUpserted pattern, avoids
    broadcast-time relation walking.
  - Broadcasts on users.{ownerUserId}.monitoring with light pulse
    {check_id, website_id}; client uses it as a trigger, not a
    source of truth.
  - Per-user channel + client-side filter trade-off documented;
    revisit at ~1k monitors or sub-30s intervals.
- routes/channels.php authorizes users.{userId}.monitoring (per-user
  gate matching the activity / deployments channels).
- RecordWebsiteCheckAction dispatches the event after every persisted
  check (steady-state runs included). Spec 024's transition activity
  events still fire separately on healthy↔failed swings.
- Show.vue subscribes via Echo on mount, filters by website_id,
  partial-reloads website + summary + checks on matching pulses;
  realtimeConnected ref drives an offline pill. Sparkline of last
  50 response_time_ms with leading-null skip + carry-forward fill;
  <2 data points renders the "not enough data" placeholder.

Tests: 18 net new passing tests across 3 new files + 2 extended.
Full suite 357 passed (was 339); 0 regressions.

Self-review pass via superpowers:code-reviewer flagged 3
recommendations, all addressed:
- usePage() hoisted to setup top-level (was inside onMounted —
  idiomatically wrong, Inertia could validate the call site).
- responseTimeSeries skips leading-null Error checks so the line
  doesn't anchor at the 0ms floor (Sparkline's min/max normalizer
  was pulling the chart to the baseline).
- broadcastOn() docblock documents the per-user-channel-vs-per-
  website-channel trade-off for future scaling.

PHASE 5 COMPLETE (3/3 specs). The only remaining MOCK_KPIS slices
are 'services' (phase 6) and 'alerts' (phase 7).
Spec frontmatter + tracker bookkeeping that didn't make the previous
commit (Edit-without-Read guard). Trailing companion to the
implementation commit.
@Copxer Copxer merged commit 153228e into main May 1, 2026
1 check passed
@Copxer Copxer deleted the spec/025-overview-and-realtime branch May 1, 2026 07:49
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Spec 025 — Overview integration + Reverb live updates + perf charts

1 participant