Skip to content

feat(observability): Prometheus /metrics endpoint (P4-J)#83

Merged
CryptoJones merged 1 commit into
masterfrom
feat/prometheus-metrics
May 18, 2026
Merged

feat(observability): Prometheus /metrics endpoint (P4-J)#83
CryptoJones merged 1 commit into
masterfrom
feat/prometheus-metrics

Conversation

@CryptoJones
Copy link
Copy Markdown
Owner

Part of architect audit issue #73 — iteration P4-J.

Summary

  • New GET /metrics Prometheus scrape endpoint via prom-client.
  • Default Node.js metrics (event-loop, heap, GC) + per-request http_requests_total Counter + http_request_duration_seconds Histogram.
  • Route labels use Express patterns (/v1/customer/:id) so cardinality stays bounded; 404s roll up to <unknown>.
  • No authKey-derived labels (would blow up the time-series count).
  • Auth optional: unset METRICS_BEARER_TOKEN = open scrape (private-net deploys); set it = Authorization: Bearer <t> enforced with timingSafeEqual.

Test plan

  • tests/api/metrics.test.js (7 cases): route mounting, text-format, default-open, full bearer-token gate matrix.
  • Full suite: 365 pass / 4 skip (was 358/4).

This code proudly made in Nebraska. GO BIG RED! 🌽 https://xkcd.com/2347/

Architect audit P4-J. New `GET /metrics` Prometheus scrape endpoint
plus per-request instrumentation middleware.

Exposed metrics:
- prom-client default Node.js metrics (event-loop lag, heap, GC,
  open FDs, etc.) auto-registered.
- `http_requests_total{method,route,status}` Counter — one bump per
  HTTP request, regardless of which controller served it.
- `http_request_duration_seconds{method,route,status}` Histogram with
  buckets [5ms, 10ms, 25ms, 50ms, 100ms, 250ms, 500ms, 1s, 2.5s, 5s,
  10s] — sized for a JSON API where the bulk of requests are sub-50ms
  and the long tail rarely exceeds 2s.

Cardinality protection:
- The `route` label uses Express's route PATTERN (`/v1/customer/:id`),
  not the rendered URL. Hitting `/v1/customer/42`, `/v1/customer/43`,
  ... all roll up to the same series.
- 404s have no matched route; we bucket those under `<unknown>` so
  someone mistyping URLs can't blow up the metric store.

Cardinality-on-purpose dropped:
- No `authKey` label. Per-key cardinality would explode the time-series
  count. Per-tenant breakdowns belong in structured logs aggregated
  server-side, not in Prometheus labels.

Auth:
- Default: /metrics is unauthenticated. The intended deployment puts
  Prometheus on the same private network and lets the reverse proxy
  gate exposure.
- Setting `METRICS_BEARER_TOKEN` flips on `Authorization: Bearer <t>`
  enforcement. Token comparison via `crypto.timingSafeEqual` so a
  leaked token can't be enumerated by timing the response.

Tests:
- `tests/api/metrics.test.js` (7 cases): route mounting, text-format
  content-type, default-no-auth pass-through, full
  METRICS_BEARER_TOKEN gate matrix (no header / wrong token / right
  token), metrics-registered scrape assertion.
- Full suite: 365 pass / 4 skip.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@CryptoJones CryptoJones merged commit 84e9751 into master May 18, 2026
3 checks passed
@CryptoJones CryptoJones deleted the feat/prometheus-metrics branch May 18, 2026 04:58
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant