A lightweight service registry explorer for on-call engineers and small platform teams.
Green Room takes its name from the room just offstage where performers wait before going on — a nod to Backstage, and a metaphor for what this tool is meant to be: everything you need to know about what’s running, right before it matters.
I built it after being tasked with drawing a large system’s architecture diagram and hating every minute of it — too big to read, too brittle to stay current, too much friction for anyone to bother maintaining. Backstage solves this well, but adopting it means convincing a platform team to host it and every other team to onboard — a real multi-quarter effort. Green Room is the lightweight middle ground: a single YAML file as the source of truth, a browser-based explorer for dependency and data flow visualization, and live schema validation to keep the registry honest. Versioning is left to you; a Git repo or a Confluence page is enough for most small teams.
Deploy in a day. Get out of the way.
- Dependency impact analysis — select any service and see everything downstream (or upstream) that breaks with it
- Business flow overlay — map services to the user journeys they power, filtered by stakeholder
- Data lineage — trace how a dataset or event travels through your system stage by stage
- On-call quick-links — surface runbooks, dashboards, incident channels, and SLOs directly from the service graph
- Live validation — paste or edit your registry in-browser; schema and cross-reference errors are highlighted instantly
- Schema hints in editor — move the cursor inside service/business/data flow entries to see required fields and allowed values
- Mermaid export — copy the current graph as a Mermaid diagram with one click
npm install
npm run devDrop a service_registry.yaml into public/ and the app loads it automatically. The file has four top-level sections:
| Section | What goes here |
|---|---|
metadata |
Team name, canonical team_id, maintainers, last-updated date |
business_flows |
Named user journeys with priority and stakeholder lists |
data_flows |
Ordered stage pipelines — which service produces, queues, processes, stores, serves, and consumes each dataset |
services |
One entry per deployable unit: type, status, upstream deps, business flows, and on-call links |
business_flows, data_flows, and services can be empty objects ({}) while you scaffold incrementally.
A minimal service entry:
services:
payments_api:
name: Payments API
description: Processes checkout transactions and refunds.
type: backend
status: active
upstream:
- service: payments_db
protocol: PostgreSQL
criticality: hard
business_flows: [checkout]
owner: payments_team
runbook: https://wiki.example.com/runbooks/payments-api
health_check: https://payments.example.com/health
dashboard: https://grafana.example.com/d/payments-api
on_call: Payments API - PagerDuty
incident_channel: "#incidents-payments"
slo: "99.9%"| Field | Required | Description |
|---|---|---|
name |
yes | Human-readable display name |
description |
yes | What the service does and why it exists (1–2 sentences) |
type |
yes | frontend · backend · worker · datastore · infrastructure |
status |
yes | active · experimental · migrating · deprecated |
upstream |
yes | Direct runtime dependencies (service key, protocol, hard/soft criticality) |
business_flows |
yes | Keys of the business flows this service participates in |
owner |
yes | Registry key of the owning team — compared to metadata.team_id to distinguish your services from external ones |
runbook |
yes | URL to the on-call runbook (triage steps, failure modes, escalation) |
health_check |
yes | URL to the health/readiness endpoint |
port |
no | Primary listening port, for local development reference |
dashboard |
no | Observability dashboard URL (Grafana, Datadog, etc.) — first stop when paged |
on_call |
no | PagerDuty service name, OpsGenie integration, or escalation policy URL |
incident_channel |
no | Primary Slack/Teams channel for incidents (e.g. #incidents-payments) |
slo |
no | Availability target as a percentage or URL to the SLO doc (e.g. 99.9%) |
Full descriptions for every field, enum value, and constraint are embedded in service_registry.schema.json as description properties. Editors with JSON Schema support (VS Code, JetBrains) surface these as hover text and autocomplete.
The schema lives in service_registry.schema.json (JSON Schema draft-2020-12). Services allow additional fields out of the box (additionalProperties: true), so you can attach team-specific metadata without touching the schema at all.
To enforce a custom field — for example, to require every service to declare a tier — edit the $defs/service definition in the schema: add the property to properties and its key to required. See the JSON Schema docs for the full vocabulary.
Validation runs in two tiers: JSON Schema checks structural correctness and enum values, then validateCrossReferences in src/domain/registry.ts ensures every referenced service, business flow, and data flow stage resolves to a real key. Errors are pinned to source locations (line and column) in the editor pane.
src/app— app bootstrap and shell wiringsrc/features— feature UI modules (catalog,editor)src/domain— pure domain logic (graphing, validation, export)src/shared— reusable UI components and browser utilitiessrc/styles— global design tokens and base stylestests— unit and component test suites
Imports use aliases (@app, @features, @domain, @shared, @styles) to avoid fragile deep relative paths.
Note: TypeScript currently requires baseUrl when using paths aliases in this setup; ignoreDeprecations is temporarily enabled until alias resolution is migrated to a TS7-safe pattern.
npm run lint
npm run typecheck
npm run format:check
npm run test:coverage
npm run buildCoverage thresholds currently enforce 70% for lines/functions/statements and 55% for branches. The lower branch threshold is temporary while branch-heavy UI paths are incrementally hardened.
Found a bug or have an idea? Open an issue using one of the templates:
