Pin env-var surface parity between docker-compose ci backend and GHA workflow (BS#958) by jakebromberg · Pull Request #964 · WXYC/Backend-Service

jakebromberg · 2026-05-19T19:11:47Z

Closes #958.

Summary

tests/unit/scripts/ci-env-surface-parity.test.ts — source-grep test that extracts env-var KEYS from both CI surfaces and asserts any divergence falls in an explicit allowlist. Adding a new env var to one surface now fails CI unless mirrored or justified.

Why

Per #164, the GHA integration suite deliberately runs the backend as a host node process with env from .github/workflows/test.yml. npm run ci:testmock runs it as a docker container with env from dev_env/docker-compose.yml. The two env surfaces drift independently, and a new var landing in one but not the other can silently break the integration suite.

#955 was the most recent symptom: G4's LML_CLIENT_MAX_CONCURRENT + LML_CLIENT_RATE_PER_MIN overrides made it into the unit-test harness but neither CI surface, so the integration suite's bucket drained and metadata-lml.spec.js started timing out. The manual fix in #957 wired the limiter env vars to both sides with a per-PR source-grep test pinning the values. This PR generalizes the pattern so the next limiter knob (or any other new env var) doesn't need its own per-PR pin.

Scope choice

Of the three options listed in #958:

Mechanical guard ← this PR. Smallest change, preserves the dual-path model.
Shared source of truth (factor into .env.ci consumed by both).
Retire ci:testmock entirely.

(2) and (3) are deferred — both are larger investments and the mechanical guard catches the bug class today.

How it works

The test asserts:

actualOnlyInCompose === EXPECTED_ONLY_IN_COMPOSE
actualOnlyInWorkflow === EXPECTED_ONLY_IN_WORKFLOW

The two allowlist arrays capture today's intentional divergence (18 compose-only keys, 9 workflow-only keys), each with a one-line // Why: comment — the deliberate-thought receipt. When the test fails, the dev either:

Mirrors the new var to the other surface (preferred), or
Adds it to the appropriate allowlist with a Why: justification.

What this does not do

Doesn't enforce value equality. LML_CLIENT_RATE_PER_MIN=60000 in workflow vs =10000 in compose would both pass this guard. The existing lml-limiter-test-env.test.ts value-pin therefore stays useful and is not retired.
Doesn't fix the existing divergences. Some entries on the allowlist (e.g. CATALOG_TRACK_SEARCH_* only in compose, BETTER_AUTH_AUDIENCE/ISSUER/JWKS_URL only in compose) may be genuine coverage gaps — but resolving each is its own ticket.

Test plan

Test passes on main (npx jest --testPathPatterns ci-env-surface-parity) — 2/2 pass.
Full unit suite — 1978/1978 pass.
Negative test: temporarily added FAKE_GUARD_TEST_KEY=value to the compose backend env; test failed with a clean diff showing the unauthorized key. File restored.
Typecheck + format:check clean.

#164 — the deliberate-divergence decision this PR makes mechanically enforceable.
#955 — the BS-prod-affecting incident this prevents recurrence of.
#957 — the manual fix whose pattern this PR generalizes.
#959 — the other point patch from the Integration tests fail deterministically in metadata-lml.spec.js (post-#949 PR run) #955 debugging session (DB_PORT conflation); separate PR.

…workflow (BS#958) Per BS#164, the GHA integration suite runs the backend as a host node process with env from `.github/workflows/test.yml`; `npm run ci:testmock` runs it as a docker container with env from `dev_env/docker-compose.yml`. The two env surfaces drift independently — BS#955 was the most recent symptom (G4's `LML_CLIENT_*` overrides made it into the unit harness but neither CI surface) and the manual fix in BS#957 pinned the limiter env vars on both sides with a per-PR source-grep test. This commit generalizes that pattern. `tests/unit/scripts/ci-env-surface-parity.test.ts` extracts the env-key set from both surfaces and asserts that any divergence falls within an explicit allowlist. Each allowlist entry carries a one-line `Why:` justifying the divergence — the deliberate-thought receipt. Adding a new env var to one surface now fails CI unless either (a) the var is mirrored to the other or (b) the allowlist is amended with a justification. The allowlist captures today's intentional divergence (18 compose-only keys, 9 workflow-only keys). The lml-limiter-test-env.test.ts value-pin remains useful and is not retired — this guard enforces key presence, not value equality. Verified the test catches a regression: temporarily added a `FAKE_GUARD_TEST_KEY` to the compose backend env, ran the test, observed a clean diff failure showing the unauthorized key.

…eedback) The first cut missed the `Integration-Tests:` job-level `env:` block (lines 266-281), which sets DB_HOST, DB_PORT, BETTER_AUTH_AUDIENCE/ISSUER/JWKS_URL, AUTH_PORT, BETTER_AUTH_SECRET, RUN_TESTS on the host-process backend. As a result the prior allowlist marked five vars as "only in compose" with `Why:` comments asserting that the workflow relied on backend defaults — but the workflow does set them, just in the job scope. That misinformation undermined the deliberate-thought-receipt purpose of the `Why:` lines. Fixes: - Split `extractWorkflowEnvKeys()` into three named extractors (`extractTopLevelEnv`, `extractIntegrationTestsJobEnv`, `extractStartServicesStepEnv`), unioned at the call site. Anchored the top-level extractor to the pre-`jobs:` slice so it can't accidentally match a job-level `env:` if file order changes. - Fixed a slice-off-by-one in the job-env extractor: `slice(0, stepsIdx)` excluded the trailing newline of the last env line, silently dropping `BETTER_AUTH_SECRET`. Adjusted to `slice(0, stepsIdx + 1)` and documented the gotcha inline. - Removed BETTER_AUTH_AUDIENCE/ISSUER/JWKS_URL, DB_HOST, DB_PORT from EXPECTED_ONLY_IN_COMPOSE (they're in the workflow job env). - Added AUTH_PORT, BETTER_AUTH_SECRET, RUN_TESTS to EXPECTED_ONLY_IN_WORKFLOW (genuinely workflow-only, with accurate `Why:` lines). - Header docstring updated to describe the three-scope union and dropped the stale reference to the BS#959 sibling test (not in this branch).

…int rule

jakebromberg added 3 commits May 19, 2026 12:10

Use { n } quantifier for spaces in regex to satisfy no-regex-spaces l…

493eb9a

…int rule

jakebromberg merged commit 16dede6 into main May 19, 2026
5 checks passed

jakebromberg deleted the fix-958-env-surface-guard branch May 19, 2026 22:31

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Pin env-var surface parity between docker-compose ci backend and GHA workflow (BS#958)#964

Pin env-var surface parity between docker-compose ci backend and GHA workflow (BS#958)#964
jakebromberg merged 3 commits into
mainfrom
fix-958-env-surface-guard

jakebromberg commented May 19, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

jakebromberg commented May 19, 2026

Summary

Why

Scope choice

How it works

What this does not do

Test plan

Related

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant