Skip to content

Drop bail from integration-test jest config#960

Merged
jakebromberg merged 1 commit into
mainfrom
drop-integration-bail
May 19, 2026
Merged

Drop bail from integration-test jest config#960
jakebromberg merged 1 commit into
mainfrom
drop-integration-bail

Conversation

@jakebromberg
Copy link
Copy Markdown
Member

Summary

  • Remove \"bail\": 1 from jest.config.json (integration suite).
  • Update CLAUDE.md doc line that referenced the bail behavior.

Why

bail: 1 masked the BS#955 cascade pattern. When G4's process-wide TokenBucket(50/min) drained, the suite reported only the first failing spec (metadata-lml.spec.js) and exited, hiding that subsequent integration specs were also queueing on the same exhausted bucket. The post-incident analysis showed several specs that would have failed visibly under a non-bailing run, which would have pointed at the rate limiter as the root cause hours earlier instead of letting it look like a metadata-lml-specific bug.

With the limiter env vars now correct in both CI surfaces (BS#957 — workflow Start services env block + dev_env/docker-compose.yml), the integration suite is stable enough that the diagnostic value of full failure visibility outweighs the CI-minute savings from short-circuiting on first failure. The suite runs --runInBand with a 30s per-test timeout — even a worst-case 6-failure run adds only ~3 minutes of CI time.

Test plan

  • CI Integration-Tests job runs to completion (it should pass on green; this PR only affects what happens when a failing test exists).
  • No other consumers of the bail config — confirmed via grep -rn 'bail' --include='*.md' --include='*.json' --include='*.ts' --include='*.js'; only jest.config.json and CLAUDE.md mention it.

Related

  • BS#957 — the limiter env-var fix this builds on.
  • BS#955 — the incident that exposed bail's diagnostic cost.

bail: 1 masked the BS#955 cascade pattern — when G4's process-wide TokenBucket drained, the suite reported only the first failing spec (metadata-lml.spec.js) and exited, hiding that subsequent integration specs were also queueing on the same exhausted bucket. With the limiter env vars now correct in both CI surfaces (workflow + compose, BS#957), the suite is stable enough that the diagnostic value of full failure visibility outweighs the CI-minute savings from short-circuiting on first failure.

CLAUDE.md doc line updated to match.
@jakebromberg jakebromberg merged commit 982058d into main May 19, 2026
5 checks passed
@jakebromberg jakebromberg deleted the drop-integration-bail branch May 19, 2026 22:31
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant