Skip to content

Scope chaos pod watch to stop OOM loops#1067

Closed
aymericDD wants to merge 1 commit into
mainfrom
dd/fix-chaos-pod-watch-memory
Closed

Scope chaos pod watch to stop OOM loops#1067
aymericDD wants to merge 1 commit into
mainfrom
dd/fix-chaos-pod-watch-memory

Conversation

@aymericDD
Copy link
Copy Markdown
Contributor

What does this PR do?

Bits AI SRE Investigation • View in Bits AI SRE Investigation

  • Adds new functionality
  • Alters existing functionality
  • Fixes a bug
  • Improves documentation or testing

Please briefly describe your changes as well as the motivation behind them:

  • Prevents the disruption controller from handling every pod event through an unfiltered raw watch path, which could enqueue non-chaos pods and contribute to high memory usage and OOM crash loops.
  • Switches pod event wiring to the namespace-scoped informer from kubeInformerFactory and applies chaosEventsPredicate() directly on that source.
  • Adds mapPodToDisruptionRequests so pods missing disruption labels never enqueue reconcile requests, even if predicate behavior changes.
  • Adds focused unit tests for pod-to-disruption request mapping and reconcile trigger predicate behavior.

Code Quality Checklist

  • The documentation is up to date.
  • My code is sufficiently commented and passes continuous integration checks.
  • I have signed my commit (see Contributing Docs).

Testing

  • I leveraged continuous integration testing
    • by depending on existing unit tests or end-to-end tests.
    • by adding new unit tests or end-to-end tests.
  • I manually tested the following steps:
    • go test ./controllers -run 'TestMapPodToDisruptionRequests|TestShouldTriggerReconcile' -count=1
    • GOOS=linux /tmp/bin/golangci-lint run ./controllers
    • locally.
    • as a canary deployment to a cluster.

PR by Bits - View session in Datadog

Comment @DataDog to request changes

Co-authored-by: aymericDD <8859832+aymericDD@users.noreply.github.com>
@datadog-datadog-prod-us1-2
Copy link
Copy Markdown

datadog-datadog-prod-us1-2 Bot commented Apr 21, 2026

View session in Datadog

Bits Dev status: ✅ Done

Comment @DataDog to request changes

@datadog-datadog-prod-us1-2
Copy link
Copy Markdown

I can only run on private repositories.

@datadog-datadog-prod-us1-2
Copy link
Copy Markdown

Tests

🎉 All green!

❄️ No new flaky tests detected
🧪 All tests passed

🎯 Code Coverage (details)
Patch Coverage: 100.00%
Overall Coverage: 39.09% (+0.00%)

This comment will be updated automatically if new data arrives.
🔗 Commit SHA: ab9b952 | Docs | Datadog PR Page | Give us feedback!

@aymericDD aymericDD closed this Apr 29, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant