fix: reconciliation loop in taskspawner controller#864
Open
Conversation
…ciler reflect.DeepEqual on init containers and volumes always returned false because Kubernetes adds defaulted fields (securityContext, terminationMessagePath, Secret.DefaultMode) that the builder does not set. This caused an infinite reconciliation loop where every reconcile updated the deployment, which triggered another reconcile. The constant pod restarts exhausted the GitHub API rate limit. Replace reflect.DeepEqual with initContainersEqual and volumesEqual that compare only the fields the builder explicitly sets.
When TokenRefresherImagePullPolicy is left at its default empty value (the Helm chart default), the API server persists a concrete policy (Always/IfNotPresent). The verbatim comparison would still detect drift on every reconcile. Skip the comparison when either side is empty.
Collaborator
Author
|
/kelow review |
Collaborator
Author
|
/kelos review |
|
🤖 Kelos Task Status Task |
There was a problem hiding this comment.
🤖 Kelos Reviewer Agent @gjkim42
Review Summary
Verdict: APPROVE
Scope: Replace reflect.DeepEqual with semantic comparisons for init containers and volumes to break reconciliation loop
Findings
Correctness
- The fix correctly addresses the root cause: Kubernetes-defaulted fields (securityContext, terminationMessagePath, terminationMessagePolicy, Secret.DefaultMode) causing
reflect.DeepEqualto always report false. initContainersEqual(taskspawner_controller.go:970) compares exactly the fields the builder sets (Name, Image, ImagePullPolicy, RestartPolicy, Env, VolumeMounts, Resources) — verified againsttaskspawner_deployment_builder.go:159-176.pullPolicyEqual(taskspawner_controller.go:992) correctly treats empty as a wildcard. This avoids a loop when the builder omits ImagePullPolicy and the API server fills in a default, while still detecting intentional policy changes when both sides are non-empty.volumesEqual(taskspawner_controller.go:1002) correctly compares only Name and the key identifier fields (SecretName, ConfigMap.Name), matching what the builder sets attaskspawner_deployment_builder.go:105-118.- The fix is applied consistently to both
updateDeploymentandupdateCronJobpaths.
Tests
- Good coverage: ignore-defaults tests (
TestInitContainersEqual_IgnoresKubernetesDefaults,TestVolumesEqual_IgnoresSecretDefaultMode) and detect-changes tests (TestInitContainersEqual_DetectsRealChanges,TestVolumesEqual_DetectsRealChanges) exercise the key scenarios. TestInitContainersEqual_IgnoresAPIDefaultedPullPolicyspecifically covers the empty-policy edge case.- All unit tests pass.
Conventions
- Commit messages are concise with no PR links.
- No unrelated changes or unnecessary refactoring.
- Follows the existing comparison pattern established by
resourceRequirementsEqualandequalEnvVars.
Suggestions (optional)
initContainersEqualandvolumesEqualare tightly coupled to the set of fields the builder populates. If the builder gains new fields (e.g.,Command,Args, or a new volume type beyond EmptyDir/Secret/ConfigMap), the comparison functions would silently ignore them. This mirrors howequalEnvVarsworks elsewhere so it's consistent, but worth keeping in mind during future builder changes.
/kelos needs-input
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
What type of PR is this?
/kind bug
What this PR does / why we need it:
The taskspawner controller was in an infinite reconciliation loop, constantly updating deployments and cronjobs even when nothing changed. This caused repeated pod restarts which exhausted the GitHub API rate limit, breaking all spawner discovery.
Root cause:
reflect.DeepEqualwas used to compare init containers and volumes between the desired spec (from the builder) and the live object (from the API server). Kubernetes normalizes objects by adding defaulted fields that the builder doesn't set:securityContext,terminationMessagePath,terminationMessagePolicySecret.DefaultMode(defaults to 420)This meant every reconciliation detected a "change" and called Update, which triggered another reconciliation.
Fix: Replace
reflect.DeepEqualwith semantic comparison functions (initContainersEqual,volumesEqual) that only compare the fields the builder explicitly sets — matching the pattern already used for main container comparison andresourceRequirementsEqual.Which issue(s) this PR is related to:
N/A
Special notes for your reviewer:
The comparison bug was pre-existing but was amplified by #859's new Task watcher, which added more reconciliation triggers. The fix applies to both
updateDeployment()andupdateCronJob().Does this PR introduce a user-facing change?
Summary by cubic
Fixes an infinite reconciliation loop in the taskspawner controller by using semantic comparisons for init containers and volumes. This stops unnecessary Deployment/CronJob updates, prevents pod restarts, and avoids GitHub API rate limit exhaustion.
ImagePullPolicyas matching any API-defaulted value to avoid false drift.Written for commit a17fcdd. Summary will update on new commits.