Skip to content

fix:return evaluation errors from processAllNodesForRule#237

Open
dorodb-web22 wants to merge 2 commits into
kubernetes-sigs:mainfrom
dorodb-web22:fix/propagate-rule-evaluation-errors
Open

fix:return evaluation errors from processAllNodesForRule#237
dorodb-web22 wants to merge 2 commits into
kubernetes-sigs:mainfrom
dorodb-web22:fix/propagate-rule-evaluation-errors

Conversation

@dorodb-web22
Copy link
Copy Markdown

Description

processAllNodesForRule was catching errors from evaluateRuleForNode but always returning nil. This prevented RuleReconciler.Reconcile from seeing any failures, so controller-runtime never triggered a requeue on transient errors like API conflicts or patch failures.

This fix accumulates errors across the node loop and returns them via errors.Join, using the same pattern applied to processNodeAgainstAllRules in #222. The existing behavior of continuing evaluation across all nodes is preserved.

Related Issue

Fixes #234

Type of Change

/kind bug

Testing

make test
make lint

Checklist

  • make test passes
  • make lint passes

Does this PR introduce a user-facing change?

NONE

@k8s-ci-robot k8s-ci-robot added the kind/bug Categorizes issue or PR as related to a bug. label May 10, 2026
@netlify
Copy link
Copy Markdown

netlify Bot commented May 10, 2026

Deploy Preview for node-readiness-controller canceled.

Name Link
🔨 Latest commit 7348a02
🔍 Latest deploy log https://app.netlify.com/projects/node-readiness-controller/deploys/6a04b423e23f8f00082881af

@k8s-ci-robot
Copy link
Copy Markdown
Contributor

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: dorodb-web22
Once this PR has been reviewed and has the lgtm label, please assign mrunalp for approval. For more information see the Code Review Process.

The full list of commands accepted by this bot can be found here.

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot added cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. labels May 10, 2026
@k8s-ci-robot
Copy link
Copy Markdown
Contributor

Hi @dorodb-web22. Thanks for your PR.

I'm waiting for a kubernetes-sigs member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work.

Regular contributors should join the org to skip this step.

Once the patch is verified, the new status will be reflected by the ok-to-test label.

I understand the commands that are listed here.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@k8s-ci-robot k8s-ci-robot added the size/XS Denotes a PR that changes 0-9 lines, ignoring generated files. label May 10, 2026
@ajaysundark
Copy link
Copy Markdown
Contributor

/ok-to-test

@k8s-ci-robot k8s-ci-robot added ok-to-test Indicates a non-member PR verified by an org member that is safe to test. and removed needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. labels May 11, 2026
Copy link
Copy Markdown
Contributor

@AvineshTripathi AvineshTripathi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cc @ajaysundark wdyt about this


log.Info("Completed processing nodes for rule", "rule", rule.Name, "processedCount", len(appliedNodes))
return nil
return errors.Join(errs...)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

returning an error will keep the caller function(in this case this) in an infinite loop. Other rules will not be evaluated too. It will also not update the status(here) till the error is fixed.

Copy link
Copy Markdown
Author

@dorodb-web22 dorodb-web22 May 13, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the early return was skipping updateRuleStatus and cleanupDeletedNodes. i ffixed by saving the error, letting status update complete, then returning the error at the end to trigger requeue.

@AvineshTripathi
Copy link
Copy Markdown
Contributor

/assign

@k8s-ci-robot k8s-ci-robot added size/S Denotes a PR that changes 10-29 lines, ignoring generated files. needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. and removed size/XS Denotes a PR that changes 0-9 lines, ignoring generated files. labels May 13, 2026
@dorodb-web22 dorodb-web22 force-pushed the fix/propagate-rule-evaluation-errors branch from 513df2b to 7348a02 Compare May 13, 2026 17:25
@k8s-ci-robot k8s-ci-robot removed the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label May 13, 2026
if r.ruleAppliesTo(ctx, rule, &node) {
appliedNodes = append(appliedNodes, node.Name)
log.Info("Processing node for rule", "rule", rule.Name, "node", node.Name)
if err := r.evaluateRuleForNode(ctx, rule, &node); err != nil {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you add a test case for this where evaluateRuleForNode returns an error?


log.Info("Completed processing nodes for rule", "rule", rule.Name, "processedCount", len(appliedNodes))
return nil
return errors.Join(errs...)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

requeue on transient errors like API conflicts or patch failures.

can you also add unit tests to capture these gaps that this PR aims to address?


// processAllNodesForRule processes all nodes when a rule changes.
//
//nolint:unparam // Keep error return for future extensibility and API stability.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we remove this comment as it is no longer true?

@ajaysundark
Copy link
Copy Markdown
Contributor

@dorodb-web22 Thanks for your PR, Could you rebase and address the test failures?

@k8s-ci-robot k8s-ci-robot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label May 15, 2026
@k8s-ci-robot
Copy link
Copy Markdown
Contributor

PR needs rebase.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@AvineshTripathi
Copy link
Copy Markdown
Contributor

/lgtm

Please rebase

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label May 15, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. kind/bug Categorizes issue or PR as related to a bug. lgtm "Looks good to me", indicates that a PR is ready to be merged. needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. ok-to-test Indicates a non-member PR verified by an org member that is safe to test. size/S Denotes a PR that changes 10-29 lines, ignoring generated files.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

bug: processAllNodesForRule swallows node evaluation errors, preventing reconcile retries

4 participants