What happened
When a NodeReadinessRule using enforcementMode: bootstrap-only is deleted, the controller removes the taints managed by the rule, but the bootstrap completion annotations written on nodes are left behind.
The annotation key is generated using the rule name:
go annotationKey := fmt.Sprintf("readiness.k8s.io/bootstrap-completed-%s", ruleName)
isBootstrapCompleted() later checks this annotation to decide whether node evaluation should be skipped.
Because the annotation only depends on the rule name, recreating a rule with the same name causes previously-bootstrapped nodes to be treated as already completed, even if the new rule has different conditions.
As a result, node evaluation is skipped and the new taint may never be applied.
I traced this to the delete path in reconcileDelete(). It currently cleans up taints through cleanupTaintsForRule() but does not clean up bootstrap annotations before removing the finalizer.
What you expected to happen
Deleting a bootstrap-only rule should also remove the bootstrap completion state associated with that rule.
If a rule is recreated with the same name, nodes should be evaluated from scratch.
How to reproduce
# Create bootstrap-only rule
kubectl apply -f - <<EOF
apiVersion: readiness.node.x-k8s.io/v1alpha1
kind: NodeReadinessRule
metadata:
name: network-readiness
spec:
enforcementMode: bootstrap-only
conditions:
- type: NetworkUnavailable
requiredStatus: "False"
taint:
key: readiness.k8s.io/network-not-ready
effect: NoSchedule
nodeSelector:
matchLabels:
kubernetes.io/os: linux
EOF
# Wait for bootstrap completion
# Delete the rule
kubectl delete nodereadinessrule network-readiness
# Verify annotations still exist
kubectl get nodes -o json | \
jq '.items[] | select(
.metadata.annotations["readiness.k8s.io/bootstrap-completed-network-readiness"] != null
) | .metadata.name'
# Recreate rule with same name but different condition
kubectl apply -f - <<EOF
apiVersion: readiness.node.x-k8s.io/v1alpha1
kind: NodeReadinessRule
metadata:
name: network-readiness
spec:
enforcementMode: bootstrap-only
conditions:
- type: DiskPressure
requiredStatus: "False"
taint:
key: readiness.k8s.io/network-not-ready
effect: NoSchedule
nodeSelector:
matchLabels:
kubernetes.io/os: linux
EOF
At this point, nodes that previously completed bootstrap can skip evaluation because the old annotation is still present.
Suggested fix
A few possible directions:
- Remove bootstrap annotations during finalizer cleanup
- Use rule UID instead of rule name in the annotation key
- Combine UID-based keys with best-effort cleanup during deletion
Happy to work on a fix if this approach makes sense.
/kind bug
What happened
When a NodeReadinessRule using enforcementMode: bootstrap-only is deleted, the controller removes the taints managed by the rule, but the bootstrap completion annotations written on nodes are left behind.
The annotation key is generated using the rule name:
isBootstrapCompleted() later checks this annotation to decide whether node evaluation should be skipped.
Because the annotation only depends on the rule name, recreating a rule with the same name causes previously-bootstrapped nodes to be treated as already completed, even if the new rule has different conditions.
As a result, node evaluation is skipped and the new taint may never be applied.
I traced this to the delete path in reconcileDelete(). It currently cleans up taints through cleanupTaintsForRule() but does not clean up bootstrap annotations before removing the finalizer.
What you expected to happen
Deleting a bootstrap-only rule should also remove the bootstrap completion state associated with that rule.
If a rule is recreated with the same name, nodes should be evaluated from scratch.
How to reproduce
At this point, nodes that previously completed bootstrap can skip evaluation because the old annotation is still present.
Suggested fix
A few possible directions:
Happy to work on a fix if this approach makes sense.
/kind bug