Summary
RuleReadinessController maintains an in-memory ruleCache that is populated
lazily as RuleReconciler processes each NodeReadinessRule. NodeReconciler
derives all applicable rules exclusively from this cache via
getApplicableRulesForNode. Because both controllers start draining their work
queues concurrently after informer sync, there is a race window on startup and
after leader re-election where node events are processed against an empty or
partially-warm cache.
For continuous enforcement this is operationally benign the controller
re-evaluates the node on the next condition, taint, or label change and
converges. For bootstrap-only rules it is a permanent correctness gap: if a
node is evaluated before its applicable rule reaches the cache, the taint is
never applied, the bootstrap completion annotation is never written, and the
intended admission gate is silently bypassed with no error surfaced.
Current Behavior
Startup sequence:
mgr.Start() starts informers and waits for cache synchronization all
NodeReadinessRule objects are now readable from the local informer cache.
RuleReconciler and NodeReconciler begin draining their respective work
queues concurrently.
RuleReconciler reconciles existing rules one by one
(MaxConcurrentReconciles: 1), calling updateRuleCache for each.
NodeReconciler immediately processes queued Node events including events
for nodes that joined during or just before startup against whatever rules
happen to be in the cache at that moment.
With a non-trivial rule count the ruleCache reaches full coverage only after
all rules have been individually reconciled. Any node event processed before a
given rule is cached is invisible to that rule.
Why Bootstrap-Only Mode Is Specifically Vulnerable
The bootstrap completion path in evaluateRuleForNode is:
conditions not satisfied → taint added
conditions satisfied → taint removed → markBootstrapCompleted → annotation written
subsequent reconciles → isBootstrapCompleted returns true → skip
markBootstrapCompleted is reached only inside the
shouldRemoveTaint && currentlyHasTaint branch. If a node is first evaluated
while its applicable rule is absent from the cache:
getApplicableRulesForNode returns an empty slice no taint is applied.
- The node's conditions are satisfied (CNI ready, security agent healthy, etc.).
- When the rule eventually enters the cache and the node is re-evaluated,
evaluateRuleForNode sees shouldRemoveTaint=true, currentlyHasTaint=false
and falls through to the default branch: no action, no call to
markBootstrapCompleted.
- The bootstrap annotation is never written. The rule re-evaluates the node on
every subsequent event indefinitely but the intended gate was never enforced.
Steps to Reproduce / Concrete Scenario
- Cluster undergoes a rolling restart of the controller (upgrade, crash, or
leader re-election).
- Within the startup window a replacement node joins. Its CNI plugin reports
NetworkReady=True within ~8 seconds (typical for a pre-warmed agent).
NodeReconciler picks up the node CREATE event; the network-readiness
bootstrap-only rule has not yet been reconciled by RuleReconciler (queue
still has earlier items ahead of it) and is absent from ruleCache.
getApplicableRulesForNode returns empty no taint is applied.
- The rule is cached ~30 seconds later. Re-evaluation:
NetworkReady=True,
no taint present → default branch → no action, no annotation.
- The node accepts workload scheduling without the readiness gate ever having
been enforced.
This reproduces on any controller restart in a cluster actively scaling out and
on every leader re-election when the replacement leader starts with a cold cache.
Expected Behavior
Nodes should never be evaluated by NodeReconciler against an empty or
incomplete cache when NodeReadinessRules already exist in the cluster.
Bootstrap-only rules must apply their taint reliably regardless of the ordering
in which the two controllers drain their queues on startup.
Root Cause
The ruleCache is populated entirely as a side effect of
RuleReconciler.Reconcile() via updateRuleCache. There is no mechanism to
seed it from the already-synced informer cache before NodeReconciler begins
processing events. mgr.Start() guarantees informers are synced before
controllers run, so all NodeReadinessRules are available locally the moment
both controllers start the gap is that ruleCache is not initialized from
this data.
Possible Implementation Direction
A WarmCache method on RuleReadinessController that lists existing rules from
the already-synced informer cache (no live API call) and calls updateRuleCache
for each would close the gap:
func (r *RuleReadinessController) WarmCache(ctx context.Context) error {
ruleList := &readinessv1alpha1.NodeReadinessRuleList{}
if err := r.List(ctx, ruleList); err != nil {
return fmt.Errorf("failed to warm rule cache: %w", err)
}
for i := range ruleList.Items {
r.updateRuleCache(ctx, &ruleList.Items[i])
}
return nil
}
This can be invoked via a manager.Runnable registered with mgr.Add() that
executes after cache sync and completes before the manager signals readiness.
Because the list is served from the local informer cache, cost is bounded by
rule count rather than node count and adds negligible startup latency. The
warm-up should be re-run on each leadership term so that leader re-election does
not reintroduce the window.
An alternative is a lazy fallback inside NodeReconciler.Reconcile: on the
first pass detect that len(ruleCache) == 0 while rules exist in the informer
cache and perform an inline warm-up before evaluating the node.
Acceptance Criteria
A test (integration or e2e) covers nodes joining during the rule cache warm-up window and verifies that bootstrap-only taints are correctly applied.
No node event is processed against an empty cache when NodeReadinessRules exist in the cluster.
Cache warm-up cost is bounded by rule count, not node count (served from local informer cache, not live API).
Leader re-election does not reintroduce the race window the replacement leader warms its cache before declaring ready.
No regression in continuous enforcement mode behavior.
Additional Context
The ruleCache also drives the DeletionTimestamp guard in
processNodeAgainstAllRules that prevents taint operations on rules being
deleted. A fully-warm cache on startup is equally important for that invariant.
The recent hardening of bootstrap annotation cleanup on rule deletion
(ea74209)
addresses the deletion side of the bootstrap lifecycle; this issue addresses the
complementary startup-time gap in the same semantic guarantee.
/kind bug
Summary
RuleReadinessControllermaintains an in-memoryruleCachethat is populatedlazily as
RuleReconcilerprocesses eachNodeReadinessRule.NodeReconcilerderives all applicable rules exclusively from this cache via
getApplicableRulesForNode. Because both controllers start draining their workqueues concurrently after informer sync, there is a race window on startup and
after leader re-election where node events are processed against an empty or
partially-warm cache.
For
continuousenforcement this is operationally benign the controllerre-evaluates the node on the next condition, taint, or label change and
converges. For
bootstrap-onlyrules it is a permanent correctness gap: if anode is evaluated before its applicable rule reaches the cache, the taint is
never applied, the bootstrap completion annotation is never written, and the
intended admission gate is silently bypassed with no error surfaced.
Current Behavior
Startup sequence:
mgr.Start()starts informers and waits for cache synchronization allNodeReadinessRuleobjects are now readable from the local informer cache.RuleReconcilerandNodeReconcilerbegin draining their respective workqueues concurrently.
RuleReconcilerreconciles existing rules one by one(
MaxConcurrentReconciles: 1), callingupdateRuleCachefor each.NodeReconcilerimmediately processes queued Node events including eventsfor nodes that joined during or just before startup against whatever rules
happen to be in the cache at that moment.
With a non-trivial rule count the
ruleCachereaches full coverage only afterall rules have been individually reconciled. Any node event processed before a
given rule is cached is invisible to that rule.
Why Bootstrap-Only Mode Is Specifically Vulnerable
The bootstrap completion path in
evaluateRuleForNodeis:conditions not satisfied → taint added
conditions satisfied → taint removed → markBootstrapCompleted → annotation written
subsequent reconciles → isBootstrapCompleted returns true → skip
markBootstrapCompletedis reached only inside theshouldRemoveTaint && currentlyHasTaintbranch. If a node is first evaluatedwhile its applicable rule is absent from the cache:
getApplicableRulesForNodereturns an empty slice no taint is applied.evaluateRuleForNodeseesshouldRemoveTaint=true, currentlyHasTaint=falseand falls through to the
defaultbranch: no action, no call tomarkBootstrapCompleted.every subsequent event indefinitely but the intended gate was never enforced.
Steps to Reproduce / Concrete Scenario
leader re-election).
NetworkReady=Truewithin ~8 seconds (typical for a pre-warmed agent).NodeReconcilerpicks up the node CREATE event; thenetwork-readinessbootstrap-only rule has not yet been reconciled by
RuleReconciler(queuestill has earlier items ahead of it) and is absent from
ruleCache.getApplicableRulesForNodereturns empty no taint is applied.NetworkReady=True,no taint present →
defaultbranch → no action, no annotation.been enforced.
This reproduces on any controller restart in a cluster actively scaling out and
on every leader re-election when the replacement leader starts with a cold cache.
Expected Behavior
Nodes should never be evaluated by
NodeReconcileragainst an empty orincomplete cache when
NodeReadinessRulesalready exist in the cluster.Bootstrap-only rules must apply their taint reliably regardless of the ordering
in which the two controllers drain their queues on startup.
Root Cause
The
ruleCacheis populated entirely as a side effect ofRuleReconciler.Reconcile()viaupdateRuleCache. There is no mechanism toseed it from the already-synced informer cache before
NodeReconcilerbeginsprocessing events.
mgr.Start()guarantees informers are synced beforecontrollers run, so all
NodeReadinessRulesare available locally the momentboth controllers start the gap is that
ruleCacheis not initialized fromthis data.
Possible Implementation Direction
A
WarmCachemethod onRuleReadinessControllerthat lists existing rules fromthe already-synced informer cache (no live API call) and calls
updateRuleCachefor each would close the gap:
This can be invoked via a manager.Runnable registered with mgr.Add() that
executes after cache sync and completes before the manager signals readiness.
Because the list is served from the local informer cache, cost is bounded by
rule count rather than node count and adds negligible startup latency. The
warm-up should be re-run on each leadership term so that leader re-election does
not reintroduce the window.
An alternative is a lazy fallback inside NodeReconciler.Reconcile: on the
first pass detect that len(ruleCache) == 0 while rules exist in the informer
cache and perform an inline warm-up before evaluating the node.
Acceptance Criteria
A test (integration or e2e) covers nodes joining during the rule cache warm-up window and verifies that bootstrap-only taints are correctly applied.
No node event is processed against an empty cache when NodeReadinessRules exist in the cluster.
Cache warm-up cost is bounded by rule count, not node count (served from local informer cache, not live API).
Leader re-election does not reintroduce the race window the replacement leader warms its cache before declaring ready.
No regression in continuous enforcement mode behavior.
Additional Context
The ruleCache also drives the DeletionTimestamp guard in
processNodeAgainstAllRules that prevents taint operations on rules being
deleted. A fully-warm cache on startup is equally important for that invariant.
The recent hardening of bootstrap annotation cleanup on rule deletion
(ea74209)
addresses the deletion side of the bootstrap lifecycle; this issue addresses the
complementary startup-time gap in the same semantic guarantee.
/kind bug