Describe the issue
The node_readiness_evaluation_duration_seconds metric in
internal/metrics/metrics.go is defined as a plain prometheus.Histogram
with no labels. All other per-rule metrics (node_readiness_taint_operations_total,
node_readiness_failures_total, node_readiness_bootstrap_completed_total)
include a rule label, allowing operators to observe behaviour per rule.
Without a rule label on the evaluation duration metric, operators cannot
identify which specific rule is slow or causing latency issues — all rules'
durations are aggregated into a single histogram.
Expected behavior
node_readiness_evaluation_duration_seconds should be a HistogramVec
with a rule label, consistent with the other metrics.
File
internal/metrics/metrics.go and
internal/controller/nodereadinessrule_controller.go line 295
Are you able to fix this issue?
Yes (I will propose a PR)
Describe the issue
The
node_readiness_evaluation_duration_secondsmetric ininternal/metrics/metrics.gois defined as a plainprometheus.Histogramwith no labels. All other per-rule metrics (
node_readiness_taint_operations_total,node_readiness_failures_total,node_readiness_bootstrap_completed_total)include a
rulelabel, allowing operators to observe behaviour per rule.Without a
rulelabel on the evaluation duration metric, operators cannotidentify which specific rule is slow or causing latency issues — all rules'
durations are aggregated into a single histogram.
Expected behavior
node_readiness_evaluation_duration_secondsshould be aHistogramVecwith a
rulelabel, consistent with the other metrics.File
internal/metrics/metrics.goandinternal/controller/nodereadinessrule_controller.goline 295Are you able to fix this issue?
Yes (I will propose a PR)