Add k8s event reporting for exporter, client, and operator controllers#321
Add k8s event reporting for exporter, client, and operator controllers#321bkhizgiy wants to merge 6 commits intojumpstarter-dev:mainfrom
Conversation
✅ Deploy Preview for jumpstarter-docs ready!
To edit notification comments on pull requests, go to your Netlify project configuration. |
📝 WalkthroughWalkthroughAdded Kubernetes event recording across controllers: new Changes
Sequence Diagram(s)sequenceDiagram
participant Reconciler as Reconciler
participant Recorder as EventRecorder
participant API as Kubernetes API
Reconciler->>Reconciler: detect state change & capture prev state
Reconciler->>API: patch resource status
API-->>Reconciler: status patch success
Reconciler->>Recorder: emitEventf(reason, message)
alt Recorder present
Recorder->>API: Create/Patch Event
API->>API: persist event
else Recorder nil
Recorder-->>Reconciler: no-op
end
Estimated code review effort🎯 4 (Complex) | ⏱️ ~45 minutes Suggested labels
Suggested reviewers
Poem
🚥 Pre-merge checks | ✅ 2 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
📝 Coding Plan
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 1
🧹 Nitpick comments (2)
controller/internal/controller/lease_controller.go (1)
220-222: Consider gatingLeasePendingevent for scheduled leases to reduce potential noise.Unlike other pending/warning events in this controller (e.g., lines 277-280, 316-319), this
LeasePendingevent is not gated by a condition-reason change. WhileRequeueAfterprevents immediate re-reconciliation, external triggers (e.g., spec updates, informer re-syncs) could cause repeated emissions of this event for the same lease while waiting.For consistency with the noise-reduction pattern used elsewhere:
♻️ Proposed gating for scheduled lease pending event
if lease.Spec.BeginTime.After(now) { // Requested BeginTime is in the future, wait until then waitDuration := lease.Spec.BeginTime.Sub(now) logger.Info("Lease is scheduled for the future, waiting", "lease", lease.Name, "requestedBeginTime", lease.Spec.BeginTime, "waitDuration", waitDuration) + prevPendingReason := leaseConditionReason(lease, jumpstarterdevv1alpha1.LeaseConditionTypePending) + lease.SetStatusPending("Scheduled", "Waiting for scheduled start time") + if leaseConditionReason(lease, jumpstarterdevv1alpha1.LeaseConditionTypePending) != prevPendingReason { r.emitEventf(lease, corev1.EventTypeNormal, "LeasePending", "Lease waiting for scheduled start: client=%s beginTime=%s waitDuration=%s", lease.Spec.ClientRef.Name, lease.Spec.BeginTime.String(), waitDuration.String()) + } result.RequeueAfter = waitDuration return nil }Note: This assumes a
SetStatusPendingcall should be added here to set the condition, similar to other pending paths. If the condition isn't being set here, the gating pattern won't work and this may be intentional behavior.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@controller/internal/controller/lease_controller.go` around lines 220 - 222, The LeasePending event emission for scheduled leases should be gated the same way other pending events are to avoid noise: call the status updater (e.g., SetStatusPending or equivalent) for the lease with the appropriate reason/message first, and only call r.emitEventf(lease, corev1.EventTypeNormal, "LeasePending", ...) if that status update indicates the condition/reason actually changed; integrate this around the existing scheduled-lease branch that uses RequeueAfter so repeated reconciles (spec updates/informer resyncs) don't repeatedly emit LeasePending.controller/internal/controller/event_emitters_test.go (1)
57-80: Consider adding a test case forConditionUnknownstatus.The test covers
ConditionTrue(returns reason) andConditionFalse(returns empty string), but doesn't verify behavior whenStatusismetav1.ConditionUnknown. This is a valid condition status and should be tested to confirm it also returns an empty string.🧪 Optional: Add test case for ConditionUnknown
// False-status condition should not be treated as active. lease.Status.Conditions[0].Status = metav1.ConditionFalse if got := leaseConditionReason(lease, jumpstarterdevv1alpha1.LeaseConditionTypePending); got != "" { t.Fatalf("expected empty reason for false condition, got %q", got) } + + // Unknown-status condition should not be treated as active. + lease.Status.Conditions[0].Status = metav1.ConditionUnknown + if got := leaseConditionReason(lease, jumpstarterdevv1alpha1.LeaseConditionTypePending); got != "" { + t.Fatalf("expected empty reason for unknown condition, got %q", got) + } }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@controller/internal/controller/event_emitters_test.go` around lines 57 - 80, Extend TestLeaseConditionReason to include a case where the condition Status is set to metav1.ConditionUnknown and assert leaseConditionReason returns an empty string; specifically, after setting lease.Status.Conditions[0].Status = metav1.ConditionFalse add (or modify) a step that sets it to metav1.ConditionUnknown and verify got == "" using the same lease and leaseConditionReason function to ensure Unknown is treated like False.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@controller/deploy/operator/test/e2e/e2e_test.go`:
- Around line 625-640: The test currently checks for any "ExporterOnline" event
for the Exporter object (hasEventReasonForObject(dynamicTestNamespace,
"Exporter", exporterName, "ExporterOnline")), which can match an earlier event
and produce a false positive; before the "setting exporter back to online state"
step, record a baseline (e.g., count/timestamp) of matching events for the
Exporter via hasEventReasonForObject or a new helper that returns event
count/lastTimestamp, then after you update current.Status and call
k8sClient.Status().Update(...) assert that a new event was emitted by verifying
the count increased or that the lastTimestamp is newer than the baseline for
exporterName in dynamicTestNamespace; update the Eventually assertion to check
for that delta rather than existence only.
---
Nitpick comments:
In `@controller/internal/controller/event_emitters_test.go`:
- Around line 57-80: Extend TestLeaseConditionReason to include a case where the
condition Status is set to metav1.ConditionUnknown and assert
leaseConditionReason returns an empty string; specifically, after setting
lease.Status.Conditions[0].Status = metav1.ConditionFalse add (or modify) a step
that sets it to metav1.ConditionUnknown and verify got == "" using the same
lease and leaseConditionReason function to ensure Unknown is treated like False.
In `@controller/internal/controller/lease_controller.go`:
- Around line 220-222: The LeasePending event emission for scheduled leases
should be gated the same way other pending events are to avoid noise: call the
status updater (e.g., SetStatusPending or equivalent) for the lease with the
appropriate reason/message first, and only call r.emitEventf(lease,
corev1.EventTypeNormal, "LeasePending", ...) if that status update indicates the
condition/reason actually changed; integrate this around the existing
scheduled-lease branch that uses RequeueAfter so repeated reconciles (spec
updates/informer resyncs) don't repeatedly emit LeasePending.
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: f6c6725c-2865-4434-a18e-bb8177a60b0d
📒 Files selected for processing (11)
controller/cmd/main.gocontroller/deploy/helm/jumpstarter/charts/jumpstarter-controller/templates/rbac/role.yamlcontroller/deploy/operator/cmd/main.gocontroller/deploy/operator/internal/controller/jumpstarter/event_emitters_test.gocontroller/deploy/operator/internal/controller/jumpstarter/jumpstarter_controller.gocontroller/deploy/operator/internal/controller/jumpstarter/status.gocontroller/deploy/operator/test/e2e/e2e_test.gocontroller/internal/controller/client_controller.gocontroller/internal/controller/event_emitters_test.gocontroller/internal/controller/exporter_controller.gocontroller/internal/controller/lease_controller.go
There was a problem hiding this comment.
♻️ Duplicate comments (1)
controller/deploy/operator/test/e2e/e2e_test.go (1)
545-561:⚠️ Potential issue | 🟠 MajorAssert on a new
ControllerDeploymentUpdatedevent.This can still pass on a stale event if the same
Jumpstarteralready emittedControllerDeploymentUpdatedduring an earlier reconcile. Capture the baseline beforeUpdateand assert the count increases after the spec change.Suggested change
It("should emit controller update events when controller spec changes", func() { + updatedEventsBefore := countEventReasonForObject( + dynamicTestNamespace, + "Jumpstarter", + "jumpstarter", + "ControllerDeploymentUpdated", + ) + By("updating Jumpstarter controller replicas to trigger a deployment update") jumpstarter := &operatorv1alpha1.Jumpstarter{} err := k8sClient.Get(ctx, types.NamespacedName{ Name: "jumpstarter", Namespace: dynamicTestNamespace, @@ By("verifying ControllerDeploymentUpdated event was emitted on Jumpstarter resource") Eventually(func(g Gomega) { - eventList := &corev1.EventList{} - listErr := k8sClient.List(ctx, eventList, client.InNamespace(dynamicTestNamespace)) - g.Expect(listErr).NotTo(HaveOccurred()) - - found := false - for _, event := range eventList.Items { - if event.InvolvedObject.Kind == "Jumpstarter" && - event.InvolvedObject.Name == "jumpstarter" && - event.Reason == "ControllerDeploymentUpdated" { - found = true - break - } - } - g.Expect(found).To(BeTrue(), "expected ControllerDeploymentUpdated event for jumpstarter") + g.Expect(countEventReasonForObject( + dynamicTestNamespace, + "Jumpstarter", + "jumpstarter", + "ControllerDeploymentUpdated", + )).To(BeNumerically(">", updatedEventsBefore), + "expected a new ControllerDeploymentUpdated event for jumpstarter") }, 2*time.Minute).Should(Succeed()) })🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@controller/deploy/operator/test/e2e/e2e_test.go` around lines 545 - 561, The test currently checks for any ControllerDeploymentUpdated event and can pass on stale events; before performing the spec Update, list events (using k8sClient.List with ctx and client.InNamespace(dynamicTestNamespace)) and count events whose InvolvedObject.Kind == "Jumpstarter", InvolvedObject.Name == "jumpstarter" and Reason == "ControllerDeploymentUpdated" to establish a baseline, then perform the Update, re-list and assert that the new count is greater than the baseline (instead of just asserting existence); update the block around eventList/event scanning in the test to capture baselineCount, run the change, then Eventually check that currentCount > baselineCount.
🧹 Nitpick comments (1)
controller/deploy/operator/test/e2e/e2e_test.go (1)
2339-2359: Surface event-list failures from the helper.Returning
false/0onk8sClient.Listerrors turns API/RBAC problems into slow “event not found” timeouts in the new specs. Returning the error, or asserting inside the helper, makes these failures much easier to diagnose.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@controller/deploy/operator/test/e2e/e2e_test.go` around lines 2339 - 2359, The helpers swallow k8sClient.List errors causing API/RBAC failures to appear as missing events; update countEventReasonForObject to surface the List error instead of returning 0 (e.g., change signature to return (int, error) and return the List error), and update hasEventReasonForObject to propagate/handle that error (e.g., change to return (bool, error) or assert/fail the test immediately). Ensure references to k8sClient.List, countEventReasonForObject, and hasEventReasonForObject are updated wherever they are called so tests stop masking List failures.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Duplicate comments:
In `@controller/deploy/operator/test/e2e/e2e_test.go`:
- Around line 545-561: The test currently checks for any
ControllerDeploymentUpdated event and can pass on stale events; before
performing the spec Update, list events (using k8sClient.List with ctx and
client.InNamespace(dynamicTestNamespace)) and count events whose
InvolvedObject.Kind == "Jumpstarter", InvolvedObject.Name == "jumpstarter" and
Reason == "ControllerDeploymentUpdated" to establish a baseline, then perform
the Update, re-list and assert that the new count is greater than the baseline
(instead of just asserting existence); update the block around eventList/event
scanning in the test to capture baselineCount, run the change, then Eventually
check that currentCount > baselineCount.
---
Nitpick comments:
In `@controller/deploy/operator/test/e2e/e2e_test.go`:
- Around line 2339-2359: The helpers swallow k8sClient.List errors causing
API/RBAC failures to appear as missing events; update countEventReasonForObject
to surface the List error instead of returning 0 (e.g., change signature to
return (int, error) and return the List error), and update
hasEventReasonForObject to propagate/handle that error (e.g., change to return
(bool, error) or assert/fail the test immediately). Ensure references to
k8sClient.List, countEventReasonForObject, and hasEventReasonForObject are
updated wherever they are called so tests stop masking List failures.
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: ea045af4-763f-4114-9de6-5b724c2086a7
📒 Files selected for processing (2)
controller/deploy/operator/test/e2e/e2e_suite_test.gocontroller/deploy/operator/test/e2e/e2e_test.go
|
Ohh this is so cool!!! :D I think we may want to be conservative with what we post to the events, I was looking at a couple of live clusters and I don't see a lot of activity. Here is my proposal: I would keep out anything related to leases, which is the noisiest. So:
WDYT? |
mangelajo
left a comment
There was a problem hiding this comment.
see my comment about filtering some events
|
I was chatting with gemini to understand this better, look at section "3" Gemini said |
Cool, yeah that totally makes sense. We don’t want to add too much noise to the events or it kind of defeats the purpose, so I’m in favor of removing the lease-related ones. |
Wire recorders for client/exporter controllers and emit lifecycle events after successful status updates. Assisted-by: OpenAI Codex Signed-off-by: Bella Khizgiyaev <bkhizgiy@redhat.com>
Emit lease lifecycle events and gate repeated pending/unsatisfiable events to reduce noise. Assisted-by: OpenAI Codex Signed-off-by: Bella Khizgiyaev <bkhizgiy@redhat.com>
Emit Jumpstarter operator deployment and readiness events with safer helper-based emission. Assisted-by: OpenAI Codex Signed-off-by: Bella Khizgiyaev <bkhizgiy@redhat.com>
Cover nil-safe event emitters and lease condition-reason helper behavior in controller and operator packages. Assisted-by: OpenAI Codex Signed-off-by: Bella Khizgiyaev <bkhizgiy@redhat.com>
Validate exporter offline/online transitions and lease assignment/ready event emission in e2e scenarios. Assisted-by: OpenAI Codex Signed-off-by: Bella Khizgiyaev <bkhizgiy@redhat.com>
Keep event reporting focused on operator and exporter status transitions by removing lease-related event emission and lease-event e2e assertions. Assisted-by: OpenAI Codex Signed-off-by: Bella Khizgiyaev <bkhizgiy@redhat.com>
There was a problem hiding this comment.
🧹 Nitpick comments (2)
controller/internal/controller/exporter_controller.go (1)
100-124: Consider simplifying event messages by removing redundant object identifiers.The event messages include
exporter=%swithexporter.Name, but Kubernetes events already have aninvolvedObjectreference pointing to the exporter. This makes the name redundant in the message body.That said, having the name inline can aid log aggregation workflows, so this is a minor stylistic preference.
♻️ Optional: Simplified event messages
if !prevOnline && newOnline { r.emitEventf(&exporter, corev1.EventTypeNormal, "ExporterOnline", - "Exporter is online: exporter=%s lastSeen=%s", - exporter.Name, exporter.Status.LastSeen.String()) + "Exporter is online (lastSeen=%s)", exporter.Status.LastSeen.String()) } else if prevOnline && !newOnline { if exporter.Status.ExporterStatusValue == jumpstarterdevv1alpha1.ExporterStatusOffline && time.Since(exporter.Status.LastSeen.Time) <= time.Minute { r.emitEventf(&exporter, corev1.EventTypeWarning, "ExporterOffline", - "Exporter reported offline (graceful shutdown): exporter=%s message=%s", - exporter.Name, exporter.Status.StatusMessage) + "Graceful shutdown: %s", exporter.Status.StatusMessage) } else { r.emitEventf(&exporter, corev1.EventTypeWarning, "ExporterOffline", - "Exporter went offline (connection lost): exporter=%s lastSeen=%s", - exporter.Name, exporter.Status.LastSeen.String()) + "Connection lost (lastSeen=%s)", exporter.Status.LastSeen.String()) } }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@controller/internal/controller/exporter_controller.go` around lines 100 - 124, The event messages currently include redundant "exporter=%s" with exporter.Name in the calls to r.emitEventf (see emitEventf usages inside the Exporter status-change block); update those message formats to remove the "exporter=%s" fragments and corresponding exporter.Name parameters for all four events ("ExporterRegistered", "ExporterUnregistered", "ExporterOnline", "ExporterOffline") so the involvedObject reference is relied on instead—ensure you only change the message strings and argument lists passed to r.emitEventf (and keep the rest of the logic, conditions, and fields like lastSeen and StatusMessage intact).controller/internal/controller/event_emitters_test.go (1)
30-45: Consider verifying event content and adding symmetric test for ClientReconciler.The test confirms an event is emitted but only checks it's non-empty. Asserting on the event string content (e.g., contains "ExporterOnline" and "test-exporter") would catch regressions in event formatting.
Also, for symmetry, consider adding a
TestClientEmitEventfWritesEventtest to validateClientReconcilerevent emission with a real recorder.♻️ Optional: Enhanced event content assertion
select { case event := <-recorder.Events: if event == "" { t.Fatal("expected non-empty event payload") } + if !strings.Contains(event, "ExporterOnline") { + t.Errorf("expected event to contain reason 'ExporterOnline', got: %s", event) + } + if !strings.Contains(event, "test-exporter") { + t.Errorf("expected event to contain 'test-exporter', got: %s", event) + } case <-time.After(2 * time.Second): t.Fatal("expected event to be emitted") }Note: This would require adding
"strings"to the imports.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@controller/internal/controller/event_emitters_test.go` around lines 30 - 45, Update TestExporterEmitEventfWritesEvent to assert the emitted event string contains the expected pieces by checking recorder.Events value includes "ExporterOnline" and "test-exporter" (use the existing ExporterReconciler and emitEventf). Also add a symmetric TestClientEmitEventfWritesEvent that constructs a ClientReconciler with a record.NewFakeRecorder, calls its emitEventf with a Client object and similar args, and asserts the recorded event contains the expected event type/name and client identifier. Add the "strings" import and use strings.Contains to perform the assertions against the value read from recorder.Events.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Nitpick comments:
In `@controller/internal/controller/event_emitters_test.go`:
- Around line 30-45: Update TestExporterEmitEventfWritesEvent to assert the
emitted event string contains the expected pieces by checking recorder.Events
value includes "ExporterOnline" and "test-exporter" (use the existing
ExporterReconciler and emitEventf). Also add a symmetric
TestClientEmitEventfWritesEvent that constructs a ClientReconciler with a
record.NewFakeRecorder, calls its emitEventf with a Client object and similar
args, and asserts the recorded event contains the expected event type/name and
client identifier. Add the "strings" import and use strings.Contains to perform
the assertions against the value read from recorder.Events.
In `@controller/internal/controller/exporter_controller.go`:
- Around line 100-124: The event messages currently include redundant
"exporter=%s" with exporter.Name in the calls to r.emitEventf (see emitEventf
usages inside the Exporter status-change block); update those message formats to
remove the "exporter=%s" fragments and corresponding exporter.Name parameters
for all four events ("ExporterRegistered", "ExporterUnregistered",
"ExporterOnline", "ExporterOffline") so the involvedObject reference is relied
on instead—ensure you only change the message strings and argument lists passed
to r.emitEventf (and keep the rest of the logic, conditions, and fields like
lastSeen and StatusMessage intact).
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: a11edd89-6a96-4862-809e-8077b4dc67f8
📒 Files selected for processing (5)
controller/cmd/main.gocontroller/deploy/operator/test/e2e/e2e_test.gocontroller/internal/controller/event_emitters_test.gocontroller/internal/controller/exporter_controller.gocontroller/internal/controller/lease_controller.go
🚧 Files skipped from review as they are similar to previous changes (3)
- controller/cmd/main.go
- controller/internal/controller/lease_controller.go
- controller/deploy/operator/test/e2e/e2e_test.go
There was a problem hiding this comment.
🧹 Nitpick comments (1)
controller/deploy/operator/internal/controller/jumpstarter/jumpstarter_controller.go (1)
273-282: Consider extracting event reason strings as constants.The event reason strings (e.g.,
"ControllerDeploymentUpdated") are inline literals here and also hardcoded in the e2e tests. Defining them as package-level constants would prevent silent mismatches if one location is updated but not the other.Example constant definition
const ( EventReasonControllerDeploymentCreated = "ControllerDeploymentCreated" EventReasonControllerDeploymentUpdated = "ControllerDeploymentUpdated" // ... other event reasons )🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@controller/deploy/operator/internal/controller/jumpstarter/jumpstarter_controller.go` around lines 273 - 282, Extract the inline event reason literals into package-level constants and replace the hardcoded strings in the emitEventf calls; define constants like EventReasonControllerDeploymentCreated and EventReasonControllerDeploymentUpdated and use them in the switch cases where emitEventf(jumpstarter, corev1.EventTypeNormal, ...) is invoked (the lines referencing existingDeployment.Name / existingDeployment.Namespace), and update any tests that reference the string literals to use the new constants to avoid mismatches.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Nitpick comments:
In
`@controller/deploy/operator/internal/controller/jumpstarter/jumpstarter_controller.go`:
- Around line 273-282: Extract the inline event reason literals into
package-level constants and replace the hardcoded strings in the emitEventf
calls; define constants like EventReasonControllerDeploymentCreated and
EventReasonControllerDeploymentUpdated and use them in the switch cases where
emitEventf(jumpstarter, corev1.EventTypeNormal, ...) is invoked (the lines
referencing existingDeployment.Name / existingDeployment.Namespace), and update
any tests that reference the string literals to use the new constants to avoid
mismatches.
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: 628e2382-fdc2-49a6-ae78-83d744cfa0c9
📒 Files selected for processing (12)
controller/cmd/main.gocontroller/deploy/helm/jumpstarter/charts/jumpstarter-controller/templates/rbac/role.yamlcontroller/deploy/operator/cmd/main.gocontroller/deploy/operator/internal/controller/jumpstarter/event_emitters_test.gocontroller/deploy/operator/internal/controller/jumpstarter/jumpstarter_controller.gocontroller/deploy/operator/internal/controller/jumpstarter/status.gocontroller/deploy/operator/test/e2e/e2e_suite_test.gocontroller/deploy/operator/test/e2e/e2e_test.gocontroller/internal/controller/client_controller.gocontroller/internal/controller/event_emitters_test.gocontroller/internal/controller/exporter_controller.gocontroller/internal/controller/lease_controller.go
💤 Files with no reviewable changes (1)
- controller/internal/controller/lease_controller.go
🚧 Files skipped from review as they are similar to previous changes (4)
- controller/deploy/operator/internal/controller/jumpstarter/status.go
- controller/deploy/operator/test/e2e/e2e_suite_test.go
- controller/internal/controller/client_controller.go
- controller/deploy/operator/cmd/main.go
This PR adds Kubernetes event reporting across Jumpstarter controllers to improve runtime observability, which can be retrieved with
oc get events,oc describe ....Events by Category
Exporter Controller (Exporter resources)
Client Controller (Client resources)
Operator Controller (Jumpstarter resources)
Related to #102
Summary by CodeRabbit
New Features
Tests