Skip to content

STOR-2873: Implement metrics for mismatch symlinks#597

Open
gnufied wants to merge 7 commits intoopenshift:mainfrom
gnufied:implement-metrics-for-mismatch-symlinks
Open

STOR-2873: Implement metrics for mismatch symlinks#597
gnufied wants to merge 7 commits intoopenshift:mainfrom
gnufied:implement-metrics-for-mismatch-symlinks

Conversation

@gnufied
Copy link
Member

@gnufied gnufied commented Mar 23, 2026

@coderabbitai
Copy link

coderabbitai bot commented Mar 23, 2026

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

  • @coderabbitai resume to resume automatic reviews.
  • @coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

  • ▶️ Resume reviews
  • 🔍 Trigger review

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 3da1adbc-cfed-46f6-b6f7-a8f52138861d

📥 Commits

Reviewing files that changed from the base of the PR and between c662ab9 and 2ef63a1.

📒 Files selected for processing (5)
  • assets/templates/localmetrics/prometheus-rule.yaml
  • cmd/diskmaker-manager/manager.go
  • go.mod
  • pkg/localmetrics/custom_metrics_collector.go
  • pkg/localmetrics/custom_metrics_collector_test.go
✅ Files skipped from review due to trivial changes (3)
  • go.mod
  • pkg/localmetrics/custom_metrics_collector_test.go
  • pkg/localmetrics/custom_metrics_collector.go
🚧 Files skipped from review as they are similar to previous changes (1)
  • assets/templates/localmetrics/prometheus-rule.yaml

Walkthrough

Adds LocalVolumeDeviceLink (LVDL) management: new internal handler and metrics collector, status reconciliation for ignored/rejected devices, PV provisioning refactor to structured args with device-link creation/status updates, CRD/RBAC/docs updates, many unit and e2e tests, and Prometheus alert rules.

Changes

Cohort / File(s) Summary
API & CRD
api/v1/localvolumedevicelink_types.go, config/manifests/stable/local.storage.openshift.io_localvolumedevicelinks.yaml
Made status.preferredLinkTarget/status.validLinkTargets optional (removed minLength/required), added init() scheme registration for LVDL types.
Device Link Implementation
pkg/internal/device_link_handler.go, pkg/internal/device_link_handler_test.go
New DeviceLinkHandler type with Create and ApplyStatus to manage LVDL objects and status; comprehensive unit tests covering creation, status updates, owner refs, and failure cases.
Internal Types & Disk Utilities
pkg/internal/disk_location.go, pkg/internal/diskutil.go, pkg/internal/diskutil_test.go
Added internal.DiskLocation; switched command execution to utilexec via CmdExecutor; added GetUncachedPathID() and centralized symlink-matching; updated tests to use FakeExec.
Provisioning Refactor
pkg/common/provisioner_utils.go, pkg/diskmaker/controllers/lv/create_pv_test.go, pkg/diskmaker/controllers/lvset/create_pv_test.go
Replaced positional CreateLocalPV with CreateLocalPVArgs plus ctx; CreateLocalPV now uses BlockDevice.KName, creates/applies LVDL, and updated tests/call sites accordingly.
Reconciliation: LV & LVSet
pkg/diskmaker/controllers/lv/reconcile.go, pkg/diskmaker/controllers/lv/reconcile_test.go, pkg/diskmaker/controllers/lvset/reconcile.go, pkg/diskmaker/controllers/lvset/reconcile_test.go, pkg/diskmaker/controllers/lvset/device_age_test.go
Switched to internal.DiskLocation, collect rejected devices, added processRejectedDevicesForDeviceLinks to reconcile/update LVDL status for ignored/rejected devices, adjusted signatures to accept ctx, and updated tests (including status-subresource fake client).
Discovery & Diskmaker Tests
pkg/diskmaker/discovery/discovery_test.go, pkg/diskmaker/controllers/lv/reconcile_test.go
Removed subprocess helper; introduced FakeExec-based executor injection for lsblk/blkid in tests; updated tests to use new executor and disk-location types.
Metrics & Manager
pkg/localmetrics/custom_metrics_collector.go, pkg/localmetrics/custom_metrics_collector_test.go, cmd/diskmaker-manager/manager.go, assets/templates/localmetrics/prometheus-rule.yaml
Added DeviceLinkCollector Prometheus collector and tests; register collector in manager; added two alert rules for missing stable path and device-link mismatch.
Events & RBAC
pkg/diskmaker/event_reporter.go, config/manifests/stable/local-storage-operator.clusterserviceversion.yaml
Added ErrorCreatingLVDL event reason; extended operator RBAC to include localvolumedevicelinks and localvolumedevicelinks/status.
LocalVolume Controller Minor Change
pkg/controllers/localvolume/localvolume_controller.go
Always append DiskMaker DaemonSet generation status to children (removed prior nil conditional).
E2E Tests & Docs
test/e2e/localvolume_test.go, test/e2e/localvolumeset_test.go, HACKING.md
E2E tests now wait for and validate LVDL existence/status and filesystem UUIDs, verify LVDL deletion; HACKING.md added steps to override diskmaker/operator image post-deploy.
Module Metadata
go.mod
Prometheus client_model dependency moved from indirect to direct require (removed // indirect).

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~75 minutes

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Comment @coderabbitai help to get the list of available commands and usage tips.

@openshift-ci openshift-ci bot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Mar 23, 2026
@openshift-ci openshift-ci bot requested review from jsafrane and tsmetana March 23, 2026 14:54
@openshift-ci
Copy link
Contributor

openshift-ci bot commented Mar 23, 2026

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: gnufied

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@openshift-ci openshift-ci bot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Mar 23, 2026
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
pkg/common/provisioner_utils.go (1)

215-215: ⚠️ Potential issue | 🟡 Minor

Use the provided ctx instead of context.TODO().

The function receives ctx as a parameter but uses context.TODO() for controllerutil.CreateOrUpdate. This could miss cancellation signals.

🐛 Proposed fix
-	opRes, err := controllerutil.CreateOrUpdate(context.TODO(), client, existingPV, func() error {
+	opRes, err := controllerutil.CreateOrUpdate(ctx, client, existingPV, func() error {
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@pkg/common/provisioner_utils.go` at line 215, The call to
controllerutil.CreateOrUpdate uses context.TODO() which ignores the function's
provided cancellation/timeout; update the call to pass the function parameter
ctx instead (i.e., replace context.TODO() with ctx) so
controllerutil.CreateOrUpdate(client, existingPV, ...) receives the passed-in
context and cancellation is propagated; ensure this change is applied where
controllerutil.CreateOrUpdate is invoked with client and existingPV.
🧹 Nitpick comments (6)
pkg/diskmaker/controllers/lvset/device_age_test.go (1)

69-71: Assert the new rejected-device bucket too.

This test adapts to the new three-value return signature, but it still discards the new classification outcome. An explicit expectation here would keep regressions in getValidDevices from slipping through.

Proposed test update
-		validDevices, delayedDevices, _ := r.getValidDevices(nil, blockDevices)
+		validDevices, delayedDevices, rejectedDevices := r.getValidDevices(nil, blockDevices)
 		assert.Lenf(t, validDevices, expectedValid[run], "validDevices")
 		assert.Lenf(t, delayedDevices, len(blockDevices)-expectedValid[run], "delayedDevices")
+		assert.Empty(t, rejectedDevices, "rejectedDevices")
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@pkg/diskmaker/controllers/lvset/device_age_test.go` around lines 69 - 71,
Test currently ignores the third return value from r.getValidDevices; update the
call to capture the rejected bucket (e.g., validDevices, delayedDevices,
rejectedDevices := r.getValidDevices(nil, blockDevices)) and add an assertion
for its length (e.g., assert.Lenf(t, rejectedDevices, expectedRejected[run],
"rejectedDevices")). Add or compute expectedRejected for each run (recommended:
introduce an expectedRejected slice alongside expectedValid) so the test
verifies all three classification outcomes from getValidDevices.
pkg/internal/diskutil.go (1)

19-23: Consider consolidating the new test seam instead of another package-global override.

CmdExecutor now joins FilePathGlob and FilePathEvalSymLinks as process-wide mutable dependencies. That works, but it also makes command stubbing leak across tests and become order-sensitive once any related test runs in parallel. A small injected dependency bundle would keep the abstraction without adding more shared global state.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@pkg/internal/diskutil.go` around lines 19 - 23, CmdExecutor, FilePathGlob and
FilePathEvalSymLinks are introduced as package-global mutable seams which leak
state across tests; replace them with a single injected dependency bundle (e.g.,
a small struct like diskDeps or DiskUtil with fields CmdExecutor
utilexec.Interface, FilePathGlob func(string) ([]string, error),
FilePathEvalSymLinks func(string) (string, error)) and refactor functions that
currently read those globals to accept that deps struct (or be methods on a
DiskUtil instance) so production code can use a package-default instance while
tests construct isolated instances with stubbed executors/glob/eval functions,
and update constructors and tests to inject mocks rather than mutating package
globals.
pkg/internal/diskutil_test.go (1)

26-44: Keep coverage for blkid's exit-status-2 path.

GetDeviceFSMap treats blkid exit code 2 as the normal blank-disk case, but newFakeExecutor can only model successful CombinedOutput calls. Please let the fake script (output, err) pairs and add one case for that path, otherwise regressions around unformatted devices will slip past both these tests and the downstream discovery tests.

As per coding guidelines, Focus on major issues impacting performance, readability, maintainability and security. Avoid nitpicks and avoid verbosity.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@pkg/internal/diskutil_test.go` around lines 26 - 44, newFakeExecutor
currently only models successful CombinedOutput calls, so tests can't simulate
blkid returning exit status 2 which GetDeviceFSMap treats as a blank-disk path;
update newFakeExecutor to accept pairs of (output string, err error) (or a small
struct slice) instead of only outputs, populate each
testingexec.FakeCmd.CombinedOutputScript to return the provided ([]byte(output),
nil or error) and ensure one test case supplies an error that represents blkid
exit status 2 (the same kind of error GetDeviceFSMap expects) so the
blkid-exit-2 path is covered; adjust any call sites that create newFakeExecutor
to pass the (output, err) pairs accordingly.
pkg/common/symlink_utils.go (1)

52-67: Keep the new symlink scan mockable.

This helper goes straight to filepath.Glob, unlike the rest of the disk-path utilities that already route through internal.FilePathGlob / internal.PathEvalsToDiskLabel. That makes the new LVSet symlink lookup harder to unit-test and easier to drift from the existing filesystem hooks.

♻️ Suggested change
 func GetSymlinkedForCurrentSC(symlinkDir string, currentDevice internal.BlockDevice) (string, error) {
-	paths, err := filepath.Glob(filepath.Join(symlinkDir, "/*"))
+	paths, err := internal.FilePathGlob(filepath.Join(symlinkDir, "*"))
 	if err != nil {
 		return "", err
 	}

As per coding guidelines, Focus on major issues impacting performance, readability, maintainability and security. Avoid nitpicks and avoid verbosity.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@pkg/common/symlink_utils.go` around lines 52 - 67, GetSymlinkedForCurrentSC
currently calls filepath.Glob directly which breaks test hooks; change it to use
the existing testable wrapper internal.FilePathGlob(symlinkDir, "/*") (or the
equivalent internal.FilePathGlob call used elsewhere) while keeping the rest of
the logic (iterating paths, calling internal.PathEvalsToDiskLabel(path,
currentDevice.KName), returning filepath.Base(path) on match and propagating
errors) so the symlink scan remains mockable and consistent with other disk-path
utilities.
test/e2e/localvolume_test.go (1)

283-309: Consider using sets.New for consistency.

This function manually builds a map while eventuallyFindLVDLsForPVs uses sets.New. Consider using the same pattern for consistency.

♻️ Suggested refactor for consistency
 func verifyLVDLFilesystemUUIDForPVs(t *testing.T, f *framework.Framework, namespace string, pvNames []string) {
 	matcher := gomega.NewWithT(t)
-	pvNameSet := map[string]struct{}{}
-	for _, pvName := range pvNames {
-		pvNameSet[pvName] = struct{}{}
-	}
+	pvNameSet := sets.New(pvNames...)
 	matcher.Eventually(func() bool {
 		lvdlList := &localv1.LocalVolumeDeviceLinkList{}
 		err := f.Client.List(goctx.TODO(), lvdlList, client.InNamespace(namespace))
 		if err != nil {
 			t.Logf("error listing LocalVolumeDeviceLink objects while verifying filesystemUUID: %v", err)
 			return false
 		}
 		uuidFoundCount := 0
 		for _, lvdl := range lvdlList.Items {
-			if _, ok := pvNameSet[lvdl.Spec.PersistentVolumeName]; !ok {
+			if !pvNameSet.Has(lvdl.Spec.PersistentVolumeName) {
 				continue
 			}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@test/e2e/localvolume_test.go` around lines 283 - 309, Replace the manual
map-based set in verifyLVDLFilesystemUUIDForPVs with k8s sets for consistency
with eventuallyFindLVDLsForPVs: import k8s.io/apimachinery/pkg/util/sets, create
pvNameSet using sets.NewString(pvNames...), use
pvNameSet.Has(lvdl.Spec.PersistentVolumeName) instead of map lookup, and compare
uuidFoundCount to pvNameSet.Len() rather than len(pvNameSet); remove the manual
population loop and adjust references accordingly.
pkg/diskmaker/controllers/lvset/reconcile.go (1)

267-302: Consider extracting shared processRejectedDevicesForDeviceLinks logic.

This function is nearly identical to the one in pkg/diskmaker/controllers/lv/reconcile.go. Consider extracting to a shared helper in pkg/common or pkg/internal to reduce duplication.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@pkg/diskmaker/controllers/lvset/reconcile.go` around lines 267 - 302, Extract
the duplicated logic in processRejectedDevicesForDeviceLinks into a shared
helper (e.g., pkg/common or pkg/internal) that accepts context, the
node/namespace/runtimeConfig pieces needed, symLinkDir, storageClassName, lvset
and the rejectedDevices slice and returns errors/events; reuse helper from both
LocalVolumeSetReconciler.processRejectedDevicesForDeviceLinks and the
near-identical function in pkg/diskmaker/controllers/lv/reconcile.go. Inside the
helper keep the same calls to common.GetSymlinkedForCurrentSC,
common.GetSymLinkSourceAndTarget, common.GeneratePVName and
internal.NewDeviceLinkHandler/ApplyStatus and surface any failures so callers
can call r.eventReporter.Report/newDiskEvent and klog.ErrorS as before; update
both reconciler methods to call the new helper and remove duplicated loops and
logic.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@HACKING.md`:
- Around line 9-12: The heading "Overriding diskmaker image or operator image
directly after deployment" is misleading because the steps only show deleting
the subscription object and editing the CSV to override the diskmaker image;
either update the heading to only mention "diskmaker image" or add the missing
operator-image instructions: describe how to edit the CSV object or the operator
Deployment (or ClusterServiceVersion) to change the operator image after
removing the subscription, include exact kubectl commands and which field to
patch (e.g., spec.install.spec.template.spec.containers[*].image) for the
operator, and keep the existing steps for "subscription object" and "CSV object"
referring to the "diskmaker image" override.

---

Outside diff comments:
In `@pkg/common/provisioner_utils.go`:
- Line 215: The call to controllerutil.CreateOrUpdate uses context.TODO() which
ignores the function's provided cancellation/timeout; update the call to pass
the function parameter ctx instead (i.e., replace context.TODO() with ctx) so
controllerutil.CreateOrUpdate(client, existingPV, ...) receives the passed-in
context and cancellation is propagated; ensure this change is applied where
controllerutil.CreateOrUpdate is invoked with client and existingPV.

---

Nitpick comments:
In `@pkg/common/symlink_utils.go`:
- Around line 52-67: GetSymlinkedForCurrentSC currently calls filepath.Glob
directly which breaks test hooks; change it to use the existing testable wrapper
internal.FilePathGlob(symlinkDir, "/*") (or the equivalent internal.FilePathGlob
call used elsewhere) while keeping the rest of the logic (iterating paths,
calling internal.PathEvalsToDiskLabel(path, currentDevice.KName), returning
filepath.Base(path) on match and propagating errors) so the symlink scan remains
mockable and consistent with other disk-path utilities.

In `@pkg/diskmaker/controllers/lvset/device_age_test.go`:
- Around line 69-71: Test currently ignores the third return value from
r.getValidDevices; update the call to capture the rejected bucket (e.g.,
validDevices, delayedDevices, rejectedDevices := r.getValidDevices(nil,
blockDevices)) and add an assertion for its length (e.g., assert.Lenf(t,
rejectedDevices, expectedRejected[run], "rejectedDevices")). Add or compute
expectedRejected for each run (recommended: introduce an expectedRejected slice
alongside expectedValid) so the test verifies all three classification outcomes
from getValidDevices.

In `@pkg/diskmaker/controllers/lvset/reconcile.go`:
- Around line 267-302: Extract the duplicated logic in
processRejectedDevicesForDeviceLinks into a shared helper (e.g., pkg/common or
pkg/internal) that accepts context, the node/namespace/runtimeConfig pieces
needed, symLinkDir, storageClassName, lvset and the rejectedDevices slice and
returns errors/events; reuse helper from both
LocalVolumeSetReconciler.processRejectedDevicesForDeviceLinks and the
near-identical function in pkg/diskmaker/controllers/lv/reconcile.go. Inside the
helper keep the same calls to common.GetSymlinkedForCurrentSC,
common.GetSymLinkSourceAndTarget, common.GeneratePVName and
internal.NewDeviceLinkHandler/ApplyStatus and surface any failures so callers
can call r.eventReporter.Report/newDiskEvent and klog.ErrorS as before; update
both reconciler methods to call the new helper and remove duplicated loops and
logic.

In `@pkg/internal/diskutil_test.go`:
- Around line 26-44: newFakeExecutor currently only models successful
CombinedOutput calls, so tests can't simulate blkid returning exit status 2
which GetDeviceFSMap treats as a blank-disk path; update newFakeExecutor to
accept pairs of (output string, err error) (or a small struct slice) instead of
only outputs, populate each testingexec.FakeCmd.CombinedOutputScript to return
the provided ([]byte(output), nil or error) and ensure one test case supplies an
error that represents blkid exit status 2 (the same kind of error GetDeviceFSMap
expects) so the blkid-exit-2 path is covered; adjust any call sites that create
newFakeExecutor to pass the (output, err) pairs accordingly.

In `@pkg/internal/diskutil.go`:
- Around line 19-23: CmdExecutor, FilePathGlob and FilePathEvalSymLinks are
introduced as package-global mutable seams which leak state across tests;
replace them with a single injected dependency bundle (e.g., a small struct like
diskDeps or DiskUtil with fields CmdExecutor utilexec.Interface, FilePathGlob
func(string) ([]string, error), FilePathEvalSymLinks func(string) (string,
error)) and refactor functions that currently read those globals to accept that
deps struct (or be methods on a DiskUtil instance) so production code can use a
package-default instance while tests construct isolated instances with stubbed
executors/glob/eval functions, and update constructors and tests to inject mocks
rather than mutating package globals.

In `@test/e2e/localvolume_test.go`:
- Around line 283-309: Replace the manual map-based set in
verifyLVDLFilesystemUUIDForPVs with k8s sets for consistency with
eventuallyFindLVDLsForPVs: import k8s.io/apimachinery/pkg/util/sets, create
pvNameSet using sets.NewString(pvNames...), use
pvNameSet.Has(lvdl.Spec.PersistentVolumeName) instead of map lookup, and compare
uuidFoundCount to pvNameSet.Len() rather than len(pvNameSet); remove the manual
population loop and adjust references accordingly.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 468c6b2f-d35b-45d8-b2ac-ccaa6851f532

📥 Commits

Reviewing files that changed from the base of the PR and between 9d54043 and 1cf07bd.

⛔ Files ignored due to path filters (2)
  • vendor/k8s.io/utils/exec/testing/fake_exec.go is excluded by !vendor/**, !**/vendor/**
  • vendor/modules.txt is excluded by !vendor/**, !**/vendor/**
📒 Files selected for processing (26)
  • HACKING.md
  • api/v1/localvolumedevicelink_types.go
  • cmd/diskmaker-manager/manager.go
  • config/manifests/stable/local-storage-operator.clusterserviceversion.yaml
  • config/manifests/stable/local.storage.openshift.io_localvolumedevicelinks.yaml
  • pkg/common/provisioner_utils.go
  • pkg/common/symlink_utils.go
  • pkg/controllers/localvolume/localvolume_controller.go
  • pkg/diskmaker/controllers/lv/create_pv_test.go
  • pkg/diskmaker/controllers/lv/reconcile.go
  • pkg/diskmaker/controllers/lv/reconcile_test.go
  • pkg/diskmaker/controllers/lvset/create_pv_test.go
  • pkg/diskmaker/controllers/lvset/device_age_test.go
  • pkg/diskmaker/controllers/lvset/reconcile.go
  • pkg/diskmaker/controllers/lvset/reconcile_test.go
  • pkg/diskmaker/discovery/discovery_test.go
  • pkg/diskmaker/event_reporter.go
  • pkg/internal/device_link_handler.go
  • pkg/internal/device_link_handler_test.go
  • pkg/internal/disk_location.go
  • pkg/internal/diskutil.go
  • pkg/internal/diskutil_test.go
  • pkg/localmetrics/custom_metrics_collector.go
  • pkg/localmetrics/custom_metrics_collector_test.go
  • test/e2e/localvolume_test.go
  • test/e2e/localvolumeset_test.go
💤 Files with no reviewable changes (1)
  • config/manifests/stable/local.storage.openshift.io_localvolumedevicelinks.yaml

Comment on lines +9 to +12
## Overriding diskmaker image or operator image directly after deployment

1. Delete the subscription object so as OLM is no longer managing the operator.
2. Edit the CSV object to override diskmaker image.
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Align the heading with the actual steps.

The section promises both diskmaker and operator image overrides, but the instructions only describe changing the diskmaker image in the CSV. Either document the operator-image edit path too or narrow the heading so this workflow is not misleading.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@HACKING.md` around lines 9 - 12, The heading "Overriding diskmaker image or
operator image directly after deployment" is misleading because the steps only
show deleting the subscription object and editing the CSV to override the
diskmaker image; either update the heading to only mention "diskmaker image" or
add the missing operator-image instructions: describe how to edit the CSV object
or the operator Deployment (or ClusterServiceVersion) to change the operator
image after removing the subscription, include exact kubectl commands and which
field to patch (e.g., spec.install.spec.template.spec.containers[*].image) for
the operator, and keep the existing steps for "subscription object" and "CSV
object" referring to the "diskmaker image" override.

gnufied added 6 commits March 23, 2026 11:45
Fix rbac rules and bugs to enable creation of lvdl objects
refactor code to findexisting lvldl
Change UpdateStatus to ApplyStatus
Refactor the code to directly use BlockDevice
Add comments about processRejectedDevicesForDeviceLinks
@gnufied gnufied force-pushed the implement-metrics-for-mismatch-symlinks branch from 1cf07bd to 7b34b7c Compare March 23, 2026 16:47
@openshift-ci openshift-ci bot removed the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Mar 23, 2026
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (2)
pkg/diskmaker/controllers/lv/create_pv_test.go (1)

252-254: ⚠️ Potential issue | 🟠 Major

Bug: return exits the entire test function instead of continuing to the next test case.

Using return here will exit the entire TestCreatePV function after the first test case that has shouldErr: true, skipping all subsequent test cases in testTable.

🐛 Proposed fix
 		if tc.shouldErr {
-			return
+			continue
 		}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@pkg/diskmaker/controllers/lv/create_pv_test.go` around lines 252 - 254, In
TestCreatePV the test loop uses "return" when tc.shouldErr is true which exits
the entire TestCreatePV function and prevents subsequent test cases in testTable
from running; replace the "return" with "continue" inside the loop iterating
over testTable so only the current case is skipped, locating the conditional
that checks tc.shouldErr in the TestCreatePV test helper or loop and change the
control flow accordingly.
pkg/common/provisioner_utils.go (1)

215-215: ⚠️ Potential issue | 🟡 Minor

Use the passed ctx parameter instead of context.TODO().

The function receives a ctx parameter but controllerutil.CreateOrUpdate uses context.TODO(). This should use the provided context for proper cancellation propagation and observability.

🔧 Proposed fix
-	opRes, err := controllerutil.CreateOrUpdate(context.TODO(), client, existingPV, func() error {
+	opRes, err := controllerutil.CreateOrUpdate(ctx, client, existingPV, func() error {
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@pkg/common/provisioner_utils.go` at line 215, Replace the hard-coded
context.TODO() call in the controllerutil.CreateOrUpdate invocation with the
function's passed-in ctx so cancellation and deadlines propagate; locate the
CreateOrUpdate call (opRes, err := controllerutil.CreateOrUpdate(context.TODO(),
client, existingPV, func() error { ... }) and swap context.TODO() for ctx,
ensuring the surrounding function signature indeed accepts ctx and that
imports/usage remain consistent.
🧹 Nitpick comments (2)
pkg/diskmaker/discovery/discovery_test.go (1)

17-35: Consider extracting newFakeExecutor to a shared test utility.

This helper is duplicated verbatim in pkg/internal/diskutil_test.go. Extracting it to a shared location (e.g., pkg/internal/testutil/exec.go or similar) would reduce duplication and ensure consistent behavior across tests.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@pkg/diskmaker/discovery/discovery_test.go` around lines 17 - 35, The helper
newFakeExecutor (which constructs a *testingexec.FakeExec with
CommandScript/CombinedOutputScript) is duplicated; extract it into a shared test
utility package (e.g., create pkg/internal/testutil with a file exec.go) and
move the function there so both discovery.newFakeExecutor and the copy in
pkg/internal/diskutil_test.go import testutil.NewFakeExecutor (or similar);
update callers to use the shared symbol and keep the same signature (outputs
...string) and behavior using testingexec.FakeExec/ FakeCmd so tests remain
unchanged.
test/e2e/localvolume_test.go (1)

297-323: Consider using sets.New for consistency.

This function manually constructs map[string]struct{} while eventuallyFindLVDLsForPVs uses sets.New. Consider using the same pattern for consistency across helper functions.

♻️ Suggested refactor for consistency
 func verifyLVDLFilesystemUUIDForPVs(t *testing.T, f *framework.Framework, namespace string, pvNames []string) {
 	matcher := gomega.NewWithT(t)
-	pvNameSet := map[string]struct{}{}
-	for _, pvName := range pvNames {
-		pvNameSet[pvName] = struct{}{}
-	}
+	pvNameSet := sets.New(pvNames...)
 	matcher.Eventually(func() bool {
 		lvdlList := &localv1.LocalVolumeDeviceLinkList{}
 		err := f.Client.List(goctx.TODO(), lvdlList, client.InNamespace(namespace))
@@ ... @@
 		uuidFoundCount := 0
 		for _, lvdl := range lvdlList.Items {
-			if _, ok := pvNameSet[lvdl.Spec.PersistentVolumeName]; !ok {
+			if !pvNameSet.Has(lvdl.Spec.PersistentVolumeName) {
 				continue
 			}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@test/e2e/localvolume_test.go` around lines 297 - 323, Replace the manual map
construction in verifyLVDLFilesystemUUIDForPVs with a kubernetes sets.String for
consistency with eventuallyFindLVDLsForPVs: create pvNameSet using
sets.NewString(pvNames...) (from k8s.io/apimachinery/pkg/util/sets), change
membership checks from "if _, ok := pvNameSet[pvName]" to pvNameSet.Has(pvName),
and use pvNameSet.Len() when comparing counts; keep the rest of the function
logic the same (references: verifyLVDLFilesystemUUIDForPVs,
eventuallyFindLVDLsForPVs).
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Outside diff comments:
In `@pkg/common/provisioner_utils.go`:
- Line 215: Replace the hard-coded context.TODO() call in the
controllerutil.CreateOrUpdate invocation with the function's passed-in ctx so
cancellation and deadlines propagate; locate the CreateOrUpdate call (opRes, err
:= controllerutil.CreateOrUpdate(context.TODO(), client, existingPV, func()
error { ... }) and swap context.TODO() for ctx, ensuring the surrounding
function signature indeed accepts ctx and that imports/usage remain consistent.

In `@pkg/diskmaker/controllers/lv/create_pv_test.go`:
- Around line 252-254: In TestCreatePV the test loop uses "return" when
tc.shouldErr is true which exits the entire TestCreatePV function and prevents
subsequent test cases in testTable from running; replace the "return" with
"continue" inside the loop iterating over testTable so only the current case is
skipped, locating the conditional that checks tc.shouldErr in the TestCreatePV
test helper or loop and change the control flow accordingly.

---

Nitpick comments:
In `@pkg/diskmaker/discovery/discovery_test.go`:
- Around line 17-35: The helper newFakeExecutor (which constructs a
*testingexec.FakeExec with CommandScript/CombinedOutputScript) is duplicated;
extract it into a shared test utility package (e.g., create
pkg/internal/testutil with a file exec.go) and move the function there so both
discovery.newFakeExecutor and the copy in pkg/internal/diskutil_test.go import
testutil.NewFakeExecutor (or similar); update callers to use the shared symbol
and keep the same signature (outputs ...string) and behavior using
testingexec.FakeExec/ FakeCmd so tests remain unchanged.

In `@test/e2e/localvolume_test.go`:
- Around line 297-323: Replace the manual map construction in
verifyLVDLFilesystemUUIDForPVs with a kubernetes sets.String for consistency
with eventuallyFindLVDLsForPVs: create pvNameSet using
sets.NewString(pvNames...) (from k8s.io/apimachinery/pkg/util/sets), change
membership checks from "if _, ok := pvNameSet[pvName]" to pvNameSet.Has(pvName),
and use pvNameSet.Len() when comparing counts; keep the rest of the function
logic the same (references: verifyLVDLFilesystemUUIDForPVs,
eventuallyFindLVDLsForPVs).

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: f30f2bf2-6e52-4046-980d-082344ff54ad

📥 Commits

Reviewing files that changed from the base of the PR and between 1cf07bd and 7b34b7c.

⛔ Files ignored due to path filters (2)
  • vendor/k8s.io/utils/exec/testing/fake_exec.go is excluded by !vendor/**, !**/vendor/**
  • vendor/modules.txt is excluded by !vendor/**, !**/vendor/**
📒 Files selected for processing (25)
  • HACKING.md
  • api/v1/localvolumedevicelink_types.go
  • cmd/diskmaker-manager/manager.go
  • config/manifests/stable/local-storage-operator.clusterserviceversion.yaml
  • config/manifests/stable/local.storage.openshift.io_localvolumedevicelinks.yaml
  • pkg/common/provisioner_utils.go
  • pkg/controllers/localvolume/localvolume_controller.go
  • pkg/diskmaker/controllers/lv/create_pv_test.go
  • pkg/diskmaker/controllers/lv/reconcile.go
  • pkg/diskmaker/controllers/lv/reconcile_test.go
  • pkg/diskmaker/controllers/lvset/create_pv_test.go
  • pkg/diskmaker/controllers/lvset/device_age_test.go
  • pkg/diskmaker/controllers/lvset/reconcile.go
  • pkg/diskmaker/controllers/lvset/reconcile_test.go
  • pkg/diskmaker/discovery/discovery_test.go
  • pkg/diskmaker/event_reporter.go
  • pkg/internal/device_link_handler.go
  • pkg/internal/device_link_handler_test.go
  • pkg/internal/disk_location.go
  • pkg/internal/diskutil.go
  • pkg/internal/diskutil_test.go
  • pkg/localmetrics/custom_metrics_collector.go
  • pkg/localmetrics/custom_metrics_collector_test.go
  • test/e2e/localvolume_test.go
  • test/e2e/localvolumeset_test.go
💤 Files with no reviewable changes (1)
  • config/manifests/stable/local.storage.openshift.io_localvolumedevicelinks.yaml
✅ Files skipped from review due to trivial changes (6)
  • pkg/diskmaker/controllers/lvset/device_age_test.go
  • HACKING.md
  • pkg/diskmaker/event_reporter.go
  • config/manifests/stable/local-storage-operator.clusterserviceversion.yaml
  • pkg/internal/disk_location.go
  • pkg/localmetrics/custom_metrics_collector_test.go
🚧 Files skipped from review as they are similar to previous changes (7)
  • pkg/diskmaker/controllers/lvset/reconcile_test.go
  • pkg/controllers/localvolume/localvolume_controller.go
  • test/e2e/localvolumeset_test.go
  • pkg/diskmaker/controllers/lvset/create_pv_test.go
  • cmd/diskmaker-manager/manager.go
  • pkg/internal/diskutil.go
  • pkg/internal/device_link_handler.go

@gnufied gnufied force-pushed the implement-metrics-for-mismatch-symlinks branch from 7b34b7c to 15b9838 Compare March 23, 2026 20:08
@gnufied gnufied changed the title Implement metrics for mismatch symlinks STOR-2873: Implement metrics for mismatch symlinks Mar 23, 2026
@openshift-ci-robot openshift-ci-robot added the jira/valid-reference Indicates that this PR references a valid Jira ticket of any type. label Mar 23, 2026
@openshift-ci-robot
Copy link
Contributor

openshift-ci-robot commented Mar 23, 2026

@gnufied: This pull request references STOR-2873 which is a valid jira issue.

Details

In response to this:

Fixes https://redhat.atlassian.net/browse/STOR-2873

https://redhat.atlassian.net/browse/STOR-2872

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@assets/templates/localmetrics/prometheus-rule.yaml`:
- Line 49: Replace the misspelled phrase "Local Storage oprator" with "Local
Storage operator" in the Prometheus alert template where the alert summary
contains that text (the summary strings that include "Local Storage oprator" in
prometheus-rule.yaml); update both occurrences (the summary around pv {{
$labels.persistent_volume }} and the other summary occurrence) so both alert
summaries read "Local Storage operator".

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 637cc55c-8687-410d-a7f8-27835d9c2338

📥 Commits

Reviewing files that changed from the base of the PR and between 7b34b7c and 15b9838.

📒 Files selected for processing (4)
  • assets/templates/localmetrics/prometheus-rule.yaml
  • cmd/diskmaker-manager/manager.go
  • pkg/localmetrics/custom_metrics_collector.go
  • pkg/localmetrics/custom_metrics_collector_test.go
🚧 Files skipped from review as they are similar to previous changes (2)
  • cmd/diskmaker-manager/manager.go
  • pkg/localmetrics/custom_metrics_collector_test.go

@gnufied gnufied force-pushed the implement-metrics-for-mismatch-symlinks branch from 15b9838 to c662ab9 Compare March 23, 2026 20:35
Ship alertmanager rule for emitting alerts for LSO
@gnufied gnufied force-pushed the implement-metrics-for-mismatch-symlinks branch from c662ab9 to 2ef63a1 Compare March 23, 2026 20:41
@openshift-ci
Copy link
Contributor

openshift-ci bot commented Mar 23, 2026

@gnufied: all tests passed!

Full PR test history. Your PR dashboard.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved Indicates a PR has been approved by an approver from all required OWNERS files. jira/valid-reference Indicates that this PR references a valid Jira ticket of any type.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants