STOR-2870: Implement creation of LocalVolumeDeviceLink objects#592
STOR-2870: Implement creation of LocalVolumeDeviceLink objects#592gnufied wants to merge 6 commits intoopenshift:mainfrom
Conversation
|
@gnufied: This pull request references STOR-2870 which is a valid jira issue. DetailsIn response to this: Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
|
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: gnufied The full list of commands accepted by this bot can be found here. The pull request process is described here DetailsNeeds approval from an approver in each of these files:
Approvers can indicate their approval by writing |
| // +optional | ||
| // +listType=set | ||
| // +kubebuilder:validation:MaxItems=256 | ||
| ValidLinkTargets []string `json:"validLinkTargets,omitempty"` |
There was a problem hiding this comment.
I had to relax these validations, because I realized that in some environments, we may not get any preferredLink targers in /dev/disk/by-id at all... So rather than not creating any objects in those environments, we still create LVDL object
|
@coderabbitai start review process |
| // +optional | ||
| // +kubebuilder:validation:MaxLength=4096 | ||
| PreferredLinkTarget string `json:"preferredLinkTarget,omitempty"` |
There was a problem hiding this comment.
Should it be a pointer, if it's allowed to be empty?
There was a problem hiding this comment.
According to - https://github.com/openshift/enhancements/blob/master/dev-guide/api-conventions.md
"In custom resource based APIs specifically, we advise to avoid making fields pointers unless there is an absolute need to do so. An absolute need being the need to distinguish between the zero value and an unset value", it should be fine to keep it optional and still a string type i think.
| "paths", noMatch, "directory", symLinkDir) | ||
| } | ||
|
|
||
| r.processRejectedDevicesForDeviceLinks(ctx, lvset, rejectedDevices, symLinkDir, storageClassName) |
There was a problem hiding this comment.
Why no error handling and exponential backoff? Add a comment that this is called periodically.
There was a problem hiding this comment.
Error handling, I can add. we can add some events. But I am unsure about exponential backoff. This reconciler loop runs every 1 minute, if we were to backoff on creation of individual LVDL objects, we may have to add separate queue. I am not sure if complexity is worth it.
There was a problem hiding this comment.
It's probably fine to retry once per minute. There should be a comment about that.
| } | ||
|
|
||
| if len(ignoredButMatchingDeviceMap) != 0 { | ||
| r.processRejectedDevicesForDeviceLinks(ctx, ignoredButMatchingDeviceMap) |
There was a problem hiding this comment.
Why no error handling and exponential backoff? Add a comment that this is called periodically.
There was a problem hiding this comment.
added error handling. same comment about exponential backoff.
pkg/internal/device_link_handler.go
Outdated
| if isNilOwnerObject(ownerObj) { | ||
| klog.Warningf("missing lvdl object %s during status update, but owner is nil; skipping creation for device: %s", pvName, devicePath) | ||
| return nil, nil | ||
| } |
There was a problem hiding this comment.
I would like to understand when is this function called with nil owner. Please add a comment about that.
There was a problem hiding this comment.
It should be pretty rare/impossible for owner to be nil. May be when LV or LVSet is being deleted? The code here merely guards against such a scenario
| }, | ||
| existing: &v1.LocalVolumeDeviceLink{ | ||
| ObjectMeta: metav1.ObjectMeta{Name: "local-pv-statustest", Namespace: "default"}, | ||
| Spec: v1.LocalVolumeDeviceLinkSpec{PersistentVolumeName: "local-pv-statustest"}, |
There was a problem hiding this comment.
Please add a case where LSO actually updates existing PreferredLinkTarget from say /dev/disk/by-id/wwn-current to /dev/disk/by-id/wwn-preferred. All existing unit test have the status empty when calling UpdateStatus.
There was a problem hiding this comment.
In general I agree, but I am not sure if device_handler is the right place for the test right? It has to be some higher level controller that should have the test, because all device_handler code does is - updates whatever was supplied to it.
|
https://github.com/coderabbitai review this please |
|
@coderabbitai review this please |
|
✅ Actions performedReview triggered.
|
There was a problem hiding this comment.
Actionable comments posted: 3
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
pkg/common/provisioner_utils.go (1)
222-222:⚠️ Potential issue | 🟡 MinorUse passed
ctxinstead ofcontext.TODO().The function signature now accepts
ctx context.Context, butcontrollerutil.CreateOrUpdatestill usescontext.TODO(). This should be updated to usectxfor consistent context propagation and proper cancellation/timeout handling.Proposed fix
- opRes, err := controllerutil.CreateOrUpdate(context.TODO(), client, existingPV, func() error { + opRes, err := controllerutil.CreateOrUpdate(ctx, client, existingPV, func() error {🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@pkg/common/provisioner_utils.go` at line 222, The call to controllerutil.CreateOrUpdate is using context.TODO() instead of the function's passed ctx; update the call site so controllerutil.CreateOrUpdate(ctx, client, existingPV, func() error { ... }) uses the existing ctx parameter to ensure proper cancellation and timeout propagation (look for controllerutil.CreateOrUpdate, existingPV and opRes in this function).
🧹 Nitpick comments (1)
pkg/diskmaker/controllers/lvset/reconcile.go (1)
471-477: TODO comment indicates potential over-strict error handling.The comment notes this may not be a fatal error, but the code currently returns an error if
GetPathByIDfails. For devices without stable by-id paths (common in some virtualized environments), this could unnecessarily block PV provisioning. Consider deferring this fix to a follow-up if it's not blocking the current use cases.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@pkg/diskmaker/controllers/lvset/reconcile.go` around lines 471 - 477, The current code treats a failure from dev.GetPathByID("") as fatal and returns the error; change this to non-fatal handling so devices without by-id paths don't block reconciliation: in the reconcile logic around dev.GetPathByID("") (the symbol preferredSymLink and its err), catch the error, log it at warning/verbose level with context (include disk/devLabelPath and the error) and set preferredSymLink to an empty string or nil so processing continues; only return an error if subsequent logic actually requires a by-id path (leave a TODO comment referencing this change so it can be revisited), ensuring the function does not return early on GetPathByID failures.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@HACKING.md`:
- Around line 9-12: Update the section title and steps to accurately describe
that deleting the Subscription only stops future upgrades (OLM will continue to
run the currently installed CSV), and instruct to edit the CSV to override
either the operator image or the diskmaker image as needed; specifically mention
the Subscription and CSV objects by name, clarify that you must edit the CSV's
container image(s) (operator image or diskmaker image) to apply the override,
and note that if you need to revert OLM entirely you must also remove the
installed CSV or recreate a replacement CSV.
In `@pkg/diskmaker/controllers/lv/reconcile.go`:
- Around line 521-525: The code currently calls
getSymlinkSourceAndTarget(diskLocation, symLinkDirPath) and uses break on error,
which aborts processing all remaining devices in the storage class; change the
error handling to use continue instead so a failed device is skipped but the
loop keeps processing other devices (replace the break in the err block with
continue and preserve the klog.ErrorS call). Ensure this change is made in the
loop that iterates devices where diskLocation and symLinkDirPath are used.
In `@pkg/internal/device_link_handler.go`:
- Around line 56-70: The early return when existing.Spec.PersistentVolumeName ==
pvName skips reconciling metadata.ownerReferences on preexisting
LocalVolumeDeviceLink objects; instead, detect when the PV name matches and
still reconcile ownerReferences on the existing object (compare desired
ownerRefs vs existing.ObjectMeta.OwnerReferences), and if they differ create a
deep copy (existing.DeepCopy()), set the updated OwnerReferences, and call
dl.client.Update(ctx, copied) (same error handling as the create/update path).
Ensure you reference and reuse the same ownerRefs computation used for new
objects so owner-based cleanup is applied to preexisting LocalVolumeDeviceLink
instances.
---
Outside diff comments:
In `@pkg/common/provisioner_utils.go`:
- Line 222: The call to controllerutil.CreateOrUpdate is using context.TODO()
instead of the function's passed ctx; update the call site so
controllerutil.CreateOrUpdate(ctx, client, existingPV, func() error { ... })
uses the existing ctx parameter to ensure proper cancellation and timeout
propagation (look for controllerutil.CreateOrUpdate, existingPV and opRes in
this function).
---
Nitpick comments:
In `@pkg/diskmaker/controllers/lvset/reconcile.go`:
- Around line 471-477: The current code treats a failure from
dev.GetPathByID("") as fatal and returns the error; change this to non-fatal
handling so devices without by-id paths don't block reconciliation: in the
reconcile logic around dev.GetPathByID("") (the symbol preferredSymLink and its
err), catch the error, log it at warning/verbose level with context (include
disk/devLabelPath and the error) and set preferredSymLink to an empty string or
nil so processing continues; only return an error if subsequent logic actually
requires a by-id path (leave a TODO comment referencing this change so it can be
revisited), ensuring the function does not return early on GetPathByID failures.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: eb0d32cc-ef07-489c-95ad-bf2d493c7701
📒 Files selected for processing (20)
HACKING.mdapi/v1/localvolumedevicelink_types.goconfig/manifests/stable/local-storage-operator.clusterserviceversion.yamlconfig/manifests/stable/local.storage.openshift.io_localvolumedevicelinks.yamlpkg/common/provisioner_utils.gopkg/common/symlink_utils.gopkg/controllers/localvolume/localvolume_controller.gopkg/diskmaker/controllers/lv/create_pv_test.gopkg/diskmaker/controllers/lv/reconcile.gopkg/diskmaker/controllers/lv/reconcile_test.gopkg/diskmaker/controllers/lvset/create_pv_test.gopkg/diskmaker/controllers/lvset/device_age_test.gopkg/diskmaker/controllers/lvset/reconcile.gopkg/diskmaker/controllers/lvset/reconcile_test.gopkg/internal/device_link_handler.gopkg/internal/device_link_handler_test.gopkg/internal/disk_location.gopkg/internal/diskutil.gotest/e2e/localvolume_test.gotest/e2e/localvolumeset_test.go
💤 Files with no reviewable changes (1)
- config/manifests/stable/local.storage.openshift.io_localvolumedevicelinks.yaml
| // Keep user-configured policy and only reconcile the pv name. | ||
| if existing.Spec.PersistentVolumeName == pvName { | ||
| return existing, nil | ||
| } | ||
|
|
||
| klog.Infof("updating lvdl object: %s", pvName) | ||
|
|
||
| existingCopy := existing.DeepCopy() | ||
| existingCopy.Spec.PersistentVolumeName = pvName | ||
|
|
||
| err = dl.client.Update(ctx, existingCopy) | ||
| if err != nil { | ||
| return nil, fmt.Errorf("error updating localvolumedevicelink object: %w", err) | ||
| } | ||
| return existingCopy, nil |
There was a problem hiding this comment.
Reconcile owner refs on existing LVDLs too.
The early return on Line 57 preserves user policy, but it also leaves any preexisting LocalVolumeDeviceLink without refreshed metadata.ownerReferences. That means manually precreated objects, or objects carried over from earlier revisions, won't participate in owner-based cleanup even after this PR adds owner refs for newly created ones.
🛠️ Proposed fix
- // Keep user-configured policy and only reconcile the pv name.
- if existing.Spec.PersistentVolumeName == pvName {
- return existing, nil
- }
-
- klog.Infof("updating lvdl object: %s", pvName)
-
- existingCopy := existing.DeepCopy()
- existingCopy.Spec.PersistentVolumeName = pvName
+ existingCopy := existing.DeepCopy()
+ needsUpdate := false
+ if existingCopy.Spec.PersistentVolumeName != pvName {
+ existingCopy.Spec.PersistentVolumeName = pvName
+ needsUpdate = true
+ }
+ if !isNilOwnerObject(ownerObj) {
+ desiredOwnerRefs, err := buildOwnerRefs(ownerObj)
+ if err != nil {
+ return nil, err
+ }
+ if !equality.Semantic.DeepEqual(existingCopy.OwnerReferences, desiredOwnerRefs) {
+ existingCopy.OwnerReferences = desiredOwnerRefs
+ needsUpdate = true
+ }
+ }
+ if !needsUpdate {
+ return existing, nil
+ }
+
+ klog.Infof("updating lvdl object: %s", pvName)🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@pkg/internal/device_link_handler.go` around lines 56 - 70, The early return
when existing.Spec.PersistentVolumeName == pvName skips reconciling
metadata.ownerReferences on preexisting LocalVolumeDeviceLink objects; instead,
detect when the PV name matches and still reconcile ownerReferences on the
existing object (compare desired ownerRefs vs
existing.ObjectMeta.OwnerReferences), and if they differ create a deep copy
(existing.DeepCopy()), set the updated OwnerReferences, and call
dl.client.Update(ctx, copied) (same error handling as the create/update path).
Ensure you reference and reuse the same ownerRefs computation used for new
objects so owner-based cleanup is applied to preexisting LocalVolumeDeviceLink
instances.
|
@gnufied: This pull request references STOR-2870 which is a valid jira issue. DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
|
@CodeRabbit please cross-check this PR with the enhancement: https://github.com/openshift/enhancements/blob/master/enhancements/storage/lso-recreate-symlinks.md |
|
Tip For best results, initiate chat on the files or code changes. 🧩 Analysis chainLet me fetch the enhancement document and then compare it against the PR implementation. 🏁 Script executed: Unrecoverable error: Request timed out after 300000ms (requestId=8ceba741-de8a-4825-9ffd-b7a5f3b61db9) Let me try fetching the enhancement doc via the GitHub CLI instead. 🏁 Script executed: Length of output: 75 Let me try another approach to fetch the document. 🏁 Script executed: Length of output: 16680
✅ What the PR implements (matching the KEP)
❌ What's missing (KEP requires, PR doesn't implement)1.
|
| PR Addition | Notes |
|---|---|
processRejectedDevicesForDeviceLinks |
Creates LVDL entries for devices that matched but were rejected by device filters. KEP only mentions LVDL creation for provisioned PVs. |
CreateLocalPVArgs struct refactoring |
Implementation detail, not API-level. |
GetSymlinkedForCurrentSC utility |
Internal helper, not described. |
Exported internal.DiskLocation struct |
Refactoring to support the new code paths. |
Summary
This PR is clearly scoped to phase 1 — LVDL object creation and status population. The following significant KEP requirements remain unimplemented:
DefaultDeviceLinkPolicyonLocalVolume/LocalVolumeSet- Alert rules (two conditions)
- Symlink recreation when
policy == PreferredLinkTarget(the headline feature) - Udev-triggered reconcile with rate limiting
- E2e tests for policy-driven scenarios
|
@CodeRabbit, does the PR do anything wrong, against what was agreed in the KEP? |
|
Tip For best results, initiate chat on the files or code changes. 🧩 Analysis chainLet me re-fetch the KEP and cross-check specific implementation details for correctness. 🏁 Script executed: Length of output: 16625 🏁 Script executed: Length of output: 7882 🏁 Script executed: Length of output: 5914 🏁 Script executed: Length of output: 1376 🏁 Script executed: Length of output: 191
1. 🔴 LVDLs created for rejected devices — contradicts KEPThe KEP states:
The PR introduces This directly contradicts the KEP's stated scope. It also creates a tension with 2. 🟡
|
| Issue | Severity | Contradicts KEP? |
|---|---|---|
| LVDLs created for rejected (no-PV) devices | High | ✅ Yes — KEP scopes LVDL to provisioned PVs only |
omitzero JSON tag on Status field |
Medium | No — implementation bug |
+required + omitempty on CurrentLinkTarget |
Medium | Yes — KEP shows it as omitempty only |
policy is omitempty vs non-optional in KEP |
Low | Minor |
|
Note Reviews pausedIt looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the Use the following commands to manage reviews:
Use the checkboxes below for quick actions:
WalkthroughAdds LocalVolumeDeviceLink (LVDL) support: CRD/type changes making some status fields optional, RBAC updates, a new internal DeviceLinkHandler, refactors to context+args PV creation and DiskLocation usage, reconciler logic to populate LVDL status, many unit/e2e test additions, and a HACKING.md note. Changes
Estimated code review effort🎯 4 (Complex) | ⏱️ ~60 minutes ✨ Finishing Touches🧪 Generate unit tests (beta)
Comment |
There was a problem hiding this comment.
Actionable comments posted: 4
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
api/v1/localvolumedevicelink_types.go (1)
44-44:⚠️ Potential issue | 🟠 MajorResolve the
+requiredandomitemptycontradiction at line 78.Line 78 marks
CurrentLinkTargetwith+requiredbut usesjson:"currentLinkTarget,omitempty", which allows the field to be omitted from JSON serialization. This conflicts with the required constraint.Line 44 (Status field) is correctly marked
+optionaland usesjson:"status,omitzero", which is supported by Go 1.20+.Suggested fix
// currentLinkTarget is the current devicePath being used by local storage operator // ... // +required // +kubebuilder:validation:MinLength=1 // +kubebuilder:validation:MaxLength=4096 - CurrentLinkTarget string `json:"currentLinkTarget,omitempty"` + CurrentLinkTarget string `json:"currentLinkTarget"`🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@api/v1/localvolumedevicelink_types.go` at line 44, The field CurrentLinkTarget is declared as +required but its JSON tag uses omitempty which contradicts the required marker; update the struct field CurrentLinkTarget to remove the omitempty from the json tag (e.g. change json:"currentLinkTarget,omitempty" to json:"currentLinkTarget") so the +required constraint is honored, ensuring the generated CRD/schema treats CurrentLinkTarget as mandatory; alternatively, if the intent is optional, remove the +required marker instead—apply the change to the CurrentLinkTarget declaration in the localvolumedevicelink types.
♻️ Duplicate comments (1)
HACKING.md (1)
9-12:⚠️ Potential issue | 🟡 MinorAlign the section title with the actual steps and clarify OLM behavior.
The title mentions both "diskmaker image or operator image" but step 2 only covers the diskmaker image. Additionally, deleting the Subscription stops future upgrades but does not stop OLM from using the already-installed CSV.
✏️ Suggested alignment
-1. Delete the subscription object so as OLM is no longer managing the operator. -2. Edit the CSV object to override diskmaker image. +1. Delete the Subscription to stop OLM from replacing the installed CSV during upgrades. +2. Edit the installed CSV to override the operator image and/or the `DISKMAKER_IMAGE` value.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@HACKING.md` around lines 9 - 12, The section title and steps are inconsistent and the OLM behavior is unclear: change the heading to reflect that you're overriding the CSV-installed images (e.g., "Overriding CSV images after deployment") and update the steps to (1) explain that deleting the Subscription only stops future upgrades but does not revert the currently-installed CSV, (2) instruct to edit the ClusterServiceVersion (CSV) to override both the diskmaker image and the operator image fields as needed, and (3) note that OLM will continue to use the installed CSV until it's replaced or removed; reference Subscription and CSV in the updated prose so readers know which resource to modify.
🧹 Nitpick comments (2)
pkg/internal/device_link_handler.go (1)
56-70: Consider reconciling owner references on existing LVDLs.The current implementation preserves user policy but doesn't update
OwnerReferenceson pre-existing LVDLs. Objects created before this PR or manually will miss owner-based garbage collection. This could leave orphaned LVDLs when the parent LocalVolume/LocalVolumeSet is deleted.♻️ Proposed fix to reconcile owner refs
// Keep user-configured policy and only reconcile the pv name. - if existing.Spec.PersistentVolumeName == pvName { - return existing, nil - } - - klog.InfoS("updating lvdl object", "pvName", pvName) - existingCopy := existing.DeepCopy() - existingCopy.Spec.PersistentVolumeName = pvName + needsUpdate := false + + if existingCopy.Spec.PersistentVolumeName != pvName { + existingCopy.Spec.PersistentVolumeName = pvName + needsUpdate = true + } + + // Reconcile owner references for proper GC + if !isNilOwnerObject(ownerObj) && len(existingCopy.OwnerReferences) == 0 { + ownerRefs, err := buildOwnerRefs(ownerObj) + if err == nil { + existingCopy.OwnerReferences = ownerRefs + needsUpdate = true + } + } + + if !needsUpdate { + return existing, nil + } + + klog.InfoS("updating lvdl object", "pvName", pvName) err = dl.client.Update(ctx, existingCopy)🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@pkg/internal/device_link_handler.go` around lines 56 - 70, When updating an existing LocalVolumeDeviceLink (the block that checks existing.Spec.PersistentVolumeName and then creates existingCopy), also reconcile the ObjectMeta.OwnerReferences on existingCopy so owner-based GC is preserved: compute the desired owner refs (e.g., from whatever creates LVDLs in this package — use the same source of truth you use when creating new LVDLs), merge/replace existingCopy.ObjectMeta.OwnerReferences if they differ, and then call dl.client.Update(ctx, existingCopy). Update the logic around existing, existingCopy, and dl.client.Update to compare and set OwnerReferences in addition to Spec.PersistentVolumeName before performing the update.pkg/diskmaker/controllers/lv/reconcile.go (1)
411-420: Add a comment clarifying error handling strategy and periodic reconciliation.Errors from
findMatchingDisksandprocessRejectedDevicesForDeviceLinksare logged but not propagated. This appears intentional for best-effort processing, but a brief comment would clarify that:
- This is called periodically via reconcile loop
- Failures are non-fatal since the next reconcile will retry
Suggested documentation
+ // Process ignored but matching devices for LVDL status updates. + // This is best-effort: errors are logged and retried on the next periodic reconcile. ignoredButMatchingDeviceMap, err := r.findMatchingDisks(diskConfig, ignoredDevices) if err != nil { klog.ErrorS(err, "error finding matching devices from ignored list") }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@pkg/diskmaker/controllers/lv/reconcile.go` around lines 411 - 420, Add a concise comment above the block that calls findMatchingDisks and processRejectedDevicesForDeviceLinks explaining the error-handling strategy: note that this code runs as part of the periodic Reconcile loop, failures from findMatchingDisks and processRejectedDevicesForDeviceLinks are intentionally logged (best-effort) and not returned to allow the controller to continue, and any transient errors will be retried on the next reconcile; place the comment immediately above the existing ignoredButMatchingDeviceMap / processRejectedDevicesForDeviceLinks calls and reference the functions by name in the comment for clarity.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@pkg/diskmaker/controllers/lv/reconcile.go`:
- Around line 529-533: Replace the malformed klog.ErrorS call that currently
passes an empty message string and pushes the message into the key/value list;
locate the call near the blockDevice.GetPathByID("") handling and change the
invocation from klog.ErrorS(err, "", "failed to get preferred device link",
"disk", blockDevice.Name) to klog.ErrorS(err, "failed to get preferred device
link", "disk", blockDevice.Name) so the human-readable message is passed as the
msg argument and disk metadata remains in the key/value pairs.
- Around line 441-445: The failure branch after calling
getSymlinkSourceAndTarget(deviceNameLocation, symLinkDirPath) currently uses
break which aborts processing the rest of the devices in the current loop;
change that break to continue so a single symlink lookup error skips only the
current device and allows the loop to continue processing other devices (update
the error handling at the source, target, idExists, err :=
getSymlinkSourceAndTarget(...) block to use continue instead of break).
In `@pkg/internal/device_link_handler_test.go`:
- Around line 87-118: Add a TestHelperProcess helper function to this test file
to satisfy the subprocess mocks used by helperCommandBlkid and
helperCommandBlkidEmpty: implement TestHelperProcess to check the
GO_WANT_HELPER_PROCESS env var, read the COMMAND env var, and for "blkid" write
the BLKIDOUT env value to stdout and exit 0; for "blkid_noop" exit 0 with empty
stdout; for any unexpected COMMAND write an error to stderr and exit non‑zero.
Ensure TestHelperProcess returns immediately when GO_WANT_HELPER_PROCESS is not
set so normal tests run.
In `@test/e2e/localvolumeset_test.go`:
- Around line 369-371: The test currently hard-fails when optional LVDL status
fields are empty; change the assertions so only CurrentLinkTarget is required
and PreferredLinkTarget and ValidLinkTargets are checked conditionally (i.e.,
only assert non-empty when the field is present/defined). Locate the assertions
using matcher.Expect and the symbols lvdl.Status.CurrentLinkTarget,
lvdl.Status.PreferredLinkTarget, and lvdl.Status.ValidLinkTargets and replace
the unconditional Expect calls for PreferredLinkTarget and ValidLinkTargets with
an if-check that only calls matcher.Expect(...).ToNot(gomega.BeEmpty(), ...)
when the field is non-nil/non-empty indicator (for strings check != "" or for
slices check len(...)>0/presence) to avoid hard failures in environments without
by-id links.
---
Outside diff comments:
In `@api/v1/localvolumedevicelink_types.go`:
- Line 44: The field CurrentLinkTarget is declared as +required but its JSON tag
uses omitempty which contradicts the required marker; update the struct field
CurrentLinkTarget to remove the omitempty from the json tag (e.g. change
json:"currentLinkTarget,omitempty" to json:"currentLinkTarget") so the +required
constraint is honored, ensuring the generated CRD/schema treats
CurrentLinkTarget as mandatory; alternatively, if the intent is optional, remove
the +required marker instead—apply the change to the CurrentLinkTarget
declaration in the localvolumedevicelink types.
---
Duplicate comments:
In `@HACKING.md`:
- Around line 9-12: The section title and steps are inconsistent and the OLM
behavior is unclear: change the heading to reflect that you're overriding the
CSV-installed images (e.g., "Overriding CSV images after deployment") and update
the steps to (1) explain that deleting the Subscription only stops future
upgrades but does not revert the currently-installed CSV, (2) instruct to edit
the ClusterServiceVersion (CSV) to override both the diskmaker image and the
operator image fields as needed, and (3) note that OLM will continue to use the
installed CSV until it's replaced or removed; reference Subscription and CSV in
the updated prose so readers know which resource to modify.
---
Nitpick comments:
In `@pkg/diskmaker/controllers/lv/reconcile.go`:
- Around line 411-420: Add a concise comment above the block that calls
findMatchingDisks and processRejectedDevicesForDeviceLinks explaining the
error-handling strategy: note that this code runs as part of the periodic
Reconcile loop, failures from findMatchingDisks and
processRejectedDevicesForDeviceLinks are intentionally logged (best-effort) and
not returned to allow the controller to continue, and any transient errors will
be retried on the next reconcile; place the comment immediately above the
existing ignoredButMatchingDeviceMap / processRejectedDevicesForDeviceLinks
calls and reference the functions by name in the comment for clarity.
In `@pkg/internal/device_link_handler.go`:
- Around line 56-70: When updating an existing LocalVolumeDeviceLink (the block
that checks existing.Spec.PersistentVolumeName and then creates existingCopy),
also reconcile the ObjectMeta.OwnerReferences on existingCopy so owner-based GC
is preserved: compute the desired owner refs (e.g., from whatever creates LVDLs
in this package — use the same source of truth you use when creating new LVDLs),
merge/replace existingCopy.ObjectMeta.OwnerReferences if they differ, and then
call dl.client.Update(ctx, existingCopy). Update the logic around existing,
existingCopy, and dl.client.Update to compare and set OwnerReferences in
addition to Spec.PersistentVolumeName before performing the update.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: e3894c0b-0423-46fa-9309-767082dedec7
📒 Files selected for processing (20)
HACKING.mdapi/v1/localvolumedevicelink_types.goconfig/manifests/stable/local-storage-operator.clusterserviceversion.yamlconfig/manifests/stable/local.storage.openshift.io_localvolumedevicelinks.yamlpkg/common/provisioner_utils.gopkg/common/symlink_utils.gopkg/controllers/localvolume/localvolume_controller.gopkg/diskmaker/controllers/lv/create_pv_test.gopkg/diskmaker/controllers/lv/reconcile.gopkg/diskmaker/controllers/lv/reconcile_test.gopkg/diskmaker/controllers/lvset/create_pv_test.gopkg/diskmaker/controllers/lvset/device_age_test.gopkg/diskmaker/controllers/lvset/reconcile.gopkg/diskmaker/controllers/lvset/reconcile_test.gopkg/internal/device_link_handler.gopkg/internal/device_link_handler_test.gopkg/internal/disk_location.gopkg/internal/diskutil.gotest/e2e/localvolume_test.gotest/e2e/localvolumeset_test.go
💤 Files with no reviewable changes (1)
- config/manifests/stable/local.storage.openshift.io_localvolumedevicelinks.yaml
| preferredSymLink, err := blockDevice.GetPathByID("") | ||
| if err != nil { | ||
| klog.ErrorS(err, "", "failed to get preferred device link", "disk", blockDevice.Name) | ||
| continue | ||
| } |
There was a problem hiding this comment.
Fix malformed klog.ErrorS call - empty message string.
The error message is placed in the key-value pairs instead of the message argument. This will produce confusing log output.
Proposed fix
preferredSymLink, err := blockDevice.GetPathByID("")
if err != nil {
- klog.ErrorS(err, "", "failed to get preferred device link", "disk", blockDevice.Name)
+ klog.ErrorS(err, "failed to get preferred device link", "disk", blockDevice.Name)
continue
}📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| preferredSymLink, err := blockDevice.GetPathByID("") | |
| if err != nil { | |
| klog.ErrorS(err, "", "failed to get preferred device link", "disk", blockDevice.Name) | |
| continue | |
| } | |
| preferredSymLink, err := blockDevice.GetPathByID("") | |
| if err != nil { | |
| klog.ErrorS(err, "failed to get preferred device link", "disk", blockDevice.Name) | |
| continue | |
| } |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@pkg/diskmaker/controllers/lv/reconcile.go` around lines 529 - 533, Replace
the malformed klog.ErrorS call that currently passes an empty message string and
pushes the message into the key/value list; locate the call near the
blockDevice.GetPathByID("") handling and change the invocation from
klog.ErrorS(err, "", "failed to get preferred device link", "disk",
blockDevice.Name) to klog.ErrorS(err, "failed to get preferred device link",
"disk", blockDevice.Name) so the human-readable message is passed as the msg
argument and disk metadata remains in the key/value pairs.
| matcher.Expect(lvdl.Status.CurrentLinkTarget).ToNot(gomega.BeEmpty(), "expected CurrentLinkTarget for LVDL %q", lvdl.Name) | ||
| matcher.Expect(lvdl.Status.PreferredLinkTarget).ToNot(gomega.BeEmpty(), "expected PreferredLinkTarget for LVDL %q", lvdl.Name) | ||
| matcher.Expect(lvdl.Status.ValidLinkTargets).ToNot(gomega.BeEmpty(), "expected ValidLinkTargets for LVDL %q", lvdl.Name) |
There was a problem hiding this comment.
Don’t hard-fail on optional LVDL status fields in e2e.
These assertions require PreferredLinkTarget and ValidLinkTargets to be non-empty for every PV, but the API defines both as optional. This can fail in valid environments where by-id links are unavailable.
Suggested test adjustment
for _, lvdl := range fsLVDLs {
matcher.Expect(lvdl.Status.CurrentLinkTarget).ToNot(gomega.BeEmpty(), "expected CurrentLinkTarget for LVDL %q", lvdl.Name)
- matcher.Expect(lvdl.Status.PreferredLinkTarget).ToNot(gomega.BeEmpty(), "expected PreferredLinkTarget for LVDL %q", lvdl.Name)
- matcher.Expect(lvdl.Status.ValidLinkTargets).ToNot(gomega.BeEmpty(), "expected ValidLinkTargets for LVDL %q", lvdl.Name)
+ if lvdl.Status.PreferredLinkTarget != "" {
+ matcher.Expect(lvdl.Status.ValidLinkTargets).To(
+ gomega.ContainElement(lvdl.Status.PreferredLinkTarget),
+ "expected PreferredLinkTarget to be listed in ValidLinkTargets for LVDL %q", lvdl.Name,
+ )
+ }
}📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| matcher.Expect(lvdl.Status.CurrentLinkTarget).ToNot(gomega.BeEmpty(), "expected CurrentLinkTarget for LVDL %q", lvdl.Name) | |
| matcher.Expect(lvdl.Status.PreferredLinkTarget).ToNot(gomega.BeEmpty(), "expected PreferredLinkTarget for LVDL %q", lvdl.Name) | |
| matcher.Expect(lvdl.Status.ValidLinkTargets).ToNot(gomega.BeEmpty(), "expected ValidLinkTargets for LVDL %q", lvdl.Name) | |
| matcher.Expect(lvdl.Status.CurrentLinkTarget).ToNot(gomega.BeEmpty(), "expected CurrentLinkTarget for LVDL %q", lvdl.Name) | |
| if lvdl.Status.PreferredLinkTarget != "" { | |
| matcher.Expect(lvdl.Status.ValidLinkTargets).To( | |
| gomega.ContainElement(lvdl.Status.PreferredLinkTarget), | |
| "expected PreferredLinkTarget to be listed in ValidLinkTargets for LVDL %q", lvdl.Name, | |
| ) | |
| } |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@test/e2e/localvolumeset_test.go` around lines 369 - 371, The test currently
hard-fails when optional LVDL status fields are empty; change the assertions so
only CurrentLinkTarget is required and PreferredLinkTarget and ValidLinkTargets
are checked conditionally (i.e., only assert non-empty when the field is
present/defined). Locate the assertions using matcher.Expect and the symbols
lvdl.Status.CurrentLinkTarget, lvdl.Status.PreferredLinkTarget, and
lvdl.Status.ValidLinkTargets and replace the unconditional Expect calls for
PreferredLinkTarget and ValidLinkTargets with an if-check that only calls
matcher.Expect(...).ToNot(gomega.BeEmpty(), ...) when the field is
non-nil/non-empty indicator (for strings check != "" or for slices check
len(...)>0/presence) to avoid hard failures in environments without by-id links.
84f8c6c to
54da887
Compare
There was a problem hiding this comment.
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
pkg/common/provisioner_utils.go (1)
215-215:⚠️ Potential issue | 🟡 MinorInconsistent context usage in CreateOrUpdate call.
The function receives
ctxas a parameter and uses it for other operations (lines 131, 259), butcontrollerutil.CreateOrUpdateon line 215 still usescontext.TODO(). This should use the passedctxfor consistency and proper cancellation propagation.🔧 Proposed fix
- opRes, err := controllerutil.CreateOrUpdate(context.TODO(), client, existingPV, func() error { + opRes, err := controllerutil.CreateOrUpdate(ctx, client, existingPV, func() error {🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@pkg/common/provisioner_utils.go` at line 215, The CreateOrUpdate call uses context.TODO() instead of the function's passed ctx, causing inconsistent context/cancellation handling; update the controllerutil.CreateOrUpdate invocation to pass the existing ctx parameter (keep using the same client, existingPV, and update func) so opRes and err are produced under the correct context, matching the other calls that use ctx.
🧹 Nitpick comments (1)
pkg/common/provisioner_utils.go (1)
122-127: LVDL creation before PV ensures tracking object exists first.Creating the
LocalVolumeDeviceLinkbefore checking for existing PVs ensures that for any new PVs, a tracking LVDL already exists. This is the correct ordering per the past review discussion. Consider adding a brief comment explaining this intentional sequencing.📝 Suggested comment
deviceHandler := internal.NewDeviceLinkHandler(args.CurrentSymlink, client) + // Create LVDL before PV to ensure tracking object exists for any new PVs. + // For upgraded clusters, ApplyStatus below also creates LVDLs if missing. _, err = deviceHandler.Create(ctx, pvName, runtimeConfig.Namespace, obj) if err != nil { return fmt.Errorf("error creating localvolumedevicelink object: %w", err) }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@pkg/common/provisioner_utils.go` around lines 122 - 127, The LocalVolumeDeviceLink creation is intentionally done before verifying existing PVs to ensure a tracking LVDL exists for any new PVs; update the code around internal.NewDeviceLinkHandler and deviceHandler.Create (which creates the LocalVolumeDeviceLink for pvName) to include a short comment stating this deliberate ordering and why (e.g., "create LVDL before PV check so tracking object exists for new PVs"), leaving the Create call and error handling unchanged.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Outside diff comments:
In `@pkg/common/provisioner_utils.go`:
- Line 215: The CreateOrUpdate call uses context.TODO() instead of the
function's passed ctx, causing inconsistent context/cancellation handling;
update the controllerutil.CreateOrUpdate invocation to pass the existing ctx
parameter (keep using the same client, existingPV, and update func) so opRes and
err are produced under the correct context, matching the other calls that use
ctx.
---
Nitpick comments:
In `@pkg/common/provisioner_utils.go`:
- Around line 122-127: The LocalVolumeDeviceLink creation is intentionally done
before verifying existing PVs to ensure a tracking LVDL exists for any new PVs;
update the code around internal.NewDeviceLinkHandler and deviceHandler.Create
(which creates the LocalVolumeDeviceLink for pvName) to include a short comment
stating this deliberate ordering and why (e.g., "create LVDL before PV check so
tracking object exists for new PVs"), leaving the Create call and error handling
unchanged.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: 88c95db0-7308-42ec-a5ac-6723980626b3
📒 Files selected for processing (21)
HACKING.mdapi/v1/localvolumedevicelink_types.goconfig/manifests/stable/local-storage-operator.clusterserviceversion.yamlconfig/manifests/stable/local.storage.openshift.io_localvolumedevicelinks.yamlpkg/common/provisioner_utils.gopkg/common/symlink_utils.gopkg/controllers/localvolume/localvolume_controller.gopkg/diskmaker/controllers/lv/create_pv_test.gopkg/diskmaker/controllers/lv/reconcile.gopkg/diskmaker/controllers/lv/reconcile_test.gopkg/diskmaker/controllers/lvset/create_pv_test.gopkg/diskmaker/controllers/lvset/device_age_test.gopkg/diskmaker/controllers/lvset/reconcile.gopkg/diskmaker/controllers/lvset/reconcile_test.gopkg/diskmaker/event_reporter.gopkg/internal/device_link_handler.gopkg/internal/device_link_handler_test.gopkg/internal/disk_location.gopkg/internal/diskutil.gotest/e2e/localvolume_test.gotest/e2e/localvolumeset_test.go
💤 Files with no reviewable changes (1)
- config/manifests/stable/local.storage.openshift.io_localvolumedevicelinks.yaml
✅ Files skipped from review due to trivial changes (3)
- HACKING.md
- pkg/diskmaker/event_reporter.go
- pkg/internal/disk_location.go
🚧 Files skipped from review as they are similar to previous changes (10)
- pkg/diskmaker/controllers/lvset/reconcile_test.go
- pkg/diskmaker/controllers/lvset/device_age_test.go
- config/manifests/stable/local-storage-operator.clusterserviceversion.yaml
- test/e2e/localvolumeset_test.go
- test/e2e/localvolume_test.go
- pkg/common/symlink_utils.go
- api/v1/localvolumedevicelink_types.go
- pkg/internal/device_link_handler_test.go
- pkg/internal/diskutil.go
- pkg/diskmaker/controllers/lv/reconcile.go
| } | ||
|
|
||
| if len(ignoredButMatchingDeviceMap) != 0 { | ||
| r.processRejectedDevicesForDeviceLinks(ctx, ignoredButMatchingDeviceMap) |
There was a problem hiding this comment.
When I have a Pod that uses a PV, its LocalVolumeDeviceLink gets updated when udev creates a new symlink. But when the PV is unused, it's not updated at all.
Should be processRejectedDevicesForDeviceLinks (with a better name) called on all devices?
(The code location is arbitrary. This code path actually works - ignored (=used) devices are updated, just the unused are not)
There was a problem hiding this comment.
This should be fixed now.
There was a problem hiding this comment.
🧹 Nitpick comments (2)
pkg/internal/device_link_handler.go (1)
256-258: Consider reducing log verbosity forgetFilesystemUUID.The
klog.InfoSat Line 257 runs on every status reconciliation. For frequently reconciled devices, this could generate excessive logs. Consider usingklog.V(4).InfoSfor this diagnostic message.Proposed fix
func getFilesystemUUID(devicePath string) (string, error) { - klog.InfoS("trying to get filesystem information", "devicePath", devicePath) + klog.V(4).InfoS("trying to get filesystem information", "devicePath", devicePath) cmd := ExecCommand("blkid", "-s", "UUID", "-o", "value", devicePath)🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@pkg/internal/device_link_handler.go` around lines 256 - 258, The getFilesystemUUID function logs a diagnostic message at Info level on every reconciliation which can be noisy; change the klog.InfoS call inside getFilesystemUUID to klog.V(4).InfoS so the message is emitted only when verbosity >=4, preserving the same message and structured key ("devicePath") to reduce log verbosity for frequently reconciled devices.pkg/diskmaker/controllers/lv/reconcile.go (1)
546-554: Function never returns an error.The function signature includes an
errorreturn value, but it always returnsnil. Either remove the error return or add validation that could produce an error (e.g., checking for empty paths).Option 1: Remove error return
-func getSymlinkSourceAndTarget(devLocation internal.DiskLocation, symlinkDir string) (string, string, bool, error) { +func getSymlinkSourceAndTarget(devLocation internal.DiskLocation, symlinkDir string) (string, string, bool) { if devLocation.DiskID != "" { target := path.Join(symlinkDir, filepath.Base(devLocation.DiskID)) - return devLocation.DiskID, target, true, nil + return devLocation.DiskID, target, true } else { target := path.Join(symlinkDir, filepath.Base(devLocation.UserProvidedPath)) - return devLocation.UserProvidedPath, target, false, nil + return devLocation.UserProvidedPath, target, false } }Option 2: Add validation
func getSymlinkSourceAndTarget(devLocation internal.DiskLocation, symlinkDir string) (string, string, bool, error) { if devLocation.DiskID != "" { target := path.Join(symlinkDir, filepath.Base(devLocation.DiskID)) return devLocation.DiskID, target, true, nil - } else { + } + if devLocation.UserProvidedPath != "" { target := path.Join(symlinkDir, filepath.Base(devLocation.UserProvidedPath)) return devLocation.UserProvidedPath, target, false, nil } + return "", "", false, fmt.Errorf("no DiskID or UserProvidedPath for device %s", devLocation.DiskNamePath) }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@pkg/diskmaker/controllers/lv/reconcile.go` around lines 546 - 554, The function getSymlinkSourceAndTarget declares an error return but never returns a non-nil error; either remove the error from the signature or add validation and return errors when inputs are invalid. Fix option A: remove the error return and update all callers of getSymlinkSourceAndTarget to handle the new (string, string, bool) signature. Fix option B: add input validation inside getSymlinkSourceAndTarget (check that symlinkDir is not empty and that either devLocation.DiskID or devLocation.UserProvidedPath is non-empty), and return a descriptive error (e.g., fmt.Errorf("invalid symlink inputs: ...")) when validation fails, keeping the existing signature; update callers to handle the possible error.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Nitpick comments:
In `@pkg/diskmaker/controllers/lv/reconcile.go`:
- Around line 546-554: The function getSymlinkSourceAndTarget declares an error
return but never returns a non-nil error; either remove the error from the
signature or add validation and return errors when inputs are invalid. Fix
option A: remove the error return and update all callers of
getSymlinkSourceAndTarget to handle the new (string, string, bool) signature.
Fix option B: add input validation inside getSymlinkSourceAndTarget (check that
symlinkDir is not empty and that either devLocation.DiskID or
devLocation.UserProvidedPath is non-empty), and return a descriptive error
(e.g., fmt.Errorf("invalid symlink inputs: ...")) when validation fails, keeping
the existing signature; update callers to handle the possible error.
In `@pkg/internal/device_link_handler.go`:
- Around line 256-258: The getFilesystemUUID function logs a diagnostic message
at Info level on every reconciliation which can be noisy; change the klog.InfoS
call inside getFilesystemUUID to klog.V(4).InfoS so the message is emitted only
when verbosity >=4, preserving the same message and structured key
("devicePath") to reduce log verbosity for frequently reconciled devices.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: 014cd4ad-3dc8-4cdf-9656-9ce3830560ac
📒 Files selected for processing (4)
pkg/diskmaker/controllers/lv/reconcile.gopkg/diskmaker/controllers/lv/reconcile_test.gopkg/diskmaker/controllers/lvset/reconcile.gopkg/internal/device_link_handler.go
🚧 Files skipped from review as they are similar to previous changes (2)
- pkg/diskmaker/controllers/lvset/reconcile.go
- pkg/diskmaker/controllers/lv/reconcile_test.go
There was a problem hiding this comment.
Actionable comments posted: 1
♻️ Duplicate comments (2)
pkg/diskmaker/controllers/lv/reconcile_test.go (1)
375-380:⚠️ Potential issue | 🟠 MajorThis “does not create LVDL” case is only proving the nil-owner shortcut.
r.localVolumeis never populated here, so a missing LVDL is skipped becauseownerObj == nilinpkg/internal/device_link_handler.go, not because rejected devices are actually exempt from creation. If the intent is to lock down real reconciler behavior, this case needs to run with a real LocalVolume owner; otherwise it should be renamed to match what it is testing.Also applies to: 440-443
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@pkg/diskmaker/controllers/lv/reconcile_test.go` around lines 375 - 380, The test "does not create lvdl while processing rejected devices" in reconcile_test.go is exercising the nil-owner shortcut because r.localVolume is never set; either populate r.localVolume with a real LocalVolume object so the handler in pkg/internal/device_link_handler.go evaluates rejected-device logic with a non-nil ownerObj, or rename the test to state it verifies the nil-owner shortcut. Specifically, set r.localVolume (or construct a LocalVolume and assign as owner) before calling the reconcile flow referenced by the test, and apply the same fix to the duplicated case around the lines corresponding to the other occurrence (the case at 440-443). Ensure references to ownerObj in device_link_handler.go are exercised by the non-nil owner when asserting that rejected devices are exempt from LVDL creation.pkg/internal/device_link_handler.go (1)
53-67:⚠️ Potential issue | 🟠 MajorReconcile
OwnerReferenceson existing LVDLs too.Both paths return an existing object unchanged. That leaves preexisting/manual
LocalVolumeDeviceLinkobjects without the LocalVolume/LocalVolumeSet owner, andApplyStatuscannot fix that later because it only updates the status subresource. The GC-based deletion model added in this PR will not work for those objects.Also applies to: 213-232
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@pkg/internal/device_link_handler.go` around lines 53 - 67, When the code returns an existing LocalVolumeDeviceLink unchanged (the branch that checks existing.Spec.PersistentVolumeName == pvName and the other similar return), also reconcile and persist OwnerReferences on that existing object before returning: compute the desired ownerRefs for the LocalVolume/LocalVolumeSet (use the same owner-ref construction logic you use when creating a new LVDL), compare/merge with existing.OwnerReferences, and if they differ, DeepCopy the existing object, set the merged OwnerReferences and call dl.client.Update(ctx, updatedCopy) (handle and wrap the error similarly). Apply the same owner-ref reconciliation to the other return path that currently returns without updating so preexisting/manual LVDLs get the correct owners.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@pkg/internal/diskutil.go`:
- Around line 447-448: GetPVCreationLock builds the find invocation incorrectly
by joining dirs into one pathname; change the CmdExecutor.Command call to pass
each search directory as a separate argument instead of strings.Join(dirs, " "),
i.e. construct the args slice as ["-L", <each dir from dirs...>, "-samefile",
path] and call CmdExecutor.Command with that variadic args list so
executeCmdWithCombinedOutput runs find over all roots correctly.
---
Duplicate comments:
In `@pkg/diskmaker/controllers/lv/reconcile_test.go`:
- Around line 375-380: The test "does not create lvdl while processing rejected
devices" in reconcile_test.go is exercising the nil-owner shortcut because
r.localVolume is never set; either populate r.localVolume with a real
LocalVolume object so the handler in pkg/internal/device_link_handler.go
evaluates rejected-device logic with a non-nil ownerObj, or rename the test to
state it verifies the nil-owner shortcut. Specifically, set r.localVolume (or
construct a LocalVolume and assign as owner) before calling the reconcile flow
referenced by the test, and apply the same fix to the duplicated case around the
lines corresponding to the other occurrence (the case at 440-443). Ensure
references to ownerObj in device_link_handler.go are exercised by the non-nil
owner when asserting that rejected devices are exempt from LVDL creation.
In `@pkg/internal/device_link_handler.go`:
- Around line 53-67: When the code returns an existing LocalVolumeDeviceLink
unchanged (the branch that checks existing.Spec.PersistentVolumeName == pvName
and the other similar return), also reconcile and persist OwnerReferences on
that existing object before returning: compute the desired ownerRefs for the
LocalVolume/LocalVolumeSet (use the same owner-ref construction logic you use
when creating a new LVDL), compare/merge with existing.OwnerReferences, and if
they differ, DeepCopy the existing object, set the merged OwnerReferences and
call dl.client.Update(ctx, updatedCopy) (handle and wrap the error similarly).
Apply the same owner-ref reconciliation to the other return path that currently
returns without updating so preexisting/manual LVDLs get the correct owners.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: 402063dc-1b9b-45ab-93f1-96ea1b50ac10
⛔ Files ignored due to path filters (2)
vendor/k8s.io/utils/exec/testing/fake_exec.gois excluded by!vendor/**,!**/vendor/**vendor/modules.txtis excluded by!vendor/**,!**/vendor/**
📒 Files selected for processing (8)
pkg/diskmaker/controllers/lv/create_pv_test.gopkg/diskmaker/controllers/lv/reconcile_test.gopkg/diskmaker/controllers/lvset/create_pv_test.gopkg/diskmaker/discovery/discovery_test.gopkg/internal/device_link_handler.gopkg/internal/device_link_handler_test.gopkg/internal/diskutil.gopkg/internal/diskutil_test.go
✅ Files skipped from review due to trivial changes (1)
- pkg/internal/device_link_handler_test.go
0e101d2 to
49e5f17
Compare
There was a problem hiding this comment.
Actionable comments posted: 1
♻️ Duplicate comments (2)
pkg/diskmaker/controllers/lv/reconcile_test.go (1)
440-443:⚠️ Potential issue | 🟠 MajorInitialize the reconciler owner in this test setup.
The
"does not create lvdl while processing rejected devices"case currently passes becausegetFakeDiskMakerleavesr.localVolumeunset, soprocessRejectedDevicesForDeviceLinksnever exercises the real owner-backed path. That can hide rejected-device LVDL creation in normal reconcile flow.Patch sketch
+ lv := &localv1.LocalVolume{ + ObjectMeta: metav1.ObjectMeta{ + Name: "test-lv", + Namespace: testNamespace, + }, + } + objs = append(objs, lv) + r, tcCtx := getFakeDiskMaker(t, tmpRoot, objs...) + r.localVolume = lv r.runtimeConfig.Node = &corev1.Node{ObjectMeta: metav1.ObjectMeta{Name: testNodeName}} r.runtimeConfig.Namespace = testNamespaceAlso applies to: 523-526
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@pkg/diskmaker/controllers/lv/reconcile_test.go` around lines 440 - 443, The test's reconciler isn't setting r.localVolume so processRejectedDevicesForDeviceLinks skips the owner-backed path; in the test case that checks "does not create lvdl while processing rejected devices" (and similarly at the other affected lines) initialize the reconciler owner by assigning a LocalVolume object to r.localVolume after calling getFakeDiskMaker (e.g., create a minimal core object with the test namespace/name) so processRejectedDevicesForDeviceLinks executes the real owner-backed logic; update the test setup where getFakeDiskMaker is used to set r.localVolume and ensure its metadata (Name/Namespace/UID) matches expectations used by processRejectedDevicesForDeviceLinks.pkg/internal/diskutil.go (1)
447-450:⚠️ Potential issue | 🔴 CriticalBuild the
findargv slice instead of joining the directories.Line 447 turns all search roots into one pathname.
utilexec.Commanddoes not shell-split that string, so multi-directory checks can miss existing symlinks and let the same device be claimed twice.Patch sketch
func GetMatchingSymlinksInDirs(path string, dirs ...string) ([]string, error) { - cmd := CmdExecutor.Command("find", "-L", strings.Join(dirs, " "), "-samefile", path) + args := append([]string{"-L"}, dirs...) + args = append(args, "-samefile", path) + cmd := CmdExecutor.Command("find", args...) output, err := executeCmdWithCombinedOutput(cmd)Run this to verify the current argv construction versus the fixed one. Expected result:
joinedfails with “No such file or directory”, whileseparatefinds both symlinks.#!/bin/bash set -euo pipefail sed -n '444,450p' pkg/internal/diskutil.go tmpdir="$(mktemp -d)" mkdir -p "$tmpdir/foo" "$tmpdir/bar" touch "$tmpdir/dev" ln -s "$tmpdir/dev" "$tmpdir/foo/link1" ln -s "$tmpdir/dev" "$tmpdir/bar/link2" python - "$tmpdir" <<'PY' import os import subprocess import sys root = sys.argv[1] dev = os.path.join(root, "dev") joined = os.path.join(root, "foo") + " " + os.path.join(root, "bar") cases = { "joined": ["find", "-L", joined, "-samefile", dev], "separate": ["find", "-L", os.path.join(root, "foo"), os.path.join(root, "bar"), "-samefile", dev], } for name, argv in cases.items(): proc = subprocess.run(argv, capture_output=True, text=True) print(f"== {name} ==") print("argv:", argv) print("rc:", proc.returncode) print("stdout:", proc.stdout.strip()) print("stderr:", proc.stderr.strip()) PY🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@pkg/internal/diskutil.go` around lines 447 - 450, The code incorrectly joins dirs into one argument when calling CmdExecutor.Command, so build the argv slice instead: create an args []string starting with "-L", append each directory from dirs separately, then append "-samefile" and path, and pass those as separate arguments to CmdExecutor.Command (used with executeCmdWithCombinedOutput). Update the call around CmdExecutor.Command and ensure executeCmdWithCombinedOutput(cmd) receives the command constructed from the args slice rather than a single joined-directory string.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@pkg/common/provisioner_utils.go`:
- Around line 122-127: The code currently calls deviceHandler.Create(...) to
persist a LocalVolumeDeviceLink (LVDL) too early (deviceHandler.Create, variable
deviceHandler), before performing final mountpoint/capacity validations and
before writing the PersistentVolume (PV); move the Create call so it executes
only after all remaining PV validation checks succeed and after the PV write
completes to avoid leaving stale LVDL objects referencing non-existent PVs.
Locate every early Create usage (the shown deviceHandler.Create call and the
other similar Create calls later in the file around the referenced blocks) and
refactor so that: perform mountpoint and capacity checks, write the PV, and only
then call deviceHandler.Create(ctx, pvName, runtimeConfig.Namespace, obj);
ensure error handling remains intact and that rollback/cleanup is considered if
PV write fails.
---
Duplicate comments:
In `@pkg/diskmaker/controllers/lv/reconcile_test.go`:
- Around line 440-443: The test's reconciler isn't setting r.localVolume so
processRejectedDevicesForDeviceLinks skips the owner-backed path; in the test
case that checks "does not create lvdl while processing rejected devices" (and
similarly at the other affected lines) initialize the reconciler owner by
assigning a LocalVolume object to r.localVolume after calling getFakeDiskMaker
(e.g., create a minimal core object with the test namespace/name) so
processRejectedDevicesForDeviceLinks executes the real owner-backed logic;
update the test setup where getFakeDiskMaker is used to set r.localVolume and
ensure its metadata (Name/Namespace/UID) matches expectations used by
processRejectedDevicesForDeviceLinks.
In `@pkg/internal/diskutil.go`:
- Around line 447-450: The code incorrectly joins dirs into one argument when
calling CmdExecutor.Command, so build the argv slice instead: create an args
[]string starting with "-L", append each directory from dirs separately, then
append "-samefile" and path, and pass those as separate arguments to
CmdExecutor.Command (used with executeCmdWithCombinedOutput). Update the call
around CmdExecutor.Command and ensure executeCmdWithCombinedOutput(cmd) receives
the command constructed from the args slice rather than a single
joined-directory string.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: b3a2a073-d6a2-412a-8701-8b16cdd63aab
⛔ Files ignored due to path filters (1)
vendor/k8s.io/utils/exec/testing/fake_exec.gois excluded by!vendor/**,!**/vendor/**
📒 Files selected for processing (4)
pkg/common/provisioner_utils.gopkg/diskmaker/controllers/lv/reconcile_test.gopkg/internal/device_link_handler_test.gopkg/internal/diskutil.go
✅ Files skipped from review due to trivial changes (1)
- pkg/internal/device_link_handler_test.go
| deviceHandler := internal.NewDeviceLinkHandler(args.CurrentSymlink, client) | ||
|
|
||
| _, err = deviceHandler.Create(ctx, pvName, runtimeConfig.Namespace, obj) | ||
| if err != nil { | ||
| return fmt.Errorf("error creating localvolumedevicelink object: %w", err) | ||
| } |
There was a problem hiding this comment.
Create the LVDL only after the remaining PV validations pass.
Lines 124-127 persist the LocalVolumeDeviceLink before the last mountpoint/capacity checks and before the PV write. Any later return leaves an LVDL whose persistentVolumeName never exists as a PV, so retries can accumulate stale device-link objects.
Patch sketch
deviceHandler := internal.NewDeviceLinkHandler(args.CurrentSymlink, client)
-
- _, err = deviceHandler.Create(ctx, pvName, runtimeConfig.Namespace, obj)
- if err != nil {
- return fmt.Errorf("error creating localvolumedevicelink object: %w", err)
- }
// Do not attempt to create or update existing PV's that have been released
existingPV := &corev1.PersistentVolume{ObjectMeta: metav1.ObjectMeta{Name: pvName}}
err = client.Get(ctx, types.NamespacedName{Name: pvName}, existingPV)
@@
newPV := provCommon.CreateLocalPVSpec(localPVConfig)
// Add finalizer for diskmaker symlink cleanup
controllerutil.AddFinalizer(newPV, LSOSymlinkDeleterFinalizer)
+
+ _, err = deviceHandler.Create(ctx, pvName, runtimeConfig.Namespace, obj)
+ if err != nil {
+ return fmt.Errorf("error creating localvolumedevicelink object: %w", err)
+ }
klog.InfoS("creating PV", "pvName", pvName)Also applies to: 129-165, 255-257
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@pkg/common/provisioner_utils.go` around lines 122 - 127, The code currently
calls deviceHandler.Create(...) to persist a LocalVolumeDeviceLink (LVDL) too
early (deviceHandler.Create, variable deviceHandler), before performing final
mountpoint/capacity validations and before writing the PersistentVolume (PV);
move the Create call so it executes only after all remaining PV validation
checks succeed and after the PV write completes to avoid leaving stale LVDL
objects referencing non-existent PVs. Locate every early Create usage (the shown
deviceHandler.Create call and the other similar Create calls later in the file
around the referenced blocks) and refactor so that: perform mountpoint and
capacity checks, write the PV, and only then call deviceHandler.Create(ctx,
pvName, runtimeConfig.Namespace, obj); ensure error handling remains intact and
that rollback/cleanup is considered if PV write fails.
49e5f17 to
a36f021
Compare
a36f021 to
0c70945
Compare
Fix rbac rules and bugs to enable creation of lvdl objects
Also setowner ref
refactor code to findexisting lvldl Change UpdateStatus to ApplyStatus
Refactor the code to directly use BlockDevice Add comments about processRejectedDevicesForDeviceLinks
0c70945 to
88063ca
Compare
Add utilexec libraries
88063ca to
2b33356
Compare
|
@gnufied: all tests passed! Full PR test history. Your PR dashboard. DetailsInstructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here. |
pkg/common/provisioner_utils.go
Outdated
| // DevicePath is the full path to the device (e.g. /dev/sda). | ||
| // filepath.Base is applied internally where only the kernel name is needed. | ||
| DevicePath string | ||
| SymLinkPath string |
There was a problem hiding this comment.
Indentation to be corrected:
$ gofmt -d pkg/common/provisioner_utils.go
diff pkg/common/provisioner_utils.go.orig pkg/common/provisioner_utils.go
--- pkg/common/provisioner_utils.go.orig
+++ pkg/common/provisioner_utils.go
@@ -50,9 +50,9 @@
StorageClass storagev1.StorageClass
MountPointMap sets.String
Client client.Client
- SymLinkPath string
- IDExists bool
- ExtraLabelsForPV map[string]string
+ SymLinkPath string
+ IDExists bool
+ ExtraLabelsForPV map[string]string
// CurrentSymlink points to source to which SymLinkPath points to.
CurrentSymlink string
I see you do it in the next commit, but it's better to fix it here (because the next commit has nothing to do with such fix)
| type DiskLocation struct { | ||
| // DiskNamePath stores full device name path - "/dev/sda" | ||
| DiskNamePath string | ||
| // UserProvidedPath is the path supplied by the user in the LocalVolume CR. Empty in LocalVolumeSet |
There was a problem hiding this comment.
This struct (DiskLocation) is essentially copy/pasta of what we had in pkg/diskmaker/controllers/lv/reconcile.go before this PR. This makes me think that this struct must be used somewhere beyond LV Reconcile. And this comment hints to LVS. But in fact, with all commits of this PR applied, this struct is still used only in LV reconcile.go. Please clarify what actually justifies this movement of this struct to this new file and give some details about the comment (Empty in LocalVolumeSet). I attempted to find answers myself, but couldn't.
There was a problem hiding this comment.
You are right. So a version of this PR used DiskLocation from other places and hence I had to make it public outside its own package, but then later on I think I scrapped that change but type was still exported publicly.
I can undo that if it helps.
|
|
||
| _, err = deviceHandler.ApplyStatus(ctx, pvName, runtimeConfig.Namespace, args.BlockDevice, obj) | ||
| if err != nil { | ||
| return fmt.Errorf("error updating localvolumedevicelink object: %w", err) |
There was a problem hiding this comment.
It seems to be logical to do r.eventSync.Report(r.localVolume, newDiskEvent(diskmaker.ErrorCreatingLVDL, ...) before returning here. Is this a bug or you omitted such Report() intentionally here?
There was a problem hiding this comment.
The event already gets reported up the chain from place where CreateLocalPV gets called. reporting another event here will simply cause duplicated events.
There was a problem hiding this comment.
@gnufied , sorry, I cannot find where it gets reported up the chain from place where CreateLocalPV gets called. Please give me file name and line of code.
I also made some debugging adding artificial error in the end of CreateLocalPV():
// _, err = deviceHandler.ApplyStatus(ctx, pvName, runtimeConfig.Namespace, args.BlockDevice, obj)
err = fmt.Errorf("ZZZZZ: MY ARTIFICIAL ERRROR: error updating localvolumedevicelink ZZZZZ")
if err != nil {
return fmt.Errorf("error updating localvolumedevicelink object: %w", err)
}
The result is expected:
E0326 05:36:58.837106 469396 reconcile.go:486] "could not create local PV" err="error updating localvolumedevicelink object: ZZZZZ: MY ARTIFICIAL ERRROR: error updating localvolumedevicelink ZZZZZ" deviceNameLocation={"DiskNamePath":"/dev/loop1","UserProvidedPath":"/dev/loop1","DiskID":"","BlockDevice":{"name":"loop1","kname":"loop1","type":"loop","fsType":"","size":"1073741824","rota":"0","ro":"0","rm":"0"},"ForceWipe":false}
And LVDL object is crippled (no Status part at all):
$ oc get -n openshift-local-storage LocalVolumeDeviceLink local-pv-ff140cb4 -o yaml
apiVersion: local.storage.openshift.io/v1
kind: LocalVolumeDeviceLink
metadata:
creationTimestamp: "2026-03-26T05:36:58Z"
generation: 1
name: local-pv-ff140cb4
namespace: openshift-local-storage
ownerReferences:
- apiVersion: local.storage.openshift.io/v1
controller: true
kind: LocalVolume
name: local-disks1
uid: d0d4ce6a-e8b2-4f86-b3f3-f367fac84e1e
resourceVersion: "360213"
uid: 574c3ed3-302d-4ffa-a9f1-098d005a063d
spec:
persistentVolumeName: local-pv-ff140cb4
policy: None
But processRejectedDevicesForDeviceLinks() was never called and I cannot see any other places where we report an event for diskmaker.ErrorCreatingLVDL from LV (not LVS) code-paths. Please give me some details backing up your words about "reporting another event here will simply cause duplicated events".
There was a problem hiding this comment.
processRejectedDevicesForDeviceLinks is only called for devices which were already mounted and consumed by previously created LSO PV. It won't be called for devices which has a backing PV, but it isn't used.
In - Reconcile function of lvset reconciler, where we call provisionPV:
err = r.provisionPV(ctx, lvset, blockDevice, *storageClass, mountPointMap, symlinkSourcePath, symlinkPath, idExists)
if err == common.ErrTryAgain {
requeueTime = fastRequeueTime
} else if err != nil {
msg := fmt.Sprintf("provisioning failed for %s: %v", blockDevice.Name, err)
r.eventReporter.Report(lvset, newDiskEvent(diskmaker.ErrorProvisioningDisk, msg, blockDevice.KName, corev1.EventTypeWarning))
klog.Error(msg)
continue
}
This will log any errors coming from provisionPV which in turn calls CreateLocalPV to be reported as event.
It looks like for lv/reconciler.go though, we are simply logging the error and not regenerating an event. I could fix it, but a lot of those code has changed in #598 which actually implements device relinking.
There was a problem hiding this comment.
So, in new version in #598. All errors from provisionPV call report events in both lv and lvset reconciler. See - https://github.com/gnufied/local-storage-operator/blob/implement-device-relinking/pkg/diskmaker/controllers/lv/reconcile.go#L500 and https://github.com/gnufied/local-storage-operator/blob/implement-device-relinking/pkg/diskmaker/controllers/lvset/reconcile.go#L546
Hope that answers your question.
| Namespace: diskMakerDS.Namespace, | ||
| Name: diskMakerDS.Name, | ||
| LastGeneration: diskMakerDS.Generation, | ||
| }) |
There was a problem hiding this comment.
How is this change relevant to the goal of this PR? Assuming we get rid of this change, would anything become broken?
Fixes https://issues.redhat.com/browse/STOR-2870
Created with: AI assistance for tests.