OADP-4855 Add cachePVC support for Kopia cache volume#2090
Conversation
WalkthroughThis PR introduces support for configurable cache PVCs for Kopia repository caching during restore operations. It adds a CachePVCConfig field to the DataProtectionApplication CRD, implements automatic default StorageClass resolution, updates RBAC permissions to read storageclasses, and includes comprehensive tests and documentation. Changes
Sequence DiagramsequenceDiagram
participant DPA as DPA Reconciliation
participant SC as StorageClass API
participant CM as NodeAgent ConfigMap
participant DS as NodeAgent DaemonSet
DPA->>DPA: Check if NodeAgent ConfigMap required<br/>(CachePVCConfig present?)
alt CachePVCConfig has no StorageClass
DPA->>SC: Query default StorageClass<br/>(via annotation)
SC-->>DPA: Return default SC name or not found
alt Default StorageClass found
DPA->>DPA: Inject default into<br/>CachePVCConfig copy
else No default found
DPA->>DPA: Nil or disable CachePVCConfig
end
end
DPA->>CM: Create/Update NodeAgent ConfigMap<br/>with CachePVCConfig
CM-->>DPA: ConfigMap ready with cache settings
DPA->>DS: Build NodeAgent DaemonSet<br/>with backup repo config map name
DS-->>DPA: DaemonSet configured
DPA->>DS: Deploy DaemonSet to cluster
DS-->>DPA: Deployment complete
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~20 minutes 🚥 Pre-merge checks | ✅ 3 | ❌ 2❌ Failed checks (2 warnings)
✅ Passed checks (3 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches
🧪 Generate unit tests (beta)
Comment |
|
/hold needs some more testing. |
There was a problem hiding this comment.
Actionable comments posted: 2
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
internal/controller/nodeagent.go (1)
400-418:⚠️ Potential issue | 🟡 MinorAdd explicit guard for empty backup-repository config map name to match pattern in velero.go.
The code passes
backupRepoConfigMapNamedirectly toinstall.WithBackupRepoConfigMap()without verifying it's non-empty. While the name is conditionally populated, velero.go (line 435) shows the pattern used elsewhere in the codebase:if cmName.Name != ""before constructing the flag. Add the same guard before callinginstall.WithBackupRepoConfigMap()to ensure consistency and protect against unintended empty flag injection.
🤖 Fix all issues with AI agents
In `@bundle/manifests/oadp-operator.clusterserviceversion.yaml`:
- Around line 1055-1061: The RBAC rule incorrectly lists the resource
"persistentvolumerclaims" (typo) so it grants no access; update the resources
array used with the apiGroup "" to include the correct Kubernetes resource name
"persistentvolumeclaims" (and keep "persistentvolumes"), ensuring the
resources/verbs block around the rule that contains "persistentvolumes" and
"get" is updated to use "persistentvolumeclaims" instead of the misspelled
token.
In `@config/crd/bases/_.yaml`:
- Around line 1-13: The file _.yaml is an unused CRD scaffold containing empty
placeholders (apiVersion, kind, names, scope, versions) and should be deleted;
remove config/crd/bases/_.yaml from the repo (or move it to an archival
location) so dead scaffold CRD content is not kept alongside real CRDs
referenced by config/crd/kustomization.yaml.
🧹 Nitpick comments (1)
internal/controller/nodeagent.go (1)
130-145: Verify default StorageClass detection covers legacy annotation if required.
If any supported clusters still mark defaults with the beta annotation, this helper could miss them and cachePVC could be disabled. Confirm GA-only is acceptable or add a fallback check.🔧 Optional fallback for legacy default annotation
- if sc.Annotations["storageclass.kubernetes.io/is-default-class"] == "true" { + if sc.Annotations["storageclass.kubernetes.io/is-default-class"] == "true" || + sc.Annotations["storageclass.beta.kubernetes.io/is-default-class"] == "true" { return sc.Name, nil }
| --- | ||
| apiVersion: apiextensions.k8s.io/v1 | ||
| kind: CustomResourceDefinition | ||
| metadata: | ||
| annotations: | ||
| controller-gen.kubebuilder.io/version: v0.16.5 | ||
| spec: | ||
| group: "" | ||
| names: | ||
| kind: "" | ||
| plural: "" | ||
| scope: "" | ||
| versions: null |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
# First, verify the file exists and see its content
find . -name "_.yaml" -path "*/crd/bases/*" | head -20Repository: openshift/oadp-operator
Length of output: 92
🏁 Script executed:
# Search for references to this file in kustomization.yaml files
rg "_.yaml" -g "kustomization.yaml" -B 2 -A 2Repository: openshift/oadp-operator
Length of output: 49
🏁 Script executed:
# Search for references to config/crd/bases directory in kustomization files
rg "config/crd/bases" -g "kustomization.yaml" -B 2 -A 2Repository: openshift/oadp-operator
Length of output: 49
🏁 Script executed:
# Check if config/crd/bases directory is included in any kustomization
rg "config/crd" -g "kustomization.yaml" -lRepository: openshift/oadp-operator
Length of output: 49
🏁 Script executed:
# Check kustomization directory structure to understand build patterns
find . -name "kustomization.yaml" -type f | head -20Repository: openshift/oadp-operator
Length of output: 566
🏁 Script executed:
# Search for any reference to bases/ directory in kustomization files more broadly
rg "bases/" -g "kustomization.yaml" -B 2 -A 2 | head -50Repository: openshift/oadp-operator
Length of output: 2596
🏁 Script executed:
# Check if there's any .gitignore or build configuration that might exclude this
cat config/crd/.gitignore 2>/dev/null || echo "no .gitignore in crd directory"Repository: openshift/oadp-operator
Length of output: 97
🏁 Script executed:
# Verify the file content matches what's in the review
cat -n config/crd/bases/_.yamlRepository: openshift/oadp-operator
Length of output: 388
Remove unused placeholder CRD file.
The _.yaml file is an empty scaffold that should either be populated with a real CRD or removed entirely. It is currently not referenced in config/crd/kustomization.yaml or any build configuration, so it poses no immediate risk; however, it should be cleaned up as dead code.
🤖 Prompt for AI Agents
In `@config/crd/bases/_.yaml` around lines 1 - 13, The file _.yaml is an unused
CRD scaffold containing empty placeholders (apiVersion, kind, names, scope,
versions) and should be deleted; remove config/crd/bases/_.yaml from the repo
(or move it to an archival location) so dead scaffold CRD content is not kept
alongside real CRDs referenced by config/crd/kustomization.yaml.
3081300 to
153dbff
Compare
There was a problem hiding this comment.
Actionable comments posted: 2
🤖 Fix all issues with AI agents
In `@docs/cache_volume_pvc_configuration.md`:
- Around line 32-34: Add a language tag to the fenced code block containing
"cacheVolumeSize = cacheLimitMB * 1.2 (rounded up to the nearest GB)" to satisfy
MD040; change the opening fence from ``` to something like ```text (or ```none)
so the block reads ```text followed by the existing line and the closing ```,
ensuring the fenced block has an explicit language tag.
In `@internal/controller/nodeagent.go`:
- Around line 130-145: The getDefaultStorageClass function currently only checks
the "storageclass.kubernetes.io/is-default-class" annotation; update it to also
consider the legacy "storageclass.beta.kubernetes.io/is-default-class"
annotation when deciding a default StorageClass. In the loop over scList.Items
(function getDefaultStorageClass) test both
sc.Annotations["storageclass.kubernetes.io/is-default-class"] and
sc.Annotations["storageclass.beta.kubernetes.io/is-default-class"] for the value
"true" (treat either as indicating default) and return sc.Name when either is
present; keep error handling and the rest of the function unchanged.
| ``` | ||
| cacheVolumeSize = cacheLimitMB * 1.2 (rounded up to the nearest GB) | ||
| ``` |
There was a problem hiding this comment.
Add a language tag to the fenced code block.
Markdownlint (MD040) flags fenced blocks without a language.
Suggested fix
-```
+```text
cacheVolumeSize = cacheLimitMB * 1.2 (rounded up to the nearest GB)
-```
+```📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| ``` | |
| cacheVolumeSize = cacheLimitMB * 1.2 (rounded up to the nearest GB) | |
| ``` |
🧰 Tools
🪛 markdownlint-cli2 (0.20.0)
[warning] 32-32: Fenced code blocks should have a language specified
(MD040, fenced-code-language)
🤖 Prompt for AI Agents
In `@docs/cache_volume_pvc_configuration.md` around lines 32 - 34, Add a language
tag to the fenced code block containing "cacheVolumeSize = cacheLimitMB * 1.2
(rounded up to the nearest GB)" to satisfy MD040; change the opening fence from
``` to something like ```text (or ```none) so the block reads ```text followed
by the existing line and the closing ```, ensuring the fenced block has an
explicit language tag.
| // getDefaultStorageClass returns the name of the cluster's default StorageClass, if one exists. | ||
| // A StorageClass is considered default if it has the annotation | ||
| // "storageclass.kubernetes.io/is-default-class" set to "true". | ||
| func (r *DataProtectionApplicationReconciler) getDefaultStorageClass() (string, error) { | ||
| scList := &storagev1.StorageClassList{} | ||
| if err := r.Client.List(r.Context, scList); err != nil { | ||
| return "", fmt.Errorf("failed to list storage classes: %w", err) | ||
| } | ||
| for i := range scList.Items { | ||
| sc := &scList.Items[i] | ||
| if sc.Annotations["storageclass.kubernetes.io/is-default-class"] == "true" { | ||
| return sc.Name, nil | ||
| } | ||
| } | ||
| return "", nil | ||
| } |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
# Search for default StorageClass annotation handling to keep behavior consistent.
rg -n "is-default-class" -SRepository: openshift/oadp-operator
Length of output: 2092
🏁 Script executed:
#!/bin/bash
# Also search for StorageClass handling in general
rg -n "StorageClass" --type go -A 2 -B 2 | head -100Repository: openshift/oadp-operator
Length of output: 7344
🏁 Script executed:
#!/bin/bash
# Check the exact file and context
cat -n internal/controller/nodeagent.go | sed -n '125,150p'Repository: openshift/oadp-operator
Length of output: 1356
🌐 Web query:
Kubernetes StorageClass default annotations storageclass.beta.kubernetes.io/is-default-class
💡 Result:
storageclass.beta.kubernetes.io/is-default-class is the legacy (beta) annotation that was used to mark a StorageClass as the cluster default (so PVCs that omit storageClassName get this class). It appears in older Kubernetes examples/docs. [3]
Today, the supported/current annotation is:
metadata:
annotations:
storageclass.kubernetes.io/is-default-class: "true"Kubernetes treats any other value (or absence) as not default. [2]
To switch defaults:
# remove default from an existing class
kubectl patch storageclass standard -p \
'{"metadata":{"annotations":{"storageclass.kubernetes.io/is-default-class":"false"}}}'
# set a new default
kubectl patch storageclass gold -p \
'{"metadata":{"annotations":{"storageclass.kubernetes.io/is-default-class":"true"}}}'[2]
Notes:
- If more than one
StorageClassis marked default, Kubernetes uses the most recently created default class for PVCs that omitstorageClassName. [1] - Defaulting is applied by the DefaultStorageClass admission controller. [4]
Sources: Kubernetes StorageClasses concept doc [1], “Change the default StorageClass” task doc [2], older Kubernetes blog showing the beta annotation [3], dynamic provisioning doc [4].
Consider supporting the legacy StorageClass default annotation for compatibility.
While Kubernetes currently uses storageclass.kubernetes.io/is-default-class, the legacy annotation storageclass.beta.kubernetes.io/is-default-class was historically supported. Some older clusters may still use it, causing the default StorageClass lookup to fail and unexpectedly disabling cache PVC. Adding support for both annotations is a simple, non-invasive defensive measure. Note that similar code elsewhere in this project (e.g., tests/e2e/lib/virt_storage_helpers.go) also uses only the stable annotation.
Proposed fix
- if sc.Annotations["storageclass.kubernetes.io/is-default-class"] == "true" {
+ if sc.Annotations["storageclass.kubernetes.io/is-default-class"] == "true" ||
+ sc.Annotations["storageclass.beta.kubernetes.io/is-default-class"] == "true" {
return sc.Name, nil
}📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| // getDefaultStorageClass returns the name of the cluster's default StorageClass, if one exists. | |
| // A StorageClass is considered default if it has the annotation | |
| // "storageclass.kubernetes.io/is-default-class" set to "true". | |
| func (r *DataProtectionApplicationReconciler) getDefaultStorageClass() (string, error) { | |
| scList := &storagev1.StorageClassList{} | |
| if err := r.Client.List(r.Context, scList); err != nil { | |
| return "", fmt.Errorf("failed to list storage classes: %w", err) | |
| } | |
| for i := range scList.Items { | |
| sc := &scList.Items[i] | |
| if sc.Annotations["storageclass.kubernetes.io/is-default-class"] == "true" { | |
| return sc.Name, nil | |
| } | |
| } | |
| return "", nil | |
| } | |
| // getDefaultStorageClass returns the name of the cluster's default StorageClass, if one exists. | |
| // A StorageClass is considered default if it has the annotation | |
| // "storageclass.kubernetes.io/is-default-class" set to "true". | |
| func (r *DataProtectionApplicationReconciler) getDefaultStorageClass() (string, error) { | |
| scList := &storagev1.StorageClassList{} | |
| if err := r.Client.List(r.Context, scList); err != nil { | |
| return "", fmt.Errorf("failed to list storage classes: %w", err) | |
| } | |
| for i := range scList.Items { | |
| sc := &scList.Items[i] | |
| if sc.Annotations["storageclass.kubernetes.io/is-default-class"] == "true" || | |
| sc.Annotations["storageclass.beta.kubernetes.io/is-default-class"] == "true" { | |
| return sc.Name, nil | |
| } | |
| } | |
| return "", nil | |
| } |
🤖 Prompt for AI Agents
In `@internal/controller/nodeagent.go` around lines 130 - 145, The
getDefaultStorageClass function currently only checks the
"storageclass.kubernetes.io/is-default-class" annotation; update it to also
consider the legacy "storageclass.beta.kubernetes.io/is-default-class"
annotation when deciding a default StorageClass. In the loop over scList.Items
(function getDefaultStorageClass) test both
sc.Annotations["storageclass.kubernetes.io/is-default-class"] and
sc.Annotations["storageclass.beta.kubernetes.io/is-default-class"] for the value
"true" (treat either as indicating default) and return sc.Name when either is
present; keep error handling and the rest of the function unchanged.
153dbff to
d47c738
Compare
|
/unhold |
|
Below is the 100% AI driven manual test plan and execution performed by the Claude OADP-4855 Manual Test Plan: cachePVC Support for Kopia Cache VolumeOverviewThis test plan validates the Cluster Environment
Prerequisites
Execution Summary
Execution Date: 2026-02-23 Key Findings
Test Case 1: DPA with cachePVC using explicit StorageClassObjective: Verify that specifying DPA Configuration ( apiVersion: oadp.openshift.io/v1alpha1
kind: DataProtectionApplication
metadata:
name: nac-testing-dpa
namespace: openshift-adp
spec:
configuration:
velero:
defaultPlugins:
- openshift
- aws
resourceTimeout: 10m
nodeAgent:
enable: true
uploaderType: kopia
cachePVC:
storageClass: ibmc-vpc-block-5iops-tier
backupLocations:
- name: default
velero:
provider: aws
default: true
objectStorage:
bucket: mpryc-nac-first
prefix: default-nac-bucket
config:
region: eu-central-1
credential:
key: default
name: cloud-credentials-1Steps:
Expected Results:
Cleanup: Result: PASSED
Test Case 2: DPA with cachePVC using default StorageClass (empty object)Objective: Verify that specifying DPA Configuration ( apiVersion: oadp.openshift.io/v1alpha1
kind: DataProtectionApplication
metadata:
name: nac-testing-dpa
namespace: openshift-adp
spec:
configuration:
velero:
defaultPlugins:
- openshift
- aws
resourceTimeout: 10m
nodeAgent:
enable: true
uploaderType: kopia
cachePVC: {}
backupLocations:
- name: default
velero:
provider: aws
default: true
objectStorage:
bucket: mpryc-nac-first
prefix: default-nac-bucket
config:
region: eu-central-1
credential:
key: default
name: cloud-credentials-1Steps:
Expected Results:
Cleanup: Result: PASSED
Test Case 3: DPA with cachePVC including residentThresholdInMBObjective: Verify that DPA Configuration ( apiVersion: oadp.openshift.io/v1alpha1
kind: DataProtectionApplication
metadata:
name: nac-testing-dpa
namespace: openshift-adp
spec:
configuration:
velero:
defaultPlugins:
- openshift
- aws
resourceTimeout: 10m
nodeAgent:
enable: true
uploaderType: kopia
cachePVC:
storageClass: ibmc-vpc-block-5iops-tier
residentThresholdInMB: 1024
backupLocations:
- name: default
velero:
provider: aws
default: true
objectStorage:
bucket: mpryc-nac-first
prefix: default-nac-bucket
config:
region: eu-central-1
credential:
key: default
name: cloud-credentials-1Steps:
Expected Results:
Cleanup: Result: PASSED
Test Case 4: DPA without cachePVC (backward compatibility)Objective: Verify that omitting DPA Configuration ( apiVersion: oadp.openshift.io/v1alpha1
kind: DataProtectionApplication
metadata:
name: nac-testing-dpa
namespace: openshift-adp
spec:
configuration:
velero:
defaultPlugins:
- openshift
- aws
resourceTimeout: 10m
nodeAgent:
enable: true
uploaderType: kopia
backupLocations:
- name: default
velero:
provider: aws
default: true
objectStorage:
bucket: mpryc-nac-first
prefix: default-nac-bucket
config:
region: eu-central-1
credential:
key: default
name: cloud-credentials-1Steps:
Expected Results:
Cleanup: Result: PASSED
Test Case 5: DPA update - add cachePVC to existing DPAObjective: Verify that adding Steps:
Expected Results:
Cleanup: Result: PASSED
Test Case 6: DPA update - remove cachePVC from existing DPAObjective: Verify that removing Steps:
Expected Results:
Cleanup: Result: PASSED
Test Case 7: DPA with cachePVC and non-existent StorageClassObjective: Verify behavior when a non-existent StorageClass is specified. The operator should still reconcile (the StorageClass is only used at restore time by Velero, not validated by the operator at DPA creation time). DPA Configuration ( apiVersion: oadp.openshift.io/v1alpha1
kind: DataProtectionApplication
metadata:
name: nac-testing-dpa
namespace: openshift-adp
spec:
configuration:
velero:
defaultPlugins:
- openshift
- aws
resourceTimeout: 10m
nodeAgent:
enable: true
uploaderType: kopia
cachePVC:
storageClass: nonexistent-storage-class
backupLocations:
- name: default
velero:
provider: aws
default: true
objectStorage:
bucket: mpryc-nac-first
prefix: default-nac-bucket
config:
region: eu-central-1
credential:
key: default
name: cloud-credentials-1Steps:
Expected Results:
Cleanup: Result: PASSED
Test Case 8: End-to-end backup and restore with cachePVCObjective: Perform an actual backup and restore cycle with cachePVC enabled to verify the full flow including cache PVC creation during restore. Steps:
Expected Results:
Cleanup: Result: PASSED
Note: To fully exercise dynamic cache PVC creation, a test workload with PVC-backed persistent volumes would be needed. The cache PVC feature targets DataDownload and PodVolumeRestore operations that involve Kopia repository cache, which only occurs during data movement of actual persistent volume data. |
There was a problem hiding this comment.
Actionable comments posted: 1
♻️ Duplicate comments (2)
docs/cache_volume_pvc_configuration.md (1)
32-34: Add a language tag to the fenced code block.Markdownlint (MD040) flags fenced blocks without a language. Use
textornonefor non-code formulas.Suggested fix
-``` +```text cacheVolumeSize = cacheLimitMB * 1.2 (rounded up to the nearest GB)</details> <details> <summary>🤖 Prompt for AI Agents</summary>Verify each finding against the current code and only fix it if needed.
In
@docs/cache_volume_pvc_configuration.mdaround lines 32 - 34, The fenced code
block containing the formula "cacheVolumeSize = cacheLimitMB * 1.2 (rounded up
to the nearest GB)" should include a language tag to satisfy markdownlint MD040;
update that block to use a non-code language tag such as text or none (e.g.,
change the opening fence to ```text) so the snippet is properly tagged—look for
the fenced block with the formula in docs/cache_volume_pvc_configuration.md and
add the language identifier.</details> </blockquote></details> <details> <summary>internal/controller/nodeagent.go (1)</summary><blockquote> `130-145`: **Consider supporting the legacy StorageClass default annotation for compatibility.** The function only checks `storageclass.kubernetes.io/is-default-class`. Some older clusters may still use the legacy annotation `storageclass.beta.kubernetes.io/is-default-class`, causing the default StorageClass lookup to fail unexpectedly. <details> <summary>Proposed fix</summary> ```diff for i := range scList.Items { sc := &scList.Items[i] - if sc.Annotations["storageclass.kubernetes.io/is-default-class"] == "true" { + if sc.Annotations["storageclass.kubernetes.io/is-default-class"] == "true" || + sc.Annotations["storageclass.beta.kubernetes.io/is-default-class"] == "true" { return sc.Name, nil } } ``` </details> <details> <summary>🤖 Prompt for AI Agents</summary> ``` Verify each finding against the current code and only fix it if needed. In `@internal/controller/nodeagent.go` around lines 130 - 145, The getDefaultStorageClass function only checks the modern annotation key; update getDefaultStorageClass to also consider the legacy key "storageclass.beta.kubernetes.io/is-default-class" when iterating scList.Items (sc variable) and ensure sc.Annotations is nil-safe before reading keys; return sc.Name when either annotation equals "true" (case-insensitive if desired) so older clusters that use the legacy annotation are supported. ``` </details> </blockquote></details> </blockquote></details> <details> <summary>🧹 Nitpick comments (1)</summary><blockquote> <details> <summary>config/crd/bases/oadp.openshift.io_dataprotectionapplications.yaml (1)</summary><blockquote> `261-275`: **Consider adding `minimum: 0` constraint to `residentThresholdInMB` for consistency.** The sibling field `cacheLimitMB` (line 259) includes a `minimum: 0` constraint, but `residentThresholdInMB` does not. Negative thresholds are semantically invalid, and adding this constraint would provide consistent schema-level validation and better API documentation. <details> <summary>💡 Suggested change</summary> ```diff cachePVC: description: |- CachePVCConfig configures a dedicated PVC for Kopia repository cache during restore operations. When set, cache data is stored on a provisioned PVC instead of the pod's root filesystem, preventing ephemeral storage exhaustion on nodes during concurrent restores. If storageClass is omitted, the cluster's default StorageClass is used. properties: residentThresholdInMB: description: ResidentThresholdInMB specifies the minimum size of the backup data to create cache PVC format: int64 + minimum: 0 type: integer storageClass: description: StorageClass specifies the storage class for cache PVC type: string type: object ``` </details> <details> <summary>🤖 Prompt for AI Agents</summary> ``` Verify each finding against the current code and only fix it if needed. In `@config/crd/bases/oadp.openshift.io_dataprotectionapplications.yaml` around lines 261 - 275, The schema for cachePVC's residentThresholdInMB allows negative values; add a JSON schema constraint "minimum: 0" to residentThresholdInMB to match the sibling cacheLimitMB and prevent negative thresholds. Locate the cachePVC object (properties -> residentThresholdInMB) in the CRD and add the minimum: 0 key alongside type/format so the field is validated as non-negative. ``` </details> </blockquote></details> </blockquote></details> <details> <summary>🤖 Prompt for all review comments with AI agents</summary>Verify each finding against the current code and only fix it if needed.
Inline comments:
In@bundle/manifests/oadp.openshift.io_dataprotectionapplications.yaml:
- Around line 261-275: Add a non-negative validation for the
residentThresholdInMB field in the cachePVC schema: update the cachePVC
properties so that residentThresholdInMB (under cachePVC) includes "minimum: 0"
alongside its existing type/format to enforce >=0 validation for the
ResidentThresholdInMB value.
Duplicate comments:
In@docs/cache_volume_pvc_configuration.md:
- Around line 32-34: The fenced code block containing the formula
"cacheVolumeSize = cacheLimitMB * 1.2 (rounded up to the nearest GB)" should
include a language tag to satisfy markdownlint MD040; update that block to use a
non-code language tag such as text or none (e.g., change the opening fence toformula in docs/cache_volume_pvc_configuration.md and add the language identifier. In `@internal/controller/nodeagent.go`: - Around line 130-145: The getDefaultStorageClass function only checks the modern annotation key; update getDefaultStorageClass to also consider the legacy key "storageclass.beta.kubernetes.io/is-default-class" when iterating scList.Items (sc variable) and ensure sc.Annotations is nil-safe before reading keys; return sc.Name when either annotation equals "true" (case-insensitive if desired) so older clusters that use the legacy annotation are supported. --- Nitpick comments: In `@config/crd/bases/oadp.openshift.io_dataprotectionapplications.yaml`: - Around line 261-275: The schema for cachePVC's residentThresholdInMB allows negative values; add a JSON schema constraint "minimum: 0" to residentThresholdInMB to match the sibling cacheLimitMB and prevent negative thresholds. Locate the cachePVC object (properties -> residentThresholdInMB) in the CRD and add the minimum: 0 key alongside type/format so the field is validated as non-negative.
ℹ️ Review info
Configuration used: Repository: openshift/coderabbit/.coderabbit.yaml
Review profile: CHILL
Plan: Pro
Cache: Disabled due to data retention organization setting
Knowledge base: Disabled due to data retention organization setting
⛔ Files ignored due to path filters (1)
go.sumis excluded by!**/*.sum📒 Files selected for processing (14)
api/v1alpha1/dataprotectionapplication_types.gobundle/manifests/oadp-operator.clusterserviceversion.yamlbundle/manifests/oadp.openshift.io_dataprotectionapplications.yamlbundle/manifests/oadp.openshift.io_dataprotectiontests.yamlbundle/manifests/velero.io_backupstoragelocations.yamlconfig/crd/bases/oadp.openshift.io_dataprotectionapplications.yamlconfig/crd/bases/oadp.openshift.io_dataprotectiontests.yamlconfig/crd/bases/velero.io_backupstoragelocations.yamlconfig/rbac/role.yamldocs/cache_volume_pvc_configuration.mdgo.modinternal/controller/dataprotectionapplication_controller.gointernal/controller/nodeagent.gointernal/controller/nodeagent_test.go🚧 Files skipped from review as they are similar to previous changes (3)
- internal/controller/dataprotectionapplication_controller.go
- bundle/manifests/oadp-operator.clusterserviceversion.yaml
- bundle/manifests/velero.io_backupstoragelocations.yaml
| cachePVC: | ||
| description: |- | ||
| CachePVCConfig configures a dedicated PVC for Kopia repository cache during restore operations. | ||
| When set, cache data is stored on a provisioned PVC instead of the pod's root filesystem, | ||
| preventing ephemeral storage exhaustion on nodes during concurrent restores. | ||
| If storageClass is omitted, the cluster's default StorageClass is used. | ||
| properties: | ||
| residentThresholdInMB: | ||
| description: ResidentThresholdInMB specifies the minimum size of the backup data to create cache PVC | ||
| format: int64 | ||
| type: integer | ||
| storageClass: | ||
| description: StorageClass specifies the storage class for cache PVC | ||
| type: string | ||
| type: object |
There was a problem hiding this comment.
Add non-negative validation for residentThresholdInMB.
A negative threshold is nonsensical and could lead to unexpected controller behavior; add minimum: 0 for schema validation.
🛠️ Suggested schema tweak
cachePVC:
description: |-
CachePVCConfig configures a dedicated PVC for Kopia repository cache during restore operations.
When set, cache data is stored on a provisioned PVC instead of the pod's root filesystem,
preventing ephemeral storage exhaustion on nodes during concurrent restores.
If storageClass is omitted, the cluster's default StorageClass is used.
properties:
residentThresholdInMB:
description: ResidentThresholdInMB specifies the minimum size of the backup data to create cache PVC
format: int64
+ minimum: 0
type: integer📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| cachePVC: | |
| description: |- | |
| CachePVCConfig configures a dedicated PVC for Kopia repository cache during restore operations. | |
| When set, cache data is stored on a provisioned PVC instead of the pod's root filesystem, | |
| preventing ephemeral storage exhaustion on nodes during concurrent restores. | |
| If storageClass is omitted, the cluster's default StorageClass is used. | |
| properties: | |
| residentThresholdInMB: | |
| description: ResidentThresholdInMB specifies the minimum size of the backup data to create cache PVC | |
| format: int64 | |
| type: integer | |
| storageClass: | |
| description: StorageClass specifies the storage class for cache PVC | |
| type: string | |
| type: object | |
| cachePVC: | |
| description: |- | |
| CachePVCConfig configures a dedicated PVC for Kopia repository cache during restore operations. | |
| When set, cache data is stored on a provisioned PVC instead of the pod's root filesystem, | |
| preventing ephemeral storage exhaustion on nodes during concurrent restores. | |
| If storageClass is omitted, the cluster's default StorageClass is used. | |
| properties: | |
| residentThresholdInMB: | |
| description: ResidentThresholdInMB specifies the minimum size of the backup data to create cache PVC | |
| format: int64 | |
| minimum: 0 | |
| type: integer | |
| storageClass: | |
| description: StorageClass specifies the storage class for cache PVC | |
| type: string | |
| type: object |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@bundle/manifests/oadp.openshift.io_dataprotectionapplications.yaml` around
lines 261 - 275, Add a non-negative validation for the residentThresholdInMB
field in the cachePVC schema: update the cachePVC properties so that
residentThresholdInMB (under cachePVC) includes "minimum: 0" alongside its
existing type/format to enforce >=0 validation for the ResidentThresholdInMB
value.
|
/retest |
|
@mpryc LGTM but now needs a rebase.. :( sorry.. |
Add cachePVC configuration to the DPA CRD under spec.configuration.nodeAgent, enabling dedicated PVC-backed cache volumes for Kopia repository cache during data movement restore operations. This prevents ephemeral storage exhaustion on nodes with limited root filesystem space during concurrent restores. - Add CachePVCConfig field to NodeAgentConfigMapSettings using upstream types.CachePVC directly to prevent field name drift with Velero - Serialize cachePVC config into the node-agent ConfigMap so Velero provisions per-restore cache PVCs instead of using pod root filesystem - Add default StorageClass resolution when cachePVC is configured without explicit storageClass, falling back to the cluster's default SC - Pass backup-repository-configmap to node-agent DaemonSet for cache volume size calculation (cacheLimitMB * 1.2) - Add RBAC for StorageClass list/get/watch - Add unit tests for ConfigMap assembly, default SC resolution, isNodeAgentCMRequired, and DaemonSet build with backup-repo configmap - Add user documentation with configuration examples Signed-off-by: Michal Pryc <mpryc@redhat.com> Co-authored-by: Claude <noreply@anthropic.com>
d47c738 to
4da4e85
Compare
|
To be 100% AI driven, rebase process was also performed by the Claude. |
There was a problem hiding this comment.
Actionable comments posted: 4
♻️ Duplicate comments (1)
docs/cache_volume_pvc_configuration.md (1)
32-34:⚠️ Potential issue | 🟡 MinorAdd a language tag to the fenced code block.
📝 Suggested fix
-``` +```text cacheVolumeSize = cacheLimitMB * 1.2 (rounded up to the nearest GB) -``` +```🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@docs/cache_volume_pvc_configuration.md` around lines 32 - 34, The fenced code block containing the expression "cacheVolumeSize = cacheLimitMB * 1.2 (rounded up to the nearest GB)" should include a language tag (e.g., "text" or "bash") after the opening ``` so the block renders correctly; update the opening fence from ``` to ```text (or another appropriate tag) around that line in docs/cache_volume_pvc_configuration.md.
🧹 Nitpick comments (2)
internal/controller/nodeagent.go (1)
130-145: Make default StorageClass selection deterministic when multiple defaults exist.List order isn’t guaranteed, so “first match” can be non‑deterministic. Consider choosing the most recently created default (or surfacing a warning).
🔧 Possible deterministic selection
- for i := range scList.Items { - sc := &scList.Items[i] - if sc.Annotations["storageclass.kubernetes.io/is-default-class"] == "true" { - return sc.Name, nil - } - } - return "", nil + var latest *storagev1.StorageClass + for i := range scList.Items { + sc := &scList.Items[i] + if sc.Annotations["storageclass.kubernetes.io/is-default-class"] == "true" { + if latest == nil || sc.CreationTimestamp.Time.After(latest.CreationTimestamp.Time) { + latest = sc + } + } + } + if latest != nil { + return latest.Name, nil + } + return "", nil🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@internal/controller/nodeagent.go` around lines 130 - 145, The getDefaultStorageClass function currently returns the "first" default StorageClass which is non-deterministic; change it to collect all StorageClasses whose annotation "storageclass.kubernetes.io/is-default-class" == "true" and pick the most recently created one deterministically (e.g., compare each matching StorageClass's CreationTimestamp and return the one with the latest timestamp). Update the logic inside getDefaultStorageClass (use scList.Items, sc.Annotations, and sc.CreationTimestamp) to choose the newest default and return its Name, keeping the same error handling for r.Client.List; optionally surface a warning if multiple defaults were found before returning the chosen one.internal/controller/nodeagent_test.go (1)
2050-2084: Consider adding a test case fordisableFsBackup: nil.The function signature takes
*bool, and anilpointer is a distinct state fromfalse. Currently all three cases pass an explicitptr.To(true/false). Adding a nil-pointer case (and an empty-config+nil case) would close that coverage gap.✅ Suggested additional cases
{ name: "CachePVCConfig set, fs-backup enabled, CM should be required", config: oadpv1alpha1.NodeAgentConfigMapSettings{CachePVCConfig: &velerotypes.CachePVC{StorageClass: "gp3-csi"}}, disableFsBackup: ptr.To(false), want: true, }, + { + name: "CachePVCConfig set, disableFsBackup nil, CM should be required", + config: oadpv1alpha1.NodeAgentConfigMapSettings{CachePVCConfig: &velerotypes.CachePVC{StorageClass: "gp3-csi"}}, + disableFsBackup: nil, + want: true, + }, + { + name: "No config set, disableFsBackup nil, CM should not be required", + config: oadpv1alpha1.NodeAgentConfigMapSettings{}, + disableFsBackup: nil, + want: false, + },🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@internal/controller/nodeagent_test.go` around lines 2050 - 2084, Add two table-driven test cases in Test_isNodeAgentCMRequired_CachePVCConfig to cover a nil disableFsBackup pointer: one where config has CachePVCConfig set and disableFsBackup is nil (expect want: true), and one with an empty config and disableFsBackup nil (expect want: false); update the tests slice in Test_isNodeAgentCMRequired_CachePVCConfig so the new entries reference isNodeAgentCMRequired and use ptr.Nil (or simply nil) for disableFsBackup to ensure the function behavior for a nil *bool is exercised.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@config/crd/bases/oadp.openshift.io_dataprotectionapplications.yaml`:
- Around line 268-272: The CRD schema for residentThresholdInMB allows negative
values; update the OpenAPI schema entry for residentThresholdInMB to include a
minimum: 0 (same style as cacheLimitMB) so it is validated as a non-negative
integer (keep type: integer and format: int64). Locate the residentThresholdInMB
property in the CRD and add the minimum field to prevent invalid CRs.
In `@internal/controller/nodeagent_test.go`:
- Around line 2263-2267: The test currently does a bare type assertion
cachePVC.(map[string]interface{}) which can panic on unexpected JSON shapes;
change it to the comma-ok form: cachePVCMap, ok :=
cachePVC.(map[string]interface{}), then assert require.True(t, ok, "cachePVC
should be a JSON object in ConfigMap") (immediately after the existing presence
check) before accessing cachePVCMap["storageClass"] so the test fails cleanly
instead of panicking.
- Around line 2086-2153: Register the storagev1 types in the fake client scheme
by adding storagev1.AddToScheme(scheme.Scheme) inside getSchemeForFakeClient (in
bsl_test.go) so StorageClass objects deserialize properly, and add a new test
case to Test_getDefaultStorageClass (nodeagent_test.go) that creates two
storagev1.StorageClass objects both annotated
"storageclass.kubernetes.io/is-default-class":"true" to assert
getDefaultStorageClass on DataProtectionApplicationReconciler returns the first
matching StorageClass name (documenting current non-deterministic ordering).
In `@internal/controller/nodeagent.go`:
- Around line 163-171: The code currently clears
configWithPrivileged.CachePVCConfig when r.getDefaultStorageClass() returns an
error, silently disabling the cache PVC; instead, modify the reconciliation to
return the error so the controller retries (do not set CachePVCConfig = nil on
getDefaultStorageClass error). Specifically, in the block that calls
r.getDefaultStorageClass() (refer to configWithPrivileged.CachePVCConfig and
r.getDefaultStorageClass()), change the error branch to return the error (or
wrap it with context) rather than logging and nil’ing the config; keep the
existing behavior only when defaultSC is empty after a successful call.
---
Duplicate comments:
In `@docs/cache_volume_pvc_configuration.md`:
- Around line 32-34: The fenced code block containing the expression
"cacheVolumeSize = cacheLimitMB * 1.2 (rounded up to the nearest GB)" should
include a language tag (e.g., "text" or "bash") after the opening ``` so the
block renders correctly; update the opening fence from ``` to ```text (or
another appropriate tag) around that line in
docs/cache_volume_pvc_configuration.md.
---
Nitpick comments:
In `@internal/controller/nodeagent_test.go`:
- Around line 2050-2084: Add two table-driven test cases in
Test_isNodeAgentCMRequired_CachePVCConfig to cover a nil disableFsBackup
pointer: one where config has CachePVCConfig set and disableFsBackup is nil
(expect want: true), and one with an empty config and disableFsBackup nil
(expect want: false); update the tests slice in
Test_isNodeAgentCMRequired_CachePVCConfig so the new entries reference
isNodeAgentCMRequired and use ptr.Nil (or simply nil) for disableFsBackup to
ensure the function behavior for a nil *bool is exercised.
In `@internal/controller/nodeagent.go`:
- Around line 130-145: The getDefaultStorageClass function currently returns the
"first" default StorageClass which is non-deterministic; change it to collect
all StorageClasses whose annotation
"storageclass.kubernetes.io/is-default-class" == "true" and pick the most
recently created one deterministically (e.g., compare each matching
StorageClass's CreationTimestamp and return the one with the latest timestamp).
Update the logic inside getDefaultStorageClass (use scList.Items,
sc.Annotations, and sc.CreationTimestamp) to choose the newest default and
return its Name, keeping the same error handling for r.Client.List; optionally
surface a warning if multiple defaults were found before returning the chosen
one.
ℹ️ Review info
Configuration used: Repository: openshift/coderabbit/.coderabbit.yaml
Review profile: CHILL
Plan: Pro
Cache: Disabled due to data retention organization setting
Knowledge base: Disabled due to data retention organization setting
📒 Files selected for processing (11)
api/v1alpha1/dataprotectionapplication_types.gobundle/manifests/oadp-operator.clusterserviceversion.yamlbundle/manifests/oadp.openshift.io_dataprotectionapplications.yamlbundle/manifests/velero.io_backupstoragelocations.yamlconfig/crd/bases/oadp.openshift.io_dataprotectionapplications.yamlconfig/crd/bases/velero.io_backupstoragelocations.yamlconfig/rbac/role.yamldocs/cache_volume_pvc_configuration.mdinternal/controller/dataprotectionapplication_controller.gointernal/controller/nodeagent.gointernal/controller/nodeagent_test.go
🚧 Files skipped from review as they are similar to previous changes (5)
- bundle/manifests/oadp.openshift.io_dataprotectionapplications.yaml
- bundle/manifests/velero.io_backupstoragelocations.yaml
- config/rbac/role.yaml
- bundle/manifests/oadp-operator.clusterserviceversion.yaml
- config/crd/bases/velero.io_backupstoragelocations.yaml
| residentThresholdInMB: | ||
| description: ResidentThresholdInMB specifies the minimum size of the backup data to create cache PVC | ||
| format: int64 | ||
| type: integer | ||
| storageClass: |
There was a problem hiding this comment.
Validate residentThresholdInMB as non‑negative.
Negative thresholds don’t make sense here; add a minimum of 0 to prevent invalid CRs and align with cacheLimitMB.
✅ Suggested schema tweak
residentThresholdInMB:
description: ResidentThresholdInMB specifies the minimum size of the backup data to create cache PVC
format: int64
+ minimum: 0
type: integer📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| residentThresholdInMB: | |
| description: ResidentThresholdInMB specifies the minimum size of the backup data to create cache PVC | |
| format: int64 | |
| type: integer | |
| storageClass: | |
| residentThresholdInMB: | |
| description: ResidentThresholdInMB specifies the minimum size of the backup data to create cache PVC | |
| format: int64 | |
| minimum: 0 | |
| type: integer | |
| storageClass: |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@config/crd/bases/oadp.openshift.io_dataprotectionapplications.yaml` around
lines 268 - 272, The CRD schema for residentThresholdInMB allows negative
values; update the OpenAPI schema entry for residentThresholdInMB to include a
minimum: 0 (same style as cacheLimitMB) so it is validated as a non-negative
integer (keep type: integer and format: int64). Locate the residentThresholdInMB
property in the CRD and add the minimum field to prevent invalid CRs.
| func Test_getDefaultStorageClass(t *testing.T) { | ||
| tests := []struct { | ||
| name string | ||
| objects []client.Object | ||
| want string | ||
| wantErr bool | ||
| }{ | ||
| { | ||
| name: "no storage classes exist", | ||
| objects: nil, | ||
| want: "", | ||
| wantErr: false, | ||
| }, | ||
| { | ||
| name: "default storage class exists", | ||
| objects: []client.Object{ | ||
| &storagev1.StorageClass{ | ||
| ObjectMeta: metav1.ObjectMeta{ | ||
| Name: "gp3-csi", | ||
| Annotations: map[string]string{ | ||
| "storageclass.kubernetes.io/is-default-class": "true", | ||
| }, | ||
| }, | ||
| }, | ||
| &storagev1.StorageClass{ | ||
| ObjectMeta: metav1.ObjectMeta{ | ||
| Name: "standard", | ||
| }, | ||
| }, | ||
| }, | ||
| want: "gp3-csi", | ||
| wantErr: false, | ||
| }, | ||
| { | ||
| name: "no default storage class", | ||
| objects: []client.Object{ | ||
| &storagev1.StorageClass{ | ||
| ObjectMeta: metav1.ObjectMeta{ | ||
| Name: "gp3-csi", | ||
| }, | ||
| }, | ||
| }, | ||
| want: "", | ||
| wantErr: false, | ||
| }, | ||
| } | ||
| for _, tt := range tests { | ||
| t.Run(tt.name, func(t *testing.T) { | ||
| fakeClient, err := getFakeClientFromObjects(tt.objects...) | ||
| if err != nil { | ||
| t.Fatalf("error in creating fake client: %v", err) | ||
| } | ||
| r := &DataProtectionApplicationReconciler{ | ||
| Client: fakeClient, | ||
| Scheme: fakeClient.Scheme(), | ||
| Log: logr.Discard(), | ||
| Context: newContextForTest(), | ||
| } | ||
| got, err := r.getDefaultStorageClass() | ||
| if (err != nil) != tt.wantErr { | ||
| t.Errorf("getDefaultStorageClass() error = %v, wantErr %v", err, tt.wantErr) | ||
| } | ||
| if got != tt.want { | ||
| t.Errorf("getDefaultStorageClass() = %v, want %v", got, tt.want) | ||
| } | ||
| }) | ||
| } | ||
| } |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
# Check scheme registration for storagev1 in test setup
echo "=== Checking suite_test.go for storagev1 scheme registration ==="
rg -n "storagev1\|storage/v1\|AddToScheme\|SchemeBuilder" --type=go -C3 internal/controller/suite_test.go
echo -e "\n=== Checking for getFakeClientFromObjects definition ==="
rg -n "func getFakeClientFromObjects\|getFakeClientFromObjects" --type=go -B2 -A10 internal/controller/ | head -50Repository: openshift/oadp-operator
Length of output: 189
🏁 Script executed:
#!/bin/bash
# Check getDefaultStorageClass implementation in production code
echo "=== Checking nodeagent.go for getDefaultStorageClass implementation ==="
rg -n "func.*getDefaultStorageClass\|is-default-class\|is-default" --type=go -C5 internal/controller/nodeagent.goRepository: openshift/oadp-operator
Length of output: 138
🏁 Script executed:
#!/bin/bash
# Find all relevant files
echo "=== Looking for nodeagent files ==="
find . -name "*nodeagent*" -type f 2>/dev/null | head -20
echo -e "\n=== Looking for suite_test files ==="
find . -name "suite_test.go" -type f 2>/dev/null
echo -e "\n=== Files in internal/controller ==="
ls -la internal/controller/ 2>/dev/null | head -30Repository: openshift/oadp-operator
Length of output: 2475
🏁 Script executed:
#!/bin/bash
# Search for getFakeClientFromObjects definition across the repo
echo "=== Searching for getFakeClientFromObjects ==="
rg -n "getFakeClientFromObjects" --type=go -B2 -A10
echo -e "\n=== Searching for getDefaultStorageClass ==="
rg -n "getDefaultStorageClass" --type=go -B2 -A10Repository: openshift/oadp-operator
Length of output: 41244
🏁 Script executed:
#!/bin/bash
# Check getSchemeForFakeClient implementation
echo "=== Checking getSchemeForFakeClient definition ==="
rg -n "func getSchemeForFakeClient" --type=go -A20 internal/controller/bsl_test.go | head -40Repository: openshift/oadp-operator
Length of output: 655
🏁 Script executed:
#!/bin/bash
# Check if storagev1 is imported and check the full scheme setup
echo "=== Checking imports in bsl_test.go for storagev1 ==="
head -50 internal/controller/bsl_test.go | grep -i "storagev1\|storage/v1"
echo -e "\n=== Checking all scheme additions in getSchemeForFakeClient ==="
rg -n "getSchemeForFakeClient" --type=go -A30 internal/controller/bsl_test.go | head -50
echo -e "\n=== Checking nodeagent.go imports ==="
head -50 internal/controller/nodeagent.go | grep -i "storagev1\|storage/v1"
echo -e "\n=== Checking if storagev1 is added anywhere in test setup ==="
rg -n "storagev1\|storage\.v1" --type=go internal/controller/ | grep -i "addtoscheme\|scheme"Repository: openshift/oadp-operator
Length of output: 1749
🏁 Script executed:
#!/bin/bash
# Check if storagev1 is registered anywhere in the codebase
echo "=== Searching for storagev1 AddToScheme registrations ==="
rg -n "storagev1.*AddToScheme\|storage\.v1.*AddToScheme" --type=go
echo -e "\n=== Checking suite_test.go for scheme setup ==="
cat internal/controller/suite_test.go | head -100
echo -e "\n=== Checking if there's any other scheme setup in nodeagent_test.go ==="
rg -n "getScheme\|AddToScheme" --type=go internal/controller/nodeagent_test.go | head -20Repository: openshift/oadp-operator
Length of output: 3232
🏁 Script executed:
#!/bin/bash
# Check the imports at the top of nodeagent_test.go
echo "=== Checking imports in nodeagent_test.go (first 100 lines) ==="
head -100 internal/controller/nodeagent_test.go
echo -e "\n=== Check if nodeagent_test uses its own getSchemeForFakeClient or imports it ==="
rg -n "import\|getSchemeForFakeClient\|getScheme" --type=go internal/controller/nodeagent_test.go | head -20Repository: openshift/oadp-operator
Length of output: 2768
🏁 Script executed:
#!/bin/bash
# Verify full getDefaultStorageClass implementation to check annotation handling
echo "=== Full getDefaultStorageClass implementation ==="
rg -n "func.*getDefaultStorageClass\|is-default-class" --type=go -A15 internal/controller/nodeagent.go
echo -e "\n=== Check for beta annotation usage anywhere ==="
rg -n "beta\.kubernetes\.io.*is-default\|is-default.*beta" --type=go internal/controller/Repository: openshift/oadp-operator
Length of output: 167
🏁 Script executed:
#!/bin/bash
# Find getDefaultStorageClass with a different pattern
echo "=== Searching for getDefaultStorageClass implementation ==="
rg "getDefaultStorageClass" --type=go -A15 internal/controller/nodeagent.go
echo -e "\n=== Check entire nodeagent.go around line 130-150 ==="
sed -n '128,150p' internal/controller/nodeagent.goRepository: openshift/oadp-operator
Length of output: 2818
🏁 Script executed:
#!/bin/bash
# Check if storagev1.AddToScheme is available to add
echo "=== Check K8s storagev1 availability ==="
rg "k8s\.io.*storagev1" --type=go internal/controller/nodeagent.go
echo -e "\n=== Check if any test file adds storagev1 to scheme ==="
rg "storagev1.*AddToScheme\|AddToScheme.*storage" --type=go internal/controller/ | grep -v ".swp"
echo -e "\n=== Full getSchemeForFakeClient to see all additions ==="
sed -n '34,51p' internal/controller/bsl_test.goRepository: openshift/oadp-operator
Length of output: 561
Add test case for multiple StorageClasses marked as default and fix missing storagev1 scheme registration.
The test file passes storagev1.StorageClass objects to getFakeClientFromObjects, but the underlying getSchemeForFakeClient() function does not register storagev1 in its scheme. This will cause the fake client to silently fail to deserialize these objects. Add storagev1.AddToScheme(scheme.Scheme) to getSchemeForFakeClient() in bsl_test.go.
Additionally, the production code's getDefaultStorageClass() returns the first matching default StorageClass without deterministic ordering. Add a test case for multiple StorageClasses simultaneously marked as default to document this behavior:
Suggested additional test case
{
name: "no default storage class",
...
},
+ {
+ name: "multiple storage classes marked as default",
+ objects: []client.Object{
+ &storagev1.StorageClass{
+ ObjectMeta: metav1.ObjectMeta{
+ Name: "gp3-csi",
+ Annotations: map[string]string{
+ "storageclass.kubernetes.io/is-default-class": "true",
+ },
+ },
+ },
+ &storagev1.StorageClass{
+ ObjectMeta: metav1.ObjectMeta{
+ Name: "standard",
+ Annotations: map[string]string{
+ "storageclass.kubernetes.io/is-default-class": "true",
+ },
+ },
+ },
+ },
+ want: "gp3-csi",
+ wantErr: false,
+ },📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| func Test_getDefaultStorageClass(t *testing.T) { | |
| tests := []struct { | |
| name string | |
| objects []client.Object | |
| want string | |
| wantErr bool | |
| }{ | |
| { | |
| name: "no storage classes exist", | |
| objects: nil, | |
| want: "", | |
| wantErr: false, | |
| }, | |
| { | |
| name: "default storage class exists", | |
| objects: []client.Object{ | |
| &storagev1.StorageClass{ | |
| ObjectMeta: metav1.ObjectMeta{ | |
| Name: "gp3-csi", | |
| Annotations: map[string]string{ | |
| "storageclass.kubernetes.io/is-default-class": "true", | |
| }, | |
| }, | |
| }, | |
| &storagev1.StorageClass{ | |
| ObjectMeta: metav1.ObjectMeta{ | |
| Name: "standard", | |
| }, | |
| }, | |
| }, | |
| want: "gp3-csi", | |
| wantErr: false, | |
| }, | |
| { | |
| name: "no default storage class", | |
| objects: []client.Object{ | |
| &storagev1.StorageClass{ | |
| ObjectMeta: metav1.ObjectMeta{ | |
| Name: "gp3-csi", | |
| }, | |
| }, | |
| }, | |
| want: "", | |
| wantErr: false, | |
| }, | |
| } | |
| for _, tt := range tests { | |
| t.Run(tt.name, func(t *testing.T) { | |
| fakeClient, err := getFakeClientFromObjects(tt.objects...) | |
| if err != nil { | |
| t.Fatalf("error in creating fake client: %v", err) | |
| } | |
| r := &DataProtectionApplicationReconciler{ | |
| Client: fakeClient, | |
| Scheme: fakeClient.Scheme(), | |
| Log: logr.Discard(), | |
| Context: newContextForTest(), | |
| } | |
| got, err := r.getDefaultStorageClass() | |
| if (err != nil) != tt.wantErr { | |
| t.Errorf("getDefaultStorageClass() error = %v, wantErr %v", err, tt.wantErr) | |
| } | |
| if got != tt.want { | |
| t.Errorf("getDefaultStorageClass() = %v, want %v", got, tt.want) | |
| } | |
| }) | |
| } | |
| } | |
| func Test_getDefaultStorageClass(t *testing.T) { | |
| tests := []struct { | |
| name string | |
| objects []client.Object | |
| want string | |
| wantErr bool | |
| }{ | |
| { | |
| name: "no storage classes exist", | |
| objects: nil, | |
| want: "", | |
| wantErr: false, | |
| }, | |
| { | |
| name: "default storage class exists", | |
| objects: []client.Object{ | |
| &storagev1.StorageClass{ | |
| ObjectMeta: metav1.ObjectMeta{ | |
| Name: "gp3-csi", | |
| Annotations: map[string]string{ | |
| "storageclass.kubernetes.io/is-default-class": "true", | |
| }, | |
| }, | |
| }, | |
| &storagev1.StorageClass{ | |
| ObjectMeta: metav1.ObjectMeta{ | |
| Name: "standard", | |
| }, | |
| }, | |
| }, | |
| want: "gp3-csi", | |
| wantErr: false, | |
| }, | |
| { | |
| name: "no default storage class", | |
| objects: []client.Object{ | |
| &storagev1.StorageClass{ | |
| ObjectMeta: metav1.ObjectMeta{ | |
| Name: "gp3-csi", | |
| }, | |
| }, | |
| }, | |
| want: "", | |
| wantErr: false, | |
| }, | |
| { | |
| name: "multiple storage classes marked as default", | |
| objects: []client.Object{ | |
| &storagev1.StorageClass{ | |
| ObjectMeta: metav1.ObjectMeta{ | |
| Name: "gp3-csi", | |
| Annotations: map[string]string{ | |
| "storageclass.kubernetes.io/is-default-class": "true", | |
| }, | |
| }, | |
| }, | |
| &storagev1.StorageClass{ | |
| ObjectMeta: metav1.ObjectMeta{ | |
| Name: "standard", | |
| Annotations: map[string]string{ | |
| "storageclass.kubernetes.io/is-default-class": "true", | |
| }, | |
| }, | |
| }, | |
| }, | |
| want: "gp3-csi", | |
| wantErr: false, | |
| }, | |
| } | |
| for _, tt := range tests { | |
| t.Run(tt.name, func(t *testing.T) { | |
| fakeClient, err := getFakeClientFromObjects(tt.objects...) | |
| if err != nil { | |
| t.Fatalf("error in creating fake client: %v", err) | |
| } | |
| r := &DataProtectionApplicationReconciler{ | |
| Client: fakeClient, | |
| Scheme: fakeClient.Scheme(), | |
| Log: logr.Discard(), | |
| Context: newContextForTest(), | |
| } | |
| got, err := r.getDefaultStorageClass() | |
| if (err != nil) != tt.wantErr { | |
| t.Errorf("getDefaultStorageClass() error = %v, wantErr %v", err, tt.wantErr) | |
| } | |
| if got != tt.want { | |
| t.Errorf("getDefaultStorageClass() = %v, want %v", got, tt.want) | |
| } | |
| }) | |
| } | |
| } |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@internal/controller/nodeagent_test.go` around lines 2086 - 2153, Register the
storagev1 types in the fake client scheme by adding
storagev1.AddToScheme(scheme.Scheme) inside getSchemeForFakeClient (in
bsl_test.go) so StorageClass objects deserialize properly, and add a new test
case to Test_getDefaultStorageClass (nodeagent_test.go) that creates two
storagev1.StorageClass objects both annotated
"storageclass.kubernetes.io/is-default-class":"true" to assert
getDefaultStorageClass on DataProtectionApplicationReconciler returns the first
matching StorageClass name (documenting current non-deterministic ordering).
| } else { | ||
| require.True(t, ok, "cachePVC should be present in ConfigMap") | ||
| cachePVCMap := cachePVC.(map[string]interface{}) | ||
| require.Equal(t, tt.wantStorageClass, cachePVCMap["storageClass"]) | ||
| } |
There was a problem hiding this comment.
Use the comma-ok form for the type assertion to avoid a panic on unexpected JSON shapes.
If cachePVC is not a JSON object (edge case from a future serialization change), the bare assertion on Line 2265 panics the test instead of failing it cleanly.
🛡️ Proposed fix
- cachePVCMap := cachePVC.(map[string]interface{})
- require.Equal(t, tt.wantStorageClass, cachePVCMap["storageClass"])
+ cachePVCMap, ok := cachePVC.(map[string]interface{})
+ require.True(t, ok, "cachePVC value should be a JSON object")
+ require.Equal(t, tt.wantStorageClass, cachePVCMap["storageClass"])📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| } else { | |
| require.True(t, ok, "cachePVC should be present in ConfigMap") | |
| cachePVCMap := cachePVC.(map[string]interface{}) | |
| require.Equal(t, tt.wantStorageClass, cachePVCMap["storageClass"]) | |
| } | |
| } else { | |
| require.True(t, ok, "cachePVC should be present in ConfigMap") | |
| cachePVCMap, ok := cachePVC.(map[string]interface{}) | |
| require.True(t, ok, "cachePVC value should be a JSON object") | |
| require.Equal(t, tt.wantStorageClass, cachePVCMap["storageClass"]) | |
| } |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@internal/controller/nodeagent_test.go` around lines 2263 - 2267, The test
currently does a bare type assertion cachePVC.(map[string]interface{}) which can
panic on unexpected JSON shapes; change it to the comma-ok form: cachePVCMap, ok
:= cachePVC.(map[string]interface{}), then assert require.True(t, ok, "cachePVC
should be a JSON object in ConfigMap") (immediately after the existing presence
check) before accessing cachePVCMap["storageClass"] so the test fails cleanly
instead of panicking.
| // If CachePVCConfig is set but StorageClass is empty, resolve the cluster's default StorageClass. | ||
| // This prevents restore failures when local storage is limited by ensuring a StorageClass is available | ||
| // for cache PVC provisioning, even when the user doesn't explicitly specify one. | ||
| if configWithPrivileged.CachePVCConfig != nil && configWithPrivileged.CachePVCConfig.StorageClass == "" { | ||
| defaultSC, err := r.getDefaultStorageClass() | ||
| if err != nil { | ||
| r.Log.Info("Failed to resolve default StorageClass for cache PVC, cache volume will be disabled", "error", err) | ||
| configWithPrivileged.CachePVCConfig = nil | ||
| } else if defaultSC != "" { |
There was a problem hiding this comment.
Avoid silently disabling cache PVC on StorageClass list errors.
A transient API failure will currently remove cachePVC from the config map and can regress restores. Prefer returning an error so reconciliation retries instead of downgrading behavior.
💡 Suggested change
defaultSC, err := r.getDefaultStorageClass()
if err != nil {
- r.Log.Info("Failed to resolve default StorageClass for cache PVC, cache volume will be disabled", "error", err)
- configWithPrivileged.CachePVCConfig = nil
+ return fmt.Errorf("failed to resolve default StorageClass for cache PVC: %w", err)
} else if defaultSC != "" {📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| // If CachePVCConfig is set but StorageClass is empty, resolve the cluster's default StorageClass. | |
| // This prevents restore failures when local storage is limited by ensuring a StorageClass is available | |
| // for cache PVC provisioning, even when the user doesn't explicitly specify one. | |
| if configWithPrivileged.CachePVCConfig != nil && configWithPrivileged.CachePVCConfig.StorageClass == "" { | |
| defaultSC, err := r.getDefaultStorageClass() | |
| if err != nil { | |
| r.Log.Info("Failed to resolve default StorageClass for cache PVC, cache volume will be disabled", "error", err) | |
| configWithPrivileged.CachePVCConfig = nil | |
| } else if defaultSC != "" { | |
| // If CachePVCConfig is set but StorageClass is empty, resolve the cluster's default StorageClass. | |
| // This prevents restore failures when local storage is limited by ensuring a StorageClass is available | |
| // for cache PVC provisioning, even when the user doesn't explicitly specify one. | |
| if configWithPrivileged.CachePVCConfig != nil && configWithPrivileged.CachePVCConfig.StorageClass == "" { | |
| defaultSC, err := r.getDefaultStorageClass() | |
| if err != nil { | |
| return fmt.Errorf("failed to resolve default StorageClass for cache PVC: %w", err) | |
| } else if defaultSC != "" { |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@internal/controller/nodeagent.go` around lines 163 - 171, The code currently
clears configWithPrivileged.CachePVCConfig when r.getDefaultStorageClass()
returns an error, silently disabling the cache PVC; instead, modify the
reconciliation to return the error so the controller retries (do not set
CachePVCConfig = nil on getDefaultStorageClass error). Specifically, in the
block that calls r.getDefaultStorageClass() (refer to
configWithPrivileged.CachePVCConfig and r.getDefaultStorageClass()), change the
error branch to return the error (or wrap it with context) rather than logging
and nil’ing the config; keep the existing behavior only when defaultSC is empty
after a successful call.
|
/test 4.22-e2e-test-aws |
1 similar comment
|
/test 4.22-e2e-test-aws |
|
@mpryc merge method is not allowed in repo settings on this. we had only enabled squash here. |
|
I configured it when we had admin. Assuming this setting is outside of openshift/release, and you want to bring back merge method: merge, then we have to request admin access. lmk if you absolutely need merge method in this repo. Easier is to set tide to squash for this repo via ocp/release pr |
|
@kaovilai I think we can't use squash when rebasebot works, here rebasebot also works to get the go mods updated. Why it's not a problem for other repos? |
|
other repos we do rebases, force pushes.. here we don't. |
|
@kaovilai we do use rebasebot to update all go modules and links to other deps in similar way, it's not strictly rebase, but it's using rebasebot to do very similar work than in other repos https://github.com/oadp-rebasebot/oadp-rebase/blob/oadp-dev/rebase-configs/openshift_oadp-operator_oadp-dev.env.sh#L10-L13 |
|
which is different, and is not strictly something that required rewriting history of like we were doing in other repos, so we were able to add more protections here. |
|
I have been a fan of squashing on merge, instead of strictly enforcing having only single commit pre-PR. Allows you to explain your steps in PR, but squash to 1 on merge. |
|
We can add merge allowance, just gotta request admin. |
|
Just in case: https://issues.redhat.com/browse/DPP-19573 |
|
@mpryc: all tests passed! Full PR test history. Your PR dashboard. DetailsInstructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here. |
|
/lgtm |
|
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: kaovilai, mpryc, weshayutin The full list of commands accepted by this bot can be found here. The pull request process is described here DetailsNeeds approval from an approver in each of these files:
Approvers can indicate their approval by writing |
Add cachePVC configuration to the DPA CRD under spec.configuration.nodeAgent, enabling dedicated PVC-backed cache volumes for Kopia repository cache during data movement restore operations. This prevents ephemeral storage exhaustion on nodes with limited root filesystem space during concurrent restores.
Why the changes were made
Fix OADP-4855
Summary by CodeRabbit
Release Notes
New Features
Documentation