Add HA tests by mmakwana30 · Pull Request #2146 · openshift/lvm-operator

mmakwana30 · 2026-03-24T15:52:53Z

Logs: https://privatebin.corp.redhat.com/?a28906199661f4a1#DbpjsRYRXvWY9PYuyeSsuVEeXuBMP9n3raN2aBESmc3C

openshift-ci · 2026-03-24T15:53:18Z

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: mmakwana30
Once this PR has been reviewed and has the lgtm label, please assign jerpeter1 for approval. For more information see the Code Review Process.

The full list of commands accepted by this bot can be found here.

Details

Needs approval from an approver in each of these files:

OWNERS

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

coderabbitai · 2026-03-24T15:53:19Z

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

@coderabbitai resume to resume automatic reviews.
@coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

▶️ Resume reviews
🔍 Trigger review

Walkthrough

Integration test entrypoint and Makefile were split into separate SNO and MNO runs; integration test Docker build context expanded; go.mod dependency set was adjusted (removed k8s.io/kubernetes, added k8s.io/klog/v2); the previous LVMS test and helper files were removed and a new, large qe_tests utility implementation was added.

Changes

Cohort / File(s)	Summary
Build Infrastructure `test/integration/Makefile`, `test/integration/Dockerfile`	Makefile: split `integration-run` into separate `integration-run-sno` and `integration-run-mno` targets and updated `integration-docker-run` to invoke both `run-suite -c 1` commands. Dockerfile: copy full repo (`COPY . ./`), set builder workdir to `/workspace/test/integration`, and emit binary to `/workspace/integration-test`.
Dependency Management `test/integration/go.mod`	Replaced top-level `k8s.io/kubernetes` requirement with `k8s.io/klog/v2 v2.130.1` and pruned several transitive requires (OpenTelemetry, some Kubernetes internals); `replace` directives unchanged.
Test Entrypoint / Suites `test/integration/integration.go`	Updated imports to `.../qe_tests` and replaced the single `single-node` suite with two suites targeting `.../qe_tests/sno` and `.../qe_tests/mno`, each using label qualifiers for `SNO` and `MNO`.
LVMS Tests Removed `test/integration/qe-tests/lvms.go`, `test/integration/qe-tests/lvms_utils.go`	Deleted the original LVMS test spec and its large helper module (extensive test cases and many node/storage helper functions removed).
New LVMS Utilities (qe_tests) `test/integration/qe_tests/lvms_utils.go`	Added a large new utilities file under `qe_tests` implementing lvmCluster helpers, extensive node/backend operations via `oc debug` and pod exec, LVMCluster create/delete/repair flows, disk discovery, RAID/LV/ VG cleanup and recovery routines, and many test-oriented helper functions (many unexported helpers and methods).

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 6

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@test/integration/Makefile`:
- Around line 34-35: The Makefile target integration-docker-run is invoking a
non-existent suite path openshift/lvm-operator/test/integration/qe_tests; update
the command in integration-docker-run to invoke the actual suites defined in
integration.go (qe_tests/sno and qe_tests/mno) — either run them both (e.g.,
call run-suite twice or pass both suite paths) or replace the single path with
the correct concrete suite path(s) so the docker run command executes the
existing qe_tests/sno and qe_tests/mno suites.

In `@test/integration/qe_tests/lvms_utils.go`:
- Around line 1579-1582: The error from cmd.CombinedOutput() is ignored; update
the error handling to return a non-nil error (and a safe zero value for the
bool) instead of an empty block. Inside the if err != nil { ... } handle, return
false and wrap the error with context (e.g., fmt.Errorf("running %v failed: %w",
cmd.Args, err)) and include trimmed output if helpful; add fmt to imports if
missing. This ensures the function (in lvms_utils.go where cmd.CombinedOutput()
is called) surfaces command failures instead of silently continuing.
- Around line 361-364: The getRandomNum function currently calls rand.Seed on
every invocation which is deprecated and unsafe; remove the per-call rand.Seed
and instead either rely on the auto-seeded global PRNG (drop the Seed call and
use rand.Int63n directly in getRandomNum) or create a package-level
math/rand.Rand (e.g., r := rand.New(rand.NewSource(time.Now().UnixNano()))
initialized once in init()) and call r.Int63n in getRandomNum so seeding happens
only once and not on every call.
- Around line 1856-1891: The YAML produced by
createLVMClusterWithPathsAndOptionalPaths is malformed: it omits metadata.name
and places paths/optionalPaths under thinPoolConfig instead of under
deviceSelector, and the fmt.Sprintf argument order must match the updated
placeholders; fix by adding a metadata.name: %s line (using the function's name
parameter), move the pathsJSON and optionalPathsJSON keys out of thinPoolConfig
into a deviceSelector block under the device class (deviceSelector:\n       
paths: %s\n        optionalPaths: %s), keep thinPoolConfig only for
thinPoolConfig fields, and update the fmt.Sprintf call in
createLVMClusterWithPathsAndOptionalPaths so the placeholder order is
metadata.name, namespace, deviceClass, pathsJSON, optionalPathsJSON before
calling createLVMClusterFromJSON.
- Around line 1551-1553: The empty error handling after calling
resourceExists("lvmcluster", name, namespace) should be fixed: handle the
returned err instead of ignoring it—either propagate it (e.g., return exists,
err or return false, err from the enclosing function) or log the error with
context using the existing logger and then decide on a fallback; update the code
around the resourceExists call (variables exists and err) to implement one of
these behaviors so the error is not silently dropped.
- Around line 1236-1246: The shell snippet in lvms_utils.go has an unclosed
outer if (if [ "$lv_count" -gt 1 ]) causing a syntax error; close that block by
inserting a matching fi immediately after the inner orphaned-check if block (the
block that echoes "HAS_ORPHANED_LVS" and exits) so the structure is: if [
"$lv_count" -gt 1 ]; then ... if [ orphaned ]; then ... fi fi before the final
echo "OK".

ℹ️ Review info

⚙️ Run configuration

Configuration used: Repository YAML (base), Organization UI (inherited)

Review profile: CHILL

Plan: Pro

Run ID: 2256d975-16da-413b-9dbf-44fbeedf408f

📥 Commits

Reviewing files that changed from the base of the PR and between df26f96 and 52e32be.

⛔ Files ignored due to path filters (1)

test/integration/go.sum is excluded by !**/*.sum

📒 Files selected for processing (5)

test/integration/Makefile
test/integration/go.mod
test/integration/integration.go
test/integration/qe_tests/lvms.go
test/integration/qe_tests/lvms_utils.go

test/integration/Makefile

test/integration/qe_tests/lvms_utils.go

codecov-commenter · 2026-03-24T16:04:22Z

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 52.26%. Comparing base (b5e0e11) to head (074cacf).
⚠️ Report is 14 commits behind head on main.

Additional details and impacted files

@@            Coverage Diff             @@
##             main    #2146      +/-   ##
==========================================
+ Coverage   51.73%   52.26%   +0.53%     
==========================================
  Files          52       52              
  Lines        3901     3901              
==========================================
+ Hits         2018     2039      +21     
+ Misses       1715     1698      -17     
+ Partials      168      164       -4

see 2 files with indirect coverage changes

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

coderabbitai

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)

test/integration/go.mod (1)
16-24: ⚠️ Potential issue | 🟠 Major

Add k8s.io/kubernetes to the require block.

Integration tests import k8s.io/kubernetes/test/e2e/framework (confirmed in test/integration/qe-tests/lvms_utils.go and lvms.go), but k8s.io/kubernetes only appears in the replace block, not in require. Without a require entry, the replace directive won't be applied and the build will fail or behave unexpectedly. Add a require statement for k8s.io/kubernetes before the replace block.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@test/integration/go.mod` around lines 16 - 24, The go.mod is missing a
require entry for k8s.io/kubernetes, so the replace directive for
k8s.io/kubernetes won't be applied; add a require line for "k8s.io/kubernetes"
(e.g., require k8s.io/kubernetes v1.30.1) in the require block before the
replace block so imports like k8s.io/kubernetes/test/e2e/framework resolve and
the replace mapping is used.

♻️ Duplicate comments (1)

test/integration/Makefile (1)
35-35: ⚠️ Potential issue | 🟠 Major

Use concrete suite names in integration-docker-run.

run-suite is still called with .../qe_tests, while suites are registered as .../qe_tests/sno and .../qe_tests/mno. This can skip tests or fail in Docker runs.
Proposed fix
 integration-docker-run: integration-docker-build
-	docker run --rm $(FULL_IMAGE) run-suite openshift/lvm-operator/test/integration/qe_tests
+	docker run --rm $(FULL_IMAGE) run-suite -c 1 openshift/lvm-operator/test/integration/qe_tests/sno
+	docker run --rm $(FULL_IMAGE) run-suite -c 1 openshift/lvm-operator/test/integration/qe_tests/mno
As per coding guidelines, "Focus on major issues impacting performance, readability, maintainability and security. Avoid nitpicks and avoid verbosity."
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@test/integration/Makefile` at line 35, The docker command in the
integration-docker-run target uses a generic suite path
"openshift/lvm-operator/test/integration/qe_tests" which doesn't match
registered suites (e.g., ".../qe_tests/sno" and ".../qe_tests/mno"); update the
docker run invocation to call the concrete suite names (referencing the existing
suite identifiers like "openshift/lvm-operator/test/integration/qe_tests/sno"
and "openshift/lvm-operator/test/integration/qe_tests/mno") so Docker runs the
specific suites (either run them separately or pass both concrete suite paths)
instead of the generic directory.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Outside diff comments:
In `@test/integration/go.mod`:
- Around line 16-24: The go.mod is missing a require entry for
k8s.io/kubernetes, so the replace directive for k8s.io/kubernetes won't be
applied; add a require line for "k8s.io/kubernetes" (e.g., require
k8s.io/kubernetes v1.30.1) in the require block before the replace block so
imports like k8s.io/kubernetes/test/e2e/framework resolve and the replace
mapping is used.

---

Duplicate comments:
In `@test/integration/Makefile`:
- Line 35: The docker command in the integration-docker-run target uses a
generic suite path "openshift/lvm-operator/test/integration/qe_tests" which
doesn't match registered suites (e.g., ".../qe_tests/sno" and
".../qe_tests/mno"); update the docker run invocation to call the concrete suite
names (referencing the existing suite identifiers like
"openshift/lvm-operator/test/integration/qe_tests/sno" and
"openshift/lvm-operator/test/integration/qe_tests/mno") so Docker runs the
specific suites (either run them separately or pass both concrete suite paths)
instead of the generic directory.

ℹ️ Review info

⚙️ Run configuration

Configuration used: Repository YAML (base), Organization UI (inherited)

Review profile: CHILL

Plan: Pro

Run ID: 107274df-5538-4464-aed1-bc8018937e4a

📥 Commits

Reviewing files that changed from the base of the PR and between 52e32be and c510db0.

⛔ Files ignored due to path filters (1)

test/integration/go.sum is excluded by !**/*.sum

📒 Files selected for processing (5)

test/integration/Makefile
test/integration/go.mod
test/integration/integration.go
test/integration/qe_tests/lvms.go
test/integration/qe_tests/lvms_utils.go

✅ Files skipped from review due to trivial changes (1)

test/integration/qe_tests/lvms_utils.go

🚧 Files skipped from review as they are similar to previous changes (1)

test/integration/integration.go

coderabbitai

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@test/integration/Dockerfile`:
- Line 7: The Dockerfile uses "COPY . ./" before "WORKDIR /workspace", which
places the repo contents in the image root and causes subsequent build steps
(the Go builds in the Dockerfile) to run from the wrong module context; fix by
either moving the "WORKDIR /workspace" line above the "COPY . ./" instruction or
change the COPY destination to the explicit workspace path (e.g., replace "COPY
. ./" with "COPY . /workspace") so that the repository is copied into the
intended /workspace module before the Go build steps run.

ℹ️ Review info

⚙️ Run configuration

Configuration used: Repository YAML (base), Organization UI (inherited)

Review profile: CHILL

Plan: Pro

Run ID: 7d74692c-68b2-4d28-a81d-39c4caac5c83

📥 Commits

Reviewing files that changed from the base of the PR and between c510db0 and d94476e.

⛔ Files ignored due to path filters (1)

test/integration/go.sum is excluded by !**/*.sum

📒 Files selected for processing (6)

test/integration/Dockerfile
test/integration/Makefile
test/integration/go.mod
test/integration/integration.go
test/integration/qe-tests/lvms.go
test/integration/qe-tests/lvms_utils.go

🚧 Files skipped from review as they are similar to previous changes (1)

test/integration/integration.go

test/integration/Dockerfile

coderabbitai

Actionable comments posted: 1

🧹 Nitpick comments (2)

test/integration/qe_tests/lvms_utils.go (2)

509-516: TLS verification disabled for pod exec.

Disabling TLS verification (config.Insecure = true) weakens security posture. While acceptable in test code, consider documenting why this is necessary (certificate chain issues with in-cluster exec).

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@test/integration/qe_tests/lvms_utils.go` around lines 509 - 516, The
execCommandInPod function currently disables TLS verification on a copied kube
config (setting config.Insecure, config.TLSClientConfig.Insecure, CAData/CAFile)
which weakens security; update the code by adding a concise explanatory comment
above the config modifications in execCommandInPod (referencing
TestClient.Config and the copied config variable) stating that TLS verification
is intentionally disabled only for integration tests due to in-cluster
certificate/exec handshake issues, and document any assumptions or risks (e.g.,
not for production) so future readers understand why this exception exists.

2175-2191: Passphrase visible in process list.

The passphrase is passed via echo which exposes it in ps output and potentially shell history. For test code this is low risk, but worth noting.

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@test/integration/qe_tests/lvms_utils.go` around lines 2175 - 2191, The
passphrase is exposed via shell piping (echo) in setDiskEncryptPassphrase;
instead create a temporary file on each worker with restrictive perms, write the
passphrase into it, use cryptsetup's --key-file option for both luksFormat and
luksOpen, then securely remove the temp file; update the calls that build
cmd/openCmd (and where execCommandInNode is invoked) to: create temp (mktemp),
chmod 600, write passphrase, run cryptsetup ... --key-file <tempfile>
<diskName>, then rm -f <tempfile>, using the existing diskName, workerName,
execCommandInNode and setDiskEncryptPassphrase identifiers to locate and change
the logic.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@test/integration/qe_tests/lvms_utils.go`:
- Around line 1507-1514: The loop over pvList ignores errors from the PV
update/delete calls; update the block that iterates pvList.Items (the pv
variable) to capture and handle the returned errors from
tc.Clientset.CoreV1().PersistentVolumes().Update(...) and Delete(...): check
each call's error, log a clear error message including pv.Name and the error,
and return or retry/collect failures as appropriate (e.g., continue on non-fatal
errors or fail the test on fatal ones) so orphaned PVs are not silently left
behind; ensure you reference the same pv.Finalizers modification and use the
returned objects/errors from Update and Delete for correct handling.

---

Nitpick comments:
In `@test/integration/qe_tests/lvms_utils.go`:
- Around line 509-516: The execCommandInPod function currently disables TLS
verification on a copied kube config (setting config.Insecure,
config.TLSClientConfig.Insecure, CAData/CAFile) which weakens security; update
the code by adding a concise explanatory comment above the config modifications
in execCommandInPod (referencing TestClient.Config and the copied config
variable) stating that TLS verification is intentionally disabled only for
integration tests due to in-cluster certificate/exec handshake issues, and
document any assumptions or risks (e.g., not for production) so future readers
understand why this exception exists.
- Around line 2175-2191: The passphrase is exposed via shell piping (echo) in
setDiskEncryptPassphrase; instead create a temporary file on each worker with
restrictive perms, write the passphrase into it, use cryptsetup's --key-file
option for both luksFormat and luksOpen, then securely remove the temp file;
update the calls that build cmd/openCmd (and where execCommandInNode is invoked)
to: create temp (mktemp), chmod 600, write passphrase, run cryptsetup ...
--key-file <tempfile> <diskName>, then rm -f <tempfile>, using the existing
diskName, workerName, execCommandInNode and setDiskEncryptPassphrase identifiers
to locate and change the logic.

ℹ️ Review info

⚙️ Run configuration

Configuration used: Repository YAML (base), Organization UI (inherited)

Review profile: CHILL

Plan: Pro

Run ID: e0a8aebc-80da-4787-a394-d7bf20ab6603

📥 Commits

Reviewing files that changed from the base of the PR and between 533bce3 and 074cacf.

⛔ Files ignored due to path filters (1)

test/integration/go.sum is excluded by !**/*.sum

📒 Files selected for processing (8)

test/integration/Dockerfile
test/integration/Makefile
test/integration/go.mod
test/integration/integration.go
test/integration/qe-tests/lvms.go
test/integration/qe-tests/lvms_utils.go
test/integration/qe_tests/lvms.go
test/integration/qe_tests/lvms_utils.go

💤 Files with no reviewable changes (2)

test/integration/qe-tests/lvms.go
test/integration/qe-tests/lvms_utils.go

🚧 Files skipped from review as they are similar to previous changes (2)

test/integration/Dockerfile
test/integration/go.mod

test/integration/qe_tests/lvms_utils.go

openshift-ci · 2026-03-25T02:33:06Z

@mmakwana30: The following tests failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name	Commit	Details	Required	Rerun command
ci/prow/e2e-aws-single-node-integration-tests	`074cacf`	link	true	`/test e2e-aws-single-node-integration-tests`
ci/prow/images	`074cacf`	link	true	`/test images`
ci/prow/precommit-check	`074cacf`	link	true	`/test precommit-check`
ci/prow/snyk-deps	`074cacf`	link	false	`/test snyk-deps`
ci/prow/e2e-aws	`074cacf`	link	true	`/test e2e-aws`
ci/prow/e2e-aws-hypershift	`074cacf`	link	true	`/test e2e-aws-hypershift`
ci/prow/e2e-aws-single-node	`074cacf`	link	true	`/test e2e-aws-single-node`

Full PR test history. Your PR dashboard.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

openshift-ci bot added the size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files. label Mar 24, 2026

openshift-ci bot requested review from jaypoulz and suleymanakbas91 March 24, 2026 15:53

mmakwana30 force-pushed the HA_tests branch from 52e32be to c510db0 Compare March 24, 2026 15:57

coderabbitai bot reviewed Mar 24, 2026

View reviewed changes

mmakwana30 force-pushed the HA_tests branch from c510db0 to 12cdfd3 Compare March 24, 2026 16:15

suleymanakbas91 mentioned this pull request Mar 24, 2026

OCPEDGE-2021: cases 61585, 61433 and 66320 #1695

Closed

mmakwana30 force-pushed the HA_tests branch from 12cdfd3 to d94476e Compare March 24, 2026 17:16

coderabbitai bot reviewed Mar 24, 2026

View reviewed changes

test/integration/Dockerfile Show resolved Hide resolved

mmakwana30 force-pushed the HA_tests branch from d94476e to 533bce3 Compare March 24, 2026 18:13

Add HA tests

074cacf

mmakwana30 force-pushed the HA_tests branch from 533bce3 to 074cacf Compare March 24, 2026 18:32

coderabbitai bot reviewed Mar 24, 2026

View reviewed changes

test/integration/qe_tests/lvms_utils.go Show resolved Hide resolved

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add HA tests#2146

Add HA tests#2146
mmakwana30 wants to merge 1 commit intoopenshift:mainfrom
mmakwana30:HA_tests

mmakwana30 commented Mar 24, 2026 •

edited

Loading

Uh oh!

openshift-ci bot commented Mar 24, 2026

Uh oh!

coderabbitai bot commented Mar 24, 2026 •

edited

Loading

Reviews paused

Uh oh!

coderabbitai bot left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

codecov-commenter commented Mar 24, 2026 •

edited

Loading

Uh oh!

coderabbitai bot left a comment

Uh oh!

coderabbitai bot left a comment

Uh oh!

Uh oh!

coderabbitai bot left a comment

Uh oh!

Uh oh!

openshift-ci bot commented Mar 25, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

mmakwana30 commented Mar 24, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

openshift-ci bot commented Mar 24, 2026

Uh oh!

coderabbitai bot commented Mar 24, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Reviews paused

Walkthrough

Changes

Estimated code review effort

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

codecov-commenter commented Mar 24, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

openshift-ci bot commented Mar 25, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

mmakwana30 commented Mar 24, 2026 •

edited

Loading

coderabbitai bot commented Mar 24, 2026 •

edited

Loading

codecov-commenter commented Mar 24, 2026 •

edited

Loading