Skip to content

Add HA tests#2146

Open
mmakwana30 wants to merge 1 commit intoopenshift:mainfrom
mmakwana30:HA_tests
Open

Add HA tests#2146
mmakwana30 wants to merge 1 commit intoopenshift:mainfrom
mmakwana30:HA_tests

Conversation

@mmakwana30
Copy link
Contributor

@mmakwana30 mmakwana30 commented Mar 24, 2026

@openshift-ci openshift-ci bot added the size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files. label Mar 24, 2026
@openshift-ci
Copy link
Contributor

openshift-ci bot commented Mar 24, 2026

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: mmakwana30
Once this PR has been reviewed and has the lgtm label, please assign jerpeter1 for approval. For more information see the Code Review Process.

The full list of commands accepted by this bot can be found here.

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@coderabbitai
Copy link

coderabbitai bot commented Mar 24, 2026

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

  • @coderabbitai resume to resume automatic reviews.
  • @coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

  • ▶️ Resume reviews
  • 🔍 Trigger review

Walkthrough

Integration test entrypoint and Makefile were split into separate SNO and MNO runs; integration test Docker build context expanded; go.mod dependency set was adjusted (removed k8s.io/kubernetes, added k8s.io/klog/v2); the previous LVMS test and helper files were removed and a new, large qe_tests utility implementation was added.

Changes

Cohort / File(s) Summary
Build Infrastructure
test/integration/Makefile, test/integration/Dockerfile
Makefile: split integration-run into separate integration-run-sno and integration-run-mno targets and updated integration-docker-run to invoke both run-suite -c 1 commands. Dockerfile: copy full repo (COPY . ./), set builder workdir to /workspace/test/integration, and emit binary to /workspace/integration-test.
Dependency Management
test/integration/go.mod
Replaced top-level k8s.io/kubernetes requirement with k8s.io/klog/v2 v2.130.1 and pruned several transitive requires (OpenTelemetry, some Kubernetes internals); replace directives unchanged.
Test Entrypoint / Suites
test/integration/integration.go
Updated imports to .../qe_tests and replaced the single single-node suite with two suites targeting .../qe_tests/sno and .../qe_tests/mno, each using label qualifiers for SNO and MNO.
LVMS Tests Removed
test/integration/qe-tests/lvms.go, test/integration/qe-tests/lvms_utils.go
Deleted the original LVMS test spec and its large helper module (extensive test cases and many node/storage helper functions removed).
New LVMS Utilities (qe_tests)
test/integration/qe_tests/lvms_utils.go
Added a large new utilities file under qe_tests implementing lvmCluster helpers, extensive node/backend operations via oc debug and pod exec, LVMCluster create/delete/repair flows, disk discovery, RAID/LV/ VG cleanup and recovery routines, and many test-oriented helper functions (many unexported helpers and methods).

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 6

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@test/integration/Makefile`:
- Around line 34-35: The Makefile target integration-docker-run is invoking a
non-existent suite path openshift/lvm-operator/test/integration/qe_tests; update
the command in integration-docker-run to invoke the actual suites defined in
integration.go (qe_tests/sno and qe_tests/mno) — either run them both (e.g.,
call run-suite twice or pass both suite paths) or replace the single path with
the correct concrete suite path(s) so the docker run command executes the
existing qe_tests/sno and qe_tests/mno suites.

In `@test/integration/qe_tests/lvms_utils.go`:
- Around line 1579-1582: The error from cmd.CombinedOutput() is ignored; update
the error handling to return a non-nil error (and a safe zero value for the
bool) instead of an empty block. Inside the if err != nil { ... } handle, return
false and wrap the error with context (e.g., fmt.Errorf("running %v failed: %w",
cmd.Args, err)) and include trimmed output if helpful; add fmt to imports if
missing. This ensures the function (in lvms_utils.go where cmd.CombinedOutput()
is called) surfaces command failures instead of silently continuing.
- Around line 361-364: The getRandomNum function currently calls rand.Seed on
every invocation which is deprecated and unsafe; remove the per-call rand.Seed
and instead either rely on the auto-seeded global PRNG (drop the Seed call and
use rand.Int63n directly in getRandomNum) or create a package-level
math/rand.Rand (e.g., r := rand.New(rand.NewSource(time.Now().UnixNano()))
initialized once in init()) and call r.Int63n in getRandomNum so seeding happens
only once and not on every call.
- Around line 1856-1891: The YAML produced by
createLVMClusterWithPathsAndOptionalPaths is malformed: it omits metadata.name
and places paths/optionalPaths under thinPoolConfig instead of under
deviceSelector, and the fmt.Sprintf argument order must match the updated
placeholders; fix by adding a metadata.name: %s line (using the function's name
parameter), move the pathsJSON and optionalPathsJSON keys out of thinPoolConfig
into a deviceSelector block under the device class (deviceSelector:\n       
paths: %s\n        optionalPaths: %s), keep thinPoolConfig only for
thinPoolConfig fields, and update the fmt.Sprintf call in
createLVMClusterWithPathsAndOptionalPaths so the placeholder order is
metadata.name, namespace, deviceClass, pathsJSON, optionalPathsJSON before
calling createLVMClusterFromJSON.
- Around line 1551-1553: The empty error handling after calling
resourceExists("lvmcluster", name, namespace) should be fixed: handle the
returned err instead of ignoring it—either propagate it (e.g., return exists,
err or return false, err from the enclosing function) or log the error with
context using the existing logger and then decide on a fallback; update the code
around the resourceExists call (variables exists and err) to implement one of
these behaviors so the error is not silently dropped.
- Around line 1236-1246: The shell snippet in lvms_utils.go has an unclosed
outer if (if [ "$lv_count" -gt 1 ]) causing a syntax error; close that block by
inserting a matching fi immediately after the inner orphaned-check if block (the
block that echoes "HAS_ORPHANED_LVS" and exits) so the structure is: if [
"$lv_count" -gt 1 ]; then ... if [ orphaned ]; then ... fi fi before the final
echo "OK".

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository YAML (base), Organization UI (inherited)

Review profile: CHILL

Plan: Pro

Run ID: 2256d975-16da-413b-9dbf-44fbeedf408f

📥 Commits

Reviewing files that changed from the base of the PR and between df26f96 and 52e32be.

⛔ Files ignored due to path filters (1)
  • test/integration/go.sum is excluded by !**/*.sum
📒 Files selected for processing (5)
  • test/integration/Makefile
  • test/integration/go.mod
  • test/integration/integration.go
  • test/integration/qe_tests/lvms.go
  • test/integration/qe_tests/lvms_utils.go

@codecov-commenter
Copy link

codecov-commenter commented Mar 24, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 52.26%. Comparing base (b5e0e11) to head (074cacf).
⚠️ Report is 14 commits behind head on main.

Additional details and impacted files

Impacted file tree graph

@@            Coverage Diff             @@
##             main    #2146      +/-   ##
==========================================
+ Coverage   51.73%   52.26%   +0.53%     
==========================================
  Files          52       52              
  Lines        3901     3901              
==========================================
+ Hits         2018     2039      +21     
+ Misses       1715     1698      -17     
+ Partials      168      164       -4     

see 2 files with indirect coverage changes

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
test/integration/go.mod (1)

16-24: ⚠️ Potential issue | 🟠 Major

Add k8s.io/kubernetes to the require block.

Integration tests import k8s.io/kubernetes/test/e2e/framework (confirmed in test/integration/qe-tests/lvms_utils.go and lvms.go), but k8s.io/kubernetes only appears in the replace block, not in require. Without a require entry, the replace directive won't be applied and the build will fail or behave unexpectedly. Add a require statement for k8s.io/kubernetes before the replace block.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@test/integration/go.mod` around lines 16 - 24, The go.mod is missing a
require entry for k8s.io/kubernetes, so the replace directive for
k8s.io/kubernetes won't be applied; add a require line for "k8s.io/kubernetes"
(e.g., require k8s.io/kubernetes v1.30.1) in the require block before the
replace block so imports like k8s.io/kubernetes/test/e2e/framework resolve and
the replace mapping is used.
♻️ Duplicate comments (1)
test/integration/Makefile (1)

35-35: ⚠️ Potential issue | 🟠 Major

Use concrete suite names in integration-docker-run.

run-suite is still called with .../qe_tests, while suites are registered as .../qe_tests/sno and .../qe_tests/mno. This can skip tests or fail in Docker runs.

Proposed fix
 integration-docker-run: integration-docker-build
-	docker run --rm $(FULL_IMAGE) run-suite openshift/lvm-operator/test/integration/qe_tests
+	docker run --rm $(FULL_IMAGE) run-suite -c 1 openshift/lvm-operator/test/integration/qe_tests/sno
+	docker run --rm $(FULL_IMAGE) run-suite -c 1 openshift/lvm-operator/test/integration/qe_tests/mno
As per coding guidelines, "Focus on major issues impacting performance, readability, maintainability and security. Avoid nitpicks and avoid verbosity."
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@test/integration/Makefile` at line 35, The docker command in the
integration-docker-run target uses a generic suite path
"openshift/lvm-operator/test/integration/qe_tests" which doesn't match
registered suites (e.g., ".../qe_tests/sno" and ".../qe_tests/mno"); update the
docker run invocation to call the concrete suite names (referencing the existing
suite identifiers like "openshift/lvm-operator/test/integration/qe_tests/sno"
and "openshift/lvm-operator/test/integration/qe_tests/mno") so Docker runs the
specific suites (either run them separately or pass both concrete suite paths)
instead of the generic directory.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Outside diff comments:
In `@test/integration/go.mod`:
- Around line 16-24: The go.mod is missing a require entry for
k8s.io/kubernetes, so the replace directive for k8s.io/kubernetes won't be
applied; add a require line for "k8s.io/kubernetes" (e.g., require
k8s.io/kubernetes v1.30.1) in the require block before the replace block so
imports like k8s.io/kubernetes/test/e2e/framework resolve and the replace
mapping is used.

---

Duplicate comments:
In `@test/integration/Makefile`:
- Line 35: The docker command in the integration-docker-run target uses a
generic suite path "openshift/lvm-operator/test/integration/qe_tests" which
doesn't match registered suites (e.g., ".../qe_tests/sno" and
".../qe_tests/mno"); update the docker run invocation to call the concrete suite
names (referencing the existing suite identifiers like
"openshift/lvm-operator/test/integration/qe_tests/sno" and
"openshift/lvm-operator/test/integration/qe_tests/mno") so Docker runs the
specific suites (either run them separately or pass both concrete suite paths)
instead of the generic directory.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository YAML (base), Organization UI (inherited)

Review profile: CHILL

Plan: Pro

Run ID: 107274df-5538-4464-aed1-bc8018937e4a

📥 Commits

Reviewing files that changed from the base of the PR and between 52e32be and c510db0.

⛔ Files ignored due to path filters (1)
  • test/integration/go.sum is excluded by !**/*.sum
📒 Files selected for processing (5)
  • test/integration/Makefile
  • test/integration/go.mod
  • test/integration/integration.go
  • test/integration/qe_tests/lvms.go
  • test/integration/qe_tests/lvms_utils.go
✅ Files skipped from review due to trivial changes (1)
  • test/integration/qe_tests/lvms_utils.go
🚧 Files skipped from review as they are similar to previous changes (1)
  • test/integration/integration.go

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@test/integration/Dockerfile`:
- Line 7: The Dockerfile uses "COPY . ./" before "WORKDIR /workspace", which
places the repo contents in the image root and causes subsequent build steps
(the Go builds in the Dockerfile) to run from the wrong module context; fix by
either moving the "WORKDIR /workspace" line above the "COPY . ./" instruction or
change the COPY destination to the explicit workspace path (e.g., replace "COPY
. ./" with "COPY . /workspace") so that the repository is copied into the
intended /workspace module before the Go build steps run.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository YAML (base), Organization UI (inherited)

Review profile: CHILL

Plan: Pro

Run ID: 7d74692c-68b2-4d28-a81d-39c4caac5c83

📥 Commits

Reviewing files that changed from the base of the PR and between c510db0 and d94476e.

⛔ Files ignored due to path filters (1)
  • test/integration/go.sum is excluded by !**/*.sum
📒 Files selected for processing (6)
  • test/integration/Dockerfile
  • test/integration/Makefile
  • test/integration/go.mod
  • test/integration/integration.go
  • test/integration/qe-tests/lvms.go
  • test/integration/qe-tests/lvms_utils.go
🚧 Files skipped from review as they are similar to previous changes (1)
  • test/integration/integration.go

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (2)
test/integration/qe_tests/lvms_utils.go (2)

509-516: TLS verification disabled for pod exec.

Disabling TLS verification (config.Insecure = true) weakens security posture. While acceptable in test code, consider documenting why this is necessary (certificate chain issues with in-cluster exec).

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@test/integration/qe_tests/lvms_utils.go` around lines 509 - 516, The
execCommandInPod function currently disables TLS verification on a copied kube
config (setting config.Insecure, config.TLSClientConfig.Insecure, CAData/CAFile)
which weakens security; update the code by adding a concise explanatory comment
above the config modifications in execCommandInPod (referencing
TestClient.Config and the copied config variable) stating that TLS verification
is intentionally disabled only for integration tests due to in-cluster
certificate/exec handshake issues, and document any assumptions or risks (e.g.,
not for production) so future readers understand why this exception exists.

2175-2191: Passphrase visible in process list.

The passphrase is passed via echo which exposes it in ps output and potentially shell history. For test code this is low risk, but worth noting.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@test/integration/qe_tests/lvms_utils.go` around lines 2175 - 2191, The
passphrase is exposed via shell piping (echo) in setDiskEncryptPassphrase;
instead create a temporary file on each worker with restrictive perms, write the
passphrase into it, use cryptsetup's --key-file option for both luksFormat and
luksOpen, then securely remove the temp file; update the calls that build
cmd/openCmd (and where execCommandInNode is invoked) to: create temp (mktemp),
chmod 600, write passphrase, run cryptsetup ... --key-file <tempfile>
<diskName>, then rm -f <tempfile>, using the existing diskName, workerName,
execCommandInNode and setDiskEncryptPassphrase identifiers to locate and change
the logic.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@test/integration/qe_tests/lvms_utils.go`:
- Around line 1507-1514: The loop over pvList ignores errors from the PV
update/delete calls; update the block that iterates pvList.Items (the pv
variable) to capture and handle the returned errors from
tc.Clientset.CoreV1().PersistentVolumes().Update(...) and Delete(...): check
each call's error, log a clear error message including pv.Name and the error,
and return or retry/collect failures as appropriate (e.g., continue on non-fatal
errors or fail the test on fatal ones) so orphaned PVs are not silently left
behind; ensure you reference the same pv.Finalizers modification and use the
returned objects/errors from Update and Delete for correct handling.

---

Nitpick comments:
In `@test/integration/qe_tests/lvms_utils.go`:
- Around line 509-516: The execCommandInPod function currently disables TLS
verification on a copied kube config (setting config.Insecure,
config.TLSClientConfig.Insecure, CAData/CAFile) which weakens security; update
the code by adding a concise explanatory comment above the config modifications
in execCommandInPod (referencing TestClient.Config and the copied config
variable) stating that TLS verification is intentionally disabled only for
integration tests due to in-cluster certificate/exec handshake issues, and
document any assumptions or risks (e.g., not for production) so future readers
understand why this exception exists.
- Around line 2175-2191: The passphrase is exposed via shell piping (echo) in
setDiskEncryptPassphrase; instead create a temporary file on each worker with
restrictive perms, write the passphrase into it, use cryptsetup's --key-file
option for both luksFormat and luksOpen, then securely remove the temp file;
update the calls that build cmd/openCmd (and where execCommandInNode is invoked)
to: create temp (mktemp), chmod 600, write passphrase, run cryptsetup ...
--key-file <tempfile> <diskName>, then rm -f <tempfile>, using the existing
diskName, workerName, execCommandInNode and setDiskEncryptPassphrase identifiers
to locate and change the logic.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository YAML (base), Organization UI (inherited)

Review profile: CHILL

Plan: Pro

Run ID: e0a8aebc-80da-4787-a394-d7bf20ab6603

📥 Commits

Reviewing files that changed from the base of the PR and between 533bce3 and 074cacf.

⛔ Files ignored due to path filters (1)
  • test/integration/go.sum is excluded by !**/*.sum
📒 Files selected for processing (8)
  • test/integration/Dockerfile
  • test/integration/Makefile
  • test/integration/go.mod
  • test/integration/integration.go
  • test/integration/qe-tests/lvms.go
  • test/integration/qe-tests/lvms_utils.go
  • test/integration/qe_tests/lvms.go
  • test/integration/qe_tests/lvms_utils.go
💤 Files with no reviewable changes (2)
  • test/integration/qe-tests/lvms.go
  • test/integration/qe-tests/lvms_utils.go
🚧 Files skipped from review as they are similar to previous changes (2)
  • test/integration/Dockerfile
  • test/integration/go.mod

@openshift-ci
Copy link
Contributor

openshift-ci bot commented Mar 25, 2026

@mmakwana30: The following tests failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name Commit Details Required Rerun command
ci/prow/e2e-aws-single-node-integration-tests 074cacf link true /test e2e-aws-single-node-integration-tests
ci/prow/images 074cacf link true /test images
ci/prow/precommit-check 074cacf link true /test precommit-check
ci/prow/snyk-deps 074cacf link false /test snyk-deps
ci/prow/e2e-aws 074cacf link true /test e2e-aws
ci/prow/e2e-aws-hypershift 074cacf link true /test e2e-aws-hypershift
ci/prow/e2e-aws-single-node 074cacf link true /test e2e-aws-single-node

Full PR test history. Your PR dashboard.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants