Skip to content

Comments

storage: validate NFS secondary storage mount using SSVM during image store discovery#12678

Open
SURYAS1306 wants to merge 1 commit intoapache:4.22from
SURYAS1306:fix-nfs-secondary-validation-12674
Open

storage: validate NFS secondary storage mount using SSVM during image store discovery#12678
SURYAS1306 wants to merge 1 commit intoapache:4.22from
SURYAS1306:fix-nfs-secondary-validation-12674

Conversation

@SURYAS1306
Copy link

@SURYAS1306 SURYAS1306 commented Feb 21, 2026

###Description

This PR addresses issue #12674 where CloudStack allows adding a secondary storage (image store) even when the provided NFS mount path is invalid.

Currently during image store discovery, CloudStack persists the image store in the database without validating whether the Secondary Storage VM (SSVM) can access the NFS export.
The failure only appears later during system VM template seeding (setup-sysvm-tmplt), producing errors in logs while the API/UI reports successful storage addition.

This leads to an inconsistent state where unusable secondary storage is registered in the system.

Previously validation occurred only during System VM template seeding; this change introduces fail-fast validation during image store discovery.

This PR performs immediate validation using SSVM during image store discovery:

  • During discovery, the management server sends a setup command (generateSetupCommand) to an available SSVM in the same zone to verify the storage can be prepared/mounted.
  • If the SSVM setup command fails:

    • the created image store is rolled back
    • a CloudRuntimeException is thrown
    • the API returns failure instead of success

This prevents unusable secondary storage from being registered and avoids late failures during system VM template registration.

Fixes: #12674


Types of changes

  • Breaking change
  • New feature
  • Bug fix
  • Enhancement
  • Cleanup
  • Build/CI
  • Test

Bug Severity

  • BLOCKER
  • Critical
  • Major
  • Minor
  • Trivial

How Has This Been Tested?

Environment

  • Apache CloudStack 4.22
  • KVM hypervisor
  • NFS secondary storage

Test Scenarios

  1. Valid NFS mount

    • Secondary storage added successfully
    • SSVM mount successful
    • Templates seeded correctly
  2. Invalid NFS mount

    • API returns error
    • Image store not persisted in database
    • No template seeding attempted
  3. Regression check

    • Primary storage behavior unchanged
    • Existing valid secondary storages unaffected

Unit tests executed:

mvn -pl server -Dtest=StorageManagerImplTest,SecondaryStorageManagerImplTest test

How did you try to break this feature and the system with this change?

  • Wrong NFS export path
  • Non-exported directory
  • Unreachable NFS server IP

In all cases, storage creation failed correctly and no invalid store was registered.

Comment on lines +3997 to +4006
if (zoneId != null) {
List<HostVO> ssvmHosts = _hostDao.listByType(Host.Type.SecondaryStorageVM);
boolean mountSuccess = false;
String failureReason = "No Secondary Storage VM available to validate the NFS mount.";

for (HostVO ssvm : ssvmHosts) {
if (ssvm.getDataCenterId() != zoneId.longValue()) {
continue;
}

Copy link
Contributor

@shwstppr shwstppr Feb 23, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The whole validation could be a separate method.
And could it be an issue for a newly deployed zone which will not have a SSVM until a secondary store is added?

HostDao can list by both type and datacenter ID. If a method is not present it can be added.

Copy link
Contributor

@shwstppr shwstppr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will need validation for a new zone

Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR aims to prevent registering unusable NFS secondary storage by performing fail-fast validation during image store discovery, using a Secondary Storage VM (SSVM) to verify the mount before proceeding.

Changes:

  • Injects SecondaryStorageVmManager into StorageManagerImpl.
  • Adds SSVM-based validation logic inside discoverImageStore and rolls back the image store on validation failure.
  • Returns an error to the API caller when SSVM setup/mount validation fails.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +3997 to +4026
if (zoneId != null) {
List<HostVO> ssvmHosts = _hostDao.listByType(Host.Type.SecondaryStorageVM);
boolean mountSuccess = false;
String failureReason = "No Secondary Storage VM available to validate the NFS mount.";

for (HostVO ssvm : ssvmHosts) {
if (ssvm.getDataCenterId() != zoneId.longValue()) {
continue;
}

try {
boolean result = _ssVmMgr.generateSetupCommand(ssvm.getId());
if (result) {
mountSuccess = true;
break;
} else {
failureReason = "Secondary Storage VM failed to mount the NFS secondary storage.";
}
} catch (Exception e) {
failureReason = e.getMessage();
}
}

if (!mountSuccess) {
// cleanup created store
_imageStoreDao.remove(store.getId());
throw new CloudRuntimeException("Invalid secondary storage mount: " + failureReason);
}
}

Copy link

Copilot AI Feb 23, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

SSVM selection uses _hostDao.listByType(SecondaryStorageVM) and only filters by zone ID. This includes hosts in non-Up states and can result in false failures (or an unhelpful "invalid mount" error) when the zone has no Up/Connecting SSVM yet. Prefer using SecondaryStorageVmManager.listUpAndConnectingSecondaryStorageVmHost(zoneId) (or equivalent) and consider how to handle zones with no available SSVM (e.g., start one, retry, or skip validation with a clearer error).

Suggested change
if (zoneId != null) {
List<HostVO> ssvmHosts = _hostDao.listByType(Host.Type.SecondaryStorageVM);
boolean mountSuccess = false;
String failureReason = "No Secondary Storage VM available to validate the NFS mount.";
for (HostVO ssvm : ssvmHosts) {
if (ssvm.getDataCenterId() != zoneId.longValue()) {
continue;
}
try {
boolean result = _ssVmMgr.generateSetupCommand(ssvm.getId());
if (result) {
mountSuccess = true;
break;
} else {
failureReason = "Secondary Storage VM failed to mount the NFS secondary storage.";
}
} catch (Exception e) {
failureReason = e.getMessage();
}
}
if (!mountSuccess) {
// cleanup created store
_imageStoreDao.remove(store.getId());
throw new CloudRuntimeException("Invalid secondary storage mount: " + failureReason);
}
}
if (zoneId != null) {
List<HostVO> ssvmHosts = _ssVmMgr.listUpAndConnectingSecondaryStorageVmHost(zoneId);
if (CollectionUtils.isEmpty(ssvmHosts)) {
if (logger.isInfoEnabled()) {
logger.info("Skipping immediate secondary storage mount validation: no Up/Connecting Secondary Storage VM available in zone " + zoneId);
}
} else {
boolean mountSuccess = false;
String failureReason = "No Secondary Storage VM was able to validate the NFS mount.";
for (HostVO ssvm : ssvmHosts) {
try {
boolean result = _ssVmMgr.generateSetupCommand(ssvm.getId());
if (result) {
mountSuccess = true;
break;
} else {
failureReason = "Secondary Storage VM failed to mount the NFS secondary storage.";
}
} catch (Exception e) {
failureReason = e.getMessage();
}
}
if (!mountSuccess) {
// cleanup created store
_imageStoreDao.remove(store.getId());
throw new CloudRuntimeException("Invalid secondary storage mount: " + failureReason);
}
}
}

Copilot uses AI. Check for mistakes.
Comment on lines +3997 to +4026
if (zoneId != null) {
List<HostVO> ssvmHosts = _hostDao.listByType(Host.Type.SecondaryStorageVM);
boolean mountSuccess = false;
String failureReason = "No Secondary Storage VM available to validate the NFS mount.";

for (HostVO ssvm : ssvmHosts) {
if (ssvm.getDataCenterId() != zoneId.longValue()) {
continue;
}

try {
boolean result = _ssVmMgr.generateSetupCommand(ssvm.getId());
if (result) {
mountSuccess = true;
break;
} else {
failureReason = "Secondary Storage VM failed to mount the NFS secondary storage.";
}
} catch (Exception e) {
failureReason = e.getMessage();
}
}

if (!mountSuccess) {
// cleanup created store
_imageStoreDao.remove(store.getId());
throw new CloudRuntimeException("Invalid secondary storage mount: " + failureReason);
}
}

Copy link

Copilot AI Feb 23, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The added validation block isn’t indented consistently with the surrounding method body (the if (zoneId != null) starts at column 1). This is likely to violate the project’s formatting/checkstyle rules and makes the method harder to read; please align indentation with the surrounding code style.

Suggested change
if (zoneId != null) {
List<HostVO> ssvmHosts = _hostDao.listByType(Host.Type.SecondaryStorageVM);
boolean mountSuccess = false;
String failureReason = "No Secondary Storage VM available to validate the NFS mount.";
for (HostVO ssvm : ssvmHosts) {
if (ssvm.getDataCenterId() != zoneId.longValue()) {
continue;
}
try {
boolean result = _ssVmMgr.generateSetupCommand(ssvm.getId());
if (result) {
mountSuccess = true;
break;
} else {
failureReason = "Secondary Storage VM failed to mount the NFS secondary storage.";
}
} catch (Exception e) {
failureReason = e.getMessage();
}
}
if (!mountSuccess) {
// cleanup created store
_imageStoreDao.remove(store.getId());
throw new CloudRuntimeException("Invalid secondary storage mount: " + failureReason);
}
}
if (zoneId != null) {
List<HostVO> ssvmHosts = _hostDao.listByType(Host.Type.SecondaryStorageVM);
boolean mountSuccess = false;
String failureReason = "No Secondary Storage VM available to validate the NFS mount.";
for (HostVO ssvm : ssvmHosts) {
if (ssvm.getDataCenterId() != zoneId.longValue()) {
continue;
}
try {
boolean result = _ssVmMgr.generateSetupCommand(ssvm.getId());
if (result) {
mountSuccess = true;
break;
} else {
failureReason = "Secondary Storage VM failed to mount the NFS secondary storage.";
}
} catch (Exception e) {
failureReason = e.getMessage();
}
}
if (!mountSuccess) {
// cleanup created store
_imageStoreDao.remove(store.getId());
throw new CloudRuntimeException("Invalid secondary storage mount: " + failureReason);
}
}

Copilot uses AI. Check for mistakes.
Comment on lines +3996 to +4025
// Validate secondary storage mount immediately using SSVM
if (zoneId != null) {
List<HostVO> ssvmHosts = _hostDao.listByType(Host.Type.SecondaryStorageVM);
boolean mountSuccess = false;
String failureReason = "No Secondary Storage VM available to validate the NFS mount.";

for (HostVO ssvm : ssvmHosts) {
if (ssvm.getDataCenterId() != zoneId.longValue()) {
continue;
}

try {
boolean result = _ssVmMgr.generateSetupCommand(ssvm.getId());
if (result) {
mountSuccess = true;
break;
} else {
failureReason = "Secondary Storage VM failed to mount the NFS secondary storage.";
}
} catch (Exception e) {
failureReason = e.getMessage();
}
}

if (!mountSuccess) {
// cleanup created store
_imageStoreDao.remove(store.getId());
throw new CloudRuntimeException("Invalid secondary storage mount: " + failureReason);
}
}
Copy link

Copilot AI Feb 23, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This change introduces new behavior (fail-fast validation + rollback) in discoverImageStore, but there’s no accompanying unit test coverage for the new success/failure paths. Please add tests that mock SSVM host selection and generateSetupCommand results to assert: (1) invalid NFS causes an exception and the store is cleaned up, and (2) valid NFS proceeds without deleting the store.

Copilot uses AI. Check for mistakes.
Comment on lines +3996 to +4026
// Validate secondary storage mount immediately using SSVM
if (zoneId != null) {
List<HostVO> ssvmHosts = _hostDao.listByType(Host.Type.SecondaryStorageVM);
boolean mountSuccess = false;
String failureReason = "No Secondary Storage VM available to validate the NFS mount.";

for (HostVO ssvm : ssvmHosts) {
if (ssvm.getDataCenterId() != zoneId.longValue()) {
continue;
}

try {
boolean result = _ssVmMgr.generateSetupCommand(ssvm.getId());
if (result) {
mountSuccess = true;
break;
} else {
failureReason = "Secondary Storage VM failed to mount the NFS secondary storage.";
}
} catch (Exception e) {
failureReason = e.getMessage();
}
}

if (!mountSuccess) {
// cleanup created store
_imageStoreDao.remove(store.getId());
throw new CloudRuntimeException("Invalid secondary storage mount: " + failureReason);
}
}

Copy link

Copilot AI Feb 23, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The new validation calls SecondaryStorageVmManager.generateSetupCommand(ssvmHostId), but that method programs all NFS image stores in the zone (iterates getImageStoresByScope and returns false if any setup fails). This can cause adding a new image store to fail due to an unrelated existing store problem, and it also doesn’t specifically validate the just-created store. Consider sending a setup/validation command targeted only at the newly initialized store (or extending SecondaryStorageVmManager with a method that validates a specific store/URL) so the result reflects only the store being added.

Suggested change
// Validate secondary storage mount immediately using SSVM
if (zoneId != null) {
List<HostVO> ssvmHosts = _hostDao.listByType(Host.Type.SecondaryStorageVM);
boolean mountSuccess = false;
String failureReason = "No Secondary Storage VM available to validate the NFS mount.";
for (HostVO ssvm : ssvmHosts) {
if (ssvm.getDataCenterId() != zoneId.longValue()) {
continue;
}
try {
boolean result = _ssVmMgr.generateSetupCommand(ssvm.getId());
if (result) {
mountSuccess = true;
break;
} else {
failureReason = "Secondary Storage VM failed to mount the NFS secondary storage.";
}
} catch (Exception e) {
failureReason = e.getMessage();
}
}
if (!mountSuccess) {
// cleanup created store
_imageStoreDao.remove(store.getId());
throw new CloudRuntimeException("Invalid secondary storage mount: " + failureReason);
}
}

Copilot uses AI. Check for mistakes.
Comment on lines +4021 to +4022
// cleanup created store
_imageStoreDao.remove(store.getId());
Copy link

Copilot AI Feb 23, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Rollback on mount validation failure only removes the image_store record. In this codebase, image store deletion also explicitly deletes image_store_details and other related records because delete-cascade won’t run (see deleteImageStore). This failure path should perform the same cleanup (ideally in a transaction) or call the appropriate lifecycle/service delete so the DB doesn’t retain orphaned details/refs when validation fails.

Suggested change
// cleanup created store
_imageStoreDao.remove(store.getId());

Copilot uses AI. Check for mistakes.
@codecov
Copy link

codecov bot commented Feb 23, 2026

Codecov Report

❌ Patch coverage is 0% with 20 lines in your changes missing coverage. Please review.
✅ Project coverage is 17.61%. Comparing base (5caf6cd) to head (3019bec).

Files with missing lines Patch % Lines
...ain/java/com/cloud/storage/StorageManagerImpl.java 0.00% 20 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff            @@
##               4.22   #12678   +/-   ##
=========================================
  Coverage     17.60%   17.61%           
- Complexity    15659    15660    +1     
=========================================
  Files          5917     5917           
  Lines        531394   531414   +20     
  Branches      64970    64975    +5     
=========================================
+ Hits          93575    93582    +7     
- Misses       427269   427282   +13     
  Partials      10550    10550           
Flag Coverage Δ
uitests 3.70% <ø> (ø)
unittests 18.68% <0.00%> (+<0.01%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants