Fix NPE in NASBackupProvider when no running KVM host is available#12680
Fix NPE in NASBackupProvider when no running KVM host is available#12680jmsperu wants to merge 1 commit intoapache:mainfrom
Conversation
ResourceManager.findOneRandomRunningHostByHypervisor() can return null when no KVM host in the zone has status=Up (e.g. during management server startup, brief agent disconnections, or host state transitions). NASBackupProvider.syncBackupStorageStats() and deleteBackup() call host.getId() without a null check, causing a NullPointerException that crashes the entire BackupSyncTask background job every sync interval. This adds null checks in both methods: - syncBackupStorageStats: log a warning and return early - deleteBackup: throw CloudRuntimeException with a descriptive message
|
Congratulations on your first Pull Request and welcome to the Apache CloudStack community! If you have any issues or are unsure about any anything please check our Contribution Guide (https://github.com/apache/cloudstack/blob/main/CONTRIBUTING.md)
|
There was a problem hiding this comment.
Pull request overview
This PR fixes a NullPointerException in NASBackupProvider that occurs when no running KVM host is available in a zone. The NPE was crashing the BackupSyncTask background job every sync interval and preventing backup storage stats from being updated.
Changes:
- Added null check in
syncBackupStorageStats()with early return and warning log when no running KVM host is found - Added null check in
deleteBackup()with descriptive exception when no running KVM host is available
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| if (CollectionUtils.isEmpty(repositories)) { | ||
| return; | ||
| } |
There was a problem hiding this comment.
The empty repositories check is placed after the method starts processing but before the host lookup. If repositories are empty, the host lookup at line 573 becomes unnecessary. Consider moving this check before the host lookup, or better yet, to the beginning of the method to avoid any unnecessary operations when there are no repositories to process.
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #12680 +/- ##
============================================
- Coverage 17.92% 17.92% -0.01%
Complexity 16154 16154
============================================
Files 5939 5939
Lines 533181 533188 +7
Branches 65237 65240 +3
============================================
+ Hits 95585 95586 +1
- Misses 426856 426860 +4
- Partials 10740 10742 +2
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
|
@jmsperu , would you want this included in the next 22 release (in which case you might to want to base it of the 4.22 branch) |
Description
NASBackupProvider.syncBackupStorageStats()crashes with aNullPointerExceptionwhenResourceManager.findOneRandomRunningHostByHypervisor()returnsnull. This happens when no KVM host in the zone hasstatus=Upat the moment theBackupSyncTaskruns (e.g., during management server startup, brief agent disconnections, or host state transitions).The NPE kills the entire
BackupSyncTaskbackground job every sync interval (default 300s), flooding the management server log with stack traces and preventing backup storage stats from being updated.The same pattern exists in
deleteBackup()wherehostcan be null when the VM is removed and no running KVM host is available.Stack Trace
Changes
syncBackupStorageStats(): Add early return with warning log when no running KVM host is founddeleteBackup(): Add null check with descriptiveCloudRuntimeExceptionHow Has This Been Tested
Tested on a production CloudStack 4.22.0.0 deployment with 3 KVM hosts and NAS (NFS) backup provider. Before the fix, the NPE fired every 5 minutes after management server restart. After applying the patch, the
BackupSyncTaskruns cleanly — logging a warning during startup and resuming normally once hosts reconnect.Types of changes
Screenshot / Video (if applicable)
Before (every 5 minutes):
After: