-
Notifications
You must be signed in to change notification settings - Fork 17
Description
We recently added RAID alerts to check RAID health ( see #7463 ), but a relevant scenario is currently not covered.
If an instance starts with an already degraded RAID1 array (for example, a disk is missing at boot time), the system boots with a "device missing" status but no alert is raised.
As a result, administrators may not be notified that the RAID is degraded.
Steps to reproduce
- Configure a RAID1 array.
- Stop the instance or node.
- Remove or detach one of the RAID devices.
- Start the instance again.
Expected behavior
An alert should be triggered when the system starts with a degraded RAID array (device missing).
Actual behavior
The system boots successfully with the RAID in degraded state, but no alert is generated.
We have to add an alert rule that detects degraded RAID arrays at boot time (device missing state).
The following Prometheus alert template should be integrated into the metrics app:
https://samber.github.io/awesome-prometheus-alerts/rules.html#rule-host-and-hardware-1-24
This rule would allow detection of RAID devices that are missing or degraded when the node starts.
Components
metrics:1.2.3
Metadata
Metadata
Assignees
Labels
Type
Projects
Status