A town's pump station has equipment that is failing. When equipment fails, the entire process comes to a halt and the failed equipment needs to be taken out of service and fixed before startup can begin. When the process unexpectedly fails, it takes longer for maintenance personnel to address the problem than if they proactively checked the equipment.
Therefore, in order to minimize the overall process downtime, the maintenance supervisor would like to proactively address system issues before they bring down the entire pump station.
Machine learning is a perfect solution for this problem because:
- There are complex patterns and non-linear relationships,
- There is a large amount of labeled data,
- There is high dimensionality,
- The environment is susceptible to changes over time, and
- This is a classification task.
Overall steps:
- Data Engineering & Exploratory Data Analysis
- Sanitize and Prepare the Data for Modeling
- Perform Feature Engineering
- Use Machine Learning to impute missing data
- Analyze and Visualize for Machine Learning
- Modeling
- Model Selection
- Training
- Hyperparameter Optimization
- ML Model Evaluation
Using a tuned XGBoost model, we correctly identified all instances of an equipment malfunction event. We do this at the expenses of incorrectly labeling 7 instances as indicative of an equipment malfunction event.
This is acceptable for this scenario, since the downside of checking the equipment when it is not actually faulty is less impactful than not identifying faulty equipment when it is truly faulty (approaching a breaking point).