A machine learning project to predict credit card defaulters using various models and evaluation strategies. Focuses on domain-specific metrics like Recall (Class 1) and Precision (Class 0), beyond just accuracy.
eda/: Exploratory data analysis notebookdata/: Data loading utilitiespreprocessing/: Feature encoding, SMOTE, and other transformationsmodeling/: Model training and tuning scriptsevaluation/: Evaluation and metric plotting
- Logistic Regression
- Random Forest
- XGBoost
- Hyperparameter tuning using:
- Accuracy (default)
- Domain-specific custom score
- ROC AUC Score (model's ability to distinguish between the 2 classes: default/non-default)
- Recall of Class 1 (catching defaulters)
- Precision of Class 0 (preserving creditworthy leads)
- Accuracy can be misleading in imbalanced datasets.
- Business context determines the right evaluation metric.
- Tuning of parameters has to be aligned with the right evaluation metric. In this case, custom metric has been used during tuning, with focus on AUC, Recall (1), and Precision (0).
# Create environment (optional)
pip install -r requirements.txt
# Run full pipeline
python main.py