Skip to content

shuhbam199/Churn-Prediction

Repository files navigation

Churn-Prediction

🚖 Ola Driver Churn Prediction

This project aims to predict whether a driver will churn (leave) Ola using historical, demographic, and performance-related data. The goal is to help the organization reduce churn by identifying at-risk drivers and enabling proactive retention strategies.


📌 Problem Statement

Recruiting and retaining drivers is a major challenge for ride-sharing companies like Ola. High driver churn affects:

  • Operational stability
  • Customer experience
  • Driver acquisition cost (which is higher than retention)

By using machine learning, this project predicts driver churn probability based on:

  • Demographics (e.g., age, gender, education)
  • Performance metrics (e.g., ratings, business value)
  • Tenure (e.g., date of joining, last working date)
  • Behavioral patterns (e.g., income or rating changes)

🧠 Project Highlights

  • Data Cleaning & Imputation: Missing values handled using KNN imputer.
  • Feature Engineering:
    • Change in rating and income over time
    • Flags for improvement in grade/rating
    • Time served in days
  • Models Built:
    • Decision Tree
    • Random Forest (GridSearch tuned)
    • XGBoost (best performing)
    • LightGBM (fastest, good baseline)
  • Class Imbalance: Addressed using SMOTE
  • Model Interpretation: SHAP values used to explain predictions at an individual level
  • Interactive App: Streamlit frontend for business users to test custom driver profiles

📊 Dataset

The dataset contains ~19,000 driver records with the following fields (subset):

Column Description
Age Driver's age
Gender 0 = Female, 1 = Male
Income Total income earned
Joining Designation Initial designation at the time of joining
Quarterly Rating Ola's internal rating of driver
Total Business Value Revenue generated by the driver
Last Grade Last performance grade
Churn Target variable (1 = Churn, 0 = Stay)

🔍 SHAP-based Explainability

The app highlights top features driving churn prediction with direction:

  • ↑ Towards Churn → pushing model to predict churn
  • ↓ Away from Churn → reducing churn risk

Key drivers often include:

  • Last Rating
  • Change in Rating
  • Total Business Value
  • Income Increase

🚀 Running the App

1. Install dependencies

  1. Start Streamlit

📁 Project Structure Churn-Prediction/ ├── ola_churn.py # EDA + model building script ├── app.py # Streamlit frontend ├── xgb_model.pkl # Trained XGBoost model ├── scaler.pkl # Scaler used in training ├── xgb_explainer.pkl # SHAP explainer ├── ola.csv # Dataset └── README.md # This file

📈 Model Performance (XGBoost)

Metric Score
Accuracy ~87%
F1 Score ~85%
Recall ~86%
Precision ~83%

Streamlit App:

image

About

Using XGBoost

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages