This project demonstrates how to segment customers based on their buying behavior using the K-Means clustering algorithm in Python. It guides you step-by-step from preparing the data to clustering it.
Dataset sourced from the UCI ML Repository: Online Retail Dataset.
- Clean the dataset.
- Build a clustering model to segment customers.
- Fine-tune the model and compare metrics.
- Handle Missing Values: Identify and manage missing data.
- Create Attributes:
- Monetary: Total amount spent by each customer.
- Frequency: Number of purchases by each customer.
- Recency: Days since the last purchase.
- Merge Data: Combine necessary datasets.
- Outlier Analysis: Identify and manage outliers in the data.
- K-Means Clustering: Apply the K-Means algorithm.
- Elbow Curve: Determine the optimal number of clusters.
- Visualization: Use boxplots to visualize clusters.
- Data Cleaning: Follow the steps in the notebook to clean the data.
- Model Training: Train the K-Means model and tune parameters.
- Evaluation: Visualize and evaluate the clusters.
This project helps you understand customer segmentation using K-Means clustering, providing insights into customer behavior to enhance marketing strategies.