Skip to content

anaya33/transaction-pattern-analysis

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

22 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Transaction Pattern Analysis

A Data-Driven Analysis of Temporal Sales Dynamics and Demand Peaks

Python Jupyter scikit-learn License: MIT Contributions Welcome

View Notebook · Dataset · Results · Contributing


Overview

This project investigates temporal purchasing behavior within a retail bakery environment using transactional sales data. The objective is to identify patterns in customer demand across time and evaluate how these patterns can inform operational decision-making.

The analysis provides insight into how temporal dynamics influence retail performance and demonstrates how data-driven approaches can support optimization of:

  • Inventory management
  • Staffing allocation
  • Production scheduling

Features

Category Techniques
Data Preprocessing Cleaning, type conversion, missing value handling
Feature Engineering Time-based feature extraction (hourly segmentation)
Exploratory Analysis Distribution analysis, correlation matrices, trend visualization
Statistical Testing Relationship analysis between price, quantity, and time
Clustering K-Means (manual implementation + scikit-learn)
Classification K-NN (manual implementation + scikit-learn)
Dimensionality Reduction Principal Component Analysis (PCA)

Project Structure

transaction-pattern-analysis/
├── transaction_pattern_analysis.ipynb  # Main analysis notebook
├── Bakery sales.csv                    # Raw dataset (234,005 transactions)
├── Bakery_Sales1.json.zip              # JSON export
└── README.md                           # Project documentation

Dataset

The dataset contains 234,005 transactional records from a French retail bakery spanning from January 2021 to September 2022.

Field Description
date Transaction date
time Transaction time (HH:MM)
ticket_number Unique transaction identifier
article Product name
Quantity Number of items purchased
unit_price Price per unit (€)

Source: Kaggle - French Bakery Daily Sales

Note: Data cleaning is performed directly within the notebook to ensure transparency and reproducibility.


Results

Sales Distribution by Hour

The analysis reveals clear peak demand periods throughout the day, enabling targeted staffing and inventory decisions.

Peak Hours: Morning rush (8-10 AM) and afternoon (12-2 PM)

Clustering Analysis

K-Means clustering segments transactions into distinct behavioral groups based on quantity and pricing patterns.

PCA Insights

Principal Component Analysis reduces dimensionality while preserving variance, revealing underlying structure in transaction data.


Tech Stack

Tool Purpose
Python Core programming language
Pandas Data manipulation & analysis
NumPy Numerical computing
Matplotlib Data visualization
Seaborn Statistical visualization
scikit-learn Machine learning

Getting Started

Prerequisites

  • Python 3.8+
  • Jupyter Notebook or Google Colab

Installation

  1. Clone the repository

    git clone https://github.com/anaya33/transaction-pattern-analysis.git
    cd transaction-pattern-analysis
  2. Install dependencies

    pip install pandas numpy matplotlib seaborn scikit-learn
  3. Launch the notebook

    jupyter notebook transaction_pattern_analysis.ipynb

Using Google Colab

  1. Open Google Colab
  2. Upload transaction_pattern_analysis.ipynb
  3. Upload Bakery sales.csv or connect via GitHub
  4. Run cells sequentially

Contributing

Contributions are welcome! Here's how you can help:

Ways to Contribute

  • Bug Reports — Found an issue? Open a detailed bug report
  • Feature Requests — Have ideas for new analyses? Share them!
  • New Visualizations — Add compelling charts or dashboards
  • Additional ML Models — Implement other clustering/classification algorithms
  • Documentation — Improve explanations or add tutorials
  • Code Optimization — Enhance performance or code quality

Contribution Guidelines

  1. Fork the repository
  2. Create a feature branch
    git checkout -b feature/your-feature-name
  3. Commit your changes with clear messages
    git commit -m "Add: description of your changes"
  4. Push to your branch
    git push origin feature/your-feature-name
  5. Open a Pull Request with a detailed description

Ideas for Contributors

Difficulty Task
Easy Add more visualizations (box plots, violin plots)
Easy Improve code comments and documentation
Medium Implement DBSCAN or hierarchical clustering
Medium Add time series forecasting (ARIMA, Prophet)
Medium Create an interactive dashboard (Plotly/Dash)
Advanced Build a recommendation system for products
Advanced Deploy as a web application

License

This project is licensed under the MIT License — see the LICENSE file for details.


Acknowledgments


Star this repo if you found it helpful!

View Live Project · Download JSON Data

About

This project explores patterns in transactional data, focusing on how variables like time, quantity, and pricing interact. The goal was to understand how to clean, structure, and analyze imperfect data while identifying patterns that could be useful for real-world systems.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages