This project provides an analysis of Walmart sales data using Python libraries like pandas, numpy, matplotlib, and seaborn. The notebook offers insights into sales trends, customer behavior, and other key metrics by generating visualizations and performing data analytics. Below is a brief overview of the content, dependencies, and usage of the project.
The project answers various analytical questions related to Walmart's sales data, such as:
- Identifying the most common product line, payment methods, and customer types.
- Understanding customer purchasing behavior based on gender, time, and location.
- Analyzing revenue by product line, city, and month.
- Visualizing sales trends over time and per weekday.
To run this project, the following Python libraries are required:
numpypandasmatplotlibseaborn
Install the dependencies using:
pip install numpy pandas matplotlib seaborn- The dataset used in this project is loaded from a CSV file
WalmartSalesData.csv. Make sure the file path is correct before running the notebook. - The
DateandTimecolumns are converted into datetime format, and additional columns such asweekdayandhourare created to support the analysis.
- Unique Product Lines: Displays the number of unique product lines.
- Most Common Payment Method: Identifies the most frequent payment method used by customers.
- Most Selling Product Line: Highlights which product line sells the most items.
- Total Revenue by Month: Shows the total revenue generated per month.
- Largest COGS (Cost of Goods Sold): Identifies which month had the highest COGS.
- Product Line with Largest Revenue: Analyzes which product line generates the most revenue.
- City with Largest Revenue: Determines which city contributes the most to revenue.
- Product Line with Largest VAT: Finds which product line generates the most VAT.
- Product Line Performance: Adds a "Good" or "Bad" label to each product line based on average sales.
- Branch Performance: Compares branches based on average product sales.
- Most Common Product Line by Gender: Identifies the most popular product line for each gender.
- Average Rating per Product Line: Displays the average customer rating for each product line.
- Sales by Time of Day per Weekday: Visualizes the number of sales made at different times of the day for each weekday using a heatmap.
- Revenue by Customer Type: Analyzes which customer type brings the most revenue.
- City with Largest VAT: Shows which city has the highest average VAT.
- VAT Paid by Customer Type: Examines which customer type pays the most VAT.
- Unique Customer Types: Counts the number of unique customer types.
- Unique Payment Methods: Counts the unique payment methods used in transactions.
- Most Common Customer Type: Identifies the most frequent customer type.
- Purchasing Behavior by Customer Type: Compares total purchases made by each customer type.
- Most Common Customer Gender: Determines the gender of the majority of customers.
- Gender Distribution per Branch: Visualizes the distribution of customer gender across different branches.
- Customer Ratings by Time: Shows which time of day customers give the most ratings.
- Ratings by Time and Branch: Analyzes customer ratings across different times of the day and branches.
- Best Day for Ratings: Identifies which day of the week receives the best average ratings.
- Ratings per Branch by Day: Shows the best-rated day per branch.
- Sales Heatmap: A heatmap showing the number of sales per hour for each weekday.
- Revenue Bar Charts: Multiple bar charts display revenue breakdowns by customer type, product line, and city.
- Ratings Line Plots: Line plots showing average customer ratings based on time of day and branch.
To run the notebook and view the results:
- Make sure all dependencies are installed.
- Place the CSV file in the appropriate directory.
- Run the cells in the notebook to generate results and visualizations.
This project provides comprehensive insights into Walmart sales trends, customer behavior, and product performance, using data visualization and analysis techniques. The insights can help improve decision-making and optimize business operations.