🍽️ Restaurant Review Sentiment Analysis System

A machine learning system for analyzing restaurant reviews to extract sentiment, identify service quality issues, and generate actionable business insights. Trained on real customer feedback to help restaurants improve service quality and customer satisfaction.

📊 Dataset

This project was tested on the Restaurant Reviews Dataset from Kaggle:

🔗 https://www.kaggle.com/datasets/joebeachcapital/restaurant-reviews

⚠️ Note: Download the dataset and save it as Restaurant_reviews.csv in the project root directory before running analysis.

✨ Features

✅ Multi-model sentiment analysis (Random Forest, Naive Bayes, SVM)
✅ Automated data cleaning for noisy real-world reviews
✅ Business intelligence engine identifying key service issues:
- Slow service patterns
- Staff behavior problems
- Food quality complaints
- Ambience issues
✅ ROI analysis with investment payback calculations
✅ Interactive HTML report with visualizations and recommendations
✅ Database integration (MySQL/MariaDB) for production deployment
✅ Word frequency analysis to understand customer language patterns
✅ Feature importance visualization for model interpretability

🛠️ Technologies Used

Category	Technologies
Core ML	scikit-learn, pandas, numpy
NLP	NLTK (stopwords, Porter stemmer), CountVectorizer, TF-IDF
Visualization	matplotlib, seaborn, HTML/CSS
Database	SQLAlchemy, mysql-connector-python
Deployment	Pickle (joblib) for model serialization

📁 Project Structure

project/
├── Restaurant_reviews.csv          # Source dataset (download from Kaggle)
├── ai_model.py                     # Main analysis script
├── analysis_report.html            # Generated HTML report
├── check_data.py                   # Database data validation
├── clear_all_tables.py             # Full database cleanup
├── clear_table.py                  # Single table cleanup
├── import_data.py                  # CSV → MySQL importer
├── test_db_connection.py           # Database connectivity test
├── results/                        # Analysis outputs
│   ├── *.png                       # Visualizations
│   ├── model_*.pkl                 # Trained models
│   ├── vectorizer_*.pkl            # Text vectorizers
│   ├── models_comparison.csv       # Model performance metrics
│   └── business_recommendations.txt # Actionable insights
└── requirements.txt                # Python dependencies

⚙️ Installation & Setup

1. Database Setup (MariaDB/MySQL)

# Install MariaDB server
sudo apt update && sudo apt install mariadb-server

# Secure installation
sudo mysql_secure_installation

# Create database and user
sudo mariadb -u root -p

# In MariaDB/MySQL shell:
CREATE DATABASE restaurant_reviews CHARACTER SET utf8mb4 COLLATE utf8mb4_unicode_ci;
CREATE USER 'project_user'@'localhost' IDENTIFIED BY 'your_password';
GRANT ALL PRIVILEGES ON restaurant_reviews.* TO 'project_user'@'localhost';
FLUSH PRIVILEGES;
EXIT;

🔒 Security Note: All database connection files in this repository use your_password as a placeholder. Before running, replace it with your actual password in:

ai_model.py
import_data.py
check_data.py
test_db_connection.py
clear_table.py
clear_all_tables.py

Best practice: Use environment variables instead:

import os
db_password = os.environ.get('DB_PASSWORD', 'your_password')

2. Python Environment

# Create virtual environment
python3 -m venv venv
source venv/bin/activate  # On Windows: venv\Scripts\activate

# Install dependencies
pip install -r requirements.txt

▶️ Usage

Option 1: Analyze CSV file directly

# Basic analysis (CSV output)
python ai_model.py Restaurant_reviews.csv

# With Excel export
python ai_model.py Restaurant_reviews.csv excel

Option 2: Database workflow (recommended for production)

# 1. Import data into MySQL
python import_data.py

# 2. Verify data loaded correctly
python check_data.py

# 3. Run full analysis from database
python ai_model.py mysql \
  "mysql+mysqlconnector://project_user:your_password@localhost:3306/restaurant_reviews" \
  "SELECT review_text AS Review, rating AS Rating, restaurant_name AS Restaurant FROM restaurant_reviews"

Option 3: Database maintenance scripts

# Test database connectivity
python test_db_connection.py

# Clear a single table (e.g., model_metrics)
python clear_table.py

# Full database cleanup (all tables)
python clear_all_tables.py

📈 Sample Output

After running analysis, you'll get:

analysis_report.html — Interactive dashboard featuring:

Customer rating distribution
Review length analysis
Top-10 restaurants by review volume
Word clouds for positive/negative sentiment
Model comparison (accuracy, F1-score, ROC-AUC)
Confusion matrices and ROC curves
Business recommendations with ROI calculations

results/ directory containing:

Trained ML models (model_RandomForest.pkl, etc.)
Classification reports
Word frequency statistics
Feature importance rankings
Business insights in CSV/Excel format

💼 Business Value

This system transforms unstructured customer feedback into actionable business intelligence:

Insight Type	Business Impact
Service speed issues	40% reduction in wait times → higher satisfaction
Staff training needs	Targeted training → 25% fewer negative reviews
Food quality patterns	Kitchen process improvements → 0.5★ rating increase
Ambience optimization	Zone redesign → 15% longer guest stays
ROI analysis	8-month payback period on service improvements

🔒 Security Recommendations

Never commit passwords to version control
Use .env files with python-dotenv:

from dotenv import load_dotenv
import os
load_dotenv()
password = os.getenv('DB_PASSWORD')

For production deployments, implement proper secrets management (HashiCorp Vault, AWS Secrets Manager, or environment variables)

🤝 Contributing

Contributions are welcome! Please follow these steps:

Fork the repository
Create your feature branch (git checkout -b feature/AmazingFeature)
Commit your changes (git commit -m 'Add some AmazingFeature')
Push to the branch (git push origin feature/AmazingFeature)
Open a Pull Request

📜 License

Distributed under the MIT License. See LICENSE for details.

🙏 Acknowledgements

Dataset source: Kaggle Restaurant Reviews
NLTK Project for natural language processing resources
scikit-learn team for robust ML implementations

🔑 Critical Security Reminder

Before using this project:

Replace all your_password placeholders in Python files with your actual database password
Never commit credentials to Git repositories
For production deployments, implement proper secrets management (HashiCorp Vault, AWS Secrets Manager, or environment variables)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

🍽️ Restaurant Review Sentiment Analysis System

📊 Dataset

✨ Features

🛠️ Technologies Used

📁 Project Structure

⚙️ Installation & Setup

1. Database Setup (MariaDB/MySQL)

2. Python Environment

▶️ Usage

Option 1: Analyze CSV file directly

Option 2: Database workflow (recommended for production)

Option 3: Database maintenance scripts

📈 Sample Output

💼 Business Value

🔒 Security Recommendations

🤝 Contributing

📜 License

🙏 Acknowledgements

🔑 Critical Security Reminder

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 14 Commits
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
ai_model.py		ai_model.py
check_data.py		check_data.py
clear_all_tables.py		clear_all_tables.py
clear_table.py		clear_table.py
import_data.py		import_data.py
requirements.txt		requirements.txt
test_db_connection.py		test_db_connection.py

Folders and files

Latest commit

History

Repository files navigation

🍽️ Restaurant Review Sentiment Analysis System

📊 Dataset

✨ Features

🛠️ Technologies Used

📁 Project Structure

⚙️ Installation & Setup

1. Database Setup (MariaDB/MySQL)

2. Python Environment

▶️ Usage

Option 1: Analyze CSV file directly

Option 2: Database workflow (recommended for production)

Option 3: Database maintenance scripts

📈 Sample Output

💼 Business Value

🔒 Security Recommendations

🤝 Contributing

📜 License

🙏 Acknowledgements

🔑 Critical Security Reminder

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages