🧠 Sentiment Analysis: TF-IDF + LR vs LSTM vs BERT

A comprehensive, production-ready comparison of classical ML, deep learning, and transformer-based approaches for binary sentiment classification on the IMDB Movie Reviews dataset.

📊 Results • 🚀 Quick Start • 📁 Project Structure • 📓 Notebooks • 🌐 Demo

🔍 Overview

This project benchmarks three sentiment analysis approaches across the IMDB 50K Movie Reviews dataset:

Approach	Type	Library
TF-IDF + Logistic Regression	Classical ML	Scikit-learn
LSTM	Deep Learning (RNN)	PyTorch
BERT (`bert-base-uncased`)	Transformer	🤗 HuggingFace

Key highlights:

✅ Full error analysis with misclassified sample inspection
✅ Class imbalance handled via class weights + SMOTE
✅ Confusion matrix, F1-score, ROC-AUC for every model
✅ Fully reproducible Jupyter Notebooks
✅ Interactive Gradio web demo

📂 Dataset

🗂️ IMDB Movie Reviews Dataset (50,000 Reviews)

Property	Value
Source	Kaggle / HuggingFace Datasets
Size	50,000 reviews
Classes	Positive / Negative (Binary)
Balance	25,000 positive + 25,000 negative
Split	80% Train / 10% Val / 10% Test

📥 How to Get the Dataset

Option A — Via Kaggle (Recommended)

Go to https://www.kaggle.com/datasets/lakshmi25npathi/imdb-dataset-of-50k-movie-reviews
Click Download
Place IMDB Dataset.csv inside the data/raw/ folder

Option B — Via Kaggle CLI (Automated)

pip install kaggle
# Place your kaggle.json API key in ~/.kaggle/
kaggle datasets download -d lakshmi25npathi/imdb-dataset-of-50k-movie-reviews
unzip imdb-dataset-of-50k-movie-reviews.zip -d data/raw/

Option C — Via HuggingFace Datasets (No download needed)

from datasets import load_dataset
dataset = load_dataset("imdb")

💡 The notebooks auto-detect which method to use — just run them!

🤖 Models Compared

1. 📊 TF-IDF + Logistic Regression (Baseline)

Feature extraction: TF-IDF with unigrams + bigrams (max 50,000 features)
Model: Logistic Regression with L2 regularization
Class imbalance: class_weight='balanced'
Pros: Extremely fast, interpretable, strong baseline
Cons: Loses word order and context

2. 🔁 LSTM (Deep Learning)

Embedding: Pretrained GloVe 100d embeddings
Architecture: Bidirectional LSTM (128 hidden units) → Dropout(0.5) → FC → Sigmoid
Class imbalance: Weighted BCELoss
Training: Adam optimizer, early stopping
Pros: Captures sequential patterns
Cons: Slower than TF-IDF, weaker than BERT on long text

3. 🤗 BERT (`bert-base-uncased`)

Model: bert-base-uncased from HuggingFace Transformers
Fine-tuning: Last 4 transformer layers + classification head
Class imbalance: Weighted cross-entropy loss
Training: AdamW optimizer, linear warmup schedule, 3 epochs
Pros: State-of-the-art contextual understanding
Cons: Computationally expensive (GPU recommended)

📊 Results

Model Performance Comparison

Model	Accuracy	Precision	Recall	F1-Score	ROC-AUC
TF-IDF + Log. Reg.	89.4%	89.2%	89.4%	89.3%	0.964
Bi-LSTM	91.8%	91.7%	91.8%	91.7%	0.972
BERT	94.1%	94.0%	94.1%	94.0%	0.988

📌 Results are on the held-out test set. Full metrics and confusion matrices are in the notebooks.

Confusion Matrices

See notebooks/04_comparison_report.ipynb for full visualizations.

📁 Project Structure

Sentiment_Analysis/
│
├── 📁 data/
│   ├── raw/                    # Raw IMDB dataset (.csv)
│   └── processed/              # Cleaned, split datasets
│
├── 📁 notebooks/
│   ├── 01_EDA_preprocessing.ipynb          # Exploratory Data Analysis
│   ├── 02_tfidf_logistic_regression.ipynb  # TF-IDF + LR model
│   ├── 03_lstm_model.ipynb                 # LSTM model
│   ├── 04_bert_model.ipynb                 # BERT fine-tuning
│   └── 05_comparison_report.ipynb          # Side-by-side comparison
│
├── 📁 src/
│   ├── preprocess.py           # Text cleaning & preprocessing
│   ├── tfidf_model.py          # TF-IDF + LR pipeline
│   ├── lstm_model.py           # LSTM architecture
│   ├── bert_model.py           # BERT fine-tuning code
│   ├── evaluate.py             # Metrics, confusion matrix, error analysis
│   └── utils.py                # Helper functions
│
├── 📁 models/
│   ├── tfidf_vectorizer.pkl    # Saved TF-IDF vectorizer
│   ├── lr_model.pkl            # Saved Logistic Regression model
│   ├── lstm_model.pth          # Saved LSTM weights
│   └── bert_finetuned/         # Saved BERT model (HuggingFace format)
│
├── 📁 results/
│   ├── confusion_matrices/     # PNG outputs
│   ├── metrics_summary.csv     # All model metrics
│   └── error_analysis.csv      # Misclassified samples
│
├── 📁 app/
│   └── demo.py                 # Gradio web demo
│
├── requirements.txt
├── environment.yml             # Conda environment
├── setup.py
├── .gitignore
└── README.md

🚀 Quick Start

# 1. Clone the repo
git clone https://github.com/najahaja/Sentiment-Analysis.git
cd Sentiment-Analysis

# 2. Create conda environment
conda env create -f environment.yml
conda activate sentiment-env

# OR use pip
pip install -r requirements.txt

# 3. Download dataset (auto via HuggingFace — no Kaggle account needed)
python src/utils.py --download

# 4. Run all notebooks in order, OR run the full pipeline:
python src/train_all.py

# 5. Launch the demo
python app/demo.py

🛠️ Installation

Prerequisites

Python 3.9+
CUDA GPU (for BERT fine-tuning, optional but recommended)
8GB+ RAM

Step-by-Step

# Clone
git clone https://github.com/najahaja/Sentiment-Analysis.git
cd Sentiment-Analysis

# Option 1: Conda (recommended)
conda env create -f environment.yml
conda activate sentiment-env

# Option 2: pip virtual environment
python -m venv venv
venv\Scripts\activate        # Windows
source venv/bin/activate     # Mac/Linux
pip install -r requirements.txt

📓 Notebooks

Run these notebooks in order for the full pipeline:

#	Notebook	Description
01	`01_EDA_preprocessing.ipynb`	Load IMDB data, clean HTML/special chars, visualize class distribution, word clouds
02	`02_tfidf_logistic_regression.ipynb`	TF-IDF feature extraction, train LR, evaluate, confusion matrix
03	`03_lstm_model.ipynb`	Load GloVe, train Bi-LSTM, plot training curves, evaluate
04	`04_bert_model.ipynb`	Fine-tune `bert-base-uncased`, evaluate, save model
05	`05_comparison_report.ipynb`	Side-by-side metrics, error analysis, final conclusions

🔬 Error Analysis

The project includes a dedicated error analysis module in src/evaluate.py and notebooks/05_comparison_report.ipynb:

False Positives: Reviews predicted as positive but actually negative
False Negatives: Reviews predicted as negative but actually positive
Confidence scores for misclassified samples
Word importance via LIME for BERT predictions
Common error patterns: Sarcasm, negation, domain-specific vocabulary

Example output:

❌ Misclassified by BERT:
Text: "This film tries SO hard to be profound that it ends up being unintentionally hilarious."
True Label: Negative | Predicted: Positive | Confidence: 0.61
Pattern: Sarcasm / Mixed Sentiment

⚖️ Class Imbalance Handling

Although IMDB is balanced (50/50), the project demonstrates techniques for imbalanced datasets:

Technique	Applied To
`class_weight='balanced'`	Logistic Regression
Weighted `BCELoss`	LSTM
Weighted `CrossEntropyLoss`	BERT
SMOTE (oversampling demo)	TF-IDF features
Stratified train/val/test split	All models

🌐 Live Demo

Launch the Gradio interactive demo locally:

python app/demo.py

Then open: http://localhost:7860

The demo lets you:

Type any review text
See predictions from all 3 models side-by-side
View confidence scores and sentiment bars

Deploy to Hugging Face Spaces (Free Hosting)

# 1. Create account at huggingface.co/spaces
# 2. Create a new Space with Gradio SDK
# 3. Push your code
git remote add space https://huggingface.co/spaces/najahaja/Sentiment-Analysis
git push space main

📦 Dependencies

Key packages (see requirements.txt for full list):

transformers>=4.35.0
torch>=2.0.0
scikit-learn>=1.3.0
pandas>=2.0.0
numpy>=1.24.0
datasets>=2.14.0
gradio>=4.0.0
matplotlib>=3.7.0
seaborn>=0.12.0
nltk>=3.8.0
imbalanced-learn>=0.11.0
lime>=0.2.0.1

🤝 Contributing

Contributions are welcome! Please:

Fork the repository
Create a feature branch: git checkout -b feature/add-roberta
Commit your changes: git commit -m 'Add RoBERTa comparison'
Push to the branch: git push origin feature/add-roberta
Open a Pull Request

📜 License

This project is protected. You may view the code for learning purposes only. Redistribution, modification, or commercial use without explicit permission is prohibited. See the LICENSE file for full details.

👤 Author

Ahamed Najah

🏷️ Topics

sentiment-analysis nlp bert lstm transformers huggingface scikit-learn machine-learning deep-learning python pytorch imdb text-classification

⭐ Star this repo if you found it useful!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

🧠 Sentiment Analysis: TF-IDF + LR vs LSTM vs BERT

📌 Table of Contents

🔍 Overview

📂 Dataset

🗂️ IMDB Movie Reviews Dataset (50,000 Reviews)

📥 How to Get the Dataset

🤖 Models Compared

1. 📊 TF-IDF + Logistic Regression (Baseline)

2. 🔁 LSTM (Deep Learning)

3. 🤗 BERT (`bert-base-uncased`)

📊 Results

Model Performance Comparison

Confusion Matrices

📁 Project Structure

🚀 Quick Start

🛠️ Installation

Prerequisites

Step-by-Step

📓 Notebooks

🔬 Error Analysis

⚖️ Class Imbalance Handling

🌐 Live Demo

Deploy to Hugging Face Spaces (Free Hosting)

📦 Dependencies

🤝 Contributing

📜 License

👤 Author

🏷️ Topics

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
app		app
models		models
notebooks		notebooks
results		results
src		src
.gitignore		.gitignore
KAGGLE_GUIDE.md		KAGGLE_GUIDE.md
LICENSE		LICENSE
README.md		README.md
SPACES_README.md		SPACES_README.md
environment.yml		environment.yml
kaggle_download.py		kaggle_download.py
requirements.txt		requirements.txt
setup.py		setup.py
train_all.py		train_all.py
upload_models_to_hf.py		upload_models_to_hf.py

Folders and files

Latest commit

History

Repository files navigation

🧠 Sentiment Analysis: TF-IDF + LR vs LSTM vs BERT

📌 Table of Contents

🔍 Overview

📂 Dataset

🗂️ IMDB Movie Reviews Dataset (50,000 Reviews)

📥 How to Get the Dataset

🤖 Models Compared

1. 📊 TF-IDF + Logistic Regression (Baseline)

2. 🔁 LSTM (Deep Learning)

3. 🤗 BERT (bert-base-uncased)

📊 Results

Model Performance Comparison

Confusion Matrices

📁 Project Structure

🚀 Quick Start

🛠️ Installation

Prerequisites

Step-by-Step

📓 Notebooks

🔬 Error Analysis

⚖️ Class Imbalance Handling

🌐 Live Demo

Deploy to Hugging Face Spaces (Free Hosting)

📦 Dependencies

🤝 Contributing

📜 License

👤 Author

🏷️ Topics

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

3. 🤗 BERT (`bert-base-uncased`)

Packages