Foldable Phone Sentiment & Experience Analysis

This project analyzes public sentiment towards foldable iPhones using YouTube comments, and extracts user experiences, feature demands, and pain points from Reddit discussions about existing foldable phones. The goal is to provide insights that could guide product development decisions for future foldable devices.

We collected and cleaned reviews that are highly relevant to foldable phones, and used this text data to fine-tune an LLM (mini-BERT). By comparing the variance between the functional words (camera, battery, screen, processor) and emotional words (good, bad, etc.) of the phone before and after fine-tuning, we hope to see the best and worst parts of the user experience. Provide business decision support.

Project Structure

Foldable_Phone_Analysis/
│
├── data/                   # Raw & processed datasets
│   ├── youtube/            # YouTube comments (cleaned and sentiment-labeled)
│   └── reddit/             # Reddit posts and comments (topic-focused)
│
├── models/                 # Fine-tuned sentiment analysis models
│
├── outputs/                # Visualizations: confusion matrix, sentiment distribution, word clouds
│
├── scripts/                # Data preprocessing, training, inference scripts
│
├── notebooks/              # Jupyter notebooks (EDA, model evaluation)
│
├── requirements.txt        # Python dependencies
└── README.md               # Project description and instructions

Features

Sentiment Analysis (BERTweet-based):
- Trained on IMDb dataset, applied to YouTube comments about foldable iPhones.
- Predicts positive or negative sentiment per comment.
User Experience Extraction (Reddit):
- Analyzes real user feedback on existing foldable phones.
- Extracts common demands (e.g., durability, battery life) and pain points (e.g., screen crease, software issues).
Visualizations:
- Confusion Matrix (for model evaluation)
- Sentiment Distribution (YouTube comments)
- Word Clouds (highlighting key terms from positive and negative comments)

Workflow

Data Collection:
- Scraped YouTube comments on foldable iPhone videos.
- Collected Reddit discussions from relevant subreddits (e.g., r/FoldablePhones).
Data Preprocessing:
- Cleaned text (preserved emojis, removed URLs, mentions).
- Saved cleaned datasets (data_preprocessing.py).
Model Training:
- Fine-tuned BERTweet sentiment classifier on IMDb movie reviews.
- Saved model and tokenizer (models/saved_models/bertweet_imdb/).
Inference:
- Applied sentiment model to YouTube comments.
- Generated labeled dataset (youtube_foldable_apple_comments_with_sentiment.csv).
Semantic Transformation Analysis:
- Fine-tune the MiniLM language model on Reddit comments.
- Compute the sentiment similarity of features before and after fine-tuning.
- Compute the incremental similarity (Δ) and rank the features based on the change in sentiment association.
- Make comparison tables and visualizations.
Visualization & Analysis:
- Created confusion matrix, sentiment distribution charts, and word clouds.
View outputs:

Charts & visualizations in outputs/
Labeled YouTube comments in data/youtube/youtube_foldable_apple_comments_with_sentiment.csv

Results

Sentiment Distribution
Top positive and negative keywords
Semantic shift ranking:
- Identify features that have enhanced positive sentiment or reduced negative sentiment after fine-tuning.
- Highlight user priorities and controversial features.

See visualizations in the outputs/ folder.

Model Details

Base model: vinai/bertweet-base
- Training dataset: IMDb movie reviews (binary sentiment)
- Fine-tuning epochs: 3
- Batch size: 32
Semantic Transformation Model: MiniLM-L6-H384-uncased
- Fine-tuning dataset: Reddit reviews on foldable phones
- Task: Masked Language Modeling (MLM)
- Custom script for incremental similarity analysis

License

This project is for educational and research purposes.

Acknowledgments

Hugging Face Transformers
IMDb Dataset
Reddit and YouTube for user-generated content

Name		Name	Last commit message	Last commit date
Latest commit History 12 Commits
config		config
data		data
outputs/figures		outputs/figures
scripts		scripts
src/data		src/data
.gitattributes		.gitattributes
.gitignore		.gitignore
LDA_Example.py		LDA_Example.py
README.md		README.md
similarity_shift_results.csv		similarity_shift_results.csv
test.py		test.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Foldable Phone Sentiment & Experience Analysis

Project Structure

Features

Workflow

Results

Model Details

License

Acknowledgments

About

Uh oh!

Releases

Packages

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Foldable Phone Sentiment & Experience Analysis

Project Structure

Features

Workflow

Results

Model Details

License

Acknowledgments

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Packages