Skip to content

UserNamedMartin/ai-sentiment-analysis

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

6 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Sentiment Analysis Tool

A Python application that automatically analyzes comments from CSV files, categorizes them using AI, and generates comprehensive sentiment reports.

Limitation Note: This project is a proof of concept (POC) and, in some steps, processes the entire comments dataset at once. Supplying very large datasets may lead to excessive token usage and reduced output quality.

Features

  • Automatic Category Detection: AI-powered analysis to identify relevant categories from your comments
  • Batch Processing: Efficiently processes large volumes of comments in batches
  • Detailed Reporting: Generates comprehensive markdown reports with statistics and insights
  • CSV Support: Works with any CSV file containing a "Comment" column
  • Real-time Progress: Shows processing progress as comments are analyzed

Project Structure

├── main.py                 # Main application entry point
├── requirements.txt        # Python dependencies
├── source/                # Core modules
│   ├── define_categories.py    # AI-powered category definition
│   ├── classify_comments.py   # Comment classification logic
│   ├── generate_report.py     # Report generation
│   └── output_formats.py      # Output formatting utilities
├── data/                  # Input CSV files directory
│   └── ECLIPSE_ RISING.csv    # Sample data file
├── prompts/              # AI prompt templates (if any)
└── Sentiment Report.md   # Generated output report

Prerequisites

  • Python 3.8 or higher
  • OpenAI API key (for AI-powered categorization)

Installation

  1. Clone or download this repository

    git clone <repository-url>
    cd <project-directory>
  2. Create a virtual environment (recommended)

    python -m venv .venv
    source .venv/bin/activate  # On Windows: .venv\Scripts\activate
  3. Install dependencies

    pip install -r requirements.txt
  4. Set up your OpenAI API key

    Create a .env file in the project root:

    echo "OPENAI_API_KEY=your_api_key_here" > .env

    Replace your_api_key_here with your actual OpenAI API key.

Usage

Quick Start

To run the application with the sample data:

python main.py

This will process the included data/ECLIPSE_ RISING.csv file and generate a Sentiment Report.md.

Using Your Own Data

  1. Prepare your CSV file

    • Ensure your CSV has a column named "Comment"
    • Place the file in the data/ directory
  2. Update the file path in main.py

    # Edit line 42 in main.py
    report = main("data/YOUR_FILE.csv")
  3. Run the application

    python main.py

CSV Format Requirements

Your CSV file must contain at least one column named "Comment". Example:

ID,Comment
1,"This is a positive comment"
2,"This is a negative comment"
3,"This is a neutral comment"

How It Works

  1. Data Loading: Reads comments from the specified CSV file
  2. Category Definition: AI analyzes a sample of comments to automatically identify relevant categories
  3. Comment Classification: Each comment is classified into one of the identified categories or marked as an outlier
  4. Report Generation: Creates a detailed markdown report with:
    • Category definitions and descriptions
    • Comment distribution across categories
    • Statistical analysis
    • Key insights and trends

Output

The application generates a Sentiment Report.md file containing:

  • Executive Summary: Overview of the analysis
  • Category Breakdown: Detailed statistics for each category
  • Key Insights: AI-generated insights about the comment patterns
  • Recommendations: Actionable recommendations based on the analysis

Customization

Modifying Categories

The categories are automatically generated, but you can influence them by:

  • Modifying the prompts in the source/define_categories.py file
  • Adjusting the sample size used for category detection

Changing Output Format

  • Modify source/generate_report.py to change the report structure
  • Edit source/output_formats.py to customize the markdown formatting

Batch Size

Comments are processed in batches of 10 by default. To change this:

# In main.py, line 21, change the batch size:
for comments_batch in [comments[i:i+BATCH_SIZE] for i in range(0, len(comments), BATCH_SIZE)]:

API Usage

The application uses OpenAI's API for:

  • Automatic category detection from comment samples
  • Individual comment classification
  • Report generation and insight creation

Make sure you have sufficient API credits for your dataset size.

Sample Data

The included ECLIPSE_ RISING.csv contains 200 sample comments about a fictional TV show adaptation, including:

  • Fan reactions and excitement
  • Casting suggestions and preferences
  • Concerns about adaptation quality
  • Spam and promotional content
  • Social media interactions

This provides a good example of mixed sentiment social media data for testing the tool.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages