Skip to content

zenjahid/samsung-chatbot-api

Repository files navigation

Samsung Phone Information System

A comprehensive intelligent system for Samsung smartphone information, featuring web scraping, conversational AI chatbot, multi-agent system, and REST API integration.

Features

  • Web Scraping: Automated data collection from GSMArena for Samsung phone specifications
  • RAG Chatbot: Conversational AI using free open-source models with retrieval-augmented generation
  • Multi-Agent System: Specialized agents for phone specifications, comparisons, and review generation
  • REST API: FastAPI-based endpoints for system interaction
  • PostgreSQL Database: Structured storage for phone data and generated reviews
  • Vector Database: ChromaDB integration for semantic search and RAG capabilities

Technology Stack

  • Web Scraping: BeautifulSoup4, aiohttp, Selenium
  • Database: PostgreSQL with SQLAlchemy ORM
  • Chatbot: Hugging Face Transformers, Sentence Transformers
  • Multi-Agent: CrewAI framework
  • API: FastAPI with Pydantic models
  • Vector Store: ChromaDB
  • Models: Free open-source models (no OpenAI API required)

Installation

  1. Clone the repository:

    git clone https://github.com/zenjahid/samsung-chatbot-api.git
    cd samsung-chatbot-api
  2. Install dependencies:

    pip install -r requirements.txt
  3. Set up PostgreSQL database:

    # Install PostgreSQL (Ubuntu/Debian)
    sudo apt-get install postgresql postgresql-contrib
    
    # Create database
    sudo -u postgres createdb samsung_phones
    sudo -u postgres createuser --superuser $USER
  4. Configure environment:

    cp .env.example .env
    # Edit .env with your database credentials
  5. Initialize the system:

    python -c "from src.database.connection import create_tables; create_tables()"

Usage

1. Start the API Server

python main.py

The API will be available at http://localhost:8000

2. API Documentation

Visit http://localhost:8000/docs for interactive API documentation.

3. Scrape Samsung Phone Data

# Via API
curl -X POST "http://localhost:8000/scraper/run"

# Or directly
python src/scraper/gsmarena_scraper.py

4. Chat with the Bot

curl -X POST "http://localhost:8000/chat" \
  -H "Content-Type: application/json" \
  -d '{"message": "What are the camera specs of the Samsung Galaxy S23?"}'

5. Use Multi-Agent System

# Get specifications
curl -X POST "http://localhost:8000/agents/specifications?phone_name=Galaxy S23"

# Compare phones
curl -X POST "http://localhost:8000/agents/compare" \
  -H "Content-Type: application/json" \
  -d '{"phone_names": ["Galaxy S23", "Galaxy S22"]}'

# Generate review
curl -X POST "http://localhost:8000/agents/review" \
  -H "Content-Type: application/json" \
  -d '{"phone_name": "Galaxy S23 Ultra"}'

API Endpoints

Chat Endpoints

  • POST /chat - Chat with RAG-enabled bot
  • GET /examples - Get example queries

Phone Information

  • GET /phones - List all phones
  • GET /phones/{phone_id} - Get phone by ID
  • GET /phones/search/{phone_name} - Search phones

Multi-Agent System

  • POST /agents/specifications - Get detailed specs
  • POST /agents/compare - Compare multiple phones
  • POST /agents/review - Generate comprehensive review

Data Management

  • POST /scraper/run - Run web scraper
  • GET /stats - System statistics
  • GET /health - Health check

Project Structure

src/
├── config.py                 # Configuration settings
├── api/
│   └── main.py              # FastAPI application
├── agents/
│   └── multi_agent_system.py # CrewAI multi-agent system
├── chatbot/
│   └── rag_chatbot.py       # RAG chatbot implementation
├── database/
│   ├── connection.py        # Database connection
│   └── models.py           # SQLAlchemy models
└── scraper/
    └── gsmarena_scraper.py  # Web scraper for GSMArena
data/                        # Data storage
tests/                       # Test files

Models Used

The system uses free, open-source models:

  • Language Model: microsoft/DialoGPT-medium (conversation)
  • Embedding Model: all-MiniLM-L6-v2 (text embeddings)
  • Agent Model: microsoft/DialoGPT-small (agent coordination)

Example Queries

Chat Examples

  • "What are the camera specs of the Samsung Galaxy S23?"
  • "Which Samsung phone has the best battery life?"
  • "How does the Galaxy S23 compare to the S22 in terms of performance?"
  • "What is the price of the Galaxy Z Fold 4?"

Agent Examples

  • Specifications: Get detailed specs for any Samsung phone
  • Comparisons: Compare 2 or more Samsung phones
  • Reviews: Generate comprehensive AI reviews

Development

Running Tests

pytest tests/

Code Formatting

black src/
isort src/
flake8 src/

Adding New Features

  1. Follow the existing project structure
  2. Use type hints and docstrings
  3. Add appropriate error handling
  4. Update API documentation

Configuration

Key configuration options in .env:

  • DATABASE_URL: PostgreSQL connection string
  • MODEL_NAME: Language model for chatbot
  • EMBEDDING_MODEL: Model for text embeddings
  • SCRAPING_DELAY: Delay between web scraping requests
  • CHROMA_DB_PATH: ChromaDB storage path

Troubleshooting

Common Issues

  1. Database Connection Error:

    • Ensure PostgreSQL is running
    • Check database credentials in .env
  2. Model Loading Issues:

    • Ensure sufficient disk space for model cache
    • Check internet connection for model downloads
  3. Web Scraping Failures:

    • Check GSMArena website accessibility
    • Adjust scraping delays if needed

Logs

Check logs for detailed error information:

tail -f logs/app.log

Contributing

  1. Fork the repository
  2. Create a feature branch
  3. Make changes with tests
  4. Submit a pull request

License

This project is for educational and research purposes. Please respect GSMArena's terms of service when scraping data.

About

An intelligent system for Samsung smartphone information featuring web scraping, RAG chatbot, multi-agent AI, and REST API. Get comprehensive phone specs, comparisons, and AI-generated reviews—all using free open-source models.

Topics

Resources

Stars

Watchers

Forks

Contributors

Languages

Generated from github/codespaces-blank