Skip to content

Tammyb2/sql-db-chat-agent

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

5 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Atliq T-Shirts Database Q&A Agent

A Streamlit application using a Few-Shot LangChain SQL Agent (powered by Gemini-2.5-Flash) to query a MySQL database for T-shirt inventory data in natural language.

🌟 Overview

This project provides a robust, natural language interface for interacting with a hypothetical T-shirt inventory database (atliq_tshirts). It leverages the power of Large Language Models (LLMs) and LangChain's SQL Agent toolkit, specifically utilizing a Few-Shot Prompting technique to ensure high accuracy in converting user questions into correct SQL queries.

🚀 How It Works

  1. User Input: The user enters a question into the Streamlit app (e.g., "What is the total cost of all small size t-shirts?").
  2. Few-Shot Selection: An OpenAIEmbeddings model and a Chroma vector store are used with SemanticSimilarityExampleSelector to find the most relevant examples from fewshots.py.
  3. Prompt Construction: The selected examples are included in a FewShotPromptTemplate along with the database schema and the user's question, guiding the LLM (Gemini-2.5-Flash) to generate the correct SQL query.
  4. SQL Execution: The LangChain SQL Agent executes the generated SQL query against the MySQL database (atliq_tshirts).
  5. Answer Generation: The LLM receives the SQL result and formats a final, human-readable answer, which is then displayed in the Streamlit application.

🛠️ Technology Stack

Component Technology / Library Purpose
LLM ChatGoogleGenerativeAI(model="gemini-2.5-flash") Core reasoning and SQL generation.
Agent Framework LangChain SQL Agent Toolkit Manages the NL-to-SQL execution process.
Database MySQL (via SQLDatabase.from_uri) Stores the T-shirt inventory data.
Example Selector SemanticSimilarityExampleSelector Retrieves the most contextually similar few-shot examples.
Embeddings OpenAIEmbeddings Used for calculating semantic similarity between questions.
Frontend Streamlit Interactive web interface for the Q&A app.

📦 Local Setup and Installation

Prerequisites

  • Python 3.x
  • A running MySQL server (configured with a database named atliq_tshirts and a tshirts table that matches the schema implied by fewshots.py).
  • Google API Key for the LLM (set as GOOGLE_API_KEY in your environment or a .env file).
  • OpenAI API Key (used for OpenAIEmbeddings as configured in langchain_helper.py).

Steps

  1. Clone the Repository:

    git clone [your-repo-link]
    cd [your-repo-name]
  2. Set up Environment Variables: Create a file named .env and/or secret_key.py/openai_key.py to store your API keys and database credentials (as referenced by langchain_helper.py).

  3. Install Dependencies:

    pip install -r requirements.txt # (Assuming you generate this file)
    # Required libraries include: streamlit, langchain-google-genai, langchain-community, langchain-openai, python-dotenv, pymysql, chromadb
  4. Run the Streamlit App:

    streamlit run main.py

The application will open in your web browser, ready to accept questions.

📋 Few-Shot Examples (fewshots.py)

The project uses a small set of example question/SQL pairs to maximize the agent's performance.

Input Example Generated SQL Query Answer
"How many t-shirts do we have left for nike in extra small size and white color?" SELECT sum(stockquantity) FROM tshirts WHERE brand='Nike' AND color='White' AND size='XS' "We have 64 t-shirts left for Nike..."
"What is the total cost of all small size t-shirts?" SELECT SUM(price * stockquantity) FROM tshirts WHERE size='S' "The total value of all the T-shirts with size S in the database is 16,379."
"If we have to sell all the Levis T-shirts today. How much revenue our store will generate without discount?" SELECT SUM(price * stockquantity) FROM tshirts WHERE brand='Levi' "The total value of Levi brand t-shirts in stock is 24,421."

You can extend this list in fewshots.py to cover more complex or unique query patterns for improved model accuracy.

About

A Streamlit application using a Few-Shot LangChain SQL Agent (Gemini-2.5-Flash) to query a MySQL database for T-shirt inventory data.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages