Conversational Book Information Retrieval System

This repository implements a conversational Retrieval Augmented Generation (RAG) system using Gemini 2.0 Flash, Google Generative AI Embeddings, and ChromaDB to provide information about books scraped from books.toscrape.com. It features persistent storage, conversational context awareness, and robust error handling.

Key Capabilities

Conversational Context Awareness: Maintains a chat history to understand the flow of the conversation.
Web Scraping: Scrapes book data from books.toscrape.com, including title, price, description, category, and other details.
Persistent Vector Store: Uses ChromaDB to create and persist a vector store of book information, allowing for efficient retrieval.
Gemini Integration: Leverages Gemini 2.0 Flash for question answering and text generation.
Google Generative AI Embeddings: Uses Google Generative AI Embeddings to create vector embeddings for book data.

Architecture

graph LR
    A[User Query] --> B(Conversational RAG System);
    B --> C{ChromaDB Exists?};
    C -- Yes --> D[Load ChromaDB];
    C -- No --> E[Scrape Books];
    E --> F[Create Documents];
    F --> G[Create Embeddings];
    G --> H[Store in ChromaDB];
    D --> I[Retrieve Relevant Docs];
    H --> I;
    I --> J[Gemini 2.0 Flash];
    J --> K[Generate Response];
    K --> L[Display Response];
    L --> M[Update Conversation Memory];
    M --> A;

Prerequisites

Python 3.6+
Google Cloud Project with Gemini API enabled
Google Cloud API key
.env file with GEMINI_API_KEY set

Installation Guide

Clone the repository:

git clone [repository_url]
cd [repository_directory]

Create a virtual environment (recommended):

python3 -m venv venv
source venv/bin/activate  # On macOS and Linux
venv\Scripts\activate  # On Windows

Install dependencies:
```
pip install -r requirements.txt
```
(See requirements.txt for the list of dependencies.)
Create a .env file:
```
GEMINI_API_KEY=YOUR_GEMINI_API_KEY
```
Replace YOUR_GEMINI_API_KEY with your actual Gemini API key.
Place the following files in your directory:
- bookquery.py (Main script)
- webscraper.py (Web scraping functions)
- rag_utils.py (RAG system utilities)

Usage

Run the script:
```
python bookquery.py
```
Enter your queries:

The script will prompt you to enter queries about books. You can ask questions like:
- "Tell me about 'A Light in the Attic'."
Exit the application:

Type "exit" and press Enter.

Code Explanation

bookquery.py: Main script that orchestrates the RAG system and handles user interaction.
webscraper.py: Contains functions for scraping book details (extract_book_details) and scraping all books (scrape_all_books).
rag_utils.py: Holds the build_rag_system function that sets up the conversational RAG system, including ChromaDB, embeddings, Gemini integration, and conversation memory.

Contributing

Contributions are welcome! Please submit a pull request or open an issue for any bugs or feature requests.

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
.gitignore		.gitignore
README.md		README.md
bookquery.py		bookquery.py
call_rag_system.py		call_rag_system.py
requirements.txt		requirements.txt
scraper.py		scraper.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Conversational Book Information Retrieval System

Key Capabilities

Architecture

Prerequisites

Installation Guide

Usage

Code Explanation

Contributing

About

Uh oh!

Releases

Packages

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Conversational Book Information Retrieval System

Key Capabilities

Architecture

Prerequisites

Installation Guide

Usage

Code Explanation

Contributing

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Packages