YouTube RAG Chrome Extension

A full-stack Retrieval-Augmented Generation (RAG) application that allows you to chat with YouTube videos in real-time. It seamlessly integrates a Python backend (for transcript extraction, local embeddings, and LLM inference) with a Chrome Extension frontend embedded directly in the YouTube interface.

Prerequisites

Python 3.9+
Ollama installed and running on your system.
Google Chrome browser.

Installation & Setup

1. Clone the Repository

git clone https://github.com/Vinit-007/Youtube-Rag-Chrome-Extension
cd youtube-rag

2. Set Up the Python Environment

Create and activate a virtual environment:

python -m venv venv

# On Windows:
.\venv\Scripts\activate
# On macOS/Linux:
source venv/bin/activate

3. Install Dependencies

pip install -r requirements.txt

4. Pull the Local LLM

Ensure Ollama is running, then download the model required for this project:

ollama pull llama3.2:1b

Running the Application

1. Start the Backend Server

The Python backend handles transcript downloads, chunking, embedding generation, and answering questions. Keep this running in the background.

# Ensure your virtual environment is active
python -m app.server

The server will start listening on http://127.0.0.1:8765.

2. Load the Chrome Extension

To interact with the RAG pipeline directly from your browser:

Open Google Chrome.
Go to chrome://extensions/ in the address bar.
Toggle Developer mode to ON (top right corner).
Click the Load unpacked button (top left).
Select the chrome-extension folder located inside this project directory.

How to Use

Go to YouTube and open any video that contains an English transcript or closed captions.
Look for the newly added AI Assistant UI on the page.
Type your question about the video's content into the chatbox and press Ask.
The extension will send the video URL and your question to the local backend, retrieve relevant chunks from the video transcript, and stream an AI-generated answer back to your screen.

Exploring the Pipeline

If you want to understand how the RAG pipeline operates without the server or UI, open the Main.ipynb Jupyter Notebook. It provides a step-by-step walkthrough of transcript loading, text splitting, embedding storage, and query retrieval.

Note: Since embeddings are processed locally using CPU (via FAISS and Sentence Transformers), the initial load and processing time for a new video depends on the video's length and your system's hardware.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

YouTube RAG Chrome Extension

Prerequisites

Installation & Setup

1. Clone the Repository

2. Set Up the Python Environment

3. Install Dependencies

4. Pull the Local LLM

Running the Application

1. Start the Backend Server

2. Load the Chrome Extension

How to Use

Exploring the Pipeline

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
app		app
chrome-extension		chrome-extension
Main.ipynb		Main.ipynb
README.md		README.md
requirements.txt		requirements.txt

Folders and files

Latest commit

History

Repository files navigation

YouTube RAG Chrome Extension

Prerequisites

Installation & Setup

1. Clone the Repository

2. Set Up the Python Environment

3. Install Dependencies

4. Pull the Local LLM

Running the Application

1. Start the Backend Server

2. Load the Chrome Extension

How to Use

Exploring the Pipeline

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages