🤖 RAG Chatbot Agent

An end-to-end Retrieval-Augmented Generation (RAG) system built with n8n that transforms a Google Drive folder into a searchable AI-powered knowledge base. Ask questions in plain English and get accurate, document-grounded answers.

🧠 What Is RAG?

RAG stands for Retrieval-Augmented Generation. Instead of relying on an AI's general training data, RAG:

Retrieves the most relevant chunks from your own documents
Augments the user's question with that retrieved context
Generates a precise answer grounded in your actual files

This means the chatbot answers from your documents, not from general AI knowledge.

🏗️ Architecture

Google Drive Folder
        ↓
  n8n Ingestion Workflow
        ↓
  Extract Text → AI Metadata Extraction → Chunking → Embeddings
        ↓
  Qdrant Vector Database
        ↓
  User Question → Embed Query → Retrieve Top Chunks
        ↓
  Groq LLM (Llama 3.3 70B) → Grounded Answer
        ↓
  Chat Interface + Google Sheets Log

⚙️ Tech Stack

Layer	Tool
Workflow Orchestration	n8n
Document Storage	Google Drive
Vector Database	Qdrant
LLM — Generation & Metadata	Groq — Llama 3.3 70B
Embeddings	Hugging Face — `sentence-transformers/all-MiniLM-L6-v2`
Logging	Google Sheets
Notifications	Gmail

🔄 Workflow 1 — Document Ingestion Pipeline

Triggered manually or on schedule. Processes documents from Google Drive and stores them in Qdrant.

Flow:

Manual Trigger
→ Config (folder ID, collection name, chunk settings)
→ Google Drive — List PDF files
→ Filter PDFs only
→ Loop Over Files (batch size: 1)
  → Download File
  → Extract Text from PDF
  → Normalize & Clean Text
  → Groq — Extract Metadata (title, summary, keywords, topics, risks)
  → Merge text + metadata
  → Flatten Metadata
  → Chunk Text (1200 tokens, 200 overlap)
  → Generate Embeddings (HuggingFace)
  → Store in Qdrant with metadata payload
  → Log to Google Sheets
→ Gmail — Send completion notification

Metadata extracted per document:

Title
Summary
Main topics
Keywords
Document type
Audience
Important entities
Action items
Risks
Dates

💬 Workflow 2 — RAG Chat Interface

Runs every time a user sends a message via n8n's built-in chat UI.

Flow:

Chat Trigger
→ Master Agent (LangChain)
  → Simple Memory (last 30 messages)
  → Qdrant Vector Store Tool (Top-K: 8 chunks)
  → Groq Llama 3.3 70B
→ Return grounded answer
→ Log to Google Sheets

System behaviour:

Always searches Qdrant before answering
Mentions source file names in responses
Says "I don't know" if answer is not in documents
Maintains conversation memory across turns

🗂️ Repository Structure

rag_chatbot_agent/
├── README.md                        ← Project documentation
├── .gitignore                       ← Ignores secrets and env files
├── .env.example                     ← Required credentials template
└── workflows/
    └── RAG-CHATBOT-AGENT.json       ← n8n workflow (import this)

🚀 Getting Started

1. Prerequisites

n8n instance (cloud or self-hosted)
Qdrant cluster (free tier available)
Groq API key (free tier)
Hugging Face API key (free)
Google account (Drive + Sheets + Gmail)

2. Create a Qdrant Collection

In your Qdrant dashboard, create a collection with:

Vector size: 384
Distance: Cosine

3. Set Up Credentials in n8n

Go to n8n → Settings → Credentials and add:

Credential	Used For
Google Drive OAuth2	Reading files
Google Sheets OAuth2	Logging
Gmail OAuth2	Notifications
Qdrant API	Vector storage
Groq API	LLM generation
Hugging Face API	Embeddings

4. Import the Workflow

Open your n8n instance
Click New Workflow → ⋮ Menu → Import from file
Select workflows/RAG-CHATBOT-AGENT.json

5. Configure the Config Node

Open the Edit Fields node and update:

folder_id         → Your Google Drive folder ID
qdrant_collection → Your Qdrant collection name
qdrant_url        → Your Qdrant cluster URL

6. Run Ingestion

Add PDF files to your Google Drive folder
Click Execute Workflow on the ingestion workflow
Wait for the Gmail completion notification
Verify vectors appear in your Qdrant dashboard

7. Start Chatting

Open the Chat trigger in n8n
Click the chat icon to open the chat UI
Ask questions about your documents

🔑 Environment Variables

Copy .env.example to .env and fill in your values:

QDRANT_URL=YOUR_QDRANT_CLUSTER_URL
QDRANT_API_KEY=your_qdrant_api_key_here
GROQ_API_KEY=your_groq_api_key_here
HF_API_KEY=your_huggingface_api_key_here
GOOGLE_DRIVE_FOLDER_ID=your_google_drive_folder_id
QDRANT_COLLECTION=your_qdrant_collection_name
GOOGLE_SHEETS_ID=your_google_sheets_id
GMAIL_ADDRESS=your_gmail_address
N8N_INSTANCE_ID=your_n8n_instance_id
WEBHOOK_ID=your_webhook_id

📊 Google Sheets Logging

The workflow automatically logs every indexed document to Google Sheets with:

Column	Description
timestamp	When the file was indexed
file_id	Google Drive file ID
file_name	Name of the document
status	`indexed`
collection	Qdrant collection used
metadata	Extracted AI metadata
pageContent	Chunk text stored

💡 Key Concepts Demonstrated

RAG Architecture — retrieval-grounded answer generation
Vector Database Design — Qdrant with rich metadata payloads
Semantic Chunking — 1200 token chunks with 200 token overlap
AI Metadata Extraction — structured enrichment using Groq
LangChain Agent — tool-using agent with memory
Workflow Automation — end-to-end orchestration in n8n
Google Workspace Integration — Drive, Sheets, Gmail APIs

📄 License

MIT License — free to use, modify, and distribute.

👤 Author

MAHADEVAN-007 GitHub: @MAHADEVAN-007

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

🤖 RAG Chatbot Agent

🧠 What Is RAG?

🏗️ Architecture

⚙️ Tech Stack

🔄 Workflow 1 — Document Ingestion Pipeline

💬 Workflow 2 — RAG Chat Interface

🗂️ Repository Structure

🚀 Getting Started

1. Prerequisites

2. Create a Qdrant Collection

3. Set Up Credentials in n8n

4. Import the Workflow

5. Configure the Config Node

6. Run Ingestion

7. Start Chatting

🔑 Environment Variables

📊 Google Sheets Logging

💡 Key Concepts Demonstrated

📄 License

👤 Author

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Name		Name	Last commit message	Last commit date
Latest commit History 7 Commits
workflows		workflows
.env.example		.env.example
.gitignore		.gitignore
README.md		README.md

Folders and files

Latest commit

History

Repository files navigation

🤖 RAG Chatbot Agent

🧠 What Is RAG?

🏗️ Architecture

⚙️ Tech Stack

🔄 Workflow 1 — Document Ingestion Pipeline

💬 Workflow 2 — RAG Chat Interface

🗂️ Repository Structure

🚀 Getting Started

1. Prerequisites

2. Create a Qdrant Collection

3. Set Up Credentials in n8n

4. Import the Workflow

5. Configure the Config Node

6. Run Ingestion

7. Start Chatting

🔑 Environment Variables

📊 Google Sheets Logging

💡 Key Concepts Demonstrated

📄 License

👤 Author

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Packages