Skip to content

wonnx/knowledge-bot

Repository files navigation

Knowledge Bot

Internal knowledge-based RAG chatbot system. Integrates various data sources (Confluence, Jira, Slack, GitHub, etc.) to provide AI-powered answers.

Features

  • RAG-based Q&A — Search internal documents and generate accurate answers via Claude API
  • Multi-source integration — Extensible plugin architecture for Confluence, Jira, Slack, GitHub, etc.
  • Incremental sync — Automatically update only changed documents (Celery batch)
  • Source attribution — Provide reference document links in answers
  • Streaming responses — Fast UX with real-time SSE streaming
  • Conversation history — Support follow-up questions using previous conversation context
  • Admin dashboard — Data source management, sync status, usage monitoring

Architecture

┌─────────────────┐     ┌──────────────────────────────────┐
│  Frontend       │     │  Backend (FastAPI)                │
│  (Next.js)      │────▶│                                  │
│                 │ SSE │  Chat API → RAG Engine → Claude  │
└─────────────────┘     │                ↕                 │
                        │           Vector DB (Qdrant)      │
                        └──────────────────────────────────┘
                                         ↑
                        ┌────────────────┴─────────────────┐
                        │  Ingestion Pipeline (Celery)      │
                        │  Confluence │ Jira │ Slack │ ...  │
                        └──────────────────────────────────┘

Tech Stack

Layer Technology
Backend Python 3.12+, FastAPI
RAG LangChain
LLM Claude API (Anthropic)
Embedding OpenAI text-embedding-3-small
Vector DB Qdrant
Database PostgreSQL 16
Cache/Queue Redis 7
Task Queue Celery
Frontend Next.js 15, Tailwind CSS
Auth Google OAuth 2.0
Infra AWS ECS Fargate, Docker

Quick Start

Prerequisites

  • Python 3.12+
  • Node.js 20+
  • Docker & Docker Compose

1. Clone & Setup

git clone https://github.com/wonnx/knowledge-bot.git
cd knowledge-bot
cp .env.example .env
# Configure API keys in the .env file

2. Start Infrastructure

# PostgreSQL, Redis, Qdrant
docker compose -f docker-compose.dev.yml up -d

3. Backend

cd backend
python -m venv .venv
source .venv/bin/activate
pip install -r requirements.txt

# DB migrations
alembic upgrade head

# Run server
uvicorn app.main:app --reload --port 8000

4. Frontend

cd frontend
npm install
npm run dev

5. Open

Data Source Configuration

Configure data sources to integrate in config.yaml:

sources:
  confluence:
    enabled: true
    url: "https://your-domain.atlassian.net"
    spaces: ["DEV", "PM"]
    sync_interval: "1h"

  jira:
    enabled: true
    url: "https://your-domain.atlassian.net"
    projects: ["PROJ"]
    sync_interval: "30m"

  slack:
    enabled: false
    channels: ["general", "dev"]
    sync_interval: "1h"

  github:
    enabled: false
    repos: ["org/repo"]
    sync_interval: "2h"

Adding a New Data Source

Extend BaseIngester to add a new data source:

# backend/app/ingestion/my_source.py
from app.ingestion.base import BaseIngester

class MySourceIngester(BaseIngester):
    async def fetch_documents(self):
        # Collect documents from the data source
        ...

    async def get_last_sync_cursor(self):
        # Cursor for incremental sync
        ...

Environment Variables

Variable Description Required
ANTHROPIC_API_KEY Claude API key Yes
OPENAI_API_KEY OpenAI API key (embedding) Yes
DATABASE_URL PostgreSQL connection string Yes
REDIS_URL Redis connection string Yes
QDRANT_URL Qdrant server URL Yes
JWT_SECRET JWT signing secret Yes
GOOGLE_CLIENT_ID Google OAuth client ID Yes
GOOGLE_CLIENT_SECRET Google OAuth client secret Yes
CONFLUENCE_URL Atlassian instance URL Per source
CONFLUENCE_API_TOKEN Atlassian API token Per source
JIRA_API_TOKEN Jira API token Per source
SLACK_BOT_TOKEN Slack bot token Per source
GITHUB_TOKEN GitHub personal access token Per source

License

MIT License

About

RAG-based internal knowledge search platform — FastAPI, Claude API, Qdrant, PostgreSQL

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors