Internal knowledge-based RAG chatbot system. Integrates various data sources (Confluence, Jira, Slack, GitHub, etc.) to provide AI-powered answers.
- RAG-based Q&A — Search internal documents and generate accurate answers via Claude API
- Multi-source integration — Extensible plugin architecture for Confluence, Jira, Slack, GitHub, etc.
- Incremental sync — Automatically update only changed documents (Celery batch)
- Source attribution — Provide reference document links in answers
- Streaming responses — Fast UX with real-time SSE streaming
- Conversation history — Support follow-up questions using previous conversation context
- Admin dashboard — Data source management, sync status, usage monitoring
┌─────────────────┐ ┌──────────────────────────────────┐
│ Frontend │ │ Backend (FastAPI) │
│ (Next.js) │────▶│ │
│ │ SSE │ Chat API → RAG Engine → Claude │
└─────────────────┘ │ ↕ │
│ Vector DB (Qdrant) │
└──────────────────────────────────┘
↑
┌────────────────┴─────────────────┐
│ Ingestion Pipeline (Celery) │
│ Confluence │ Jira │ Slack │ ... │
└──────────────────────────────────┘
| Layer | Technology |
|---|---|
| Backend | Python 3.12+, FastAPI |
| RAG | LangChain |
| LLM | Claude API (Anthropic) |
| Embedding | OpenAI text-embedding-3-small |
| Vector DB | Qdrant |
| Database | PostgreSQL 16 |
| Cache/Queue | Redis 7 |
| Task Queue | Celery |
| Frontend | Next.js 15, Tailwind CSS |
| Auth | Google OAuth 2.0 |
| Infra | AWS ECS Fargate, Docker |
- Python 3.12+
- Node.js 20+
- Docker & Docker Compose
git clone https://github.com/wonnx/knowledge-bot.git
cd knowledge-bot
cp .env.example .env
# Configure API keys in the .env file# PostgreSQL, Redis, Qdrant
docker compose -f docker-compose.dev.yml up -dcd backend
python -m venv .venv
source .venv/bin/activate
pip install -r requirements.txt
# DB migrations
alembic upgrade head
# Run server
uvicorn app.main:app --reload --port 8000cd frontend
npm install
npm run dev- Frontend: http://localhost:3000
- Backend API: http://localhost:8000
- API Docs: http://localhost:8000/docs
Configure data sources to integrate in config.yaml:
sources:
confluence:
enabled: true
url: "https://your-domain.atlassian.net"
spaces: ["DEV", "PM"]
sync_interval: "1h"
jira:
enabled: true
url: "https://your-domain.atlassian.net"
projects: ["PROJ"]
sync_interval: "30m"
slack:
enabled: false
channels: ["general", "dev"]
sync_interval: "1h"
github:
enabled: false
repos: ["org/repo"]
sync_interval: "2h"Extend BaseIngester to add a new data source:
# backend/app/ingestion/my_source.py
from app.ingestion.base import BaseIngester
class MySourceIngester(BaseIngester):
async def fetch_documents(self):
# Collect documents from the data source
...
async def get_last_sync_cursor(self):
# Cursor for incremental sync
...| Variable | Description | Required |
|---|---|---|
ANTHROPIC_API_KEY |
Claude API key | Yes |
OPENAI_API_KEY |
OpenAI API key (embedding) | Yes |
DATABASE_URL |
PostgreSQL connection string | Yes |
REDIS_URL |
Redis connection string | Yes |
QDRANT_URL |
Qdrant server URL | Yes |
JWT_SECRET |
JWT signing secret | Yes |
GOOGLE_CLIENT_ID |
Google OAuth client ID | Yes |
GOOGLE_CLIENT_SECRET |
Google OAuth client secret | Yes |
CONFLUENCE_URL |
Atlassian instance URL | Per source |
CONFLUENCE_API_TOKEN |
Atlassian API token | Per source |
JIRA_API_TOKEN |
Jira API token | Per source |
SLACK_BOT_TOKEN |
Slack bot token | Per source |
GITHUB_TOKEN |
GitHub personal access token | Per source |
MIT License