Graphen

GraphRAG Knowledge Graph Web App — Upload documents, automatically build knowledge graphs, visually explore, and chat with AI.

Document Parsing → Knowledge Graph Construction → Visual Exploration → AI Chat

✨ Features

📄 Document Parsing — Upload and automatically parse Markdown, PDF, and TXT documents
🧠 Knowledge Graph Construction — Extract entities and relationships from documents using LLM, automatically building a structured knowledge graph
🌐 Interactive Visualization — 2D force-directed graph powered by Reagraph with click, search, filter, and expand interactions
💬 GraphRAG Chat — Enhanced conversational AI using knowledge graph + vector retrieval, with accurate answers and source citations
🔌 Split Storage Architecture — PostgreSQL + pgvector as source of truth, Neo4j as graph projection

🏗️ Tech Stack

Layer	Technology
Frontend	React 19 + TypeScript + Vite
UI	Radix UI + Vanilla CSS + Framer Motion
Visualization	Reagraph (2D Force Layout)
State Management	Zustand
Backend	Node.js + Express + TypeScript
LLM	Gemini / Qwen / OpenAI (OpenAI-compatible)
Primary Storage	PostgreSQL 15+ + pgvector
Graph Database	Neo4j 5.x (graph traversal and subgraph query)
Monorepo	pnpm workspace

📁 Project Structure

graphen/
├── packages/
│   ├── frontend/          # Frontend (React + Vite)
│   ├── backend/           # Backend (Express + TypeScript)
│   └── shared/            # Shared type definitions
├── cases/                 # Sample test cases
├── data/                  # Local runtime data (.gitignore)
├── .env.example           # Environment variable template
├── pnpm-workspace.yaml    # pnpm workspace config
└── tsconfig.base.json     # Shared TypeScript config

🚀 Getting Started

1. Prerequisites

Tool	Version	Notes
Node.js	≥ 18.x	LTS version recommended (20.x / 22.x)
pnpm	≥ 10.0	Package manager (`npm install -g pnpm`)
PostgreSQL	≥ 15.x	Primary database (requires `pgvector` extension)
Neo4j	≥ 5.x	Graph database (local install or Neo4j Aura Free Tier)
LLM API Key	—	API key for Gemini, Qwen, or OpenAI

2. Install Neo4j

Choose one of the following methods:

Option A: Local Installation (Recommended)

Download Neo4j Community Edition from neo4j.com, then start:

# macOS (Homebrew)
brew install neo4j
neo4j start

Default address: bolt://localhost:7687, initial username/password: neo4j/neo4j (you'll be prompted to change the password on first login).

Option B: Neo4j Aura Free Tier (No Installation)

Visit Neo4j Aura to create a free instance and obtain the connection URI and password.

3. Clone & Install Dependencies

git clone <repo-url>
cd graphen

# Install all dependencies (frontend + backend + shared)
pnpm install

4. Configure Environment Variables

cp .env.example .env

Edit the .env file with the required settings (see Configuration below):

# ⚠️ Required
LLM_PROVIDER=gemini # or qwen, openai
GEMINI_API_KEY=sk-xxxxxxxxxxxxxxxx
NEO4J_PASSWORD=your-neo4j-password
PG_HOST=localhost
PG_DATABASE=graphen
PG_USER=graphen

# Other settings have default values, adjust as needed

If PG_AUTO_BOOTSTRAP=true (default), backend startup will automatically ensure the configured PG_USER and PG_DATABASE exist, then initialize the memory schema. It uses PG_BOOTSTRAP_DATABASE + PG_BOOTSTRAP_USER (or PGUSER/USER fallback) for admin connection.

5. Start the Development Server

# Start frontend and backend
pnpm dev

Once started, visit:

Service	URL
Frontend	http://localhost:5173
Backend API	http://localhost:3001
Health Check	http://localhost:3001/api/health

💡 Both frontend and backend support Hot Module Replacement (HMR) — changes are reflected immediately.

⚙️ Configuration

All configuration is managed via the .env file in the project root (based on the .env.example template).

General

Variable	Default	Description
`NODE_ENV`	`development`	Runtime environment
`PORT`	`3001`	Backend server port
`CORS_ORIGIN`	`http://localhost:5173`	Allowed frontend CORS origin
`LOG_LEVEL`	`info`	Log level (`debug` / `info` / `warn` / `error`)
`GRAPH_SYNC_ENABLED`	`true`	Enable/disable GraphSyncWorker
`RUNTIME_PG_REQUIRED`	`true`	When `NODE_ENV=test`, allow non-PG runtime mode if set to `false`

LLM

Variable	Default	Description
`LLM_PROVIDER`	`gemini`	LLM provider (`gemini`, `qwen`, `openai`)
`GEMINI_API_KEY`	(required)	Gemini API key
`QWEN_API_KEY`	(required)	Qwen API key
`OPENAI_API_KEY`	(required)	OpenAI API key
`EMBEDDING_DIMENSIONS`	`1024`	Embedding vector dimensions (must match the selected model)

💡 For additional settings like BASE_URL and MODEL_NAME, refer to .env.example.

LLM Rate Limiting & Resilience

Variable	Default	Description
`LLM_MAX_CONCURRENT`	`5`	Max concurrent LLM requests
`LLM_MAX_RETRIES`	`3`	Max retries on request failure
`LLM_RETRY_DELAY_MS`	`1000`	Initial retry delay in ms (exponential backoff)
`LLM_REQUESTS_PER_MINUTE`	`30`	Max requests per minute
`LLM_TIMEOUT_MS`	`120000`	Per-request timeout in ms

Neo4j

Variable	Default	Description
`NEO4J_URI`	`bolt://localhost:7687`	Neo4j connection URI
`NEO4J_USER`	`neo4j`	Database username
`NEO4J_PASSWORD`	(required)	Database password
`NEO4J_DATABASE`	`neo4j`	Database name

PostgreSQL

Variable	Default	Description
`PG_HOST`	`localhost`	PostgreSQL host
`PG_PORT`	`5432`	PostgreSQL port
`PG_DATABASE`	`graphen`	PostgreSQL database name
`PG_USER`	`graphen`	PostgreSQL user
`PG_PASSWORD`	`""`	PostgreSQL password
`PG_MAX_CONNECTIONS`	`20`	Connection pool max size
`PG_AUTO_BOOTSTRAP`	`true`	Auto-create runtime role/database and initialize memory schema at startup
`PG_BOOTSTRAP_DATABASE`	`postgres`	Admin connection database used for bootstrap
`PG_BOOTSTRAP_USER`	`""`	Admin user for bootstrap (`PGUSER/USER` fallback when empty)
`PG_BOOTSTRAP_PASSWORD`	`""`	Admin password for bootstrap
`PG_VECTOR_HNSW_M`	`16`	pgvector HNSW `m` parameter
`PG_VECTOR_HNSW_EF_CONSTRUCTION`	`200`	pgvector HNSW build parameter
`PG_VECTOR_EF_SEARCH`	`64`	pgvector query-time `ef_search`

Document Processing

Variable	Default	Description
`MAX_UPLOAD_SIZE`	`52428800`	Max file upload size in bytes (default 50 MB)
`CHUNK_SIZE`	`1500`	Text chunk size (tokens)
`CHUNK_OVERLAP`	`200`	Chunk overlap length (tokens)
`MAX_CHUNKS_PER_DOCUMENT`	`500`	Max chunks per document
`MAX_DOCUMENT_ESTIMATED_TOKENS`	`500000`	Max estimated tokens per document

API Rate Limiting

Variable	Default	Description
`RATE_LIMIT_WINDOW_MS`	`60000`	Rate limit time window in ms
`RATE_LIMIT_MAX`	`100`	Max requests per window

Data Storage Paths

Variable	Default	Description
`CACHE_DIR`	`data/cache`	Parsing intermediate result cache directory

🧪 Test Cases

The project provides sample test cases in the /cases directory for quick experimentation:

Simple Finance: A set of finance-related Markdown documents (relationship identification, market events, transaction networks), suitable for testing graph extraction accuracy and connectivity.

You can upload these files directly into the app to observe the knowledge graph being built.

📜 Available Scripts

Run from the project root:

# Start dev server (frontend + backend in parallel)
pnpm dev

# Build all packages
pnpm build

# Type checking
pnpm typecheck

Operate on individual packages:

# Frontend only
pnpm --filter @graphen/frontend dev

# Backend only
pnpm --filter @graphen/backend dev

# Run backend tests
pnpm --filter @graphen/backend test

# Run backend integration tests
pnpm --filter @graphen/backend test:integration

📄 License

MIT LICENSE

Name		Name	Last commit message	Last commit date
Latest commit History 15 Commits
cases		cases
packages		packages
scripts		scripts
.env.example		.env.example
.gitignore		.gitignore
README.md		README.md
demo_ui.html		demo_ui.html
package.json		package.json
pnpm-lock.yaml		pnpm-lock.yaml
pnpm-workspace.yaml		pnpm-workspace.yaml
tsconfig.base.json		tsconfig.base.json

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Graphen

✨ Features

🏗️ Tech Stack

📁 Project Structure

🚀 Getting Started

1. Prerequisites

2. Install Neo4j

3. Clone & Install Dependencies

4. Configure Environment Variables

5. Start the Development Server

⚙️ Configuration

General

LLM

LLM Rate Limiting & Resilience

Neo4j

PostgreSQL

Document Processing

API Rate Limiting

Data Storage Paths

🧪 Test Cases

📜 Available Scripts

📄 License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Graphen

✨ Features

🏗️ Tech Stack

📁 Project Structure

🚀 Getting Started

1. Prerequisites

2. Install Neo4j

3. Clone & Install Dependencies

4. Configure Environment Variables

5. Start the Development Server

⚙️ Configuration

General

LLM

LLM Rate Limiting & Resilience

Neo4j

PostgreSQL

Document Processing

API Rate Limiting

Data Storage Paths

🧪 Test Cases

📜 Available Scripts

📄 License

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages