Document intelligence and OSINT platform for local corpus analysis.
Ingest documents, extract entities, build a knowledge graph, detect communities, generate LLM summaries, and export investigation bundles — all running locally, no cloud required.
- Document ingestion — txt, json, markdown, structured or unstructured
- Entity extraction — people, orgs, locations, dates with relationship mapping
- Knowledge graph — force-directed visualization, typed edges, community detection
- Timeline engine — chronological event reconstruction across documents
- Narrative analysis — LLM-assisted summarization via local or API-connected models
- Investigation bundles — export/import
.osightpackages for sharing findings
FastAPI · SQLAlchemy · SQLite (FTS5) · Python 3.12 · Docker
cp .env.example .env # add your API key if using LLM features
bash start_services.shThe API runs on :8000, semantic service on :8010.
The examples/ directory contains synthetic demo datasets:
opensight_synthetic_legal_dataset/— 400 structured legal documentsopensight_demo_wow_json/— 5,000 document JSON corpusopensight_demo_wow_txt/— 5,000 document plain text corpus
See requirements.txt. Use a virtualenv.
apps/api/ FastAPI backend + routes
modules/ entity extraction, graph engine, timeline, analysis
semantic_service.py semantic search microservice
ui/ investigation dashboard
examples/ demo datasets
tests/ test suite
LANimals collective // badBANANA
