A multi-tenant private file host that turns into a searchable, askable knowledge base. Drop PDFs, DOCX, Markdown or text in and Filenergy extracts the text, embeds it, and lets you (or your team) chat with the library — with citations, a programmatic API, and Stripe billing built in.
Stack: Flask 3 + SQLite, Voyage embeddings, the Anthropic Claude API for answers, and Stripe Checkout for billing.
Chat with citations + vision attachment
File library with bulk multi-select
Answer-quality dashboard (thumbs feedback over time + triage queue)
- Reranking on top of vector search — every retrieval pulls a wider
candidate set, then a fast Claude pass scores each chunk 0–10 against
the query. Disable with
FILENERGY_RERANKER=noop. - SCIM 2.0 at
/scim/v2/*— provision and de-provision users from Okta, Workspace, Auth0, Azure AD. Bearer-token auth +X-Filenergy-Workspace-Slugheader. - Email-to-ingest — every workspace gets a deterministic
inbox-<slug>-<token>@<domain>address. Forward an email and the body + attachments become indexed files. - Soft-delete with undo — bulk-delete shows a toast with an Undo
button for 8 seconds. Files only get hard-deleted after a grace
window (
flask purge-deleted-filesruns from cron). - Conversation pinning + archiving — pinned threads float to the top of the sidebar; archived threads are hidden but kept for export.
- File rename, collection share links, command-palette arrow-key navigation, redesigned landing page.
- Command palette (⌘K / Ctrl+K /
/) — quick-jump to any page, searchable, keyboard-driven. - Chat power moves — Stop generation mid-stream, Regenerate the last answer, thumbs up/down per assistant message (feeds the eval signal), copy any message, inline rename of conversations, deep-linkable starter questions.
- File library — bulk multi-select with checkbox column, bulk delete + reindex, instant client-side filter, drag-drop upload zone with per-file progress and inline retry.
- Mobile-first nav — hamburger drawer on small screens; chat & settings layouts collapse cleanly on mobile.
- Toasts — flash messages render as dismissable toasts (auto-hide
on success, manual dismiss on error).
window.fnToast(msg, kind)is available everywhere for ad-hoc notifications. - SVG icon sprite baked into base.html — every nav item, button, and empty state uses a consistent line-icon set.
- Workspaces (multi-tenant) — every user gets a personal workspace and
can be invited to others. Roles (owner / admin / member). Members,
invitations, and a switcher live under
/settings/workspace. - Collections — group files into folders ("notebooks") and ask questions scoped to one collection or one file. Per-doc Q&A like NotebookLM / Humata.
- Auto-summary + suggested questions — on first index every file gets
a one-line summary and 3 suggested questions, generated by Claude with
structured output (
output_config.format). The "magic moment" of the demo. - Streaming chat — Server-Sent Events feed tokens to the browser as
Claude generates them. Markdown is rendered live; sources cite back to
the originals. Multi-turn threads persist with the last 12 messages of
context sent back to the model. Supports
collection_id/file_idscoping. - Background indexing — uploads return immediately; the embedding pipeline runs in a daemon thread.
- Outbound webhooks — customers register URLs + signing secrets, pick
the events they care about (
file.uploaded,ask.answered, etc.), and we POST signed JSON. Deliveries are logged with HTTP status + retry count. HMAC-SHA256 inX-Filenergy-Signature: sha256=.... - Audit log UI —
/audit/shows every event with type/user/date filters, paginates, and exports CSV. Required for any enterprise sale. - Email service — pluggable adapter (
logdefault,smtpfor production). Used for invitations; ready for password resets and notification flows. - Plan-based quotas —
free/pro/teamtiers with limits on storage, file count, members, and questions per month. - Stripe billing — Checkout sessions, webhook reconciliation, and per-workspace customer/subscription state.
- API keys +
/api/v1— bearer-token auth toPOST /api/v1/files,POST /api/v1/ask,GET /api/v1/files. - Public share links — per-file unguessable URLs with optional TTL and download caps.
- Sliding-window rate limit on
/ask(per user) layered on top of the same Event log. - Event log — every meaningful action lands in a row. Substrate for rate-limiter, billing usage, audit log, and webhooks.
- Health endpoints —
/healthz(cheap liveness) and/readyz(DB + config check) for k8s and uptime monitors. - Settings UI — profile, security, workspace, API keys, webhooks, audit log, billing.
- Indexing badges — the file list shows
indexed/pending/errorper row with one-click reindex.
- 2FA (TOTP) —
pyotp+ QR code. 8 single-use recovery codes generated on enable. Login flow deferslogin_useruntil the OTP succeeds, so a stolen password alone is not enough. - WebAuthn / passkeys — full FIDO2 ceremony backed by
py_webauthn:navigator.credentials.create/getfrom the browser, challenge stash in Flask session, attestation + assertion verification server-side. Also ships a stub fallback for tests / pre-JS deploys. Multiple keys per user; per-key labels and last-used timestamps in security settings. - Workspace-wide 2FA enforcement — owners can flip "Require 2FA"; any member without TOTP or a passkey is bounced to /settings/security on the next request until they enroll.
- Google SSO —
AuthlibOpenID Connect. Auto-creates the user (with default workspace) on first login; links to existing accounts by email. Hidden whenGOOGLE_OAUTH_CLIENT_IDis unset. - Self-serve account deletion — wipes owned workspaces, files, conversations, API keys, and memberships. Anonymizes events to keep audit trails. Required for GDPR/CCPA compliance.
- Bulk file delete —
POST /file/bulk_delete/withids[]. - Conversation export —
GET /ask/c/<id>/export.mdreturns a Markdown transcript with sources. - Audit log CSV export —
GET /audit/export.csv(admin only). - Workspace export (GDPR-grade) —
GET /w/exportreturns a ZIP with every file's bytes, conversation transcripts as Markdown, members CSV, events CSV, and metadata JSON. Owner / admin only. - Personal data export —
GET /settings/account/exportreturns a ZIP with the user's own data + every workspace they own (each as a nested ZIP). Self-serve GDPR portability. - Weekly email digests — Monday recap of last week's uploads, asks,
and new members per workspace. Per-user opt-out at
/settings/workspace. Wired asflask send-digestsfor cron / k8s job runners.
/healthz— liveness, no DB hit./readyz— DB SELECT 1 + reports configuration of every external dependency./metrics— Prometheus exposition format. Per-endpoint request counters and a duration histogram with the standard SRE buckets.- Structured request logs — every response emits an
INFOlog withendpoint,status,duration_ms,user_id,workspace_id. Pipe through any log shipper to get user-facing analytics for free. - 404 / 500 error pages so the brand stays intact when something breaks.
- OpenAPI 3 spec at
/api/v1/openapi.json. - Swagger UI rendered at
/api/v1/docsvia the public CDN.
- Dockerfile with a multi-stage build: a Node 20 stage compiles
Tailwind into
static/css/app.css, the Python 3.11 stage copies it into the runtime image. base.html links the bundle when present and falls back to the Play CDN in dev. CSP tightens automatically (drops'unsafe-eval'and the CDN host) when the bundle is in use. - docker-compose.yml with a persistent SQLite volume.
- Background jobs with retries — webhook deliveries (5xx + network
errors), digest emails (SMTP failures). Exponential backoff (2/4/8/16s)
via
jobs.enqueue(..., retries=4). RQ-backed when Redis is configured; threading fallback otherwise.
- URL ingestion — paste a URL on the upload page; we fetch, strip HTML, extract title/text, and index it like any other file. Caps page size at 10 MB and rejects non-text content types.
- OCR fallback — when local extractors return nothing (scanned PDFs,
screenshots), we send the file to Claude as a
documentorimageblock and ask it to transcribe verbatim. Image MIME types (png/jpg/gif/webp) are now first-class indexable as a result.
- Onboarding wizard — new users land on
/onboarding/after registration: name your workspace, optionally seed three sample files so the demo is non-empty.
/dashboard/(owner/admin only) — stat cards (files, indexed, collections, conversations, messages, members, API keys, plan), a 30-day uploads timeseries, a 30-day asks timeseries, and the top-cited files. Plain inline SVG bars — no chart libraries.
- Alembic via Flask-Migrate —
flask db upgradeapplies pending migrations in production. Local dev / tests still usedb.create_all()so the suite is self-contained. SetFILENERGY_SKIP_CREATE_ALL=1when runningflask db migrateagainst an empty schema.
/ask/c/<id>/export.md— Markdown/ask/c/<id>/export.pdf— PDF (viafpdf2, no Cairo dependency)/ask/c/<id>/export.docx— DOCX (viapython-docx, already a dep)
Pluggable third-party sources via filenergy/services/connectors.py.
Manage connections at /connectors/. Each connector implements
authorize_url / complete_oauth / sync and runs on demand from the
UI or via the scheduler.
- Google Drive — full OAuth dance, refresh tokens, native Google
docs exported as text/CSV; PDFs and
text/*MIME types pulled directly. - Notion — OAuth +
search+ recursive block flattening into Markdown. Stores each page as<title>.md. - Dropbox — OAuth (offline access), token refresh,
list_folder+downloadfor indexable extensions (PDF, DOCX, MD, TXT, CSV, JSON, HTML, log). - Slack — OAuth read-only scopes, fetches the most recent channels
and writes per-channel transcripts as
slack-<channel>.txt.
filenergy/services/connector_scheduler.py runs a single daemon thread
that wakes every FILENERGY_SYNC_INTERVAL_MIN minutes (default 60),
walks every ConnectorAccount whose last_synced_at is stale, and
re-runs its sync() as the workspace owner. Errors land in
account.last_error instead of crashing the loop. Disabled in TESTING
and when FILENERGY_SYNC_SCHEDULER=false.
browser-extension/— Chrome MV3 popup. Paste your server URL + an API key, click "Save page". POSTs the current tab's URL to/file/from_url/withAuthorization: Bearer …. The endpoint accepts either session cookies or API keys.
filenergy/services/jobs.py—enqueue(target_path, *args).- Default backend is
thread: a daemon thread per job (zero-config). - Set
FILENERGY_JOBS_BACKEND=rq+REDIS_URL=redis://...to push to a Redis queue (run anrqworker against it). Falls back to thread when the dep or URL is missing. - Tests force synchronous execution via
app.config["TESTING"].
-
DB engine is whatever URI you put in
FILENERGY_DB_URI. Setpostgresql://user:pass@host/db, thenflask db upgrade. -
For real-scale RAG retrieval, install
pgvector>=0.2and run:from filenergy.services import pgvector_store pgvector_store.enable_pgvector(dim=512) # creates extension + col + ivfflat index pgvector_store.reembed_existing() # back-fill from JSON column
After that,
embeddings.searchautomatically uses anORDER BY cosineSQL query instead of pulling every chunk into Python. SQLite callers stay on the JSON+numpy path with no code change.
filenergy/services/saml_sso.pyis a realpython3-samlwrapper (not a stub). SetSAML_ENABLED=true,SAML_IDP_METADATA_URL,SAML_SP_ENTITY_ID, and (optionally)SAML_SP_X509_CERT/SAML_SP_PRIVATE_KEY. The Dockerfile installs the requiredlibxml2/xmlsec1system libraries./saml/loginredirects the browser to the IdP./saml/acsvalidates the SAMLResponse, provisions the user (or links by email), creates a default workspace, and logs them in./saml/statusreports configuration + whetherpython3-samlis importable, so you can verify env wiring before flipping it on.
browser-extension/— Chrome MV3 popup. Paste your server URL + an API key, click "Save page". POSTs the current tab's URL to/file/from_url/withAuthorization: Bearer …. The endpoint accepts either session cookies or API keys.
services/crypto.pyexposes a Fernet-backedEncryptedTextSQLAlchemy type. Sensitive columns —File.text_content,Chunk.embedding,ConnectorAccount.access_token/refresh_token,User.totp_secret— go through it.- Activate by setting
FILENERGY_ENCRYPTION_KEY(mint one withpython manage.py generate-encryption-key). Without the key, columns round-trip as plaintext — backwards compatible with old data and dev environments. - New writes get an
enc:prefix; old plaintext rows coexist with new encrypted ones. Runpython manage.py reencryptonce after enabling to back-fill.
Every assistant turn now creates MessageCitation rows linking
the Message to each retrieved Chunk + the cosine score. The dashboard
surfaces a "Most-cited chunks" panel with the snippet inline, and the
"Most-cited files" panel now uses real citation counts instead of the
prior coarse ask.answered event proxy.
ConnectorAccount.sync_cursor is the resume token; each connector uses
its native cursor primitive:
- Drive —
q=modifiedTime > '<RFC3339>', cursor = newestmodifiedTimefrom the latest page. - Notion —
start_cursor/next_cursor. Cleared when the result set is exhausted so the next tick starts over. - Dropbox — first call hits
/2/files/list_folder, subsequent calls hit/list_folder/continuewith the saved cursor (delta-native). - Slack —
oldest=<ts>. New deltas are appended to the existing per-channel transcript file rather than written as fresh files.
- CSRF protection on every browser POST via Flask-WTF. The
/api/v1,/webhooks/stripe, and/saml/acsblueprints opt out because they authenticate with Bearer / HMAC / SAML signature, not session cookies. JSON Ajax requests pick up the token from a<meta>tag and send it asX-CSRFToken. - Failed-login rate limit keyed on the email being attempted.
Configurable via
FILENERGY_LOGIN_RATE_LIMIT/FILENERGY_LOGIN_RATE_WINDOW. Bucket per email so one attacker can't lock out a real user. - Security headers on every response:
Strict-Transport-Security,X-Content-Type-Options,X-Frame-Options: DENY,Referrer-Policy,Permissions-Policy, and aContent-Security-Policythat allows Bootstrap-3 inline scripts + the Swagger UI CDN. SetFILENERGY_DISABLE_HSTS=truefor plain-HTTP dev deploys. - Session management via a
UserSessionrow per active browser./settings/securitylists them with browser / IP / last-seen, lets the user revoke any one, or "Log out of all other sessions". The middleware refuses cookies whose row was revoked.
POST /ask/c/<id>/share(orPOST /api/v1/conversations/<id>/share-links) mints a public read-only URL at/sc/<token>.- TTL via
ttl_hours; revocable;view_countincrements on every successful landing render.
- Indexer now stores
char_offset_start/char_offset_endon every Chunk so we can recover the exact span insideFile.text_content. /file/chunk/<id>/contextreturns the cited span + ~280 chars of surrounding context. The dashboard's "Most-cited chunks" panel uses it to expand a citation inline with the span highlighted.
ApiKey.scopes(e.g.["files:read", "ask:write"]) restricts which endpoints a key can hit. Empty = full access (back-compat).- The
/settings/keysmint form has checkboxes for every recognised scope.files:writeimpliesfiles:read.
In addition to /files and /ask:
GET/POST/DELETE /api/v1/files/<id>GET/POST/DELETE /api/v1/collections,PUT /api/v1/collections/<id>/files/<id>GET /api/v1/conversations,GET/DELETE /api/v1/conversations/<id>POST /api/v1/files/<id>/share-links,DELETE /api/v1/share-links/<id>POST /api/v1/conversations/<id>/share-linksGET/POST/DELETE /api/v1/webhooksGET /api/v1/members,POST /api/v1/invitations
Every endpoint enforces the relevant scope.
718 tests, 97.3% coverage (pytest + pytest-cov).
| Capability | Filenergy | NotebookLM | Humata | Mendable | Glean |
|---|---|---|---|---|---|
| Self-host (Docker) | ✅ | ❌ | ❌ | ❌ | ❌ |
| Multi-tenant workspaces | ✅ | ❌ | ❌ | ✅ | ✅ |
| Collections / per-doc chat | ✅ | ✅ | ✅ | ❌ | ❌ |
| Streaming RAG with citations | ✅ | ✅ | ✅ | ✅ | ✅ |
| Auto-summary + suggested Qs | ✅ | ✅ | ✅ | ❌ | ❌ |
| OCR for scanned PDFs / images | ✅ | ✅ | ✅ | ❌ | ❌ |
| URL ingestion | ✅ | ✅ | ❌ | ❌ | ❌ |
| Conversation export (MD/PDF/DOCX) | ✅ | partial | ❌ | ❌ | ❌ |
| GDrive / Notion / Dropbox / Slack connectors | ✅ | ❌ | ❌ | ✅ | ✅ |
| Connector sync scheduler | ✅ | ❌ | ❌ | ✅ | ✅ |
| Browser extension | ✅ | ❌ | ❌ | ❌ | ❌ |
| Background job queue (RQ) | ✅ | n/a | n/a | n/a | n/a |
| pgvector retrieval | ✅ | n/a | n/a | n/a | n/a |
| Outbound webhooks | ✅ | ❌ | ❌ | ✅ | ✅ |
| Audit log UI + CSV export | ✅ | ❌ | ❌ | ❌ | ✅ |
| Activity dashboard | ✅ | ❌ | ❌ | ✅ | ✅ |
| Stripe-billed plan tiers | ✅ | n/a | ✅ | ✅ | ✅ |
| Public REST API + OpenAPI | ✅ | ❌ | ✅ | ✅ | ✅ |
| Public share links with TTL | ✅ | ❌ | ❌ | ❌ | ❌ |
| 2FA (TOTP) | ✅ | ❌ | ❌ | ❌ | ✅ |
| SSO (Google OIDC) | ✅ | ✅ | ✅ | ✅ | ✅ |
| Self-serve account deletion | ✅ | ✅ | ❌ | ❌ | ❌ |
Prometheus /metrics |
✅ | ❌ | ❌ | ❌ | ✅ |
| Onboarding wizard | ✅ | ✅ | ✅ | ❌ | ✅ |
| Alembic migrations | ✅ | n/a | n/a | n/a | n/a |
| SAML SSO | ✅ | ❌ | ❌ | ✅ | ✅ |
pip install -r requirements.txt# Required for /ask
ANTHROPIC_API_KEY=sk-ant-...
# Required for indexing + retrieval
VOYAGE_API_KEY=pa-...
# Stripe (optional — billing UI degrades gracefully without it)
STRIPE_SECRET_KEY=sk_test_...
STRIPE_PUBLIC_KEY=pk_test_...
STRIPE_WEBHOOK_SECRET=whsec_...
STRIPE_PRICE_PRO=price_...
STRIPE_PRICE_TEAM=price_...
FILENERGY_BASE_URL=https://your-domain.com
# Google OAuth (optional — "Sign in with Google" hides without these)
GOOGLE_OAUTH_CLIENT_ID=...
GOOGLE_OAUTH_CLIENT_SECRET=...
# Email (optional — log adapter prints to stderr; smtp sends real mail)
FILENERGY_EMAIL_ADAPTER=log # or smtp
FILENERGY_EMAIL_FROM=filenergy@yourdomain.com
FILENERGY_SMTP_HOST=smtp.example.com
FILENERGY_SMTP_PORT=587
FILENERGY_SMTP_USER=...
FILENERGY_SMTP_PASSWORD=...
FILENERGY_SMTP_TLS=true
# Misc
FILENERGY_SECRET_KEY=change-me
FILENERGY_DB_PATH=filenergy.db
FILENERGY_UPLOAD_DIR=files
CLAUDE_MODEL=claude-opus-4-7
VOYAGE_EMBED_MODEL=voyage-3-lite
FILENERGY_SYNC_INDEXING=false # true = inline indexing (helpful in tests)
FILENERGY_ASK_RATE_LIMIT=30 # per window
FILENERGY_ASK_RATE_WINDOW=60 # secondspython manage.pyFor the Stripe webhook locally:
stripe listen --forward-to localhost:5000/webhooks/stripepython manage.py create-superuser admin@example.com 'a-good-password'
python manage.py reindex| Free | Pro $19/mo | Team $99/mo | |
|---|---|---|---|
| Questions / month | 100 | 2,000 | 20,000 |
| Storage | 100 MB | 5 GB | 100 GB |
| Files | 25 | 1,000 | 25,000 |
| Members | 1 | 1 | 25 |
Adjust in filenergy/settings.py:PLAN_LIMITS.
# Mint a key in /settings/keys, then:
curl -X POST http://localhost:5000/api/v1/ask \
-H "Authorization: Bearer $FILENERGY_TOKEN" \
-H "Content-Type: application/json" \
-d '{"question": "What do my contracts say about termination?"}'
curl -X POST http://localhost:5000/api/v1/files \
-H "Authorization: Bearer $FILENERGY_TOKEN" \
-F "files[]=@./report.pdf"
curl -H "Authorization: Bearer $FILENERGY_TOKEN" \
http://localhost:5000/api/v1/filesErrors:
| Code | Meaning |
|---|---|
| 401 | Invalid / missing API key |
| 402 | Plan quota exceeded (file count, storage, asks/month) |
| 429 | Rate limit hit (per-user sliding window on /ask) |
| 503 | Anthropic / Voyage / Stripe not configured on server |
upload → text extraction → chunk (1200/150) → Voyage embed
│
▼
SQLite
ask → Voyage embed → cosine top-K (workspace scope) → Claude (RAG, streaming)
│
▼
SSE → browser
persisted to thread
billing → webhook → workspace.plan flip → next quota check sees the new plan
- Tenant boundary is
workspace_idonFile,Conversation,Event,ApiKey. All queries filter on it. - Embeddings are JSON-encoded float arrays in SQLite. Fine up to ~10K
chunks per workspace; swap in
sqlite-vecor pgvector for more. - Background indexing runs in a daemon thread with its own app context.
Tests force it inline via
app.config["TESTING"]. - Rate limiter is a sliding window over the
Eventtable — same source of truth as billing usage gauges.
filenergy/
├── settings.py # env config + PLAN_LIMITS
├── __init__.py # app, db, login_manager
├── middleware.py # g.user / g.workspace
├── admin.py # superuser-only Flask-Admin
├── models/
│ └── __init__.py # User, Workspace, WorkspaceMember,
│ # WorkspaceInvitation, File, Chunk,
│ # Conversation, Message, Event, ApiKey,
│ # ShareLink
├── services/
│ ├── base.py # generic SQLAlchemy service
│ ├── user.py # auth + register/login (auto-creates ws)
│ ├── workspaces.py # tenancy, invitations, switching
│ ├── api_keys.py # token mint/verify/revoke
│ ├── share_links.py # public share TTL + cap
│ ├── billing.py # Stripe + plan-quota checks
│ ├── file.py # upload, async index, search
│ ├── extraction.py # pdf/docx/txt extractors + chunker
│ ├── embeddings.py # Voyage client + cosine retrieval
│ ├── chat.py # RAG + Claude streaming
│ ├── conversations.py # threads + messages
│ ├── events.py # analytics + audit
│ └── rate_limit.py # DB-backed sliding window
├── views/
│ ├── index.py
│ ├── user.py
│ ├── file.py # CRUD, reindex, share
│ ├── ask.py # JSON ask + SSE stream
│ ├── workspace.py # switch, invite, accept, members
│ ├── settings_views.py # profile, keys, ws, billing
│ ├── share.py # /s/<token>
│ ├── api_v1.py # bearer-auth API
│ └── billing.py # /webhooks/stripe
├── templates/ # bootstrap-3 templates
└── static/
tests/ # pytest suite, 272 tests, 97.6% coverage
pip install pytest pytest-cov stripe
python -m pytest # run the suite
python -m pytest --cov # with coverage reportStubs in tests/conftest.py replace Voyage and Anthropic clients; Stripe
is faked per-test via sys.modules. Zero network.
docker compose up --buildThen point a reverse proxy (nginx, Caddy, Cloudflare) at port 5000 with TLS terminated.
- Per-connector OAuth scope tightening (Slack still asks
groups:*scopes; Notion + Dropbox could be narrower). - KEK rotation via
MultiFernetso secrets can rotate without a full re-encrypt pass. - Citation drill-through — click a chunk on the dashboard and jump to the source paragraph in the file detail view.
- Cron / RQ-driven scheduler (today's daemon thread is fine for one Filenergy process; scale-out needs a real worker).