PDFlens(RAG)

Offline PDF Reading Assistant: convert PDFs to Markdown, build a local vector index, and chat with your documents (with image understanding) on your machine.

Features

PDF → Markdown via Docling (OCR + tables)
Optional image understanding (Moondream via Ollama)
Local FAISS vector index (per folder), incremental updates and persistence
Simple UI (Tkinter) with collapsible sidebar and favorites
Fully offline: uses local models via Ollama

Quickstart (macOS)

1) Prerequisites

Python 3.10+
Ollama installed and running

# Install Ollama (macOS)
brew install ollama
ollama serve

2) Clone & Setup

git clone https://github.com/<your-user>/PDFassistant.git
cd PDFassistant

python3 -m venv .venv
source .venv/bin/activate

# Core deps
pip install -U pip
pip install customtkinter pillow docling langchain-community langchain-huggingface langchain-ollama faiss-cpu

3) Pull Models

# LLM for answering
ollama pull qwen2.5:7b

# Vision model for image descriptions
ollama pull moondream

4) Run

python3 app.py

Click “Import Folder” and choose a folder containing PDFs (non‑recursive: reads PDFs in the folder root).
The app will:
1. Convert PDFs to Markdown (next to each PDF, same filename with .md)
2. Build/update a FAISS index in <folder>/.rag_storage/
3. Let you ask questions; answers quote snippets and show source files

Tips & Troubleshooting

Chinese text appears garbled in Markdown
- Many PDFs use CID fonts without proper ToUnicode mapping. In such cases, rely on OCR.
- Ensure Docling can use an OCR engine with Chinese language data (e.g., EasyOCR/Tesseract).
- If downloads are blocked, preinstall OCR models or configure Docling OCR options.
No PDFs found
- The app scans only the selected folder’s root for *.pdf. Move files to the root or extend scanning logic.
Model not found
- Make sure ollama serve is running and you’ve pulled qwen2.5:7b and moondream.

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
.gitignore		.gitignore
README.md		README.md
app.py		app.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

PDFlens(RAG)

Features

Quickstart (macOS)

1) Prerequisites

2) Clone & Setup

3) Pull Models

4) Run

Tips & Troubleshooting

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

PDFlens(RAG)

Features

Quickstart (macOS)

1) Prerequisites

2) Clone & Setup

3) Pull Models

4) Run

Tips & Troubleshooting

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages