Document Data Extractor

DocExtract is a powerful tool designed to extract structured data from various documents (Images, PDFs) using the power of Google's Gemini Multimodal AI. It allows users to define custom extraction schemas and validation rules.

Technical Stack

Backend: Python, FastAPI
AI Model: Google Gemini 2.0 Flash (via google-generativeai)
Frontend: HTML5, Vanilla JavaScript, TailwindCSS
Persistence: JSON-based storage for configuration
Validation: Regex, Fuzzy Matching (TheFuzz), and LLM-based validation

Usage

Prerequisites:
- Python 3.8+
- Google Cloud API Key with access to Gemini API.

Installation:

cd backend
pip install -r requirements.txt

Configuration:
- Create a .env file in the backend directory.
- Add your API key: GOOGLE_API_KEY=your_api_key_here
Running the Application:
```
uvicorn main:app --reload
```
- Open your browser and navigate to http://127.0.0.1:8000.
Using the App:
- Go to Configuration to create a new "Document Type" (e.g., Invoice, ID Card).
- Define fields to extract (e.g., "Total Amount", "Name") and add descriptions to help the AI.
- Add validation rules (Regex, etc.) to ensure data quality.
- Go to Dashboard, select your document type, and upload a file to extract data.