Skip to content

Latest commit

 

History

History
56 lines (40 loc) · 1.94 KB

File metadata and controls

56 lines (40 loc) · 1.94 KB

Document Data Extractor

DocExtract is a powerful tool designed to extract structured data from various documents (Images, PDFs) using the power of Google's Gemini Multimodal AI. It allows users to define custom extraction schemas and validation rules.

Technical Stack

  • Backend: Python, FastAPI
  • AI Model: Google Gemini 2.0 Flash (via google-generativeai)
  • Frontend: HTML5, Vanilla JavaScript, TailwindCSS
  • Persistence: JSON-based storage for configuration
  • Validation: Regex, Fuzzy Matching (TheFuzz), and LLM-based validation

Usage

  1. Prerequisites:

    • Python 3.8+
    • Google Cloud API Key with access to Gemini API.
  2. Installation:

    cd backend
    pip install -r requirements.txt
  3. Configuration:

    • Create a .env file in the backend directory.
    • Add your API key: GOOGLE_API_KEY=your_api_key_here
  4. Running the Application:

    uvicorn main:app --reload
    • Open your browser and navigate to http://127.0.0.1:8000.
  5. Using the App:

    • Go to Configuration to create a new "Document Type" (e.g., Invoice, ID Card).
    • Define fields to extract (e.g., "Total Amount", "Name") and add descriptions to help the AI.
    • Add validation rules (Regex, etc.) to ensure data quality.
    • Go to Dashboard, select your document type, and upload a file to extract data.

Screenshots

Credits

This application has been (quickly) developed by Antigravity AI, with a little help from myself.

License

This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License.

CC BY-NC