A web application using Google's Gemini API to clean and optimize PDF text for text-to-speech applications.
- AI-powered text cleaning (fixes line breaks, removes headers/footers, corrects spacing)
- Drag-and-drop PDF upload
- Download cleaned PDF
- Modern web interface
- Python 3.8+
- Gemini API key
- Clone/download the repository
- Create virtual environment:
python -m venv venv - Activate:
venv\Scripts\activate(Windows) orsource venv/bin/activate(Mac/Linux) - Install:
pip install -r requirements.txt - Create
.envand add your Gemini API key:GEMINI_API_KEY=your_key_here - Run:
python app.py - Open:
http://localhost:5000
- Extract text from PDF (PyPDF2)
- Clean with Google's Gemini API
- Generate new PDF (ReportLab)
- Download cleaned PDF
- API key error: Make sure
.envfile exists with correctGEMINI_API_KEY - PDF extraction fails: Some scanned PDFs need OCR first
- File upload fails: Check file is valid PDF and under 16MB
- Never commit
.envto version control - Keep your API key secret
- Temporary files are deleted after processing
MIT License