- Python 3.8 or higher
- pip package manager
Install all required packages from requirements.txt:
pip install -r requirements.txtImportant: This includes PyMuPDF (fitz), which is required for the PDF highlighting feature.
Verify that PyMuPDF is installed correctly:
python -c "import fitz; print('PyMuPDF version:', fitz.__version__)"If this fails with ModuleNotFoundError: No module named 'fitz', install it manually:
pip install PyMuPDF>=1.23.0The backend server must be running for PDF highlighting to work:
python harvest_be.pyBy default, the backend runs on port 5001. Check the console output to verify.
In a separate terminal, start the frontend:
python harvest_fe.pyBy default, the frontend runs on port 8050.
The application supports two deployment modes. Edit config.py:
# Internal mode (default) - simple setup, no reverse proxy needed
DEPLOYMENT_MODE = "internal"
# Nginx mode - production deployment with reverse proxy
DEPLOYMENT_MODE = "nginx"
BACKEND_PUBLIC_URL = "https://api.yourdomain.com" # Required for nginx modeFor detailed deployment configuration, see DEPLOYMENT.md
Edit config.py and set:
ENABLE_PDF_HIGHLIGHTING = True # Enable highlighting feature
# or
ENABLE_PDF_HIGHLIGHTING = False # Disable highlighting featureSymptom: Getting HTTP 500 errors when trying to save PDF highlights.
Cause: PyMuPDF (fitz) is not installed.
Solution:
pip install PyMuPDF>=1.23.0Then restart the backend server.
Symptom: Getting HTTP 502 errors when accessing PDF viewer or saving highlights.
Cause: Backend server is not running or not accessible.
Solution:
- Start the backend server:
python harvest_be.py - Verify it's running on port 5001
- Check backend logs for errors
Symptom: Cannot select text in PDF viewer.
Cause: PDF.js text layer not rendering properly.
Solution:
- Clear browser cache and refresh
- Check browser console for JavaScript errors
- Verify PDF.js CDN is accessible
Symptom: Cross-Origin Request Blocked errors in browser console.
Cause: Incorrect deployment mode or proxy configuration.
Solution:
- Check your
DEPLOYMENT_MODEsetting inconfig.py - In internal mode: All API calls use
/proxy/routes (default behavior) - In nginx mode: Direct backend URLs are used, ensure
BACKEND_PUBLIC_URLis correct - Restart both backend and frontend after changing deployment mode
Symptom: Application fails to start with configuration errors.
Cause: Invalid or missing deployment configuration.
Solution:
- Ensure
DEPLOYMENT_MODEis either "internal" or "nginx" - If using nginx mode,
BACKEND_PUBLIC_URLmust be set - Check
launch_harvest.pyoutput for specific configuration issues
Run the test suite to verify installation:
python -m pytest test_pdf_annotation.py -vAll tests should pass if PyMuPDF is installed correctly.