A production-ready document processing API powered by Docling.
Convert PDF, DOCX, PPTX, and more to Markdown, JSON, or plain text with table extraction, OCR, and vector embeddings.
- Document Conversion - PDF, DOCX, PPTX, XLSX, HTML, Images β Markdown/JSON/Text
- Table Extraction - Preserve table structure from documents
- OCR Support - Process scanned documents and images
- Vector Embeddings - Generate embeddings for RAG applications
- Async Processing - Background task processing with Celery
- Batch Processing - Convert multiple documents in parallel
- Monitoring - Flower dashboard + Prometheus metrics
- SSL/TLS - Automatic certificates with Let's Encrypt
# 1. Configure
cp app/.env.example .env
export DOCLING_API_TOKEN=$(openssl rand -hex 32)
nano .env # Add your token and domain
# 2. Deploy
make init
# 3. Verify
curl https://yourdomain.com/healthπ Full guide: docs/DEPLOYMENT.md
# Start dev environment
make dev-up
# Check status
make dev-status
# Test API
curl -X POST http://localhost:8080/convert \
-H "X-API-Key: dev-token-123" \
-H "Content-Type: application/json" \
-d '{"url": "https://arxiv.org/pdf/2408.09869"}'π Full guide: docs/DEVELOPMENT.md
| Document | Description |
|---|---|
| Deployment Guide | Production setup, SSL, scaling, maintenance |
| Development Guide | Local setup, testing, debugging, contributing |
| API Reference | Endpoints, request/response formats, examples |
| Configuration | Environment variables, nginx, Celery settings |
| Security Guide | Authentication, secrets, security checklist |
| AGENT.md | Guidelines for AI assistants and contributors |
Internet β Nginx (SSL) β FastAPI β Celery (Redis) β Workers
β
Flower
| Service | Purpose |
|---|---|
| Nginx | Reverse proxy, SSL, rate limiting |
| API | FastAPI REST endpoints |
| Worker | Document processing (Celery) |
| Redis | Message broker & result backend |
| Flower | Task monitoring dashboard |
# Convert document
curl -X POST https://api.example.com/convert \
-H "X-API-Key: your-token" \
-H "Content-Type: application/json" \
-d '{"url": "https://example.com/doc.pdf"}'
# Check status
curl https://api.example.com/tasks/{task_id} \
-H "X-API-Key: your-token"| Endpoint | Method | Description |
|---|---|---|
/convert |
POST | Convert from URL |
/convert/upload |
POST | Upload & convert |
/convert/batch |
POST | Batch conversion |
/tasks/{id} |
GET | Get results |
/health |
GET | Health check |
/docs |
GET | API documentation |
π Full reference: docs/API.md
Key settings in .env:
DOCLING_API_TOKEN=your-secure-token # Required
WORKERS=2 # API workers
CELERY_CONCURRENCY=2 # Tasks per worker
EMBEDDING_MODEL=all-MiniLM-L6-v2 # Embedding modelπ All options: docs/CONFIGURATION.md
- β API key authentication required
- β Startup validation rejects weak tokens
- β Flower bound to localhost only
- β TLS 1.2/1.3 with strong ciphers
- β Rate limiting per IP
# Generate secure token
openssl rand -hex 32π Security checklist: docs/SECURITY.md
# Production
make up # Start services
make down # Stop services
make status # Check status
make logs # View logs
make monitoring # Enable Flower dashboard
# Development
make dev-up # Start dev environment
make dev-down # Stop dev environment
make dev-test # Test endpoints
make dev-logs # View logs
# Upgrades
make upgrade-check # Check for updates
make upgrade-docling # Upgrade Docling only
make upgrade # Upgrade all dependencies
make rollback # Rollback to previous version
# Maintenance
make ssl-renew # Renew certificates
make scale N=3 # Scale workers
make backup-certs # Backup SSL certsContributions are welcome! Please read our Contributing Guide and Code of Conduct first.
- Fork the repository
- Create your feature branch (
git checkout -b feature/amazing-feature) - Commit your changes (
git commit -m 'feat: add amazing feature') - Push to the branch (
git push origin feature/amazing-feature) - Open a Pull Request
- Documentation: docs/
- Bug Reports: GitHub Issues
- Feature Requests: GitHub Issues
- Security Issues: vineeth.nk@locaboo.com (private)
This project is licensed under the MIT License - see the LICENSE file for details.
- Docling - Document processing engine (MIT)
- FastAPI - Web framework
- Celery - Task queue
- Sentence Transformers - Embeddings