This is a production-ready distributed task queue system built with Python, Redis, and Kubernetes. It includes enterprise-grade features for reliability, observability, and monitoring.
- β Distributed Task Processing - Multiple workers processing tasks in parallel
- β Task Retry with Exponential Backoff - Automatic retries with configurable delays
- β Dead Letter Queue (DLQ) - Failed tasks moved to DLQ for manual handling
- β Task Priority - High and normal priority task queues
- β Prometheus Metrics - Comprehensive metrics for monitoring
- β Jaeger Distributed Tracing - End-to-end request tracing
- β Web Dashboard - Real-time monitoring interface
- β Kubernetes Ready - Production deployment manifests included
- β Enhanced Logging - Visual task tracking with emojis
- Docker & Docker Compose
- Python 3.11+ (for local development)
- kubectl (for Kubernetes deployment)
# Clone and navigate
cd distributed-task-queue-python
# Start all services
docker-compose up -d --build
# Open dashboard
open http://localhost:5001- Gateway API - http://localhost:8000
- Dashboard - http://localhost:5001
- Prometheus - http://localhost:9090
- Jaeger - http://localhost:16686
| Document | Purpose |
|---|---|
| FEATURES_SUMMARY.md | Complete feature overview |
| ADVANCED_FEATURES.md | Detailed feature documentation & deployment |
| QUICK_START_FEATURES.md | Quick start guide with examples |
| COMMAND_REFERENCE.md | CLI commands & cheat sheet |
Failed tasks automatically retry with increasing delays:
- Initial delay: 5 seconds (configurable)
- Backoff multiplier: 2.0x (configurable)
- Max delay: 1 hour (configurable)
- Max retries: 3 (configurable)
# Configure in docker-compose.yml
MAX_RETRIES=3
INITIAL_RETRY_DELAY=5
RETRY_BACKOFF_MULTIPLIER=2.0
MAX_RETRY_DELAY=3600Tasks that fail after max retries go to DLQ for analysis and manual retry:
# View failed tasks
curl http://localhost:8000/dlq/failed-tasks
# Retry a failed task
curl -X POST http://localhost:8000/dlq/failed-tasks/{TASK_ID}/retry
# Remove from DLQ
curl -X DELETE http://localhost:8000/dlq/failed-tasks/{TASK_ID}Comprehensive metrics tracking all operations:
- Task submissions/completions/failures
- Processing duration
- Retry attempts
- Queue sizes
- Worker status
- DLQ size
Access: http://localhost:8000/metrics
Full request tracing across all services:
- Request flow visualization
- Operation timing
- Error tracking
- Redis operation tracing
Access: http://localhost:16686
Real-time monitoring interface showing:
- Live statistics
- Active workers
- Recent tasks
- Dead Letter Queue
- Auto-refresh every 5 seconds
Access: http://localhost:5001
Complete K8s manifests for production:
- Redis with persistent storage
- Gateway with load balancer
- Multiple workers with scaling
- Prometheus & Jaeger for monitoring
- Dashboard for visualization
Deploy: kubectl apply -f k8s/deployment.yaml
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β Client Applications β
ββββββββββββββββββββββ¬βββββββββββββββββββββββββββββββββββββ
β
ββββββββββββββΌβββββββββββββ
βΌ βΌ βΌ
ββββββββββ ββββββββββ ββββββββββββ
βGateway β βMetrics β βDashboard β
β:8000 β β:8000 β β:5001 β
ββββββ¬ββββ βββββ¬βββββ ββββββ¬ββββββ
β β β
ββββββββββββΌβββββββββββββ
β
βββββββββββΌβββββββββββ
β Redis β
β (Queues, Storage) β
ββββββ¬ββββββββ¬ββββ¬ββββ
β β β
βββββββ β ββββββββββββ
β β β
ββββββΌβββββ ββββββΌβββββ ββββββΌβββββ
β Worker1 β β Worker2 β β Worker3 β
βββββββββββ βββββββββββ βββββββββββ
βββββββββββββββ ββββββββββββ
β Prometheus β β Jaeger β
β Metrics β β Tracing β
β :9090 β β :16686 β
βββββββββββββββ ββββββββββββ
1. SUBMIT β Task submitted via API
π€ SUBMITTED log message
Stored in Redis
Added to pending queue
2. ASSIGN β Worker picks up task
π ASSIGNED log message
Status β "processing"
3. PROCESS β Task is executing
βοΈ STARTED log message
β±οΈ Simulated processing
4. COMPLETE β Task finished
β
COMPLETED log message
Status β "completed"
OR
4. FAIL β Task failed
β ERROR log message
Status β "retrying"
Scheduled for retry
5. RETRY β Retry after delay
π REQUEUED log message
Status β "pending"
(repeats from step 2)
6. DLQ β Max retries exhausted
π DLQ log message
Status β "failed"
Stored in Dead Letter Queue
curl -X POST http://localhost:8000/tasks \
-H "Content-Type: application/json" \
-d '{
"type": "process",
"data": "hello-world",
"priority": "high"
}'curl http://localhost:8000/tasks/{TASK_ID}curl http://localhost:8000/health# Raw Prometheus format
curl http://localhost:8000/metrics
# In Prometheus UI
open http://localhost:9090
# Query: task_submissions_totalopen http://localhost:16686
# Select service: "gateway"
# View traces showing request flowpython tests/stress_test.py
# Submits 100 tasks and monitors completion| Variable | Default | Purpose |
|---|---|---|
MAX_RETRIES |
3 | Maximum retry attempts |
INITIAL_RETRY_DELAY |
5 | Initial retry delay (seconds) |
RETRY_BACKOFF_MULTIPLIER |
2.0 | Backoff multiplier per retry |
MAX_RETRY_DELAY |
3600 | Maximum retry delay (seconds) |
REDIS_HOST |
localhost | Redis server hostname |
REDIS_PORT |
6379 | Redis server port |
JAEGER_HOST |
localhost | Jaeger agent hostname |
JAEGER_PORT |
6831 | Jaeger agent port |
Edit docker-compose.yml:
environment:
- MAX_RETRIES=5
- INITIAL_RETRY_DELAY=10
- RETRY_BACKOFF_MULTIPLIER=1.5
- MAX_RETRY_DELAY=7200task_submissions_total- Total tasks submittedtask_completions_total- Completed taskstask_duration_seconds- Processing time (histogram)task_retries_total- Total retriestasks_in_dlq_total- Failed tasksactive_workers_total- Active workerspending_tasks_total- Pending tasks
- Request flow visualization
- Service dependencies
- Operation timing
- Error tracking
- Redis operation tracing
- Real-time task counts
- Worker status
- Queue depths
- DLQ management
docker-compose up -d --build# Apply all manifests
kubectl apply -f k8s/deployment.yaml
# Scale workers
kubectl scale deployment worker -n task-queue --replicas=5
# Port forward services
kubectl port-forward -n task-queue svc/gateway 8000:5000
kubectl port-forward -n task-queue svc/dashboard 5001:5001# Check logs
docker logs task-queue-gateway
docker logs task-queue-worker-1
# Verify Redis
redis-cli PING# Verify endpoint
curl http://localhost:8000/metrics
# Check Prometheus targets
open http://localhost:9090/targets# Check worker logs
docker logs -f task-queue-worker-1
# Check queue status
curl http://localhost:8000/health
# Verify Redis
redis-cli LLEN tasks:pendingdistributed-task-queue-python/
βββ src/
β βββ gateway/
β β βββ server.py # REST API gateway
β βββ worker/
β β βββ worker.py # Task worker with retry logic
β βββ dashboard/
β β βββ app.py # Dashboard API
β β βββ templates/index.html # Dashboard UI
β βββ shared/
β βββ metrics.py # Prometheus metrics
β βββ tracing.py # Jaeger tracing
βββ k8s/
β βββ deployment.yaml # Kubernetes manifests
βββ tests/
β βββ stress_test.py # Load test
β βββ test_client.py # Interactive CLI client
βββ docker-compose.yml # Docker Compose config
βββ Dockerfile.* # Container definitions
βββ prometheus.yml # Prometheus config
βββ requirements.txt # Python dependencies
βββ *.md # Documentation files
- FEATURES_SUMMARY.md - Complete feature overview
- ADVANCED_FEATURES.md - Detailed deployment guide
- QUICK_START_FEATURES.md - Quick examples
- COMMAND_REFERENCE.md - CLI command cheat sheet
- docker-compose.yml - Full stack configuration
- k8s/deployment.yaml - Kubernetes manifests
- prometheus.yml - Prometheus configuration
-
Explore the Dashboard
- Open http://localhost:5001
- Submit some tasks
- Monitor in real-time
-
Check the Metrics
- Open http://localhost:9090
- Query task_submissions_total
- View graphs
-
Trace a Request
- Open http://localhost:16686
- Select "gateway" service
- Click "Find Traces"
-
Run Stress Test
python tests/stress_test.py- Watch workers process 100 tasks
-
Deploy to Kubernetes
- Build images for your cluster
kubectl apply -f k8s/deployment.yaml- Scale workers as needed
Feel free to extend this system with:
- Custom task types
- Additional metrics
- Custom dashboard widgets
- Integration with other systems
This project is provided as-is for educational and production use.
Last Updated: December 24, 2025
Status: β All Features Implemented & Tested
For detailed information, see the documentation files listed above.