This document outlines the architectural decisions, concurrency model, and deployment rationale for the Django WebSocket Blue/Green deployment etc.
- Django + Channels: Core application built on Django, using Django Channels to handle WebSocket connections.
- Uvicorn ASGI Server: Runs the Channels application, providing high-performance async I/O. The startup commands can be found in entrypoint.sh.
- Nginx: Reverse proxy for HTTP/WebSocket traffic, and routes to active app instances.
- Docker Compose: Orchestrates multi-container environment, including two
app instances (
app_blue&app_green), monitoring, metrics, and load testing. - Blue/Green Deployment: Ensures zero-downtime releases by maintaining two parallel environments and switching traffic via Nginx proxy.
-
ASGI Workers
- Uvicorn is launched with multiple worker processes (
--workers 8) to utilize multi-core CPUs. - Each worker handles asynchronous event loops via
uvloopfor fast network I/O.
- Uvicorn is launched with multiple worker processes (
-
Channel Layers
- Currently, we're dumping everything in memory.
- (Optimization): Redis (or alternative) can be used as the channel layer for pub/sub and group messaging.
- Ensures messages broadcasted (e.g., chat rooms) are propagated across all Uvicorn workers.
-
Session & Connection Management
- Connections are tracked in an in-memory map (
SESSIONS) and metrics counters (active connections, total messages).
- Connections are tracked in an in-memory map (
-
Prometheus Metrics exposed at
/metrics:- Active Connections: Gauge of current live WebSocket sessions.
- Total Messages: Counter of messages received.
- Error Count: Counter for exceptions in consumer logic.
-
Grafana Dashboards:
- Visualize real-time metrics, trends, and alert thresholds.
- Preconfigured via
docker/compose.ymland mounted to/etc/grafana/dashboards.
-
Health Endpoints:
/healthz: Liveness probe — returns 200 if the server process is running./readyz: Readiness probe — returns 200 if the application has started. Tracked via globa variables + shutdown singal (SIGTERM)
-
Zero-Downtime Releases
- Maintain two identical environments (
app_blue&app_green). - Deploy new code to the idle environment, run health checks, then flip traffic by updating Nginx.
- Maintain two identical environments (
-
Rollback Capability
- If smoke tests or production monitoring detect issues, the deployment doesn't go through and older stuff keeps on running.
-
Automation
promote.shencapsulates build, smoke tests, proxy update, and teardown logic.- Ensures consistent and repeatable deployment process.
-
Nginx Proxy Settings:
- Upgrade and Connection headers for WebSocket handshake.
- Timeouts tuned for long-lived connections.
- Autoscaling: Use docker swarm to run replicas of application. Ensuring application is horizontally scalable.
- Channel Layer Backend: Integrate Redis or other in-memory cache.