Prototype and stress-test implementation of a distributed sharded like counter in Python.
This project demonstrates how large-scale platforms handle “likes” efficiently, using sharding, deduplication, and O(1) read paths. It includes both the counter implementation (counters.py) and a stress test harness (test_counters.py) to simulate viral traffic.
- Sharded Counters: Likes are distributed across multiple shards to prevent global lock contention.
- Worker Threads: Background threads increment shard counters concurrently.
- Event Queue: Buffers incoming like events to handle bursts safely.
- O(1) Reads: Live
running_totalcounter allows instant read access under high concurrency. - Deduplication: Prevents duplicate likes using a thread-safe in-memory set.
- Batch Flush: Aggregator commits shard totals to “database” in batches.
- Stress Test: Multi-phase simulation with normal, viral, and extreme (tsunami) traffic patterns.
- Python 3.x
- Windows/Linux/macOS
- No external database required
python counters.py- Initializes shards, workers, and aggregator.
- Accepts likes and maintains a live O(1) running total.
python test_counters.py- Simulates multiple writer threads sending likes to multiple posts.
- Measures throughput, duplicate rate, queue depth, flush accuracy, and read/write latency.
distributed-like-counter/
│
├── counters.py # Core distributed counter implementation
├── test_counters.py # Stress test harness simulating viral traffic
├── README.md # This file
└── like counter report.pdf # Optional folder for system reports or logs
- Correctly blocks duplicate likes (65–85% dupe rate on viral posts).
- O(1) reads remain fast (<100 μs) even with 200 concurrent writers.
- Batch flush reduces writes to the “database” drastically.
- Queue safely absorbs spikes; no phantom likes observed.
- Local machine is fast, so shard rescaling may not trigger — production systems would handle rescaling dynamically.
| Gap | Prototype | Production Solution |
|---|---|---|
| Database | Python int | PostgreSQL / DynamoDB |
| Deduplication | In-memory set | Redis SETNX with TTL |
| Multi-server | Single machine | Kafka or Redis Streams |
| Data loss window | ~10 seconds | Write-ahead log (WAL) or Kafka |
| Workers | Fixed threads | Auto-scaling worker pool |
| Shard scaling | Not triggered | Dynamic scaling based on queue depth |
Garv Jain – Engineering student, experimenting with distributed counters and high-concurrency systems.