Distributed Like Counter

Prototype and stress-test implementation of a distributed sharded like counter in Python. This project demonstrates how large-scale platforms handle “likes” efficiently, using sharding, deduplication, and O(1) read paths. It includes both the counter implementation (counters.py) and a stress test harness (test_counters.py) to simulate viral traffic.

Features

Sharded Counters: Likes are distributed across multiple shards to prevent global lock contention.
Worker Threads: Background threads increment shard counters concurrently.
Event Queue: Buffers incoming like events to handle bursts safely.
O(1) Reads: Live running_total counter allows instant read access under high concurrency.
Deduplication: Prevents duplicate likes using a thread-safe in-memory set.
Batch Flush: Aggregator commits shard totals to “database” in batches.
Stress Test: Multi-phase simulation with normal, viral, and extreme (tsunami) traffic patterns.

Getting Started

Prerequisites

Python 3.x
Windows/Linux/macOS
No external database required

Running the Counter

python counters.py

Initializes shards, workers, and aggregator.
Accepts likes and maintains a live O(1) running total.

Running the Stress Test

python test_counters.py

Simulates multiple writer threads sending likes to multiple posts.
Measures throughput, duplicate rate, queue depth, flush accuracy, and read/write latency.

Project Structure

distributed-like-counter/
│
├── counters.py                         # Core distributed counter implementation
├── test_counters.py                    # Stress test harness simulating viral traffic
├── README.md                           # This file
└── like counter report.pdf             # Optional folder for system reports or logs

Key Insights from Stress Test

Correctly blocks duplicate likes (65–85% dupe rate on viral posts).
O(1) reads remain fast (<100 μs) even with 200 concurrent writers.
Batch flush reduces writes to the “database” drastically.
Queue safely absorbs spikes; no phantom likes observed.
Local machine is fast, so shard rescaling may not trigger — production systems would handle rescaling dynamically.

Notes for Production

Gap	Prototype	Production Solution
Database	Python int	PostgreSQL / DynamoDB
Deduplication	In-memory set	Redis SETNX with TTL
Multi-server	Single machine	Kafka or Redis Streams
Data loss window	~10 seconds	Write-ahead log (WAL) or Kafka
Workers	Fixed threads	Auto-scaling worker pool
Shard scaling	Not triggered	Dynamic scaling based on queue depth

Author

Garv Jain – Engineering student, experimenting with distributed counters and high-concurrency systems.

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
counters.py		counters.py
like counter report.pdf		like counter report.pdf
readme.md		readme.md
test_counters.py		test_counters.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Distributed Like Counter

Features

Getting Started

Prerequisites

Running the Counter

Running the Stress Test

Project Structure

Key Insights from Stress Test

Notes for Production

Author

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Distributed Like Counter

Features

Getting Started

Prerequisites

Running the Counter

Running the Stress Test

Project Structure

Key Insights from Stress Test

Notes for Production

Author

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages