Skip to content

garvjain7/distributedCounters

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

1 Commit
 
 
 
 
 
 
 
 

Repository files navigation

Distributed Like Counter

Prototype and stress-test implementation of a distributed sharded like counter in Python. This project demonstrates how large-scale platforms handle “likes” efficiently, using sharding, deduplication, and O(1) read paths. It includes both the counter implementation (counters.py) and a stress test harness (test_counters.py) to simulate viral traffic.


Features

  • Sharded Counters: Likes are distributed across multiple shards to prevent global lock contention.
  • Worker Threads: Background threads increment shard counters concurrently.
  • Event Queue: Buffers incoming like events to handle bursts safely.
  • O(1) Reads: Live running_total counter allows instant read access under high concurrency.
  • Deduplication: Prevents duplicate likes using a thread-safe in-memory set.
  • Batch Flush: Aggregator commits shard totals to “database” in batches.
  • Stress Test: Multi-phase simulation with normal, viral, and extreme (tsunami) traffic patterns.

Getting Started

Prerequisites

  • Python 3.x
  • Windows/Linux/macOS
  • No external database required

Running the Counter

python counters.py
  • Initializes shards, workers, and aggregator.
  • Accepts likes and maintains a live O(1) running total.

Running the Stress Test

python test_counters.py
  • Simulates multiple writer threads sending likes to multiple posts.
  • Measures throughput, duplicate rate, queue depth, flush accuracy, and read/write latency.

Project Structure

distributed-like-counter/
│
├── counters.py                         # Core distributed counter implementation
├── test_counters.py                    # Stress test harness simulating viral traffic
├── README.md                           # This file
└── like counter report.pdf             # Optional folder for system reports or logs

Key Insights from Stress Test

  • Correctly blocks duplicate likes (65–85% dupe rate on viral posts).
  • O(1) reads remain fast (<100 μs) even with 200 concurrent writers.
  • Batch flush reduces writes to the “database” drastically.
  • Queue safely absorbs spikes; no phantom likes observed.
  • Local machine is fast, so shard rescaling may not trigger — production systems would handle rescaling dynamically.

Notes for Production

Gap Prototype Production Solution
Database Python int PostgreSQL / DynamoDB
Deduplication In-memory set Redis SETNX with TTL
Multi-server Single machine Kafka or Redis Streams
Data loss window ~10 seconds Write-ahead log (WAL) or Kafka
Workers Fixed threads Auto-scaling worker pool
Shard scaling Not triggered Dynamic scaling based on queue depth

Author

Garv Jain – Engineering student, experimenting with distributed counters and high-concurrency systems.

About

Prototype and stress-test implementation of a distributed sharded like counter in Python. Includes counters.py (core system) and test_counters.py (stress test harness) demonstrating O(1) reads, deduplication, sharding, and batch flush under high concurrency.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages