helios

Distributed model inference server with a router and multiple workers.

Build

On macOS, keep protobuf ABI aligned with grpc + onnxruntime from Homebrew.

cmake -S . -B build
cmake --build build

Run router + workers

Use the helper script after build:

chmod +x scripts/start_cluster.sh
scripts/start_cluster.sh 2 resnet50 models/resnet50-v1-7.onnx

This starts:

workers on 127.0.0.1:50052, 127.0.0.1:50053, ...
router on 127.0.0.1:50051

Client test

Install Python dependencies:

python -m pip install -r requirements.txt

Generate Python gRPC stubs (if needed):

python -m grpc_tools.protoc -I proto --python_out=. --grpc_python_out=. proto/inference.proto

Run the client:

python client.py --router localhost:50051 --model_id resnet50 --tokens 1.0,2.0,3.0,4.0

Throughput Performance on Simultaneous Requests

Workload used for comparison:

model: resnet50
input shape: 1 x 3 x 224 x 224
total requests: 200
concurrency: 50 threads

Measured results from the latest run:

Workers	Throughput (req/s)	Success
1	10.512	200
3	22.259	200
5	36.883	200

Highlights:

Request handling is stable (0 errors in all scenarios).
Throughput now increases with worker count in this environment.
Router dispatch is balanced across workers in multi-worker runs.

To reproduce this benchmark:

python scripts/throughput_benchmark.py

To evaluate scale under a latency SLO/SLA:

python scripts/scaling_benchmark.py --workers 1,3,5 --concurrency-levels 20,40,60,80 --total-requests 180 --warmup 6 --sla-p95-ms 1500

Name		Name	Last commit message	Last commit date
Latest commit History 67 Commits
proto		proto
scripts		scripts
src		src
.gitignore		.gitignore
CMakeLists.txt		CMakeLists.txt
README.md		README.md
client.py		client.py
inference_pb2.py		inference_pb2.py
inference_pb2_grpc.py		inference_pb2_grpc.py
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

helios

Build

Run router + workers

Client test

Throughput Performance on Simultaneous Requests

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

helios

Build

Run router + workers

Client test

Throughput Performance on Simultaneous Requests

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages