Add CPU and GPU deployment modes by satra · Pull Request #88 · sensein/riverst

satra · 2026-03-11T03:10:43Z

Summary

add explicit CPU vs GPU deployment controls for the server build and runtime
document the GPU-sensitive functions and deployment workflow
keep CPU deployments working when ONNX pose export dependencies are missing

Testing

python3 -m unittest discover -s tests

gemini-code-assist · 2026-03-11T03:11:10Z

Summary of Changes

Hello, I'm Gemini Code Assist¹! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request significantly enhances the deployment flexibility of the Riverst server by introducing explicit support for both CPU and GPU environments. It provides clear mechanisms for configuring the build and runtime for each target, ensuring that users can leverage hardware acceleration when available while maintaining full functionality on CPU-only systems. The changes include updated documentation, Docker integration, and a robust fallback strategy for GPU-dependent components, making the system more adaptable to various deployment scenarios.

Highlights

Flexible Deployment Modes: Introduced explicit controls for CPU and GPU deployment targets, allowing the server to be built and run optimally for either environment.
Enhanced Documentation: Added comprehensive documentation detailing GPU-sensitive functions, the deployment workflow, and configuration instructions for both CPU and GPU setups across README.md, src/server/README.md, notes/first_steps_to_deploy.md, and a new docs/gpu-cpu-deployment-plan.md.
Robust CPU Fallback: Ensured that CPU deployments remain fully functional even when ONNX pose export dependencies are unavailable, with the system gracefully falling back to PyTorch YOLO models.
Docker Integration: Implemented Docker Compose overrides (docker-compose.gpu.yaml) and Dockerfile build arguments (RIVERST_DEPLOYMENT_TARGET) to streamline GPU-accelerated container builds and runtime.
Centralized Device Management: Refactored device_utils.py to centralize runtime device selection using the RIVERST_COMPUTE_DEVICE environment variable, supporting 'auto' (prefer accelerators) and 'cpu' (force CPU) policies.
New GPU Dependency Management: Created a dedicated requirements.gpu.txt file to manage GPU-specific Python dependencies, separating them from standard CPU requirements.
Device Utility Testing: Added a new test suite (test_device_utils.py) to validate the logic for compute device policy and selection.

🧠 New Feature in Public Preview: You can now enable Memory to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console.

Changelog

README.md
- Added instructions for GPU-enabled Docker Compose setup.
- Included guidance for manual GPU-oriented Python package installation.
- Documented the RIVERST_COMPUTE_DEVICE environment variable for forcing CPU inference.
docker-compose.gpu.yaml
- Added a new Docker Compose override file for GPU deployments, configuring build arguments and GPU access.
docker-compose.yaml
- Modified the server build section to accept RIVERST_DEPLOYMENT_TARGET as a build argument.
- Added RIVERST_COMPUTE_DEVICE environment variable to the server service configuration.
docs/gpu-cpu-deployment-plan.md
- Added a new document outlining the GPU and non-GPU deployment plan, including goals, findings on GPU-sensitive functions, decisions, and implementation notes.
notes/first_steps_to_deploy.md
- Updated EC2 instance type recommendations to differentiate between GPU and non-GPU deployments.
- Clarified that NVIDIA driver installation is only required for GPU deployments.
- Added a note to configure RIVERST_COMPUTE_DEVICE=cpu for CPU-only deployments in the .env file.
src/server/Dockerfile
- Added RIVERST_DEPLOYMENT_TARGET as a build argument with a default of 'cpu'.
- Modified the dependency installation step to conditionally install requirements.txt or requirements.gpu.txt based on RIVERST_DEPLOYMENT_TARGET.
src/server/README.md
- Updated virtual environment setup instructions to use python -m venv.
- Added instructions for installing GPU-oriented Python dependencies.
- Included guidance on setting RIVERST_COMPUTE_DEVICE=cpu to disable GPU/MPS usage.
- Added Docker commands for building and running GPU-oriented images.
src/server/bot/processors/video/processor.py
- Refactored YOLO pose model initialization into a new _load_pose_inferencer method.
- Implemented a fallback mechanism to use the PyTorch YOLO model if ONNX export fails or is unavailable.
src/server/bot/utils/device_utils.py
- Introduced COMPUTE_DEVICE_ENV_VAR and DEPLOYMENT_TARGET_ENV_VAR constants.
- Added get_compute_device_policy and get_deployment_target functions to retrieve environment variables.
- Rewrote get_best_device to respect the RIVERST_COMPUTE_DEVICE policy, allowing explicit CPU-only operation or automatic detection.
src/server/env.example
- Added RIVERST_COMPUTE_DEVICE environment variable with a default value of 'auto' and a description.
src/server/requirements.gpu.txt
- Added a new file listing GPU-specific Python dependencies, including onnx, onnxslim, and onnxruntime-gpu.
src/server/tests/test_device_utils.py
- Added a new test file containing unit tests for the device_utils module, covering default policy, CPU policy enforcement, and invalid policy handling.

Activity

The author has indicated that the changes were tested using python3 -m unittest discover -s tests.

Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature	Command	Description
Code Review	`/gemini review`	Performs a code review for the current pull request in its current state.
Pull Request Summary	`/gemini summary`	Provides a summary of the current pull request in its current state.
Comment	@gemini-code-assist	Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help	`/gemini help`	Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for GitHub and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution. ↩

gemini-code-assist

Code Review

This pull request introduces a well-structured mechanism for selecting between CPU and GPU deployments, both at build time (via Docker build args and different requirements files) and at runtime (via an environment variable). The changes are consistently applied across Docker configurations, documentation, and application code. The fallback for ONNX-dependent features on CPU builds is a nice touch for robustness. The addition of a new test suite for the device selection logic is also a great improvement. I have one suggestion to improve the reliability of the new tests.

Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>

wilke0818

looks reasonable

Add CPU and GPU deployment modes

6b06c3b

gemini-code-assist bot reviewed Mar 11, 2026

View reviewed changes

Comment thread src/server/tests/test_device_utils.py Outdated

Update src/server/tests/test_device_utils.py

3df34e5

Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>

wilke0818 approved these changes Apr 15, 2026

View reviewed changes

wilke0818 merged commit 69a9279 into main Apr 15, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add CPU and GPU deployment modes#88

Add CPU and GPU deployment modes#88
wilke0818 merged 2 commits intomainfrom
codex/gpu-or-cpu-main

satra commented Mar 11, 2026

Uh oh!

gemini-code-assist bot commented Mar 11, 2026

Uh oh!

gemini-code-assist bot left a comment

Uh oh!

Uh oh!

wilke0818 left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

satra commented Mar 11, 2026

Summary

Testing

Uh oh!

gemini-code-assist bot commented Mar 11, 2026

Summary of Changes

Highlights

Footnotes

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

Uh oh!

wilke0818 left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants