Modernize CUDA build workflow: remove PowerShell, inline matrix, drop conda#43
Modernize CUDA build workflow: remove PowerShell, inline matrix, drop conda#43
Conversation
Co-authored-by: chamalgomes <106123134+chamalgomes@users.noreply.github.com>
There was a problem hiding this comment.
Pull request overview
Modernizes the CUDA wheel build GitHub Actions workflow by consolidating the matrix into a single job, switching from PowerShell to bash, and building inside an NVIDIA CUDA container to remove the conda/mamba toolchain setup.
Changes:
- Removes the separate
define_matrixjob and inlines the build matrix intobuild_wheels. - Replaces PowerShell scripting with bash (including version tag rewriting via
sed). - Uses an
nvidia/cudadevel container image and adds a CUDA stub linker flag for builds.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| cuda_config: | ||
| - ver: 13.1.1 | ||
| short: cu131 | ||
| arch: 75;80; |
There was a problem hiding this comment.
matrix.cuda_config.arch is set to 75;80; with a trailing semicolon. In CMake, semicolon-separated lists with a trailing ; typically introduce an empty list element, which can make CMAKE_CUDA_ARCHITECTURES include an empty architecture and fail configuration. Drop the trailing semicolon (e.g., 75;80) or represent architectures as a YAML list and join them explicitly.
| arch: 75;80; | |
| arch: 75;80 |
There was a problem hiding this comment.
Fixed in commit 3bff142 — added an "Install system dependencies" step as the very first step (before checkout) that runs apt-get install -y software-properties-common git curl build-essential cmake libssl-dev. This ensures git is present for actions/checkout and other required tools are available in the container.
| name: Build Wheel CUDA ${{ matrix.cuda_config.ver }} Py ${{ matrix.pyver }} | ||
| runs-on: ubuntu-latest | ||
| container: | ||
| image: nvidia/cuda:${{ matrix.cuda_config.ver }}-devel-ubuntu24.04 |
There was a problem hiding this comment.
Running the job in an nvidia/cuda:* container means all steps execute inside that image. This workflow relies on several JavaScript/composite actions (actions/checkout, setup-uv, action-gh-release, attest-build-provenance) which require tooling like node (and git for checkout) to be present in the container. The nvidia/cuda images don’t guarantee those dependencies, so the workflow can fail early. Consider switching to a custom CUDA build image that includes node + git (and any build deps like cmake), or avoid a job-level container and instead run the CUDA build via docker run in a run: step.
There was a problem hiding this comment.
Fixed in commit 3bff142 — changed arch: 75;80; to arch: 75;80 to remove the trailing semicolon.
|
@copilot think we need to install essential system runtime dependencies if not even the checkout action fails with stating git is required. So maybe something like this maybe ? software-properties-common git curl build-essential cmake libssl-dev ? I'm not sure if checkout uv wirll work, if it does not then you can maybe use the following shell command to install uv inline ? "curl -LsSf https://astral.sh/uv/install.sh | sh" |
… install Co-authored-by: chamalgomes <106123134+chamalgomes@users.noreply.github.com>
Done in commit 3bff142:
|
Original prompt
This pull request was created from Copilot chat.
💬 We'd love your input! Share your thoughts on Copilot coding agent in our 2 minute survey.