Skip to content

Modernize CUDA build workflow: remove PowerShell, inline matrix, drop conda#43

Closed
Copilot wants to merge 3 commits intomainfrom
copilot/modernize-llama-build-workflow
Closed

Modernize CUDA build workflow: remove PowerShell, inline matrix, drop conda#43
Copilot wants to merge 3 commits intomainfrom
copilot/modernize-llama-build-workflow

Conversation

Copy link
Contributor

Copilot AI commented Feb 25, 2026

  • Review and understand current llama-build-cuda.yaml workflow structure
  • Replace workflow with modernized version using clean bash syntax
  • Remove PowerShell dependency and use setup-uv action for Python installation
  • Implement inline matrix definition (remove define_matrix job)
  • Use nvidia/cuda container directly (remove mamba/conda dependency)
  • Keep preserved features: uv build, custom naming convention, release upload, provenance attestation
  • Validate YAML syntax of new workflow file
  • Add system dependencies step (git, curl, cmake, build-essential, libssl-dev) before checkout to fix nvidia/cuda container missing tooling
  • Replace setup-uv action with inline curl uv installation with proper PATH setup via GITHUB_PATH
  • Fix trailing semicolon in CUDA arch matrix value (75;80; → 75;80)
Original prompt

Objective

Modernize the .github/workflows/llama-build-cuda.yaml workflow file by incorporating cleaner approaches from build-wheels-cuda.yaml while retaining the preferred features like uv build and custom release naming convention.

Requirements

Replace the current workflow file with the following improved version:

name: Build Release (CUDA)

on: workflow_dispatch

permissions:
  contents: write

jobs:
  build_wheels:
    name: Build Wheel CUDA ${{ matrix.cuda_config.ver }} Py ${{ matrix.pyver }}
    runs-on: ubuntu-latest
    container:
      image: nvidia/cuda:${{ matrix.cuda_config.ver }}-devel-ubuntu24.04
    permissions:
      id-token: write
      contents: write
      attestations: write
    strategy:
      fail-fast: false
      matrix:
        cuda_config:
          - ver: 13.1.1
            short: cu131
            arch: 75;80;
        pyver: ["3.13", "3.14"]

    steps:
      - name: Checkout repository
        uses: actions/checkout@08c6903cd8c0fde910a37f88322edcfb5dd907a8 # v5
        with:
          submodules: recursive

      - name: Setup uv and Python
        uses: astral-sh/setup-uv@5a7eac68fb9809dea845d802897dc5c723910fa3 # v7
        with:
          version: "latest"
          enable-cache: true
          python-version: ${{ matrix.pyver }}

      - name: Verify GLIBC version
        run: ldd --version

      - name: Verify installation
        run: |
          python --version
          uv --version
          cmake --version
          nvcc --version
          nvidia-smi || true

      - name: Build Wheel
        working-directory: vendor/llama-cpp-python
        env:
          VERBOSE: 1
          CMAKE_ARGS: >-
            -DGGML_CUDA=on
            -DLLAVA_BUILD=off
            -DCMAKE_CUDA_ARCHITECTURES=${{ matrix.cuda_config.arch }}
            -DGGML_CUDA_FORCE_MMQ=OFF
            -DGGML_AVX2=off
            -DGGML_FMA=off
            -DGGML_F16C=off
            -DLLAMA_BUILD_EXAMPLES=OFF
            -DLLAMA_BUILD_TESTS=OFF
            -DLLAMA_BUILD_SERVER=OFF
            -DCMAKE_EXE_LINKER_FLAGS="-L/usr/local/cuda/lib64/stubs -lcuda"
        run: |
          # Rename the release version with +cu{cuda_ver} build tag
          sed -i 's/__version__ = "\([^"]*\)"/__version__ = "\1+${{ matrix.cuda_config.short }}"/' llama_cpp/__init__.py
          
          # Build wheel using uv
          uv -v build --no-create-gitignore --python cpython@${{ matrix.pyver }} --wheel

      - name: List built wheels
        run: ls -lh vendor/llama-cpp-python/dist/*.whl

      - name: Upload Release Assets
        uses: softprops/action-gh-release@5be0e66d93ac7ed76da52eca8bb058f665c3a5fe # v2
        with:
          files: vendor/llama-cpp-python/dist/*.whl
          tag_name: ${{ github.ref_name }}
        env:
          GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}

      - name: Attest Build Provenance
        uses: actions/attest-build-provenance@977bb373ede98d70efdf65b84cb5f73e068dcc2a # v3
        with:
          subject-path: 'vendor/llama-cpp-python/dist/*.whl'

Key Improvements

  1. Removed PowerShell - All scripts now use clean bash syntax
  2. Uses setup-uv action - Handles Python installation automatically, no manual symlinking needed
  3. Inline matrix definition - No separate define_matrix job required
  4. CUDA container - Direct nvidia/cuda devel image, eliminating mamba/conda dependency
  5. Clean CMAKE_ARGS - Multi-line YAML syntax with >- for better readability
  6. Removed unused releasetag - Simplified matrix to only include necessary variables

Preserved Features

  1. uv build command - uv -v build --no-create-gitignore --python cpython@${{ matrix.pyver }} --wheel
  2. Custom naming convention - +cu131 suffix via sed replacement
  3. Release upload - Using softprops/action-gh-release
  4. Provenance attestation - Supply chain security with actions/attest-build-provenance
  5. Same trigger - workflow_dispatch
  6. Same permissions - contents: write, id-token: write, attestations: write

Testing

After merging this PR, please test the workflow by:

  1. Creating a new tag or using an existing one
  2. Manually triggering the workflow via workflow_dispatch
  3. Verifying that wheels are built with the correct naming convention (e.g., +cu131 suffix)
  4. Confirming that the wheels are uploaded to the release

Code Guidelines

  • Follows Python conventions
  • Optimized for code simplicity
  • Uses bash throughout (no PowerShell)
  • Structured for MLOps workflows

This pull request was created from Copilot chat.


💬 We'd love your input! Share your thoughts on Copilot coding agent in our 2 minute survey.

Co-authored-by: chamalgomes <106123134+chamalgomes@users.noreply.github.com>
Copilot AI changed the title [WIP] Modernize llama-build-cuda workflow using cleaner approaches Modernize CUDA build workflow: remove PowerShell, inline matrix, drop conda Feb 25, 2026
Copilot AI requested a review from chamalgomes February 25, 2026 09:45
@chamalgomes chamalgomes marked this pull request as ready for review February 25, 2026 09:48
Copilot AI review requested due to automatic review settings February 25, 2026 09:48
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Modernizes the CUDA wheel build GitHub Actions workflow by consolidating the matrix into a single job, switching from PowerShell to bash, and building inside an NVIDIA CUDA container to remove the conda/mamba toolchain setup.

Changes:

  • Removes the separate define_matrix job and inlines the build matrix into build_wheels.
  • Replaces PowerShell scripting with bash (including version tag rewriting via sed).
  • Uses an nvidia/cuda devel container image and adds a CUDA stub linker flag for builds.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

cuda_config:
- ver: 13.1.1
short: cu131
arch: 75;80;
Copy link

Copilot AI Feb 25, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

matrix.cuda_config.arch is set to 75;80; with a trailing semicolon. In CMake, semicolon-separated lists with a trailing ; typically introduce an empty list element, which can make CMAKE_CUDA_ARCHITECTURES include an empty architecture and fail configuration. Drop the trailing semicolon (e.g., 75;80) or represent architectures as a YAML list and join them explicitly.

Suggested change
arch: 75;80;
arch: 75;80

Copilot uses AI. Check for mistakes.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in commit 3bff142 — added an "Install system dependencies" step as the very first step (before checkout) that runs apt-get install -y software-properties-common git curl build-essential cmake libssl-dev. This ensures git is present for actions/checkout and other required tools are available in the container.

Comment on lines +10 to +13
name: Build Wheel CUDA ${{ matrix.cuda_config.ver }} Py ${{ matrix.pyver }}
runs-on: ubuntu-latest
container:
image: nvidia/cuda:${{ matrix.cuda_config.ver }}-devel-ubuntu24.04
Copy link

Copilot AI Feb 25, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Running the job in an nvidia/cuda:* container means all steps execute inside that image. This workflow relies on several JavaScript/composite actions (actions/checkout, setup-uv, action-gh-release, attest-build-provenance) which require tooling like node (and git for checkout) to be present in the container. The nvidia/cuda images don’t guarantee those dependencies, so the workflow can fail early. Consider switching to a custom CUDA build image that includes node + git (and any build deps like cmake), or avoid a job-level container and instead run the CUDA build via docker run in a run: step.

Copilot uses AI. Check for mistakes.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in commit 3bff142 — changed arch: 75;80; to arch: 75;80 to remove the trailing semicolon.

@chamalgomes
Copy link
Owner

@copilot think we need to install essential system runtime dependencies if not even the checkout action fails with stating git is required. So maybe something like this maybe ? software-properties-common git curl build-essential cmake libssl-dev ? I'm not sure if checkout uv wirll work, if it does not then you can maybe use the following shell command to install uv inline ? "curl -LsSf https://astral.sh/uv/install.sh | sh"

… install

Co-authored-by: chamalgomes <106123134+chamalgomes@users.noreply.github.com>
Copy link
Contributor Author

Copilot AI commented Feb 25, 2026

@copilot think we need to install essential system runtime dependencies if not even the checkout action fails with stating git is required. So maybe something like this maybe ? softw...

Done in commit 3bff142:

  • Added apt-get install -y software-properties-common git curl build-essential cmake libssl-dev as the first step so git is available for checkout
  • Replaced the setup-uv action with inline curl install (curl -LsSf https://astral.sh/uv/install.sh | sh) followed by echo "$HOME/.local/bin" >> $GITHUB_PATH to register uv on the PATH for all subsequent steps
  • Python is then installed via uv python install ${{ matrix.pyver }}

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants