Skip to content

Support Data Center precompiled driver container for Arm (Ubuntu 24.04)#533

Draft
shivakunv wants to merge 10 commits intomainfrom
precompiled-arm-support
Draft

Support Data Center precompiled driver container for Arm (Ubuntu 24.04)#533
shivakunv wants to merge 10 commits intomainfrom
precompiled-arm-support

Conversation

@shivakunv
Copy link
Contributor

@shivakunv shivakunv commented Jan 6, 2026

Code Changes Summary:

  • Platform Support
    Added support for the ARM64 platform.
    AMD64 remains the default architecture.

  • Artifacts Update
    ARM64 build artifacts are now uploaded with the -arm64 suffix.

  • Instance Type and Region Mapping
    g4dn.xlarge:
    Architecture: AMD64
    Supported Region: us-west-1
    Used for AMD64 builds.

    g5g.xlarge:
    Architecture: ARM64
    Supported Region: us-west-2
    Used for ARM64 builds.

passed pipeline: https://github.com/NVIDIA/gpu-driver-container/actions/runs/22180871853

passed pipeline: https://github.com/NVIDIA/gpu-driver-container/actions/runs/22337833186

@shivakunv shivakunv changed the title Precompiled arm support Support Data Center precompiled driver container for Arm (Ubuntu 24.04) Jan 6, 2026
@shivakunv shivakunv force-pushed the precompiled-arm-support branch 2 times, most recently from 6405d48 to 574ce43 Compare January 14, 2026 17:22
@shivakunv shivakunv force-pushed the precompiled-arm-support branch 4 times, most recently from 20726a8 to 46aa0d1 Compare February 12, 2026 12:07
@shivakunv shivakunv force-pushed the precompiled-arm-support branch 3 times, most recently from c008150 to b684015 Compare February 19, 2026 13:11
@shivakunv shivakunv marked this pull request as ready for review February 19, 2026 13:12

- name: Set up Holodeck
uses: NVIDIA/holodeck@v0.2.18
uses: NVIDIA/holodeck@main
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will update it and specify the actual version once @ArangoGutierrez releases the new version of Holodeck.

@shivakunv shivakunv self-assigned this Feb 19, 2026
Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This pull request adds ARM64 (aarch64) platform support to the Ubuntu 24.04 precompiled driver container builds, while maintaining AMD64 as the default architecture. The changes enable multi-platform Docker builds and update the CI/CD pipeline to handle both architectures.

Changes:

  • Added ARM64 platform support for Ubuntu 24.04 precompiled driver containers with architecture-specific package handling
  • Updated CI workflow to build, test, and publish both AMD64 and ARM64 artifacts with platform-specific suffixes
  • Modified Holodeck test infrastructure to support ARM64 instances (g5g.xlarge in us-west-2) and Ubuntu 24.04 OS specification

Reviewed changes

Copilot reviewed 10 out of 10 changed files in this pull request and generated 2 comments.

Show a summary per file
File Description
ubuntu24.04/precompiled/nvidia-driver Added conditional installation of libnvidia-fbc1 package (AMD64 only)
ubuntu24.04/precompiled/local-repo.sh Added conditional downloads for ARM64-incompatible packages (linux-signatures-nvidia, libnvidia-fbc1)
ubuntu24.04/precompiled/Dockerfile Made i386 architecture and CUDA repository URLs conditional based on target architecture
tests/scripts/findkernelversion.sh Added optional PLATFORM_SUFFIX parameter for artifact matching and platform-specific manifest inspection
tests/scripts/ci-precompiled-helpers.sh Added PLATFORM_SUFFIX parameter support for kernel version testing
tests/holodeck_ubuntu24.04.yaml Removed file (merged into holodeck_ubuntu.yaml)
tests/holodeck_ubuntu.yaml Removed hardcoded ingressIpRanges and AMI, added OS specification support
multi-arch.mk Removed AMD64-only platform restriction for ubuntu24.04 builds
Makefile Added DOCKER_BUILD_PLATFORM_OPTIONS to base image build targets
.github/workflows/precompiled.yaml Added platform matrix dimension, platform-aware artifact naming, ARM64 e2e testing with appropriate instance types, and Holodeck version update

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.


- name: Set up Holodeck
uses: NVIDIA/holodeck@v0.2.18
uses: NVIDIA/holodeck@main
Copy link

Copilot AI Feb 19, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The Holodeck version has been changed from a pinned version (v0.2.18) to @main, which is not a best practice for CI/CD workflows. Using @main introduces unpredictability as the main branch could contain breaking changes at any time. The other workflow file (.github/workflows/ci.yaml) uses NVIDIA/holodeck@v0.2.18. Consider using a specific pinned version or tag instead of @main for stability and reproducibility.

Suggested change
uses: NVIDIA/holodeck@main
uses: NVIDIA/holodeck@v0.2.18

Copilot uses AI. Check for mistakes.
@shivakunv shivakunv force-pushed the precompiled-arm-support branch 3 times, most recently from ee1265d to 49429dd Compare February 21, 2026 08:03
@shivakunv shivakunv marked this pull request as draft February 23, 2026 15:33
@shivakunv shivakunv force-pushed the precompiled-arm-support branch from 32e68a1 to cdbfe9a Compare February 24, 2026 05:21
@shivakunv shivakunv marked this pull request as ready for review February 24, 2026 06:48
@shivakunv shivakunv force-pushed the precompiled-arm-support branch from cdbfe9a to e224399 Compare February 25, 2026 04:14
- dist: ubuntu24.04
driver_branch: 535
- dist: ubuntu24.04
driver_branch: 570
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

refer .common.yaml

BASE_IMAGE_TAG="${PRIVATE_REGISTRY}/nvidia/driver:base-${BASE_TARGET}-${LTS_KERNEL}-${KERNEL_FLAVOR}-${{ matrix.driver_branch }}"
docker tag ${BASE_IMAGE_TAG} ${BASE_IMAGE_TAG}-${{ env.PLATFORM_NAME }}
docker push "${BASE_IMAGE_TAG}-${{ env.PLATFORM_NAME }}"
docker buildx imagetools create -t "${BASE_IMAGE_TAG}" --append "${BASE_IMAGE_TAG}-${{ env.PLATFORM_NAME }}" || docker buildx imagetools create -t "${BASE_IMAGE_TAG}" "${BASE_IMAGE_TAG}-${{ env.PLATFORM_NAME }}"
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

multi arch support

DRIVER_IMAGE_TAG="${PRIVATE_REGISTRY}/nvidia/driver:${{ matrix.driver_branch }}-${{ env.KERNEL_VERSION }}"
docker tag ${DRIVER_IMAGE_TAG} ${DRIVER_IMAGE_TAG}-${{ env.PLATFORM_NAME }}
docker push "${DRIVER_IMAGE_TAG}-${{ env.PLATFORM_NAME }}"
docker buildx imagetools create -t "${DRIVER_IMAGE_TAG}" --append "${DRIVER_IMAGE_TAG}-${{ env.PLATFORM_NAME }}" || docker buildx imagetools create -t "${DRIVER_IMAGE_TAG}" "${DRIVER_IMAGE_TAG}-${{ env.PLATFORM_NAME }}"
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

multi arch support

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As mentioned in my earlier comment, let's do a single multiarch build instead of building individual platform-specific images and then merging them together.

@shivakunv shivakunv force-pushed the precompiled-arm-support branch from e224399 to 6c1bb37 Compare February 25, 2026 04:21
@tariq1890
Copy link
Contributor

Note that we don't want separate images for arm. For the precompiled driver packages that have arm variants, we want to start building multi-arch images so that they support arm64 along with amd64

@shivakunv
Copy link
Contributor Author

Note that we don't want separate images for arm. For the precompiled driver packages that have arm variants, we want to start building multi-arch images so that they support arm64 along with amd64

Yes, this PR already includes this feature.

confirmed with the command:

docker pull ghcr.io/nvidia/driver:580-6.8.0-101-generic-ubuntu24.04 --platform=linux/amd64

docker pull ghcr.io/nvidia/driver:580-6.8.0-101-generic-ubuntu24.04 --platform=linux/arm64

kernel_flavors: ${{ steps.extract_driver_branch.outputs.kernel_flavors }}
dist: ${{ steps.extract_driver_branch.outputs.dist }}
lts_kernel: ${{ steps.extract_driver_branch.outputs.lts_kernel }}
platforms: ${{ steps.extract_driver_branch.outputs.platforms }}
Copy link
Contributor

@tariq1890 tariq1890 Feb 25, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it possible to not expand the matrix? Adding a new matrix column increases the complexity of the ci manifests by a lot.

Let's look at alternatives please.

Signed-off-by: Shiva Kumar (SW-CLOUD) <shivaku@nvidia.com>
@shivakunv shivakunv force-pushed the precompiled-arm-support branch from 6c1bb37 to f694c5e Compare February 25, 2026 05:12
@tariq1890
Copy link
Contributor

@shivakunv On putting more thought into this, can we do a multiarch build of the precompiled image instead of building the arm64 and amd64 images separate and then stitching them together?

DRIVER_BRANCHES=($(echo "$driver_branch_json" | jq -r '.[]'))
echo "DRIVER_BRANCHES=${DRIVER_BRANCHES[*]}" >> $GITHUB_ENV
- name: Set kernel version in holodeck_${{ env.DIST }}.yaml
- name: Configure Holodeck e2e test config (kernel, OS, instance)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This PR already has a large diff. Let's revisit the holodeck changes in a follow-up PR

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These changes are required for arm64 ( please ccheck yq replacement)

I will create a separate PR for holodeck changes that should be merged before this one, so that this PR will only include the arm64 changes.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

holodeck PR: #620

@shivakunv shivakunv marked this pull request as draft February 26, 2026 09:40
@shivakunv shivakunv force-pushed the precompiled-arm-support branch 2 times, most recently from c0dd5f8 to eb9eebd Compare February 26, 2026 10:17
Signed-off-by: Shiva Kumar (SW-CLOUD) <shivaku@nvidia.com>
@shivakunv shivakunv force-pushed the precompiled-arm-support branch from eb9eebd to abcbd47 Compare February 26, 2026 11:20
Shiva Kumar added 8 commits February 26, 2026 17:37
Signed-off-by: Shiva Kumar (SW-CLOUD) <shivaku@nvidia.com>
Signed-off-by: Shiva Kumar (SW-CLOUD) <shivaku@nvidia.com>
Signed-off-by: Shiva Kumar (SW-CLOUD) <shivaku@nvidia.com>
Signed-off-by: Shiva Kumar (SW-CLOUD) <shivaku@nvidia.com>
Signed-off-by: Shiva Kumar (SW-CLOUD) <shivaku@nvidia.com>
Signed-off-by: Shiva Kumar (SW-CLOUD) <shivaku@nvidia.com>
Signed-off-by: Shiva Kumar (SW-CLOUD) <shivaku@nvidia.com>
Signed-off-by: Shiva Kumar (SW-CLOUD) <shivaku@nvidia.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants