Feat-Infra: Add Cloud/RunPod Docker Support with Automated Builds#1569
Open
FNGarvin wants to merge 6 commits intodeepbeepmeep:mainfrom
Open
Feat-Infra: Add Cloud/RunPod Docker Support with Automated Builds#1569FNGarvin wants to merge 6 commits intodeepbeepmeep:mainfrom
FNGarvin wants to merge 6 commits intodeepbeepmeep:mainfrom
Conversation
Author
Reviewer's GuideContainerization and infrastructure overhaul for Wan2GP: introduces a multi-stage Docker build targeting CUDA 12.8 and 13.0, runtime GPU/profile auto-detection with SageAttention 2++ wheel selection, optional SSH and filebrowser services, and CI workflows to build/publish Docker images and SageAttention/Blackwell wheels for cloud (RunPod) deployments. Sequence diagram for container startup and Wan2GP launchsequenceDiagram
actor Operator
participant Pod as RunPod_container
participant Entrypoint as entrypoint_sh
participant SSHD as sshd
participant FB as filebrowser
participant GPU as nvidia_smi
participant Sage as Sage_wheels_installer
participant BW as Blackwell_kernel_installer
participant App as wgp_py
Operator->>Pod: Start Wan2GP image
Pod->>Entrypoint: Invoke /entrypoint.sh
Entrypoint->>Entrypoint: Sanitize env
Entrypoint->>Entrypoint: Configure cache and telemetry
Entrypoint->>SSHD: Start sshd (if SSH_PORT set)
Entrypoint->>FB: Start filebrowser (if FILEBROWSER_PORT set)
Entrypoint->>GPU: Query GPU name and VRAM
GPU-->>Entrypoint: GPU_NAME, VRAM_GB
Entrypoint->>Entrypoint: Derive PROFILE and ATTN
Entrypoint->>Entrypoint: Apply WGP_PROFILE and WGP_ATTENTION overrides
Entrypoint->>Sage: Detect compute_capability via nvidia-smi
Sage-->>Entrypoint: Matching SageAttention wheel path
Entrypoint->>Sage: pip install selected wheel
Entrypoint->>BW: Check Blackwell and CUDA 13.0
BW-->>Entrypoint: NVFP4 kernel wheel (if available)
Entrypoint->>BW: pip install NVFP4 kernel
Entrypoint->>App: python3 wgp.py --listen --profile PROFILE --attention ATTN WGP_ARGS "$@"
App-->>Operator: Serve UI on port 7860
Flow diagram for SageAttention wheel factory and Blackwell kernel workflowsgraph TD
A[workflow_dispatch<br>sage-wheels or sage-wheels-cu13] --> B[initialize-release job]
B --> B1[Checkout repo]
B1 --> B2[Parse Dockerfile to get SAGE_VERSION]
B2 --> B3[Compute release tag<br>sage-vX.Y.Z-cu128_or_cu130-cp312]
B3 --> B4[Create or refresh prerelease<br>on GitHub Releases]
B4 --> C[build-wheels matrix job]
subgraph MatrixBuilds
C --> C1[Variant ampere-ada-rtx-30-40<br>CUDA_ARCHITECTURES 8.0;8.6;8.9]
C --> C2[Variant hopper-h100-h200<br>CUDA_ARCHITECTURES 9.0;10.0]
C --> C3[Variant blackwell-rtx-50<br>CUDA_ARCHITECTURES 12.0+PTX]
end
C1 --> D1[Build Docker target sage-tools<br>with docker-build-push-action]
C2 --> D2[Build Docker target sage-tools]
C3 --> D3[Build Docker target sage-tools]
D1 --> E1[Run container and copy /tmp/sa_dist/*.whl]
D2 --> E2[Run container and copy /tmp/sa_dist/*.whl]
D3 --> E3[Run container and copy /tmp/sa_dist/*.whl]
E1 --> F1[Rename wheel with suffix ampere.ada.rtx30.40]
E2 --> F2[Rename wheel with suffix hopper.h100.h200]
E3 --> F3[Rename wheel with suffix blackwell.rtx50]
F1 --> G[Upload wheel assets to release tag]
F2 --> G
F3 --> G
subgraph BlackwellKernelsWorkflow
H[workflow_dispatch<br>blackwell-kernels] --> I[Build Docker target blackwell-tools<br>from Dockerfile.cu13]
I --> J[Extract /tmp/bw_dist/*.whl<br>NVFP4 kernels]
J --> K[Ensure blackwell-kernels release exists]
K --> L[Upload kernel wheels to blackwell-kernels release]
end
File-Level Changes
Tips and commandsInteracting with Sourcery
Customizing Your ExperienceAccess your dashboard to:
Getting Help
|
Author
|
Hello from Reddit. |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Add Cloud/RunPod Docker Support with Automated Builds
Hiya, Deep. Long-time fan of your amazing project and the motivation behind it. I've been recommending it frequently for quite some time. It's amazing how much leading edge tech you've packed into it. Thanks.
Summary
This PR adds first-class container support for local or cloud deployment, including CUDA 13.0 Blackwell (sm_12x) support. It features an automated build pipeline that compiles and publishes architecture-specific SageAttention weights and Blackwell NVFP4 kernels as release assets. The final container image is ultra-mobile, bundling an SSH daemon and web-based filemanager, and is automatically published to GHCR on every commit. A step-by-step RunPod Deployment Guide is now included to help users get started quickly. The two images (cuda12.8 and cuda130) both have the appropriate Nunchaku kernels baked in.
The changes are entirely additive — nothing in the existing codebase, install scripts,
or local workflow is touched. Users who run WanGP locally or via the existing
run-docker-cuda-deb.shscript are completely unaffected. But should the PR be accepted, there will be an "Official" container image available at ghcr.io/deepbeepmeep/wan2gp and it will always be up-to-date with almost zero extra maintenance effort. Installing it will be as simple asdocker run --gpus all -p 7860:7860 -v /my/wangp_storage:/workspace ghcr.io/deepbeepmeep/wan2gp.I recognize that this PR seems like A LOT. But it basically boils down to merging, clicking a few action buttons to run the initial wheel builds, and from thenceforth always having automatic Docker images. You can even use the image builds as a way of sanity-checking future commits or merges. I tried very hard to make this one come with no downsides.
Motivation
The existing
Dockerfileis a solid foundation, but it has a few limitations thatmake cloud deployment awkward:
Hopper (H100), and Blackwell (RTX 50xx) users get no pre-compiled CUDA kernels and
fall back to slower paths.
run-docker-cuda-deb.shscript (which has great GPU-detection logic) is designedfor local use — it assumes the project is bind-mounted from the host. Cloud platforms
like RunPod require the code to be copied into the image.
that takes 30–60 minutes on a laptop due to the SageAttention CUDA compilation.
This PR fixes all three.
What Changed
New: Automated Build Workflow (
.github/workflows/docker-build.yml)A GitHub Actions workflow that automatically builds and publishes the Docker image to
the GitHub Container Registry (
ghcr.io/deepbeepmeep/wan2gp) whenever code is pushed.When it runs:
main:latestmainWhat this means in practice: Once merged, users can pull a pre-built, ready-to-run
image with a single command rather than compiling for an hour themselves:
Modified:
Dockerfile— Three-Stage Optimized ArchitectureThe Dockerfile has been refactored into a three-stage build to solve the "45-minute recompile" problem while ensuring maximum GPU compatibility.
Stage 1:
base(Shared Foundation)Contains the common environment (Ubuntu 24.04 + CUDA 12.8 + PyTorch 2.10.0+cu128 +
uvinstaller). Both local and CI builds start here.Stage 2:
sage-tools(The Compiler — Robot Only)A heavy-duty compiler stage used strictly by the automated GitHub "Sage Wheels" ritual. It compiles SageAttention for all four GPU generations. This stage is never run by the user or the primary CI build.
Stage 3:
sage-compile(The Production Stage)The actual production image. It downloads three architecture-specific parallel factory wheels directly from your GitHub Releases. This keeps the primary build time under 10 minutes.
Native coverage expanded from Ampere to Blackwell:
lightx2v_kernelfactory for high-performance FP4 inference on 50-series GPUs.entrypoint.shinstalls the optimal one at boot.New: Step-by-Step RunPod Guide
To make cloud deployment as frictionless as possible, I've added a visual guide (
docs/RUNPOD-HOWTO.md) that walks users through:I would be happy to setup a public "Official Wan2GP" template as pictured in the docs here, but would want your blessing.
The SageAttention Version Pin — and How to Upgrade It
SageAttention is pinned to a specific release (
v2.2.0, commiteb615cf6) in theDockerfile:
git clone --branch v2.2.0 --depth 1 \ https://github.com/thu-ml/SageAttention.git /tmp/SageAttentionWhy pin it? Without a pin, every automated build pulls the latest commit from
SageAttention's
mainbranch. In a fast-moving repo, that means a breaking changeupstream can silently make our image fail to build — potentially days after it was
introduced, mid-release, with no obvious cause. Pinning gives us a reproducible,
auditable build: the image built today and the image built six months from now will
compile the same SageAttention code.
How to upgrade it:
Upgrading SageAttention is a one-line change in the Dockerfile. When a new release
of SageAttention is published at https://github.com/thu-ml/SageAttention/releases:
v2.3.0) and its full commit SHA (shown on the releases page)Dockerfileand find the line:v2.2.0to the new tag:and publish a fresh image. The old cached layers are reused up to that point.
What about automatic upgrade notifications?
This PR also includes a
.github/dependabot.ymlfile. Dependabot is a free GitHubservice that opens pull requests to bump dependency versions automatically. It is
configured here to check GitHub Actions references weekly and open a PR if any of
the CI action versions have newer releases — completely automated, zero configuration
required, one click to merge.
Dependabot cannot currently bump git clone refs inside Dockerfiles, so the
SageAttention tag remains a manual one-liner as described above. Upgrading it takes
about 60 seconds of human effort.
Caching: How Builds Stay Fast After the First Run
The first automated build is slow — primarily because of the SageAttention CUDA
compilation, which takes ~45 minutes on a standard CI runner. Every build after
that is fast because of BuildKit layer caching:
BuildKit (the Docker build engine) saves each completed build stage as a snapshot.
On the next build, it checks whether anything that affects a stage has changed. If
not, it restores the stage from cache in seconds rather than re-running it.
Here is what triggers a rebuild of each stage:
.pyor.shfileentrypoint.shrequirements.txtDockerfileDockerfileIn practice, the vast majority of commits are source code changes — those hit the
cache on the expensive stage and only re-run the trivial copy step.
New:
sage-wheels.yml— The Wheel FactoryThis workflow compiles SageAttention and publishes it to GitHub Releases.
One-Time Setup Required (Post-Merge)
To ensure the Docker image can pull the necessary optimized binaries, the following GitHub Actions must be run manually once after merging this PR:
1. Blackwell Static Kernel (cu130)
lightx2v_kernel(NVFP4) for RTX 50-series.blackwell-kernels2. SageAttention Factory (cu130)
sage-v<version>-cu130-cp3123. SageAttention Pre-built Wheel (cu128)
sage-v<version>-cu128-cp312Tip
Run these via the Actions tab on GitHub by selecting the workflow and clicking Run workflow. Once these releases are populated, the main Docker builds will complete in ~3 minutes.
Testing
To test locally before pulling the published image:
Files Changed
DockerfileDockerfile.cu13entrypoint.sh.github/workflows/docker-build.yml.github/workflows/docker-build-cu13.yml.github/workflows/blackwell-kernels.yml.github/workflows/sage-wheels.yml.github/workflows/sage-wheels-cu13.yml.github/dependabot.yml.dockerignore.gitignoredocs/RUNPOD-HOWTO.mddocs/images/*.jpg.yamllintrequirements.txtscipyandsafetensorsdependencies