Skip to content

CUDA: only init NCCL for setups with multi GPU#21761

Open
EldarBorge wants to merge 1 commit intoggml-org:masterfrom
EldarBorge:master
Open

CUDA: only init NCCL for setups with multi GPU#21761
EldarBorge wants to merge 1 commit intoggml-org:masterfrom
EldarBorge:master

Conversation

@EldarBorge
Copy link
Copy Markdown

Overview

Skips NCCL if the setup is single GPU, i.e. only initializes NCCL if GPU count is more than 1.
This reduces VRAM on single GPU setups that was introduced in b8738.

Fixes #21759

Tested locally and reduces VRAM both idle and under load to pre b8738 levels on my setup.

Requirements

  • I have read and agree with the contributing guidelines
  • AI usage disclosure: Yes, AI was used to bisect the bad commit and to investigate what in that commit caused higher VRAM usage.

@EldarBorge EldarBorge requested a review from a team as a code owner April 11, 2026 07:24
@JohannesGaessler
Copy link
Copy Markdown
Contributor

#21746 should be a better fix.

@EldarBorge
Copy link
Copy Markdown
Author

EldarBorge commented Apr 11, 2026

You are probably right, your fix is too advanced for me to understand! Let's await @ggerganov , but feel free to close my PR if it's not a good fix. 🙂

@EldarBorge EldarBorge changed the title fix: only init NCCL for setups with multi GPU CUDA: only init NCCL for setups with multi GPU Apr 11, 2026
@github-actions github-actions bot added Nvidia GPU Issues specific to Nvidia GPUs ggml changes relating to the ggml tensor library for machine learning labels Apr 11, 2026
@eelgaev
Copy link
Copy Markdown

eelgaev commented Apr 11, 2026

Is NCCL always initialized now? or is it just for --split-mode tensor?

@EldarBorge
Copy link
Copy Markdown
Author

Is NCCL always initialized now? or is it just for --split-mode tensor?

always after b8738 is my understanding

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ggml changes relating to the ggml tensor library for machine learning Nvidia GPU Issues specific to Nvidia GPUs

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Misc. bug: Higher VRAM usage after b8738

3 participants