Any chance to have the device plugin working on containerd without nvidia-docker2?
I have rebuild my cluster with Conteinerd and on my worker nodes
the following are installed
libnvidia-container
nvidia-container-toolkit
nvidia-container-runtime
but the device plugin rises the error:
0425 10:34:29.375414 1 main.go:18] Start gpushare device plugin
I0425 10:34:29.382160 1 gpumanager.go:28] Loading NVML
I0425 10:34:29.382601 1 gpumanager.go:31] Failed to initialize NVML: could not load NVML library.
I0425 10:34:29.382616 1 gpumanager.go:32] If this is a GPU node, did you set the docker default runtime to nvidia?
The default runtime has been setup to nvidia-container-runtime
[plugins."io.containerd.runtime.v1.linux"]
no_shim = false
runtime = "nvidia-container-runtime"
runtime_root = ""
shim = "containerd-shim"
shim_debug = false
Anyone has found a workaround?
Any plan to replace nvidia-docker2 with nvidia-container-runtime
Thanks
Any chance to have the device plugin working on containerd without nvidia-docker2?
I have rebuild my cluster with Conteinerd and on my worker nodes
the following are installed
libnvidia-container
nvidia-container-toolkit
nvidia-container-runtime
but the device plugin rises the error:
0425 10:34:29.375414 1 main.go:18] Start gpushare device plugin
I0425 10:34:29.382160 1 gpumanager.go:28] Loading NVML
I0425 10:34:29.382601 1 gpumanager.go:31] Failed to initialize NVML: could not load NVML library.
I0425 10:34:29.382616 1 gpumanager.go:32] If this is a GPU node, did you set the docker default runtime to
nvidia?The default runtime has been setup to nvidia-container-runtime
[plugins."io.containerd.runtime.v1.linux"]
no_shim = false
runtime = "nvidia-container-runtime"
runtime_root = ""
shim = "containerd-shim"
shim_debug = false
Anyone has found a workaround?
Any plan to replace nvidia-docker2 with nvidia-container-runtime
Thanks