-
Notifications
You must be signed in to change notification settings - Fork 257
Open
Labels
bugSomething isn't workingSomething isn't working
Description
Is this a duplicate?
- I confirmed there appear to be no duplicate issues for this bug and that I agree to the Code of Conduct
Type of Bug
Silent Failure
Component
cuda.bindings
Describe the bug
cuda-python/examples/0_Introduction/simpleP2P_test.py: line
p2pCapableGPUs = [-1, -1]
for i in range(gpu_n):
p2pCapableGPUs[0] = i
for j in range(gpu_n):
if i == j:
continue
i_access_j = checkCudaErrors(cudart.cudaDeviceCanAccessPeer(i, j))
j_access_i = checkCudaErrors(cudart.cudaDeviceCanAccessPeer(j, i))
print(
"> Peer access from {} (GPU{}) -> {} (GPU{}) : {}\n".format(
prop[i].name, i, prop[j].name, j, "Yes" if i_access_j else "No"
)
)
print(
"> Peer access from {} (GPU{}) -> {} (GPU{}) : {}\n".format(
prop[j].name, j, prop[i].name, i, "Yes" if i_access_j else "No"
)
)
if i_access_j and j_access_i:
p2pCapableGPUs[1] = j
break
if p2pCapableGPUs[1] != -1:
break
For GPU[j]->GPU[i], I think we should use the variable 'j_access_i' rather than 'i_access_j'. Btw, I think the extra/isoFDmodelling_test.py has the same issue, please double check this .py file https://github.com/NVIDIA/cuda-python/blob/main/cuda_bindings/examples/extra/isoFDModelling_test.py#L645 when you feel free.
How to Reproduce
This is a logging issue when you disable the peer access from GPU[j] to GPU[i] and print this logging information
Expected behavior
It should be j_access_i for GPU[j]->GPU[i].
Operating System
No response
nvidia-smi output
No response
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
bugSomething isn't workingSomething isn't working