Skip to content

[FEA]: Trigger add_dll_directory(found_path) logic even if was_already_loaded_from_elsewhere is true #821

@gmarkall

Description

@gmarkall

Is this a duplicate?

Type of Bug

Runtime Error

Component

cuda.bindings

Describe the bug

Attempting to use NVRTC on Windows with the NVRTC wheels from cuda-python fails with:

nvrtc: error: failed to open nvrtc-builtins64_129.dll.
  Make sure that nvrtc-builtins64_129.dll is installed correctly.

I can work around this by copying nvrtc-builtins64_129.dll to a folder where other loaded DLLs exist, e.g.

copy "$SYS_PREFIX\Lib\site-packages\nvidia\cuda_nvrtc\bin\nvrtc-builtins64_*.dll" "$SYS_PREFIX"

How to Reproduce

I will have to create a smaller reproducer, but I observed this by pip installing Numba-CUDA in CI and running its testsuite: https://github.com/NVIDIA/numba-cuda/pull/353/files#diff-9753d897094baa6cf320a53ee98e7059f8477dbf3c6d467078ce35bcf3354f57R31-R57

Example run: https://github.com/NVIDIA/numba-cuda/actions/runs/16526296179/job/46740959886#step:8:8351

I did similar on my local machine and observed the same issue. I will update this issue once I have a shorter reproducer.

Expected behavior

Using NVRTC should not emit an NVRTC error about loading the builtins.

Operating System

Windows 11, amd64

nvidia-smi output

+-----------------------------------------------------------------------------------------+
| NVIDIA-SMI 572.13                 Driver Version: 572.13         CUDA Version: 12.8     |
|-----------------------------------------+------------------------+----------------------+
| GPU  Name                  Driver-Model | Bus-Id          Disp.A | Volatile Uncorr. ECC |
| Fan  Temp   Perf          Pwr:Usage/Cap |           Memory-Usage | GPU-Util  Compute M. |
|                                         |                        |               MIG M. |
|=========================================+========================+======================|
|   0  Tesla T4                     TCC   |   00000001:00:00.0 Off |                  Off |
| N/A   38C    P8              9W /   70W |       9MiB /  16384MiB |      0%      Default |
|                                         |                        |                  N/A |
+-----------------------------------------+------------------------+----------------------+
                                                                                         
+-----------------------------------------------------------------------------------------+
| Processes:                                                                              |
|  GPU   GI   CI              PID   Type   Process name                        GPU Memory |
|        ID   ID                                                               Usage      |
|=========================================================================================|
|  No running processes found                                                             |
+-----------------------------------------------------------------------------------------+

Metadata

Metadata

Assignees

Labels

cuda.pathfinderEverything related to the cuda.pathfinder moduleenhancementAny code-related improvements

Type

No type

Projects

Status

Done

Relationships

None yet

Development

No branches or pull requests

Issue actions