Skip to content

Vuklan not recognized with nvidia-toolkit 1.18, Debian 13 Trixie #1559

@snowgoon88

Description

@snowgoon88

Hello,

I'm trying to use Vulkan through docker+nvidia-toolkit, but it fails because of faulty initialisation/configuration.

I have a manual fix below.
And it might be (somehow) related to issues #1472, #1021 and maybe #1517.

Problem: Vulkan not recognized ==============================================

In docker, the GPU is not see by Vulkan

[host]$ docker run -it --privileged --net=host --env="DISPLAY" --volume="/tmp/.X11-unix:/tmp/.X11-unix:rw" --gpus all --runtime=nvidia nvidia/cuda:12.4.1-cudnn-devel-ubuntu22.04 bash -c "apt update && apt install vulkan-tools -y && nvidia-smi && vulkaninfo --summary"

+-----------------------------------------------------------------------------------------+
| NVIDIA-SMI 550.163.01             Driver Version: 550.163.01     CUDA Version: 12.4     |
|-----------------------------------------+------------------------+----------------------+
| GPU  Name                 Persistence-M | Bus-Id          Disp.A | Volatile Uncorr. ECC |
| Fan  Temp   Perf          Pwr:Usage/Cap |           Memory-Usage | GPU-Util  Compute M. |
|                                         |                        |               MIG M. |
|=========================================+========================+======================|
|   0  NVIDIA GeForce RTX 4060 Ti     On  |   00000000:07:00.0  On |                  N/A |
|  0%   34C    P8             15W /  165W |    1214MiB /  16380MiB |      7%      Default |
|                                         |                        |                  N/A |
+-----------------------------------------+------------------------+----------------------+
                                                                                         
+-----------------------------------------------------------------------------------------+
| Processes:                                                                              |
|  GPU   GI   CI        PID   Type   Process name                              GPU Memory |
|        ID   ID                                                               Usage      |
|=========================================================================================|
+-----------------------------------------------------------------------------------------+
ERROR: [Loader Message] Code 0 : libnvidia-glsi.so.550.163.01: cannot open shared object file: No such file or directory
WARNING: [Loader Message] Code 0 : terminator_CreateInstance: Failed to CreateInstance in ICD 2.  Skipping ICD.
error: XDG_RUNTIME_DIR not set in the environment.
error: XDG_RUNTIME_DIR not set in the environment.
error: XDG_RUNTIME_DIR not set in the environment.
error: XDG_RUNTIME_DIR not set in the environment.
error: XDG_RUNTIME_DIR not set in the environment.
==========
VULKANINFO
==========

Vulkan Instance Version: 1.3.204


Instance Extensions: count = 20
-------------------------------
VK_EXT_acquire_drm_display             : extension revision 1
VK_EXT_acquire_xlib_display            : extension revision 1
VK_EXT_debug_report                    : extension revision 10
VK_EXT_debug_utils                     : extension revision 2
VK_EXT_direct_mode_display             : extension revision 1
VK_EXT_display_surface_counter         : extension revision 1
VK_EXT_swapchain_colorspace            : extension revision 4
VK_KHR_device_group_creation           : extension revision 1
VK_KHR_display                         : extension revision 23
VK_KHR_external_fence_capabilities     : extension revision 1
VK_KHR_external_memory_capabilities    : extension revision 1
VK_KHR_external_semaphore_capabilities : extension revision 1
VK_KHR_get_display_properties2         : extension revision 1
VK_KHR_get_physical_device_properties2 : extension revision 2
VK_KHR_get_surface_capabilities2       : extension revision 1
VK_KHR_surface                         : extension revision 25
VK_KHR_surface_protected_capabilities  : extension revision 1
VK_KHR_wayland_surface                 : extension revision 6
VK_KHR_xcb_surface                     : extension revision 6
VK_KHR_xlib_surface                    : extension revision 6

Instance Layers: count = 4
--------------------------
VK_LAYER_INTEL_nullhw       INTEL NULL HW                1.1.73   version 1
VK_LAYER_MESA_device_select Linux device selection layer 1.3.211  version 1
VK_LAYER_MESA_overlay       Mesa Overlay layer           1.3.211  version 1
VK_LAYER_NV_optimus         NVIDIA Optimus layer         1.3.277  version 1

Devices:
========
GPU0:
	apiVersion         = 4206847 (1.3.255)
	driverVersion      = 1 (0x0001)
	vendorID           = 0x10005
	deviceID           = 0x0000
	deviceType         = PHYSICAL_DEVICE_TYPE_CPU
	deviceName         = llvmpipe (LLVM 15.0.7, 256 bits)
	driverID           = DRIVER_ID_MESA_LLVMPIPE
	driverName         = llvmpipe
	driverInfo         = Mesa 23.2.1-1ubuntu3.1~22.04.3 (LLVM 15.0.7)
	conformanceVersion = 1.3.1.1
	deviceUUID         = 6d657361-3233-2e32-2e31-2d3175627500
	driverUUID         = 6c6c766d-7069-7065-5555-494400000000

Directly on the host machine:

[host]$ nvidia-smi 
Tue Jan  6 13:49:56 2026       
+-----------------------------------------------------------------------------------------+
| NVIDIA-SMI 550.163.01             Driver Version: 550.163.01     CUDA Version: 12.4     |
|-----------------------------------------+------------------------+----------------------+
| GPU  Name                 Persistence-M | Bus-Id          Disp.A | Volatile Uncorr. ECC |
| Fan  Temp   Perf          Pwr:Usage/Cap |           Memory-Usage | GPU-Util  Compute M. |
|                                         |                        |               MIG M. |
|=========================================+========================+======================|
|   0  NVIDIA GeForce RTX 4060 Ti     Off |   00000000:07:00.0  On |                  N/A |
|  0%   34C    P8             15W /  165W |    1370MiB /  16380MiB |      0%      Default |
|                                         |                        |                  N/A |
+-----------------------------------------+------------------------+----------------------+
                                                                                         
+-----------------------------------------------------------------------------------------+
| Processes:                                                                              |
|  GPU   GI   CI        PID   Type   Process name                              GPU Memory |
|        ID   ID                                                               Usage      |
|=========================================================================================|
|    0   N/A  N/A      1534      G   /usr/lib/xorg/Xorg                            295MiB |
|    0   N/A  N/A      3495      G   /usr/lib/thunderbird/thunderbird              170MiB |
|    0   N/A  N/A    522193      G   firefox-esr                                   891MiB |
+-----------------------------------------------------------------------------------------+

[host]$ vulkaninfo --summary
==========
VULKANINFO
==========

Vulkan Instance Version: 1.4.309


Instance Extensions: count = 24
-------------------------------
VK_EXT_acquire_drm_display             : extension revision 1
VK_EXT_acquire_xlib_display            : extension revision 1
VK_EXT_debug_report                    : extension revision 10
VK_EXT_debug_utils                     : extension revision 2
VK_EXT_direct_mode_display             : extension revision 1
VK_EXT_display_surface_counter         : extension revision 1
VK_EXT_headless_surface                : extension revision 1
VK_EXT_surface_maintenance1            : extension revision 1
VK_EXT_swapchain_colorspace            : extension revision 5
VK_KHR_device_group_creation           : extension revision 1
VK_KHR_display                         : extension revision 23
VK_KHR_external_fence_capabilities     : extension revision 1
VK_KHR_external_memory_capabilities    : extension revision 1
VK_KHR_external_semaphore_capabilities : extension revision 1
VK_KHR_get_display_properties2         : extension revision 1
VK_KHR_get_physical_device_properties2 : extension revision 2
VK_KHR_get_surface_capabilities2       : extension revision 1
VK_KHR_portability_enumeration         : extension revision 1
VK_KHR_surface                         : extension revision 25
VK_KHR_surface_protected_capabilities  : extension revision 1
VK_KHR_wayland_surface                 : extension revision 6
VK_KHR_xcb_surface                     : extension revision 6
VK_KHR_xlib_surface                    : extension revision 6
VK_LUNARG_direct_driver_loading        : extension revision 1

Instance Layers: count = 4
--------------------------
VK_LAYER_INTEL_nullhw       INTEL NULL HW                1.1.73   version 1
VK_LAYER_MESA_device_select Linux device selection layer 1.4.303  version 1
VK_LAYER_MESA_overlay       Mesa Overlay layer           1.4.303  version 1
VK_LAYER_NV_optimus         NVIDIA Optimus layer         1.3.277  version 1

Devices:
========
GPU0:
	apiVersion         = 1.3.277
	driverVersion      = 550.163.1.0
	vendorID           = 0x10de
	deviceID           = 0x2805
	deviceType         = PHYSICAL_DEVICE_TYPE_DISCRETE_GPU
	deviceName         = NVIDIA GeForce RTX 4060 Ti
	driverID           = DRIVER_ID_NVIDIA_PROPRIETARY
	driverName         = NVIDIA
	driverInfo         = 550.163.01
	conformanceVersion = 1.3.7.2
	deviceUUID         = 895229b1-52bf-a326-6565-cbd4d20a56e3
	driverUUID         = 2485a78d-e39e-5aba-a0e9-b1139a1e6395
GPU1:
	apiVersion         = 1.4.305
	driverVersion      = 0.0.1
	vendorID           = 0x10005
	deviceID           = 0x0000
	deviceType         = PHYSICAL_DEVICE_TYPE_CPU
	deviceName         = llvmpipe (LLVM 19.1.7, 256 bits)
	driverID           = DRIVER_ID_MESA_LLVMPIPE
	driverName         = llvmpipe
	driverInfo         = Mesa 25.0.7-2 (LLVM 19.1.7)
	conformanceVersion = 1.3.1.1
	deviceUUID         = 6d657361-3235-2e30-2e37-2d3200000000
	driverUUID         = 6c6c766d-7069-7065-5555-494400000000

Context: ===========================================================
OS Debian 13 Trixie
nvidia driver: 550.163.01 installed with apt-get

[host]$ ls /usr/lib/x86_64-linux-gnu/libnvidia*   

/usr/lib/x86_64-linux-gnu/libnvidia-allocator.so.1          /usr/lib/x86_64-linux-gnu/libnvidia-glcore.so.550.163.01
/usr/lib/x86_64-linux-gnu/libnvidia-cfg.so.1                /usr/lib/x86_64-linux-gnu/libnvidia-glsi.so.550.163.01
/usr/lib/x86_64-linux-gnu/libnvidia-container-go.so.1       /usr/lib/x86_64-linux-gnu/libnvidia-glvkspirv.so.550.163.01
/usr/lib/x86_64-linux-gnu/libnvidia-container-go.so.1.18.1  /usr/lib/x86_64-linux-gnu/libnvidia-gpucomp.so.550.163.01
/usr/lib/x86_64-linux-gnu/libnvidia-container.so.1
/usr/lib/x86_64-linux-gnu/libnvidia-ml.so.1
/usr/lib/x86_64-linux-gnu/libnvidia-container.so.1.18.1
/usr/lib/x86_64-linux-gnu/libnvidia-nvvm.so.4
/usr/lib/x86_64-linux-gnu/libnvidia-eglcore.so.550.163.01   /usr/lib/x86_64-linux-gnu/libnvidia-nvvm.so.550.163.01
/usr/lib/x86_64-linux-gnu/libnvidia-egl-gbm.so.1
/usr/lib/x86_64-linux-gnu/libnvidia-opencl.so.1
/usr/lib/x86_64-linux-gnu/libnvidia-egl-gbm.so.1.1.2        /usr/lib/x86_64-linux-gnu/libnvidia-pkcs11-openssl3.so.550.163.01
/usr/lib/x86_64-linux-gnu/libnvidia-egl-wayland.so.1        /usr/lib/x86_64-linux-gnu/libnvidia-ptxjitcompiler.so.1
/usr/lib/x86_64-linux-gnu/libnvidia-egl-wayland.so.1.1.18   /usr/lib/x86_64-linux-gnu/libnvidia-rtcore.so.550.163.01
/usr/lib/x86_64-linux-gnu/libnvidia-encode.so.1             /usr/lib/x86_64-linux-gnu/libnvidia-tls.so.550.163.01

[host]$ ls /usr/lib/x86_64-linux-gnu/nvidia/current 
libcuda.so                         libGLX_nvidia.so.0                 libnvidia-encode.so             libnvidia-opencl.so.550.163.01
libcuda.so.1                       libGLX_nvidia.so.550.163.01        libnvidia-encode.so.1           libnvidia-ptxjitcompiler.so.1
libcuda.so.550.163.01              libnvcuvid.so                      libnvidia-encode.so.550.163.01  libnvidia-ptxjitcompiler.so.550.163.01
libEGL_nvidia.so.0                 libnvcuvid.so.1                    libnvidia-ml.so                 libvdpau_nvidia.so.1
libEGL_nvidia.so.550.163.01        libnvcuvid.so.550.163.01           libnvidia-ml.so.1               libvdpau_nvidia.so.550.163.01
libGLESv1_CM_nvidia.so.1           libnvidia-allocator.so.1           libnvidia-ml.so.550.163.01      nvidia-drm_gbm.so
libGLESv1_CM_nvidia.so.550.163.01  libnvidia-allocator.so.550.163.01  libnvidia-nvvm.so.4
libGLESv2_nvidia.so.2              libnvidia-cfg.so.1                 libnvidia-nvvm.so.550.163.01
libGLESv2_nvidia.so.550.163.01     libnvidia-cfg.so.550.163.01        libnvidia-opencl.so.1


[host]$ docker --version
Docker version 29.1.3, build f52814d

[host]$ nvidia-container-cli --version 
cli-version: 1.18.1
lib-version: 1.18.1
build date: 2025-11-24T14:45+00:00
build revision: 889a3bb5408c195ed7897ba2cb8341c7d249672f
build compiler: x86_64-linux-gnu-gcc-7 7.5.0
build platform: x86_64
build flags: -D_GNU_SOURCE -D_FORTIFY_SOURCE=2 -DNDEBUG -std=gnu11 -O2 -g -fdata-sections -ffunction-sections -fplan9-extensions -fstack-protector -fno-strict-aliasing -fvisibility=hidden -Wall -Wextra -Wcast-align -Wpointer-arith -Wmissing-prototypes -Wnonnull -Wwrite-strings -Wlogical-op -Wformat=2 -Wmissing-format-attribute -Winit-self -Wshadow -Wstrict-prototypes -Wunreachable-code -Wconversion -Wsign-conversion -Wno-unknown-warning-option -Wno-format-extra-args -Wno-gnu-alignof-expression -Wl,-zrelro -Wl,-znow -Wl,-zdefs -Wl,--gc-sections

[host]$ less /etc/docker/daemon.json
{
    "runtimes": {
        "nvidia": {
            "args": [],
            "path": "nvidia-container-runtime"
        }
    }
}

Manual solution : because of configuration issues ============================================

In fact, I discovered that /etc/vulkan was not populated (still empty, even after restarting nvidia-cdi-refresh.service which is supposed to do the job) and that /etc/cdi/nvidia.yaml was missing some links (even after sudo nvidia-ctk cdi generate --output=/etc/cdi/nvidia.yaml

So I manually added the following symbolic links

[host]$ tree -l /etc/vulkan
.
├── explicit_layer.d
├── icd.d
│   └── nvidia_icd.json -> /usr/share/vulkan/icd.d/nvidia_icd.json
└── implicit_layer.d
    └── nvidia_layers.json -> /usr/share/vulkan/implicit_layer.d/nvidia_layers.json

And I added the following lines at the end of /etc/cdi/nvidia.yaml
(note that all these links are not in /usr/lib/x86_64-linux-gnu/nvidia/current but direclty in /usr/lib/x86_64-linux-gnu

== added to /etc/cdi/nvidia.yaml
        - hostPath: /usr/lib/x86_64-linux-gnu/libnvidia-glsi.so.550.163.01
          containerPath: /usr/lib/x86_64-linux-gnu/libnvidia-glsi.so.550.163.01
          options:
            - ro
            - nosuid
            - nodev
            - rbind
            - rprivate
        - hostPath: /usr/lib/x86_64-linux-gnu/libnvidia-tls.so.550.163.01
          containerPath: /usr/lib/x86_64-linux-gnu/libnvidia-tls.so.550.163.01
          options:
            - ro
            - nosuid
            - nodev
            - rbind
            - rprivate
        - hostPath: /usr/lib/x86_64-linux-gnu/libnvidia-glcore.so.550.163.01
          containerPath: /usr/lib/x86_64-linux-gnu/libnvidia-glcore.so.550.163.01
          options:
            - ro
            - nosuid
            - nodev
            - rbind
            - rprivate
        - hostPath: /usr/lib/x86_64-linux-gnu/libnvidia-gpucomp.so.550.163.01
          containerPath: /usr/lib/x86_64-linux-gnu/libnvidia-gpucomp.so.550.163.01
          options:
            - ro
            - nosuid
            - nodev
            - rbind
            - rprivate
        - hostPath: /usr/lib/x86_64-linux-gnu/libnvidia-glvkspirv.so.550.163.01
          containerPath: /usr/lib/x86_64-linux-gnu/libnvidia-glvkspirv.so.550.163.01
          options:
            - ro
            - nosuid
            - nodev
            - rbind
            - rprivate

Then I can use --env NVIDIA_VISIBLE_DEVICES=nvidia.com/gpu=all in running the docker and Vulkan is okay

[host]$ docker run -it --privileged --net=host --env="DISPLAY" --volume="/tmp/.X11-unix:/tmp/.X11-unix:rw" -e NVIDIA_VISIBLE_DEVICES=nvidia.com/gpu=all --runtime=nvidia ghcr.io/nvidia/k8s-samples:vulkan-86d426f7 vulkaninfo --summary

error: XDG_RUNTIME_DIR not set in the environment.
==========
VULKANINFO
==========

Vulkan Instance Version: 1.4.313


Instance Extensions: count = 23
-------------------------------
VK_EXT_acquire_drm_display             : extension revision 1
VK_EXT_acquire_xlib_display            : extension revision 1
VK_EXT_debug_report                    : extension revision 10
VK_EXT_debug_utils                     : extension revision 2
VK_EXT_direct_mode_display             : extension revision 1
VK_EXT_display_surface_counter         : extension revision 1
VK_EXT_surface_maintenance1            : extension revision 1
VK_EXT_swapchain_colorspace            : extension revision 4
VK_KHR_device_group_creation           : extension revision 1
VK_KHR_display                         : extension revision 23
VK_KHR_external_fence_capabilities     : extension revision 1
VK_KHR_external_memory_capabilities    : extension revision 1
VK_KHR_external_semaphore_capabilities : extension revision 1
VK_KHR_get_display_properties2         : extension revision 1
VK_KHR_get_physical_device_properties2 : extension revision 2
VK_KHR_get_surface_capabilities2       : extension revision 1
VK_KHR_portability_enumeration         : extension revision 1
VK_KHR_surface                         : extension revision 25
VK_KHR_surface_protected_capabilities  : extension revision 1
VK_KHR_wayland_surface                 : extension revision 6
VK_KHR_xcb_surface                     : extension revision 6
VK_KHR_xlib_surface                    : extension revision 6
VK_LUNARG_direct_driver_loading        : extension revision 1

Instance Layers: count = 10
---------------------------
VK_LAYER_KHRONOS_profiles         Khronos Profiles layer                                                                                            1.4.313  version 1
VK_LAYER_KHRONOS_shader_object    Khronos Shader object layer                                                                                       1.4.313  version 1
VK_LAYER_KHRONOS_synchronization2 Khronos Synchronization2 layer                                                                                    1.4.313  version 1
VK_LAYER_KHRONOS_validation       Khronos Validation Layer                                                                                          1.4.313  version 1
VK_LAYER_LUNARG_api_dump          LunarG API dump layer                                                                                             1.4.313  version 2
VK_LAYER_LUNARG_crash_diagnostic  Crash Diagnostic Layer is a crash/hang debugging tool that helps determines GPU progress in a Vulkan application. 1.4.313  version 1
VK_LAYER_LUNARG_gfxreconstruct    GFXReconstruct Capture Layer Version 1.0.5-unknown                                                                1.4.313  version 4194309
VK_LAYER_LUNARG_monitor           Execution Monitoring Layer                                                                                        1.4.313  version 1
VK_LAYER_LUNARG_screenshot        LunarG image capture layer                                                                                        1.4.313  version 1
VK_LAYER_NV_optimus               NVIDIA Optimus layer                                                                                              1.3.277  version 1

Devices:
========
GPU0:
	apiVersion         = 1.3.277
	driverVersion      = 550.163.1.0
	vendorID           = 0x10de
	deviceID           = 0x2805
	deviceType         = PHYSICAL_DEVICE_TYPE_DISCRETE_GPU
	deviceName         = NVIDIA GeForce RTX 4060 Ti
	driverID           = DRIVER_ID_NVIDIA_PROPRIETARY
	driverName         = NVIDIA
	driverInfo         = 550.163.01
	conformanceVersion = 1.3.7.2
	deviceUUID         = 895229b1-52bf-a326-6565-cbd4d20a56e3
	driverUUID         = 2485a78d-e39e-5aba-a0e9-b1139a1e6395

Metadata

Metadata

Assignees

Labels

Type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions