Skip to content
This repository was archived by the owner on Jan 22, 2024. It is now read-only.
This repository was archived by the owner on Jan 22, 2024. It is now read-only.

"Failed to initialize NVML: Unknown Error" after random amount of time #1671

@iFede94

Description

@iFede94

1. Issue or feature description

After a random amount of time (it could be hours or days) the GPUs become unavailable inside all the running containers and nvidia-smi returns "Failed to initialize NVML: Unknown Error".
A restart of all the containers fixes the issue and the GPUs return available.
Outside the containers the GPUs are still working correctly.
I tried searching in the open/closed issues but I could not find any solution.

2. Steps to reproduce the issue

All the containers are run with docker run --gpus all -it tensorflow/tensorflow:latest-gpu /bin/bash

3. Information to attach

  • Some nvidia-container information: nvidia-container-cli -k -d /dev/tty info
-- WARNING, the following logs are for debugging purposes only --

I0831 10:36:45.129762 2174149 nvc.c:376] initializing library context (version=1.10.0, build=395fd41701117121f1fd04ada01e1d7e006a37ae)
I0831 10:36:45.129878 2174149 nvc.c:350] using root /
I0831 10:36:45.129892 2174149 nvc.c:351] using ldcache /etc/ld.so.cache
I0831 10:36:45.129906 2174149 nvc.c:352] using unprivileged user 1000:1000
I0831 10:36:45.129960 2174149 nvc.c:393] attempting to load dxcore to see if we are running under Windows Subsystem for Linux (WSL)
I0831 10:36:45.130411 2174149 nvc.c:395] dxcore initialization failed, continuing assuming a non-WSL environment
W0831 10:36:45.132458 2174150 nvc.c:273] failed to set inheritable capabilities
W0831 10:36:45.132555 2174150 nvc.c:274] skipping kernel modules load due to failure
I0831 10:36:45.133242 2174151 rpc.c:71] starting driver rpc service
I0831 10:36:45.141625 2174152 rpc.c:71] starting nvcgo rpc service
I0831 10:36:45.144941 2174149 nvc_info.c:766] requesting driver information with ''
I0831 10:36:45.146226 2174149 nvc_info.c:173] selecting /usr/lib/x86_64-linux-gnu/libnvoptix.so.515.48.07
I0831 10:36:45.146379 2174149 nvc_info.c:173] selecting /usr/lib/x86_64-linux-gnu/libnvidia-tls.so.515.48.07
I0831 10:36:45.146563 2174149 nvc_info.c:173] selecting /usr/lib/x86_64-linux-gnu/libnvidia-rtcore.so.515.48.07
I0831 10:36:45.146792 2174149 nvc_info.c:173] selecting /usr/lib/x86_64-linux-gnu/libnvidia-ptxjitcompiler.so.515.48.07
I0831 10:36:45.146986 2174149 nvc_info.c:173] selecting /usr/lib/x86_64-linux-gnu/libnvidia-opticalflow.so.515.48.07
I0831 10:36:45.147178 2174149 nvc_info.c:173] selecting /usr/lib/x86_64-linux-gnu/libnvidia-opencl.so.515.48.07
I0831 10:36:45.147375 2174149 nvc_info.c:173] selecting /usr/lib/x86_64-linux-gnu/libnvidia-ngx.so.515.48.07
I0831 10:36:45.147400 2174149 nvc_info.c:173] selecting /usr/lib/x86_64-linux-gnu/libnvidia-ml.so.515.48.07
I0831 10:36:45.147598 2174149 nvc_info.c:173] selecting /usr/lib/x86_64-linux-gnu/libnvidia-glvkspirv.so.515.48.07
I0831 10:36:45.147777 2174149 nvc_info.c:173] selecting /usr/lib/x86_64-linux-gnu/libnvidia-glsi.so.515.48.07
I0831 10:36:45.147986 2174149 nvc_info.c:173] selecting /usr/lib/x86_64-linux-gnu/libnvidia-glcore.so.515.48.07
I0831 10:36:45.148258 2174149 nvc_info.c:173] selecting /usr/lib/x86_64-linux-gnu/libnvidia-fbc.so.515.48.07
I0831 10:36:45.148506 2174149 nvc_info.c:173] selecting /usr/lib/x86_64-linux-gnu/libnvidia-encode.so.515.48.07
I0831 10:36:45.148699 2174149 nvc_info.c:173] selecting /usr/lib/x86_64-linux-gnu/libnvidia-eglcore.so.515.48.07
I0831 10:36:45.148915 2174149 nvc_info.c:173] selecting /usr/lib/x86_64-linux-gnu/libnvidia-compiler.so.515.48.07
I0831 10:36:45.148942 2174149 nvc_info.c:173] selecting /usr/lib/x86_64-linux-gnu/libnvidia-cfg.so.515.48.07
I0831 10:36:45.149219 2174149 nvc_info.c:173] selecting /usr/lib/x86_64-linux-gnu/libnvidia-allocator.so.515.48.07
I0831 10:36:45.149467 2174149 nvc_info.c:173] selecting /usr/lib/x86_64-linux-gnu/libnvcuvid.so.515.48.07
I0831 10:36:45.149591 2174149 nvc_info.c:173] selecting /usr/lib/x86_64-linux-gnu/libcuda.so.515.48.07
I0831 10:36:45.149814 2174149 nvc_info.c:173] selecting /usr/lib/x86_64-linux-gnu/libGLX_nvidia.so.515.48.07
I0831 10:36:45.149996 2174149 nvc_info.c:173] selecting /usr/lib/x86_64-linux-gnu/libGLESv2_nvidia.so.515.48.07
I0831 10:36:45.150224 2174149 nvc_info.c:173] selecting /usr/lib/x86_64-linux-gnu/libGLESv1_CM_nvidia.so.515.48.07
I0831 10:36:45.150437 2174149 nvc_info.c:173] selecting /usr/lib/x86_64-linux-gnu/libEGL_nvidia.so.515.48.07
I0831 10:36:45.150772 2174149 nvc_info.c:173] selecting /usr/lib/i386-linux-gnu/libnvidia-tls.so.515.48.07
I0831 10:36:45.150978 2174149 nvc_info.c:173] selecting /usr/lib/i386-linux-gnu/libnvidia-ptxjitcompiler.so.515.48.07
I0831 10:36:45.151147 2174149 nvc_info.c:173] selecting /usr/lib/i386-linux-gnu/libnvidia-opticalflow.so.515.48.07
I0831 10:36:45.151335 2174149 nvc_info.c:173] selecting /usr/lib/i386-linux-gnu/libnvidia-opencl.so.515.48.07
I0831 10:36:45.151592 2174149 nvc_info.c:173] selecting /usr/lib/i386-linux-gnu/libnvidia-ml.so.515.48.07
I0831 10:36:45.151786 2174149 nvc_info.c:173] selecting /usr/lib/i386-linux-gnu/libnvidia-glvkspirv.so.515.48.07
I0831 10:36:45.151970 2174149 nvc_info.c:173] selecting /usr/lib/i386-linux-gnu/libnvidia-glsi.so.515.48.07
I0831 10:36:45.152225 2174149 nvc_info.c:173] selecting /usr/lib/i386-linux-gnu/libnvidia-glcore.so.515.48.07
I0831 10:36:45.152480 2174149 nvc_info.c:173] selecting /usr/lib/i386-linux-gnu/libnvidia-fbc.so.515.48.07
I0831 10:36:45.152791 2174149 nvc_info.c:173] selecting /usr/lib/i386-linux-gnu/libnvidia-encode.so.515.48.07
I0831 10:36:45.152999 2174149 nvc_info.c:173] selecting /usr/lib/i386-linux-gnu/libnvidia-eglcore.so.515.48.07
I0831 10:36:45.153254 2174149 nvc_info.c:173] selecting /usr/lib/i386-linux-gnu/libnvidia-compiler.so.515.48.07
I0831 10:36:45.153580 2174149 nvc_info.c:173] selecting /usr/lib/i386-linux-gnu/libnvcuvid.so.515.48.07
I0831 10:36:45.153853 2174149 nvc_info.c:173] selecting /usr/lib/i386-linux-gnu/libcuda.so.515.48.07
I0831 10:36:45.154063 2174149 nvc_info.c:173] selecting /usr/lib/i386-linux-gnu/libGLX_nvidia.so.515.48.07
I0831 10:36:45.154259 2174149 nvc_info.c:173] selecting /usr/lib/i386-linux-gnu/libGLESv2_nvidia.so.515.48.07
I0831 10:36:45.154473 2174149 nvc_info.c:173] selecting /usr/lib/i386-linux-gnu/libGLESv1_CM_nvidia.so.515.48.07
I0831 10:36:45.154696 2174149 nvc_info.c:173] selecting /usr/lib/i386-linux-gnu/libEGL_nvidia.so.515.48.07
W0831 10:36:45.154723 2174149 nvc_info.c:399] missing library libnvidia-nscq.so
W0831 10:36:45.154726 2174149 nvc_info.c:399] missing library libcudadebugger.so
W0831 10:36:45.154729 2174149 nvc_info.c:399] missing library libnvidia-fatbinaryloader.so
W0831 10:36:45.154731 2174149 nvc_info.c:399] missing library libnvidia-pkcs11.so
W0831 10:36:45.154733 2174149 nvc_info.c:399] missing library libvdpau_nvidia.so
W0831 10:36:45.154735 2174149 nvc_info.c:399] missing library libnvidia-ifr.so
W0831 10:36:45.154737 2174149 nvc_info.c:399] missing library libnvidia-cbl.so
W0831 10:36:45.154739 2174149 nvc_info.c:403] missing compat32 library libnvidia-cfg.so
W0831 10:36:45.154741 2174149 nvc_info.c:403] missing compat32 library libnvidia-nscq.so
W0831 10:36:45.154743 2174149 nvc_info.c:403] missing compat32 library libcudadebugger.so
W0831 10:36:45.154746 2174149 nvc_info.c:403] missing compat32 library libnvidia-fatbinaryloader.so
W0831 10:36:45.154748 2174149 nvc_info.c:403] missing compat32 library libnvidia-allocator.so
W0831 10:36:45.154750 2174149 nvc_info.c:403] missing compat32 library libnvidia-pkcs11.so
W0831 10:36:45.154752 2174149 nvc_info.c:403] missing compat32 library libnvidia-ngx.so
W0831 10:36:45.154754 2174149 nvc_info.c:403] missing compat32 library libvdpau_nvidia.so
W0831 10:36:45.154756 2174149 nvc_info.c:403] missing compat32 library libnvidia-ifr.so
W0831 10:36:45.154758 2174149 nvc_info.c:403] missing compat32 library libnvidia-rtcore.so
W0831 10:36:45.154760 2174149 nvc_info.c:403] missing compat32 library libnvoptix.so
W0831 10:36:45.154762 2174149 nvc_info.c:403] missing compat32 library libnvidia-cbl.so
I0831 10:36:45.154919 2174149 nvc_info.c:299] selecting /usr/bin/nvidia-smi
I0831 10:36:45.154945 2174149 nvc_info.c:299] selecting /usr/bin/nvidia-debugdump
I0831 10:36:45.154954 2174149 nvc_info.c:299] selecting /usr/bin/nvidia-persistenced
I0831 10:36:45.154970 2174149 nvc_info.c:299] selecting /usr/bin/nvidia-cuda-mps-control
I0831 10:36:45.154980 2174149 nvc_info.c:299] selecting /usr/bin/nvidia-cuda-mps-server
W0831 10:36:45.155027 2174149 nvc_info.c:425] missing binary nv-fabricmanager
I0831 10:36:45.155044 2174149 nvc_info.c:343] listing firmware path /usr/lib/firmware/nvidia/515.48.07/gsp.bin
I0831 10:36:45.155058 2174149 nvc_info.c:529] listing device /dev/nvidiactl
I0831 10:36:45.155061 2174149 nvc_info.c:529] listing device /dev/nvidia-uvm
I0831 10:36:45.155063 2174149 nvc_info.c:529] listing device /dev/nvidia-uvm-tools
I0831 10:36:45.155065 2174149 nvc_info.c:529] listing device /dev/nvidia-modeset
I0831 10:36:45.155080 2174149 nvc_info.c:343] listing ipc path /run/nvidia-persistenced/socket
W0831 10:36:45.155092 2174149 nvc_info.c:349] missing ipc path /var/run/nvidia-fabricmanager/socket
W0831 10:36:45.155100 2174149 nvc_info.c:349] missing ipc path /tmp/nvidia-mps
I0831 10:36:45.155102 2174149 nvc_info.c:822] requesting device information with ''
I0831 10:36:45.161039 2174149 nvc_info.c:713] listing device /dev/nvidia0 (GPU-13fd0930-06c3-5975-8720-72c72ee7a823 at 00000000:01:00.0)
I0831 10:36:45.166471 2174149 nvc_info.c:713] listing device /dev/nvidia1 (GPU-a76d37d7-5ed0-58d9-6087-b18fee984570 at 00000000:02:00.0)
NVRM version:   515.48.07
CUDA version:   11.7

Device Index:   0
Device Minor:   0
Model:          NVIDIA GeForce RTX 2080 Ti
Brand:          GeForce
GPU UUID:       GPU-13fd0930-06c3-5975-8720-72c72ee7a823
Bus Location:   00000000:01:00.0
Architecture:   7.5

Device Index:   1
Device Minor:   1
Model:          NVIDIA GeForce RTX 2080 Ti
Brand:          GeForce
GPU UUID:       GPU-a76d37d7-5ed0-58d9-6087-b18fee984570
Bus Location:   00000000:02:00.0
Architecture:   7.5
I0831 10:36:45.166493 2174149 nvc.c:434] shutting down library context
I0831 10:36:45.166540 2174152 rpc.c:95] terminating nvcgo rpc service
I0831 10:36:45.166751 2174149 rpc.c:135] nvcgo rpc service terminated successfully
I0831 10:36:45.167790 2174151 rpc.c:95] terminating driver rpc service
I0831 10:36:45.167907 2174149 rpc.c:135] driver rpc service terminated successfully
  • Kernel version from uname -a
Linux wds-co-ml 5.15.0-43-generic NVIDIA/nvidia-docker#46-Ubuntu SMP Tue Jul 12 10:30:17 UTC 2022 x86_64 x86_64 x86_64 GNU/Linux
  • Driver information from nvidia-smi -a
==============NVSMI LOG==============

Timestamp                                 : Wed Aug 31 12:42:55 2022
Driver Version                            : 515.48.07
CUDA Version                              : 11.7

Attached GPUs                             : 2
GPU 00000000:01:00.0
    Product Name                          : NVIDIA GeForce RTX 2080 Ti
    Product Brand                         : GeForce
    Product Architecture                  : Turing
    Display Mode                          : Disabled
    Display Active                        : Disabled
    Persistence Mode                      : Disabled
    MIG Mode
        Current                           : N/A
        Pending                           : N/A
    Accounting Mode                       : Disabled
    Accounting Mode Buffer Size           : 4000
    Driver Model
        Current                           : N/A
        Pending                           : N/A
    Serial Number                         : N/A
    GPU UUID                              : GPU-13fd0930-06c3-5975-8720-72c72ee7a823
    Minor Number                          : 0
    VBIOS Version                         : 90.02.0B.00.C7
    MultiGPU Board                        : No
    Board ID                              : 0x100
    GPU Part Number                       : N/A
    Module ID                             : 0
    Inforom Version
        Image Version                     : G001.0000.02.04
        OEM Object                        : 1.1
        ECC Object                        : N/A
        Power Management Object           : N/A
    GPU Operation Mode
        Current                           : N/A
        Pending                           : N/A
    GSP Firmware Version                  : N/A
    GPU Virtualization Mode
        Virtualization Mode               : None
        Host VGPU Mode                    : N/A
    IBMNPU
        Relaxed Ordering Mode             : N/A
    PCI
        Bus                               : 0x01
        Device                            : 0x00
        Domain                            : 0x0000
        Device Id                         : 0x1E0710DE
        Bus Id                            : 00000000:01:00.0
        Sub System Id                     : 0x150319DA
        GPU Link Info
            PCIe Generation
                Max                       : 3
                Current                   : 1
            Link Width
                Max                       : 16x
                Current                   : 8x
        Bridge Chip
            Type                          : N/A
            Firmware                      : N/A
        Replays Since Reset               : 0
        Replay Number Rollovers           : 0
        Tx Throughput                     : 0 KB/s
        Rx Throughput                     : 0 KB/s
    Fan Speed                             : 0 %
    Performance State                     : P8
    Clocks Throttle Reasons
        Idle                              : Not Active
        Applications Clocks Setting       : Not Active
        SW Power Cap                      : Active
        HW Slowdown                       : Not Active
            HW Thermal Slowdown           : Not Active
            HW Power Brake Slowdown       : Not Active
        Sync Boost                        : Not Active
        SW Thermal Slowdown               : Not Active
        Display Clock Setting             : Not Active
    FB Memory Usage
        Total                             : 11264 MiB
        Reserved                          : 244 MiB
        Used                              : 1 MiB
        Free                              : 11018 MiB
    BAR1 Memory Usage
        Total                             : 256 MiB
        Used                              : 3 MiB
        Free                              : 253 MiB
    Compute Mode                          : Default
    Utilization
        Gpu                               : 0 %
        Memory                            : 0 %
        Encoder                           : 0 %
        Decoder                           : 0 %
    Encoder Stats
        Active Sessions                   : 0
        Average FPS                       : 0
        Average Latency                   : 0
    FBC Stats
        Active Sessions                   : 0
        Average FPS                       : 0
        Average Latency                   : 0
    Ecc Mode
        Current                           : N/A
        Pending                           : N/A
    ECC Errors
        Volatile
            SRAM Correctable              : N/A
            SRAM Uncorrectable            : N/A
            DRAM Correctable              : N/A
            DRAM Uncorrectable            : N/A
        Aggregate
            SRAM Correctable              : N/A
            SRAM Uncorrectable            : N/A
            DRAM Correctable              : N/A
            DRAM Uncorrectable            : N/A
    Retired Pages
        Single Bit ECC                    : N/A
        Double Bit ECC                    : N/A
        Pending Page Blacklist            : N/A
    Remapped Rows                         : N/A
    Temperature
        GPU Current Temp                  : 30 C
        GPU Shutdown Temp                 : 94 C
        GPU Slowdown Temp                 : 91 C
        GPU Max Operating Temp            : 89 C
        GPU Target Temperature            : 84 C
        Memory Current Temp               : N/A
        Memory Max Operating Temp         : N/A
    Power Readings
        Power Management                  : Supported
        Power Draw                        : 20.87 W
        Power Limit                       : 260.00 W
        Default Power Limit               : 260.00 W
        Enforced Power Limit              : 260.00 W
        Min Power Limit                   : 100.00 W
        Max Power Limit                   : 300.00 W
    Clocks
        Graphics                          : 300 MHz
        SM                                : 300 MHz
        Memory                            : 405 MHz
        Video                             : 540 MHz
    Applications Clocks
        Graphics                          : N/A
        Memory                            : N/A
    Default Applications Clocks
        Graphics                          : N/A
        Memory                            : N/A
    Max Clocks
        Graphics                          : 2160 MHz
        SM                                : 2160 MHz
        Memory                            : 7000 MHz
        Video                             : 1950 MHz
    Max Customer Boost Clocks
        Graphics                          : N/A
    Clock Policy
        Auto Boost                        : N/A
        Auto Boost Default                : N/A
    Voltage
        Graphics                          : N/A
    Processes                             : None

GPU 00000000:02:00.0
    Product Name                          : NVIDIA GeForce RTX 2080 Ti
    Product Brand                         : GeForce
    Product Architecture                  : Turing
    Display Mode                          : Disabled
    Display Active                        : Disabled
    Persistence Mode                      : Disabled
    MIG Mode
        Current                           : N/A
        Pending                           : N/A
    Accounting Mode                       : Disabled
    Accounting Mode Buffer Size           : 4000
    Driver Model
        Current                           : N/A
        Pending                           : N/A
    Serial Number                         : N/A
    GPU UUID                              : GPU-a76d37d7-5ed0-58d9-6087-b18fee984570
    Minor Number                          : 1
    VBIOS Version                         : 90.02.17.00.58
    MultiGPU Board                        : No
    Board ID                              : 0x200
    GPU Part Number                       : N/A
    Module ID                             : 0
    Inforom Version
        Image Version                     : G001.0000.02.04
        OEM Object                        : 1.1
        ECC Object                        : N/A
        Power Management Object           : N/A
    GPU Operation Mode
        Current                           : N/A
        Pending                           : N/A
    GSP Firmware Version                  : N/A
    GPU Virtualization Mode
        Virtualization Mode               : None
        Host VGPU Mode                    : N/A
    IBMNPU
        Relaxed Ordering Mode             : N/A
    PCI
        Bus                               : 0x02
        Device                            : 0x00
        Domain                            : 0x0000
        Device Id                         : 0x1E0710DE
        Bus Id                            : 00000000:02:00.0
        Sub System Id                     : 0x150319DA
        GPU Link Info
            PCIe Generation
                Max                       : 3
                Current                   : 1
            Link Width
                Max                       : 16x
                Current                   : 8x
        Bridge Chip
            Type                          : N/A
            Firmware                      : N/A
        Replays Since Reset               : 0
        Replay Number Rollovers           : 0
        Tx Throughput                     : 0 KB/s
        Rx Throughput                     : 0 KB/s
    Fan Speed                             : 35 %
    Performance State                     : P8
    Clocks Throttle Reasons
        Idle                              : Active
        Applications Clocks Setting       : Not Active
        SW Power Cap                      : Not Active
        HW Slowdown                       : Not Active
            HW Thermal Slowdown           : Not Active
            HW Power Brake Slowdown       : Not Active
        Sync Boost                        : Not Active
        SW Thermal Slowdown               : Not Active
        Display Clock Setting             : Not Active
    FB Memory Usage
        Total                             : 11264 MiB
        Reserved                          : 244 MiB
        Used                              : 1 MiB
        Free                              : 11018 MiB
    BAR1 Memory Usage
        Total                             : 256 MiB
        Used                              : 27 MiB
        Free                              : 229 MiB
    Compute Mode                          : Default
    Utilization
        Gpu                               : 0 %
        Memory                            : 0 %
        Encoder                           : 0 %
        Decoder                           : 0 %
    Encoder Stats
        Active Sessions                   : 0
        Average FPS                       : 0
        Average Latency                   : 0
    FBC Stats
        Active Sessions                   : 0
        Average FPS                       : 0
        Average Latency                   : 0
    Ecc Mode
        Current                           : N/A
        Pending                           : N/A
    ECC Errors
        Volatile
            SRAM Correctable              : N/A
            SRAM Uncorrectable            : N/A
            DRAM Correctable              : N/A
            DRAM Uncorrectable            : N/A
        Aggregate
            SRAM Correctable              : N/A
            SRAM Uncorrectable            : N/A
            DRAM Correctable              : N/A
            DRAM Uncorrectable            : N/A
    Retired Pages
        Single Bit ECC                    : N/A
        Double Bit ECC                    : N/A
        Pending Page Blacklist            : N/A
    Remapped Rows                         : N/A
    Temperature
        GPU Current Temp                  : 28 C
        GPU Shutdown Temp                 : 94 C
        GPU Slowdown Temp                 : 91 C
        GPU Max Operating Temp            : 89 C
        GPU Target Temperature            : 84 C
        Memory Current Temp               : N/A
        Memory Max Operating Temp         : N/A
    Power Readings
        Power Management                  : Supported
        Power Draw                        : 6.66 W
        Power Limit                       : 260.00 W
        Default Power Limit               : 260.00 W
        Enforced Power Limit              : 260.00 W
        Min Power Limit                   : 100.00 W
        Max Power Limit                   : 300.00 W
    Clocks
        Graphics                          : 300 MHz
        SM                                : 300 MHz
        Memory                            : 405 MHz
        Video                             : 540 MHz
    Applications Clocks
        Graphics                          : N/A
        Memory                            : N/A
    Default Applications Clocks
        Graphics                          : N/A
        Memory                            : N/A
    Max Clocks
        Graphics                          : 2160 MHz
        SM                                : 2160 MHz
        Memory                            : 7000 MHz
        Video                             : 1950 MHz
    Max Customer Boost Clocks
        Graphics                          : N/A
    Clock Policy
        Auto Boost                        : N/A
        Auto Boost Default                : N/A
    Voltage
        Graphics                          : N/A
    Processes                             : None
  • Docker version from docker version
Client: Docker Engine - Community
 Version:           20.10.17
 API version:       1.41
 Go version:        go1.17.11
 Git commit:        100c701
 Built:             Mon Jun  6 23:02:46 2022
 OS/Arch:           linux/amd64
 Context:           default
 Experimental:      true

Server: Docker Engine - Community
 Engine:
  Version:          20.10.17
  API version:      1.41 (minimum version 1.12)
  Go version:       go1.17.11
  Git commit:       a89b842
  Built:            Mon Jun  6 23:00:51 2022
  OS/Arch:          linux/amd64
  Experimental:     false
 containerd:
  Version:          1.6.6
  GitCommit:        10c12954828e7c7c9b6e0ea9b0c02b01407d3ae1
 runc:
  Version:          1.1.2
  GitCommit:        v1.1.2-0-ga916309
 docker-init:
  Version:          0.19.0
  GitCommit:        de40ad0

  • NVIDIA packages version from dpkg -l '*nvidia*' or rpm -qa '*nvidia*'
ii  libnvidia-cfg1-515:amd64                   515.48.07-0ubuntu0.22.04.2 amd64        NVIDIA binary OpenGL/GLX configuration library
ii  libnvidia-common-515                       515.48.07-0ubuntu0.22.04.2 all          Shared files used by the NVIDIA libraries
ii  libnvidia-compute-515:amd64                515.48.07-0ubuntu0.22.04.2 amd64        NVIDIA libcompute package
ii  libnvidia-compute-515:i386                 515.48.07-0ubuntu0.22.04.2 i386         NVIDIA libcompute package
ii  libnvidia-container-tools                  1.10.0-1                   amd64        NVIDIA container runtime library (command-line tools)
ii  libnvidia-container1:amd64                 1.10.0-1                   amd64        NVIDIA container runtime library
ii  libnvidia-decode-515:amd64                 515.48.07-0ubuntu0.22.04.2 amd64        NVIDIA Video Decoding runtime libraries
ii  libnvidia-decode-515:i386                  515.48.07-0ubuntu0.22.04.2 i386         NVIDIA Video Decoding runtime libraries
ii  libnvidia-egl-wayland1:amd64               1:1.1.9-1.1                amd64        Wayland EGL External Platform library -- shared library
ii  libnvidia-encode-515:amd64                 515.48.07-0ubuntu0.22.04.2 amd64        NVENC Video Encoding runtime library
ii  libnvidia-encode-515:i386                  515.48.07-0ubuntu0.22.04.2 i386         NVENC Video Encoding runtime library
ii  libnvidia-extra-515:amd64                  515.48.07-0ubuntu0.22.04.2 amd64        Extra libraries for the NVIDIA driver
ii  libnvidia-fbc1-515:amd64                   515.48.07-0ubuntu0.22.04.2 amd64        NVIDIA OpenGL-based Framebuffer Capture runtime library
ii  libnvidia-fbc1-515:i386                    515.48.07-0ubuntu0.22.04.2 i386         NVIDIA OpenGL-based Framebuffer Capture runtime library
ii  libnvidia-gl-515:amd64                     515.48.07-0ubuntu0.22.04.2 amd64        NVIDIA OpenGL/GLX/EGL/GLES GLVND libraries and Vulkan ICD
ii  libnvidia-gl-515:i386                      515.48.07-0ubuntu0.22.04.2 i386         NVIDIA OpenGL/GLX/EGL/GLES GLVND libraries and Vulkan ICD
ii  linux-modules-nvidia-515-5.15.0-43-generic 5.15.0-43.46               amd64        Linux kernel nvidia modules for version 5.15.0-43
ii  linux-modules-nvidia-515-generic-hwe-22.04 5.15.0-43.46               amd64        Extra drivers for nvidia-515 for the generic-hwe-22.04 flavour
ii  linux-objects-nvidia-515-5.15.0-43-generic 5.15.0-43.46               amd64        Linux kernel nvidia modules for version 5.15.0-43 (objects)
ii  linux-signatures-nvidia-5.15.0-43-generic  5.15.0-43.46               amd64        Linux kernel signatures for nvidia modules for version 5.15.0-43-generic
ii  nvidia-compute-utils-515                   515.48.07-0ubuntu0.22.04.2 amd64        NVIDIA compute utilities
ii  nvidia-container-toolkit                   1.10.0-1                   amd64        NVIDIA container runtime hook
ii  nvidia-docker2                             2.11.0-1                   all          nvidia-docker CLI wrapper
ii  nvidia-driver-515                          515.48.07-0ubuntu0.22.04.2 amd64        NVIDIA driver metapackage
ii  nvidia-kernel-common-515                   515.48.07-0ubuntu0.22.04.2 amd64        Shared files used with the kernel module
ii  nvidia-kernel-source-515                   515.48.07-0ubuntu0.22.04.2 amd64        NVIDIA kernel source package
ii  nvidia-prime                               0.8.17.1                   all          Tools to enable NVIDIA's Prime
ii  nvidia-settings                            510.47.03-0ubuntu1         amd64        Tool for configuring the NVIDIA graphics driver
ii  nvidia-utils-515                           515.48.07-0ubuntu0.22.04.2 amd64        NVIDIA driver support binaries
ii  xserver-xorg-video-nvidia-515              515.48.07-0ubuntu0.22.04.2 amd64        NVIDIA binary Xorg driver
  • NVIDIA container library version from nvidia-container-cli -V
cli-version: 1.10.0
lib-version: 1.10.0
build date: 2022-06-13T10:39+00:00
build revision: 395fd41701117121f1fd04ada01e1d7e006a37ae
build compiler: x86_64-linux-gnu-gcc-7 7.5.0
build platform: x86_64
build flags: -D_GNU_SOURCE -D_FORTIFY_SOURCE=2 -DNDEBUG -std=gnu11 -O2 -g -fdata-sections -ffunction-sections -fplan9-extensions -fstack-protector -fno-strict-aliasing -fvisibility=hidden -Wall -Wextra -Wcast-align -Wpointer-arith -Wmissing-prototypes -Wnonnull -Wwrite-strings -Wlogical-op -Wformat=2 -Wmissing-format-attribute -Winit-self -Wshadow -Wstrict-prototypes -Wunreachable-code -Wconversion -Wsign-conversion -Wno-unknown-warning-option -Wno-format-extra-args -Wno-gnu-alignof-expression -Wl,-zrelro -Wl,-znow -Wl,-zdefs -Wl,--gc-sections
  • Docker command, image and tag used
docker run --gpus all -it tensorflow/tensorflow:latest-gpu /bin/bash

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions