Summary
On Jetson(aarch64, Tegra SoC) devices, version 1.17.1 is not creating containers properly, if environment variable NVIDIA_DRIVER_CAPABILITIES contains any of display,graphics,all value.
This could be mitigated by overriding container env, for example docker run -e NVIDIA_DRIVER_CAPABILITIES=compute nvcr.io/....
Steps to reproduce
-
Get a Jetson device. I tested with {Xavier, Orin} AGX DevKit as a reference.
-
Install Docker runtime and nvidia-container-runtime=1.17.1-1
-
Ensure nvidia container runtime has configured. To configure, run
sudo nvidia-ctk runtime configure --set-as-default
-
Try running a container. For example, l4t-base image could be used. For example:
docker run -it --rm \
-e NVIDIA_DRIVER_CAPABILITIES=all \
nvcr.io/nvidia/l4t-base:r36.2.0
OR, even with non-jetson base images:
docker run -it --rm \
-e NVIDIA_DRIVER_CAPABILITIES=display \
-e NVIDIA_VISIBLE_DEVICES=all \
ubuntu:22.04
Result
Example of error message
$ docker run -it --rm -e NVIDIA_DRIVER_CAPABILITIES=display -e NVIDIA_VISIBLE_DEVICES=all ubuntu:22.04
docker: Error response from daemon: failed to create task for container: failed to create shim task: OCI runtime create failed: runc create failed: unable to start container process: error during container init: error running hook #1: error running hook: exit status 1, stdout: , stderr: time="2024-11-13T17:38:55+09:00" level=info msg="Symlinking /var/lib/docker/overlay2/8af1b1d84ee57db598be489bb9ad58fb2d139b77604aead77526787d18a02900/merged/etc/vulkan/icd.d/nvidia_icd.json to /usr/lib/aarch64-linux-gnu/tegra/nvidia_icd.json"
time="2024-11-13T17:38:55+09:00" level=error msg="failed to create link [/usr/lib/aarch64-linux-gnu/tegra/nvidia_icd.json /etc/vulkan/icd.d/nvidia_icd.json]: failed to create symlink: failed to remove existing file: remove /var/lib/docker/overlay2/8af1b1d84ee57db598be489bb9ad58fb2d139b77604aead77526787d18a02900/merged/etc/vulkan/icd.d/nvidia_icd.json: device or resource busy": unknown.
| Hardware |
Jetpack |
nvidia-container-toolkit |
NVIDIA_DRIVER_CAPABILITIES |
result |
| Orin AGX |
6.1 |
1.14.2 |
all |
Good |
| Orin AGX |
6.1 |
1.17.1 |
all |
Error |
| Orin AGX |
6.1 |
1.17.1 |
compute,utility |
Good |
| Orin AGX |
6.1 |
1.17.1 |
display |
Error |
| Orin AGX |
6.1 |
1.17.1 |
graphics |
Error |
| Xavier AGX |
5.1.2 |
1.16.1 |
all |
Good |
| Xavier AGX |
5.1.2 |
1.16.1 |
graphics |
Good |
| Xavier AGX |
5.1.2 |
1.17.1 |
all |
Error |
| Xavier AGX |
5.1.2 |
1.17.1 |
compute |
Good |
| Xavier AGX |
5.1.2 |
1.17.1 |
display |
Error |
| Xavier AGX |
5.1.2 |
1.17.1 |
graphics |
Error |
Summary
On Jetson(
aarch64,Tegra SoC) devices, version1.17.1is not creating containers properly, if environment variableNVIDIA_DRIVER_CAPABILITIEScontains any ofdisplay,graphics,allvalue.This could be mitigated by overriding container env, for example
docker run -e NVIDIA_DRIVER_CAPABILITIES=compute nvcr.io/....Steps to reproduce
Get a Jetson device. I tested with {Xavier, Orin} AGX DevKit as a reference.
Install
Docker runtimeandnvidia-container-runtime=1.17.1-1Ensure nvidia container runtime has configured. To configure, run
sudo nvidia-ctk runtime configure --set-as-defaultTry running a container. For example, l4t-base image could be used. For example:
docker run -it --rm \ -e NVIDIA_DRIVER_CAPABILITIES=all \ nvcr.io/nvidia/l4t-base:r36.2.0OR, even with non-jetson base images:
docker run -it --rm \ -e NVIDIA_DRIVER_CAPABILITIES=display \ -e NVIDIA_VISIBLE_DEVICES=all \ ubuntu:22.04Result
Example of error message