Skip to content

Enable NVIDIA runtime for competitor container#30

Merged
caguero merged 1 commit into
vorcfrom
cuda_for_competitor
Jan 4, 2021
Merged

Enable NVIDIA runtime for competitor container#30
caguero merged 1 commit into
vorcfrom
cuda_for_competitor

Conversation

@mabelzhang
Copy link
Copy Markdown
Collaborator

@mabelzhang mabelzhang commented Dec 17, 2020

Draft until I can run a solution that uses CUDA.

Add --runtime=nvidia and --privileged for the competitor container.

We already had these two flags for the server container, but not the competitor container.
If the competitor uses NVIDIA CUDA-capable GPU, then the nvidia runtime flag is required.

Without --runtime=nvidia:

$ docker exec -it vorc-competitor-system bash
$ nvidia-smi
bash: nvidia-smi: command not found

With --runtime=nvidia, note I get the CUDA Version: 11.0 line inside the Docker container:

$ docker exec -it vorc-competitor-system bash
$ nvidia-smi
Thu Dec 17 01:48:45 2020       
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 455.45.01    Driver Version: 455.45.01    CUDA Version: 11.0     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|                               |                      |               MIG M. |
|===============================+======================+======================|
|   0  GeForce GTX 105...  On   | 00000000:01:00.0 Off |                  N/A |
| N/A   73C    P0    N/A /  N/A |   1875MiB /  4042MiB |     74%      Default |
|                               |                      |                  N/A |
+-------------------------------+----------------------+----------------------+
                                                                               
+-----------------------------------------------------------------------------+
| Processes:                                                                  |
|  GPU   GI   CI        PID   Type   Process name                  GPU Memory |
|        ID   ID                                                   Usage      |
|=============================================================================|
+-----------------------------------------------------------------------------+

On my host, I actually have CUDA Version 11.1. I don't know if that makes a difference.

Without --privileged:

$ docker exec -it vorc-competitor-system bash
$ ls /dev/nvidia*
/dev/nvidia0  /dev/nvidiactl

With --privileged, the ls result matches my host machine exactly:

$ docker exec -it vorc-competitor-system bash
$ ls /dev/nvidia* -1
/dev/nvidia-modeset
/dev/nvidia-uvm
/dev/nvidia-uvm-tools
/dev/nvidia0
/dev/nvidiactl

/dev/nvidia-caps:
nvidia-cap1
nvidia-cap2

Signed-off-by: Mabel Zhang <mabel@openrobotics.org>
@mabelzhang mabelzhang requested a review from caguero December 17, 2020 10:04
@mabelzhang mabelzhang marked this pull request as ready for review December 19, 2020 02:51
@mabelzhang
Copy link
Copy Markdown
Collaborator Author

mabelzhang commented Dec 19, 2020

We no longer need this / cannot test for sure that this works for solutions using CUDA.

Do we still want to merge it, since it still gives the competitor container the /dev/nvidia* devices and CUDA driver?

@caguero caguero merged commit 8faec25 into vorc Jan 4, 2021
@mabelzhang mabelzhang deleted the cuda_for_competitor branch January 7, 2021 07:10
mabelzhang added a commit that referenced this pull request Jan 13, 2021
Signed-off-by: Mabel Zhang <mabel@openrobotics.org>
mabelzhang added a commit that referenced this pull request Nov 2, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants