Skip to content

Option --gpus failes with AMD and Intel GPUs #2063

@mviereck

Description

@mviereck

Description
New option --gpus to provide the GPU to containers failes with AMD and Intel GPUs.

It is specific to systems with NVIDIA GPU and NVIDIA's proprietary driver and NVIDIA's
container runtime setup.

A cli option should be general and not be vendor-specific.

Steps to reproduce the issue:
On a system with an AMD GPU:

$ docker run --rm --gpus all debian echo
docker: Error response from daemon: could not select device driver "" with capabilities: [[gpu]].

Describe the results you received:
Option --gpus all fails on a system with an AMD GPU.
Likely it also fails with Intel GPUs and with NVIDIA GPUs using the nouveau driver.

Describe the results you expected:
Option --gpus all should provide the GPU to the container.

Output of docker --version:
Docker version 19.03.1, build 74b1e89

Discussion
Coming from #1200, #1714 , opening a new ticket:
I want to analyze the current state of --gpus and make proposals:

  • Either make GPU support vendor-specific to NVIDIA with docker plugin install and drop the cli option --gpus.
  • Or make --gpus work in general for all vendors. I would prefer that.

The current state:

  • --gpus works with a specific NVIDIA setup only. Dependencies:
    • NVIDIA GPU
    • NVIDIA proprietary driver on host
    • nvidia-container-toolkit on host
    • nvidia/nvidia-docker image.
  • --gpus fails with NVIDIA GPUs not fulfilling above dependencies.
  • --gpus fails with AMD GPUs.
  • --gpus fails with Intel GPUs.
  • --gpus fails with NVIDIA GPUs and nouveau.
  • Unknown/to check: Does --gpus work with the combination:
    • NVIDIA GPU
    • NVIDIA proprietary driver on host
    • same NVIDIA proprietary driver in container (arbitrary image)
    • Without nvidia-container-toolkit on host

Desirable:

  • Support of other vendors and nouveau and possible existing NVIDIA driver in container. ToDo:
    • --gpus should provide /dev/dri to container.
    • --gpus should provide /dev/nvidia* to container.
    • --gpus maybe should provide /dev/vga_arbiter to container.
    • --gpus maybe should add the container user to groups video and render to support unprivileged users in container.

At least for --gpus all or e.g. --gpus intel this should not be too hard.

Maybe additional, or to be done by the user:

  • Driver check on host and in container by docker.
  • Driver installation in container by docker. (Probably going too far. However, possible e.g. with NVIDIA's proprietary driver. I am not sure how to accomplish this with MESA drivers, but likely possible, too.)

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions