Skip to content

[Bug]: Non-privileged MIG_STRATEGY=mixed fails due to missing device-cgroup access for nvidia-cap1/nvidia-cap2 (nvidia-container-toolkit 1.18+ / 1.19.0) #1740

@sulixu

Description

@sulixu

Describe the bug
with MIG_STRATEGY=mixed on nvidia-container-toolkit 1.18+ (also reproducible on 1.19.0).

MIG_STRATEGY=mixed fails in a non-privileged NVIDIA device-plugin pod with:

error getting device memory info: Insufficient Permissions

Using privileged: true mitigates the issue, but the NVIDIA device-plugin is typically deployed non-privileged and privileged mode is not documented as a requirement for mixed MIG discovery.

To Reproduce

Repro Matrix

  1. Non-privileged + MIG_STRATEGY=mixed => fails
  2. Privileged + MIG_STRATEGY=mixed => succeeds, MIG resources register
  3. Non-privileged + MIG_STRATEGY=none => succeeds
  4. Non-privileged + container-toolkit mode = "legacy" => succeeds
  5. Non-privileged + 1.17.x container-toolkit = => succeeds

Observed Host State

/dev/nvidia-caps exists on the node and includes cap1 and cap2:

# ls -l /dev/nvidia-caps/
total 0
cr-------- 1 root root 237,   1 Mar 17 17:55 nvidia-cap1
cr--r--r-- 1 root root 237, 102 Mar 17 17:55 nvidia-cap102
cr--r--r-- 1 root root 237, 103 Mar 17 17:55 nvidia-cap103
cr--r--r-- 1 root root 237, 111 Mar 17 17:55 nvidia-cap111
cr--r--r-- 1 root root 237, 112 Mar 17 17:55 nvidia-cap112
cr--r--r-- 1 root root 237, 120 Mar 17 17:55 nvidia-cap120
cr--r--r-- 1 root root 237, 121 Mar 17 17:55 nvidia-cap121
cr--r--r-- 1 root root 237,   2 Mar 17 17:55 nvidia-cap2
cr--r--r-- 1 root root 237,  66 Mar 17 17:55 nvidia-cap66
cr--r--r-- 1 root root 237,  67 Mar 17 17:55 nvidia-cap67
cr--r--r-- 1 root root 237,  75 Mar 17 17:55 nvidia-cap75
cr--r--r-- 1 root root 237,  76 Mar 17 17:55 nvidia-cap76
cr--r--r-- 1 root root 237,  84 Mar 17 17:55 nvidia-cap84
cr--r--r-- 1 root root 237,  85 Mar 17 17:55 nvidia-cap85
cr--r--r-- 1 root root 237,  93 Mar 17 17:55 nvidia-cap93
cr--r--r-- 1 root root 237,  94 Mar 17 17:55 nvidia-cap94

But in the non-privileged pod, device-cgroup access appears not granted for cap1/cap2 (EPERM in devcgroup_check_permission), while privileged pod bypasses those restrictions and works.

CDI State

nvidia-ctk cdi list on node shows only 15 devices and does not include MIG management classes:

# nvidia-ctk cdi list
INFO[0000] Found 15 CDI devices
nvidia.com/gpu=0:0
nvidia.com/gpu=0:1
nvidia.com/gpu=0:2
nvidia.com/gpu=0:3
nvidia.com/gpu=0:4
nvidia.com/gpu=0:5
nvidia.com/gpu=0:6
nvidia.com/gpu=MIG-07ad8e7c-0197-5973-bd05-7fcd18ae754e
nvidia.com/gpu=MIG-30171940-9037-5504-89d2-7b5acd71bf5d
nvidia.com/gpu=MIG-90825510-8f47-5052-8e15-ead851500769
nvidia.com/gpu=MIG-90f01f31-b6fe-54f6-acc8-0e228a8ca6e2
nvidia.com/gpu=MIG-9207551c-4419-5285-9621-19e99aec2afc
nvidia.com/gpu=MIG-93b1626d-9793-506e-81de-1b00f8f9edfc
nvidia.com/gpu=MIG-acec7971-611e-54f5-ae21-4a0f5a5eb074
nvidia.com/gpu=all

No:

  • nvidia.com/mig-config=all
  • nvidia.com/mig-monitor=all

Device Plugin Error

error starting plugins: unable to add default resources to config: error visiting devices: error visiting device: error visiting device: error getting device memory info: Insufficient Permissions

Device-plugin pod (non-privileged):

  env:
    - name: MPS_ROOT
      value: /run/nvidia/mps
    - name: MIG_STRATEGY
      value: mixed
    - name: NVIDIA_VISIBLE_DEVICES
      value: all
    - name: NVIDIA_DRIVER_CAPABILITIES
      value: compute,utility
    - name: NVIDIA_MIG_MONITOR_DEVICES
      value: all
    image: nvcr.io/nvidia/k8s-device-plugin:v0.18.0
    imagePullPolicy: IfNotPresent
    name: nvidia-device-plugin-ctr
    resources: {}
    securityContext:
      capabilities:
        add:
        - SYS_ADMIN
      privileged: false
      runAsGroup: 0
      runAsUser: 0

(this config works with 1.17.x nvidia-container-toolkit, but not with any 1.18.x)

Expected behavior

device-plugin pod shouldn't need to be granted with "privileged=true" , for non-privileged MIG_STRATEGY=mixed, what is the officially supported configuration?

From NVIDIA MIG docs https://docs.nvidia.com/datacenter/tesla/mig-user-guide/device-nodes-and-capabilities.html, mig-config maps to nvidia-cap1 and requires cgroup-granted access in non-privileged containers. In our case, node device nodes exist, but non-privileged container cgroup permissions for cap1/cap2 are missing.

It appears default CDI refresh only gives nvidia.com/gpu=* entries and does not expose MIG management CDI classes that would inject corresponding device nodes and cgroup rules into non-privileged management containers.

Environment

  • nvidia-container-toolkit version: 1.19.0-1 (nvidia-ctk --version: 1.19.0, commit ec7b4e2)
    • NVIDIA Driver Version: 580.126.09
    • Host OS: Ubuntu 22.04.5 LTS
    • Kernel Version: 5.15.0-1102-azure
    • Container Runtime Version: containerd 1.7.30-2 (containerd://1.7.30-2)
    • CPU Architecture: x86_64 (amd64 in Kubernetes node info)
    • GPU Model(s): NVIDIA H100 NVL

Information to attach (optional if deemed irrelevant)

Question

Mitigation with non-privileged device-plugin pod (Requesting Confirmation)

  1. Runtime config:
nvidia-ctk config --config-file=/etc/nvidia-container-runtime/config.toml --in-place \
  --set accept-nvidia-visible-devices-as-volume-mounts=false \
  --set accept-nvidia-visible-devices-envvar-when-unprivileged=true
  1. Generate management CDI specs:
nvidia-ctk cdi generate --mode=management --vendor=nvidia.com --class=mig-config \
  --output=/var/run/cdi/nvidia-mig-config.yaml
nvidia-ctk cdi generate --mode=management --vendor=nvidia.com --class=mig-monitor \
  --output=/var/run/cdi/nvidia-mig-monitor.yaml

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugIssue/PR to expose/discuss/fix a bugneeds-triageissue or PR has not been assigned a priority-px label

    Type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions