Skip to content
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
7 changes: 6 additions & 1 deletion pkg/provisioner/templates/nv-driver.go
Original file line number Diff line number Diff line change
Expand Up @@ -113,8 +113,13 @@ holodeck_progress "$COMPONENT" 3 5 "Adding CUDA repository"
if [[ ! -f /etc/apt/sources.list.d/cuda*.list ]] || \
[[ ! -f /usr/share/keyrings/cuda-archive-keyring.gpg ]]; then
distribution=$(. /etc/os-release; echo "${ID}${VERSION_ID}" | sed -e 's/\.//g')
# Determine CUDA repo architecture (NVIDIA uses "sbsa" for arm64 servers)
CUDA_ARCH="$(uname -m)"
if [[ "$CUDA_ARCH" == "aarch64" ]]; then
CUDA_ARCH="sbsa"
fi
Comment on lines +118 to +120
Copy link

Copilot AI Feb 14, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The architecture detection logic only handles x86_64 (implicitly) and aarch64→sbsa mapping, but doesn't provide a fallback or error handling for unsupported architectures. Other templates in the codebase (e.g., container-toolkit.go:243-253) use case statements with explicit error handling for unsupported architectures. Consider adding an else clause to handle unexpected architecture values or at least add a comment explaining that x86_64 is used as-is.

Suggested change
if [[ "$CUDA_ARCH" == "aarch64" ]]; then
CUDA_ARCH="sbsa"
fi
case "$CUDA_ARCH" in
aarch64)
CUDA_ARCH="sbsa"
;;
x86_64)
# use x86_64 as-is
;;
*)
holodeck_log "ERROR" "$COMPONENT" "unsupported architecture for CUDA repository: $CUDA_ARCH"
exit 1
;;
esac

Copilot uses AI. Check for mistakes.
holodeck_retry 3 "$COMPONENT" wget -q \
"https://developer.download.nvidia.com/compute/cuda/repos/$distribution/x86_64/cuda-keyring_1.1-1_all.deb"
"https://developer.download.nvidia.com/compute/cuda/repos/$distribution/${CUDA_ARCH}/cuda-keyring_1.1-1_all.deb"
sudo dpkg -i cuda-keyring_1.1-1_all.deb
rm -f cuda-keyring_1.1-1_all.deb
holodeck_retry 3 "$COMPONENT" sudo apt-get update
Expand Down
30 changes: 30 additions & 0 deletions pkg/provisioner/templates/nv-driver_test.go
Original file line number Diff line number Diff line change
Expand Up @@ -178,3 +178,33 @@ func TestNVDriverTemplate(t *testing.T) {
})
}
}

func TestNVDriverTemplate_CUDARepoArch(t *testing.T) {
driver := &NvDriver{
Branch: defaultNVBranch,
}

var output bytes.Buffer
err := driver.Execute(&output, v1alpha1.Environment{})
require.NoError(t, err)

outStr := output.String()

// Must NOT contain hardcoded x86_64 in the CUDA repo URL
require.NotContains(t, outStr, "cuda/repos/$distribution/x86_64/",
"Template must not hardcode x86_64 in the CUDA repository URL")

// Must contain runtime architecture detection
require.Contains(t, outStr, `CUDA_ARCH="$(uname -m)"`,
"Template must detect architecture at runtime via uname -m")

// Must contain aarch64 -> sbsa mapping
require.Contains(t, outStr, `if [[ "$CUDA_ARCH" == "aarch64" ]]; then`,
"Template must check for aarch64 architecture")
require.Contains(t, outStr, `CUDA_ARCH="sbsa"`,
"Template must map aarch64 to sbsa for NVIDIA CUDA repos")

// Must use CUDA_ARCH variable in the wget URL
require.Contains(t, outStr, "${CUDA_ARCH}/cuda-keyring",
"Template must use CUDA_ARCH variable in the wget URL")
}
Loading