The current injection of the enable-cuda-compat hook checks the version suffix of the libcuda.so.RM_VERSION file to determine the host driver version. On tegra-based systems this is not correct as the file has the name libcuda.so.1.1 meaning that the compat libraries in a container are ALWAYS used.
The CDI spec generation for Tegra-based systems (mode=csv) must be updated to:
- read the driver version from nvml if available
- update the generation of the
enable-cuda-compat hook to handle the orin case.
This means that if an Orin system is detected, alternative CUDA compat libraries (rooted at /usr/local/cuda/compat-orin in the container). If these compat libraries are not present, then the host drivers will always be used effectively disabling CUDA forward compat on Orin systems.
The following PRs are relevant:
The current injection of the
enable-cuda-compathook checks the version suffix of thelibcuda.so.RM_VERSIONfile to determine the host driver version. On tegra-based systems this is not correct as the file has the namelibcuda.so.1.1meaning that the compat libraries in a container are ALWAYS used.The CDI spec generation for Tegra-based systems (mode=csv) must be updated to:
enable-cuda-compathook to handle the orin case.This means that if an Orin system is detected, alternative CUDA compat libraries (rooted at
/usr/local/cuda/compat-orinin the container). If these compat libraries are not present, then the host drivers will always be used effectively disabling CUDA forward compat on Orin systems.The following PRs are relevant: