-
Notifications
You must be signed in to change notification settings - Fork 641
Description
Describe the bug
Building Transformer Engine from source fails during CUDA compilation with:
transformer_engine/common/gemm/cublaslt_gemm.cu(936): error: "cuda" is ambiguous
/mnt/cfs/dev/TransformerEngine/transformer_engine/common/gemm/cublaslt_gemm.cu(936): error: "cuda" is ambiguous
do { if (!(cuda::cublas_version() >= 120205 && cuda::cublas_version() < 130000)) { do { throw ::std::runtime_error(::transformer_engine::concat_strings( "/mnt/cfs/dev/TransformerEngine/transformer_engine/common/gemm/cublaslt_gemm.cu" ":", 936, " in function ", __func__, ": ", ::transformer_engine::concat_strings("Assertion failed: " "cuda::cublas_version() >= 120205 && cuda::cublas_version() < 130000" ". ", ::transformer_engine::concat_strings("Atomic GEMM requires cuBLAS version >=12.2.5 and <13.0.0, but run-time cuBLAS version is ", cuda::cublas_version())))); } while (false); } } while (false)
The failure happens in nvte_cublas_atomic_gemm when compiling the runtime version check that calls cuda::cublas_version().
Steps/Code to reproduce bug
-
Clone Transformer Engine.
-
Build/install from source in editable mode:
export NVTE_FRAMEWORK=pytorch
pip install --no-build-isolation -v -e .- Build fails with:
.../transformer_engine/common/gemm/cublaslt_gemm.cu(936): error: "cuda" is ambiguous
do { if (!(cuda::cublas_version() >= 120205 && cuda::cublas_version() < 130000)) { ...
Expected behavior
building success
Environment overview (please complete the following information)
Environment details
- PyTorch version: 2.7.1
- Python version: 3.12
- Transformer Engine version: b9f4013
- CUDA version: cuda_12.9.r12.9
- CUDNN version: 9.16.0
Device details
- GPU model: A800
Additional context
This appears to be a namespace collision in this translation unit: both transformer_engine::cuda and global ::cuda can be visible, so unqualified cuda::cublas_version() becomes ambiguous in some toolchains/environments.
works fine with v2.11
maybe related :#2631