-
Notifications
You must be signed in to change notification settings - Fork 153
Description
Background
I'm trying to compile ABACUS with ENABLE_CUSOLVERMP=1 using the latest NVIDIA HPC SDK 26.1 (released in early 2026), which includes cuSOLVERMp v0.7.2 and cuBLASMp v0.7.0.
However, the current CMake configuration fails because it assumes the presence of libcal.so (Communication Abstraction Library):
CMake Error: The following variables are used in this project, but they are set to NOTFOUND:
CAL_LIBRARY
CUSOLVERMP_LIBRARY
According to NVIDIA's official release notes (cuBLASMp v0.5.0, cuSOLVERMp v0.7.0), both libraries have transitioned away from libcal starting from these versions:
Breaking changes:
cuBLASMp has transitioned from using the Communication Abstraction Library (libcal) to using NCCL directly.
cuSOLVERMp has transitioned from using the Communication Abstraction Library (libcal) to using NCCL directly.
This means after August 2025:
libcal.sois no longer shipped with HPC SDK ≥ 25.7 (for cuBLASMp) and ≥ 26.1 (for cuSOLVERMp).- Initialization now uses NCCL communicators directly (and optionally NVSHMEM), not CAL handles.
- The current
DiagoCusolverMpimplementation and CMake logic (find_library(CAL_LIBRARY ...)) are incompatible with modern HPC SDK versions.
Describe the solution you'd like
Need to update ABACUS to support the new NCCL-based initialization model for cuSOLVERMp.