Skip to content

Refactor: support modern cuSOLVERMp with NCCL#6951

Merged
mohanchen merged 12 commits intodevelopfrom
transition-to-nccl
Feb 7, 2026
Merged

Refactor: support modern cuSOLVERMp with NCCL#6951
mohanchen merged 12 commits intodevelopfrom
transition-to-nccl

Conversation

@dzzz2001
Copy link
Collaborator

@dzzz2001 dzzz2001 commented Feb 4, 2026

Summary

  • Add NCCL backend support for cuSOLVERMp initialization (version >= 0.70)
  • Retain libcal backend compatibility for older cuSOLVERMp versions (< 0.70)
  • CMake automatically selects the appropriate backend based on cuSOLVERMp version
  • Update toolchain documentation

Closes #6940

Background

Starting from cuSOLVERMp v0.7.0, NVIDIA has transitioned from the Communication Abstraction Library (CAL) to using NCCL directly. This PR adds NCCL support while maintaining backward compatibility with older versions using CAL.

Build Instructions

To compile ABACUS with cuSOLVERMp support:

cmake -DENABLE_CUSOLVERMP=ON -DNVHPC_ROOT_DIR=/opt/nvidia/hpc_sdk/Linux_x86_64/xx.x/ ...

Note: If you have loaded the NVHPC module file, NVHPC_ROOT_DIR will be automatically inferred and does not need to be set manually.

@dzzz2001 dzzz2001 changed the title Support modern cuSOLVERMp/cuBLASMp with NCCL (replaces libcal) Refactor: support modern cuSOLVERMpwith NCCL (replaces libcal) Feb 4, 2026
@dzzz2001
Copy link
Collaborator Author

dzzz2001 commented Feb 4, 2026

I updated some toolchain-related docs. @QuantumMisaka , could you please review them? The toolchain might need further changes

@dzzz2001 dzzz2001 marked this pull request as draft February 4, 2026 16:46
@dzzz2001 dzzz2001 changed the title Refactor: support modern cuSOLVERMpwith NCCL (replaces libcal) Refactor: support modern cuSOLVERMp with NCCL Feb 6, 2026
@dzzz2001 dzzz2001 marked this pull request as ready for review February 6, 2026 10:42
@dzzz2001 dzzz2001 marked this pull request as draft February 6, 2026 12:25
@dzzz2001 dzzz2001 marked this pull request as ready for review February 6, 2026 12:29
@ZhouXY-PKU
Copy link
Collaborator

#6758 can also be closed.

@mohanchen mohanchen added GPU & DCU & HPC GPU and DCU and HPC related any issues Refactor Refactor ABACUS codes Compile & CICD & Docs & Dependencies Issues related to compiling ABACUS labels Feb 7, 2026
@mohanchen mohanchen merged commit 06fb968 into develop Feb 7, 2026
25 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Compile & CICD & Docs & Dependencies Issues related to compiling ABACUS GPU & DCU & HPC GPU and DCU and HPC related any issues Refactor Refactor ABACUS codes

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Feature Request: Support for modern cuSOLVERMp/cuBLASMp (v0.5.0+ and v0.7.0+) without libcal dependency

3 participants