test MPI GPU awareness using Send/Recv instead of MPI_Allreduce #310
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
For some obscure reason, MPI_AllReduce with openmpi is broken when using cuda arrays on OpenMPI 4.1.5. This led idefix to think that it is not running a cuda-aware library on JeanZay with H100 GPUs (for which only openmpi 4.1.5 is available).
This issue might be related to open-mpi/ompi#9845
Since Idefix does not use reduction operations on GPUs, but only Send/Recv, this PR patch this problem by testing the MPI library with Send/Recv instead of AllReduce to detect non-Cuda Aware libraries.