Describe the bug
Reported by our NRL colleagues:
Here is what I am referring to on the call.
module unload craype-network-ofi
module load craype-network-ucx
feiliu@narwhal08:~/ESMF/esmf> module list
Currently Loaded Modulefiles:
- craype/2.7.19 82) odc/1.4.6 163) libzmq/4.3.5
- craype-x86-rome 83) py-attrs/21.4.0 164) py-greenlet/2.0.2
- libfabric/1.12.1.2.2.1 84) py-pycparser/2.21 165) py-gevent/1.5.0
- craype-network-ofi 85) py-cffi/1.15.1 166) py-pyzmq/25.0.2
- cray-dsmml/0.2.2 86) py-findlibs/0.0.2 167) py-tomli/2.0.1
- perftools-base/22.09.0 87) py-eccodes/1.5.0 168) py-urwid/2.1.2
- cray-pals/1.2.12 88) py-f90nml/1.4.3 169) py-cylc-flow/8.2.3
- bct-env/0.1 89) py-h5py/3.7.0 170) py-aiofiles/0.7.0
- mpscp/1.3a 90) py-cftime/1.0.3.4 171) py-zipp/3.17.0
- PrgEnv-intel/8.3.2 91) py-netcdf4/1.5.8 172) py-importlib-metadata/6.6.0
- intel-classic/2021.4.0 92) py-bottleneck/1.3.7 173) py-more-itertools/9.1.0
- cray-mpich/8.1.14 93) py-numexpr/2.8.4 174) py-jaraco-classes/3.2.3
- cray-python/3.9.7.1 94) py-et-xmlfile/1.0.1 175) py-jeepney/0.8.0
- cray-libsci/22.08.1.1 95) py-openpyxl/3.1.2 176) py-secretstorage/3.3.3
- qt/5.15.2 96) py-six/1.16.0 177) py-keyring/23.13.1
For your reference, this is the default list of modules we load for navy ESPC. The difference in ofi versus ucx implementation is about 50% in performance.
export ARCH=intel_cray
module purge
module use --append /app/projects/NEPTUNE/modulefiles
module load cylc/7.9.3
module use --append /app/projects/espc/software/ModuleFiles
module load conda/3
module load PrgEnv-intel/8.1.0
module load craype/2.7.14
module load intel/2021.4.0
module load craype-x86-rome
module unload craype-network-ofi
module load craype-network-ucx
module load libfabric
module load cray-libsci/21.08.1.2
module load cray-pals/1.0.17
module load gcc/10.2.0
module load bct-env
module load cray-fftw/3.3.8.11
module load cray-pmi/6.0.13
module load cray-pmi-lib/6.0.13
module load cray-libpals/1.0.17
module load crayintel/mpi/hdf5/1.10.5
module load crayintel/mpi/netcdf-c/4.6.3
module load crayintel/mpi/netcdf-fortran/4.4.4
module load crayintel/mpi/lapack/3.9.0
module load ksh/ksh93
module unload cray-mpich
module load cray-mpich-ucx/8.1.9
module load ncarg
module load crayintel/mpi/esmf/8.4.0bs15
#module load crayintel/mpi/esmf/7.1.0r
To Reproduce
See above
Expected behavior
Use ucx and finish faster
System:
Narwhal
Additional context
n/a
Describe the bug
Reported by our NRL colleagues:
module unload craype-network-ofi
module load craype-network-ucx
feiliu@narwhal08:~/ESMF/esmf> module list
Currently Loaded Modulefiles:
For your reference, this is the default list of modules we load for navy ESPC. The difference in ofi versus ucx implementation is about 50% in performance.
export ARCH=intel_cray
module purge
module use --append /app/projects/NEPTUNE/modulefiles
module load cylc/7.9.3
module use --append /app/projects/espc/software/ModuleFiles
module load conda/3
module load PrgEnv-intel/8.1.0
module load craype/2.7.14
module load intel/2021.4.0
module load craype-x86-rome
module unload craype-network-ofi
module load craype-network-ucx
module load libfabric
module load cray-libsci/21.08.1.2
module load cray-pals/1.0.17
module load gcc/10.2.0
module load bct-env
module load cray-fftw/3.3.8.11
module load cray-pmi/6.0.13
module load cray-pmi-lib/6.0.13
module load cray-libpals/1.0.17
module load crayintel/mpi/hdf5/1.10.5
module load crayintel/mpi/netcdf-c/4.6.3
module load crayintel/mpi/netcdf-fortran/4.4.4
module load crayintel/mpi/lapack/3.9.0
module load ksh/ksh93
module unload cray-mpich
module load cray-mpich-ucx/8.1.9
module load ncarg
module load crayintel/mpi/esmf/8.4.0bs15
#module load crayintel/mpi/esmf/7.1.0r
To Reproduce
See above
Expected behavior
Use ucx and finish faster
System:
Narwhal
Additional context
n/a