Skip to content

Compiliation with Cuda support fails due to out of memory error #46

@awvwgk

Description

@awvwgk

Is there a way to make the compilation of exchcxx more memory efficient? At the moment the build fails on GHA/ADO CI machines when Cuda is enabled.

[7/11] Building CXX object src/CMakeFiles/exchcxx.dir/builtin_interface.cxx.o
[8/11] Building CXX object src/CMakeFiles/exchcxx.dir/cuda/libxc_device.cxx.o
##[warning]Free memory is lower than 5%; Currently used: 95.15%
##[warning]Free memory is lower than 5%; Currently used: 95.15%
[9/11] Building CUDA object src/CMakeFiles/exchcxx.dir/cuda/builtin.cu.o
FAILED: [code=9] src/CMakeFiles/exchcxx.dir/cuda/builtin.cu.o 
$BUILD_PREFIX/bin/nvcc -forward-unknown-to-host-compiler -DEXCHCXX_HAS_CONFIG_H=1 -DUSING_Libxc -Dexchcxx_EXPORTS -I$SRC_DIR/include -I$SRC_DIR/_build/include -I$SRC_DIR/src -isystem $PREFIX/include -isystem $BUILD_PREFIX/targets/x86_64-linux/include -O3 -DNDEBUG -std=c++17 -arch=all -Xcompiler=-fPIC -Xcudafe --diag_suppress=partial_override -Xptxas -v -MD -MT src/CMakeFiles/exchcxx.dir/cuda/builtin.cu.o -MF src/CMakeFiles/exchcxx.dir/cuda/builtin.cu.o.d -x cu -c $SRC_DIR/src/cuda/builtin.cu -o src/CMakeFiles/exchcxx.dir/cuda/builtin.cu.o
$SRC_DIR/include/exchcxx/impl/builtin/kernels/deorbitalized.hpp(119): warning #177-D: variable "vtau_k" was declared but never referenced
      double TAU_A, TAU_B, vrho_a_k, vrho_b_k, vsigma_aa_k, vsigma_bb_k, vlapl_a_k, vlapl_b_k, vtau_k, dummy;
                                                                                               ^
          detected during:
            instantiation of "void ExchCXX::detail::device_eval_exc_vxc_helper_polar_kernel<KernelType>(int, ExchCXX::const_host_buffer_type, ExchCXX::const_host_buffer_type, ExchCXX::const_host_buffer_type, ExchCXX::const_host_buffer_type, ExchCXX::host_buffer_type, ExchCXX::host_buffer_type, ExchCXX::host_buffer_type, ExchCXX::host_buffer_type, ExchCXX::host_buffer_type) [with KernelType=ExchCXX::BuiltinSCANL_C]" at line 1595 of $SRC_DIR/src/cuda/builtin.cu
            instantiation of "void ExchCXX::detail::device_eval_exc_vxc_helper_polar<KernelType>(int, ExchCXX::const_device_buffer_type, ExchCXX::const_device_buffer_type, ExchCXX::const_device_buffer_type, ExchCXX::const_device_buffer_type, ExchCXX::device_buffer_type, ExchCXX::device_buffer_type, ExchCXX::device_buffer_type, ExchCXX::device_buffer_type, ExchCXX::device_buffer_type, cudaStream_t) [with KernelType=ExchCXX::BuiltinSCANL_C]" at line 1039 of $SRC_DIR/include/exchcxx/impl/builtin/kernel_type.hpp

Remark: The warnings can be suppressed with "-diag-suppress <warning-number>"

nvcc error   : 'ptxas' died due to signal 9 (Kill signal)

See Pipelines - Run 20260102.9 logs

Runner info
Agent name: 'Azure Pipelines 33'
Agent machine name: 'runnervm4y8n2'
Current agent version: '4.264.2'
Runner Image Provisioner
Hosted Compute Agent
Version: 20251211.462
Commit: 6cbad8c2bb55d58165063d031ccabf57e2d2db61
Build Date: 2025-12-11T16:28:49Z
Worker ID: {b5e20266-47c6-4bcd-b90b-d2b3cced9b97}
Operating System
Ubuntu
24.04.3
LTS
Runner Image
Image: ubuntu-24.04
Version: 20251215.174.1
Included Software: https://github.com/actions/runner-images/blob/ubuntu24/20251215.174/images/ubuntu/Ubuntu2404-Readme.md
Image Release: https://github.com/actions/runner-images/releases/tag/ubuntu24%2F20251215.174
Current image version: '20251215.174.1'
Agent running as: 'vsts'
Conda info
     active environment : base
    active env location : /opt/conda
            shell level : 1
       user config file : /home/conda/.condarc
 populated config files : /opt/conda/.condarc
                          /home/conda/.condarc
          conda version : 25.11.1
    conda-build version : 25.11.1
         python version : 3.12.12.final.0
                 solver : libmamba (default)
       virtual packages : __archspec=1=zen2
                          __conda=25.11.1=0
                          __cuda=12.6=0
                          __glibc=2.34=0
                          __linux=6.11.0=0
                          __unix=0=0
       base environment : /opt/conda  (writable)
      conda av data dir : /opt/conda/etc/conda
  conda av metadata url : None
           channel URLs : https://conda.anaconda.org/conda-forge/linux-64
                          https://conda.anaconda.org/conda-forge/noarch
          package cache : /home/conda/feedstock_root/build_artifacts/pkg_cache
                          /opt/conda/pkgs
       envs directories : /opt/conda/envs
                          /home/conda/.conda/envs
               platform : linux-64
             user-agent : conda/25.11.1 requests/2.32.5 CPython/3.12.12 Linux/6.11.0-1018-azure almalinux/9.7 glibc/2.34 solver/libmamba conda-libmamba-solver/25.11.0 libmambapy/2.4.0
                UID:GID : 1001:1001
             netrc file : None
           offline mode : False

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions