Skip to content

Conversation

@moellh
Copy link
Contributor

@moellh moellh commented Dec 17, 2025

This PR adds macros for synchronizing HPX futures and creating APEX timers to measure the duration of steps in the predict, predict_with_uncertainty, predict_with_full_cov, and cholesky function when GPRAT_APEX_STEPS=ON. Additionally, if GPRAT_APEX_CHOLESKY=ON, the duration of both assembly and right looking cholesky computation combined is measured. The added macros are in core/include/apex_utils.hpp, which are included in core/src/cpu/gp_functions.cpp and core/src/gpu/gp_functions.cpp. The header contains the macros GPRAT_START_TIMER and GPRAT_STOP_TIMER, which can be used to initialize an APEX timer and synchronize HPX futures & sample the APEX timer, respectively. When GPRAT_APEX_STEPS=ON, the two other macros GPRAT_START_STEP and GPRAT_END_STEP are not empty: They synchronize between steps of the aforementioned GP functions. These macro calls are quite frequent. Thus, this PR aims at making the addition of APEX timers for this purpose less obstrusive in the GP function implementations. In contrast, only a single timer is tracked if GPRAT_APEX_CHOLESKY=ON, so the cholesky implementation uses the always available macros GPRAT_START_TIMER and GPRAT_STOP_TIMER only once. See below for why the cholesky function needs such a separate mechanism for measuring its execution duration.

Besides of the header and source files, the GPRAT_APEX_STEPS and GPRAT_APEX_CHOLESKY definitions have been added to all configuration files with default setting OFF as they should only be enabled when explicitly wanting to replicate scaling measurements. Thus, they can be enabled (set to ON) with a 4th argument to compile_gprat.sh script, which is either empty (both OFF), steps, or cholesky. The GPRat library is then compiled accordingly, so for example, examples/gprat_cpp/ can be configured to show apex output where the timers will then be visible.

Important note: The last two commits 9af7a06 and a39ea3e aim at fixing the CI pipeline: They enable instrumentation=apex for the CPU installations of HPX with Spack and remove the +static setting (otherwise the CI Install step fails). I don't know how much these two settings affect other parts of the GPRat codebase and previous scaling measurements. However, this seemed to fix the pipeline while also enabling the APEX features of this PR for the CPU build of GPRat.


Some other notes:

The option GPRAT_APEX_CHOLESKY is needed because the cholesky function returns a large matrix (besides assembly and right looking cholesky computation) which adds significant overhead to the whole function call. However, the matrix is not returned by the more relevant GP predict operations, yet the cholesky (covariance matrix assembly and) computation is the computationally most interesting part. Thus, the addition of GPRAT_APEX_CHOLESKY is used to only track a part of the cholesky function.

See experiment branch of my fork that I created for my bachelor's thesis for the original implementation of using APEX timers and some measurement scripts.

@moellh moellh force-pushed the apex-steps branch 2 times, most recently from 2328ebe to f6155d6 Compare December 17, 2025 17:11
@constracktor constracktor marked this pull request as ready for review December 23, 2025 10:23
@constracktor constracktor merged commit cdc21b0 into SC-SGS:main Dec 23, 2025
2 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants