Measure Step Durations and Cholesky durations with APEX Timers #43
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
This PR adds macros for synchronizing HPX futures and creating APEX timers to measure the duration of steps in the
predict,predict_with_uncertainty,predict_with_full_cov, andcholeskyfunction whenGPRAT_APEX_STEPS=ON. Additionally, ifGPRAT_APEX_CHOLESKY=ON, the duration of both assembly and right looking cholesky computation combined is measured. The added macros are incore/include/apex_utils.hpp, which are included incore/src/cpu/gp_functions.cppandcore/src/gpu/gp_functions.cpp. The header contains the macrosGPRAT_START_TIMERandGPRAT_STOP_TIMER, which can be used to initialize an APEX timer and synchronize HPX futures & sample the APEX timer, respectively. WhenGPRAT_APEX_STEPS=ON, the two other macrosGPRAT_START_STEPandGPRAT_END_STEPare not empty: They synchronize between steps of the aforementioned GP functions. These macro calls are quite frequent. Thus, this PR aims at making the addition of APEX timers for this purpose less obstrusive in the GP function implementations. In contrast, only a single timer is tracked ifGPRAT_APEX_CHOLESKY=ON, so the cholesky implementation uses the always available macrosGPRAT_START_TIMERandGPRAT_STOP_TIMERonly once. See below for why the cholesky function needs such a separate mechanism for measuring its execution duration.Besides of the header and source files, the
GPRAT_APEX_STEPSandGPRAT_APEX_CHOLESKYdefinitions have been added to all configuration files with default settingOFFas they should only be enabled when explicitly wanting to replicate scaling measurements. Thus, they can be enabled (set toON) with a 4th argument tocompile_gprat.shscript, which is either empty (bothOFF),steps, orcholesky. The GPRat library is then compiled accordingly, so for example,examples/gprat_cpp/can be configured to show apex output where the timers will then be visible.Important note: The last two commits 9af7a06 and a39ea3e aim at fixing the CI pipeline: They enable
instrumentation=apexfor the CPU installations of HPX with Spack and remove the+staticsetting (otherwise the CI Install step fails). I don't know how much these two settings affect other parts of the GPRat codebase and previous scaling measurements. However, this seemed to fix the pipeline while also enabling the APEX features of this PR for the CPU build of GPRat.Some other notes:
The option
GPRAT_APEX_CHOLESKYis needed because the cholesky function returns a large matrix (besides assembly and right looking cholesky computation) which adds significant overhead to the whole function call. However, the matrix is not returned by the more relevant GP predict operations, yet the cholesky (covariance matrix assembly and) computation is the computationally most interesting part. Thus, the addition ofGPRAT_APEX_CHOLESKYis used to only track a part of the cholesky function.See experiment branch of my fork that I created for my bachelor's thesis for the original implementation of using APEX timers and some measurement scripts.