-
Notifications
You must be signed in to change notification settings - Fork 384
Reduce memory usage during GWDO stats estimation #1235
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Reduce memory usage during GWDO stats estimation #1235
Conversation
|
Tested on Eris (GNU) with 24km grid, and Derecho (GNU) with 15km and 24km. Derecho NVHPC build and run is successful for 15km grid. |
|
Tested on 5km grid as well. |
mgduda
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've done some testing with a 120-km global mesh and all seems good. I'll do further testing with a variable-resolution regional mesh. Meanwhile, I've left a few comments that mostly concern style.
|
Thanks for the additional review @mgduda! I have addressed with the last two commits. |
mgduda
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Other than an update to a routine name in a comment block (as noted), I think we're in good shape. After addressing the routine name issue, feel free to rework the commit history for this PR so that it has one or a few commits with a clear and distinct purpose.
Additionally -- though it can be put off to a future PR -- it could be worth looking for simple ways of optimizing the get_box and/or get_tile_from_box_point routines.
…on of GWDO statistics This commit introduces code changes in order to improve the memory footprint of the computation of Gravity Wave Drag Orography statistics in the init_atmosphere. The previous approach involved each MPI rank reading in the entirety of the topography and land use tiles, and this may prevent fully-subscribing to all cores in a node. The new approach only reads in one tile at a time, when it encounters a pixel whose data is not already available in a linked list. The new algorithm is able to fully subscribe to all ranks in a node, leading to better parallel performance. The get_box subroutine is modified to call get_tile_from_box_point to check if the current pixel in the box is present in the linked list, and if not, it appends this tile (after reading both topo and land use data) to the head of the list. This commit also changes box, box_landuse, dxm, nx and ny to be local variables, instead of module variables. This provides better readability, along with advantages of thread safety. This commit also removes the extraneous conversion of the cell latitude from radians to degrees and back to radians, prior to the estimation of zonal box size. The result is a numerically more correct code, but it results in marginal differences with the previous approach
…last matching tile This commit introduces an optimization for the lookup of tiles in subroutine get_tile_from_box_point by passing the most recent successful tile lookup to the next iteration of the search. These changes substantially improves the single-core performance of the compute_gwd_fields subroutine, and also improve the parallel performance of this computation.
650a60c to
7ad8bf6
Compare
MPAS Version 8.3.0 This release of MPAS introduces new capabilities and improvements in the MPAS-Atmosphere model and its supporting software infrastructure. Notable changes are listed below. Initialization: * Addition of support for 30" BNU soil category dataset. The 30" BNU soil category dataset can be selected by setting the new namelist option config_soilcat_data to 'BNU' in the &data_sources namelist group. Use of this dataset requires a separate static dataset download. (PR #1322) * Addition of support for 15" MODIS land use dataset. The 15" MODIS land use dataset may be selected by setting the existing namelist option config_landuse_data to 'MODIFIED_IGBP_MODIS_NOAH_15s' in the &data_sources namelist group. Use of this dataset requires a separate static dataset download. (PR #1322) * Introduction of a new namelist option, config_lu_supersample_factor, to control the super-sampling of land use data, which may now be on either a 30" or a 15" grid, depending on the choice of dataset. The existing namelist option config_30s_supersample_factor now controls the super-sampling for 30" terrain, soil category, and MODIS FPAR monthly vegetation fraction data only. (PR #1322) * A change in the horizontal interpolation from a four-point bilinear interpolation to a sixteen-point overlapping parabolic interpolation for both initial conditions and lateral boundary conditions. (PR #1303) * Ability to use ICON soil moisture and soil temperature fields. (PR #1298) * Addition of an option to skip processing of Noah-MP-only static fields in the init_atmosphere core. Setting the new config_noahmp_static namelist option to false in the &data_sources namelist group prevents the Noah-MP static fields from being processed when config_static_interp = true in the namelist.init_atmosphere file; this also permits existing static files that lack the Noah-MP fields 'soilcomp', 'soilcl1', 'soilcl2', 'soilcl3', and 'soilcl4' to be used by the init_atmosphere_model program. (PR #1239) * Memory scaling improvements to the gravity wave drag (GWD) static field processing in the init_atmosphere core (when 'config_native_gwd_static = true') to reduce memory usage when multiple MPI ranks are used. In many cases, these changes eliminate the need to undersubscribe computing resources, which was previously required in order to work around lack of memory scaling in the GWD static field processing. (PR #1235) Physics: * Update of the RRTMG LW and SW schemes, most notably with the addition of the exponential and exponential_random cloud overlap assumptions. The cloud overlap assumption and decorrelation length are now available as namelist options (config_radt_cld_overlap and config_radt_cld_dcorrlen, respectively). (PR #1296 and PR #1297) * The incorporation of NOAA's Unified Forecast System (UFS) Unified Gravity Wave Physics (UGWP) suite of physics parameterizations. This physics package is the "NOAA/GSL" orographic gravity wave drag (GWD) suite introduced in WRF Version 4.3 (activated by WRF namelist option 'gwd_opt=3'), but with the addition of a non-stationary GWD parameterization that represents gravity wave sources such as deep convection and frontal instability. The use of the UGWP suite requires additional static field downloads. (PR #1276) Dynamics: * Complete port of all routines in the dynamical core to GPUs using OpenACC directives, including routines used by limited-area simulations. Not included in this release, though, is the optimization of data movement between the CPU and GPU memory, and the profiling and optimization of the computational kernels. * A change in the zero-gradient LBC for w to a constant value of w=0 in the specified zone. For limited-area configurations, the change from a zero-gradient boundary condition for the vertical velocity, w, to a setting of the vertical velocity to zero in the specified region alleviates spurious streamers and instabilities that appeared near the boundaries in regions of strong inflow. (PR #1304) Infrastructure: * Implementation of a new capability to automatically generate package logic code, which determines when a package is active. This package logic is generated by the registry at build time through the use of a new XML attribute, active_when, for <package> elements. (PR #1321) Other: * Addition of a new Python script for setting up MPAS-Atmosphere run directories. (PR #1326) * Addition of 3-d 10 cm radar reflectivity (refl10cm) to the 'da_state' stream, useful for radar DA and radar obs comparison purposes. (PR #1323)
MPAS Version 8.3.0 This release of MPAS introduces new capabilities and improvements in the MPAS-Atmosphere model and its supporting software infrastructure. Notable changes are listed below. Initialization: * Addition of support for 30" BNU soil category dataset. The 30" BNU soil category dataset can be selected by setting the new namelist option config_soilcat_data to 'BNU' in the &data_sources namelist group. Use of this dataset requires a separate static dataset download. (PR MPAS-Dev#1322) * Addition of support for 15" MODIS land use dataset. The 15" MODIS land use dataset may be selected by setting the existing namelist option config_landuse_data to 'MODIFIED_IGBP_MODIS_NOAH_15s' in the &data_sources namelist group. Use of this dataset requires a separate static dataset download. (PR MPAS-Dev#1322) * Introduction of a new namelist option, config_lu_supersample_factor, to control the super-sampling of land use data, which may now be on either a 30" or a 15" grid, depending on the choice of dataset. The existing namelist option config_30s_supersample_factor now controls the super-sampling for 30" terrain, soil category, and MODIS FPAR monthly vegetation fraction data only. (PR MPAS-Dev#1322) * A change in the horizontal interpolation from a four-point bilinear interpolation to a sixteen-point overlapping parabolic interpolation for both initial conditions and lateral boundary conditions. (PR MPAS-Dev#1303) * Ability to use ICON soil moisture and soil temperature fields. (PR MPAS-Dev#1298) * Addition of an option to skip processing of Noah-MP-only static fields in the init_atmosphere core. Setting the new config_noahmp_static namelist option to false in the &data_sources namelist group prevents the Noah-MP static fields from being processed when config_static_interp = true in the namelist.init_atmosphere file; this also permits existing static files that lack the Noah-MP fields 'soilcomp', 'soilcl1', 'soilcl2', 'soilcl3', and 'soilcl4' to be used by the init_atmosphere_model program. (PR MPAS-Dev#1239) * Memory scaling improvements to the gravity wave drag (GWD) static field processing in the init_atmosphere core (when 'config_native_gwd_static = true') to reduce memory usage when multiple MPI ranks are used. In many cases, these changes eliminate the need to undersubscribe computing resources, which was previously required in order to work around lack of memory scaling in the GWD static field processing. (PR MPAS-Dev#1235) Physics: * Update of the RRTMG LW and SW schemes, most notably with the addition of the exponential and exponential_random cloud overlap assumptions. The cloud overlap assumption and decorrelation length are now available as namelist options (config_radt_cld_overlap and config_radt_cld_dcorrlen, respectively). (PR MPAS-Dev#1296 and PR MPAS-Dev#1297) * The incorporation of NOAA's Unified Forecast System (UFS) Unified Gravity Wave Physics (UGWP) suite of physics parameterizations. This physics package is the "NOAA/GSL" orographic gravity wave drag (GWD) suite introduced in WRF Version 4.3 (activated by WRF namelist option 'gwd_opt=3'), but with the addition of a non-stationary GWD parameterization that represents gravity wave sources such as deep convection and frontal instability. The use of the UGWP suite requires additional static field downloads. (PR MPAS-Dev#1276) Dynamics: * Complete port of all routines in the dynamical core to GPUs using OpenACC directives, including routines used by limited-area simulations. Not included in this release, though, is the optimization of data movement between the CPU and GPU memory, and the profiling and optimization of the computational kernels. * A change in the zero-gradient LBC for w to a constant value of w=0 in the specified zone. For limited-area configurations, the change from a zero-gradient boundary condition for the vertical velocity, w, to a setting of the vertical velocity to zero in the specified region alleviates spurious streamers and instabilities that appeared near the boundaries in regions of strong inflow. (PR MPAS-Dev#1304) Infrastructure: * Implementation of a new capability to automatically generate package logic code, which determines when a package is active. This package logic is generated by the registry at build time through the use of a new XML attribute, active_when, for <package> elements. (PR MPAS-Dev#1321) Other: * Addition of a new Python script for setting up MPAS-Atmosphere run directories. (PR MPAS-Dev#1326) * Addition of 3-d 10 cm radar reflectivity (refl10cm) to the 'da_state' stream, useful for radar DA and radar obs comparison purposes. (PR MPAS-Dev#1323)
A copy of #1228 with branch renamed. This PR attempts to improve the memory footprint during the computation of Gravity Wave Drag Orography statistics in the pre-processing step.
The previous approach relied on each MPI rank reading all of the topography and land use tiles, and hence would run out of memory before we could fully subscribe to all cores in a node. In the new approach, we only read in one tile at a time and when we encounter a pixel whose data is not already available in a linked list. The
get_boxsubroutine is modified to callget_tile_from_box_pointto check if the current pixel in the box is present in the linked list, and if not, it appends this tile (after reading both topo and land use data) to the head of the list.This PR also changes
box, box_landuse, dxm, nx and nyto be local variables, instead of module variables. This provides a little better readability, along with advantages of thread safety, etc.