Skip to content

Conversation

@romerojosh
Copy link
Collaborator

In cuDecomp, users currently have only two options for memory layout orders for their pencil data buffers:

  1. [X, Y, Z] corresponding to transpose_axis_contiguous[axis] = false
  2. a prescribed cyclically-permuted layout such that the memory is contiguous along the pencil axis when transpose_axis_contiguous[axis] = true (e.g., [Y, Z, X] for the Y-axis pencil).

This can limit the ease of application of cuDecomp to codes that have established their own memory orderings that do not match what is currently available.

This PR enables users to more flexibly set their own desired pencil buffer memory layouts by pencil axis via a new transpose_mem_order entry in the cudecompGridDescConfig structure. This new entry overrides the setting from transpose_axis_contiguous.

From the updated documentation:
Advanced users who require more flexibility in the memory layout of the pencil buffers can override the layouts available via
transpose_axis_contiguous by setting thetranspose_mem_order array in the configuration structure. This array enables
users to set arbitrary memory layout orders for the pencil buffers by axis. For example, a user can set this structure as follows to have pencil memory in [X, Y, Z] order for the X-pencil and [Z, Y, X] order for the Y- and Z-pencils:

In C++:


    config.transpose_mem_order[0][0] = 0;
    config.transpose_mem_order[0][1] = 1;
    config.transpose_mem_order[0][2] = 2;
    config.transpose_mem_order[1][0] = 2;
    config.transpose_mem_order[1][1] = 1;
    config.transpose_mem_order[1][2] = 0;
    config.transpose_mem_order[2][0] = 2;
    config.transpose_mem_order[2][1] = 1;
    config.transpose_mem_order[2][2] = 0;

In Fortran:

    config%transpose_mem_order(1, 1) = 1
    config%transpose_mem_order(2, 1) = 2
    config%transpose_mem_order(3, 1) = 3
    config%transpose_mem_order(1, 2) = 3
    config%transpose_mem_order(2, 2) = 2
    config%transpose_mem_order(3, 2) = 1
    config%transpose_mem_order(1, 3) = 3
    config%transpose_mem_order(2, 3) = 2
    config%transpose_mem_order(3, 3) = 1

@romerojosh romerojosh merged commit 6af3794 into main Feb 6, 2025
@romerojosh romerojosh deleted the memory_layouts branch February 7, 2025 21:14
@romerojosh romerojosh mentioned this pull request Mar 7, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants