Skip to content

Conversation

@babusid
Copy link

@babusid babusid commented Jul 27, 2023

Alltoallv benchmark

This pull request incorporates a new AllToAllV benchmark. This allows for testing a wider range of communication patterns found in real-world workloads, and allows end users to easily create custom benchmarks by adjusting test parameters. It can even be used to simulate communication patterns that have dedicated tests currently, without having to compile all of them.

The benchmark requires a parameterization CSV file which contains a square matrix of dimension NxN where N is the number of ranks. Each entry in this matrix must contain a fraction between 0 and 1 (inclusive). Entry I,J determines the amount of data sent from Rank I to Rank J. The amount is equal to the fraction times the data size specified for the test. For example, if the benchmark is being run at 256M, and a particular entry has a value of 0.5, then the sending rank will send 128M to the receiving rank.

This PR also adds the ability to parameterize any test with a setup file with the -s flag. The AllToAllV benchmark implementation can be referenced for an example on how to use this.

cc: @sjeaugey (I don't have the ability to add reviewers)

Sidharth Babu and others added 20 commits June 2, 2023 23:09
Changes:
- Added variable count of elements to send/recv based on sending/recieving peers
- Added new file to make file

Notes:
- Current method of uniquely identifying the peers that are sending (thread_local of thread number) may not work correctly.
Not sure if that is the appropriate way to determine rank.
- alltoallv2.cu testfile: Parameterizes with alltoallv_param.csv
- run_a2av.sh script:
-- Runs the built test with an arbitrarily named CSV instead of the static name
-- Passes through other arguments to the testfile
Each Rank is guaranteed to send X/nranks data in some distribution.
@babusid
Copy link
Author

babusid commented Aug 2, 2023

cc: @AddyLaddy

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant