Add non-power-of-arity tests for hierarchical all_reduce and all_gather#7198
Conversation
Exercises site counts that are not clean multiples of the arity,
including configurations where tree recursion produces size-1
subgroups. Covers arity=2 with site counts {3, 5, 6, 7, 9, 10, 11, 15}
and arity=4 with {5, 6, 7, 9, 10, 11, 13, 15}.
These paths were previously uncovered. The hierarchical tree
construction in create_communicator.cpp handles them through the
division_steps and remainder logic, but no existing test verified
the behaviour. Companion coverage to the adaptive flat-fallback
work in TheHPXProject#7193.
|
Can one of the admins verify this patch? |
Up to standards ✅🟢 Issues
|
hkaiser
left a comment
There was a problem hiding this comment.
While you're at it, could you please add similar tests to the other (already existing) hierarchical collectives?
On it. Will extend to broadcast, gather, reduce, scatter with the same pattern. broadcast and scatter already have a single non-power-of-arity regression case; I will fold those into the new test_non_power_of_arity() function so the six hierarchical tests share one shape. Pushing shortly. |
Extended to broadcast, gather, reduce, scatter with the same pattern. broadcast and scatter already had a single non power-of-arity regression case (test_local_use(10, 3)); I folded those into the new test_non_power_of_arity() function so the six hierarchical tests share one shape. Local runs pass at 1 and 2 localities for all four new targets. |
Per review feedback on TheHPXProject#7198, mirror the all_reduce and all_gather coverage to broadcast, gather, reduce, and scatter hierarchical tests. Same site counts and arity combinations for uniformity across the suite. For broadcast_hierarchical and scatter_hierarchical, the existing single-case regression (test_local_use(10, 3)) is subsumed by the new function's broader coverage and removed to keep the six hierarchical test files on a single pattern.
Summary
Adds
test_non_power_of_arity()to the hierarchical all_reduce and all_gather unit tests, exercising site counts that are not clean multiples of the arity.Cases covered
These include configurations where recursion produces degenerate size-1 leaves and where the top-level partition is uneven.
Motivation
The hierarchical tree construction in
create_communicator.cpphandles non-power-of-arity site counts through its division_steps and remainder logic, but no existing test exercised that path. This PR closes that gap. It is companion coverage to the adaptive flat-fallback work in #7193.Verification
Tests were run locally on macOS (clang) at 1 locality and at 2 localities via
hpxrun.py -l 2 -t 2. Both pass with exit 0. The non-power-of-arity block is confirmed by timing output lines such aslocal timing (3/2)andlocal timing (15/4).inspectoverlibs/full/collectives/tests/unit/reports 0 problems across 35 files scanned. No production code is touched.