Skip to content

Add configurable pool properties to PinnedMemoryResource#851

Merged
rapids-bot[bot] merged 21 commits intorapidsai:mainfrom
nirandaperera:enable_props_pinned_pool
Mar 5, 2026
Merged

Add configurable pool properties to PinnedMemoryResource#851
rapids-bot[bot] merged 21 commits intorapidsai:mainfrom
nirandaperera:enable_props_pinned_pool

Conversation

@nirandaperera
Copy link
Copy Markdown
Contributor

@nirandaperera nirandaperera commented Feb 9, 2026

This PR adds PinnedPoolProperties struct to allow configuration of pinned memory pool behavior via initial_pool_size and max_pool_size parameters.

Changes

  • New PinnedPoolProperties struct with two configurable fields:

    • initial_pool_size: Pre-allocates pinned memory for improved performance
    • max_pool_size: Limits maximum pool size (0 = unlimited)
  • Updated PinnedMemoryResource constructor to accept optional PinnedPoolProperties parameter with backward-compatible defaults

Initial allocation benchmark

  • Added a benchmark to test the impact of initial_pool_size. On my workstation (RTX A6000) it shows around 10x allocation performance when the pool is set an initial size.
---------------------------------------------------------------------------------------------------------------
Benchmark                                                     Time             CPU   Iterations UserCounters...
---------------------------------------------------------------------------------------------------------------
BM_PinnedFirstAlloc_InitialPoolSize/1/1/real_time          1301 us         1300 us          538 bytes_per_second=768.881M/s initial_pool_size=1048.58k
BM_PinnedFirstAlloc_InitialPoolSize/1/0/real_time          9518 us         9515 us           70 bytes_per_second=105.067M/s initial_pool_size=0
BM_PinnedFirstAlloc_InitialPoolSize/256/1/real_time       10532 us        10517 us           67 bytes_per_second=23.7374G/s initial_pool_size=268.435M
BM_PinnedFirstAlloc_InitialPoolSize/256/0/real_time       85740 us        85706 us            8 bytes_per_second=2.91578G/s initial_pool_size=0
BM_PinnedFirstAlloc_InitialPoolSize/1024/1/real_time      45428 us        45366 us           16 bytes_per_second=22.0129G/s initial_pool_size=1073.74M
BM_PinnedFirstAlloc_InitialPoolSize/1024/0/real_time     302111 us       301913 us            2 bytes_per_second=3.31004G/s initial_pool_size=0

@nirandaperera nirandaperera requested a review from a team as a code owner February 9, 2026 15:48
@nirandaperera nirandaperera added improvement Improves an existing functionality non-breaking Introduces a non-breaking change labels Feb 9, 2026
*/
struct PinnedPoolProperties {
std::size_t initial_pool_size = 0; ///< initial size of the pool. Initial size is
///< important for pinned memory performance.
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think, priming pools (ie. make some allocations up front and deacllocating) has little effect on device pools. But for pinned memory pools, initial allocation and warming up is important. I extended the comment to include this.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What does "important for pinned memory performance" mean? How is it important?

Comment on lines 37 to 39
// Before <https://github.com/NVIDIA/cccl/pull/6718>, the default
// `release_threshold` was 0, which defeats the purpose of having a pool. We
// now set it so the pool never releases unused pinned memory.
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This comment seems outdated now, no?

// `release_threshold` was 0, which defeats the purpose of having a pool. We
// now set it so the pool never releases unused pinned memory.
.release_threshold = std::numeric_limits<size_t>::max(),
.release_threshold = props.max_pool_size > 0 ? props.max_pool_size
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

max_pool_size seems to imply that the pool cannot go beyond that limit. However, AFAIU, release_threshold is something different, meaning it will start releasing unused memory back to the driver once that limit is reached. Can you clarify what max_pool_size really means, and if necessary rename it to release_threshold or something more accurate?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thinking about this again, I feel like my change is wrong. I reverted this. Thank you for catching this.

Comment on lines +53 to +59
// It was observed that priming async pools have little effect for performance.
// See <https://github.com/rapidsai/rmm/issues/1931>.
.initial_pool_size = 0,
.initial_pool_size = props.initial_pool_size,
// Before <https://github.com/NVIDIA/cccl/pull/6718>, the default
// `release_threshold` was 0, which defeats the purpose of having a pool. We
// now set it so the pool never releases unused pinned memory.
.release_threshold = std::numeric_limits<size_t>::max(),
.release_threshold = props.max_pool_size > 0 ? props.max_pool_size
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All comments above are applicable here.

*/
struct PinnedPoolProperties {
std::size_t initial_pool_size = 0; ///< initial size of the pool. Initial size is
///< important for pinned memory performance.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What does "important for pinned memory performance" mean? How is it important?

Comment thread cpp/tests/test_host_buffer.cpp Outdated
Comment on lines +214 to +229
// Create a PinnedMemoryResource with max pool size of 1MiB
auto pinned_mr = std::make_shared<rapidsmpf::PinnedMemoryResource>(
rapidsmpf::get_current_numa_node(),
rapidsmpf::PinnedPoolProperties{.initial_pool_size = 0, .max_pool_size = 1_MiB}
);
auto stream = cudf::get_default_stream();

void* ptr = pinned_mr->allocate(stream, 512_KiB);
EXPECT_NE(nullptr, ptr);
pinned_mr->deallocate(stream, ptr, 512_KiB);

// NOTE: currently cuda driver rounds up max size to 32MB. So we need to allocate 32MB
// + 1 byte.
EXPECT_THROW(
{
void* ptr2 = pinned_mr->allocate(stream, 32_MiB + 1);
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we inspect the max size we get so that this test is robust?

Comment on lines +29 to +30
#if CCCL_MAJOR_VERSION > 3 || (CCCL_MAJOR_VERSION == 3 && CCCL_MINOR_VERSION >= 2)
cuda::memory_pool_properties get_memory_pool_properties() {
cuda::memory_pool_properties get_memory_pool_properties(
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I note that we are now on CCCL 3.2.1, so we should be able to remove all of this macro-conditional stuff.

It's also no longer gated behind experimental, so I think we can avoid needing to turn on experimental mode in cccl (and can just get the version from RMM).

And, the headers now no longer require nvcc, so we can remove the pimpl idiom.

Before doing this refactoring, can we migrate to the CCCL >= 3.2 version of the memory pool resource. It's plausible that we can then delete huge swathes of this code anyway

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes! Let me send a separate PR for it.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@wence- #856 I opened a new PR for this

@wence- wence- mentioned this pull request Feb 12, 2026
@nirandaperera nirandaperera requested a review from wence- February 12, 2026 21:24
Comment on lines +208 to +229
/// Discover the actual pool size the driver creates when a small max is requested.
/// Creates a pool with \p requested_max_pool_size (e.g. 1 MiB), then uses recursive
/// doubling of allocation size until allocation fails; returns the last successful size.
std::size_t discover_pinned_pool_actual_size(
rmm::cuda_stream_view stream, std::size_t requested_max_pool_size = 1_MiB
) {
rapidsmpf::PinnedMemoryResource pinned_mr{
rapidsmpf::get_current_numa_node(),
rapidsmpf::PinnedPoolProperties{.max_pool_size = requested_max_pool_size}
};
std::size_t try_size = requested_max_pool_size;
while (true) {
try {
void* ptr = pinned_mr.allocate(stream, try_size);
pinned_mr.deallocate(stream, ptr, try_size);
try_size *= 2;
} catch (cuda::cuda_error const&) {
break;
}
}
return std::max(try_size / 2, requested_max_pool_size);
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should probably do bisection search, in case the actual size is not a power of two.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

Comment thread cpp/benchmarks/bench_memory_resources.cpp Outdated
Signed-off-by: niranda perera <niranda.perera@gmail.com>
Copy link
Copy Markdown
Member

@madsbk madsbk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Update PinnedMemoryResource::from_options with a pinned_memory_initial_pool_size option (also update READSME.md).

std::size_t initial_pool_size = 0;

/// @brief Maximum size of the pool. 0 means no limit.
std::size_t max_pool_size = 0;
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use std::optional instead of 0

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

Comment thread cpp/include/rapidsmpf/memory/pinned_memory_resource.hpp
Signed-off-by: niranda perera <niranda.perera@gmail.com>
@madsbk
Copy link
Copy Markdown
Member

madsbk commented Feb 19, 2026

@nirandaperera, we should introduce an config option pinned_memory_initial_pool_size in PinnedMemoryResource::from_options.

@nirandaperera nirandaperera requested a review from madsbk February 20, 2026 17:46
@nirandaperera
Copy link
Copy Markdown
Contributor Author

@nirandaperera, we should introduce an config option pinned_memory_initial_pool_size in PinnedMemoryResource::from_options.

@madsbk done

Comment on lines +86 to +91
.max_pool_size = options.get<std::optional<size_t>>(
"pinned_max_pool_size", [](auto const& s) {
return s.empty() ? std::nullopt
: std::optional<size_t>(parse_string<size_t>(s));
}
)
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use parse_optional() to handle the optional case before parsing it to parse_string<size_t> like we do here:

if (auto val = parse_optional(s); val.has_value()) {

[](auto const& s) { return parse_string<size_t>(s.empty() ? "0" : s); }
),
.max_pool_size = options.get<std::optional<size_t>>(
"pinned_max_pool_size", [](auto const& s) {
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Docs the pinned_max_pool_size option:

if (pinned_memory) {
PinnedPoolProperties pool_properties{
.initial_pool_size = options.get<size_t>(
"pinned_initial_pool_size",
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Docs the pinned_initial_pool_size option:

Comment thread cpp/tests/test_host_buffer.cpp Outdated
wence-
wence- previously requested changes Feb 24, 2026
Comment thread cpp/include/rapidsmpf/memory/pinned_memory_resource.hpp
Comment thread cpp/include/rapidsmpf/memory/pinned_memory_resource.hpp
Co-authored-by: Mads R. B. Kristensen <madsbk@gmail.com>
@nirandaperera nirandaperera requested review from madsbk and wence- March 2, 2026 20:04
@nirandaperera
Copy link
Copy Markdown
Contributor Author

@madsbk @wence- Let's get this in as well.

Copy link
Copy Markdown
Member

@madsbk madsbk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Comment thread docs/source/configuration.md Outdated
Comment on lines +149 to +162
- **`pinned_initial_pool_size`**
- **Environment Variable**: `RAPIDSMPF_PINNED_INITIAL_POOL_SIZE`
- **Default**: `0`
- **Description**: Initial size (in bytes) of the pinned host memory pool when
`pinned_memory` is enabled. A value of `0` means the pool starts empty and grows
on demand. Accepts byte counts (e.g. `"1GiB"`, `"512MiB"`).

- **`pinned_max_pool_size`**
- **Environment Variable**: `RAPIDSMPF_PINNED_MAX_POOL_SIZE`
- **Default**: `"disabled"`
- **Description**: Maximum size (in bytes) of the pinned host memory pool when
`pinned_memory` is enabled. When unset or empty, the pool is allowed to grow
without an upper bound. Accepts byte counts (e.g. `"4GiB"`, `"2048MiB"`).

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

move up to the pinned_memory section

@nirandaperera
Copy link
Copy Markdown
Contributor Author

/merge

@nirandaperera nirandaperera dismissed wence-’s stale review March 5, 2026 23:31

@wence- I will do the requested changes in follow-up PR

@rapids-bot rapids-bot bot merged commit 42a0304 into rapidsai:main Mar 5, 2026
66 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

improvement Improves an existing functionality non-breaking Introduces a non-breaking change

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants