Skip to content

Dynamically sizing off-heap memory #5438

@supermem613

Description

@supermem613

Description

When using Gluten with Velox and Spark, today we specify the off-heap memory size and accordingly adjust the on-heap memory as well. In practice, this means that the amount of memory we set aside for on-heap cannot be used for off-heap and vice-versa, which can lead to situations where we are not optimally using the machine's memory since we may be doing processing mostly using on-heap or off-heap memory, but rarely both at the same time in great quantities.

This is particularly painful, for example, when we fall back execution to "vanilla" Spark.

For example, for a 64GB machine where we want to use 56GB of memory for Spark, we would set on-heap memory (via the spark.executor.memory setting) to, say, 14GB and set the off-heap (via the spark.memory.offHeap.size) to 42GB. In this case, if we fallback execution to Spark, we will be constrained by the 14GB of on-heap memory. If we don't fall back, we are using up to 42GB, leaving a fair number of unused GBs of memory that could be used.

We propose to leverage the existing off-heap allocation tracking in Gluten, paired with JDK APIs (Runtime.getRuntime().totalMemory() and freeMemory() APIs) that show on-heap utilization to provide unified memory managed utilization control. However, it is important to notice that this approach does not actively control Java allocations, so it can in practice allow some over subscription of memory to happen until a native allocation comes along and is failed accordingly.

From a configuration perspective, there will be a new gluten Boolean configuration to turn on this new feature, which in turn obviates any off-heap configuration. This means that the setting for off-heap enabling and sizing will no longer be used. Instead, we will continue to configure the executor memory – the on-heap sizing – to use as much memory as possible, as is done today with "vanilla" Spark.

Metadata

Metadata

Assignees

Labels

enhancementNew feature or request

Type

No type
No fields configured for issues without a type.

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions