Skip to content

[VL] Move pre-configuration code of dynamic off-heap sizing to its own place#9336

Merged
zhztheplayer merged 3 commits intoapache:mainfrom
zhztheplayer:wip-dynamic-sizing
Apr 17, 2025
Merged

[VL] Move pre-configuration code of dynamic off-heap sizing to its own place#9336
zhztheplayer merged 3 commits intoapache:mainfrom
zhztheplayer:wip-dynamic-sizing

Conversation

@zhztheplayer
Copy link
Copy Markdown
Member

@zhztheplayer zhztheplayer commented Apr 15, 2025

Dynamic off-heap sizing was introduced in #5439. The PR does a few of improvements against the implementation.

  1. Move pre-configuration code from GlutenConfig.scala to DynamicOffHeapSizingMemoryTarget.
  2. Avoid chaining DynamicOffHeapSizingMemoryTarget with Spark consumer since it basically never triggers spills because of its design. The target will manage OOMs by itself.
  3. Do not set off-heap size automatically when dynamic off-heap sizing is enabled. All the memory allocation is from a counter that indicates the remaining unused size from JVM on-heap memory cap.
  4. Other minor improvements.

@github-actions github-actions bot added CORE works for Gluten Core VELOX labels Apr 15, 2025
@apache apache deleted a comment from github-actions bot Apr 15, 2025
@apache apache deleted a comment from github-actions bot Apr 15, 2025
@github-actions
Copy link
Copy Markdown

Run Gluten Clickhouse CI on x86

@apache apache deleted a comment from github-actions bot Apr 17, 2025
@zhztheplayer zhztheplayer changed the title [VL] Move pre-configuration code of dynamic off-heap sizing to its own place, and minor refactors against the feature [VL] Move pre-configuration code of dynamic off-heap sizing to its own place Apr 17, 2025
@zhztheplayer zhztheplayer marked this pull request as ready for review April 17, 2025 09:23
@github-actions
Copy link
Copy Markdown

Run Gluten Clickhouse CI on x86

Copy link
Copy Markdown
Member

@zhouyuan zhouyuan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, the logic on dynamic off-heap sizing is cleaner.


val DYNAMIC_OFFHEAP_SIZING_ENABLED =
buildConf("spark.gluten.memory.dynamic.offHeap.sizing.enabled")
buildStaticConf("spark.gluten.memory.dynamic.offHeap.sizing.enabled")
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is this a behavior change by changing this to static config?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not really, it's just an improvement for configuration definitions.

// Pessimistic off-heap sizes, with the assumption that all non-borrowable storage memory
// determined by spark.memory.storageFraction was used.
val fraction = 1.0d - conf.getDouble("spark.memory.storageFraction", 0.5d)
val conservativeOffHeapPerTask = (offHeapSize * fraction).toLong / taskSlots
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it looks like changed the scope: conservativeOffHeapPerTask is effective on DYNAMIC_OFFHEAP_SIZING_ENABLED=false on previous code

Copy link
Copy Markdown
Member Author

@zhztheplayer zhztheplayer Apr 17, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We don't have to deal with off-heap size in the future anymore since we no longer rely on off-heap memory manager when the feature is on.

We are just recovering the old code here before off-heap sizing was landed, so it's totally safe: https://github.com/apache/incubator-gluten/blame/022c208564dbb6dc59b7d64fbc4ad002c166e012/gluten-core/src/main/scala/org/apache/gluten/GlutenPlugin.scala

@zhouyuan
Copy link
Copy Markdown
Member

Cc: @zhli1142015 to be aware of this

@zhztheplayer zhztheplayer merged commit 083ecfb into apache:main Apr 17, 2025
48 checks passed
warrenzhu25 pushed a commit to warrenzhu25/gluten that referenced this pull request Jan 10, 2026
…n place (apache#9336)

(cherry picked from commit 083ecfb)
Change-Id: I1f03d59860a2a966cf4197de41ba6ce3dfd45e60
Reviewed-on: https://bigdataoss-internal-review.googlesource.com/c/third_party/apache/incubator-gluten/+/115779
Reviewed-by: Revanth Venkat Mikkilineni <revanthvenkat@google.com>
Tested-by: Srinivas S T <srst@google.com>
Reviewed-by: Preetesh Verma <preeteshverma@google.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

CORE works for Gluten Core

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants