[VL] Provide a configuration option to completely turn off off-heap memory tracking with Spark memory manager#9341
Conversation
|
Thanks for opening a pull request! Could you open an issue for this pull request on Github Issues? https://github.com/apache/incubator-gluten/issues Then could you also rename commit message and pull request title in the following format? See also: |
|
Run Gluten Clickhouse CI on x86 |
1 similar comment
|
Run Gluten Clickhouse CI on x86 |
|
Run Gluten Clickhouse CI on x86 |
|
Run Gluten Clickhouse CI on x86 |
zhouyuan
left a comment
There was a problem hiding this comment.
Thanks. I have one directional question: is this new memory untrack feature depends on the dynamic off-heap sizing feature?
| buildConf("spark.gluten.memory.untracked") | ||
| .internal() | ||
| .doc( | ||
| "When enabled, turn all native memory allocations in Gluten into untracked. Spark " + |
There was a problem hiding this comment.
based on the description, this feature should be effect only with DYNAMIC_OFFHEAP_SIZING_ENABLED case? or do you intend to introduce this feature in case with static off-heap also?
There was a problem hiding this comment.
It doesn't relate to the off-heap sizing feature, the idea use case is to allow user set
spark.memory.offHeap.enabled=false
spark.gluten.memory.untracked=true
to bypass allocation tracking from Spark memory manager.
| } | ||
|
|
||
| if ( | ||
| conf.getBoolean(COLUMNAR_MEMORY_UNTRACKED.key, COLUMNAR_MEMORY_UNTRACKED.defaultValue.get) |
There was a problem hiding this comment.
looking at the logic, if DYNAMIC_OFFHEAP_SIZING_ENABLED=false, and COLUMNAR_MEMORY_UNTRACKED=true then it will also skip the check of off-heap settings, is this inteded?
There was a problem hiding this comment.
Yes, as mentioned in the other comment, it's allowed for user to set
spark.memory.offHeap.enabled=false
spark.gluten.memory.untracked=true
at the same time.
zhouyuan
left a comment
There was a problem hiding this comment.
+1 Works for me
My review notes in case someone is also looking:
- originally gluten has built two ways of memory management 1) static off-heap 2) dynamic off-heap sizing. In both case Spark will track the memory allocations and report OOM issue
- this patch adds a new feature to "ignore" the memory allocation so Spark will not introduce "OOM" error(but the OS may kill the application due to its big memory usage)
There are still some messy code need to sort out for off-heap sizing. I'll have another PR for that. After this series of work I hope we can either continue on or immediately remove the off-heap sizing feature in future based on our decision. Because the code is made more independent by the effort. |
…emory tracking with Spark memory manager (apache#9341) (cherry picked from commit b99fff8) Change-Id: If3b9982d8391a97826bf12f3c1d4f8f4d37496c0 Reviewed-on: https://bigdataoss-internal-review.googlesource.com/c/third_party/apache/incubator-gluten/+/115778 Reviewed-by: Revanth Venkat Mikkilineni <revanthvenkat@google.com> Reviewed-by: Preetesh Verma <preeteshverma@google.com> Tested-by: Srinivas S T <srst@google.com>
We noticed some users unexpectedly rely on dynamic off-heap sizing to emulate a case that all the memory allocations are not tracked by Spark for testing or PoC purpose. As the feature dynamic off-heap sizing is not reliable itself (with wrong free on-heap memory calculations), we are providing a new option in this patch,
spark.gluten.memory.untrackedwhich will completely make all native allocations untracked by Spark when being set to true.Note the new option is only be used for similar testing or PoC purpose as well. The previous usages on
spark.gluten.memory.dynamic.offHeap.sizing.enabledcan be changed to this new option because we are fixing the existing issues on the dynamic off-heap sizing feature which may cause more OOMs reported when that feature is on.