Skip to content

[VL] Looks CelebornShuffleReader doesn't spill data #9784

@boneanxs

Description

@boneanxs

Backend

VL (Velox)

Bug description

25/05/28 17:12:50 ERROR [Executor task launch worker for task 0.0 in stage 399.0 (TID 76139)] ManagedReservationListener: Error reserving memory from target
org.apache.gluten.memory.memtarget.ThrowOnOomMemoryTarget$OutOfMemoryException: Not enough spark off-heap execution memory. Acquired: 8.0 MiB, granted: 6.0 MiB. Try tweaking config option spark.memory.offHeap.size to get larger space to run this application (if spark.gluten.memory.dynamic.offHeap.sizing.enabled is not enabled). 
Current config settings: 
	spark.gluten.memory.offHeap.size.in.bytes=5.6 GiB
	spark.gluten.memory.task.offHeap.size.in.bytes=1433.5 MiB
	spark.gluten.memory.conservative.task.offHeap.size.in.bytes=716.8 MiB
	spark.memory.offHeap.enabled=true
	spark.gluten.memory.dynamic.offHeap.sizing.enabled=false
Memory consumer stats: 
	Task.76139:                                                                    Current used bytes:   5.6 GiB, peak bytes:        N/A
	+- Gluten.Tree.5:                                                              Current used bytes:   5.5 GiB, peak bytes:    5.5 GiB
	|  \- Capacity[8.0 EiB].5:                                                     Current used bytes:   5.5 GiB, peak bytes:    5.5 GiB
	|     +- CelebornShuffleReader.5:                                              Current used bytes:   5.5 GiB, peak bytes:    5.5 GiB
	|     |  \- single:                                                            Current used bytes:   5.5 GiB, peak bytes:    5.5 GiB
	|     |     +- root:                                                           Current used bytes:   5.5 GiB, peak bytes:    5.5 GiB
	|     |     |  \- default_leaf:                                                Current used bytes:   5.5 GiB, peak bytes:    5.5 GiB
	|     |     \- gluten::MemoryAllocator:                                        Current used bytes:     0.0 B, peak bytes:      0.0 B
	|     +- ArrowContextInstance.5:                                               Current used bytes:   8.0 MiB, peak bytes:    8.0 MiB
	|     +- ColumnarToRow.0.OverAcquire.0:                                        Current used bytes:     0.0 B, peak bytes:    2.4 MiB
	|     +- ColumnarToRow.0:                                                      Current used bytes:     0.0 B, peak bytes:    8.0 MiB
	|     |  \- single:                                                            Current used bytes:     0.0 B, peak bytes:    8.0 MiB
	|     |     +- root:                                                           Current used bytes:     0.0 B, peak bytes:    7.0 MiB
	|     |     |  \- default_leaf:                                                Current used bytes:     0.0 B, peak bytes:    7.0 MiB
	|     |     \- gluten::MemoryAllocator:                                        Current used bytes:     0.0 B, peak bytes:      0.0 B
	|     \- CelebornShuffleReader.5.OverAcquire.0:                                Current used bytes:     0.0 B, peak bytes: 1288.8 MiB
	\- org.apache.spark.util.collection.unsafe.sort.UnsafeExternalSorter@4d16a67d: Current used bytes: 136.0 MiB, peak bytes:        N/A

Gluten version

Gluten-1.3

Spark version

Spark-3.2.x

Spark configurations

No response

System information

No response

Relevant logs

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't workingtriage

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions