Is your feature request related to a problem or challenge?
When asking for more memory via the memory reservation, the error returned from the underlying memory pool focuses on that specific request. As a result, when debugging a resource exhausted error, we get an error message that looks something like:
Failed to allocate additional 795696944 bytes for RepartitionExec[1]
with 3182787776 bytes already allocated -
maximum available is 63296515
This^^ error is about what next incremental request failed to get more memory, and not about what is using the most memory. As a result, additional dev time has to be spent to (a) at best, track down the actual high memory consumer and (b) at worst, wasted time chasing the wrong memory consumer.
Describe the solution you'd like
One possible solution is to have the error message returned the Top K memory consumers.
Describe alternatives you've considered
No response
Additional context
No response
Is your feature request related to a problem or challenge?
When asking for more memory via the memory reservation, the error returned from the underlying memory pool focuses on that specific request. As a result, when debugging a resource exhausted error, we get an error message that looks something like:
This^^ error is about what next incremental request failed to get more memory, and not about what is using the most memory. As a result, additional dev time has to be spent to (a) at best, track down the actual high memory consumer and (b) at worst, wasted time chasing the wrong memory consumer.
Describe the solution you'd like
One possible solution is to have the error message returned the Top K memory consumers.
Describe alternatives you've considered
No response
Additional context
No response