Skip to content

Resources exhausted errors are confusing return the biggest memory consumers. #11523

@wiedld

Description

@wiedld

Is your feature request related to a problem or challenge?

When asking for more memory via the memory reservation, the error returned from the underlying memory pool focuses on that specific request. As a result, when debugging a resource exhausted error, we get an error message that looks something like:

Failed to allocate additional 795696944 bytes for RepartitionExec[1]
with 3182787776 bytes already allocated - 
maximum available is 63296515

This^^ error is about what next incremental request failed to get more memory, and not about what is using the most memory. As a result, additional dev time has to be spent to (a) at best, track down the actual high memory consumer and (b) at worst, wasted time chasing the wrong memory consumer.

Describe the solution you'd like

One possible solution is to have the error message returned the Top K memory consumers.

Describe alternatives you've considered

No response

Additional context

No response

Metadata

Metadata

Assignees

Labels

enhancementNew feature or request

Type

No type
No fields configured for issues without a type.

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions