Skip to content

Enable automatic memory accounting using the memory pool when creating arrays #8938

@LiaCastaneda

Description

@LiaCastaneda

Is your feature request related to a problem or challenge? Please describe what you are trying to do.

When tracking memory usage with arrow memory pools (introduced in #7303, tracked by EPIC #8137), users must manually call claim() on buffers after array creation. It could be easy to forget to call claim() on newly created arrays, leading to potentially under-reporting memory usage, also it requires having direct access to ArrayData just to access buffers for claiming, which in some cases is complicated.

It would be much more ergonomic and reliable to automatically claim buffers at array creation time.

Describe the solution you'd like

Add optional MemoryPool parameters to array constructors and builders, so buffers are automatically claimed at creation time. For example we could have: PrimitiveArray:: try_new_with_pool() in array costructors, PrimitiveBuilder::new_with_pool that claim buffers once we call finish() etc.

Additional context

This feature is needed to improve memory accounting and tracking in downstream projects like DataFusion. Related discussions:

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementAny new improvement worthy of a entry in the changelog

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions