Skip to content

Use the whole frame when writing rows.#17094

Merged
gianm merged 3 commits intoapache:masterfrom
gianm:frame-writer-use-more-memory
Sep 19, 2024
Merged

Use the whole frame when writing rows.#17094
gianm merged 3 commits intoapache:masterfrom
gianm:frame-writer-use-more-memory

Conversation

@gianm
Copy link
Copy Markdown
Contributor

@gianm gianm commented Sep 17, 2024

This patch makes the following adjustments to enable writing larger single rows to frames:

  1. RowBasedFrameWriter: Max out allocation size on the final doubling.
    i.e., if the final allocation "naturally" would be 1 MiB but the
    max frame size is 900 KiB, use 900 KiB rather than failing the 1 MiB
    allocation.

  2. AppendableMemory: In reserveAdditional, release the last block if it
    is empty. This eliminates waste when a frame writer uses a
    successive-doubling approach to find the right allocation size.

  3. ArenaMemoryAllocator: Reclaim memory from the last allocation when
    the last allocation is closed.

Prior to these changes, a single row could be much smaller than the frame size and still fail to be added to the frame.

This patch makes the following adjustments to enable writing larger
single rows to frames:

1) RowBasedFrameWriter: Max out allocation size on the final doubling.
   i.e., if the final allocation "naturally" would be 1 MiB but the
   max frame size is 900 KiB, use 900 KiB rather than failing the 1 MiB
   allocation.

2) AppendableMemory: In reserveAdditional, release the last block if it
   is empty. This eliminates waste when a frame writer uses a
   successive-doubling approach to find the right allocation size.

3) ArenaMemoryAllocator: Reclaim memory from the last allocation when
   the last allocation is closed.

Prior to these changes, a single row could be much smaller than the
frame size and still fail to be added to the frame.
@gianm gianm added this to the 31.0.0 milestone Sep 17, 2024
@gianm gianm marked this pull request as ready for review September 17, 2024 20:35
@github-actions github-actions Bot added Area - Batch Ingestion Area - MSQ For multi stage queries - https://github.com/apache/druid/issues/12262 labels Sep 18, 2024
Copy link
Copy Markdown
Contributor

@LakshSingla LakshSingla left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought I fixed the point mentioned in (1) elsewhere

@gianm
Copy link
Copy Markdown
Contributor Author

gianm commented Sep 19, 2024

I thought I fixed the point mentioned in (1) elsewhere

I think you mean this change in AppendableMemory: https://github.com/apache/druid/pull/15987/files#r1512928070

The issue (1) from my list is a similar thing in RowBasedFrameWriter.

@gianm gianm merged commit 3d45f98 into apache:master Sep 19, 2024
@gianm gianm deleted the frame-writer-use-more-memory branch September 19, 2024 07:42
kfaraz pushed a commit to kfaraz/druid that referenced this pull request Sep 30, 2024
* Use the whole frame when writing rows.

This patch makes the following adjustments to enable writing larger
single rows to frames:

1) RowBasedFrameWriter: Max out allocation size on the final doubling.
   i.e., if the final allocation "naturally" would be 1 MiB but the
   max frame size is 900 KiB, use 900 KiB rather than failing the 1 MiB
   allocation.

2) AppendableMemory: In reserveAdditional, release the last block if it
   is empty. This eliminates waste when a frame writer uses a
   successive-doubling approach to find the right allocation size.

3) ArenaMemoryAllocator: Reclaim memory from the last allocation when
   the last allocation is closed.

Prior to these changes, a single row could be much smaller than the
frame size and still fail to be added to the frame.

* Style.

* Fix test.
kfaraz added a commit that referenced this pull request Oct 2, 2024
This patch makes the following adjustments to enable writing larger
single rows to frames:
1) RowBasedFrameWriter: Max out allocation size on the final doubling.
   i.e., if the final allocation "naturally" would be 1 MiB but the
   max frame size is 900 KiB, use 900 KiB rather than failing the 1 MiB
   allocation.
2) AppendableMemory: In reserveAdditional, release the last block if it
   is empty. This eliminates waste when a frame writer uses a
   successive-doubling approach to find the right allocation size.
3) ArenaMemoryAllocator: Reclaim memory from the last allocation when
   the last allocation is closed.

Prior to these changes, a single row could be much smaller than the
frame size and still fail to be added to the frame.

Co-authored-by: Gian Merlino <gianmerlino@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Area - Batch Ingestion Area - MSQ For multi stage queries - https://github.com/apache/druid/issues/12262

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants