Skip to content
This repository was archived by the owner on Jan 23, 2023. It is now read-only.

[WIP] [no merge] Support limiting the buffer size in LargeArrayBuilder.#14006

Closed
jamesqo wants to merge 6 commits intodotnet:masterfrom
jamesqo:limited
Closed

[WIP] [no merge] Support limiting the buffer size in LargeArrayBuilder.#14006
jamesqo wants to merge 6 commits intodotnet:masterfrom
jamesqo:limited

Conversation

@jamesqo
Copy link
Copy Markdown
Contributor

@jamesqo jamesqo commented Nov 26, 2016

There are certain Linq methods, such as Where and Take, where we cannot pinpoint the exact number of elements the iterator will contain, but we can establish an upper bound on how many elements there are. For example, list.Where(p) can have at most list.Count elements, and e.Take(5) can have at most 5 elements.

Currently, when we call ToArray on any of these iterators, the resizing pattern will be to allocate 4, then 8, then 16, etc. size buffers. This can be wasteful because e.g. if the source enumerable in Where contains 100 elements, we allocate space for 128 elements even though those last 28 slots will never be used.

This PR adds support to LargeArrayBuilder<T> to limit how many elements we allocate, so there is no extra space wasted. It should help drive down allocations in Where, Where.Select, & Take when the maximum count is far from the next power of 2.

Waiting on: #12703, #13628

@jamesqo jamesqo changed the title [no merge] Support limiting the buffer size in LargeArrayBuilder. [WIP] [no merge] Support limiting the buffer size in LargeArrayBuilder. Nov 26, 2016
@karelz karelz added area-System.Collections * NO MERGE * The PR is not ready for merge yet (see discussion for detailed reasons) labels Nov 26, 2016
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since the old Add(T) was marked with AggressiveInlining, should this new version be marked too?

Copy link
Copy Markdown
Contributor Author

@jamesqo jamesqo Nov 26, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I found in my experiments a while ago that the JIT inlines methods sequentially; it will recognize the body of Add(T) is small, then inline it. Then it will recognize the body of Add(T, int) is small, and inline it. Consequently, you only need to apply AggressiveInlining once.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is limit passed on every call to Add? Is it expected to change? If not, shouldn't it be passed in the constructor and then stored in a field? Or is the cost of one more field too high?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, the size of the struct is already 40 bytes. I didn't want to penalize other cases where we're not using this parameter, e.g. Select or Concat. No, it's not expected to change,

Copy link
Copy Markdown
Member

@stephentoub stephentoub Nov 26, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, the size of the struct is already 40 bytes

Why does the size of this struct matter? In all usage I can remember, it's only used on the stack, it's not passed around, etc., in which case the difference between 40 and 44 bytes doesn't matter.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@stephentoub Future-proofing. I have been thinking of using it in Reverse, for example, in which case it would need to be stored in a field.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have been thinking of using it in Reverse, for example

Where would it need to be stored in a field? If you mean using it instead of using Buffer and storing the array, you're likely going to run into other complications.

Future-proofing

While it's good to think about future usage, from my perspective this change makes existing code harder to understand, unnecessarily. We should only do so if/when there's known good reason to do so.

@stephentoub
Copy link
Copy Markdown
Member

This PR adds support to LargeArrayBuilder to limit how many elements we allocate

Let's only add such support at the same time such support is consumed, so that we can actually see what real impact it has, and determine whether it's really worth the added complexity.

@jamesqo
Copy link
Copy Markdown
Contributor Author

jamesqo commented Nov 26, 2016

@stephentoub Sure. Will close for now until the the other PRs are merged

@jamesqo jamesqo closed this Nov 26, 2016
@jamesqo jamesqo deleted the limited branch November 26, 2016 13:36
@karelz karelz modified the milestone: 1.2.0 Dec 3, 2016
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

area-System.Collections * NO MERGE * The PR is not ready for merge yet (see discussion for detailed reasons)

Projects

None yet

Development

Successfully merging this pull request may close these issues.

7 participants