Skip to content
This repository was archived by the owner on Jan 23, 2023. It is now read-only.

Adds FixedSizeBuilder type#154

Closed
jaredpar wants to merge 10 commits into
dotnet:masterfrom
jaredpar:fixed-size-builder
Closed

Adds FixedSizeBuilder type#154
jaredpar wants to merge 10 commits into
dotnet:masterfrom
jaredpar:fixed-size-builder

Conversation

@jaredpar
Copy link
Copy Markdown
Member

The existing ImmutableArray builder pattern through the Builder type
uses a minimum of two array allocations to create a single
ImmutableArray instance. The ImmutableArray.Builder type has an
array field which is manipulated in much the same way as the array
inside of List. The call to ToImmutable on the builder creates a
copy of this array at the current length and wraps the result in an
ImmutableArray

This pattern is great for a number of sceanrios. For example when the
length of the array is variable or the application has low memory
pressure.

There are many situations though where a no copy / single allocation
solution would be advantageous. In particular where there are a known
fixed number of elements to be converted to an array. There is no need
for a double copy solution here because the final size is known in
advance. All that is needed is for the array to be created, populated
and then frozen in an ImmutableArray instance.

A good example of this need is Roslyn within Visual Studio. Roslyn is
very careful to avoid unnecessary allocations, especially on hot paths.
Many of the hot paths include building instances of ImmutableArray.
To avoid the double allocation Roslyn pools ImmutableArray.Builder
instances. This successfully avoids the double allocation but increases
the overall Gen2 pressure in the system.

This PR proposes to fix this by adding a no copy builder pattern:
FixedSizeBuilder. This is a very simple wrapper around T[] which allows
for the creation and population of the array without the ability to
store a long term reference to the underlying array. Hence it can be
safely converted to an ImmutableArray without the need to copy the
array.

The builder can be reused for subsequent ImmutableArray builds via
the Reset method

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No idea what changed here. Didn't do anything special other than use the project inside of Visual Studio

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Most likely the byte order mark s added (or removed). Editors don't show it, but it is nevertheless a change. I'm cool with you leaving it here.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just leave it. VS will keep doing this.

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it tilting at wind mills or not worth it to file a bug on VS?

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is an extension that will make VS stop doing this.

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think VS just likes the byte order mark to be there. I don't think it will ever remove it once added. Although I'm not sure.

@AArnott AArnott added System.Collections.Immutable enhancement Product code improvement that does NOT require public API changes/additions labels Nov 25, 2014
@AArnott AArnott self-assigned this Nov 25, 2014
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It might be a bit more 'user facing' to document this parameter as "The size of the array the Builder creates."

@AArnott
Copy link
Copy Markdown

AArnott commented Nov 25, 2014

How about a test for calling Reset() multiple times in a row?
And please add a test for initializing to n length, setting some values, calling Reset(n) and then verifying the values have been cleared. That will set us up nicely for the optimization of not throwing away the array when Reset is called with the same length as before but still having the behavior of clearing the contents.

@AArnott
Copy link
Copy Markdown

AArnott commented Nov 25, 2014

OK. I'm done reviewing. Please ping when you've responded to all the feedback.

@AArnott
Copy link
Copy Markdown

AArnott commented Nov 25, 2014

Also, we should definitely get another MSFT signoff from the immutable team before merging this since it's very public API impacting and we don't have long to stabilize before releasing this library.

@jaredpar
Copy link
Copy Markdown
Member Author

@AArnott pushed a new commit. Took most of the feedback directly, added comments every where else

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

StyleCop guidelines for properties state the summary should begin with "Gets a value indicating whether" for boolean properties. So perhaps "Gets a value indicating whether this instance has not yet created an ImmutableArray{T} since it was last Reset or created."

@AArnott
Copy link
Copy Markdown

AArnott commented Nov 25, 2014

Thanks, @jaredpar. Overall it looks really good. A few more comments that occurred to me on this iteration and replies to others.

@jaredpar
Copy link
Copy Markdown
Member Author

Thanks @AArnott. Pushed a new version which addresses items you raised.

The one issue that I'm still worried about is the name ToImmutable. On the other types, to my knowledge at least, the ToImmutable call is non-destructive. I wonder if a different name should be used here like ToImmutableAndClear

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We usually define structs for public GetEnumerator() methods to reduce allocations.
Unless you want to go to that trouble, can you remove this method in favor of an explicit interface implementation for it? That way, we can add a public GetEnumerator() method later when and if the struct optimization is worthwhile.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@AArnott I based this implementation off of ImmutableArray<T>+Builder which has this exact pattern. Should they both be changed?

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think enumeration is rare and doesn't need to be optimized. In any case, we have O(1) indexing.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@AArnott in the most recent push I changed this enumeration to just re-use the Enumerator type from ImmutableArray.

Are we still interested in making ImmutableArray.Builder have struct based enumeration? If so I can file an Issue to track that

@AArnott
Copy link
Copy Markdown

AArnott commented Nov 25, 2014

Yes, the ToImmutableAndClear makes sense. Go for it.

@jaredpar
Copy link
Copy Markdown
Member Author

@AArnott push up a new version.

Seems like our comments weren't lost with the most recent push. In case that changes though the one open issue that I'm aware of is whether or not to have GetEnumerator be an explicit interface implementation.

Right now the pattern in FixedSizeBuilder matches that of Builder. I'm fine with making it an explicit interface implementation if desired but it seems like if I make that change for FixedSizeBuilder I should also make it for Builder.

@AArnott
Copy link
Copy Markdown

AArnott commented Nov 25, 2014

Hmmm... I hear we're trying to avoid breaking changes at this point, and changing GetEnumerator on the other Builder would qualify as a breaking change. If it's already a non-struct on the other Builder then it's probably not a big issue. As you say, FixedSizeBuilder isn't meant as a first-class collection citizen so it's probably not a big deal.

@stephentoub stephentoub added the tenet-performance Performance related issue label Dec 2, 2014
@davkean davkean added the api-needs-work API needs work before it is approved, it is NOT ready for implementation label Dec 3, 2014
@AArnott
Copy link
Copy Markdown

AArnott commented Dec 3, 2014

Closing this pull request since we've agreed to take another approach.

@AArnott AArnott closed this Dec 3, 2014
@jaredpar jaredpar mentioned this pull request Dec 3, 2014
@AArnott
Copy link
Copy Markdown

AArnott commented Dec 4, 2014

I guess I closed this prematurely. Sorry about that.

@AArnott AArnott reopened this Dec 4, 2014
@jaredpar
Copy link
Copy Markdown
Member Author

jaredpar commented Dec 4, 2014

No problem, my email could have been clearer.

Lesson of the day: don't type in a rush with a half open laptop in the middle of another meeting :)

@davkean
Copy link
Copy Markdown
Member

davkean commented Dec 6, 2014

Can someone give an update on what the status of this is?

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this be <see cref="ImmutableArray{T}.FixedLengthBuilder"/>?

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Either way works for the compiler, but yes, your prescribed syntax is easier to read.

The existing ImmutableArray<T> builder pattern through the Builder type
uses a minimum of two array allocations to create a single
ImmutableArray<T> instance.  The ImmutableArray<T>.Builder type has an
array field which is manipulated in much the same way as the array
inside of List<T>.  The call to ToImmutable on the builder creates a
copy of this array at the current length and wraps the result in an
ImmutableArray<T>

This pattern is great for a number of sceanrios.  For example when the
length of the array is variable or the application has low memory
pressure.

There are many situations though where a no copy / single allocation
solution would be advantageous.  In particular where there are a known
fixed number of elements to be converted to an array.  There is no need
for a double copy solution here because the final size is known in
advance.  All that is needed is for the array to be created, populated
and then frozen in an ImmutableArray<T> instance.

A good example of this need is Roslyn within Visual Studio.  Roslyn is
very careful to avoid unnecessary allocations, especially on hot paths.
Many of the hot paths include building instances of ImmutableArray<T>.
To avoid the double allocation Roslyn pools ImmutableArray<T>.Builder
instances.  This successfully avoids the double allocation but increases
the overall Gen2 pressure in the system.

This PR proposes to fix this by adding a no copy builder pattern:
FixedSizeBuilder.  This is a very simple wrapper around T[] which allows
for the creation and population of the array without the ability to
store a long term reference to the underlying array.  Hence it can be
safely converted to an ImmutableArray<T> without the need to copy the
array.

The builder can be reused for subsequent ImmutableArray<T> builds via
the Reset method
@jaredpar jaredpar force-pushed the fixed-size-builder branch from eab3123 to 4a7346e Compare January 6, 2015 17:51
@terrajobst
Copy link
Copy Markdown

Based on the API review we decided to not pursue this option.

@terrajobst terrajobst closed this Jan 9, 2015
@karelz karelz removed the enhancement Product code improvement that does NOT require public API changes/additions label Nov 22, 2016
@karelz karelz modified the milestone: 1.0.0-rtm Dec 3, 2016
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

api-needs-work API needs work before it is approved, it is NOT ready for implementation tenet-performance Performance related issue

Projects

None yet

Development

Successfully merging this pull request may close these issues.