Skip to content
This repository was archived by the owner on Jan 23, 2023. It is now read-only.

Add ExtractToImmutable#180

Merged
jaredpar merged 8 commits into
dotnet:masterfrom
jaredpar:fixes-extract
Jan 9, 2015
Merged

Add ExtractToImmutable#180
jaredpar merged 8 commits into
dotnet:masterfrom
jaredpar:fixes-extract

Conversation

@jaredpar
Copy link
Copy Markdown
Member

@jaredpar jaredpar commented Dec 3, 2014

This is an API which allows for a zero-copy creation of an ImmutableArray<T> value.

This is an alternative approach to the PR #154

EDITS

  • API review results available here.

@AArnott
Copy link
Copy Markdown

AArnott commented Dec 3, 2014

Regarding your addition of the ImmutableArray.CreateBuilderWithCount method, we need to keep in mind that when you add API, it must pay for itself. Every new feature/API has a "weight" of some negative number that must be more than paid for in order to be accepted. This method doesn't qualify by my estimation using that metric. It adds complexity because every new user who wants to create a Builder now has to face this extra method and ask themselves why it's there and which version they should instantiate. The difference is rather subtle, and it seems it would make bugs more likely.
You've made the point that it has perf benefits and makes things more explicit. We shouldn't optimize for perf before we have data showing it will matter (rule no. 1 in perf optimizations: don't optimize first), and we can add this method later if we find it's important. As for being more explicit, when it means that the majority of users that won't care now have to face a decision that they'd rather not deal with, I think it costs more than it provides.

@jaredpar
Copy link
Copy Markdown
Member Author

jaredpar commented Dec 3, 2014

@AArnott I do think having an API which is dedicated to the no copy pattern is worthwhile. It's a decidedly different pattern than a dynamic builder and as a result has different creation semantics.

I disagree that it would introduce bugs, I think it is far more likely to prevent them. Given only the APIs of today I think the following pattern will be a potential pitfall for anyone who first tries a no copy approach.

var builder = ImmutableArray.CreateBuilder<string>(2);
builder[0] = "a"; // Throws because capacity != count 

That being said I'm fine with removing it for now and looking at adding it back at a later time. It's a purely additive change and hence easy to add later if we agree it needs to happen.

@AArnott
Copy link
Copy Markdown

AArnott commented Dec 3, 2014

Thanks for being fine to remove it. If you want to send another pull request after this one is taken that adds CreateBuilderWithCount, we can then make that decision independently of the rest of the Goodness in this one. :)

@AArnott
Copy link
Copy Markdown

AArnott commented Dec 3, 2014

👍 Looks good to me. Hopefully the API review process will be much quicker on this pull request than your last since the API impact is much smaller.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IIRC, these Count > 1 checks were previously added to avoid allocating the Comparer wrapper. Now that the wrapper isn't needed, I wonder if the checks are still beneficial.

@nguerrera
Copy link
Copy Markdown
Contributor

👍 Overall, looks good to me. I'd like to get a bit more API reviewer's eyes on this before merging.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would just call it reverse to be consistent with List.

@joshfree joshfree added the api-needs-work API needs work before it is approved, it is NOT ready for implementation label Dec 5, 2014
This is an alternate implementation idea for having a zero copy
ImmutableArray<T> construction mechanism.  This reuses the existing
builder and adds an ExtractToImmutable method which wraps the internal
array with an ImmutableArray<T> and replaces it with an empty array.

As a part of this change the RefAsValueType optimization is removed from
the Builder.  It's a requirement to have such a method because
ImmutableArray<T> wraps a T[] not a RefAsValueType<T>[]
As per the results of the design review this changes the Builder API
around extraction of immutable arrays in the following ways:

1. Renamed method to MoveToImmutable
2. Throw when Count != Capacity in that method to avoid potential user
errors around extraction.
As discussed making two minor changes to make Builder line up closer to
List<T>

- Capacity now has a get and set.
- Make EnsureCapacity a private method.  It is superfluous in the public
API with the addition of a set accessor on Capacity.
@mmitche
Copy link
Copy Markdown
Member

mmitche commented Jan 7, 2015

Can one of the admins verify this patch?

@FiveTimesTheFun
Copy link
Copy Markdown
Contributor

Thanks @jaradpar :) In e034c93 you add the check to MoveToImmutable that throws an InvalidOperationException if Count is not equal to Capacity. I believe Immo is going to add more details about the convention we're advising for this method to the API review. Since this is a potentially risky app-compat burden and potentially easy to break, may I recommend some tests around the positive and negative case there? I think it would make sense to test:
Create with explicit capacity, add exactly that number, call MoveToImmutable
Create with explicit capacity, insert exactly that number, call MoveToImmutable
Create with explicit capacity, addrange exactly that number, call MoveToImmutable
Create with explicit capacity, add, remove, and add until the final number is exactly that number, call MoveToImmutable
Create with explicit capacity, add some but less than that number, expect InvalidOperationException

@terrajobst
Copy link
Copy Markdown

The API review results are now available here.

@jaredpar
Copy link
Copy Markdown
Member Author

jaredpar commented Jan 9, 2015

@FiveTimesTheFun most of those tests already exist. I think I'm missing Remove and possibly AddRange but the others should be there. Will add those.

@weshaggard
Copy link
Copy Markdown
Member

ok to test

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you please explicitly type temp here to T[]?

@jaredpar
Copy link
Copy Markdown
Member Author

jaredpar commented Jan 9, 2015

@FiveTimesTheFun and @weshaggard

Latest contains the requested changes.

@sharwell
Copy link
Copy Markdown

sharwell commented Jan 9, 2015

@jaredpar I'm interested in your feedback regarding the following usability improvement. How difficult would it be to implement the following after adding MoveToImmutable?

The ToImmutable() method could be updated to the following special behavior when Count==Capacity:

  1. Return an ImmutableArray<T> without additional allocations, as you've done with MoveToImmutable.
  2. Alter the semantics of _elements to be "copy on write", so the copy currently performed within ToImmutable is relocated to the next time the _elements array is modified.

@jaredpar
Copy link
Copy Markdown
Member Author

jaredpar commented Jan 9, 2015

@sharwell

We discussed a copy on write design which optimized ToImmutable in the manner you described but discarded it because we didn't like the trade offs. It would require:

  1. The builder type expand by a field which increases allocation size.
  2. Adds extra overhead to any mutating method call on the type.

The builder is meant to be a lean type which is efficient as possible for building up ImmutableArray<T> values. In particular when building up a single ImmutableArray<T> value which is the predominant use case. These tradeoffs went against that.

The good news though is that if you want a copy on write design, it can easily be built on top of the builder as is.

@jaredpar
Copy link
Copy Markdown
Member Author

jaredpar commented Jan 9, 2015

@sharwell

I strongly opposed having MoveToImmutable have the behavior of "move if count == capacity, copy otherwise". This API was added specifically to avoid copies because they are negatively impacting perf. Changing the API in this manner would cause programming errors to show up not as runtime errors but silent perf regressions. This type of change can easily go unnoticed and be missed during a code review.

If you really want these semantics though, and we agreed in design that it's possible customers would, it can easily be added on top of the change with an extension method.

static ImmutableArray<T> ToImmutableEfficient(this ImmutableArray<T>.Builder builder)
{
  if (builder.Count == builder.Capacity) 
  {
    return builder.MoveToImmutable();
  }
  else  
  {
    var array = builder.ToImmutable();
    builder.Clear();
    return array;
  }
}

@sharwell
Copy link
Copy Markdown

sharwell commented Jan 9, 2015

Edit: This was a reply to the message about optimizing ToImmutable. It seems Jared answered this question before I asked it. 😄

@jaredpar That makes sense. How do you feel about the following?

  • Instead of MoveToImmutable, provide a method ToImmutableAndClear
  • Instead of throwing an exception when Count < Capacity, use Array.Resize(ref _elements, Count) before creating the ImmutableArray<T> (which resizes the array only if Count < Capacity)

This would provide the performance of MoveToImmutable, the overall usability of ToImmutable, and have semantics which are clearly communicated through the name of the method.

@jaredpar
Copy link
Copy Markdown
Member Author

jaredpar commented Jan 9, 2015

@sharwell

This solution still allocates memory in the cases where Count != Capacity because Array.Resize always allocates memory. This goes against the intended design.

Note: In general I perfer the terminology Reset over Clear. The term Clear is established as "zero out elements" and, at least to my knowledge, never involves changing the size of the underlying storage. I feel Reset is closer to the semantics being provided here.

@sharwell
Copy link
Copy Markdown

sharwell commented Jan 9, 2015

@jaredpar I believe that ToImmutableAndClear would be the method I used most often in applications. I use memory allocation profiling to track down sources of allocations, and this method would show up in the profiler if actual use cases were creating copies of the underlying array.

@jaredpar
Copy link
Copy Markdown
Member Author

jaredpar commented Jan 9, 2015

@sharwell

In general I prefer core types like ImmutableArray<T>.Builder have single purpose methods. It makes the usage, including performance characteristics, easy to understand even to the casual reviewer and doesn't force customers to pay for features they aren't using. It also reduces the chance of code changes silently introducing errors into the code.

I do agree that some customers will have use cases that don't align with the specific goals here. But by providing the low level methods we give them the tools to build the solutions they want. It also shifts the cost of these solutions, even if minor, to the developers actually using them.

@sharwell
Copy link
Copy Markdown

sharwell commented Jan 9, 2015

@jaredpar You reply too fast. ⏩

For code like the .NET Compiler Platform and within the .NET Core, it's clear why you would want the code to throw early for this type of performance regression. However, for other applications where memory allocations are not the primary concern, the usability characteristics of the current implementation seem unfavorable. A more friendly approach would be providing ToImmutableAndReset through this API, and then having the .NET Compiler Platform call this operation through a helper method like the following:

public static ImmutableArray<T> MoveToImmutable(this ImmutableArray<T>.Builder builder)
{
  if (builder.Count < builder.Capacity)
    throw ...

  return builder.ToImmutableAndReset();
}

Summary: I'm not convinced that the driving concern for the current semantics when Count < Capacity accurately represent the primary concern for the majority of consumers.

@sharwell
Copy link
Copy Markdown

sharwell commented Jan 9, 2015

BTW, I believe the posts above represent my complete thoughts on the matter, so I won't be going back-and-forth on this forever. 😜 I do believe the discussion is beneficial for future readers looking at this. 👍

@jaredpar
Copy link
Copy Markdown
Member Author

jaredpar commented Jan 9, 2015

@sharwell One last little thought 😄

I agree that both of these methods do make sense and that one is naturally implemented in terms of the other. However I would do it in the opposite direction: implement ToImmutableAndReset in terms of MoveToImmutable.

The reason why is MoveToImmutable represents the lowest level method. It is a simple check and pointer move operation. It is likely to be used directly by code which is much more perf conscious and already heavily geared to keeping Capacity == Count. Hence I would want to keep it as lean as possible.

Implementing it in terms of ToImmutableAndReset will add at the very least extra branches to the code path. That's negligible in the vast majority of scenarios but could be relevant in specific cases.

On the other hand ToImmutableAndReset is already accepting the possibility of a memory copy operation. The extra branch that is added by calling into MoveToImmutable, and hence repeating the capacity check, is small compared to the possibility of allocation.

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not just use the following for this block? EDIT: This is suboptimal

if (value > 0)
{
    Array.Resize(ref _elements, value);
}

@jaredpar
Copy link
Copy Markdown
Member Author

jaredpar commented Jan 9, 2015

Anything blocking merge?

@terrajobst
Copy link
Copy Markdown

LGTM.

@sharwell
Copy link
Copy Markdown

sharwell commented Jan 9, 2015

@jaredpar No blockers from me.

jaredpar added a commit that referenced this pull request Jan 9, 2015
@jaredpar jaredpar merged commit 7499821 into dotnet:master Jan 9, 2015
@jaredpar jaredpar deleted the fixes-extract branch January 9, 2015 21:54
@karelz karelz modified the milestone: 1.0.0-rtm Dec 3, 2016
MaximLipnin added a commit to MaximLipnin/corefx that referenced this pull request Nov 13, 2018
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

api-needs-work API needs work before it is approved, it is NOT ready for implementation

Projects

None yet

Development

Successfully merging this pull request may close these issues.