String.Join optimization for single item lists#1460
Conversation
There was a problem hiding this comment.
❗ Did this just drop the second value? Seems like you either need a do/while loop, or add the following before this loop.
result.Append(separator);
result.Append(en.Current);There was a problem hiding this comment.
Thanks @sharwell , incomplete copy from my test harness. I'll add that later as well as a gist which might make it easier to confirm perf.
|
Since you are only depending on the read-only properties of This means we will see these improvements for This means there would be no slow-down for some custom mutable list-like implementations that only inherit from |
|
@Eyas |
|
Ah yes, bad example. But my suggestion stands. Why not use the minimal interface that provides only the functions needed to get the perf boost? The alternative is to force an implementer to always implement IList to be treated as a first class citizen, even of the mutating methods are throw-only |
|
Interesting suggestion @Eyas I used IList mainly for consistency with other areas such as Linq and I'm not sure how popular IReadOnlyList has become yet. Question for the maintainers, any preferences on this going forward? |
|
Aren't there implications about interface calls vs. direct call here, WRT *Hibernating Rhinos Ltd * Oren Eini* l CEO l *Mobile: + 972-52-548-6969 Office: +972-4-622-7811 *l *Fax: +972-153-4-622-7811 On Mon, Aug 31, 2015 at 5:53 AM, Bruce Bowyer-Smyth <
|
|
There's a performance difference in the test of |
|
@ayende forcing List into either path the timings are:
If
So that might be worth doing which would leave one |
|
No, I meant List vs. IList vs. IReadOnlyList *Hibernating Rhinos Ltd * Oren Eini* l CEO l *Mobile: + 972-52-548-6969 Office: +972-4-622-7811 *l *Fax: +972-153-4-622-7811 On Mon, Aug 31, 2015 at 3:20 PM, Bruce Bowyer-Smyth <
|
|
I'm curious as to what results you get if you have only the improvement for the single-element enumerable case, and remove the |
|
With the
There is a specific Join overload for string[] but if the value arrived as an IList<T>
|
|
Committed the count local variable cache. The IList<string> try cast still seems like the best general improvement so we go with that or play it safe and just stick with the IEnumerable path. Gist for perf testing: |
|
I'm thinking that the code should just stick with IEnumerable path for now (which is still a win). There is no guarantee that IList indexers have to return values in the same order as their enumerator. While it is probably fine for Linq to work on that assumption, |
|
Replaced the commit to remove the IList try cast. |
There was a problem hiding this comment.
I assume this was removed because the StringBuilder.Append() correctly handle the null case. Is that correct?
Did this particular change impact the results in a significant way?
There was a problem hiding this comment.
That's correct. Mentioned some of the reasoning in the PR message.
There is not a lot in it, for a null separator on List<string>(100) it is 3227ms with it and 3186ms without it. It's just not protecting from anything as Append does the check again anyway.
There was a problem hiding this comment.
Makes sense. As long as we document somewhere that this was an intentional change to remove the null check 😄
|
Addressed the feedback with variable name, nesting, and null handling comment |
|
👍 Thank you @bbowyersmyth for the change! If you don't mind, could you squash the commit that adressed the PR feedback? I prefer having separate commits while the PR is open as that makes it easier to see what has changes after each feedback iteration but I don't think they add value once we are done with the feedback :). |
|
@AlexGhiondea for sure. Done. |
|
Thanks @bbowyersmyth ! Waiting for a second 👍 in order to merge it in! |
|
LGTM. Do we have tests in corefx for these special cases? |
|
@stephentoub Yes they were added via dotnet/corefx#2839 |
String.Join optimization for single item lists
String.Join optimization for single item lists Commit migrated from dotnet/coreclr@d176041
Returns existing string instance for
String.Joinwhen there is only one item in the list. Optimizations for whenIEnumerable<string>is aIList<string>.I've tried to match the coding style in this file but it is a bit variable. CoreFX
System.Runtimetests have been run on these changes.It is common to have lists that may have multiple items but typically have only one. Running a String.Join on them will needlessly create a new string.
One example is http header parsing.
https://github.com/aspnet/HttpAbstractions/blob/dev/src/Microsoft.AspNet.Http.Extensions/ParsingHelpers.cs#L537
Some headers support multi values but will usually only have one. In the AspNet implementation all headers are in a
IEnumerableto simplify the interfaces.This PR is mainly about reducing allocations but there are some immediate performance benefits to go along with it. Perf tests done with 1M iterations.
String[] Join
value[stringToJoinIndex]in a local variable to avoid double array lookup.IEnumerable<String>JoinString.Empty.StringBuilder.Appendhandles nulls and is the first check in that function. So passing null is actually more desirable.IEnumerable<String>Join when values is anIList<String>IListinterface for Count and index property access. Same optimisation as many of the Linq methods. Early exit for when Count is 0 and 1, avoids new string allocation.IEnumerable<String>Join when values is not anIList<String>(usedQueue<String>for testing)en.Current. As above,Appendhandles nulls and this removes the double property access and branch for what should be a rarer case.Edit:
IList<string>path is no longer included in this PR