Slightly increase throughput of string.Concat(object[])#6547
Conversation
|
Oh and here is the code I was using to benchmark... I both collected its console output, and ran it through PerfView. Here is an example run (though it's somewhat shaky): On some runs all of the experimental values are lower (if only slightly) than the baseline ones. |
How many arguments do you need to pass to Concat for this particular change to generate a measurable improvement? Same question for 32 bit architectures where
Probably that's a good change but the chances that someone uses this particular |
|
I do not think that the slight improvement is worth the extra code. This should be taken care of by optimizing this in the JIT (#6537). |
|
@mikedn I am aware
Nullable returns I agree it's pretty unlikely that all of the ToStrings will be null/empty as well, but in the contrived case where it actually does happen the cost is so low (one |
|
@jkotas What are the chances that could happen soon (optimizing in the JIT)? Also it may be useful to add the |
It depends on if somebody volunteers to implement it.
I am pretty sure that you can easily find hundreds of places in coreclr or corefx where you can manually apply this optimization. That's why it is better done in the JIT .... it does not make sense to add harder to understand and maintain code everywhere for slight performance gains just because of JIT does not do an optimization today. |
I'm only proposing to do this for codepaths that are called frequently; of course I wouldn't want to litter lots of places where perf doesn't really matter with
As @mikedn mentioned in the other thread it's going to be very hard / next to impossible to do this for non-sealed types where we can't see the allocation of the array. For example, we almost certainly won't be able to do this for At any rate, I guess I'll remove it from this PR for now. If any scenario such as the one I just mentioned comes up later, then I may re-add it. I'll limit this PR to the minor changes @mikedn commented on + the formatting changes. edit: OK, I've reduced the changes in String to just using a |
I don't think symmetry with other code is a good argument. It's different code, written and reviewed at a different time by different people. For example, the loop in 4559 is much simpler and has a high chance to be evaluated using only registers on x86. But the loop you have here is more complex and there's a higher chance for a variable to be spilled to memory. Then you're not only adding a single
I don't understand this. The cost is the same no matter how big the array is.
I didn't say it's very hard/next to impossible. It's problematic because the simplest solution isn't currently an option as it requires verification to be enabled. There are other solutions that could optimize this particular code, they're slightly more complicated but not hard/impossible.
It would be useful to have this in |
|
In addition to likely being slightly slower on 32-bit platforms, the change to use The part of the change that removes the redundant array access is a clear improvement because of it makes the code both smaller and faster. |
There was a problem hiding this comment.
This may not be worth the comment. It is a well-known fact that you get best performance in for-loops over arrays when the limit is length of the array that is being iterated on, and not some other derived value.
I was referring specifically to the case of
Agreed, e.g. @jkotas Oh wow, good catch, that was a pretty subtle compat issue (although it's likely never going to happen). I've updated the pull request to the old behavior of checking each iteration. |
|
LGTM. Thanks! |
|
@dotnet-bot test Windows_NT x64 Release Priority 1 Build and Test please |
Previously, the method was using a regular string array to store the intermediary ToStrings of each of the objects. This had the effect that each write to the array was causing a covariant type-check to ensure the string can actually be written, even though it always can be since
stringis sealed (see #6537 for more on this). I've changed the intermediary array to be aRefAsValueType<string>instead (copied from System.Collections.Immutable which employs the same trick), which circumvents these checks.Other changes/notes
totalLengthLongand checked if it was greater thanint.MaxValueat the end, analagous to what was done in Avoid defensive copy in String.Concat(string[]) #4559, instead of checking for overflow at each iteration of the loop.string.Emptyif all of the ToStrings are null/empty.Internal/directory in mscorlib due to Add EnvironmentAugments to coreclr #6205, so I put theRefAsValueTypeutility type in there instead of theSystem/namespace. Since it's such a huge assembly, I think it's best to separate the internal types from the publicly exposed types.Performance impact
Although this change was kind of hard to benchmark (presumably to the GC allocations and all the other stuff going on in the method), in general there seems to be a speedup of about 5-6%.
FillStringCheckedseems to be a larger bottleneck in this method, so it might be worth looking into applying AggressiveInlining there later.(note: I initially posted that the speedup was ~50% in #6537, but that was because I was using an earlier, slightly different implementation that avoided calling
FillStringCheckedfor nulls, and testing that with lots of nulls. That wasn't accurate.)cc @jkotas @bbowyersmyth @mikedn