Less StringValue struct copies + bounds check elimination#2347
Less StringValue struct copies + bounds check elimination#2347benaadams wants to merge 6 commits into
Conversation
| protected override bool AddValueFast(string key, StringValues value) | ||
| {{{(loop.ClassName == "HttpResponseHeaders" ? @" | ||
| ValidateHeaderCharacters(value);" : "")} | ||
| var isNotNull = value.Count > 0; |
There was a problem hiding this comment.
Is this the fastest way to check for emptiness?
There was a problem hiding this comment.
I think so 😢
int Count => _value != null ? 1 : (_values?.Length ?? 0);There is a pass by value static method; but looks like that does more?
public static bool IsNullOrEmpty(StringValues value)
{
if (value._values == null)
{
return string.IsNullOrEmpty(value._value);
}
switch (value._values.Length)
{
case 0: return true;
case 1: return string.IsNullOrEmpty(value._values[0]);
default: return false;
}
}There was a problem hiding this comment.
We should optimize this. Looks very branchy 😄
There was a problem hiding this comment.
public bool IsNull()
{
var values = _values;
return (_value != null) ?
false :
((values == null || values.Length == 0) ?
true :
false);
}? 😟
There was a problem hiding this comment.
|
/cc @pakrym @mikeharder @Tratcher to do another pass |
| ThrowHeadersReadOnlyException(); | ||
| } | ||
| SetValueFast(key, value); | ||
| if (value.Count == 0) |
There was a problem hiding this comment.
No because it doesn't output null headers; however it means the bit flags can be trusted. Currently if you set the header to a null value it will set the bit flag saying it has a value; rather than changing it to say it doesn't have a value
There was a problem hiding this comment.
null string/Empty.StringValues/empty array
There was a problem hiding this comment.
Corner case would be if you set it to an array of null strings; then it would say it had a value with bit flag. However that's same as now (also it needs to know to clear it if its that).
However if you add nulls; StringValues coalesces them to nothing so you'd explicitly need to use the new StringValues(string[]) constructor
There was a problem hiding this comment.
With dotnet/extensions#323 this changes to
value.IsNull
There was a problem hiding this comment.
We should wait for this before merging.
| } | ||
|
|
||
| [MethodImpl(MethodImplOptions.AggressiveInlining)] | ||
| private void EnsureHostHeaderExists() |
There was a problem hiding this comment.
Skip this method, I'm working on some changes here that will conflict.
There was a problem hiding this comment.
It's not that simple, my changes will negate these.
There was a problem hiding this comment.
Is your change going to be done in the next few days?
There was a problem hiding this comment.
It's not that simple, my changes will negate these.
Just overwrite it? Isn't a conflict (except to git) if you entirely replace it; and it give you a perf base line to measure against 😉
Want me to drop this change?
|
NuGet unhappy |
|
Yeaaa shit is broken right now |
c2d1e68 to
eb173c3
Compare
|
Rebased |
eb173c3 to
b46607f
Compare
|
Need to validate some of my tail call Jit assertions in second commit |
|
@halter73 assigning this to you. |
|
|
ResponseHeadersWritingBenchmark Pre Post Http1ConnectionBenchmark Pre Post |
|
That's a nice boost |
| { | ||
| Assert.True(HttpUtilities.IsValidHostHeader(host)); | ||
| HttpUtilities.ValidateHostHeader(host); | ||
| Assert.True(true); |
There was a problem hiding this comment.
Assert.True(true)... 😄 xunit used to have Assert.DoesNotThrow 😄
There was a problem hiding this comment.
Wasn't sure whether to just put nothing in as if it threw it would catch it as an error; but that felt wrong 😄
There was a problem hiding this comment.
I'd remove it and just add a comment saying this shouldn't throw.
|
Did a bounds checking dance Http1ConnectionBenchmark Pre Post |
|
This change seems to have a very positive impact on the plaintext benchmarks. |
| .MalformedRequestInvalidHeaders); | ||
| BadHttpRequestException.Throw(RequestRejectionReason.MalformedRequestInvalidHeaders); | ||
| } | ||
| throw; |
There was a problem hiding this comment.
This seems suspicious. Why is this ok? We always want to throw even in the else case?
There was a problem hiding this comment.
Aha! Yes, will add back
9fdf691 to
80553a9
Compare
| protected override bool AddValueFast(string key, in StringValues value) | ||
| {{{(loop.ClassName == "HttpResponseHeaders" ? @" | ||
| ValidateHeaderCharacters(value);" : "")} | ||
| var isNotNull = value.Count > 0; |
There was a problem hiding this comment.
With dotnet/extensions#323 this changes to
!value.IsNull
| { | ||
| if (!host.Equals(RawTarget)) | ||
| // Tail call | ||
| ValidateNonOrginHostHeader(hostText); |
There was a problem hiding this comment.
Cool. Do we know for sure that the tail call is getting eliminated by the JIT? Or the C# compiler itself? If not, the comment could be a little misleading.
| {header.SetBit()}; | ||
| _headers._{header.Identifier} = value;{(header.EnhancedSetter == false ? "" : $@" | ||
| _headers._raw{header.Identifier} = null;")} | ||
| }} |
There was a problem hiding this comment.
This seems like a (small) breaking change to me. Prior to this, the following would throw:
context.Response.Headers.Add("Warning", new StringValues(value: null));
context.Response.Headers.Add("Warning", new StringValues(value: null));Now it doesn't. I don't mind making this small of a breaking change, but it's not my call. @davidfowl What do you think?
If we are willing to change behavior though, we might as well just skip calling AddValueFast and completely no-op when value.IsNull instead of wasting time checking for collisions at all.
There was a problem hiding this comment.
Good point; changed.
@davidfowl on technical break?
A null value is a bit of a weird one as it won't output any header for a null value, so all it does it block another value being added but doesn't change output.
Also concatenating null StringValues will coalesce null values to be a single null with a count of 0.
e.g.
context.Response.Headers.Add("Warning", new StringValues(value: null));
context.Response.Headers["Warning"].Count == 0;There was a problem hiding this comment.
Raised issue for it https://github.com/aspnet/KestrelHttpServer/issues/2402
| // Subtract start of range '0' | ||
| // Cast to uint to change negative numbers to large numbers | ||
| // Check if less than 10 representing chars '0' - '9' | ||
| return (uint)(ch - '0') < 10u; |
There was a problem hiding this comment.
Do we know that a subtraction then a comparison is faster than two comparisons? I ask because the earlier version certainly does read better which I guess is why you left it there as a comment. I have more or less the same question about the IsHex change.
I know you've posted some nice results from the ResponseHeadersWritingBenchmark and the Http1ConnectionBenchmark, but I'm more interested in the performance of these methods in isolation.
There was a problem hiding this comment.
Its an extra branch prediction/mis-prediction per character; rather than the extra comparison that's the issue; cpu can only run with so many branches in parallel before you hit a pipeline bubble: Control_hazards_(branch_hazards)
There was a problem hiding this comment.
How do you know changing '0' <= ch && ch <= '9' to (uint)(ch - '0') < 10u improves the branch prediction?
There was a problem hiding this comment.
No branch vs 1 branch?
There was a problem hiding this comment.
Makes sense... I'd feel much more warm and fuzzy about it if there were some BenchmarkDotNet results to back up this theory.
There was a problem hiding this comment.
Have returned to my command center of my big screen monitors, so should be able to rummage something up.
| for (var i = offset + 1; i < hostText.Length; i++) | ||
| // This do+if check rather than for loop is to elimitate the bounds check, since | ||
| // the Jit doesn't currently pick up on it when starting at a variable offset | ||
| do |
There was a problem hiding this comment.
I don't think transforming simple for loops to complicated do/while loops is worthwhile when we can just wait for the JIT to handle this better.
There was a problem hiding this comment.
Might be waiting a long time https://github.com/dotnet/coreclr/issues/15723#issuecomment-357152332
it's probably going to take a while, the rangecheck phase is a bit of a mess and it is difficult and risky to extend it to handle new patterns.
The issue here is its a range check per character; and when looking at the valid characters array that's another range check per character
| break; | ||
| var ch = (int)hostText[i]; | ||
| // Bounds check and elimiate second bounds check | ||
| if ((uint)ch >= (uint)hostCharValidity.Length || !hostCharValidity[ch]) |
There was a problem hiding this comment.
How is this faster than IsValidHostCharf? Does the second bounds check not get eliminated when ch is a char?
There was a problem hiding this comment.
No, was generating the asm while changing these and refining them.
Perhaps char isn't treated as an unsigned value (e.g. short as ushort isn't cls complaint) so ch < HostCharValidity.Length could pass with a negative char?
There was a problem hiding this comment.
Also using a class level array HostCharValidity rather than a local array hostCharValidity can stop bounds checks being eliminated; and they recently made elimination more conservative dotnet/coreclr#15756
| { | ||
| if (offset == hostText.Length) | ||
| // Skip bounds check for accessing the [offset] element | ||
| if ((uint)offset >= (uint)hostText.Length) |
There was a problem hiding this comment.
So the bounds check get repeated unless you cat offset to a uint? If so, that seems crazy. It seemed odd when the offset started as a char and it even seems more odd now that the offset is the int.
There was a problem hiding this comment.
You have to check its not less than zero and also not greater than or equal to length; so two tests. Which is one test with the uint. Also I think if you do it in two tests, it doesn't work - outside of for loops with a constant start.
| if (string.IsNullOrEmpty(hostText)) | ||
| // This is a string.IsNullOrEmpty test, but arranged to elmininate the | ||
| // bounds check from accessing the firstChar of the string | ||
| if (hostText is null || 0u >= (uint)hostText.Length) |
There was a problem hiding this comment.
This is a breaking change if you look at ValidateHostHeader in isolation since now string full of nothing but whitespace could be rejected. I think this is safe though since we trim whitespace from all request header values.
There was a problem hiding this comment.
Yea, is stripped by header parsing
There was a problem hiding this comment.
@benaadams Does this faster than if (hostText == null || hostText.Length == 0)? If so, why? If not, why write it like this?
There was a problem hiding this comment.
Still not sure about is vs ==, but I remember something about the bounds check elimination only happening with uint comparisons. It seems to me the JIT should be able to eliminate the bounds check given the more conventional if condition. I'd just wait for the JIT to fix this rather than leave these unconventional conditions around. I mean if we did leave them around, how long we should continue cargo culting the pattern?
There was a problem hiding this comment.
Is kinda funny looking... thinking about why
There was a problem hiding this comment.
Could check the asm, it might; range check elimination misses obvious stuff though
There was a problem hiding this comment.
I went on sharplab.io to try some variations on this, and I have to admit that the conventional version is worse than I thought:
public static bool StartsWithBracket(string hostText)
{
if (hostText == null || 0u >= (uint)hostText.Length)
{
return false;
}
var firstChar = hostText[0];
if (firstChar == '[')
{
return true;
}
return false;
}
; Desktop CLR v4.7.2563.00 (clr.dll) on amd64.
Program.StartsWithBracket(System.String)
L0000: test rcx, rcx
L0003: jz L000c
L0005: mov eax, [rcx+0x8]
L0008: test eax, eax
L000a: jnz L000f
L000c: xor eax, eax
L000e: ret
L000f: cmp word [rcx+0xc], 0x5b
L0014: jnz L001c
L0016: mov eax, 0x1
L001b: ret
L001c: xor eax, eax
L001e: ret
public static bool StartsWithBracket(string hostText)
{
if (hostText == null || hostText.Length == 0)
{
return false;
}
var firstChar = hostText[0];
if (firstChar == '[')
{
return true;
}
return false;
}
; Desktop CLR v4.7.2563.00 (clr.dll) on amd64.
Program.StartsWithBracket(System.String)
L0000: sub rsp, 0x28
L0004: test rcx, rcx
L0007: jz L0010
L0009: mov edx, [rcx+0x8]
L000c: test edx, edx
L000e: jnz L0017
L0010: xor eax, eax
L0012: add rsp, 0x28
L0016: ret
L0017: cmp edx, 0x0
L001a: jbe L0034
L001c: cmp word [rcx+0xc], 0x5b
L0021: jnz L002d
L0023: mov eax, 0x1
L0028: add rsp, 0x28
L002c: ret
L002d: xor eax, eax
L002f: add rsp, 0x28
L0033: ret
L0034: call 0x7ffaff4223c0
L0039: int3
public static bool StartsWithBracket(string hostText)
{
if (string.IsNullOrEmpty(hostText))
{
return false;
}
var firstChar = hostText[0];
if (firstChar == '[')
{
return true;
}
return false;
}
; Desktop CLR v4.7.2563.00 (clr.dll) on amd64.
Program.StartsWithBracket(System.String)
L0000: sub rsp, 0x28
L0004: test rcx, rcx
L0007: jz L0017
L0009: cmp dword [rcx+0x8], 0x0
L000d: setz al
L0010: movzx eax, al
L0013: test eax, eax
L0015: jz L001e
L0017: xor eax, eax
L0019: add rsp, 0x28
L001d: ret
L001e: cmp dword [rcx+0x8], 0x0
L0022: jbe L003c
L0024: cmp word [rcx+0xc], 0x5b
L0029: jnz L0035
L002b: mov eax, 0x1
L0030: add rsp, 0x28
L0034: ret
L0035: xor eax, eax
L0037: add rsp, 0x28
L003b: ret
L003c: call 0x7ffaff4223c0
L0041: int3
Now I'm wondering if we should just make this optimization in string.IsNullOrEmpty if it's going to take a while for the JIT to optimize this.
There was a problem hiding this comment.
Now I'm wondering if we should just make this optimization in
string.IsNullOrEmptyif it's going to take a while for the JIT to optimize this.
coreclr is pretty shut down to any enhancements atm until next week; but at that point any additions will be to 2.2, so a 6 month wait?
There was a problem hiding this comment.
Also you need both sides to be uint as a uint to int comparison will first upcast them to long; then do a long >= long comparision :-/
| ThrowHeadersReadOnlyException(); | ||
| } | ||
| SetValueFast(key, value); | ||
| if (value.Count == 0) |
There was a problem hiding this comment.
We should wait for this before merging.
| } | ||
|
|
||
| // The lead '[' was already checked | ||
| [MethodImpl(MethodImplOptions.AggressiveInlining)] |
There was a problem hiding this comment.
Are you removing this attribute from all the methods you expect to be TCO'd?
There was a problem hiding this comment.
Ish, making sure they are tail called (easy way, by not returning anything, no funky params and last action in function)
But mostly its not inlining where there is no overall fast-path. So origin form is fully inlined, as a the char checks; but what's the right choice for the others? Is ipv6 via raw ip address a common scenerio? Is ports a common scenerio? (might be due to https on 443; but isn't for http on 80)
|
@davidfowl and I are interested in the relative performance of this change without the more esoteric parts of it that were added to avoid extra bounds checks by the JIT. |
| // Subtract start of range 'a' | ||
| // Cast to uint to change negative numbers to large numbers | ||
| // Check if less than 6 representing chars 'a' - 'f' | ||
| || (uint)((ch | 32) - 'a') < 6u; |
There was a problem hiding this comment.
This is reducing 4 branches to 1 branch
Http1ConnectionBenchmark Baseline Changes (without bounds checking removal) Changes (with bounds checking removal) |
|
@dotnet-bot test OSX 10.12 Release Build |
d54ad0d to
cc7cbcc
Compare
|
Added If I update deps |
|
@benaadams rebase |
cc7cbcc to
bff8296
Compare
Good point :) |
|
Is there a way to magic perf test this PR? |
bff8296 to
b932a4f
Compare
|
Backed out the |
BeforeAfter |
| { | ||
| if (!_absoluteRequestTarget.IsDefaultPort | ||
| || host != _absoluteRequestTarget.Authority + ":" + _absoluteRequestTarget.Port.ToString(CultureInfo.InvariantCulture)) | ||
| || hostText != _absoluteRequestTarget.Authority + ":" + _absoluteRequestTarget.Port.ToString(CultureInfo.InvariantCulture)) |
There was a problem hiding this comment.
@Tratcher This has nothing to do with this PR, but if the problem here is "System.Uri doesn't not tell us if the port was in the original string or not", why not just compare hostText to RawTarget instead of doing this complicated condition?
There was a problem hiding this comment.
HostText isn't expected to be the full value of RawTarget, only part of it.
| for (; i < hostText.Length; i++) | ||
| { | ||
| if (!IsValidHostChar(hostText[i])) | ||
| // Enregister array |
There was a problem hiding this comment.
Is enregister a technical term for changing a class reference to a local reference or something?
There was a problem hiding this comment.
Make sure it gets stored in a register
There was a problem hiding this comment.
Compiler term, put variable in register. I hadn't heard it before I strayed too close to the Jit
| private static bool IsValidHostChar(char ch) | ||
| { | ||
| return ch < HostCharValidity.Length && HostCharValidity[ch]; | ||
| if (i < hostText.Length) |
There was a problem hiding this comment.
Isn't this condition getting repeated in ValidateHostPort? Why check twice?
There was a problem hiding this comment.
Could drop the check from ValidateHostPort; but then its just going to introduce a bounds check and check it anyway, so might as well check it there for safety?
There was a problem hiding this comment.
ah, is while loop there anyway so needs the check for the loop. Not sure how to get rid of one of them?
|
@halter73 any specific changes you would like? I've got a bit lost in the github display of comments |
|
@benaadams If you can update this PR or submit a new one without the bounds check related changes, I would like to merge it for 2.1. |
Asked @jkotas if the |
Something more like this #2488 |
|
Closing this PR since we merged #2488 which is basically this without the bounds check elimination tricks. |


Less StringValue struct copies for header existence checks
nulladds set bitflags in headers (as won't be output anyway)nullsets in headersLazy calc transfer coding
Fast-path EnsureHostHeaderExists
Skip lots of bounds checks through various methods (by helping the Jit recognize preconditions)
Update to #2014