Faster Span#16222
Conversation
| // Check will be Jitted out for ValueTypes | ||
| if (SpanHelpers.IsReference<T>() && array.GetType() != typeof(T[])) | ||
| ThrowHelper.ThrowArrayTypeMismatchException_ArrayTypeMustBeExactMatch(typeof(T)); | ||
| if ((uint)start > (uint)array.Length || (uint)length > (uint)(array.Length - start)) |
There was a problem hiding this comment.
Does changing ((uint)start > (uint)array.Length || (uint)length > (uint)(array.Length - start)) to (start | length) < 0 || array.Length - start < length) really help? What is the code before/after?
There was a problem hiding this comment.
No, does a little extra 😢
Before
G_M5589_IG02:
4D85C0 test r8, r8
0F8400000000 je G_M5589_IG05
458B5008 mov r10d, dword ptr [r8+8]
453BD1 cmp r10d, r9d
0F8200000000 jb G_M5589_IG05
452BD1 sub r10d, r9d
443BD0 cmp r10d, eax
0F8200000000 jb G_M5589_IG05After
G_M5589_IG02:
4D85C0 test r8, r8
0F8400000000 je G_M5589_IG05
458BD1 mov r10d, r9d
440BD0 or r10d, eax
4585D2 test r10d, r10d
0F8C00000000 jl G_M5589_IG05
458B5008 mov r10d, dword ptr [r8+8]
452BD1 sub r10d, r9d
443BD0 cmp r10d, eax
0F8C00000000 jl G_M5589_IG05 Will revert
There was a problem hiding this comment.
Though I did notice something strange about the rest of the coreclr method https://github.com/dotnet/coreclr/issues/9633
There was a problem hiding this comment.
With this change reverted, is the crux of the change here changing default(T)== null to
if (SpanHelpers.IsReference<T>() && array.GetType() != typeof(T[])) ?
There was a problem hiding this comment.
No, its merging two exception throws into one and reducing the arguments passed
There was a problem hiding this comment.
Is cutting the function calls; and exit points, that are brought in with the inline
|
|
||
| public static bool IsReferenceCore(Type type) | ||
| { | ||
| if (type.GetTypeInfo().IsPrimitive) // This is hopefully the common case. All types that return true for this are value types w/out embedded references. |
There was a problem hiding this comment.
I doubt that this is saving anything. It can just return type.GetTypeInfo().IsValueType.
| // | ||
| public static readonly bool IsReferenceOrContainsReferences = IsReferenceOrContainsReferencesCore(typeof(T)); | ||
|
|
||
| public static readonly bool IsReference = IsReferenceCore(typeof(T)); |
There was a problem hiding this comment.
Nit - why not call it IsValueType to make same as the reflection API?
public static readonly bool IsValueType = typeof(T).GetTypeInfo().IsValueType;
|
@ahsonkhan Once this goes in, we will want to port the applicable changes to the fast span in CoreCLR as well. |
| if (length == 0) | ||
| return; | ||
|
|
||
| // Branch will be Jit elminated |
|
@dotnet-bot test this please (Feeds live again) |
| @@ -105,72 +105,67 @@ public static bool SequenceEqual(ref byte first, ref byte second, int length) | |||
| int index = 0; | |||
There was a problem hiding this comment.
As the param is by ref ref byte first is it safe to use AsPointer?
e.g. Unsafe.AsPointer(ref first);
There was a problem hiding this comment.
You would have to pin it first. Unsafe.AsPointer is safe to use only if you know that the ref is pinned.
There was a problem hiding this comment.
I must admit I expected a compile error from fixed (byte* pSearchSpace = &searchSpace) but it works 😄
aaa5f6b to
186d3e6
Compare
|
Vectorized |
Pre
<details>
```asm
; Total bytes of code 38, prolog size 4 for method SpanExtensions:SequenceEqual(struct,struct):bool
; ============================================================
Successfully inlined Unsafe:AreSame(byref,byref):bool (5 IL bytes) (depth 1) [aggressive inline attribute]
Successfully inlined Unsafe:Add(byref,int):byref (12 IL bytes) (depth 1) [aggressive inline attribute]
Successfully inlined Unsafe:Add(byref,int):byref (12 IL bytes) (depth 1) [aggressive inline attribute]
Successfully inlined Unsafe:Add(byref,int):byref (12 IL bytes) (depth 1) [aggressive inline attribute]
Successfully inlined Unsafe:Add(byref,int):byref (12 IL bytes) (depth 1) [aggressive inline attribute]
Successfully inlined Unsafe:Add(byref,int):byref (12 IL bytes) (depth 1) [aggressive inline attribute]
Successfully inlined Unsafe:Add(byref,int):byref (12 IL bytes) (depth 1) [aggressive inline attribute]
Successfully inlined Unsafe:Add(byref,int):byref (12 IL bytes) (depth 1) [aggressive inline attribute]
Successfully inlined Unsafe:Add(byref,int):byref (12 IL bytes) (depth 1) [aggressive inline attribute]
Successfully inlined Unsafe:Add(byref,int):byref (12 IL bytes) (depth 1) [aggressive inline attribute]
Successfully inlined Unsafe:Add(byref,int):byref (12 IL bytes) (depth 1) [aggressive inline attribute]
Successfully inlined Unsafe:Add(byref,int):byref (12 IL bytes) (depth 1) [aggressive inline attribute]
Successfully inlined Unsafe:Add(byref,int):byref (12 IL bytes) (depth 1) [aggressive inline attribute]
Successfully inlined Unsafe:Add(byref,int):byref (12 IL bytes) (depth 1) [aggressive inline attribute]
Successfully inlined Unsafe:Add(byref,int):byref (12 IL bytes) (depth 1) [aggressive inline attribute]
Successfully inlined Unsafe:Add(byref,int):byref (12 IL bytes) (depth 1) [aggressive inline attribute]
Successfully inlined Unsafe:Add(byref,int):byref (12 IL bytes) (depth 1) [aggressive inline attribute]
Successfully inlined Unsafe:Add(byref,int):byref (12 IL bytes) (depth 1) [aggressive inline attribute]
Successfully inlined Unsafe:Add(byref,int):byref (12 IL bytes) (depth 1) [aggressive inline attribute]
Successfully inlined Unsafe:Add(byref,int):byref (12 IL bytes) (depth 1) [aggressive inline attribute]
Successfully inlined Unsafe:Add(byref,int):byref (12 IL bytes) (depth 1) [aggressive inline attribute]
Successfully inlined Unsafe:Add(byref,int):byref (12 IL bytes) (depth 1) [aggressive inline attribute]
Successfully inlined Unsafe:Add(byref,int):byref (12 IL bytes) (depth 1) [aggressive inline attribute]
Successfully inlined Unsafe:Add(byref,int):byref (12 IL bytes) (depth 1) [aggressive inline attribute]
Successfully inlined Unsafe:Add(byref,int):byref (12 IL bytes) (depth 1) [aggressive inline attribute]
Successfully inlined Unsafe:Add(byref,int):byref (12 IL bytes) (depth 1) [aggressive inline attribute]
Successfully inlined Unsafe:Add(byref,int):byref (12 IL bytes) (depth 1) [aggressive inline attribute]
**************** Inline Tree
Inlines into 06000018 SpanHelpers:SequenceEqual(byref,byref,int):bool
[1 IL=0002 TR=000003 06000019] [aggressive inline attribute] Unsafe:AreSame(byref,byref):bool
[2 IL=0020 TR=000026 06000012] [aggressive inline attribute] Unsafe:Add(byref,int):byref
[3 IL=0028 TR=000034 06000012] [aggressive inline attribute] Unsafe:Add(byref,int):byref
[4 IL=0044 TR=000056 06000012] [aggressive inline attribute] Unsafe:Add(byref,int):byref
[5 IL=0052 TR=000064 06000012] [aggressive inline attribute] Unsafe:Add(byref,int):byref
[6 IL=0068 TR=000086 06000012] [aggressive inline attribute] Unsafe:Add(byref,int):byref
[7 IL=0076 TR=000094 06000012] [aggressive inline attribute] Unsafe:Add(byref,int):byref
[8 IL=0092 TR=000116 06000012] [aggressive inline attribute] Unsafe:Add(byref,int):byref
[9 IL=0100 TR=000124 06000012] [aggressive inline attribute] Unsafe:Add(byref,int):byref
[10 IL=0116 TR=000146 06000012] [aggressive inline attribute] Unsafe:Add(byref,int):byref
[11 IL=0124 TR=000154 06000012] [aggressive inline attribute] Unsafe:Add(byref,int):byref
[12 IL=0140 TR=000176 06000012] [aggressive inline attribute] Unsafe:Add(byref,int):byref
[13 IL=0148 TR=000184 06000012] [aggressive inline attribute] Unsafe:Add(byref,int):byref
[14 IL=0164 TR=000206 06000012] [aggressive inline attribute] Unsafe:Add(byref,int):byref
[15 IL=0172 TR=000214 06000012] [aggressive inline attribute] Unsafe:Add(byref,int):byref
[16 IL=0188 TR=000236 06000012] [aggressive inline attribute] Unsafe:Add(byref,int):byref
[17 IL=0196 TR=000244 06000012] [aggressive inline attribute] Unsafe:Add(byref,int):byref
[18 IL=0226 TR=000312 06000012] [aggressive inline attribute] Unsafe:Add(byref,int):byref
[19 IL=0234 TR=000320 06000012] [aggressive inline attribute] Unsafe:Add(byref,int):byref
[20 IL=0250 TR=000342 06000012] [aggressive inline attribute] Unsafe:Add(byref,int):byref
[21 IL=0258 TR=000350 06000012] [aggressive inline attribute] Unsafe:Add(byref,int):byref
[22 IL=0274 TR=000372 06000012] [aggressive inline attribute] Unsafe:Add(byref,int):byref
[23 IL=0282 TR=000380 06000012] [aggressive inline attribute] Unsafe:Add(byref,int):byref
[24 IL=0298 TR=000402 06000012] [aggressive inline attribute] Unsafe:Add(byref,int):byref
[25 IL=0306 TR=000410 06000012] [aggressive inline attribute] Unsafe:Add(byref,int):byref
[26 IL=0333 TR=000462 06000012] [aggressive inline attribute] Unsafe:Add(byref,int):byref
[27 IL=0341 TR=000470 06000012] [aggressive inline attribute] Unsafe:Add(byref,int):byref
Budget: initialTime=1158, finalTime=1414, initialBudget=11580, currentBudget=11840
Budget: increased by 260 because of force inlines
Budget: initialSize=8476, finalSize=8476
; Assembly listing for method SpanHelpers:SequenceEqual(byref,byref,int):bool
; Emitting BLENDED_CODE for X64 CPU with SSE2
; optimized code
; rsp based frame
; fully interruptible
; Final local variable assignments
;
; V00 arg0 [V00,T01] ( 16, 55 ) byref -> rcx
; V01 arg1 [V01,T02] ( 16, 55 ) byref -> rdx
; V02 arg2 [V02,T03] ( 14, 39.5) int -> r8
; V03 loc0 [V03,T00] ( 53, 208.5) int -> r9
; V04 tmp0 [V04,T05] ( 2, 16 ) int -> rax
; V05 tmp1 [V05,T06] ( 2, 16 ) int -> rax
; V06 tmp2 [V06,T07] ( 2, 16 ) int -> rax
; V07 tmp3 [V07,T08] ( 2, 16 ) int -> rax
; V08 tmp4 [V08,T09] ( 2, 16 ) int -> rax
; V09 tmp5 [V09,T10] ( 2, 16 ) int -> rax
; V10 tmp6 [V10,T11] ( 2, 16 ) int -> rax
; V11 tmp7 [V11,T12] ( 2, 16 ) int -> rax
; V12 tmp8 [V12,T13] ( 2, 16 ) int -> rax
; V13 tmp9 [V13,T14] ( 2, 16 ) int -> rax
; V14 tmp10 [V14,T15] ( 2, 16 ) int -> rax
; V15 tmp11 [V15,T16] ( 2, 16 ) int -> rax
; V16 tmp12 [V16,T17] ( 2, 16 ) int -> rax
; V17 tmp13 [V17,T04] ( 16, 17 ) int -> rax
;# V18 OutArgs [V18 ] ( 1, 1 ) lclBlk ( 0) [rsp+0x00]
;
; Lcl frame size = 0
G_M26133_IG01:
G_M26133_IG02:
483BCA cmp rcx, rdx
750A jne SHORT G_M26133_IG03
B801000000 mov eax, 1
E9D9010000 jmp G_M26133_IG23
G_M26133_IG03:
4533C9 xor r9d, r9d
4183F808 cmp r8d, 8
0F8C0E010000 jl G_M26133_IG13
G_M26133_IG04:
4963C1 movsxd rax, r9d
0FB60401 movzx rax, byte ptr [rcx+rax]
4D63D1 movsxd r10, r9d
460FB61412 movzx r10, byte ptr [rdx+r10]
443BD0 cmp r10d, eax
7409 je SHORT G_M26133_IG05
33C0 xor eax, eax
E9B1010000 jmp G_M26133_IG23
G_M26133_IG05:
41FFC1 inc r9d
4963C1 movsxd rax, r9d
0FB60401 movzx rax, byte ptr [rcx+rax]
4D63D1 movsxd r10, r9d
460FB61412 movzx r10, byte ptr [rdx+r10]
443BD0 cmp r10d, eax
7409 je SHORT G_M26133_IG06
33C0 xor eax, eax
E991010000 jmp G_M26133_IG23
G_M26133_IG06:
41FFC1 inc r9d
4963C1 movsxd rax, r9d
0FB60401 movzx rax, byte ptr [rcx+rax]
4D63D1 movsxd r10, r9d
460FB61412 movzx r10, byte ptr [rdx+r10]
443BD0 cmp r10d, eax
7409 je SHORT G_M26133_IG07
33C0 xor eax, eax
E971010000 jmp G_M26133_IG23
G_M26133_IG07:
41FFC1 inc r9d
4963C1 movsxd rax, r9d
0FB60401 movzx rax, byte ptr [rcx+rax]
4D63D1 movsxd r10, r9d
460FB61412 movzx r10, byte ptr [rdx+r10]
443BD0 cmp r10d, eax
7409 je SHORT G_M26133_IG08
33C0 xor eax, eax
E951010000 jmp G_M26133_IG23
G_M26133_IG08:
41FFC1 inc r9d
4963C1 movsxd rax, r9d
0FB60401 movzx rax, byte ptr [rcx+rax]
4D63D1 movsxd r10, r9d
460FB61412 movzx r10, byte ptr [rdx+r10]
443BD0 cmp r10d, eax
7409 je SHORT G_M26133_IG09
33C0 xor eax, eax
E931010000 jmp G_M26133_IG23
G_M26133_IG09:
41FFC1 inc r9d
4963C1 movsxd rax, r9d
0FB60401 movzx rax, byte ptr [rcx+rax]
4D63D1 movsxd r10, r9d
460FB61412 movzx r10, byte ptr [rdx+r10]
443BD0 cmp r10d, eax
7409 je SHORT G_M26133_IG10
33C0 xor eax, eax
E911010000 jmp G_M26133_IG23
G_M26133_IG10:
41FFC1 inc r9d
4963C1 movsxd rax, r9d
0FB60401 movzx rax, byte ptr [rcx+rax]
4D63D1 movsxd r10, r9d
460FB61412 movzx r10, byte ptr [rdx+r10]
443BD0 cmp r10d, eax
7409 je SHORT G_M26133_IG11
33C0 xor eax, eax
E9F1000000 jmp G_M26133_IG23
G_M26133_IG11:
41FFC1 inc r9d
4963C1 movsxd rax, r9d
0FB60401 movzx rax, byte ptr [rcx+rax]
4D63D1 movsxd r10, r9d
460FB61412 movzx r10, byte ptr [rdx+r10]
443BD0 cmp r10d, eax
7409 je SHORT G_M26133_IG12
33C0 xor eax, eax
E9D1000000 jmp G_M26133_IG23
G_M26133_IG12:
41FFC1 inc r9d
4183C0F8 add r8d, -8
4183F808 cmp r8d, 8
0F8D02FFFFFF jge G_M26133_IG04
G_M26133_IG13:
4183F804 cmp r8d, 4
0F8C85000000 jl G_M26133_IG19
G_M26133_IG14:
4963C1 movsxd rax, r9d
0FB60401 movzx rax, byte ptr [rcx+rax]
4D63D1 movsxd r10, r9d
460FB61412 movzx r10, byte ptr [rdx+r10]
443BD0 cmp r10d, eax
7409 je SHORT G_M26133_IG15
33C0 xor eax, eax
E999000000 jmp G_M26133_IG23
G_M26133_IG15:
41FFC1 inc r9d
4963C1 movsxd rax, r9d
0FB60401 movzx rax, byte ptr [rcx+rax]
4D63D1 movsxd r10, r9d
460FB61412 movzx r10, byte ptr [rdx+r10]
443BD0 cmp r10d, eax
7406 je SHORT G_M26133_IG16
33C0 xor eax, eax
EB7C jmp SHORT G_M26133_IG23
G_M26133_IG16:
41FFC1 inc r9d
4963C1 movsxd rax, r9d
0FB60401 movzx rax, byte ptr [rcx+rax]
4D63D1 movsxd r10, r9d
460FB61412 movzx r10, byte ptr [rdx+r10]
443BD0 cmp r10d, eax
7406 je SHORT G_M26133_IG17
33C0 xor eax, eax
EB5F jmp SHORT G_M26133_IG23
G_M26133_IG17:
41FFC1 inc r9d
4963C1 movsxd rax, r9d
0FB60401 movzx rax, byte ptr [rcx+rax]
4D63D1 movsxd r10, r9d
460FB61412 movzx r10, byte ptr [rdx+r10]
443BD0 cmp r10d, eax
7406 je SHORT G_M26133_IG18
33C0 xor eax, eax
EB42 jmp SHORT G_M26133_IG23
G_M26133_IG18:
41FFC1 inc r9d
4183C0FC add r8d, -4
4183F804 cmp r8d, 4
7D87 jge SHORT G_M26133_IG14
G_M26133_IG19:
4585C0 test r8d, r8d
7E25 jle SHORT G_M26133_IG22
G_M26133_IG20:
4963C1 movsxd rax, r9d
0FB60401 movzx rax, byte ptr [rcx+rax]
4D63D1 movsxd r10, r9d
460FB61412 movzx r10, byte ptr [rdx+r10]
443BD0 cmp r10d, eax
7406 je SHORT G_M26133_IG21
33C0 xor eax, eax
EB12 jmp SHORT G_M26133_IG23
G_M26133_IG21:
41FFC1 inc r9d
41FFC8 dec r8d
4585C0 test r8d, r8d
7FDD jg SHORT G_M26133_IG20
G_M26133_IG22:
B801000000 mov eax, 1
G_M26133_IG23:
0FB6C0 movzx rax, al
G_M26133_IG24:
C3 ret
; Total bytes of code 462, prolog size 0 for method SpanHelpers:SequenceEqual(byref,byref,int):bool
```
</details>
Post
<details>
```asm
Successfully inlined Unsafe:Read(long):struct (7 IL bytes) (depth 1) [aggressive inline attribute]
Successfully inlined Unsafe:Read(long):struct (7 IL bytes) (depth 1) [aggressive inline attribute]
**************** Inline Tree
Inlines into 06000018 SpanHelpers:SequenceEqual(byref,byref,int):bool
[1 IL=0043 TR=000071 06000001] [aggressive inline attribute] Unsafe:Read(long):struct
[2 IL=0055 TR=000083 06000001] [aggressive inline attribute] Unsafe:Read(long):struct
Budget: initialTime=714, finalTime=714, initialBudget=7140, currentBudget=7140
Budget: initialSize=5101, finalSize=5101
; Assembly listing for method SpanHelpers:SequenceEqual(byref,byref,int):bool
; Emitting BLENDED_CODE for X64 CPU with SSE2
; optimized code
; rsp based frame
; fully interruptible
; Final local variable assignments
;
; V00 arg0 [V00,T10] ( 4, 3 ) byref -> rcx
; V01 arg1 [V01,T11] ( 4, 3 ) byref -> rdx
; V02 arg2 [V02,T09] ( 11, 7 ) int -> r8
; V03 loc0 [V03,T12] ( 7, 4.5) bool -> rax
; V04 loc1 [V04 ] ( 2, 1 ) byref -> [rsp+0x08] must-init pinned
; V05 loc2 [V05 ] ( 2, 1 ) byref -> [rsp+0x00] must-init pinned
; V06 loc3 [V06,T03] ( 10, 22.5) long -> rcx
; V07 loc4 [V07,T04] ( 10, 22.5) long -> rdx
; V08 loc5 [V08,T00] ( 31, 64.5) int -> r9
; V09 loc6 [V09,T05] ( 3, 12 ) simd16 -> mm0 ld-addr-op
; V10 loc7 [V10,T06] ( 3, 12 ) simd16 -> mm1
;* V11 tmp0 [V11 ] ( 0, 0 ) long -> zero-ref
;* V12 tmp1 [V12 ] ( 0, 0 ) long -> zero-ref
; V13 tmp2 [V13,T13] ( 2, 2 ) long -> rcx
; V14 tmp3 [V14,T14] ( 2, 2 ) long -> rdx
;# V15 OutArgs [V15 ] ( 1, 1 ) lclBlk ( 0) [rsp+0x00]
; V16 cse0 [V16,T07] ( 8, 11 ) int -> r10
; V17 cse1 [V17,T08] ( 6, 10 ) int -> r10
; V18 cse2 [V18,T02] ( 6, 24 ) long -> r11
; V19 cse3 [V19,T01] ( 8, 32 ) long -> r11
;
; Lcl frame size = 16
G_M26108_IG01:
56 push rsi
4883EC10 sub rsp, 16
33C0 xor rax, rax
4889442408 mov qword ptr [rsp+08H], rax
48890424 mov qword ptr [rsp], rax
G_M26108_IG02:
B801000000 mov eax, 1
4585C0 test r8d, r8d
0F84E2000000 je G_M26108_IG14
48894C2408 mov bword ptr [rsp+08H], rcx
48891424 mov bword ptr [rsp], rdx
483BCA cmp rcx, rdx
0F84C5000000 je G_M26108_IG13
4533C9 xor r9d, r9d
458D50F0 lea r10d, [r8-16]
4585D2 test r10d, r10d
7C33 jl SHORT G_M26108_IG05
G_M26108_IG03:
4D63D9 movsxd r11, r9d
420F100419 movups xmm0, xmmword ptr [rcx+r11]
420F100C1A movups xmm1, xmmword ptr [rdx+r11]
4183C110 add r9d, 16
0F28D0 movaps xmm2, xmm0
660F76D1 pcmpeqd xmm2, xmm1
66440FD7DA pmovmskb r11d, xmm2
4181FBFFFF0000 cmp r11d, 0xFFFF
7408 je SHORT G_M26108_IG04
33C0 xor eax, eax
E98C000000 jmp G_M26108_IG13
G_M26108_IG04:
453BD1 cmp r10d, r9d
7DCE jge SHORT G_M26108_IG03
G_M26108_IG05:
458D50F8 lea r10d, [r8-8]
453BD1 cmp r10d, r9d
7C1B jl SHORT G_M26108_IG08
G_M26108_IG06:
4D63D9 movsxd r11, r9d
4A8B3419 mov rsi, qword ptr [rcx+r11]
4A3B341A cmp rsi, qword ptr [rdx+r11]
7405 je SHORT G_M26108_IG07
33C0 xor eax, eax
EB6C jmp SHORT G_M26108_IG13
G_M26108_IG07:
4183C108 add r9d, 8
453BD1 cmp r10d, r9d
7DE6 jge SHORT G_M26108_IG06
G_M26108_IG08:
458D50FC lea r10d, [r8-4]
453BD1 cmp r10d, r9d
7C1A jl SHORT G_M26108_IG10
4D63D1 movsxd r10, r9d
468B1411 mov r10d, dword ptr [rcx+r10]
4D63D9 movsxd r11, r9d
463B141A cmp r10d, dword ptr [rdx+r11]
7406 je SHORT G_M26108_IG09
33C0 xor eax, eax
EB45 jmp SHORT G_M26108_IG13
G_M26108_IG09:
4183C104 add r9d, 4
G_M26108_IG10:
458D50FE lea r10d, [r8-2]
453BD1 cmp r10d, r9d
7C1C jl SHORT G_M26108_IG12
4D63D1 movsxd r10, r9d
4E0FBF1411 movsx r10, word ptr [rcx+r10]
4D63D9 movsxd r11, r9d
66463B141A cmp r10w, word ptr [rdx+r11]
7406 je SHORT G_M26108_IG11
33C0 xor eax, eax
EB20 jmp SHORT G_M26108_IG13
G_M26108_IG11:
4183C102 add r9d, 2
G_M26108_IG12:
453BC1 cmp r8d, r9d
7E15 jle SHORT G_M26108_IG13
4D63C1 movsxd r8, r9d
420FB60C01 movzx rcx, byte ptr [rcx+r8]
4D63C1 movsxd r8, r9d
423A0C02 cmp cl, byte ptr [rdx+r8]
7404 je SHORT G_M26108_IG13
33C0 xor eax, eax
G_M26108_IG13:
33D2 xor rdx, rdx
48891424 mov bword ptr [rsp], rdx
4889542408 mov bword ptr [rsp+08H], rdx
G_M26108_IG14:
4883C410 add rsp, 16
5E pop rsi
C3 ret
; Total bytes of code 254, prolog size 16 for method SpanHelpers:SequenceEqual(byref,byref,int):bool
```
</details>
|
Can't seem to get the project references right for |
|
I think you will want to avoid any netstandard references in netcoreapp build. Related to #16244 (comment) |
| private static bool IsReferenceOrContainsReferencesCore(Type type) | ||
| { | ||
| if (type.GetTypeInfo().IsPrimitive) // This is hopefully the common case. All types that return true for this are value types w/out embedded references. | ||
| if (type.IsPrimitive) // This is hopefully the common case. All types that return true for this are value types w/out embedded references. |
There was a problem hiding this comment.
This will not build in the netstandard1.0 build configuration
| [MethodImpl(MethodImplOptions.AggressiveInlining)] | ||
| public static int IndexOf(this Span<byte> span, byte value) | ||
| { | ||
| if (!BitConverter.IsLittleEndian) |
There was a problem hiding this comment.
Why do we care about endianness here?
There was a problem hiding this comment.
The locate byte doesn't work on big-endian as it highlights the wrong byte (due to order reversal) aspnet/KestrelHttpServer#1138 (comment); don't know if its worth the effort to come up with something that does work on big-endian though /cc @halter73
There was a problem hiding this comment.
Ah, makes sense. Do you know how much this branch adds (if any) to "early" searches? i.e. search that finds the byte at index 0.
There was a problem hiding this comment.
It may suffer from a static base initalization check if its first use; and not do branch elimination. Subsequent uses (in different function) will branch eliminate.
@jkotas would there me any mileage in adding a use of BitConverter.IsLittleEndian to something like CommonlyUsedGenericInstantiations as they are runtime fixed and mainly used for branch elimination; so the check is pre-paid?
There was a problem hiding this comment.
Already issue for it https://github.com/dotnet/coreclr/issues/9701 serendipity 😺
There was a problem hiding this comment.
I have added comment to dotnet/coreclr#9701 on what would be needed to fix it.
There was a problem hiding this comment.
👍 For keeping a big-endian path working. It would be nice if we could test on big-endian architectures.
In Kestrel, we declared big-endian bankruptcy a while ago, so Kestrel now throws on startup if it detects that it's running on a big-endian machine. I don't think that's really an option for corefx APIs.
| var a = pFirst; | ||
| var b = pSecond; | ||
|
|
||
| if (a == b) |
There was a problem hiding this comment.
nit: we call values related to the first parameter: first, a, and v0. It would be better if the names were all related, e.b. first, pFirst, cFirst ("cursor to first"), and vFirst ("value from first").
| if (Unsafe.Add(ref first, index) != Unsafe.Add(ref second, index)) | ||
| return false; | ||
| index++; | ||
| fixed (byte* pHaystack = &searchSpace) |
There was a problem hiding this comment.
Could we call it pSearchSpace?
| [MethodImpl(MethodImplOptions.AggressiveInlining)] | ||
| private static Vector<byte> GetVector(byte vectorByte) | ||
| { | ||
| #if !NETCOREAPP1_2 |
There was a problem hiding this comment.
How will that work with System.Memory package? i.e. do we now need three different dlls in the package: one for below 1.2, one for 1.2, and one for 2.0?
cc: @ahsonkhan
There was a problem hiding this comment.
I'm assuming NETCOREAPP1_2 would be defined for NETCOREAPP2_0? Or could set it to NETCOREAPP2_0. The difference between the two is a small improvement; so it doesn't matter hugley if it uses the other path; however new Vector<byte>(...) where the jit doesn't have the fix is quite a major regression.
There was a problem hiding this comment.
I'm assuming NETCOREAPP1_2 would be defined for NETCOREAPP2_0
It should be netcoreapp (without version). Define naming convention in https://github.com/dotnet/corefx/blob/master/Documentation/coding-guidelines/project-guidelines.md
There was a problem hiding this comment.
its only valid with the coreclr jit > 1.2 though?
There was a problem hiding this comment.
Right, netcoreapp means latest .NET Core
|
|
||
| if (a == b) | ||
| { | ||
| goto exitFixed; |
There was a problem hiding this comment.
Why can't we just return isMatch here instead of having a goto?
There was a problem hiding this comment.
Bloats the asm by inserting all the undo fixed asm and stack pops in place, for a rare case (both terms are the identical memory location); goto keeps it nice and tight. Using goto out of fixed equally does a lot of code bloat; which is why two labels. I see this as being a very hot method so trying to minimise instructions.
Will post before and after asm.
There was a problem hiding this comment.
Jit converts every thing to gotos anyway; just helping it along 😉
|
@benaadams @davidfowl @KrzysztofCwalina @halter73 Do we want IndexOfAny(byte b1,..) as part of this PR? We would need to duplicate this code in Kestrel/Pipelines if we won't have it in core. |
|
@pakrym I'm doing some experiments in corefxlab; might be better to churn there for a bit; then bring forward changes to corefx? |
|
@benaadams what are your plans with this PR here? Did you move your work to corefxlab? |
|
Think the best bits have returned to corefx by a circular route :) |
|
Thanks @benaadams! |
Slim the span hot methods
.ctor+Sliceas they are very hot and always inlinedVectorized
IndexOffor byte andSequenceEqualfor byte and char/cc @jkotas