-
Notifications
You must be signed in to change notification settings - Fork 5.3k
Open
Labels
area-CodeGen-coreclrCLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMICLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI
Milestone
Description
Math.Min and Math.Max are currently quite slow due to IEEE754 corner cases, here is the codegen for Math.Max(double, double) (it's not inlineable):
; Max(double,double):double
G_M17496_IG01:
sub rsp, 24
vzeroupper
G_M17496_IG02:
vucomisd xmm0, xmm1
ja SHORT G_M17496_IG03
vmovsd qword ptr [rsp+10H], xmm0
mov rax, qword ptr [rsp+10H]
mov rdx, 0xD1FFAB1E
and rax, rdx
mov rdx, 0xD1FFAB1E
cmp rax, rdx
jle SHORT G_M17496_IG04
G_M17496_IG03:
add rsp, 24
ret
G_M17496_IG04:
vucomisd xmm0, xmm1
jp SHORT G_M17496_IG08
jne SHORT G_M17496_IG08
vmovsd qword ptr [rsp+08H], xmm0
cmp qword ptr [rsp+08H], 0
jl SHORT G_M17496_IG06
G_M17496_IG05:
add rsp, 24
ret
G_M17496_IG06:
vmovaps xmm0, xmm1
G_M17496_IG07:
add rsp, 24
ret
G_M17496_IG08:
vmovaps xmm0, xmm1
G_M17496_IG09:
add rsp, 24
ret
; Total bytes of code: 102The code looks HUGE, especially if you compare it with C/C++'s fmax (I know it doesn't really care about corner cases, e.g. fmax(-0.0, 0.0) => -0.0) but for the case when one of the arguments is a constant value this codegen can be just:
vmaxsd xmm0, xmm0, Cand still be IEEE754-2019 compliant (unlike dotnet/coreclr#22965) if C is a "normal" value like 100.0 (we can re-use FloatingPointUtils::isNormal from dotnet/coreclr#24584).
I found plenty of usages of Math.Max(X, C) pattern in open-source projects, e.g.: AvaloniaUI, Xenko.
category:cq
theme:floating-point
skill-level:expert
cost:large
jkoritzinsky, ArtBlnd and pentp
Metadata
Metadata
Assignees
Labels
area-CodeGen-coreclrCLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMICLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI