-
Notifications
You must be signed in to change notification settings - Fork 5.3k
[wasm] Add Vector.Normalize measurement #81237
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[wasm] Add Vector.Normalize measurement #81237
Conversation
|
I couldn't figure out the best area label to add to this PR. If you have write-permissions please help me learn by adding exactly one area label. |
|
The emitted code: > wa-info -d -f Vector.*Normalize.*Run dotnet.wasm
(func Wasm_Browser_Bench_Sample_Sample_VectorTask_Normalize_RunStep(param $0 i32, $1 i32))
local $2 v128
local.get $0
i32.eqz
if
call mini_llvmonly_throw_nullref_exception
unreachable
local.get $0
local.get $0
v128.load offset:32 align:2 [SIMD]
local.tee $2
local.get $2
local.get $2
f32x4.mul [SIMD]
local.tee $2
local.get $2
v128.const 0x0b0a09080f0e0d0c0302010007060504 [SIMD]
i8x16.swizzle [SIMD]
f32x4.add [SIMD]
local.tee $2
local.get $2
v128.const 0x07060504030201000f0e0d0c0b0a0908 [SIMD]
i8x16.swizzle [SIMD]
f32x4.add [SIMD]
f32x4.extract.lane 0 [SIMD]
f32.sqrt
f32x4.splat [SIMD]
f32x4.div [SIMD]
v128.store offset:16 [SIMD] |
|
What's the difference between the For the codegen this doesn't look "quite" like I'd expect from a few perspectives: Namely this code isn't "deterministic" like it should be and may differ in behavior based on platform/architecture:
I'd expect something that is roughly (using unique locals for each step, SSA style): The code is "mostly" doing this, but there are a couple of edge cases where it looks like its not quite right. |
The browser-bench is our "litmus paper" benchmark. It has lesser coverage compared to dotnet/performance. It has more measured flavors, like interp, aot, simd, wasm eh, chrome, firefox. It also lives in the runtime repo, so it works most of the time and saved us many times :-) We have visualization of the data here https://radekdoulik.github.io/WasmPerformanceMeasurements. Currently it runs twice a day. |
And that is exactly what it does. It does it the way you described it here, just the SSA introduced variables are already gone.
|
This looks like clang optimization, our llvm IR looks like this: @radical @vargaz @lewing do you know what floating point behavior we use by default for wasm? Should we switch to more precise one or at least document it somewhere? https://clang.llvm.org/docs/UsersManual.html#controlling-floating-point-behavior |
|
We don't change the fp behavior so whatever the llvm default is, we use that. |
Probably more here https://emscripten.org/docs/porting/simd.html |
|
As @radekdoulik mentioned browser-bench is our attempt to measure a few key scenarios without the complications of the dotnet project system breaking dotnet/performance results for extended periods. We'd love it if you included dotnet/performance results for wasm in your performance related prs, but as it is we have to be reactive to regressions and browser-bench helps us do that using personal hardware (we aren't currently running dotnet/performance with simd enabled on wasm for cost reasons) Wasm as specified is generally much more deterministic than the host platform so anything that compiles down to wasm intrinsics should be compared to the spec. On top of that none of the operations here are particularly wasm specific and we depend on llvm as a platform in these cases. The goal in this PR is to make sure we are measuring a few simple paths after substantial regressions that were not identified in other tooling (c.f. #81201 (comment)) |
No description provided.