Cranelift: add option to use new single-pass register allocator.#9611
Cranelift: add option to use new single-pass register allocator.#9611cfallin merged 2 commits intobytecodealliance:mainfrom
Conversation
|
Should this selection be automatic with |
Subscribe to Label Actioncc @fitzgen DetailsThis issue or pull request has been labeled: "cranelift", "cranelift:area:machinst", "cranelift:meta", "fuzzing", "wasmtime:api", "wasmtime:config"Thus the following users have been cc'd because of the following labels:
To subscribe or unsubscribe from this label, edit the |
|
In that case we should probably get an opt level between none and speed which uses the better regalloc but keeps egraph optimizations disabled given that the better regalloc has a significantly higher improvement to runtime performance than egraph optimizations. |
Label Messager: wasmtime:configIt looks like you are changing Wasmtime's configuration options. Make sure to
DetailsTo modify this label's message, edit the To add new label messages or remove existing label messages, edit the |
alexcrichton
left a comment
There was a problem hiding this comment.
🎉 nice!
I'm ambivalent myself on the defaults for O0 and could go either way.
d60e04c to
8b89853
Compare
|
For now at least, I think I'd prefer to keep it an opt-in default -- let's let it bake in wasmtime's continuous fuzzing for a little longer. We can always switch the default later. @alexcrichton updated to add cargo-vet, could you rubber-stamp the new commit? Also fixed silly issues in fuzz build (which I never test beforehand because Ocaml; I should fix my setup!). |
In bytecodealliance/regalloc2#181, @d-sonuga added a fast single-pass algorithm option to regalloc2, in addition to its existing backtracking allocator. This produces code much more quickly, at the expense of code quality. Sometimes this tradeoff is desirable (e.g. when performing a debug build in a fast-iteration development situation, or in an initial JIT tier). This PR adds a Cranelift option to select the RA2 algorithm, plumbs it through to a Wasmtime option, and adds the option to Wasmtime fuzzing as well. An initial compile-time measurement in Wasmtime: `spidermonkey.wasm` builds in 1.383s with backtracking (existing algorithm), and 1.065s with single-pass. The resulting binary runs a simple Fibonacci benchmark in 2.060s with backtracking vs. 3.455s with single-pass. Hence, the single-pass algorithm yields a 23% compile-time reduction, at the cost of a 67% runtime increase.
8b89853 to
1ac7de1
Compare
alexcrichton
left a comment
There was a problem hiding this comment.
Looks good!
For fuzzing you can also build the fuzzers with --no-default-features to turn off the ocaml integration.
In bytecodealliance/regalloc2#181, @d-sonuga added a fast single-pass algorithm option to regalloc2, in addition to its existing backtracking allocator. This produces code much more quickly, at the expense of code quality. Sometimes this tradeoff is desirable (e.g. when performing a debug build in a fast-iteration development situation, or in an initial JIT tier).
This PR adds a Cranelift option to select the RA2 algorithm, plumbs it through to a Wasmtime option, and adds the option to Wasmtime fuzzing as well.
An initial compile-time measurement in Wasmtime:
spidermonkey.wasmbuilds in 1.383s with backtracking (existing algorithm), and 1.065s with single-pass. The resulting binary runs a simple Fibonacci benchmark in 2.060s with backtracking vs. 3.455s with single-pass.Hence, the single-pass algorithm yields a 23% compile-time reduction, at the cost of a 67% runtime increase.
Fixes #9596.