-
Notifications
You must be signed in to change notification settings - Fork 1.6k
Description
TL;DR: what would it take to deprecate the “old” backend in favor of the “new” backend for code generation?
Context
The compilation pipeline in Cranelift currently does instruction selection (through legalizations) before optimizing the intermediate representation (IR), applying register allocation on it, and then generating the machine code. From the point of view of Cranelift, these last steps can be seen as a “backend” that generates machine code for different target architectures.
The previous backend was a bit complicated to work with: it was using the DSL from the codegen/meta crate, with concepts hard to approach and explain (like Recipes), it generated Rust code that could get out of sync with the non-meta crate or contain compile errors, etc. (see also #1141). A decision was made to work on a new backend (sometimes referred to as the “machinst” backend). This was presented in #1174 and has landed since then, as an alternative backend (viz., in addition to the existing one).
As of today, the old backend supports generating machine code for (some subset of) RISC-V, x86 64 bits and 32 bits. The new backend supports generating machine code for aarch64, and has a work-in-progress backend for x86_64. The duplication of x86_64 in both the old and new backends imply that both backends move ahead in parallel. This makes it harder for the new backend to catch up with the old one as new features are being added, and can generate frustration as different teams with different priorities work on different backends.
The Mozilla Spidermonkey team has enough confidence in the new backend, which we consider to be pleasant to work with (developer ergonomics), fast enough for our use case (both compile-time and generated code throughput), and it has the potential for more compile-time and code quality optimizations in the long run, so we think it is a good time to start this discussion.
The proposal
We propose that at some point in the future, we entirely move away from the “old” backend (that is, remove it, as well as all the associated code in the meta language), and use the “new” backend, for all target architectures, and that the only possible way to implement a new target is to do it through the new backend. Notably, since x86 is the main target in the old backend, this means removing the old x86 backend.
Of course, this can’t be done until all the primarily involved stakeholders are satisfied with this idea and don’t have any strong objections in moving forward. This RFC is a first step at identifying what the acceptance criterias would be to make it possible to transition, and what a plan would be to make this realistic.
What this is not about: removing the entire meta language. This may or may not be done in the future (if we want to do it, then moving over to the new backend is a first step).
Acceptance criteria
Note that these criteria are not definite and could evolve over time, based on our discussions here.
- Features: the new backend should support all the features that are effectively used by all the stakeholders, including all the WebAssembly (wasm) features that have been implemented so far in the old backend.
- Target-independent features:
- support wasm MVP features + lightweight extensions (mutable globals, bulk memory ops)
- support wasm reftypes
- support wasm multi-value
- debugging support for generated code.
- implement enough of x86_64 to support these features.
- Performance: since the new backend came with its own instruction selection and with a new register allocation, its performance characteristics are likely to be different from those of the old backend.
- CLIR compile time: the new backend should compile code as fast as or faster than the old backend, for a set of wasm benchmarks (to be determined).
- generated code quality: the new backend should generate code that runs at least as fast as or faster than the code generated by the old backend, for a set of wasm benchmarks (to be determined).
- Target-independent features:
- Security and quality:
- CL testing: pass all the existing CLIF tests
- Major stakeholders/embedders pass tests
- fuzzing should run for some time and fuzz bugs should be fixed
- Stakeholders supported:
- Wasmtime testing: pass all the existing wasmtime tests using Cranelift as the compiler
- SpiderMonkey testing: pass all the existing SpiderMonkey tests using Cranelift as the compiler
- Other Bytecode Alliance stakeholders give their “go” (see below).
Feel free to comment about other things that are important to you, and please explain why (if it is not obvious)! Good criteria tend to be objectively quantifiable, measurable and/or bimodal (done or not done).
Potential additions to this list
These are additions to the above list, and need to be discussed as a group:
- enough support to not break cg_clif, a Rust backend initiative using Cranelift for code generation. It is hard to make a guess about the amount of work that will be required to keep cg_clif working, while it is our hope that most of it should be covered by our work, and the rest could be a community-supported effort.
- porting the x86 32-bits platform. While most of the code could be reused between x86_64 and x86 32-bits, it may not be a primary target right now, and we might or might not want to block the transition for this.
Proposed planning
Step 1: agree on the proposal
This is the current step that’s being done as part of this issue. See below.
Step 2: get to a point where we can try the new backend in real-world settings
Once we get to a point where we can compile code for large wasm programs mostly using wasm MVP features, we’ll be able to do a performance analysis, comparing on the two axis presented above. This will give us confidence in how fast we can move forward with this plan, or if we should revisit some implementation decisions, and chase more performance first.
Step 3: finish implementation of remaining features
This means implementing all the Features mentioned in the above list of criteria, as well as passing tests from all the test suites. At this point, we could put up an official depreciation notice for the old backend, and encourage people to use the new backend in general.
Step 4: do a final approval and switch
Based on an evaluation of performance, as well as feedback from the different stakeholders, we can eventually decide to enable the new backend by default. Removal of the code supporting the old backend may or may not happen at the same time; deferring its removal for a short period of time allows to switch the default back to the old backend, in case of unexpected consequences.
Future work
There is future work that is going to be enabled by switching to the new backend. At this point, these are mostly ideas, and it is not the point of this issue to discuss the design / feasibility / interest aspects of these ideas.
- Code removal in the meta language as well as in the codegen crate may lower the overall build time of Cranelift, see also https://github.com/bytecodealliance/cranelift/issues/1318 which shows that large functions in the encodings/recipes system take some time to compile (and they generate large functions too).
- After removal of the old backend, since the instruction selection really happens at the MachInst IR (Vcode) level, then all the CLIF instructions which were present for the sole benefit of being available in the backend can be removed. This includes CLIF instructions that are target-specific (e.g. x86_udivmodx), as well as instructions which offer alternative operand modes (e.g. iadd_imm is an alternate operand mode for iadd, allowing to express an “int add with immediate” with two different CL instructions, making pattern-matching more complex).
- Translating from wasm to target-independent Vcode directly (and then adapting the lowering machinery to use this) is something we would like to investigate. In an even longer horizon, we could get back to having a single IR container again (parameterized by instruction/opcode-space) and carry over optimizations onto it, while avoiding some of the pitfalls of the current CLIF design (such as performance impact of in-place editing).
Thoughts?
If you have any comments, questions, alternative proposals, objections, please feel free to write them down here. Note that we’re looking for consensus here, which is gained not when everybody agrees about all the details, but when nobody has strong objections anymore. So please carefully discuss objections and assume good intent from everyone involved in the process :-) Thanks!