Add early-stage optimization crate#556
Conversation
lib/codegen/src/timing.rs
Outdated
| gvn: "Global value numbering", | ||
| licm: "Loop invariant code motion", | ||
| unreachable_code: "Remove unreachable blocks", | ||
| // constant_folding: "Fold constant expressions", |
There was a problem hiding this comment.
Uncommenting this to add constant_folding pass timing will not compile. The array generated by the macro is currently 32 items in length, and a static slice of 33 items does not implement Default. When const generics land, this will be fixed, but not sure what to do about it before then.
There was a problem hiding this comment.
Huh. Not sure what to do yet either.
There was a problem hiding this comment.
Figured it out. Just had to coerce them to an dynamically sized slice.
c8c3dce to
f60cc91
Compare
lib/codegen/src/constant_folding.rs
Outdated
| while let Some(_ebb) = pos.next_ebb() { | ||
| while let Some(inst) = pos.next_inst() { | ||
| use ir::instructions::Opcode::*; | ||
| match pos.func.dfg[inst].opcode() { |
There was a problem hiding this comment.
An organization that I think would be advantageous would be to match on the InstructionFormatInstructionData, which is pos.func.dfg[inst], rather than pulling out the opcode right away. Then there will be one match arm for all ir::InstructionData::Binary opcodes, and then we can have one fold_numerical_binary call and pass it the opcode. This seems appealing because it seems like it could let us keep the code for manipulating the IR separate from the folding arithmetic logic. If the ultimate folding function for binary operators could be a function which takes an opcode and two i64's and returns an i64 (or maybe an Option because trapping operators), that'd (a) make it really easy to write unit tests for the folding arithmetic, and (b) make it possible to reuse the folding arithmetic for... other fun things in the future, like IR interpreters, or auto-folding InstBuilders :-).
There was a problem hiding this comment.
Oh boy, didn't even think about an auto-folding InstBuilder. That would be super cool, and would probably remove most of the overhead of a constant folding pass.
lib/codegen/src/context.rs
Outdated
|
|
||
| if isa.flags().enable_constant_folding() { | ||
| self.fold_constants(isa)?; | ||
| } |
There was a problem hiding this comment.
A concern that I have is that this may increase the size of the compiler for users that aren't using this. I don't know how big the constant folding pass will get, but if we're going to be getting into optimizing unoptimized code, we may be adding other things besides.
Since this is running at the beginning of compile, what if we moved it out into a separate top-level optimize function on Context? Users that know they have unoptimized code could call optimize before compile, and for users that don't call optimize, it could get DCE'd.
There was a problem hiding this comment.
Okay, that sounds good to me. That would entail removing the constant_folding setting as well, right?
lib/codegen/src/timing.rs
Outdated
| gvn: "Global value numbering", | ||
| licm: "Loop invariant code motion", | ||
| unreachable_code: "Remove unreachable blocks", | ||
| // constant_folding: "Fold constant expressions", |
There was a problem hiding this comment.
Huh. Not sure what to do yet either.
filetests/folding/branch.clif
Outdated
| ; nextln: ebb2: | ||
| ; nextln: v2 = iconst.i32 24 | ||
| ; nextln: return v2 | ||
| ; nextln: } No newline at end of file |
There was a problem hiding this comment.
Oh, whoops, didn't realize that was necessary.
| ; nextln: v1 = iconst.i32 1 | ||
| ; nextln: v2 = iconst.i32 41 | ||
| ; nextln: return v2 | ||
| ; nextln: } No newline at end of file |
6679266 to
f0781fe
Compare
f0781fe to
13e52f9
Compare
lib/codegen/src/constant_folding.rs
Outdated
| opcode: F32const, | ||
| imm, | ||
| } => { | ||
| let imm_as_f32 = f32::from_bits(imm.bits()); // see https://doc.rust-lang.org/std/primitive.f32.html#method.from_bits for caveats |
There was a problem hiding this comment.
Could you put the comment before this line. This line is a bit long.
There was a problem hiding this comment.
This got replaced by software floats.
| let imm0 = Wrapping(imm0.unwrap_i64()); | ||
| let imm1 = Wrapping(imm1.unwrap_i64()); | ||
| if imm1.0 == 0 { | ||
| panic!("Cannot divide by a zero.") |
There was a problem hiding this comment.
This can happen in valid code, so we should just return None here.
There was a problem hiding this comment.
Doesn't the verifier already deny division by a 0 immediate?
There was a problem hiding this comment.
That's an open question, however this patch also has code that resolves Values defined by iconst, which is entirely valid.
There was a problem hiding this comment.
Oh, I thought that had gotten merged. I'll make it return None.
| }, | ||
| // ir::Opcode::Fadd => Some(imm0.unwrap) | ||
| _ => None, | ||
| } |
There was a problem hiding this comment.
Yes, this looks like a good overall direction. To further separate the arithmetic from the IR walking, we can pull just this match statement out into a separate function in a separate file.
There was a problem hiding this comment.
Would it make sense to resolve cases where one of the arguments is const and the other isn't into an *_imm instruction?
There was a problem hiding this comment.
preopt does this. So it's not something we need here right away. However if this constant folding pass starts growing into a more general optimization framework, it may want to start doing that kind of thing too, at which point we should think about how to organize things.
lib/codegen/src/lib.rs
Outdated
| #[macro_use] | ||
| extern crate log; | ||
|
|
||
| extern crate rustc_apfloat; |
There was a problem hiding this comment.
rustc_apfloat uses the u128 type, which is not stable on rust 1.25.0, which cranelift attempts to support.
76d36d3 to
5316812
Compare
5316812 to
34f2b84
Compare
|
I'm going to remove the soft float crate from this pr until we can get it to build on rust 1.25.0. |
sunfishcode
left a comment
There was a problem hiding this comment.
This is starting to take form! Thanks for your patience with the rustc_apfloat detour. Here's an initial round of comments.
lib/filetests/src/lib.rs
Outdated
| "shrink" => test_shrink::subtest(parsed), | ||
| "simple-gvn" => test_simple_gvn::subtest(parsed), | ||
| "verifier" => test_verifier::subtest(parsed), | ||
| "optimize" => test_optimize::subtest(parsed), |
There was a problem hiding this comment.
Can we name these "preopt" and "test_preopt" so that it's clear that they refer to cranelift-preopt?
There was a problem hiding this comment.
There's already a preopt filetest. What should I do with that?
There was a problem hiding this comment.
Rename it to "simple_preopt", to match the new name of the thing it tests :-).
lib/codegen/src/timing.rs
Outdated
|
|
||
| /// Accumulated timing for all passes. | ||
| #[derive(Default)] | ||
| // #[derive(Default)] |
There was a problem hiding this comment.
If this is no longer needed, let's just remove it.
lib/preopt/src/constant_folding.rs
Outdated
| // _ => return, | ||
| // } | ||
| // } | ||
| // } |
There was a problem hiding this comment.
What's the status of this commented-out code?
There was a problem hiding this comment.
Sorry, that was old code that I forgot to completely remove.
lib/preopt/src/constant_folding.rs
Outdated
| (ConstImm::Ieee32(imm0), ConstImm::Ieee32(imm1)) => { | ||
| if imm1 == 0.0 { | ||
| return None; | ||
| } |
There was a problem hiding this comment.
Here and in the Ieee64 case below, floating-point division by zero is fully defined, so we don't need to special-case it.
lib/preopt/src/constant_folding.rs
Outdated
| let (imm0, imm1) = ( | ||
| resolve_value_to_imm(dfg, args[0])?, | ||
| resolve_value_to_imm(dfg, args[1])?, | ||
| ); |
There was a problem hiding this comment.
Also, I'd really like to separate knowledge of the dfg from the logic of arithmetic folding. Can you split the match below into a separate function? That should also make it easy to add unit tests to ensure that we handle things like division by zero (see comments below).
lib/preopt/src/constant_folding.rs
Outdated
| opcode: F32const, | ||
| imm, | ||
| } => { | ||
| // see https://doc.rust-lang.org/std/primitive.f32.html#method.from_bits for caveats |
There was a problem hiding this comment.
Please captalize and emperiod comments which are complete sentences.
There was a problem hiding this comment.
Oh, sure. Sorry 'bout that.
lib/preopt/src/constant_folding.rs
Outdated
| opcode: F64const, | ||
| imm, | ||
| } => { | ||
| // see https://doc.rust-lang.org/std/primitive.f32.html#method.from_bits for caveats |
|
BTW, #569 has now landed, so the Rust 1.25 restriction is now lifted! |
|
This looks good! Let's land this now, and we can continue to iterate from there. |
This pr adds a simple constant folding pass. Don't merge this yet, it's a work in progress and not near completion.
The tracking issue is #554.