Add early-stage optimization crate by lachlansneff · Pull Request #556 · bytecodealliance/cranelift

lachlansneff · 2018-10-10T03:13:45Z

This pr adds a simple constant folding pass. Don't merge this yet, it's a work in progress and not near completion.

The tracking issue is #554.

lachlansneff · 2018-10-10T03:19:14Z

lib/codegen/src/timing.rs

    gvn: "Global value numbering",
    licm: "Loop invariant code motion",
    unreachable_code: "Remove unreachable blocks",
+    // constant_folding: "Fold constant expressions",


Uncommenting this to add constant_folding pass timing will not compile. The array generated by the macro is currently 32 items in length, and a static slice of 33 items does not implement Default. When const generics land, this will be fixed, but not sure what to do about it before then.

Huh. Not sure what to do yet either.

Figured it out. Just had to coerce them to an dynamically sized slice.

sunfishcode · 2018-10-10T16:05:51Z

lib/codegen/src/constant_folding.rs

+    while let Some(_ebb) = pos.next_ebb() {
+        while let Some(inst) = pos.next_inst() {
+            use ir::instructions::Opcode::*;
+            match pos.func.dfg[inst].opcode() {


An organization that I think would be advantageous would be to match on the ~~InstructionFormat~~InstructionData, which is pos.func.dfg[inst], rather than pulling out the opcode right away. Then there will be one match arm for all ir::InstructionData::Binary opcodes, and then we can have one fold_numerical_binary call and pass it the opcode. This seems appealing because it seems like it could let us keep the code for manipulating the IR separate from the folding arithmetic logic. If the ultimate folding function for binary operators could be a function which takes an opcode and two i64's and returns an i64 (or maybe an Option because trapping operators), that'd (a) make it really easy to write unit tests for the folding arithmetic, and (b) make it possible to reuse the folding arithmetic for... other fun things in the future, like IR interpreters, or auto-folding InstBuilders :-).

Oh boy, didn't even think about an auto-folding InstBuilder. That would be super cool, and would probably remove most of the overhead of a constant folding pass.

sunfishcode · 2018-10-10T16:10:32Z

lib/codegen/src/context.rs


+        if isa.flags().enable_constant_folding() {
+            self.fold_constants(isa)?;
+        }


A concern that I have is that this may increase the size of the compiler for users that aren't using this. I don't know how big the constant folding pass will get, but if we're going to be getting into optimizing unoptimized code, we may be adding other things besides.

Since this is running at the beginning of compile, what if we moved it out into a separate top-level optimize function on Context? Users that know they have unoptimized code could call optimize before compile, and for users that don't call optimize, it could get DCE'd.

Okay, that sounds good to me. That would entail removing the constant_folding setting as well, right?

Yes, that's right.

sunfishcode · 2018-10-10T16:11:02Z

lib/codegen/src/timing.rs

    gvn: "Global value numbering",
    licm: "Loop invariant code motion",
    unreachable_code: "Remove unreachable blocks",
+    // constant_folding: "Fold constant expressions",


Huh. Not sure what to do yet either.

bjorn3 · 2018-10-10T06:21:15Z

filetests/folding/branch.clif

+; nextln: ebb2:
+; nextln:     v2 = iconst.i32 24
+; nextln:     return v2
+; nextln: }


Missing trailing newline

Oh, whoops, didn't realize that was necessary.

bjorn3 · 2018-10-10T06:21:29Z

filetests/folding/numerical.clif

+; nextln:     v1 = iconst.i32 1
+; nextln:     v2 = iconst.i32 41
+; nextln:     return v2
+; nextln: }


bjorn3 · 2018-10-11T06:34:29Z

lib/codegen/src/constant_folding.rs

+            opcode: F32const,
+            imm,
+        } => {
+            let imm_as_f32 = f32::from_bits(imm.bits()); // see https://doc.rust-lang.org/std/primitive.f32.html#method.from_bits for caveats


Could you put the comment before this line. This line is a bit long.

This got replaced by software floats.

sunfishcode · 2018-10-11T04:00:13Z

lib/codegen/src/constant_folding.rs

+            let imm0 = Wrapping(imm0.unwrap_i64());
+            let imm1 = Wrapping(imm1.unwrap_i64());
+            if imm1.0 == 0 {
+                panic!("Cannot divide by a zero.")


This can happen in valid code, so we should just return None here.

Doesn't the verifier already deny division by a 0 immediate?

That's an open question, however this patch also has code that resolves Values defined by iconst, which is entirely valid.

Oh, I thought that had gotten merged. I'll make it return None.

sunfishcode · 2018-10-11T18:07:29Z

lib/codegen/src/constant_folding.rs

+        },
+        // ir::Opcode::Fadd => Some(imm0.unwrap)
+        _ => None,
+    }


Yes, this looks like a good overall direction. To further separate the arithmetic from the IR walking, we can pull just this match statement out into a separate function in a separate file.

Would it make sense to resolve cases where one of the arguments is const and the other isn't into an *_imm instruction?

preopt does this. So it's not something we need here right away. However if this constant folding pass starts growing into a more general optimization framework, it may want to start doing that kind of thing too, at which point we should think about how to organize things.

lachlansneff · 2018-10-11T20:28:10Z

lib/codegen/src/lib.rs

 #[macro_use]
 extern crate log;

+extern crate rustc_apfloat;


rustc_apfloat uses the u128 type, which is not stable on rust 1.25.0, which cranelift attempts to support.

lachlansneff · 2018-10-19T16:52:13Z

I'm going to remove the soft float crate from this pr until we can get it to build on rust 1.25.0.

sunfishcode

This is starting to take form! Thanks for your patience with the rustc_apfloat detour. Here's an initial round of comments.

sunfishcode · 2018-10-20T21:29:30Z

lib/filetests/src/lib.rs

        "shrink" => test_shrink::subtest(parsed),
        "simple-gvn" => test_simple_gvn::subtest(parsed),
        "verifier" => test_verifier::subtest(parsed),
+        "optimize" => test_optimize::subtest(parsed),


Can we name these "preopt" and "test_preopt" so that it's clear that they refer to cranelift-preopt?

There's already a preopt filetest. What should I do with that?

Rename it to "simple_preopt", to match the new name of the thing it tests :-).

sunfishcode · 2018-10-20T21:30:19Z

lib/codegen/src/timing.rs


    /// Accumulated timing for all passes.
-    #[derive(Default)]
+    // #[derive(Default)]


If this is no longer needed, let's just remove it.

lib/preopt/Cargo.toml