Skip to content
This repository was archived by the owner on Jun 26, 2020. It is now read-only.

Add early-stage optimization crate#556

Merged
sunfishcode merged 16 commits intobytecodealliance:masterfrom
lachlansneff:constant-folding
Nov 7, 2018
Merged

Add early-stage optimization crate#556
sunfishcode merged 16 commits intobytecodealliance:masterfrom
lachlansneff:constant-folding

Conversation

@lachlansneff
Copy link
Contributor

This pr adds a simple constant folding pass. Don't merge this yet, it's a work in progress and not near completion.

The tracking issue is #554.

gvn: "Global value numbering",
licm: "Loop invariant code motion",
unreachable_code: "Remove unreachable blocks",
// constant_folding: "Fold constant expressions",
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Uncommenting this to add constant_folding pass timing will not compile. The array generated by the macro is currently 32 items in length, and a static slice of 33 items does not implement Default. When const generics land, this will be fixed, but not sure what to do about it before then.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Huh. Not sure what to do yet either.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Figured it out. Just had to coerce them to an dynamically sized slice.

while let Some(_ebb) = pos.next_ebb() {
while let Some(inst) = pos.next_inst() {
use ir::instructions::Opcode::*;
match pos.func.dfg[inst].opcode() {
Copy link
Member

@sunfishcode sunfishcode Oct 10, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

An organization that I think would be advantageous would be to match on the InstructionFormatInstructionData, which is pos.func.dfg[inst], rather than pulling out the opcode right away. Then there will be one match arm for all ir::InstructionData::Binary opcodes, and then we can have one fold_numerical_binary call and pass it the opcode. This seems appealing because it seems like it could let us keep the code for manipulating the IR separate from the folding arithmetic logic. If the ultimate folding function for binary operators could be a function which takes an opcode and two i64's and returns an i64 (or maybe an Option because trapping operators), that'd (a) make it really easy to write unit tests for the folding arithmetic, and (b) make it possible to reuse the folding arithmetic for... other fun things in the future, like IR interpreters, or auto-folding InstBuilders :-).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh boy, didn't even think about an auto-folding InstBuilder. That would be super cool, and would probably remove most of the overhead of a constant folding pass.


if isa.flags().enable_constant_folding() {
self.fold_constants(isa)?;
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A concern that I have is that this may increase the size of the compiler for users that aren't using this. I don't know how big the constant folding pass will get, but if we're going to be getting into optimizing unoptimized code, we may be adding other things besides.

Since this is running at the beginning of compile, what if we moved it out into a separate top-level optimize function on Context? Users that know they have unoptimized code could call optimize before compile, and for users that don't call optimize, it could get DCE'd.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay, that sounds good to me. That would entail removing the constant_folding setting as well, right?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, that's right.

gvn: "Global value numbering",
licm: "Loop invariant code motion",
unreachable_code: "Remove unreachable blocks",
// constant_folding: "Fold constant expressions",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Huh. Not sure what to do yet either.

; nextln: ebb2:
; nextln: v2 = iconst.i32 24
; nextln: return v2
; nextln: } No newline at end of file
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Missing trailing newline

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, whoops, didn't realize that was necessary.

; nextln: v1 = iconst.i32 1
; nextln: v2 = iconst.i32 41
; nextln: return v2
; nextln: } No newline at end of file
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same here

opcode: F32const,
imm,
} => {
let imm_as_f32 = f32::from_bits(imm.bits()); // see https://doc.rust-lang.org/std/primitive.f32.html#method.from_bits for caveats
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you put the comment before this line. This line is a bit long.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This got replaced by software floats.

let imm0 = Wrapping(imm0.unwrap_i64());
let imm1 = Wrapping(imm1.unwrap_i64());
if imm1.0 == 0 {
panic!("Cannot divide by a zero.")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This can happen in valid code, so we should just return None here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Doesn't the verifier already deny division by a 0 immediate?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's an open question, however this patch also has code that resolves Values defined by iconst, which is entirely valid.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, I thought that had gotten merged. I'll make it return None.

},
// ir::Opcode::Fadd => Some(imm0.unwrap)
_ => None,
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, this looks like a good overall direction. To further separate the arithmetic from the IR walking, we can pull just this match statement out into a separate function in a separate file.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it make sense to resolve cases where one of the arguments is const and the other isn't into an *_imm instruction?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

preopt does this. So it's not something we need here right away. However if this constant folding pass starts growing into a more general optimization framework, it may want to start doing that kind of thing too, at which point we should think about how to organize things.

#[macro_use]
extern crate log;

extern crate rustc_apfloat;
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

rustc_apfloat uses the u128 type, which is not stable on rust 1.25.0, which cranelift attempts to support.

@lachlansneff lachlansneff force-pushed the constant-folding branch 2 times, most recently from 76d36d3 to 5316812 Compare October 12, 2018 15:07
@lachlansneff lachlansneff changed the title Add simple constant folding. Add early-stage optimization crate Oct 12, 2018
@lachlansneff
Copy link
Contributor Author

I'm going to remove the soft float crate from this pr until we can get it to build on rust 1.25.0.

Copy link
Member

@sunfishcode sunfishcode left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is starting to take form! Thanks for your patience with the rustc_apfloat detour. Here's an initial round of comments.

"shrink" => test_shrink::subtest(parsed),
"simple-gvn" => test_simple_gvn::subtest(parsed),
"verifier" => test_verifier::subtest(parsed),
"optimize" => test_optimize::subtest(parsed),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we name these "preopt" and "test_preopt" so that it's clear that they refer to cranelift-preopt?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's already a preopt filetest. What should I do with that?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Rename it to "simple_preopt", to match the new name of the thing it tests :-).


/// Accumulated timing for all passes.
#[derive(Default)]
// #[derive(Default)]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If this is no longer needed, let's just remove it.

// _ => return,
// }
// }
// }
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's the status of this commented-out code?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, that was old code that I forgot to completely remove.

(ConstImm::Ieee32(imm0), ConstImm::Ieee32(imm1)) => {
if imm1 == 0.0 {
return None;
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here and in the Ieee64 case below, floating-point division by zero is fully defined, so we don't need to special-case it.

let (imm0, imm1) = (
resolve_value_to_imm(dfg, args[0])?,
resolve_value_to_imm(dfg, args[1])?,
);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, I'd really like to separate knowledge of the dfg from the logic of arithmetic folding. Can you split the match below into a separate function? That should also make it easy to add unit tests to ensure that we handle things like division by zero (see comments below).

opcode: F32const,
imm,
} => {
// see https://doc.rust-lang.org/std/primitive.f32.html#method.from_bits for caveats
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please captalize and emperiod comments which are complete sentences.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, sure. Sorry 'bout that.

opcode: F64const,
imm,
} => {
// see https://doc.rust-lang.org/std/primitive.f32.html#method.from_bits for caveats
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ditto

@sunfishcode
Copy link
Member

BTW, #569 has now landed, so the Rust 1.25 restriction is now lifted!

@sunfishcode
Copy link
Member

This looks good! Let's land this now, and we can continue to iterate from there.

@sunfishcode sunfishcode merged commit f38847b into bytecodealliance:master Nov 7, 2018
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants