cranelift: Fix ireduce rules#8005
Merged
jameysharp merged 1 commit intobytecodealliance:mainfrom Feb 28, 2024
Merged
Conversation
We had two optimization rules which started off like this:
(rule (simplify (ireduce smallty val@(binary_op _ op x y)))
(if-let _ (reducible_modular_op val))
...)
This was intended to check that `x` and `y` came from an instruction
which not only was a binary op but also matched `reducible_modular_op`.
Unfortunately, both `binary_op` and `reducible_modular_op` were
multi-terms.
- So `binary_op` would search the eclass rooted at `val` to find each
instruction that uses a binary operator.
- Then `reducible_modular_op` would search the entire eclass again to
find an instruction which matched its criteria.
Nothing ensured that both searches would find the same instruction.
The reason these rules were written this way was because they had
additional guards (`will_simplify_with_ireduce`) which made them fairly
complex, and it seemed desirable to not have to copy those guards for
every operator where we wanted to apply this optimization.
However, we've decided that checking whether the rule is actually an
improvement is not desirable. In general, that should be the job of the
cost function. Blindly adding equivalent expressions gives us more
opportunities for other rules to fire, and we have global recursion and
growth limits to keep the process from going too wild.
As a result, we can just delete those guards. That allows us to write
the rules in a more straightforward way.
Fixes bytecodealliance#7999.
Co-authored-by: Trevor Elliott <telliott@fastly.com>
Co-authored-by: L Pereira <l.pereira@fastly.com>
Co-authored-by: Chris Fallin <chris@cfallin.org>
990ddd0 to
4672280
Compare
elliottt
approved these changes
Feb 28, 2024
elliottt
added a commit
to elliottt/wasmtime
that referenced
this pull request
Feb 28, 2024
We had two optimization rules which started off like this:
(rule (simplify (ireduce smallty val@(binary_op _ op x y)))
(if-let _ (reducible_modular_op val))
...)
This was intended to check that `x` and `y` came from an instruction
which not only was a binary op but also matched `reducible_modular_op`.
Unfortunately, both `binary_op` and `reducible_modular_op` were
multi-terms.
- So `binary_op` would search the eclass rooted at `val` to find each
instruction that uses a binary operator.
- Then `reducible_modular_op` would search the entire eclass again to
find an instruction which matched its criteria.
Nothing ensured that both searches would find the same instruction.
The reason these rules were written this way was because they had
additional guards (`will_simplify_with_ireduce`) which made them fairly
complex, and it seemed desirable to not have to copy those guards for
every operator where we wanted to apply this optimization.
However, we've decided that checking whether the rule is actually an
improvement is not desirable. In general, that should be the job of the
cost function. Blindly adding equivalent expressions gives us more
opportunities for other rules to fire, and we have global recursion and
growth limits to keep the process from going too wild.
As a result, we can just delete those guards. That allows us to write
the rules in a more straightforward way.
Fixes bytecodealliance#7999.
Co-authored-by: Trevor Elliott <telliott@fastly.com>
Co-authored-by: L Pereira <l.pereira@fastly.com>
Co-authored-by: Chris Fallin <chris@cfallin.org>
elliottt
added a commit
to elliottt/wasmtime
that referenced
this pull request
Feb 28, 2024
We had two optimization rules which started off like this:
(rule (simplify (ireduce smallty val@(binary_op _ op x y)))
(if-let _ (reducible_modular_op val))
...)
This was intended to check that `x` and `y` came from an instruction
which not only was a binary op but also matched `reducible_modular_op`.
Unfortunately, both `binary_op` and `reducible_modular_op` were
multi-terms.
- So `binary_op` would search the eclass rooted at `val` to find each
instruction that uses a binary operator.
- Then `reducible_modular_op` would search the entire eclass again to
find an instruction which matched its criteria.
Nothing ensured that both searches would find the same instruction.
The reason these rules were written this way was because they had
additional guards (`will_simplify_with_ireduce`) which made them fairly
complex, and it seemed desirable to not have to copy those guards for
every operator where we wanted to apply this optimization.
However, we've decided that checking whether the rule is actually an
improvement is not desirable. In general, that should be the job of the
cost function. Blindly adding equivalent expressions gives us more
opportunities for other rules to fire, and we have global recursion and
growth limits to keep the process from going too wild.
As a result, we can just delete those guards. That allows us to write
the rules in a more straightforward way.
Fixes bytecodealliance#7999.
Co-authored-by: Trevor Elliott <telliott@fastly.com>
Co-authored-by: L Pereira <l.pereira@fastly.com>
Co-authored-by: Chris Fallin <chris@cfallin.org>
alexcrichton
pushed a commit
that referenced
this pull request
Feb 28, 2024
* cranelift: Fix ireduce rules (#8005) We had two optimization rules which started off like this: (rule (simplify (ireduce smallty val@(binary_op _ op x y))) (if-let _ (reducible_modular_op val)) ...) This was intended to check that `x` and `y` came from an instruction which not only was a binary op but also matched `reducible_modular_op`. Unfortunately, both `binary_op` and `reducible_modular_op` were multi-terms. - So `binary_op` would search the eclass rooted at `val` to find each instruction that uses a binary operator. - Then `reducible_modular_op` would search the entire eclass again to find an instruction which matched its criteria. Nothing ensured that both searches would find the same instruction. The reason these rules were written this way was because they had additional guards (`will_simplify_with_ireduce`) which made them fairly complex, and it seemed desirable to not have to copy those guards for every operator where we wanted to apply this optimization. However, we've decided that checking whether the rule is actually an improvement is not desirable. In general, that should be the job of the cost function. Blindly adding equivalent expressions gives us more opportunities for other rules to fire, and we have global recursion and growth limits to keep the process from going too wild. As a result, we can just delete those guards. That allows us to write the rules in a more straightforward way. Fixes #7999. Co-authored-by: Trevor Elliott <telliott@fastly.com> Co-authored-by: L Pereira <l.pereira@fastly.com> Co-authored-by: Chris Fallin <chris@cfallin.org> * Update RELEASES.md --------- Co-authored-by: Jamey Sharp <jsharp@fastly.com> Co-authored-by: L Pereira <l.pereira@fastly.com> Co-authored-by: Chris Fallin <chris@cfallin.org>
alexcrichton
pushed a commit
that referenced
this pull request
Feb 28, 2024
* cranelift: Fix ireduce rules (#8005) We had two optimization rules which started off like this: (rule (simplify (ireduce smallty val@(binary_op _ op x y))) (if-let _ (reducible_modular_op val)) ...) This was intended to check that `x` and `y` came from an instruction which not only was a binary op but also matched `reducible_modular_op`. Unfortunately, both `binary_op` and `reducible_modular_op` were multi-terms. - So `binary_op` would search the eclass rooted at `val` to find each instruction that uses a binary operator. - Then `reducible_modular_op` would search the entire eclass again to find an instruction which matched its criteria. Nothing ensured that both searches would find the same instruction. The reason these rules were written this way was because they had additional guards (`will_simplify_with_ireduce`) which made them fairly complex, and it seemed desirable to not have to copy those guards for every operator where we wanted to apply this optimization. However, we've decided that checking whether the rule is actually an improvement is not desirable. In general, that should be the job of the cost function. Blindly adding equivalent expressions gives us more opportunities for other rules to fire, and we have global recursion and growth limits to keep the process from going too wild. As a result, we can just delete those guards. That allows us to write the rules in a more straightforward way. Fixes #7999. Co-authored-by: Trevor Elliott <telliott@fastly.com> Co-authored-by: L Pereira <l.pereira@fastly.com> Co-authored-by: Chris Fallin <chris@cfallin.org> * Update RELEASES.md * Update RELEASES.md --------- Co-authored-by: Jamey Sharp <jsharp@fastly.com> Co-authored-by: L Pereira <l.pereira@fastly.com> Co-authored-by: Chris Fallin <chris@cfallin.org>
cfallin
added a commit
to cfallin/wasmtime
that referenced
this pull request
Feb 28, 2024
…dance after bytecodealliance#7999. While debugging bytecodealliance#7999, we learned of a possible, extremely subtle, interaction between the design of our ISLE mid-end prelude and matching rules with a certain structure. Because a `Value` represents an entire eclass, separate left-hand sides (as in helper rules or if-let clauses) that match on the value may match against different enodes returned by the multi-match extractor's iterator. We then may infer some property of one enode, another property of another enode, and perform a rewrite that is only valid if both of those matches are on the same enode. The precise distinction is whether it is a property of the *value* -- e.g., nonzero, or even, or within a range -- or a property of the *operation* and matched subpieces. The former is fine, the latter runs into trouble. We found that bytecodealliance#7719 added a helper that determined whether a value "was a certain operator" -- actually, had any enode of a certain operator -- and separately, matched "the operator" and extracted its opcode and parameters (actually, *any* binary operator). The first half can match an opcode we support simplifying, and the second half can get the arguments and `op` and blindly use them in the rewrite. This PR adds new guidance to avoid complex helpers and be aware of multi-matching behavior, preferring to write patterns directly (as the fix in bytecodealliance#8005 does) instead. Longer-term, we also have other ideas, e.g. @jameysharp's suggestion to disallow at-patterns on multi-extractors in left hand sides to reduce the chance of hitting this footgun.
github-merge-queue bot
pushed a commit
that referenced
this pull request
Feb 28, 2024
…dance after #7999. (#8015) While debugging #7999, we learned of a possible, extremely subtle, interaction between the design of our ISLE mid-end prelude and matching rules with a certain structure. Because a `Value` represents an entire eclass, separate left-hand sides (as in helper rules or if-let clauses) that match on the value may match against different enodes returned by the multi-match extractor's iterator. We then may infer some property of one enode, another property of another enode, and perform a rewrite that is only valid if both of those matches are on the same enode. The precise distinction is whether it is a property of the *value* -- e.g., nonzero, or even, or within a range -- or a property of the *operation* and matched subpieces. The former is fine, the latter runs into trouble. We found that #7719 added a helper that determined whether a value "was a certain operator" -- actually, had any enode of a certain operator -- and separately, matched "the operator" and extracted its opcode and parameters (actually, *any* binary operator). The first half can match an opcode we support simplifying, and the second half can get the arguments and `op` and blindly use them in the rewrite. This PR adds new guidance to avoid complex helpers and be aware of multi-matching behavior, preferring to write patterns directly (as the fix in #8005 does) instead. Longer-term, we also have other ideas, e.g. @jameysharp's suggestion to disallow at-patterns on multi-extractors in left hand sides to reduce the chance of hitting this footgun.
Contributor
|
Oops! Sorry for the mess here.
Good to see the new guidance for this, since it had been uncertain back when I first started doing these rules in #7693 (comment) |
Contributor
Author
|
Yeah, we're all still figuring out how best to use this egraph optimizer. The best we can do is keep trying things out, and I appreciate your efforts to do that! |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
We had two optimization rules which started off like this:
(rule (simplify (ireduce smallty val@(binary_op _ op x y)))
(if-let _ (reducible_modular_op val))
...)
This was intended to check that
xandycame from an instruction which not only was a binary op but also matchedreducible_modular_op.Unfortunately, both
binary_opandreducible_modular_opwere multi-terms.binary_opwould search the eclass rooted atvalto find each instruction that uses a binary operator.reducible_modular_opwould search the entire eclass again to find an instruction which matched its criteria.Nothing ensured that both searches would find the same instruction.
The reason these rules were written this way was because they had additional guards (
will_simplify_with_ireduce) which made them fairly complex, and it seemed desirable to not have to copy those guards for every operator where we wanted to apply this optimization.However, we've decided that checking whether the rule is actually an improvement is not desirable. In general, that should be the job of the cost function. Blindly adding equivalent expressions gives us more opportunities for other rules to fire, and we have global recursion and growth limits to keep the process from going too wild.
As a result, we can just delete those guards. That allows us to write the rules in a more straightforward way.
Fixes #7999.
cc: @elliottt @cfallin @lpereira