Skip to content

Cranelift: use more iconst_[su] in opts, notably enabling some i128 patterns#7693

Merged
fitzgen merged 2 commits intobytecodealliance:mainfrom
scottmcm:more-i128
Dec 18, 2023
Merged

Cranelift: use more iconst_[su] in opts, notably enabling some i128 patterns#7693
fitzgen merged 2 commits intobytecodealliance:mainfrom
scottmcm:more-i128

Conversation

@scottmcm
Copy link
Contributor

This uses iconst_u and iconst_s a bunch more in ISLE opts.

We can now enable some of the patterns that had to be excluded before to avoid generating iconst.i128 that doesn't exist, since these rules do the right thing. For example, x - x with this PR will now give uextend.i128 (iconst.i64 0) when x is i128, rather than staying unchanged as it did before.

There's a bunch of things that still don't have it, like const-prop, since the constant values are still only 64-bit, not 128-bit, so those patterns are still restricted to the smaller types.

…128 cases

`iconst_u` and `iconst_s` support `$I128`, so we can now enable these even though they had to be excluded before to avoid generating `iconst.i128` that doesn't exist.
@scottmcm scottmcm requested a review from a team as a code owner December 15, 2023 05:46
@scottmcm scottmcm requested review from elliottt and removed request for a team December 15, 2023 05:46
Comment on lines +79 to +80
(rule (simplify (sextend (fits_in_64 wide) (iconst_s narrow k)))
(subsume (iconst_s wide k)))
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like how elegant this is now, but it's still restricted to 64-bit and smaller because iconst_s $I128 k will emit a sign-extended constant that's probably the same thing it started with.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mind adding a comment to this effect?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

(rule (simplify (ireduce ty (uextend _ x @ (value_type ty)))) x)

;; `band`, `bor`, and `bxor` can't affect any bits that aren't set in the one of
;; the inputs, so they can be pushed down inside `uextend`s
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I could pull these (and the reduce ones) into a separate PR if you want, since they're not really following the bigger theme of the PR. They just came from looking at cprop -- which in general can't do 128-bit constants usefully -- and trying to find what would be possible on the extended constants that iconst_u and iconst_s generate. See the bor_extended_constants_i128 test below as an example of where this happens.

@github-actions github-actions bot added cranelift Issues related to the Cranelift code generator isle Related to the ISLE domain-specific language labels Dec 15, 2023
@github-actions
Copy link

Subscribe to Label Action

cc @cfallin, @fitzgen

Details This issue or pull request has been labeled: "cranelift", "isle"

Thus the following users have been cc'd because of the following labels:

  • cfallin: isle
  • fitzgen: isle

To subscribe or unsubscribe from this label, edit the .github/subscribe-to-label.json configuration file.

Learn more.

Copy link
Member

@fitzgen fitzgen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great, thanks! Just a couple things below before merging.

Comment on lines +79 to +80
(rule (simplify (sextend (fits_in_64 wide) (iconst_s narrow k)))
(subsume (iconst_s wide k)))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mind adding a comment to this effect?

Comment on lines +62 to +79
;; The high bits of the input can't affect the lower bits of the input for most
;; modular arithmetic nor for bitops, so can be pulled out of `ireduce`.
(rule (simplify (ireduce ty (ineg _ x)))
(ineg ty (ireduce ty x)))
(rule (simplify (ireduce ty (bnot _ x)))
(bnot ty (ireduce ty x)))
(rule (simplify (ireduce ty (iadd _ x y)))
(iadd ty (ireduce ty x) (ireduce ty y)))
(rule (simplify (ireduce ty (isub _ x y)))
(isub ty (ireduce ty x) (ireduce ty y)))
(rule (simplify (ireduce ty (imul _ x y)))
(imul ty (ireduce ty x) (ireduce ty y)))
(rule (simplify (ireduce ty (bor _ x y)))
(bor ty (ireduce ty x) (ireduce ty y)))
(rule (simplify (ireduce ty (bxor _ x y)))
(bxor ty (ireduce ty x) (ireduce ty y)))
(rule (simplify (ireduce ty (band _ x y)))
(band ty (ireduce ty x) (ireduce ty y)))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we will want to pull these out into a new, dedicated PR. It isn't obvious that these rewrites are (always? usually?) improvements on their own since we are going from two ops to three and most CPUs, AFAIK, aren't faster at doing eg 8-bit adds than 32-bit adds. I could see this possibly unlocking further rewrites, but are intentionally very careful about the kinds of rewrites that we introduce just to unlock other optimizations because we aren't working with a full e-graph here, but an acyclic one that is intended to run with less time budget. All that is to say, this might still be beneficial, but I'd like to get more eyes on it and discussion about it first, and we can land the rest of this stuff in the meantime.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Makes sense. I can also experiment with doing it only for things where one of the inner ones is itself an extend, since I think that's the only cases where I used it in the tests anyway, and would keep the excess down since the extra reduce would collapse with the extend.

Copy link
Member

@fitzgen fitzgen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!

@fitzgen fitzgen added this pull request to the merge queue Dec 18, 2023
Merged via the queue into bytecodealliance:main with commit 36b1091 Dec 18, 2023
@scottmcm scottmcm deleted the more-i128 branch December 18, 2023 21:19
dhil pushed a commit to dhil/wasmtime that referenced this pull request Dec 29, 2023
… patterns (bytecodealliance#7693)

* Cranelift: use more `iconst_[su]` in opts, notably re-enabling some i128 cases

`iconst_u` and `iconst_s` support `$I128`, so we can now enable these even though they had to be excluded before to avoid generating `iconst.i128` that doesn't exist.

* Update for PR feedback
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

cranelift Issues related to the Cranelift code generator isle Related to the ISLE domain-specific language

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants