Skip to content

ISLE: port fmin, fmax, fmin_pseudo, fmax_pseudo on x64.#3856

Merged
cfallin merged 1 commit intobytecodealliance:mainfrom
cfallin:isle-min-max
Feb 28, 2022
Merged

ISLE: port fmin, fmax, fmin_pseudo, fmax_pseudo on x64.#3856
cfallin merged 1 commit intobytecodealliance:mainfrom
cfallin:isle-min-max

Conversation

@cfallin
Copy link
Member

@cfallin cfallin commented Feb 26, 2022

The min/max sequences were a bit difficult to wrap my head around, so please let me know if the descriptive comments are inaccurate at all -- I tried to add a bit more detail (e.g. not just "propagate discrepancies" but why the lowering works).

Stacks on top of #3848, #3849, #3855; only last commit is new.

@cfallin cfallin requested review from abrown and jlb6740 February 26, 2022 05:28
@github-actions github-actions bot added cranelift Issues related to the Cranelift code generator cranelift:area:machinst Issues related to instruction selection and the new MachInst backend. cranelift:area:aarch64 Issues related to AArch64 backend. cranelift:area:x64 Issues related to x64 codegen isle Related to the ISLE domain-specific language labels Feb 26, 2022
@github-actions
Copy link

Subscribe to Label Action

cc @cfallin, @fitzgen

Details This issue or pull request has been labeled: "cranelift", "cranelift:area:aarch64", "cranelift:area:machinst", "cranelift:area:x64", "isle"

Thus the following users have been cc'd because of the following labels:

  • cfallin: isle
  • fitzgen: isle

To subscribe or unsubscribe from this label, edit the .github/subscribe-to-label.json configuration file.

Learn more.

Comment on lines +2083 to +2121
;; Compute min(x, y) and min(y, x) with native
;; instructions. These will differ in one of the edge cases
;; above that we have to handle properly. (Conversely, if they
;; don't differ, then the native instruction's answer is the
;; right one per CLIF semantics.)
(let ((min1 Xmm (minps x y))
(min2 Xmm (minps y x))
;; Compute the OR of the two. Note that NaNs have an
;; exponent field of all-ones (0xFF for F32), so if either
;; result is a NaN, this OR will be. And if either is a
;; zero (which has an exponent of 0 and mantissa of 0),
;; this captures a sign-bit of 1 (negative) if either
;; input is negative.
;;
;; In the case where we don't have a +/-0 mismatch or
;; NaNs, then `min1` and `min2` are equal and `min_or` is
;; the correct minimum.
(min_or Xmm (orps min1 min2))
;; "compare unordered" produces a true mask (all ones) in
;; a given lane if the min is a NaN. We use this to
;; generate a mask to ensure quiet NaNs.
(is_nan_mask Xmm (cmpps min_or min2 (FcmpImm.Unordered)))
;; OR in the NaN mask.
(min_or_2 Xmm (orps min_or is_nan_mask))
;; Shift the NaN mask down so that it covers just the
;; fraction below the NaN signalling bit; we'll use this
;; to mask off non-canonical NaN payloads.
;;
;; All-ones for NaN, shifted down to leave 10 top bits (1
;; sign, 8 exponent, 1 QNaN bit that must remain set)
;; cleared.
(nan_fraction_mask Xmm (psrld is_nan_mask (RegMemImm.Imm 10)))
;; Do a NAND, so that we retain every bit not set in
;; `nan_fraction_mask`. This mask will be all zeroes (so
;; we retain every bit) in non-NaN cases, and will have
;; ones (so we clear those bits) in NaN-payload bits
;; otherwise.
(final Xmm (andnps nan_fraction_mask min_or_2)))
final))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great comments :)

@cfallin cfallin merged commit cd173cf into bytecodealliance:main Feb 28, 2022
@cfallin cfallin deleted the isle-min-max branch February 28, 2022 22:40
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

cranelift:area:aarch64 Issues related to AArch64 backend. cranelift:area:machinst Issues related to instruction selection and the new MachInst backend. cranelift:area:x64 Issues related to x64 codegen cranelift Issues related to the Cranelift code generator isle Related to the ISLE domain-specific language

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants