Skip to content

nvptx: Incorrect use of LLVM intrinsics for f16x2_min/max(_nan) #2056

@RalfJung

Description

@RalfJung

The nvptx intrinsics f16x2_min/f16x2_max/f16x2_min_nan/f16x2_max_nan are currently being mapped to the LLVM intrinsics minnum/minimum/maxnum/maximum, respectively (in some cases this is indirected via simd_fmin/simd_fmax, which are documented to correspond to minnum nsz/maxnum nsz, but we currently don't actually emit the nsz attribute). See here for an overview of the LLVM float min/max operations.

This is incorrect:

  • According to the docs, the behavior for signed zeros is defined by (a < b) ? a : b, i.e., when both operands compare equal, the 2nd operand is returned. That's not what any of the LLVM intrinsics does: they either treat -0.0 as smaller than +0.0 (that's the default), or return either value non-deterministically (when the nsz attribute is present). [This means it is actually a bug that LLVM uses the min.f16x2 nvptx operation for lowering minnum...]
  • According to the docs, assuming that isNaN checks for both QNaN and SNaN, if exactly one input is any NaN, the other input is returned for f16x2_min/f16x2_max. In contrast, minnum/maxnum say that when an input is SNaN, the return value is a NaN or the other input. The LLVM variant with the correct NaN semantics is minimumnum/maximumnum.

Cc @kjetilkjeka @folkertdev

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions