Skip to content

Conversation

@arsenm
Copy link

@arsenm arsenm commented Dec 17, 2025

Instead of doing a range check for whether the converted value will be in the denormal range, with the off by one for the direction, just check if the converted value is equal to 0.

This does have a small behavior change. Neither version canonicalizes the results on the pred path, so this changes which denormal results are flushed to 0 or left uncanonicalized.

This saves 5 instructions, mostly from materializing the 64-bit constants.

@arsenm arsenm requested a review from b-sumner as a code owner December 17, 2025 20:57
@arsenm arsenm added the device-libs Related to Device Libraries label Dec 17, 2025
@github-actions
Copy link

Thank you for submitting a Pull Request (PR) to the LLVM Project!

This PR will be automatically labeled and the relevant teams will be notified.

If you wish to, you can add reviewers by using the "Reviewers" section on this page.

If this is not working for you, it is probably because you do not have write permissions for the repository. In which case you can instead tag reviewers by name in a comment by using @ followed by their GitHub username.

If you have received no comments on your PR for a week, you can request a review by "ping"ing the PR by adding a comment “Ping”. The common courtesy "ping" rate is once a week. Please remember that you are asking for valuable time from other developers.

If you have further questions, they may be answered by the LLVM GitHub User Guide.

You can also ask questions in a comment on this PR, on the LLVM Discord or on the forums.

@arsenm
Copy link
Author

arsenm commented Dec 17, 2025

I think PSDB is still not running any tests with -z, but the relevant conversions test pass for me

@z1-cciauto
Copy link
Collaborator

if (DAZ_OPT()) {
float z = BUILTIN_COPYSIGN_F32(0.0f, r);
r = a >= -0x1.fffffcp-127 && a < 0x1.0p-126 ? z : r;
r = a_f == 0.0f ? a_f : r;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think this will always return the same results as the original, but I think it's OK. Would you please replace "a_f" with "fa"?

@arsenm
Copy link
Author

arsenm commented Dec 17, 2025

https://alive2.llvm.org/ce/z/BYCRrh

Need a local build for it to not timeout, but:

ERROR: Value mismatch

Example:
double noundef %x = #x380fffffe0000000 (0.000000000000?)

Source:
float %conv.i = #x00800000 (0.000000000000?)
i32 %astype.i.i = #x00800000 (8388608)
i32 %sub.i.i = poison
i1 %cmp.i.i = #x0 (0)
i32 %cond.i.i = #x00800000 (8388608)
i1 %#0 = #x1 (1)
i32 %land.ext.neg.i.i = #xffffffff (4294967295, -1)
i32 %sub2.i.i = #x007fffff (8388607)
i32 %sub3.i.i = poison
i1 %cmp4.i.i = #x0 (0)
i32 %cond8.i.i = #x007fffff (8388607)
float %astype9.i.i = #x007fffff (0.000000000000?)
double %conv1.i = #x3810000000000000 (0.000000000000?)
i1 %cmp.i = #x1 (1)
float %cond.i = #x007fffff (0.000000000000?)
float %#1 = #x00000000 (+0.0)
i1 %cmp3.i = #x1 (1)
i1 %cmp5.i = #x1 (1)
i1 %or.cond.i = #x1 (1)
float %spec.select.i = #x00000000 (+0.0)

Target:
float %conv.i = #x00800000 (0.000000000000?)
i32 %astype.i.i = #x00800000 (8388608)
i32 %sub.i.i = poison
i1 %cmp.i.i = #x0 (0)
i32 %cond.i.i = #x00800000 (8388608)
i1 %#0 = #x1 (1)
i32 %land.ext.neg.i.i = #xffffffff (4294967295, -1)
i32 %sub2.i.i = #x007fffff (8388607)
i32 %sub3.i.i = poison
i1 %cmp4.i.i = #x0 (0)
i32 %cond8.i.i = #x007fffff (8388607)
float %astype9.i.i = #x007fffff (0.000000000000?)
double %conv1.i = #x3810000000000000 (0.000000000000?)
i1 %cmp.i = #x1 (1)
float %cond.i = #x007fffff (0.000000000000?)
i1 %cmp3.i = #x0 (0)
float %r.0.i = #x007fffff (0.000000000000?)
Source value: #x00000000 (+0.0)
Target value: #x007fffff (0.000000000000?)

Changes the behavior for 0x1.fffffep-127 from 0 to 0x1.fffffcp-127, but the flush isn't mandatory. I'm not sure if alive2 has a way to force producing additional counterexamples

Instead of doing a range check for whether the converted value
will be in the denormal range, with the off by one for the direction,
just check if the converted value is equal to 0.

This saves 5 instructions, mostly from materializing the 64-bit
constants.
@arsenm
Copy link
Author

arsenm commented Dec 18, 2025

Unsurprisingly alive2 is happy with the assumption the value isn't in the float denormal range:

  %x.abs = call double @llvm.fabs.f64(double %x)
  %is.not.float.denormal = fcmp oge double %x, 0x3810000000000000
  call void @llvm.assume(i1 %is.not.float.denormal)

I'm getting the impression it's not treating the flushed output of the fptrunc as a valid result from the denormal-fp-math-f32

@arsenm
Copy link
Author

arsenm commented Dec 18, 2025

Manually forcing canonicalization of the fptrunc result: https://alive2.llvm.org/ce/z/8a9oVY

ERROR: Value mismatch

Example:
double noundef %x = #xb80fffffdc000000 (-0.000000000000?)

Source:
float %conv.i.raw = #x807fffff (-0.000000000000?)
float %conv.i = #x80000000 (-0.0)
i32 %astype.i.i = #x80000000 (2147483648, -2147483648)
i32 %sub.i.i = #x00000000 (0)
i1 %cmp.i.i = #x1 (1)
i32 %cond.i.i = #x00000000 (0)
i1 %#0 = #x1 (1)
i32 %land.ext.neg.i.i = #xffffffff (4294967295, -1)
i32 %sub2.i.i = #xffffffff (4294967295, -1)
i32 %sub3.i.i = #x80000001 (2147483649, -2147483647)
i1 %cmp4.i.i = #x1 (1)
i32 %cond8.i.i = #x80000001 (2147483649, -2147483647)
float %astype9.i.i = #x80000001 (-0.000000000000?)
double %conv1.i = #x8000000000000000 (-0.0)
i1 %cmp.i = #x1 (1)
float %cond.i = #x80000001 (-0.000000000000?)
float %#1 = #x80000000 (-0.0)
i1 %cmp3.i = #x0 (0)
i1 %cmp5.i = #x1 (1)
i1 %or.cond.i = #x0 (0)
float %spec.select.i = #x80000001 (-0.000000000000?)

Target:
float %conv.i.raw = #x807fffff (-0.000000000000?)
float %conv.i = #x80000000 (-0.0)
i32 %astype.i.i = #x80000000 (2147483648, -2147483648)
i32 %sub.i.i = #x00000000 (0)
i1 %cmp.i.i = #x1 (1)
i32 %cond.i.i = #x00000000 (0)
i1 %#0 = #x1 (1)
i32 %land.ext.neg.i.i = #xffffffff (4294967295, -1)
i32 %sub2.i.i = #xffffffff (4294967295, -1)
i32 %sub3.i.i = #x80000001 (2147483649, -2147483647)
i1 %cmp4.i.i = #x1 (1)
i32 %cond8.i.i = #x80000001 (2147483649, -2147483647)
float %astype9.i.i = #x80000001 (-0.000000000000?)
double %conv1.i = #x8000000000000000 (-0.0)
i1 %cmp.i = #x1 (1)
float %cond.i = #x80000001 (-0.000000000000?)
i1 %cmp3.i = #x1 (1)
float %r.0.i = #x80000000 (-0.0)
Source value: #x80000001 (-0.000000000000?)
Target value: #x80000000 (-0.0)

So the old code didn't flush for all values either, which makes sense as there's no canonicalizing operation coming out of pred.

@arsenm
Copy link
Author

arsenm commented Dec 18, 2025

Forcing canonicalize on the pred path makes it pass: https://alive2.llvm.org/ce/z/_VNHTn, so the change is just which denormal results are flushed or not

I still need the canonicalize of the fptrunc output, so I think that's an alive2 bug

@arsenm arsenm force-pushed the device-libs/use-f32-compare-denorm-check branch from 86b77cb to 05dc045 Compare December 18, 2025 11:02
@z1-cciauto
Copy link
Collaborator

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

device-libs Related to Device Libraries

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants