device-libs: Use f32 denormal check in rtn f64->f32 conversions #876

arsenm · 2025-12-17T20:57:46Z

Instead of doing a range check for whether the converted value will be in the denormal range, with the off by one for the direction, just check if the converted value is equal to 0.

This does have a small behavior change. Neither version canonicalizes the results on the pred path, so this changes which denormal results are flushed to 0 or left uncanonicalized.

This saves 5 instructions, mostly from materializing the 64-bit constants.

github-actions · 2025-12-17T20:58:19Z

Thank you for submitting a Pull Request (PR) to the LLVM Project!

This PR will be automatically labeled and the relevant teams will be notified.

If you wish to, you can add reviewers by using the "Reviewers" section on this page.

If this is not working for you, it is probably because you do not have write permissions for the repository. In which case you can instead tag reviewers by name in a comment by using @ followed by their GitHub username.

If you have received no comments on your PR for a week, you can request a review by "ping"ing the PR by adding a comment “Ping”. The common courtesy "ping" rate is once a week. Please remember that you are asking for valuable time from other developers.

If you have further questions, they may be answered by the LLVM GitHub User Guide.

You can also ask questions in a comment on this PR, on the LLVM Discord or on the forums.

arsenm · 2025-12-17T20:58:32Z

I think PSDB is still not running any tests with -z, but the relevant conversions test pass for me

z1-cciauto · 2025-12-17T20:59:15Z

PSDB Link: https://compiler-ci.amd.com/job/compiler-psdb-amd-staging/3326

b-sumner · 2025-12-17T21:32:55Z

amd/device-libs/ocml/src/convert.cl

    if (DAZ_OPT()) {
-        float z = BUILTIN_COPYSIGN_F32(0.0f, r);
-        r = a >= -0x1.fffffcp-127 && a < 0x1.0p-126 ? z : r;
+        r = a_f == 0.0f ? a_f : r;


I don't think this will always return the same results as the original, but I think it's OK. Would you please replace "a_f" with "fa"?

arsenm · 2025-12-17T23:28:32Z

https://alive2.llvm.org/ce/z/BYCRrh

Need a local build for it to not timeout, but:

ERROR: Value mismatch

Example:
double noundef %x = #x380fffffe0000000 (0.000000000000?)

Source:
float %conv.i = #x00800000 (0.000000000000?)
i32 %astype.i.i = #x00800000 (8388608)
i32 %sub.i.i = poison
i1 %cmp.i.i = #x0 (0)
i32 %cond.i.i = #x00800000 (8388608)
i1 %#0 = #x1 (1)
i32 %land.ext.neg.i.i = #xffffffff (4294967295, -1)
i32 %sub2.i.i = #x007fffff (8388607)
i32 %sub3.i.i = poison
i1 %cmp4.i.i = #x0 (0)
i32 %cond8.i.i = #x007fffff (8388607)
float %astype9.i.i = #x007fffff (0.000000000000?)
double %conv1.i = #x3810000000000000 (0.000000000000?)
i1 %cmp.i = #x1 (1)
float %cond.i = #x007fffff (0.000000000000?)
float %#1 = #x00000000 (+0.0)
i1 %cmp3.i = #x1 (1)
i1 %cmp5.i = #x1 (1)
i1 %or.cond.i = #x1 (1)
float %spec.select.i = #x00000000 (+0.0)

Target:
float %conv.i = #x00800000 (0.000000000000?)
i32 %astype.i.i = #x00800000 (8388608)
i32 %sub.i.i = poison
i1 %cmp.i.i = #x0 (0)
i32 %cond.i.i = #x00800000 (8388608)
i1 %#0 = #x1 (1)
i32 %land.ext.neg.i.i = #xffffffff (4294967295, -1)
i32 %sub2.i.i = #x007fffff (8388607)
i32 %sub3.i.i = poison
i1 %cmp4.i.i = #x0 (0)
i32 %cond8.i.i = #x007fffff (8388607)
float %astype9.i.i = #x007fffff (0.000000000000?)
double %conv1.i = #x3810000000000000 (0.000000000000?)
i1 %cmp.i = #x1 (1)
float %cond.i = #x007fffff (0.000000000000?)
i1 %cmp3.i = #x0 (0)
float %r.0.i = #x007fffff (0.000000000000?)
Source value: #x00000000 (+0.0)
Target value: #x007fffff (0.000000000000?)

Changes the behavior for 0x1.fffffep-127 from 0 to 0x1.fffffcp-127, but the flush isn't mandatory. I'm not sure if alive2 has a way to force producing additional counterexamples

Instead of doing a range check for whether the converted value will be in the denormal range, with the off by one for the direction, just check if the converted value is equal to 0. This saves 5 instructions, mostly from materializing the 64-bit constants.

arsenm · 2025-12-18T10:16:40Z

Unsurprisingly alive2 is happy with the assumption the value isn't in the float denormal range:

  %x.abs = call double @llvm.fabs.f64(double %x)
  %is.not.float.denormal = fcmp oge double %x, 0x3810000000000000
  call void @llvm.assume(i1 %is.not.float.denormal)

I'm getting the impression it's not treating the flushed output of the fptrunc as a valid result from the denormal-fp-math-f32

arsenm · 2025-12-18T10:50:19Z

Manually forcing canonicalization of the fptrunc result: https://alive2.llvm.org/ce/z/8a9oVY

ERROR: Value mismatch

Example:
double noundef %x = #xb80fffffdc000000 (-0.000000000000?)

Source:
float %conv.i.raw = #x807fffff (-0.000000000000?)
float %conv.i = #x80000000 (-0.0)
i32 %astype.i.i = #x80000000 (2147483648, -2147483648)
i32 %sub.i.i = #x00000000 (0)
i1 %cmp.i.i = #x1 (1)
i32 %cond.i.i = #x00000000 (0)
i1 %#0 = #x1 (1)
i32 %land.ext.neg.i.i = #xffffffff (4294967295, -1)
i32 %sub2.i.i = #xffffffff (4294967295, -1)
i32 %sub3.i.i = #x80000001 (2147483649, -2147483647)
i1 %cmp4.i.i = #x1 (1)
i32 %cond8.i.i = #x80000001 (2147483649, -2147483647)
float %astype9.i.i = #x80000001 (-0.000000000000?)
double %conv1.i = #x8000000000000000 (-0.0)
i1 %cmp.i = #x1 (1)
float %cond.i = #x80000001 (-0.000000000000?)
float %#1 = #x80000000 (-0.0)
i1 %cmp3.i = #x0 (0)
i1 %cmp5.i = #x1 (1)
i1 %or.cond.i = #x0 (0)
float %spec.select.i = #x80000001 (-0.000000000000?)

Target:
float %conv.i.raw = #x807fffff (-0.000000000000?)
float %conv.i = #x80000000 (-0.0)
i32 %astype.i.i = #x80000000 (2147483648, -2147483648)
i32 %sub.i.i = #x00000000 (0)
i1 %cmp.i.i = #x1 (1)
i32 %cond.i.i = #x00000000 (0)
i1 %#0 = #x1 (1)
i32 %land.ext.neg.i.i = #xffffffff (4294967295, -1)
i32 %sub2.i.i = #xffffffff (4294967295, -1)
i32 %sub3.i.i = #x80000001 (2147483649, -2147483647)
i1 %cmp4.i.i = #x1 (1)
i32 %cond8.i.i = #x80000001 (2147483649, -2147483647)
float %astype9.i.i = #x80000001 (-0.000000000000?)
double %conv1.i = #x8000000000000000 (-0.0)
i1 %cmp.i = #x1 (1)
float %cond.i = #x80000001 (-0.000000000000?)
i1 %cmp3.i = #x1 (1)
float %r.0.i = #x80000000 (-0.0)
Source value: #x80000001 (-0.000000000000?)
Target value: #x80000000 (-0.0)

So the old code didn't flush for all values either, which makes sense as there's no canonicalizing operation coming out of pred.

arsenm · 2025-12-18T10:59:54Z

Forcing canonicalize on the pred path makes it pass: https://alive2.llvm.org/ce/z/_VNHTn, so the change is just which denormal results are flushed or not

I still need the canonicalize of the fptrunc output, so I think that's an alive2 bug

z1-cciauto · 2025-12-18T11:04:32Z

PSDB Link: https://compiler-ci.amd.com/job/compiler-psdb-amd-staging/3332

arsenm requested a review from b-sumner as a code owner December 17, 2025 20:57

arsenm added the device-libs Related to Device Libraries label Dec 17, 2025

b-sumner reviewed Dec 17, 2025

View reviewed changes

arsenm force-pushed the device-libs/use-f32-compare-denorm-check branch from 86b77cb to 05dc045 Compare December 18, 2025 11:02

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

device-libs: Use f32 denormal check in rtn f64->f32 conversions #876

device-libs: Use f32 denormal check in rtn f64->f32 conversions #876

Uh oh!

arsenm commented Dec 17, 2025 •

edited

Loading

Uh oh!

github-actions bot commented Dec 17, 2025

Uh oh!

arsenm commented Dec 17, 2025

Uh oh!

z1-cciauto commented Dec 17, 2025

Uh oh!

b-sumner Dec 17, 2025

Uh oh!

arsenm commented Dec 17, 2025

Uh oh!

arsenm commented Dec 18, 2025

Uh oh!

arsenm commented Dec 18, 2025

Uh oh!

arsenm commented Dec 18, 2025

Uh oh!

z1-cciauto commented Dec 18, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

device-libs: Use f32 denormal check in rtn f64->f32 conversions #876

Are you sure you want to change the base?

device-libs: Use f32 denormal check in rtn f64->f32 conversions #876

Uh oh!

Conversation

arsenm commented Dec 17, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

github-actions bot commented Dec 17, 2025

Uh oh!

arsenm commented Dec 17, 2025

Uh oh!

z1-cciauto commented Dec 17, 2025

Uh oh!

b-sumner Dec 17, 2025

Choose a reason for hiding this comment

Uh oh!

arsenm commented Dec 17, 2025

Uh oh!

arsenm commented Dec 18, 2025

Uh oh!

arsenm commented Dec 18, 2025

Uh oh!

arsenm commented Dec 18, 2025

Uh oh!

z1-cciauto commented Dec 18, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

arsenm commented Dec 17, 2025 •

edited

Loading