Force overflow/underflow in generic std.math implementations#3386
Force overflow/underflow in generic std.math implementations#3386andralex merged 1 commit intodlang:masterfrom
Conversation
|
For reviewers notes, it's these particular tests: 70e5e94 |
|
Maybe add a short comment about why CTFE is different? |
|
You mean, that we don't care about ieeeFlags in CTFE because floats operations are done using a synthetic type? |
|
Yeah. |
|
std/math.d
Outdated
There was a problem hiding this comment.
doesn't this multiply make things slow?
There was a problem hiding this comment.
Slow how? This operation is deliberate because I didn't 't want the optimizer to fold the operation so that nohardware flag is raised.
std/math.d
Outdated
There was a problem hiding this comment.
CTFE code returns +0.0, but runtime -0.0. This is a bad practice for math.
Please check other signs too.
There was a problem hiding this comment.
Rather than copysign the parameters, passing the result literal should do for now.
|
So, apart from @9il comments what's halting progress? Does anyone have any alternative suggestion that would deliberately raise a fp exceptionwithout risk of being folded away by an optimizing compiler? |
1590222 to
d0e2bf2
Compare
|
Re-based and addressed @9il's nit. |
|
Unittest looks important to me because optimization control. Off topic:
Google translate makes me feel strange because nit. |
The unittests already exist that test FP control flags, although they only test the exception flags after calling
Well, when you have wavy-long hair, these things can happen. :-) |
|
LGTM |
|
Can we now make it template? |
|
ping |
Yes, it can be merged. |
|
Thanks. @andralex - dare I ask you for approval? |
| if (__ctfe) | ||
| return real.infinity; | ||
| else | ||
| return real.max * copysign(real.max, real.infinity); |
There was a problem hiding this comment.
@ibuclaw Why all the calls to copysign()?
They don't appear to do anything, since the sign bit for both operands is always the same.
There was a problem hiding this comment.
I answered this last year.
This operation is deliberate because I didn't want the optimizer to fold the operation so that no hardware flag is raised.
Essentially, I want runtime to execute this so that the hardware flag is raised. If CTFE does the job, it returns the answer with no hardware exception. Which defies the whole point of this PR.
If you have any alternatives, I'm all ears.
This does the trick on all three compilers, and with better codegen than abusing pragma(inline, true)
real ieeeDoOverflow(bool neg) @safe nothrow @nogc
{
// only x87 instructions set ieeeFlags
RealRep!real big = void;
big.mantBits = cast(big.Mant)1 << big.fracExDig;
end!(-1)(big.allBits) = cast(big.Exp)((neg << big.signShift) | big.expRawMaxNorm);
big += big;
return big;
}
unittest
{
resetIeeeFlags();
assert(isIdentical(ieeeDoOverflow(false), real.infinity));
assert(ieeeFlags.overflow);
resetIeeeFlags();
assert(isIdentical(ieeeDoOverflow(true), -real.infinity));
assert(ieeeFlags.overflow);
}
pragma(inline, true)
real ieeeDoUnderflow(bool neg) @safe nothrow @nogc
{
// only x87 instructions set ieeeFlags
RealRep!real small = void;
small.mantBits = cast(small.Mant)1 << small.fracExDig;
end!(-1)(small.allBits) = cast(small.Exp)(neg << small.signShift);
small *= real.min_normal;
return small;
}
unittest
{
resetIeeeFlags();
real zero = 0;
assert(isIdentical(ieeeDoUnderflow(false), zero));
assert(ieeeFlags.underflow);
resetIeeeFlags();
zero = -zero;
assert(isIdentical(ieeeDoUnderflow(true), zero));
assert(ieeeFlags.underflow);
}EDIT: Fixed. This would require my |
|
I don't think that would work either. It's not CTFE that is the problem, it's the optimizer. It will (or should) see through those operations in release mode. |
I fixed it. (The version you saw last worked on DMD, but I guess I forgot to retest it for GDC and LDC release mode before posting.) |
|
I'll pull, @ibuclaw feel free to improve by using @tsbockman's idea |
|
Auto-merge toggled on |
I can do that later, if |
|
Thanks. @tsbockman ok. I'll have a re-review when I get round to it. How much does it differ from the previous two attempts to move pointer casting into unions? |
I wrote an answer in the |
If an overflow or underflow occurred in one of
exp,expm1, orexp2we should return infinity or 0.0 computationally at runtime, forcing overflow/underflowieeeFlagsto be set. At the same time, this still respects the fast path in CTFE.This should fix the
exp/exp2andieeeFlagsunittests for non-x86 targets (though only tested on gdc x86_64).