Skip to content

Force overflow/underflow in generic std.math implementations#3386

Merged
andralex merged 1 commit intodlang:masterfrom
ibuclaw:overunderflow
Jun 18, 2016
Merged

Force overflow/underflow in generic std.math implementations#3386
andralex merged 1 commit intodlang:masterfrom
ibuclaw:overunderflow

Conversation

@ibuclaw
Copy link
Member

@ibuclaw ibuclaw commented Jun 7, 2015

If an overflow or underflow occurred in one of exp, expm1, or exp2 we should return infinity or 0.0 computationally at runtime, forcing overflow/underflow ieeeFlags to be set. At the same time, this still respects the fast path in CTFE.

This should fix the exp/exp2 and ieeeFlags unittests for non-x86 targets (though only tested on gdc x86_64).

@ibuclaw
Copy link
Member Author

ibuclaw commented Jun 7, 2015

For reviewers notes, it's these particular tests: 70e5e94

@dnadlinger
Copy link
Contributor

Maybe add a short comment about why CTFE is different?

@ibuclaw
Copy link
Member Author

ibuclaw commented Jun 7, 2015

You mean, that we don't care about ieeeFlags in CTFE because floats operations are done using a synthetic type?

@dnadlinger
Copy link
Contributor

Yeah.

@ibuclaw
Copy link
Member Author

ibuclaw commented Jun 7, 2015

Special cases. Raises an overflow or underflow flag accordingly,
except in the case for CTFE, where there are no hardware controls.

std/math.d Outdated
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

doesn't this multiply make things slow?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Slow how? This operation is deliberate because I didn't 't want the optimizer to fold the operation so that nohardware flag is raised.

std/math.d Outdated
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

CTFE code returns +0.0, but runtime -0.0. This is a bad practice for math.
Please check other signs too.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Rather than copysign the parameters, passing the result literal should do for now.

@ibuclaw
Copy link
Member Author

ibuclaw commented Nov 18, 2015

So, apart from @9il comments what's halting progress?

Does anyone have any alternative suggestion that would deliberately raise a fp exceptionwithout risk of being folded away by an optimizing compiler?

@ibuclaw
Copy link
Member Author

ibuclaw commented Nov 29, 2015

Re-based and addressed @9il's nit.

@9il
Copy link
Member

9il commented Nov 29, 2015

Unittest looks important to me because optimization control.

Off topic:

addressed @9il's nit

Google translate makes me feel strange because nit.

@ibuclaw
Copy link
Member Author

ibuclaw commented Nov 29, 2015

Unittest looks important to me because optimization control.

The unittests already exist that test FP control flags, although they only test the exception flags after calling exp2()

Google translate makes me feel strange because nit.

Well, when you have wavy-long hair, these things can happen. :-)

@9il
Copy link
Member

9il commented Nov 29, 2015

LGTM

@9il
Copy link
Member

9il commented Dec 3, 2015

Can we now make it template?

@quickfur
Copy link
Member

ping
What's happening with this PR? Are we ready to merge, or is this waiting for something else to happen first?

@9il
Copy link
Member

9il commented Feb 20, 2016

ping
What's happening with this PR? Are we ready to merge, or is this waiting for something else to happen first?

Yes, it can be merged.

@ibuclaw
Copy link
Member Author

ibuclaw commented Feb 20, 2016

Thanks. @andralex - dare I ask you for approval?

if (__ctfe)
return real.infinity;
else
return real.max * copysign(real.max, real.infinity);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ibuclaw Why all the calls to copysign()?

They don't appear to do anything, since the sign bit for both operands is always the same.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I answered this last year.

This operation is deliberate because I didn't want the optimizer to fold the operation so that no hardware flag is raised.

Essentially, I want runtime to execute this so that the hardware flag is raised. If CTFE does the job, it returns the answer with no hardware exception. Which defies the whole point of this PR.

If you have any alternatives, I'm all ears.

@tsbockman
Copy link
Contributor

tsbockman commented May 21, 2016

@ibuclaw

If you have any alternatives, I'm all ears.

This does the trick on all three compilers, and with better codegen than abusing copysign():

pragma(inline, true)
real ieeeDoOverflow(bool neg) @safe nothrow @nogc
{
    // only x87 instructions set ieeeFlags
    RealRep!real big = void;
    big.mantBits = cast(big.Mant)1 << big.fracExDig;
    end!(-1)(big.allBits) = cast(big.Exp)((neg << big.signShift) | big.expRawMaxNorm);

    big += big;
    return big;
}

unittest
{
    resetIeeeFlags();
    assert(isIdentical(ieeeDoOverflow(false), real.infinity));
    assert(ieeeFlags.overflow);

    resetIeeeFlags();
    assert(isIdentical(ieeeDoOverflow(true), -real.infinity));
    assert(ieeeFlags.overflow);
}

pragma(inline, true)
real ieeeDoUnderflow(bool neg) @safe nothrow @nogc
{
    // only x87 instructions set ieeeFlags
    RealRep!real small = void;
    small.mantBits = cast(small.Mant)1 << small.fracExDig;
    end!(-1)(small.allBits) = cast(small.Exp)(neg << small.signShift);

    small *= real.min_normal;
    return small;
}

unittest
{
    resetIeeeFlags();
    real zero = 0;
    assert(isIdentical(ieeeDoUnderflow(false), zero));
    assert(ieeeFlags.underflow);

    resetIeeeFlags();
    zero = -zero;
    assert(isIdentical(ieeeDoUnderflow(true),  zero));
    assert(ieeeFlags.underflow);
}

EDIT: Fixed. This would require my RealRep PR, though.

@ibuclaw
Copy link
Member Author

ibuclaw commented May 21, 2016

I don't think that would work either. It's not CTFE that is the problem, it's the optimizer. It will (or should) see through those operations in release mode.

@tsbockman
Copy link
Contributor

@ibuclaw

I don't think that would work either.

I fixed it. (The version you saw last worked on DMD, but I guess I forgot to retest it for GDC and LDC release mode before posting.)

@andralex
Copy link
Member

I'll pull, @ibuclaw feel free to improve by using @tsbockman's idea

@andralex
Copy link
Member

Auto-merge toggled on

@andralex andralex merged commit c0aa14d into dlang:master Jun 18, 2016
@tsbockman
Copy link
Contributor

feel free to improve by using @tsbockman's idea

I can do that later, if RealRep is accepted.

@ibuclaw
Copy link
Member Author

ibuclaw commented Jun 19, 2016

Thanks. @tsbockman ok. I'll have a re-review when I get round to it. How much does it differ from the previous two attempts to move pointer casting into unions?

@tsbockman
Copy link
Contributor

@ibuclaw

How much does it differ from the previous two attempts to move pointer casting into unions?

I wrote an answer in the RealRep comments.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

8 participants