Skip to content

Revert "std.math: proper support for 64-bit real in exp()"#3483

Closed
ibuclaw wants to merge 1 commit intomasterfrom
revert-3285-exp
Closed

Revert "std.math: proper support for 64-bit real in exp()"#3483
ibuclaw wants to merge 1 commit intomasterfrom
revert-3285-exp

Conversation

@ibuclaw
Copy link
Member

@ibuclaw ibuclaw commented Jul 10, 2015

Reverts #3285

Because no one has responded to any of my complaints or concerns for over a month. @kinke @WalterBright @MartinNowak

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To re-cap what I've already raised - for the third time.

This line is not using the RealFormat enum, it should be F.realFormat == RealFormat.ieeeExtended, likewise the static if below too (F.realFormat == RealFormat.ieeeDouble)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Don't you think performing these 2 changes in a proper commit would be just as much work as this revert+description?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I raised a PR 1 month ago. (#3388) - no one responded.

@MartinNowak
Copy link
Member

Revert of fix (#3514), your decision.

@schveiguy
Copy link
Member

Is this PR still necessary? I don't understand the code enough to tell.

@kinke
Copy link
Contributor

kinke commented Nov 16, 2015

The RealFormat enum is now used due to #3514. I've been running into these tests again these days when trying to enable LLVM math intrinsics for LDC. I'll re-run the tests again when I'm home and see if the D version of std.math.exp(-0x1.64p+9L) for 64-bit reals on Win64 still yields the same result diverging from GDC (@ibuclaw: how exactly did you produce your result?).

@ibuclaw
Copy link
Member Author

ibuclaw commented Nov 16, 2015

(@ibuclaw: how exactly did you produce your result?).

Compile with -mlong-double-64, any platform such as Linux x86 will work.

@kinke
Copy link
Contributor

kinke commented Nov 16, 2015

I still get 0x0.06f84920bb2d3p-1022 for std.math.exp(). core.stdc.math.exp() (VS 2015 runtime) on the other hand yields 0x0.06f84920bb2d4p-1022, i.e., Iain's result. Maybe it's just one of the many non-hex literals which gets parsed slightly differently by LDC (I think we use MS strtod() for that).
The LLVM intrinsic (most likely just using the MS C implementation) also yields ...4.

@ibuclaw
Copy link
Member Author

ibuclaw commented Nov 16, 2015

@kinke - How does std.math.exp return doubles vs. core.stdc.math.exp? Are you forcing a different ABI for extern(D) in LLVM?

@ibuclaw
Copy link
Member Author

ibuclaw commented Nov 16, 2015

BTW: The limits/values I have for real == double are:

enum real C1 = 6.93145751953125E-1;
enum real C2 = 1.42860682030941723212E-6;

enum real OF = 7.09782712893383996843E2;   // log(2 ^^ 1024)
enum real UF = -7.08396418532264106224E2;  // log(2 ^^ -1022)

The first two shouldn't make a difference, as the 80-bit literals should round correctly to them.

@kinke
Copy link
Contributor

kinke commented Nov 16, 2015

The 2 latter literals make a difference for me - I get 0 when using them. ;)
UF is seriously different (-7.08396418532264106224E2 vs. -7.451332191019412076235E2). Seems like your limit doesn't allow subnormals at all.

I don't know how you handle the details in GDC with -mlong-double-64. For Win64 and Microsoft's lacking support for x87 in their runtime (64-bit long double), we (LDC) opted to go for native 64-bit reals. So even the literals are parsed as 64-bit doubles by a Win64 LDC build (compiled by MSVC), and all real computations (incl. CTFE!) are obviously performed in 64 bits. So core.stdc.math.exp() and core.stdc.math.expl() are equivalent and yield a double. The only difference for real with extern(D) is the mangling.

@ibuclaw
Copy link
Member Author

ibuclaw commented Nov 16, 2015

The 2 latter literals make a difference for me - I get 0 when using them. ;)
UF is seriously different (-7.08396418532264106224E2 vs. -7.451332191019412076235E2). Seems like your limit doesn't allow subnormals at all.

Yes, you are correct! That serves me right from assuming the documentation is correct. :-D

Actual limits for real == double

enum real OF =  7.09782712893383996732E2;    // log(double.max)
enum real UF = -7.451332191019412076235E2;   // log(2**-1075)

Sorry about that.

@ibuclaw
Copy link
Member Author

ibuclaw commented Nov 16, 2015

we (LDC) opted to go for native 64-bit reals. So even the literals are parsed as 64-bit doubles by a Win64 LDC build (compiled by MSVC), and all real computations (incl. CTFE!) are obviously performed in 64 bits.

Yes, I seem to have forgotten that we are talking about double's. Well it's been rare, but I have seen cases where returning off the stack does one thing, and return in registers does another. Just wanted to make sure I wasn't seeing the same thing again. :-)

@ibuclaw
Copy link
Member Author

ibuclaw commented Nov 16, 2015

I don't know how you handle the details in GDC with -mlong-double-64.

Should be identical to using double. Perhaps a better or alternate platform to test would be ARM?

@kinke
Copy link
Contributor

kinke commented Nov 16, 2015

Yep, ARM results would be nice (for both GDC and LDC). But I'm pretty sure ...4 is correct, given that MS returns it too. I'm just wondering where the divergence comes from (still present with your actual constants).

@kinke
Copy link
Contributor

kinke commented Nov 16, 2015

Btw, what about improving feqrel() for subnormals?
In this case, we have a diff of 1 subnormal unit (2^-1022, min exponent), but as the step is from 3=011b to 4=100b, the 3 least significant mantissa bits differ. If the step was from 7=0111b to 8=1000b, the last 4 bits would differ, you get the idea.

We always use assert(feqrel(a, b) >= T.mant_dig - maxDifferingBits). Imho, a

bool feqrel(T)(T a, T b, int maxDifferingBits);

would be more convenient. And with that vital maxDifferingBits information, we could then treat subnormals differently. E.g., we could compute the absolute mantissa difference d and then simply return (d >> maxDifferingBits) == 0.
What do you think?

Edit: Oh feqrel() is currently not that bad for subnormals, i.e., it seems to do the above, but for some reason subtracts 1 from the intuitive result (double.mant_dig == 53):

feqrel(0x0.06f84920bb2d3p-1022, 0x0.06f84920bb2d4p-1022); //  1=    1b => returns 51
feqrel(0x0.06fffffffffffp-1022, 0x0.0700000000007p-1022); //  8= 1000b => returns 48
feqrel(0x0.06fffffffffffp-1022, 0x0.070000000000ep-1022); // 15= 1111b => returns 48
feqrel(0x0.06fffffffffffp-1022, 0x0.070000000000fp-1022); // 16=10000b => returns 47

This also means that the test would currently pass on Win64 LDC anyway when using 0x0.06f84920bb2d4p-1022L.

@ibuclaw
Copy link
Member Author

ibuclaw commented Nov 17, 2015

I'm just wondering where the divergence comes from (still present with your actual limits).

Yeah, just making sure that it's not a value that's out of bounds from what it is meant to handle.

I'll see if I have an ARM virtual machine lingering around...

@ibuclaw
Copy link
Member Author

ibuclaw commented Nov 17, 2015

I needed to apply #3386

ARM passes the tests. Hmm...

@kinke
Copy link
Contributor

kinke commented Nov 17, 2015

ARM passes the tests.

Why not? If it returns ...4 instead of the expected ...3, feqrel() returns 51 = real.mant_dig - 2, and the assert isn't triggered.
Your pull #3386 will come in handy, I've simply disabled the IEEE flags checks for LDC and non-x87 reals. ;)

@ibuclaw
Copy link
Member Author

ibuclaw commented Nov 18, 2015

Why not? If it returns ...4 instead of the expected ...3, feqrel() returns 51 = real.mant_dig - 2, and the assert isn't triggered.

The 'hmmm' was because on x86 using doubles it asserts. Using x86 with reals asserts for me too, I'll get a full report table going on the matter.

@ibuclaw
Copy link
Member Author

ibuclaw commented Nov 18, 2015

After some more testing, it seems that -mlong-double-64 is fundamentally broken in my environment at the moment. Strange, because it wasn't before. :-)

Oh well, here's the feqrel results of ARM and x86/64. Just a point of note, re-writing real exp(real) into double exp(real) and running the unittests on x86 reveals the same precision as ARM.

Input ARM ARM (libm) x86/64 x86/64 (libm)
1.0 52 53 64 64
0.5 53 53 64 64
3.0 52 52 63 64
0x1.1p+13 / 0x1.6p+9 53 53 52 52
0x1.7p+13 / 0x1.7p+9 53 53 64 64
0x1p+80 53 53 64 64
real.infinity 53 53 64 64
-0x1.18p+13 / -0x1.6p+9 53 53 51 51
-0x1.625p+13 N/A N/A 53 53
-0x1.62dafp+13 / -0x1.64p+9 51 51 54 54
-0x1.643p+13 / -0x1.743p+9 53 53 64 64
-0x1.645p+13 / -0x1.8p+9 53 53 64 64
-0x1p30 53 53 64 64

@kinke
Copy link
Contributor

kinke commented Nov 18, 2015

So the only minor divergences of your ARM results from my Win64 results are for 1.0 and 3.0 inputs (involving the E constant) as well as that -0x1.64p+9. Your exp(1.0L) != E is particularly interesting.

re-writing real exp(real) into double exp(real) and running the unittests on x86 reveals the same precision as ARM.

How can that be? So you only changed the return type from real to double, casting the 80-bit result down to 64-bit? I'll create a table for LDC (x86 Win64 with 64-bit reals, and x86 Linux with 80-bit reals) too once I'm home. Also note that libm's expl() yields quite different results; see ldc-developers@07a11cf#commitcomment-14404644 (where I tried to enable the llvm.exp intrinsic which seems to just forward to the libm C implementation).
Edit: My x86 libm results almost correspond perfectly to your x86 results. The only difference is for input 3.0, for which you get 63 and I got 64. So I guess you're not using the Phobos implementation (x87 inline assembly).

@ibuclaw
Copy link
Member Author

ibuclaw commented Nov 18, 2015

casting the 80-bit result down to 64-bit?

No, by using the 64-bit branch explicitly.

@ibuclaw
Copy link
Member Author

ibuclaw commented Nov 18, 2015

So I guess you're not using the Phobos implementation (x87 inline assembly).

Correct. GDC don't do DMD-style inline assembly. I'm testing the generic version, which shows that we can only guarantee double precision in the near overflow/underflow tests.

@kinke
Copy link
Contributor

kinke commented Nov 18, 2015

I'm pretty sure it's the other way around: the expected values for 80-bit reals seem crappy. If using DMD-style inline assembly, std.math.exp(real) is implemented as exp2(LOG2E*x), and exp2() uses x87 assembly... I'm currently gathering LDC results.

@dnadlinger
Copy link
Contributor

It's probably obvious to you anyway, but I'd not trust the x87 results at all; various functions are known to contain bugs. Arbitraty-precision libraries would probably be the better reference, and IIRC libm was quite good as well.

@kinke
Copy link
Contributor

kinke commented Nov 18, 2015

LDC results for std.math.exp(real x), feqrel() results in [brackets]:

  1. x86, 64-bit real:
x expected Generic D version LLVM intrinsic (MSVC++)
0x1.000p+0 0x1.5bf0a8b145769p+1 0x1.5bf0a8b14576ap+1 [52] 0x1.5bf0a8b145769p+1 [53]
0x1.000p-1 0x1.a61298e1e069cp+0 0x1.a61298e1e069cp+0 [53] 0x1.a61298e1e069cp+0 [53]
0x1.800p+1 0x1.415e5bf6fb105p+4 0x1.415e5bf6fb106p+4 [52] 0x1.415e5bf6fb106p+4 [52]
0x1.600p+9 0x1.93bf4ec282efbp+1015 0x1.93bf4ec282efbp+1015 [53] 0x1.93bf4ec282efbp+1015 [53]
0x1.700p+9 inf inf [53] inf [53]
0x1.000p+80 inf inf [53] inf [53]
inf inf inf [53] inf [53]
-0x1.600p+9 0x1.44a3824e5285fp-1016 0x1.44a3824e5285fp-1016 [53] 0x1.44a3824e5285fp-1016 [53]
-0x1.640p+9 0x0.06f84920bb2d3p-1022 0x0.06f84920bb2d3p-1022 [53] 0x0.06f84920bb2d4p-1022 [51]
-0x1.743p+9 0x0.0000000000001p-1022 0x0.0000000000001p-1022 [53] 0x0.0000000000001p-1022 [53]
-0x1.800p+9 0x0p+0 0x0p+0 [53] 0x0p+0 [53]
-0x1.000p+30 0x0p+0 0x0p+0 [53] 0x0p+0 [53]
  1. x86, 80-bit real (the version using DMD-style inline assembly always returns the expected value):
x expected Generic D version LLVM intrinsic (libm)
0x8p-3 0xa.df85458a2bb4a9bp-2 0xa.df85458a2bb4a9bp-2 [64] 0xa.df85458a2bb4a9bp-2 [64]
0x8p-4 0xd.3094c70f034de4cp-3 0xd.3094c70f034de4cp-3 [64] 0xd.3094c70f034de4cp-3 [64]
0xcp-2 0xa.0af2dfb7d882f97p+1 0xa.0af2dfb7d882f96p+1 [63] 0xa.0af2dfb7d882f97p+1 [64]
0x8.8p+10 0x9.4d77ff7e4763228p+12554 0x9.4d77ff7e47624b3p+12554 [52] 0x9.4d77ff7e47624b3p+12554 [52]
0xb.8p+10 inf inf [64] inf [64]
0x8p+77 inf inf [64] inf [64]
inf inf inf [64] inf [64]
-0x8.cp+10 0xa.f25faa5a4036dc8p-12930 0xa.f25faa5a40381a7p-12930 [51] 0xa.f25faa5a40381a8p-12930 [51]
-0xb.128p+10 0xd.35eb451ce88f9aep-16361 0xd.35eb451ce88ff62p-16361 [53] 0xd.35eb451ce88ff62p-16361 [53]
-0xb.16d78p+10 0x6.5b14f4c09dc0874p-16385 0x6.5b14f4c09dc076ap-16385 [54] 0x6.5b14f4c09dc076ap-16385 [54]
-0xb.218p+10 0x0.000000000000002p-16385 0x0.000000000000002p-16385 [64] 0x0.000000000000002p-16385 [64]
-0xb.228p+10 0x0p+0 0x0p+0 [64] 0x0p+0 [64]
-0x8p+27 0x0p+0 0x0p+0 [64] 0x0p+0 [64]

@ibuclaw
Copy link
Member Author

ibuclaw commented Nov 18, 2015

Updated my table with results for libm tests (slash GCC builtins).

@ibuclaw
Copy link
Member Author

ibuclaw commented Nov 18, 2015

0x0.06f84920bb2d4p-1022 [51]

OK, so my ARM tests, libm, and llvm intrinsics all seem to agree on this. So shall we just alter the expected value to 0x0.06f84920bb2d4p-1022L - meaning that Win64 will be the odd one out?

@kinke
Copy link
Contributor

kinke commented Nov 18, 2015

Yep, perfectly fine with that.

But more importantly, what about the 80-bit reference results? There's 2 options:

  • As they are imprecise crappy, allow for up to 13 diverging mantissa bits.
  • Use proper ones (libm/Wolfram), as all non-assembly versions seem to agree on them pretty much as well. That means that exp() would need to be fixed, most likely by just eliminating the special case for x87 and using the generic version instead.

The latter will probably result in more failing assertions. And obviously require your pull for the IEEE flags.

@ibuclaw
Copy link
Member Author

ibuclaw commented Nov 18, 2015

That means that exp() would need to be fixed, most likely by just eliminating the special case for x87 and using the generic version instead.

Well, what I read from it is that it's not clear how accurate the x87 test/expected pairs are. If libm disagrees with a divergence of up to 13 mantissa bits, maybe they are just plain wrong?

@kinke
Copy link
Contributor

kinke commented Nov 18, 2015

maybe they are just plain wrong?

Of course they are. I bet they've just been obtained by running the current implementation, which multiplies the value and feeds exp2() with it, which can only be an approximation.
Ironically, Walter wouldn't let me use a limit of 13 bits in my original pull (as preparation for LLVM intrinsics) because he likes to keep the tests tight. If only they were testing for correct results ;) - but I didn't go for the underlying issue back then.

@ibuclaw
Copy link
Member Author

ibuclaw commented Nov 19, 2015

Yep, perfectly fine with that.

Actually, what's the result of feqrel with 0x0.06f84920bb2d4p-1022 on Win64? If it's less than 51, then it might not be worth it.

@kinke
Copy link
Contributor

kinke commented Nov 19, 2015

I've replied to your new pull. Given the diff of one subnormal unit, feqrel() returns real.mant_dig - 2 = 51 (feqrel() subtracts an additional bit for subnormals, no idea why).

@ibuclaw
Copy link
Member Author

ibuclaw commented Nov 19, 2015

I have raised a new bug: https://issues.dlang.org/show_bug.cgi?id=15365

So I guess this is good for closure.

@ibuclaw ibuclaw closed this Nov 19, 2015
@ibuclaw ibuclaw deleted the revert-3285-exp branch November 19, 2015 20:48
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants