Begin replacing floatTraits pointer math with RealRep union by tsbockman · Pull Request #4336 · dlang/phobos

tsbockman · 2016-05-18T01:00:51Z

Pros:

Prepare the way for CTFE floating-point math by switching from explicit pointer reinterpretation to a union.
union code is easier to read and understand.
union code is also easier to write, thus freeing mental energy for performance optimizations.
Consolidate tons of near-duplicate code-paths into (usually) two: one for ibmExtended, and one for everything else.
Adding support for new IEEE-style floating-point types (such as half or quad) gets easier as more of std.math is converted to use RealRep. (ucent would help here, though.)
By the time all of std.math has been converted, I expect that (in net) RealRep will have reduced the total line count substantially.
It may be possible to replace some of the inline assembler with RealRep code; the codegen seems to be very clean, for the most part.

Cons:

Many large diffs (starting with this one) will be required to complete the conversion.
std.math will become substantially more dependant upon inlining for good performance.

wilzbach · 2016-05-20T08:40:41Z

std/math.d

+   TODO: Make this publically available through std.bitmanip once the API
+   is stable and tested.
+*/
+union RealRep(N)


Should be at least package as long as it's not exposed publicly?

That whole section was already marked package, starting on line 214.

tsbockman · 2016-05-20T09:18:50Z

I have now tuned this PR to produce the best assembly code that I can.

I optimized to minimize memory access, instruction count, branches, and embedded constant size, in about that order. (Note that the runtime code affected by this PR is all bitwise integer ops, so there are no slow, precision-destroying FPU ops to worry about.)

Here is a summary of the net change. -X ins means the union version is X instructions shorter. -Y mem means that it contains Y fewer memory accesses:

	DMD64	DMD32	LDC64	LDC32	GDC64
isNaN:80	-2 ins	-8 ins, -6 mem	0	0	0
isNaN:64	-5 ins, -2 mem	-10 ins, -5 mem	0	0	+1 ins
isNaN:32	0	0	0	0	+2 ins
isFinite:80	-1 ins	-1 ins	0	0	0
isFinite:64	0	-1 ins	0	0	-2 mem
isFinite:32	-1 ins	-1 ins	0	0	-2 mem
isInfinity:80	-4 ins	-14 ins, -6 mem	0	0	0
isInfinity:64	-1 ins	0	0	0	-1 ins
isInfinity:32	0	0	0	0	0
isNormal:80	-1 ins	-1 ins	+1 ins	+1 ins	-1 ins
isNormal:64	-1 ins	-1 ins	+1 ins	+1 ins	-1 ins, -2 mem
isNormal:32	0	-1 ins	+1 ins	+1 ins	-1 ins, -2 mem
isSubnormal:80	0	0	0	-1 ins	0
isSubnormal:64	-1 ins, -1 mem	+2 ins	-4 ins	-4 ins	-1 ins, -3 mem
isSubnormal:32	+1 ins	+1 ins	0	0	+3 ins
signbit:80	-1 ins	0	0	0	0
signbit:64	+1 ins	0	0	0	-2 mem
signbit:32	-1 ins	0	0	0	-1 ins, -2 mem
isIdentical:80	0	0	0	0	+4 ins, +4 mem
isIdentical:64	-15 ins, -12 mem	-15 ins, -8 mem	-13 ins, -10 mem	-15 ins, -6 mem	-13 ins, -10 mem
isIdentical:32	-14 ins, -10 mem	-18 ins, -10 mem	-13 ins, -10 mem	-18 ins, -8 mem	-13 ins, -10 mem
copysign:80	+2 ins	+3 ins			0
copysign:64	+3 ins, +1 mem	+7 ins, +5 mem			+5 ins, -4 mem
copysign:32	+3 ins, +1 mem	+2 ins			+5 ins, -4 mem

Overall, I expect that the new union version is either the same speed, or faster compared to the old one. I do wish I could figure out why DMD doesn't like my copysign() implementation though, particularly for double on 32-bit.

EDIT: LDC uses the llvm_copysign() intrinsic for copysign(); that's why my version compared poorly. I have removed it from the comparison, since there is no need for the inferior RealRep version on that platform.

tsbockman · 2016-05-30T15:09:21Z

Ping @9il and @ibuclaw . I know this is a lot to review, but can you guys at least take a first look and let me know if this is worth pursuing?

I want to start working on follow-up pull requests to convert the rest of the pointer-based code in std.math, but don't want to waste my time if my whole approach is unacceptable for some reason.

tsbockman · 2016-06-16T18:35:17Z

Converting std.math to use unions is a large project, and I don't want it hanging over my head indefinitely. Since no one will tell me whether this is wanted or not, I'm going to close it now.

I might reopen if someone asks me to, soon.

WalterBright · 2016-06-17T00:02:46Z

There is good work in here. Please do not close.

UplinkCoder · 2017-05-17T03:38:00Z

std/math.d

+            alias Mant;
+        +/
+
+        static if (N.mant_dig == 11 && N.max_exp == 0x6)


please use decimals for N.max_exp also

I used hexadecimal for N.max_exp because for all other formats, N.max_exp == (1 << (expDig - 1)) and hexadecimal makes it easy to verify at a glance that the numbers are actually correct.

In fact... this one should be (1 << (expDig - 1)) == 0x10, also; I don't know where I got 0x6 from. I'll fix that now.

fair enough

UplinkCoder · 2017-05-17T03:38:18Z

std/math.d

+                 expDig = 11;
+            alias Mant = ulong;
+        }
+        else static if (N.mant_dig == 53 && N.max_exp == 0x4000)


…tical, signbit, & copysign.

tsbockman · 2017-05-17T14:58:11Z

On the forum, Simen Kjærås questioned the need for a separate ibmExtended code path.

While I don't think it can be consolidated with the others, I am wondering if we should just drop it completely. What do you guys think, @ibuclaw and @kinke ? Is imbExtended something D needs to support, or should we mandate that all floating-point formats be IEEE-like?

ibuclaw · 2017-05-17T17:19:19Z

How difficult a change would it be? If trivial then I'd say let's just deal with it when someone has the hardware (or emulator) to test.

tsbockman · 2017-05-17T17:38:12Z

How difficult a change would it be?

I don't really know how difficult adding ibmExtended support will be; it seems fairly easy so far, but there are a lot of functions in std.math that I haven't worked on yet. (I started with the easy stuff to keep this first PR small.)

Even if I succeed in writing code that is correct, I definitely can't optimize it much without access to a test environment.

(Removing the existing incomplete ibmExtended support is easy - I'll just add a static assert when that format is detected and delete all the special code paths for it outside of RealRep.)

kinke · 2017-05-17T17:48:12Z

From https://gcc.gnu.org/wiki/Ieee128PowerPC:

[...] the PowerPC compiler uses a software type known as IBM extended double type. The IBM extended double type is a pair of double values that give the user more bits of precision, but no additional range for the mantissa. All of the support for IBM extended double is done via software emulation (there are deprecated instructions to load and store a pair of floating point values, but current hardware no longer supports these instructions).

As D real is defined as the largest FP type in hardware (with LDC violating this for Windows/MSVC already) and I don't recall anyone requesting support for it, I'm all for getting rid of ibmExtended completely. I'd suggest adding a new commit removing it, so that it's still in the git history in case someone needs it in the future (interop with C libs like glibc compiled with -mlong-double-128 or so).

Nice PR btw (no time for a proper review though).

kinke · 2017-05-17T18:30:08Z

Btw such a FP union might come in handy already in druntime instead of Phobos; core.internal.convert could ~~definitely~~ use it.

tsbockman · 2017-05-17T18:53:56Z

druntime

OK. I'll keep that in mind for the future.

Currently my plan is to keep RealRep private to the std.math package until the entire thing has been converted to use it. That way, the API design won't get frozen before it's been adequately tested.

We can look into publicising it and moving it to druntime or std.bitmanip later.

WalterBright · 2017-05-17T19:00:09Z

Note that CTFE supports the special casts:

*cast(uint*)&float
*cast(uint*)&double

in order to support bit manipulation of floating point values. Changing these to unions will break CTFE support.

tsbockman · 2017-05-17T19:04:59Z

@WalterBright I can manually lower the union code to an equivalent struct that uses those casts internally, if necessary.

Are those really the only supported casts, though? What about ulong*, ushort*, etc.? What about real?

Biotronic · 2017-05-18T04:44:03Z

Those are the only supported casts. Hence, real is just a mess in CTFE. I actually started writing a patch for DMD to support the missing casts a year ago or so - maybe I should pick it up again and add a PR. Either that, or this PR is basically stranded until NewCTFE comes around.

ibuclaw · 2017-05-18T06:45:42Z

When I moved type painting out of ctfe, I wrote it with the intention of it being a simple addition to paint basic and vector types to static array and vice versa [1].

Where all you need is to add the case for Tsarray, and do a loop passing the pointer adjusted buffer to encodeXXX or decodeXXX. Then allowing the cast in ctfe as being valid where Target.paintAsType is called from.

I had also intended to use this for union support too, but I never got round to it.

[1] https://github.com/dlang/dmd/blob/master/src/ddmd/target.d#L374

tsbockman · 2017-05-19T03:56:46Z

@ibuclaw @Biotronic @WalterBright I have opened a DMD PR to extend the repainting capabilities as needed.

ibuclaw · 2017-10-29T15:05:33Z

std/math.d

+            {
+                RealRep!double lo;
+                RealRep!double hi;
+            }


After some testing, can safely say that this is wrong. The most significant part is always first, regardless of of endian-ness on ibmExtended.

tsbockman force-pushed the float_union branch 6 times, most recently from 8a35e27 to b40630e Compare May 18, 2016 09:28

tsbockman mentioned this pull request May 18, 2016

Fix issue 16026: std.math.frexp!float() wrong for very small subnormals #4337

Merged

tsbockman force-pushed the float_union branch 3 times, most recently from 44bdfbf to 529b7c9 Compare May 19, 2016 02:17

tsbockman mentioned this pull request May 19, 2016

src/dsymbol/conversion/first.d(1029): Assertion failure dlang-community/dsymbol#9

Closed

tsbockman force-pushed the float_union branch 4 times, most recently from f851ecf to f471022 Compare May 19, 2016 08:37

Hackerpilot mentioned this pull request May 19, 2016

update dscanner to alpha 6 #4340

Merged

tsbockman force-pushed the float_union branch from f471022 to a6de064 Compare May 20, 2016 08:02

wilzbach reviewed May 20, 2016
View reviewed changes

tsbockman force-pushed the float_union branch from a6de064 to f7157ef Compare May 20, 2016 10:38

DmitryOlshansky added the math label May 20, 2016

tsbockman mentioned this pull request May 21, 2016

Force overflow/underflow in generic std.math implementations #3386

Merged

tsbockman force-pushed the float_union branch from f7157ef to b214ac2 Compare May 22, 2016 21:11

tsbockman force-pushed the float_union branch 2 times, most recently from 202d8ba to 8160b90 Compare May 31, 2016 13:01

tsbockman closed this Jun 16, 2016

WalterBright reopened this Jun 17, 2016

UplinkCoder reviewed May 17, 2017

View reviewed changes

std/math.d

expDig = 11;

alias Mant = ulong;

}

else static if (N.mant_dig == 53 && N.max_exp == 0x4000)

Copy link

Member

UplinkCoder May 17, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ditto

tsbockman added 2 commits May 16, 2017 21:05

Add RealRep union for low-level decomposition of floating-point values.

849e5cf

Use RealRep for isNaN, isFinite, is{Subn|N}ormal, isInfinity, isIden…

5d7a172

…tical, signbit, & copysign.

tsbockman force-pushed the float_union branch from 07aac12 to 5d7a172 Compare May 17, 2017 04:14

tsbockman mentioned this pull request May 22, 2017

Generalize CTFE float/int repainting to support many more types. dlang/dmd#6811

Closed

dlang-bot added the Review:Needs Work label Aug 13, 2017

ibuclaw mentioned this pull request Oct 29, 2017

std.math: Use DOUBLEPAIR_MSB/LSB for accessing high/low parts of ibmExtended format #5823

Merged

ibuclaw reviewed Oct 29, 2017

View reviewed changes

ibuclaw mentioned this pull request Oct 29, 2017

std.math: Add RealRep union for accessing floating point bits via a union. #5825

Closed

dlang-bot added Merge:stalled Merge:Needs Rebase and removed Merge:stalled Merge:Needs Rebase labels Dec 29, 2017

tsbockman closed this Oct 8, 2020

Uh oh!

Conversation

tsbockman commented May 18, 2016 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

tsbockman commented May 20, 2016 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

tsbockman commented May 30, 2016

Uh oh!

tsbockman commented Jun 16, 2016

Uh oh!

WalterBright commented Jun 17, 2016

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

tsbockman commented May 17, 2017

Uh oh!

ibuclaw commented May 17, 2017

Uh oh!

tsbockman commented May 17, 2017 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

kinke commented May 17, 2017 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

kinke commented May 17, 2017 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

tsbockman commented May 17, 2017

Uh oh!

WalterBright commented May 17, 2017

Uh oh!

tsbockman commented May 17, 2017

Uh oh!

Biotronic commented May 18, 2017

Uh oh!

ibuclaw commented May 18, 2017

Uh oh!

tsbockman commented May 19, 2017

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

9 participants

tsbockman commented May 18, 2016 •

edited

Loading

tsbockman commented May 20, 2016 •

edited

Loading

tsbockman commented May 17, 2017 •

edited

Loading

kinke commented May 17, 2017 •

edited

Loading

kinke commented May 17, 2017 •

edited

Loading