Skip to content

Replace pointer casts in std.math with union repainting#3499

Closed
ivan-timokhin wants to merge 24 commits intodlang:masterfrom
ivan-timokhin:math-unions
Closed

Replace pointer casts in std.math with union repainting#3499
ivan-timokhin wants to merge 24 commits intodlang:masterfrom
ivan-timokhin:math-unions

Conversation

@ivan-timokhin
Copy link
Contributor

From the comments received on #3022, I was left with an impression that unions are generally preferred to pointer casts, and should be used instead wherever possible. So I went ahead and replaced most of the pointer casts with unions.

This is mostly straightforward one-to-one replacement, with notable exception of isIdentical, because I couldn't resist the temptation.

I didn't touch NaN and getNaNPayload, because

  1. I couldn't figure out what real formats are actually supported by them.
  2. I've no idea how to translate these weird unaligned reads/stores [1][2] into unions in a sane way.

Destroy!

@ibuclaw
Copy link
Member

ibuclaw commented Jul 18, 2015

Let's start with the bike shedding first, as that should be the easy bit.... :-)

  • Repainter: I'd be leaning on something like FloatLayout to describe this, but maybe it is better suited for...
  • Blob: Because when I think of blob, object storage systems come to mind.

I didn't touch NaN and getNaNPayload, because

  1. I couldn't figure out what real formats are actually supported by them.

64bit, 128bit, and x87 reals by the looks of it. (Considering that DMD code would be using the catch-all else block in there).

There should be an explicit static if condition and else static assert.

  1. I've no idea how to translate these weird unaligned reads/stores [1][2] into unions in a sane way.

You'll just be dereferencing the .bytes array I guess, but the cast from ubyte* to ulong* is interesting.

Maybe from *cast(ulong*)(6+cast(ubyte*)(&x)) to *cast(ulong*)(&x.bytes[6]). Not that I approve of such magic. Which leads back to an earlier point of leveraging some FloatLayout structure into the union.

I'm thinking along the lines of:

static if (T.mant_dig == 24)
{
  version(BigEndian)
    alias Layout = Tuple!("negative", ubyte,
                          "exponent", ulong,
                          "qnan", ubyte,
                          "mantissa", ubyte[22]);
  else
    alias Layout = Tuple!("mantissa", ubyte[22],
                          "qnan", ubyte,
                          "exponent", ulong,
                          "negative", ubyte);
}
/* else static if... */

union Repainter
{
  Unqual!T number;
  Layout layout;
  /* other fields... */
}

Something that you have already sort of done with significand using x87 reals.

@ivan-timokhin
Copy link
Contributor Author

On Sat, Jul 18, 2015 at 05:07:57AM -0700, Iain Buclaw wrote:

Let's start with the bike shedding first, as that should be the easy bit.... :-)

  • Repainter: I'd be leaning on something like FloatLayout to describe this, but maybe it is better suited for...
  • Blob: Because when I think of blob, object storage systems come to mind.

Well, I don't like the names too much myself :). FloatLayout is probably
even better suited for what you describe below, so...

I'm thinking along the lines of:

static if (T.mant_dig == 24)
{
  version(BigEndian)
    alias Layout = Tuple!("negative", ubyte, "exponent", ulong, "qnan", ubyte, "mantissa", ubyte[22]);
  else
    alias Layout = Tuple!("mantissa", ubyte[22], "qnan", ubyte, "exponent", ulong, "negative", ubyte);
}

Something that you have already sort of done with significand using x87 reals.

I can probably try and come up with something using bitfields, but I'd like to
limit the scope of this pull request to simple translation, and add further
improvements later.

@9il
Copy link
Member

9il commented Jul 18, 2015

Great work! Thanks!

I can probably try and come up with something using bitfields, but I'd like to
limit the scope of this pull request to simple translation, and add further
improvements later.

+1

@ivan-timokhin
Copy link
Contributor Author

Actually, it looks like we have some interesting stuff in Phobos: FloatRep, DoubleRep. And they've been there for a long time.

@9il
Copy link
Member

9il commented Jul 18, 2015

FloatRep, DoubleRep are not fast enough.

@ivan-timokhin
Copy link
Contributor Author

But they do look similar to what Iain suggested, so maybe a better course of action would be to improve them?

@9il
Copy link
Member

9il commented Jul 18, 2015

Any improvement will break code. The bitfields concept is slow for math.

@ibuclaw
Copy link
Member

ibuclaw commented Jul 18, 2015

Yes, std.bitmanip generates some pretty funky accessors, and the existing representations don't consider endian-ness.

@ivan-timokhin
Copy link
Contributor Author

Hm. Then I don't quite understand what the proposed Layout is supposed to be.

@ivan-timokhin
Copy link
Contributor Author

Anyhow, what I really wanted to know was whether I should try and unify the code with (Float|Double)Rep. The answer's negative, so I suggest we return to the discussion of the PR at hand.

@PetarKirov
Copy link
Member

@ivan-timokhin perhaps after this PR is done we can make (Float|Double)Rep forward to it?

@9il
Copy link
Member

9il commented Jul 18, 2015

@ivan-timokhin perhaps after this PR is done we can make (Float|Double)Rep forward to it?

FloatTraits is package stuff

@ivan-timokhin
Copy link
Contributor Author

@ivan-timokhin perhaps after this PR is done we can make (Float|Double)Rep forward to it?

Uhh, no. Any bitfieldish stuff like the one done in those two is most definitely out of scope of this PR. I brought them up because if we do want something similar in floatTraits in the long-term, it might be better to build on existing solution instead of rolling out a completely new one. Turns out that's a bad idea.

@quickfur
Copy link
Member

Will the result be CTFE-able?

@9il
Copy link
Member

9il commented Jul 20, 2015

@quickfur Currently unions are not CTFE-able. But they can be in the future.

@ibuclaw
Copy link
Member

ibuclaw commented Jul 20, 2015

@quickfur Currently unions are not CTFE-able. But they can be in the future.

This is on my long todo list.

@ivan-timokhin
Copy link
Contributor Author

@ibuclaw I've tried to come up with better names.

@ibuclaw
Copy link
Member

ibuclaw commented Jul 30, 2015

@ivan-timokhin - OK. I've been out of action this week, I'll have a review when I can.

In any case, this is for post-2.068 release.

@DmitryOlshansky
Copy link
Member

@ibuclaw any chance to review/destroy this ? ;)

@ivan-timokhin
Copy link
Contributor Author

I've rewritten functions already using unions for uniformity, and the rewrite is usually very straightforward (though I've taken some liberties in cmp), but I can drop these changes to keep this PR focused on getting rid of pointers. Should I?

@DmitryOlshansky DmitryOlshansky added this to the 2.069 milestone Sep 8, 2015
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should be the same as mant_dig == 64. All bit manipulation is identical to ieeeExtended for this type.

@ibuclaw
Copy link
Member

ibuclaw commented Sep 13, 2015

Apart from the nits already mentioned (I may find more). Has there been any benchmarking on this? Are we loosing or gaining performance by switching to higher level accessors? I haven't checked, but I assume that std.math is fully @nogc as well?

@ivan-timokhin
Copy link
Contributor Author

I didn't look into performance much (will do), but I'd expect the generated code to be more or less the same; e.g., in both https://goo.gl/umkAtu and https://goo.gl/jFzuR2 assembly output is the same for pointers and unions.

And yes, it seems that most functions are @nogc. abs is not marked as such, but has @nogc unittest; approxEqual does not have one (an oversight?); resetIeeeFlags has no attributes at all, but, again, that looks like an oversight.

@ivan-timokhin
Copy link
Contributor Author

😞 Unfortunately, I must admit defeat. The resulting assembly does not look as similar as I hoped it would, and the new one is frequently longer. While for most functions it seems that the new instructions are mostly a slightly permuted version of old ones, my knowledge of assembly is not deep enough to tell if that's significant or not.
An alternative is to run actual benchmarks on all module functions, but setting them up (as well as learning x86_64 assembly) would take more time and effort than I can dedicate at the moment.
So, in conclusion, I'm unable to provide any guarantees regarding performance of the new code compared to the old one.

@ibuclaw
Copy link
Member

ibuclaw commented Sep 20, 2015

No problem. I can dig out a couple from my mailing list archives

@ibuclaw
Copy link
Member

ibuclaw commented Sep 20, 2015

For the most part, I think this is fine. Just having a mental pass at it, I can only see one notable place of concern. Last nits will follow and I will give the green light

std/math.d Outdated
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This would be turned into a memcmp for ieeeQuadruple and ibmExtended, and goodness knows what for ieeeExtended. For efficiency, I'd recommend keeping the previous behaviour and inline the bit comparison.

@ibuclaw
Copy link
Member

ibuclaw commented Sep 20, 2015

Because you are dereferencing static arrays instead of pointers, I'd expect there to be more bounds checks, this will distort some things if you plan on doing comparison of assembly.

@dnadlinger
Copy link
Contributor

Why would indexing a static array by a (compile-time) constant offset cause a bounds check to be emitted?

@ibuclaw
Copy link
Member

ibuclaw commented Sep 20, 2015

Why would indexing a static array by a (compile-time) constant offset cause a bounds check to be emitted?

All peeks are constant offsets, not all pokes are though.

@MartinNowak
Copy link
Member

What's the state of this?

@MartinNowak MartinNowak removed this from the 2.069.0 milestone Oct 14, 2015
@ivan-timokhin
Copy link
Contributor Author

ivan-timokhin commented Oct 14, 2015 via email

@ibuclaw
Copy link
Member

ibuclaw commented Oct 19, 2015

Two optional nits, but otherwise I'm ok with this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

8 participants