Generalize CTFE float/int repainting to support many more types.#6811
Generalize CTFE float/int repainting to support many more types.#6811tsbockman wants to merge 1 commit intodlang:masterfrom
Conversation
|
Failing tests: I fixed this first one by returning This test ( /*
TEST_OUTPUT:
---
fail_compilation/ctfe14207.d(13): Error: cannot convert &immutable(ulong) to ubyte[8]* at compile time
fail_compilation/ctfe14207.d(18): called from here: nativeToBigEndian()
fail_compilation/ctfe14207.d(22): called from here: digest()
---
*/
ubyte[8] nativeToBigEndian()
{
immutable ulong res = 1;
return *cast(ubyte[8]*) &res;
}
auto digest()
{
ubyte[8] bits = nativeToBigEndian();
return bits;
}
enum h = digest();I know that repainting needs to handle endian-ness properly, but it's not difficult to do so provided that the endian-ness of both the host and the target are known. What predicate should I fill in here? // If host != target endian-ness, swap the byte order for bp[0 .. eleSize] here.
if ( ??? )
{ |
version(BigEndian)
enum hostBigEndian = true;
else
enum hostBigEndian = false;
// If host != target endian-ness, swap the byte order for bp[0 .. eleSize] here.
if (hostBigEndian != target.bigEndian)I think the target endianess information is not exposed to the frontend currently. You'll either have to extend |
src/ddmd/ctfeexpr.d
Outdated
| extern (C++) bool isFloatIntPaint(Type from, Type to) | ||
| { | ||
| return from.size() == to.size() && (from.isintegral() && to.isfloating() || from.isfloating() && to.isintegral()); | ||
| const toUsed = (to.ty == Tfloat80) ? 80/8 : to.size(); |
There was a problem hiding this comment.
Or maybe you are trying to do Target.realsize - Target.realpad here...
There was a problem hiding this comment.
Is there a Treal ENUMTY value I can use instead of Tfloat80 then?
Because as long as it's checking against Tfloat80, it ought to say 80/8, not Target.realsize - Target.realpad: there is no guarantee that real will always be Tfloat80.
There was a problem hiding this comment.
I know it's an unfortunate name, but Tfloat80 is treated as largest supported floating point type in gdc and ldc. Tfloat80 can even be a double!
There was a problem hiding this comment.
Gross! But thanks for explaining; I'll fix my code...
src/ddmd/ctfeexpr.d
Outdated
| return false; | ||
|
|
||
| Type fromEle = (from.ty == Tsarray) ? (cast(TypeArray) from).next : from; | ||
| Type toEle = (to.ty == Tsarray) ? (cast(TypeArray) to).next : to; |
There was a problem hiding this comment.
fromElem/toElem? Unless the latter is clashes with a backend function, in which case I think another appropriate name would be tbfrom/tbto.
There was a problem hiding this comment.
What does "tb" stand for?
There was a problem hiding this comment.
tb is the result of running t.toBasetype()
src/ddmd/target.d
Outdated
| // Write the integer value of 'e' into a unsigned byte buffer. | ||
| extern (C++) static void encodeInteger(Expression e, ubyte* buffer) | ||
| // Write the bit patterns representing the value of `e` into `buffer`. | ||
| extern (C++) void paintEncode(Expression e, ubyte* buffer) |
There was a problem hiding this comment.
This can be made private, remove the extern(C++) too.
src/ddmd/target.d
Outdated
| assert (e.op == TOK.TOKarrayliteral); | ||
| arr = cast(ArrayLiteralExp) e; | ||
| aLen = cast(int) arr.elements.dim; | ||
| eleSize = cast(int) (cast(TypeArray) arr.type).next.size(); |
There was a problem hiding this comment.
Same as before, using ele looks strange to me.
src/ddmd/target.d
Outdated
| break; | ||
| case Tfloat80: | ||
| *(cast(real*) bp) = cast(real) value; | ||
| bp[80/8 .. eleSize] = 0; // Clear the padding area for consistency. |
There was a problem hiding this comment.
Target.realsize - Target.realpad
src/ddmd/target.d
Outdated
| bp[80/8 .. eleSize] = 0; // Clear the padding area for consistency. | ||
| break; | ||
| default: | ||
| assert (0); |
| { | ||
| const last = eleSize - 1; | ||
| for (size_t x = 0; x <= last; ++x) | ||
| bp[x] = bp[last - x]; |
There was a problem hiding this comment.
I know this is inside a static if (false), but you probably want to use swap.
| break; | ||
| const last = eleSize - 1; | ||
| for (size_t x = 0; x <= last; ++x) | ||
| bp[x] = bp[last - x]; |
There was a problem hiding this comment.
Or maybe my thoughts about endianness don't apply here, as dmd doesn't internally represent data as the target sees it.
|
Other notes: Missing test cases, should also check the behaviour of type punning a zero sized array. |
I probably wouldn't worry about it, dmd doesn't pretend to support mixed endianness anyway. I'd put it in the later queue as other compilers should already be handling this. |
|
However the testsuite should be endianness aware. |
|
@tsbockman in general I dislike it when people extend the feature set of ctfe... Also the implementation needs to be cleaned. |
Why? Because that means you need to extend your feature set? 😉 |
Exactly. |
|
@tsbockman don't support Treal at all. |
Also, |
The 14207 indicates it came from fixing https://issues.dlang.org/show_bug.cgi?id=14207 which gives a use case. |
Given that the test case works at run time, I see no reason that it shouldn't be made to work at compile time, as well; I'll replace that |
Looks like ICE -> Error to me. This PR makes it so we go from Error -> OK. |
|
It looks like everything is passing now. I still need to add a bunch more tests before this is ready to merge, though. |
f6689ad to
081643a
Compare
|
This really is ugly. If we're just doing this to so that we can get Is there anything there that can't be done with a few intrinsics instead of pointer casting? eg I've always seen ctfe preventing pointer casting and union reinterpretation as being a feature, not a bug. |
Bit setting may or may not be more awkward, see for instance std.math.floor() or ceil(). In any case the implementation underneath the hood would still remain the same. Whether or not the user code is close or far away from it doesn't really matter. I have no objection to adding new properties to provide accessors, someone may request for a DIP though. |
|
TBH it looks like Most similar stuff can be done in ctfe without reinterpretation, because you can reliably extract bits from integers. Intrinsics to extract bits from floats would IIUC complete this to the point that hashing and other low-level manipulations can be done in a clean and portable way. This would also mean you could write low-level floating point code without ever having to care about padding. And unlike casting/unions, the intrinsics could be overloaded to provide the same interface for library float types. @tsbockman What do you think about this? |
Intrinsics also increase the feature set, but do so by inventing a different language dialect incompatible with the rest language. I.e. they make writing generic code harder because you would need to have different code for CTFE and runtime - the same mistake that C++ does over and over again. Having a clean language that allows you to have code reuse is much more important than inventing new intrinsics for every single use case. Type repainting is a key part of every systems programmer's toolbox. Let's not add more hacks to the language. |
No, because the intrinsics would work at runtime. There is nothing clean about the current use of unions, take a look at the code in |
|
@yebblies Most of the benefits which you claim intrinsics would have (portability, cleaner code, etc.) over That one place can be a type in phobos or druntime, or it can be some intrinsic functions implemented in the compiler. Either way, it makes no difference as far as portability, correctness, and code simplification goes. As to the CTFE feature set - the arbitrary, undocumented restrictions placed upon CTFE code are a frequent source of frustration for me, especial since (last time I checked)
|
It makes a difference as far as complexity of the supported set of features in ctfe. Adding this feature does not make it easier to specify the exact feature set either. It sounds like you're saying an intrinsic based api would work for your purpose as long as it works at runtime and compile time. Our even that you're already designing such an api, and it just needs to be moved to druntime and added to the builtins list. Am i misunderstanding? |
The restrictions themselves are the main problem; the lack of documentation is just salt in the wound. Expanding the CTFE feature set in the direction of the run time feature set lessens the need for separate documentation.
An appropriate intrinsic API would look similar to my I wouldn't want to commit to a specific list of intrinsics right now, because while it's easy to write a list that is functionally complete, it's harder to figure out which set gives optimal performance. |
|
With further testing, I see that (with or without this PR) Is there a reasonable way to extend this to work with lvalues, too? If not, I might just close this... |
If you can't do constant propagation, and that includes pointer offset and devirtualization at compile time, then I would consider that a bug in ctfe. You should be able to support everything that a compiler optimizer is able to do, and then some. |
I would consider restricting it to rvalues is reasonable for the time being. Lvalues would just require in the worst case double painting. Its easier just to let the user do the rewrite as: Instead of thrusting this on the compiler. At least that is my opinion until we perhaps switch to better ctfe technologies. |
I thought about that, but that's only a viable option if we can demonstrate that it generates the same run time code. In particular, I am concerned about this case: real r = 1.5;
ushort[5] u = *(cast(ushort[5]*) &r);
u[4] = u[4] | 0x8000;
r = *(cast(real*) &u);
assert(r == -1.5);Given that I'll investigate. |
It's not: real makeNegCT(real x)
{
ushort[5] u = *(cast(ushort[5]*) &x);
u[4] = u[4] | 0x8000;
x = *(cast(real*) &u);
return x;
}
real makeNegRT(real x)
{
(cast(ushort*) &x)[4] |= 0x8000;
return x;
}
But (That's from DMD via asm.dlang.org; the results from GDC are similar. LDC produces sub-optimal code for both.) |
687d139 to
855a3de
Compare
|
@tsbockman - Can we get your eyes back on this. I has been quite a while. Is this a no go or are we at a point where this is now possible? |
@AndrewEdwards This was always possible, from a technical perspective. The only blocker was the core compiler development team making a decision as to whether/how they want this feature implemented. See the posts of UplinkCoder, yebblies, and ibuclaw, above. In the years since this PR stalled, I have learned much more about how to write the kind of low-level floating-point code that it is designed to support. I think I could now design a good intrinsic API for that purpose, if it were decided that yebblies' solution is best. However, I still believe that expanding the CTFE engine's support for reinterpret casts would be much better for the language in the long run. Regardless, I don't feel very motivated to actually work on any of this again: I largely quit trying to contribute code to the D project years ago because I have watched far too many pull requests (both my own and those of others) stall indefinitely waiting for leadership to make some important decision. It seems that when debates like this don't lead to a clear and objective answer, often leadership refuses or forgets to make any decision at all, so that it just ends up being a big waste of time for would-be contributors like myself. |
|
@tsbockman : If your goal is to get Regarding the contribution process, I share the frustration, however things have improved quite a bit over the years, and I think if you were to resume contributing the experience would be much more pleasant. It's never going to be a walk in the park (we obviously need to set a high quality barrier for changing the language), but some unnecessary barriers were definitely lowered. |
|
@tsbockman I do apologize for that. On a pragmatic note, smaller, more incremental changes are far more likely to get approved. It's not unusual to have a dozen or more separate PRs for a single change, for example, a recent series where I redid how code was generated for divide-by-constant. |
|
@tsbockman - I was specifically referring to this comment and the ensuing conversation with Iain.
Thanks for your response. I aim to assist on improving the experience with contributions in general. @WalterBright, @yebblies, @UplinkCoder. From what I understand, the general premise is a firm decision on whether to implement via intrinsics vice the current implementation and exclusion/inclusion of |
My actual goal was to make it possible to do major refactoring and optimizations in When this PR stalled without a clear path forward or alternative, I eventually just decided to fork
However, it does not have good CTFE support, because there is no way in D as it currently stands to write some low-level floating-point algorithms in a way that is correct and efficient at both compile-time and run-time.
Correct. The alternatives are:
Of course this PR is not ready to merge as-is, but there is no point in me working on that unless the compiler dev team is pretty sure they want to take option (1). |
|
Ping @WalterBright, @yebblies, @UplinkCoder. |
This enhances
paintFloatInt()to support repainting between any combination ofbyte/ubyte/short/ushort/int/uint/long/ulong/float/double/real, as well as small static arrays derived from any of those types (such asushort[5]).Repainting is permitted as long as the
fromtype has a size equal to or greater than thetotype. 80-bitrealis a special case: the padding area does not count when casting to it, but does when casting from it.I believe this PR provides the capabilities needed to get all of
std.mathworking at CTFE using the same code path as runtime (aside from inline assembler), albeit in a rather ugly fashion.TODO: Mostly tests.