Array Compare Template Function#1781
Conversation
9755eeb to
4c21ce8
Compare
|
@9il you may be interested - this lifts array comparisons from druntime into templateland. Also cc @WalterBright @MartinNowak @CyberShadow. |
|
To clarify, the plan is:
|
src/object.d
Outdated
| // U1 gets the unqualified version of T1 | ||
| static if (is(T1 W == immutable W)) alias U1 = W; | ||
| else static if (is(T1 W == const W)) alias U1 = W; | ||
| else alias U1 = T1; |
There was a problem hiding this comment.
core.internal.traits.Unqual ?
There was a problem hiding this comment.
@CyberShadow thx I missed that, thought we only have Unqual in Phobos. @somzzz is there a problem if you add a local import here for core.internal.traits?
There was a problem hiding this comment.
It should be fine, I missed it too. Will make the change, thanks!
There was a problem hiding this comment.
OT: Btw why don't we move std.meta and std.traits entirely into druntime? This copy/pasting is a bit weird :/
There was a problem hiding this comment.
I spoke to @somzzz and advised that we should take a larger step once the bit of duplication becomes painful. Since her work will intensify the use of templates in object.d, that's likely to happen.
src/object.d
Outdated
| } | ||
| } | ||
| } | ||
| return c < 0 ? -1 : c > 0 ? 1 : 0; |
There was a problem hiding this comment.
A return here causes "Statement is not reachable" warnings lower
There was a problem hiding this comment.
I forgot an else. Fixed.
src/object.d
Outdated
| static if (is(U1 : Object) && is(U2 : Object)) | ||
| { | ||
| auto c = cast(sizediff_t)(s1.length - s2.length); | ||
| if (c == 0) |
There was a problem hiding this comment.
This is a breaking change. Previous behavior was to compare lexicographically (up to the first difference or end of the shortest array).
There was a problem hiding this comment.
This is the same code as before for Object[] type.
https://github.com/dlang/druntime/blob/master/src/rt/typeinfo/ti_AC.d#L66
There was a problem hiding this comment.
Not sure what's going on here but a trivial test indicates that this actually does seem to be a breaking change. https://dpaste.dzfl.pl/21cf68839945 Did I make a mistake?
There was a problem hiding this comment.
By the comments at https://github.com/dlang/druntime/blob/master/src/rt/typeinfo/ti_AC.d#L19:
NOTE: It is sometimes used for arrays of
classes or (incorrectly) interfaces.
But naked `TypeInfo_Array` is mostly used.
See @@@BUG12303@@@.
So there's also https://issues.dlang.org/show_bug.cgi?id=12303 to look at.
This seems to suggest that the "wrong" function and TypeInfo_AC as a whole are not used for normal comparison, but perhaps for hash tables or not at all.
There was a problem hiding this comment.
I agree with @CyberShadow, "abc" < "z", you can't short circuit the length.
By the comments at ...
Note that when I click on that link, the coverage reported by github for the whole file is 0%. I don't think that function is ever called.
There was a problem hiding this comment.
BTW, I applaud this PR. And that note is exactly why we should be doing this. Nobody knows what is called when.
src/object.d
Outdated
| template Floating(T) | ||
| if (is(T == cfloat) || is(T == cdouble) || is(T == creal)) | ||
| { | ||
| pure nothrow @safe: |
There was a problem hiding this comment.
(Nit) unusual indentation style?
There was a problem hiding this comment.
@somzzz yah all indents should be multiples of 4. As an aside: I don't think we need any of those annotations, they're all deduced. Maybe for clarity.
src/object.d
Outdated
| if (is(T == cfloat) || is(T == cdouble) || is(T == creal)) | ||
| { | ||
| pure nothrow @safe: | ||
| int compare(T f1, T f2) |
src/object.d
Outdated
| for (size_t u = 0; u < s1.length; u++) | ||
| { | ||
| Object o1 = s1[u]; | ||
| Object o2 = s2[u]; |
There was a problem hiding this comment.
Use auto for o1 and o2 here so you avoid qualifier trouble.
There was a problem hiding this comment.
In fact, you need non-qualified objects. Object.opCmp requires it.
There was a problem hiding this comment.
Point stands - auto ensures unique point of maintenance.
src/object.d
Outdated
| c = o1.opCmp(o2); | ||
| if (c == 0) | ||
| continue; | ||
| break; |
There was a problem hiding this comment.
These last three lines are simply:
if (c != 0)
break;
src/object.d
Outdated
| static if (is(U1 : Object) && is(U2 : Object)) | ||
| { | ||
| auto c = cast(sizediff_t)(s1.length - s2.length); | ||
| if (c == 0) |
There was a problem hiding this comment.
I'm a bit confused here (I know it's the existing implementation, but just for my edification). This is not lexicographical compare because if an array is shorter than the other it always counts as smaller. For example, if we have an array [hugeWidget], it always counts as smaller than [littleWidget, littleWidget]. Why is this the case?
There was a problem hiding this comment.
This is done to make the comparison faster. No need to load the string into the memory cache if the lengths can be compared.
There was a problem hiding this comment.
@WalterBright I'm not sure you're right. Per my comment above there may be a shorter array that is greater than a longer array. Please recheck.
There was a problem hiding this comment.
That makes sense for equality but comparison? And why is it a breaking change?
There was a problem hiding this comment.
@WalterBright By Object do you mean the variable type or the instance type? It's still a change in behavior when the variable is Object but the instance isn't, so never mind my comment above, it's not the correct fix.
There was a problem hiding this comment.
Note, if I grep for TypeInfo_Array in dmd source, I get many hits. If I grep for TypeInfo_AC, I get nothing. When is this TypeInfo used?
There was a problem hiding this comment.
It looks like I was wrong. The following code:
int cmp(Object[] a, Object[] b)
{
return a < b;
}
generates:
_D18TypeInfo_AC6Object6__initZ comdat
dd offset FLAT:_D14TypeInfo_Array6__vtblZ
db 000h,000h,000h,000h ;....
dd offset FLAT:_D6Object7__ClassZ
_D3bug3cmpFAC6ObjectAC6ObjectZi comdat
assume CS:_D3bug3cmpFAC6ObjectAC6ObjectZi
L0: mov EAX,offset FLAT:_D18TypeInfo_AC6Object6__initZ
push EAX
push dword ptr 0Ch[ESP]
push dword ptr 0Ch[ESP]
push dword ptr 01Ch[ESP]
push dword ptr 01Ch[ESP]
call near ptr __adCmp2
shr EAX,01Fh
add ESP,014h
ret 010h
meaning it is calling TypeInfo_Array rather than TypeInfo_AC. The latter is never called any more that I can see. I don't know when this changed, it's been a long time since I've looked at it. This dead code should be removed from druntime. The same goes for TypeInfo_Ag.
The code to generate a typeinfo looks like this, in typinf.d:
extern (C++) TypeInfoDeclaration getTypeInfoDeclaration(Type t)
{
//printf("Type::getTypeInfoDeclaration() %s\n", t.toChars());
switch (t.ty)
{
case Tpointer:
return TypeInfoPointerDeclaration.create(t);
case Tarray:
return TypeInfoArrayDeclaration.create(t);
case Tsarray:
return TypeInfoStaticArrayDeclaration.create(t);
case Taarray:
return TypeInfoAssociativeArrayDeclaration.create(t);
case Tstruct:
return TypeInfoStructDeclaration.create(t);
case Tvector:
return TypeInfoVectorDeclaration.create(t);
case Tenum:
return TypeInfoEnumDeclaration.create(t);
case Tfunction:
return TypeInfoFunctionDeclaration.create(t);
case Tdelegate:
return TypeInfoDelegateDeclaration.create(t);
case Ttuple:
return TypeInfoTupleDeclaration.create(t);
case Tclass:
if ((cast(TypeClass)t).sym.isInterfaceDeclaration())
return TypeInfoInterfaceDeclaration.create(t);
else
return TypeInfoClassDeclaration.create(t);
default:
return TypeInfoDeclaration.create(t);
}
}
and from todt.d:
override void visit(TypeInfoArrayDeclaration d)
{
//printf("TypeInfoArrayDeclaration.toDt()\n");
verifyStructSize(Type.typeinfoarray, 3 * Target.ptrsize);
dtb.xoff(toVtblSymbol(Type.typeinfoarray), 0); // vtbl for TypeInfo_Array
dtb.size(0); // monitor
assert(d.tinfo.ty == Tarray);
TypeDArray tc = cast(TypeDArray)d.tinfo;
genTypeInfo(tc.next, null);
dtb.xoff(toSymbol(tc.next.vtinfo), 0); // TypeInfo for array of type
}
where there are no special cases for certain array types.
There was a problem hiding this comment.
Ha ha ha, great minds think alike! Just sent out #1783 before seeing this!
src/object.d
Outdated
| pure nothrow @safe: | ||
| int compare(T d1, T d2) | ||
| { | ||
| if (d1 != d1 || d2 != d2) // if either are NaN |
src/object.d
Outdated
| } | ||
| return 1; | ||
| } | ||
| return (d1 == d2) ? 0 : ((d1 < d2) ? -1 : 1); |
There was a problem hiding this comment.
I think this may be a tad more efficient: return d1 < d2 ? -1 : (d1 > d2);
It actually seems to be: https://godbolt.org/g/XhUSPm, https://godbolt.org/g/T1qwSr
| result = 1; | ||
| else | ||
| result = 0; | ||
| return result; |
There was a problem hiding this comment.
see comments below for making this faster
src/object.d
Outdated
| else if (s1.length > s2.length) | ||
| return 1; | ||
|
|
||
| return 0; |
There was a problem hiding this comment.
return s1.length < s2.length ? -1 : (s1.length > s2.length);
src/object.d
Outdated
| T[] b = [T.min, T.min]; | ||
|
|
||
| assert(a > b); | ||
| assert(b < a); |
There was a problem hiding this comment.
For this PR, please use calls to _adCmp directly. Then we know the function being tested is actually used. Then, after merging, we go back to using operators. I know, tedious!
assert(_opCmpT(a, b) > 0);
assert(_opCmpT(b, a) < 0);
src/object.d
Outdated
| T[] b = [T.min, T.min]; | ||
|
|
||
| assert(a > b); | ||
| assert(b < a); |
src/object.d
Outdated
| T[] b = [T.min_normal, T.min_normal]; | ||
|
|
||
| assert(a > b); | ||
| assert(b < a); |
|
I'm not up to speed with current Druntime code but it bothers me somewhat that this implements comparison for not just arrays, but also a bunch of primitive types, as inline functions. I think ideally we would have one single function or overload set |
|
@CyberShadow yah, this would be a good step in that direction. |
|
Yeah, in order to implement array comparisons we need comparisons for array elements, i.e., basically all types. Why not just unify the logic into a single template that specializes for each concrete type? |
|
@quickfur getting there! For now we are carefully replacing one lowering with another. |
src/object.d
Outdated
| void compareMinMax(T)() | ||
| { | ||
| T[] a = [T.max, T.max]; | ||
| T[] b = [T.min_normal, T.min_normal]; |
There was a problem hiding this comment.
If you define em like this they will be on the heap.
T[2] please.
src/object.d
Outdated
| { | ||
| void compareMinMax(T)() | ||
| { | ||
| T[] a = [T.max, T.max]; |
|
@WalterBright short-circuiting on the length only works for opEquals. not for opCmp. |
src/object.d
Outdated
| enum RTInfo = null; | ||
| } | ||
|
|
||
| int _adCmpT(T1, T2)(T1[] s1, T2[] s2) |
There was a problem hiding this comment.
It might make sense to turn this into a stub that then calls the implementation with U1 and U2 to lower the number of template instantiations, if possible.
src/object.d
Outdated
| pure nothrow @safe: | ||
| int compare(T[] s1, T[] s2) | ||
| { | ||
| size_t len = s1.length; |
Just a small request for this and similar changes: Make DMD fail with some nice error message when the template can't be found iff it actually needs to be used by some array comparison. This will please the |
|
Let's also see how #1783 fares. |
src/object.d
Outdated
| return () @trusted { return Array!F.compare(cast(F[])s1, cast(F[])s2); }(); | ||
| } | ||
|
|
||
| // integral, char types |
There was a problem hiding this comment.
The troubles with this generic code:
-
it is slow, especially for strings. Contrast this with
dstrcmp()which is currently used for strings which usesmemcmpwhich is heavily optimized -
it creates separate implementations for sbyte and char and immutable(char) and const(char), although those implementations will be identical. It also creates these implementations over and over and over, because everyone uses strings. This is not a runtime problem because the duplicate string implementations are merged, but it is a problem for compilation speed.
-
The use of s1[u] means an array bounds check is (uselessly) generated, further slowing things down.
I suggest that many cases can be combined, and then forwarded to a heavily optimized non-template function that implements it, such as dstrcmp().
There was a problem hiding this comment.
For example, float, ifloat, const(float), immutable(float), const(ifloat), immutable(ifloat) should all be forwarded to one non-template function. Ditto for void, byte, char, and the qualified variants of them. Etc.
There was a problem hiding this comment.
@WalterBright yah that's the next good step. Until now @somzzz has worked toward getting together something that works amid all subtleties discussed here and elsewhere.
There was a problem hiding this comment.
Next step, sure, but I am concerned that the next step doesn't get taken soon enough and we release a version that has substantially degraded performance. There's no way the manual loop will come close to memcmp().
There was a problem hiding this comment.
@WalterBright thx!
- Calling
dstrcmpdidn't work in this approach because it was invoked during compilation. So we wanted to have something simple that passes muster. At this point @somzzz will look into special casingdstrcmpwithif (__ctfe). - Fixing (1) will also fix (2) - the template becomes an inlined one-liner.
- using
s1.ptr[u]has a different issue - it's not deducible as @safe. I've explained @somzzz to use the idiom for generic trusted code.
For example, float, ifloat, const(float), immutable(float), const(ifloat), immutable(ifloat) should all be forwarded to one non-template function.
That is the case already.
Next step, sure, but I am concerned that the next step doesn't get taken soon enough and we release a version that has substantially degraded performance. There's no way the manual loop will come close to memcmp().
There actually is if the size is short. On a larger note, this is a long-haul project not a drive-by refactoring so no need to worry.
src/object.d
Outdated
| if (s2.length < len) | ||
| len = s2.length; | ||
|
|
||
| for (size_t u = 0; u < len; u++) |
There was a problem hiding this comment.
foreach (const u; 0 .. len)
There was a problem hiding this comment.
Some of the code is taken from other parts of the system. I'm all for refactoring code during the movaroo, but this is against your (and my) philosophy that moving is one step and refactoring is another.
I'm fine with this particular nit. but I am getting nervous that people see in this PR the ultimate opportunity to improve everything starting with simple refactoring all the way up to a complete reorganization of object.d. Thanks!
There was a problem hiding this comment.
Could you link to the previous location of this code then for review please?
There was a problem hiding this comment.
@somzzz is out for a couple hours but the code should be easy to find - too many instances actually!
$ git grep 'for (size_t u = 0; u < len; u++)'
src/object.d: for (size_t u = 0; u < len; u++)
src/object.d: for (size_t u = 0; u < len; u++)
src/object.d: for (size_t u = 0; u < len; u++)
src/rt/typeinfo/ti_Ag.d: for (size_t u = 0; u < len; u++)
src/rt/typeinfo/ti_Aint.d: for (size_t u = 0; u < len; u++)
src/rt/typeinfo/ti_Aint.d: for (size_t u = 0; u < len; u++)
src/rt/typeinfo/ti_Along.d: for (size_t u = 0; u < len; u++)
src/rt/typeinfo/ti_Along.d: for (size_t u = 0; u < len; u++)
src/rt/typeinfo/ti_Ashort.d: for (size_t u = 0; u < len; u++)
src/rt/typeinfo/ti_Ashort.d: for (size_t u = 0; u < len; u++)
src/rt/util/typeinfo.d: for (size_t u = 0; u < len; u++)
src/rt/util/typeinfo.d: for (size_t u = 0; u < len; u++)
There was a problem hiding this comment.
Thanks. Depending on how careful we want to be, it might be prudent to list exactly what code this replaces, so it's possible to check that there are no behavior or performance regressions.
There was a problem hiding this comment.
Think it over - that wouldn't work (unless you make the import public, which I understand you'd want to avoid).
@andralex I've already suggested public import twice.
What's even more concerning than the code side, is that it adds additional public symbols to the module that is by default imported. That can cause conflicts with existing symbols. What if I already have a function with the same name and a matching signature, will the compiler use that one instead? Or will it use a fully qualified name when invoking it?
What would be even better, and solve both of these problems, is to put the internal symbols in a separate module. When the compiler needs to access an internal symbol it would add a import in the lowering process for that module. It would use a renamed import to a compiler generated symbol. Example:
import __someCompilerGeneratedSymbol123 = rt.internal;
__someCompilerGeneratedSymbol123._adCmpT(a, b);It could even, if necessary, combine that with a selective import to only import what's exactly needed.
There was a problem hiding this comment.
All of the Druntime underscore symbols are internal, implementation-specific and not part of the language.
Actually _adCmpT is a legit symbol. I was thinking we need to replace it with a double-underscore symbol. @somzzz could you please replace it with __cmp here and in the upcoming compiler patch? Thanks!
There was a problem hiding this comment.
What would be even better, and solve both of these problems, is to put the internal symbols in a separate module.
Sorry, no.
There was a problem hiding this comment.
If we are going to turn more things into template instantiations, we probably can't keep putting them in object.d forever, so we will probably need to move things out at some point ... just not now.
There was a problem hiding this comment.
@CyberShadow yah, the ticket is local imports that are loaded on demand if user code triggers this or that functionality. Pay as you go at its best.
|
BTW I figured a completely jump-free 3-way compare: |
|
I prefer to not use urls that may not last for a while. The relevant bits of https://godbolt.org/g/RVJIqw are: produces: |
|
@andralex check the performance. seta is very expensive on some circumstances. |
This is fine only for primitive types. For non-primitive types, this causes two calls to I'm not sure how the old code worked, but it's possible that to avoid performance regressions, we must call |
src/object.d
Outdated
| || (is(U1 == ireal) && is(U2 == ireal)) | ||
| || (is(U1 == cfloat) && is(U2 == cfloat)) | ||
| || (is(U1 == cdouble) && is(U2 == cdouble)) | ||
| || (is(U1 == creal) && is(U2 == creal))) |
There was a problem hiding this comment.
Perhaps extract constraints to a separate template?
There was a problem hiding this comment.
Or even better __traits(isFloating).
|
src/object.d
Outdated
| { | ||
| if (d1 != d1) | ||
| { | ||
| if (d2 != d2) |
There was a problem hiding this comment.
Is there really no better way to do this than the worst case of 4 comparisons?
There was a problem hiding this comment.
Struck my eye too. This code, too, was carried from legacy parts of the system.
src/object.d
Outdated
| static if (is(U1 : Object) && is(U2 : Object)) | ||
| { | ||
| auto c = cast(sizediff_t)(s1.length - s2.length); | ||
| if (c == 0) |
There was a problem hiding this comment.
This breaking change hasn't been addressed yet.
There was a problem hiding this comment.
@somzzz when you fix this please add unittests that wouldn't work with the current code and will pass with the fixed code. Thanks!
There was a problem hiding this comment.
Here's some tests I wrote:
https://dump.thecybershadow.net/2801c21c14f64b3cc72bd5bbb7622888/test.d
Perhaps these should be added to the DMD test suite.
There was a problem hiding this comment.
@CyberShadow cool, thanks! Since druntime actually implements the logic, the tests belong with it. Can you please submit them as a PR? Thanks!
There was a problem hiding this comment.
OK (didn't know Druntime has a non-unittest testsuite): #1784
There was a problem hiding this comment.
It is strange that these tests passed with that faulty implementation as well. It should be fixed now. Thanks!
src/object.d
Outdated
| } | ||
| } | ||
|
|
||
| // integral |
There was a problem hiding this comment.
As before, this isn't just integral, is it? Also structs and (nested) arrays.
src/object.d
Outdated
| if (e1 < e2) | ||
| return -1; | ||
| else if (e1 > e2) | ||
| return 1; |
There was a problem hiding this comment.
Again, two comparisons leading to exponential complexity for non-basic types.
There was a problem hiding this comment.
@CyberShadow what would be the simplest way to address this?
There was a problem hiding this comment.
Well, do what the old code did. Which I would guess would be: essentially for all types for which we can get an integer comparison result, we should do that - call opCmp for structs/classes and / __cmp (i.e. the template introduced here) for arrays.
Compilation speed is a major concern. |
Turning all type info code into templates is more likely to make the compilation process slower. |
|
That's a fallacy... if an overall desired event has an undesired outcome, that doesn't mean we should stop caring about other instances of that outcome. |
PR dlang#1781 is moving array comparisons to a new template; the initial implementation included changes which may have introduced performance and behavior regressions. Add explicit tests for these cases to detect such regressions.
PR dlang#1781 is moving array comparisons to a new template; the initial implementation included changes which may have introduced performance and behavior regressions. Add explicit tests for these cases to detect such regressions.
PR dlang#1781 is moving array comparisons to a new template; the initial implementation included changes which may have introduced performance and behavior regressions. Add explicit tests for these cases to detect such regressions.
My only response to that would be std.datetime. |
| @@ -3241,6 +3241,268 @@ template RTInfo(T) | |||
| enum RTInfo = null; | |||
| } | |||
|
|
|||
There was a problem hiding this comment.
Please add a regular (not documentation) comment stating that this function is called by the lowering of ordering comparisons.
src/object.d
Outdated
| { | ||
| if (e1 < e2) | ||
| return -1; | ||
| else if (e1 > e2) |
| else static if (is(U1 == ireal)) alias F = real; | ||
| else alias F = U1; | ||
|
|
||
| static int compare(F f1, F f2) |
There was a problem hiding this comment.
Cool! This is getting ready, but I now realized I gave you the wrong advice when I told you to move these functions inside __cmp. This has to do with a subtlety in the way nested static functions are generated: even though you constrained compare properly to only accept certain types, the compiler will still generate one instance for each distinct outer template.
I'll approve this (if there are no other issues) and will follow with a fixup.
src/object.d
Outdated
| if (c != 0) | ||
| return c < 0 ? -1 : 1; | ||
| } | ||
| else static if (__traits(compiles, _adCmp(e1, e2))) |
There was a problem hiding this comment.
I used replace in my editor, it failed. Fixing.
|
@CyberShadow I plan to follow up with a bit of factorization, any lingering issues for this PR? |
| { | ||
| auto c = e1.opCmp(e2); | ||
| if (c != 0) | ||
| return c < 0 ? -1 : 1; |
There was a problem hiding this comment.
Any reason not to do return c?
Looks like my outstanding complaints are resolved, I don't see anything else so far. Would be good to see the results from Martin's project tester for the PR that enables using this code. |
I can't see where that comment was made any more, but if that's an option, what about |
|
@CyberShadow I misunderstood something @andralex said when I wrote that comment. It's irrelevant. |
|
So here's the first question about these new functions [1]. This what happens when internal functionality is not properly hidden. Unfortunately the functions that were added later did not have the comment warning about that the function is only used by the compiler. I would not be surprised if someone starts to call them manually soon. [1] http://forum.dlang.org/post/oddrluarorcmvepgbpcs@forum.dlang.org |
|
@jacob-carlborg there's no need for changes at user level. How do you mean functionality is not hidden? Symbols starting with a double underscore are reserved, and the compiler lowers comparisons automatically. |
Exactly. But it's nothing that indicates that except the double underscore, hence the question was asked in the newsgroup. There's not even a comment indicating it shouldn't be called directly by user code.
The compiler doesn't stop you from accessing the symbol. |
|
@jacob-carlborg I see, thanks. I think the double underscore is perfect - it's warning against unwitting use, yet it still makes the symbol accessible for people who know what they're doing and are willing to forgo portability. |
This is an initial implementation of the template function to replace all overloads of
TypeInfo.compare.The purpose is to have the compiler generate calls to this template, rather than a function (adCmp2 in this case https://github.com/dlang/druntime/blob/master/src/rt/adi.d#L445)
Array comparisons will be lowered to this template after a
dmdpatch. An example: