Skip to content
This repository was archived by the owner on Oct 12, 2022. It is now read-only.
/ druntime Public archive

Array Compare Template Function#1781

Merged
andralex merged 4 commits intodlang:masterfrom
somzzz:adCmpT
Mar 6, 2017
Merged

Array Compare Template Function#1781
andralex merged 4 commits intodlang:masterfrom
somzzz:adCmpT

Conversation

@somzzz
Copy link
Contributor

@somzzz somzzz commented Mar 2, 2017

This is an initial implementation of the template function to replace all overloads of TypeInfo.compare.

The purpose is to have the compiler generate calls to this template, rather than a function (adCmp2 in this case https://github.com/dlang/druntime/blob/master/src/rt/adi.d#L445)
Array comparisons will be lowered to this template after a dmd patch. An example:

bool fun(int[] a, int[] b) {
        return b[] < a[]; // generates call to the template
}

void main()
{
    import std.stdio;
    writeln(fun([1,2,3], [4, 1, 5]));
    writeln(fun([4,6,5], [1, 2, 3]));
}

@somzzz somzzz force-pushed the adCmpT branch 2 times, most recently from 9755eeb to 4c21ce8 Compare March 2, 2017 19:10
@andralex
Copy link
Member

andralex commented Mar 2, 2017

@9il you may be interested - this lifts array comparisons from druntime into templateland. Also cc @WalterBright @MartinNowak @CyberShadow.

@andralex
Copy link
Member

andralex commented Mar 2, 2017

To clarify, the plan is:

  • merge this template, which for the time being is not used by generated code
  • merge the dmd frontend change, which will insert calls to this template

src/object.d Outdated
// U1 gets the unqualified version of T1
static if (is(T1 W == immutable W)) alias U1 = W;
else static if (is(T1 W == const W)) alias U1 = W;
else alias U1 = T1;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

core.internal.traits.Unqual ?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@CyberShadow thx I missed that, thought we only have Unqual in Phobos. @somzzz is there a problem if you add a local import here for core.internal.traits?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It should be fine, I missed it too. Will make the change, thanks!

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OT: Btw why don't we move std.meta and std.traits entirely into druntime? This copy/pasting is a bit weird :/

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I spoke to @somzzz and advised that we should take a larger step once the bit of duplication becomes painful. Since her work will intensify the use of templates in object.d, that's likely to happen.

src/object.d Outdated
}
}
}
return c < 0 ? -1 : c > 0 ? 1 : 0;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A return here causes "Statement is not reachable" warnings lower

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I forgot an else. Fixed.

src/object.d Outdated
static if (is(U1 : Object) && is(U2 : Object))
{
auto c = cast(sizediff_t)(s1.length - s2.length);
if (c == 0)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a breaking change. Previous behavior was to compare lexicographically (up to the first difference or end of the shortest array).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure what's going on here but a trivial test indicates that this actually does seem to be a breaking change. https://dpaste.dzfl.pl/21cf68839945 Did I make a mistake?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@WalterBright please illuminate us

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

By the comments at https://github.com/dlang/druntime/blob/master/src/rt/typeinfo/ti_AC.d#L19:

NOTE: It is sometimes used for arrays of
classes or (incorrectly) interfaces.
But naked `TypeInfo_Array` is mostly used.
See @@@BUG12303@@@.

So there's also https://issues.dlang.org/show_bug.cgi?id=12303 to look at.

This seems to suggest that the "wrong" function and TypeInfo_AC as a whole are not used for normal comparison, but perhaps for hash tables or not at all.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree with @CyberShadow, "abc" < "z", you can't short circuit the length.

By the comments at ...

Note that when I click on that link, the coverage reported by github for the whole file is 0%. I don't think that function is ever called.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

BTW, I applaud this PR. And that note is exactly why we should be doing this. Nobody knows what is called when.

src/object.d Outdated
template Floating(T)
if (is(T == cfloat) || is(T == cdouble) || is(T == creal))
{
pure nothrow @safe:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(Nit) unusual indentation style?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@somzzz yah all indents should be multiples of 4. As an aside: I don't think we need any of those annotations, they're all deduced. Maybe for clarity.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed.

src/object.d Outdated
if (is(T == cfloat) || is(T == cdouble) || is(T == creal))
{
pure nothrow @safe:
int compare(T f1, T f2)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

rt.cmath2._Ccmp?

Copy link
Member

@andralex andralex left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good stuff!

src/object.d Outdated
for (size_t u = 0; u < s1.length; u++)
{
Object o1 = s1[u];
Object o2 = s2[u];
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use auto for o1 and o2 here so you avoid qualifier trouble.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In fact, you need non-qualified objects. Object.opCmp requires it.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Point stands - auto ensures unique point of maintenance.

src/object.d Outdated
c = o1.opCmp(o2);
if (c == 0)
continue;
break;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These last three lines are simply:

if (c != 0)
    break;

src/object.d Outdated
static if (is(U1 : Object) && is(U2 : Object))
{
auto c = cast(sizediff_t)(s1.length - s2.length);
if (c == 0)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm a bit confused here (I know it's the existing implementation, but just for my edification). This is not lexicographical compare because if an array is shorter than the other it always counts as smaller. For example, if we have an array [hugeWidget], it always counts as smaller than [littleWidget, littleWidget]. Why is this the case?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep, see #1781 (comment)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is done to make the comparison faster. No need to load the string into the memory cache if the lengths can be compared.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@WalterBright I'm not sure you're right. Per my comment above there may be a shorter array that is greater than a longer array. Please recheck.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That makes sense for equality but comparison? And why is it a breaking change?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@WalterBright By Object do you mean the variable type or the instance type? It's still a change in behavior when the variable is Object but the instance isn't, so never mind my comment above, it's not the correct fix.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note, if I grep for TypeInfo_Array in dmd source, I get many hits. If I grep for TypeInfo_AC, I get nothing. When is this TypeInfo used?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When is this TypeInfo used?

Answer: never #1782

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks like I was wrong. The following code:

int cmp(Object[] a, Object[] b)
{
    return a < b;
}

generates:

_D18TypeInfo_AC6Object6__initZ  comdat
        dd      offset FLAT:_D14TypeInfo_Array6__vtblZ
        db      000h,000h,000h,000h     ;....
        dd      offset FLAT:_D6Object7__ClassZ

_D3bug3cmpFAC6ObjectAC6ObjectZi comdat
        assume  CS:_D3bug3cmpFAC6ObjectAC6ObjectZi
L0:             mov     EAX,offset FLAT:_D18TypeInfo_AC6Object6__initZ
                push    EAX
                push    dword ptr 0Ch[ESP]
                push    dword ptr 0Ch[ESP]
                push    dword ptr 01Ch[ESP]
                push    dword ptr 01Ch[ESP]
                call    near ptr __adCmp2
                shr     EAX,01Fh
                add     ESP,014h
                ret     010h

meaning it is calling TypeInfo_Array rather than TypeInfo_AC. The latter is never called any more that I can see. I don't know when this changed, it's been a long time since I've looked at it. This dead code should be removed from druntime. The same goes for TypeInfo_Ag.

The code to generate a typeinfo looks like this, in typinf.d:

extern (C++) TypeInfoDeclaration getTypeInfoDeclaration(Type t)
{
    //printf("Type::getTypeInfoDeclaration() %s\n", t.toChars());
    switch (t.ty)
    {
    case Tpointer:
        return TypeInfoPointerDeclaration.create(t);
    case Tarray:
        return TypeInfoArrayDeclaration.create(t);
    case Tsarray:
        return TypeInfoStaticArrayDeclaration.create(t);
    case Taarray:
        return TypeInfoAssociativeArrayDeclaration.create(t);
    case Tstruct:
        return TypeInfoStructDeclaration.create(t);
    case Tvector:
        return TypeInfoVectorDeclaration.create(t);
    case Tenum:
        return TypeInfoEnumDeclaration.create(t);
    case Tfunction:
        return TypeInfoFunctionDeclaration.create(t);
    case Tdelegate:
        return TypeInfoDelegateDeclaration.create(t);
    case Ttuple:
        return TypeInfoTupleDeclaration.create(t);
    case Tclass:
        if ((cast(TypeClass)t).sym.isInterfaceDeclaration())
            return TypeInfoInterfaceDeclaration.create(t);
        else
            return TypeInfoClassDeclaration.create(t);

    default:
        return TypeInfoDeclaration.create(t);
    }
}

and from todt.d:

    override void visit(TypeInfoArrayDeclaration d)
    {
        //printf("TypeInfoArrayDeclaration.toDt()\n");
        verifyStructSize(Type.typeinfoarray, 3 * Target.ptrsize);

        dtb.xoff(toVtblSymbol(Type.typeinfoarray), 0);    // vtbl for TypeInfo_Array
        dtb.size(0);                                     // monitor

        assert(d.tinfo.ty == Tarray);

        TypeDArray tc = cast(TypeDArray)d.tinfo;

        genTypeInfo(tc.next, null);
        dtb.xoff(toSymbol(tc.next.vtinfo), 0); // TypeInfo for array of type
    }

where there are no special cases for certain array types.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ha ha ha, great minds think alike! Just sent out #1783 before seeing this!

src/object.d Outdated
pure nothrow @safe:
int compare(T d1, T d2)
{
if (d1 != d1 || d2 != d2) // if either are NaN
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

s/are/is/

src/object.d Outdated
}
return 1;
}
return (d1 == d2) ? 0 : ((d1 < d2) ? -1 : 1);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this may be a tad more efficient: return d1 < d2 ? -1 : (d1 > d2);

It actually seems to be: https://godbolt.org/g/XhUSPm, https://godbolt.org/g/T1qwSr

result = 1;
else
result = 0;
return result;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

see comments below for making this faster

src/object.d Outdated
else if (s1.length > s2.length)
return 1;

return 0;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

return s1.length < s2.length ? -1 : (s1.length > s2.length);

src/object.d Outdated
T[] b = [T.min, T.min];

assert(a > b);
assert(b < a);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For this PR, please use calls to _adCmp directly. Then we know the function being tested is actually used. Then, after merging, we go back to using operators. I know, tedious!

assert(_opCmpT(a, b) > 0);
assert(_opCmpT(b, a) < 0);

src/object.d Outdated
T[] b = [T.min, T.min];

assert(a > b);
assert(b < a);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same here

src/object.d Outdated
T[] b = [T.min_normal, T.min_normal];

assert(a > b);
assert(b < a);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same here

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done!

@CyberShadow
Copy link
Member

I'm not up to speed with current Druntime code but it bothers me somewhat that this implements comparison for not just arrays, but also a bunch of primitive types, as inline functions. I think ideally we would have one single function or overload set int cmp(A, B)(A a, B b) which would then have appropriate implementations for various types (thus the inline functions would be moved up as individual overloads of the above). That would also be really useful for all user opCmp implementations. Same with equality and hash calculations.

@andralex
Copy link
Member

andralex commented Mar 2, 2017

@CyberShadow yah, this would be a good step in that direction.

@quickfur
Copy link
Member

quickfur commented Mar 2, 2017

Yeah, in order to implement array comparisons we need comparisons for array elements, i.e., basically all types. Why not just unify the logic into a single template that specializes for each concrete type?

@andralex
Copy link
Member

andralex commented Mar 2, 2017

@quickfur getting there! For now we are carefully replacing one lowering with another.

src/object.d Outdated
void compareMinMax(T)()
{
T[] a = [T.max, T.max];
T[] b = [T.min_normal, T.min_normal];
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you define em like this they will be on the heap.
T[2] please.

src/object.d Outdated
{
void compareMinMax(T)()
{
T[] a = [T.max, T.max];
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

T[2]

@UplinkCoder
Copy link
Member

@WalterBright short-circuiting on the length only works for opEquals. not for opCmp.

src/object.d Outdated
enum RTInfo = null;
}

int _adCmpT(T1, T2)(T1[] s1, T2[] s2)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It might make sense to turn this into a stub that then calls the implementation with U1 and U2 to lower the number of template instantiations, if possible.

src/object.d Outdated
pure nothrow @safe:
int compare(T[] s1, T[] s2)
{
size_t len = s1.length;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

const instead of size_t

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

immutable?

@jpf91
Copy link
Contributor

jpf91 commented Mar 2, 2017

merge the dmd frontend change, which will insert calls to this template

Just a small request for this and similar changes: Make DMD fail with some nice error message when the template can't be found iff it actually needs to be used by some array comparison. This will please the betterC crowd as they'll get a nice error when a custom runtime doesn't implement some function instead of the linker error we get now. Additionally as long as they don't use array comparisons they won't have to implement this template.

@andralex
Copy link
Member

andralex commented Mar 2, 2017

Let's also see how #1783 fares.

@schveiguy
Copy link
Member

Let's also see how #1783 fares.

#1782 (GMTA) shows that it passes all tests (at least on two platforms before something was merged). I think TypeInfo_AC is not used anywhere.

src/object.d Outdated
return () @trusted { return Array!F.compare(cast(F[])s1, cast(F[])s2); }();
}

// integral, char types
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The troubles with this generic code:

  1. it is slow, especially for strings. Contrast this with dstrcmp() which is currently used for strings which uses memcmp which is heavily optimized

  2. it creates separate implementations for sbyte and char and immutable(char) and const(char), although those implementations will be identical. It also creates these implementations over and over and over, because everyone uses strings. This is not a runtime problem because the duplicate string implementations are merged, but it is a problem for compilation speed.

  3. The use of s1[u] means an array bounds check is (uselessly) generated, further slowing things down.

I suggest that many cases can be combined, and then forwarded to a heavily optimized non-template function that implements it, such as dstrcmp().

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For example, float, ifloat, const(float), immutable(float), const(ifloat), immutable(ifloat) should all be forwarded to one non-template function. Ditto for void, byte, char, and the qualified variants of them. Etc.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@WalterBright yah that's the next good step. Until now @somzzz has worked toward getting together something that works amid all subtleties discussed here and elsewhere.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Next step, sure, but I am concerned that the next step doesn't get taken soon enough and we release a version that has substantially degraded performance. There's no way the manual loop will come close to memcmp().

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@WalterBright thx!

  1. Calling dstrcmp didn't work in this approach because it was invoked during compilation. So we wanted to have something simple that passes muster. At this point @somzzz will look into special casing dstrcmp with if (__ctfe).
  2. Fixing (1) will also fix (2) - the template becomes an inlined one-liner.
  3. using s1.ptr[u] has a different issue - it's not deducible as @safe. I've explained @somzzz to use the idiom for generic trusted code.

For example, float, ifloat, const(float), immutable(float), const(ifloat), immutable(ifloat) should all be forwarded to one non-template function.

That is the case already.

Next step, sure, but I am concerned that the next step doesn't get taken soon enough and we release a version that has substantially degraded performance. There's no way the manual loop will come close to memcmp().

There actually is if the size is short. On a larger note, this is a long-haul project not a drive-by refactoring so no need to worry.

src/object.d Outdated
if (s2.length < len)
len = s2.length;

for (size_t u = 0; u < len; u++)
Copy link
Member

@WalterBright WalterBright Mar 2, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

foreach (const u; 0 .. len)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some of the code is taken from other parts of the system. I'm all for refactoring code during the movaroo, but this is against your (and my) philosophy that moving is one step and refactoring is another.

I'm fine with this particular nit. but I am getting nervous that people see in this PR the ultimate opportunity to improve everything starting with simple refactoring all the way up to a complete reorganization of object.d. Thanks!

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you link to the previous location of this code then for review please?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@somzzz is out for a couple hours but the code should be easy to find - too many instances actually!

$ git grep 'for (size_t u = 0; u < len; u++)'
src/object.d:        for (size_t u = 0; u < len; u++)
src/object.d:        for (size_t u = 0; u < len; u++)
src/object.d:        for (size_t u = 0; u < len; u++)
src/rt/typeinfo/ti_Ag.d:        for (size_t u = 0; u < len; u++)
src/rt/typeinfo/ti_Aint.d:        for (size_t u = 0; u < len; u++)
src/rt/typeinfo/ti_Aint.d:        for (size_t u = 0; u < len; u++)
src/rt/typeinfo/ti_Along.d:        for (size_t u = 0; u < len; u++)
src/rt/typeinfo/ti_Along.d:        for (size_t u = 0; u < len; u++)
src/rt/typeinfo/ti_Ashort.d:        for (size_t u = 0; u < len; u++)
src/rt/typeinfo/ti_Ashort.d:        for (size_t u = 0; u < len; u++)
src/rt/util/typeinfo.d:        for (size_t u = 0; u < len; u++)
src/rt/util/typeinfo.d:        for (size_t u = 0; u < len; u++)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks. Depending on how careful we want to be, it might be prudent to list exactly what code this replaces, so it's possible to check that there are no behavior or performance regressions.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Think it over - that wouldn't work (unless you make the import public, which I understand you'd want to avoid).

@andralex I've already suggested public import twice.

What's even more concerning than the code side, is that it adds additional public symbols to the module that is by default imported. That can cause conflicts with existing symbols. What if I already have a function with the same name and a matching signature, will the compiler use that one instead? Or will it use a fully qualified name when invoking it?

What would be even better, and solve both of these problems, is to put the internal symbols in a separate module. When the compiler needs to access an internal symbol it would add a import in the lowering process for that module. It would use a renamed import to a compiler generated symbol. Example:

import __someCompilerGeneratedSymbol123 = rt.internal;
__someCompilerGeneratedSymbol123._adCmpT(a, b);

It could even, if necessary, combine that with a selective import to only import what's exactly needed.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All of the Druntime underscore symbols are internal, implementation-specific and not part of the language.

Actually _adCmpT is a legit symbol. I was thinking we need to replace it with a double-underscore symbol. @somzzz could you please replace it with __cmp here and in the upcoming compiler patch? Thanks!

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What would be even better, and solve both of these problems, is to put the internal symbols in a separate module.

Sorry, no.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we are going to turn more things into template instantiations, we probably can't keep putting them in object.d forever, so we will probably need to move things out at some point ... just not now.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@CyberShadow yah, the ticket is local imports that are loaded on demand if user code triggers this or that functionality. Pay as you go at its best.

@andralex
Copy link
Member

andralex commented Mar 3, 2017

BTW I figured a completely jump-free 3-way compare: (a > b) - (a < b). This generates no jumps with gcc: https://godbolt.org/g/RVJIqw.

@WalterBright
Copy link
Member

WalterBright commented Mar 3, 2017

I prefer to not use urls that may not last for a while. The relevant bits of https://godbolt.org/g/RVJIqw are:

int test3(double a, double b) {
  return (a > b) - (a < b);
}

produces:

        xor     eax, eax
        ucomisd xmm0, xmm1
        seta    al
        xor     edx, edx
        ucomisd xmm1, xmm0
        seta    dl
        sub     eax, edx
        ret

@UplinkCoder
Copy link
Member

@andralex check the performance. seta is very expensive on some circumstances.

@CyberShadow
Copy link
Member

(a > b) - (a < b)

This is fine only for primitive types. For non-primitive types, this causes two calls to opCmp or the Druntime comparison operators. If the types themselves then contain other non-primitive types, this will in turn cause a total of four calls (per instance) to them, thus it becomes 2^^(nesting depth of the data structure) comparisons.

I'm not sure how the old code worked, but it's possible that to avoid performance regressions, we must call opCmp or the old Druntime comparison functions directly instead of doing two comparisons.

src/object.d Outdated
|| (is(U1 == ireal) && is(U2 == ireal))
|| (is(U1 == cfloat) && is(U2 == cfloat))
|| (is(U1 == cdouble) && is(U2 == cdouble))
|| (is(U1 == creal) && is(U2 == creal)))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps extract constraints to a separate template?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

things are fine here

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Or even better __traits(isFloating).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jacob-carlborg very nice! cc @somzzz

@jacob-carlborg
Copy link
Contributor

  • Would it be useful as a transition, if possible, to forward the existing implementation to this function without modifying the compiler?

  • A generic note, I think that the object module is getting very crowed. I'm leaning towards that we should completely avoid declarations directly in the object module. Instead defining the declarations in other modules and forward them to object using public import and similar

src/object.d Outdated
{
if (d1 != d1)
{
if (d2 != d2)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there really no better way to do this than the worst case of 4 comparisons?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Struck my eye too. This code, too, was carried from legacy parts of the system.

src/object.d Outdated
static if (is(U1 : Object) && is(U2 : Object))
{
auto c = cast(sizediff_t)(s1.length - s2.length);
if (c == 0)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This breaking change hasn't been addressed yet.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@somzzz when you fix this please add unittests that wouldn't work with the current code and will pass with the fixed code. Thanks!

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here's some tests I wrote:

https://dump.thecybershadow.net/2801c21c14f64b3cc72bd5bbb7622888/test.d

Perhaps these should be added to the DMD test suite.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@CyberShadow cool, thanks! Since druntime actually implements the logic, the tests belong with it. Can you please submit them as a PR? Thanks!

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK (didn't know Druntime has a non-unittest testsuite): #1784

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is strange that these tests passed with that faulty implementation as well. It should be fixed now. Thanks!

src/object.d Outdated
}
}

// integral
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As before, this isn't just integral, is it? Also structs and (nested) arrays.

src/object.d Outdated
if (e1 < e2)
return -1;
else if (e1 > e2)
return 1;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Again, two comparisons leading to exponential complexity for non-basic types.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@CyberShadow what would be the simplest way to address this?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, do what the old code did. Which I would guess would be: essentially for all types for which we can get an integer comparison result, we should do that - call opCmp for structs/classes and / __cmp (i.e. the template introduced here) for arrays.

@andralex
Copy link
Member

andralex commented Mar 4, 2017

I don't think that there would be anything stopping object.d being turned into a package.

Compilation speed is a major concern.

@jacob-carlborg
Copy link
Contributor

jacob-carlborg commented Mar 4, 2017

Compilation speed is a major concern.

Turning all type info code into templates is more likely to make the compilation process slower.

@CyberShadow
Copy link
Member

That's a fallacy... if an overall desired event has an undesired outcome, that doesn't mean we should stop caring about other instances of that outcome.

CyberShadow added a commit to CyberShadow/druntime that referenced this pull request Mar 4, 2017
PR dlang#1781 is moving array comparisons to a new template; the initial
implementation included changes which may have introduced performance
and behavior regressions.

Add explicit tests for these cases to detect such regressions.
CyberShadow added a commit to CyberShadow/druntime that referenced this pull request Mar 4, 2017
PR dlang#1781 is moving array comparisons to a new template; the initial
implementation included changes which may have introduced performance
and behavior regressions.

Add explicit tests for these cases to detect such regressions.
CyberShadow added a commit to CyberShadow/druntime that referenced this pull request Mar 4, 2017
PR dlang#1781 is moving array comparisons to a new template; the initial
implementation included changes which may have introduced performance
and behavior regressions.

Add explicit tests for these cases to detect such regressions.
@ibuclaw
Copy link
Member

ibuclaw commented Mar 5, 2017

Compilation speed is a major concern.

My only response to that would be std.datetime.

@@ -3241,6 +3241,268 @@ template RTInfo(T)
enum RTInfo = null;
}

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please add a regular (not documentation) comment stating that this function is called by the lowering of ordering comparisons.

src/object.d Outdated
{
if (e1 < e2)
return -1;
else if (e1 > e2)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

remove the else

else static if (is(U1 == ireal)) alias F = real;
else alias F = U1;

static int compare(F f1, F f2)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cool! This is getting ready, but I now realized I gave you the wrong advice when I told you to move these functions inside __cmp. This has to do with a subtlety in the way nested static functions are generated: even though you constrained compare properly to only accept certain types, the compiler will still generate one instance for each distinct outer template.

I'll approve this (if there are no other issues) and will follow with a fixup.

src/object.d Outdated
if (c != 0)
return c < 0 ? -1 : 1;
}
else static if (__traits(compiles, _adCmp(e1, e2)))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wouldn't this be __cmp?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I used replace in my editor, it failed. Fixing.

@andralex
Copy link
Member

andralex commented Mar 5, 2017

@CyberShadow I plan to follow up with a bit of factorization, any lingering issues for this PR?

Copy link
Member

@andralex andralex left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!

return -1;
else if (e1 > e2)

if (e1 > e2)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

noice

{
auto c = e1.opCmp(e2);
if (c != 0)
return c < 0 ? -1 : 1;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Any reason not to do return c?

@CyberShadow
Copy link
Member

any lingering issues for this PR?

Looks like my outstanding complaints are resolved, I don't see anything else so far. Would be good to see the results from Martin's project tester for the PR that enables using this code.

@CyberShadow
Copy link
Member

Is this code better?

 if (e1 == e2)
   continue;
 return e1 - e2;

I can't see where that comment was made any more, but if that's an option, what about auto c = e1 - e2; if (c) return c;

@somzzz
Copy link
Contributor Author

somzzz commented Mar 6, 2017

@CyberShadow I misunderstood something @andralex said when I wrote that comment. It's irrelevant.

@andralex andralex merged commit b338a01 into dlang:master Mar 6, 2017
@jacob-carlborg
Copy link
Contributor

So here's the first question about these new functions [1]. This what happens when internal functionality is not properly hidden. Unfortunately the functions that were added later did not have the comment warning about that the function is only used by the compiler. I would not be surprised if someone starts to call them manually soon.

[1] http://forum.dlang.org/post/oddrluarorcmvepgbpcs@forum.dlang.org

@andralex
Copy link
Member

@jacob-carlborg there's no need for changes at user level. How do you mean functionality is not hidden? Symbols starting with a double underscore are reserved, and the compiler lowers comparisons automatically.

@jacob-carlborg
Copy link
Contributor

there's no need for changes at user level

Exactly. But it's nothing that indicates that except the double underscore, hence the question was asked in the newsgroup. There's not even a comment indicating it shouldn't be called directly by user code.

How do you mean functionality is not hidden.

The compiler doesn't stop you from accessing the symbol.

@andralex
Copy link
Member

@jacob-carlborg I see, thanks. I think the double underscore is perfect - it's warning against unwitting use, yet it still makes the symbol accessible for people who know what they're doing and are willing to forgo portability.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.