Skip to content
This repository was archived by the owner on Oct 12, 2022. It is now read-only.
/ druntime Public archive

One more pass through __cmp#1787

Merged
andralex merged 8 commits intodlang:masterfrom
andralex:__cmp
Mar 8, 2017
Merged

One more pass through __cmp#1787
andralex merged 8 commits intodlang:masterfrom
andralex:__cmp

Conversation

@andralex
Copy link
Member

@andralex andralex commented Mar 6, 2017

A few more improvements to __cmp to reduce number of templates generated and tighten the code.

@andralex
Copy link
Member Author

andralex commented Mar 6, 2017

@CyberShadow
Copy link
Member

Hmm, diff isn't very useful, a bit difficult to review. In such cases splitting changes across multiple commits is helpful.


int dstrcmp( scope const char[] s1, scope const char[] s2 ) @trusted
{
import core.stdc.string : memcmp;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This (dstrcmp change) really ought to be in its own commit... currently there is no rationale. I see it's to add CTFE support?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Affirmative.

src/object.d Outdated
// Special-case imaginary types to use comparison for "concrete" types
static if (is(U1 == ifloat)) alias F = float;
else static if (is(U1 == idouble)) alias F = double;
else static if (is(U1 == ireal)) alias F = real;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

BTW, should __cmp perhaps have a constraint that restricts the two arguments to have the same unqualified type? The compiler seems to enforce that when you do the comparison, however if we do somehow end up with an instantiation with e.g. float[] and double[], such as an explicit user invocation, then slightly wrong things will happen (such as casting the doubles to floats).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually casting entire arrays is more than slightly wrong :)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Currently I planted a static assert at the beginning of __cmp. From the looks of it the qualifiers are passed by the compiler.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From the looks of it the qualifiers are passed by the compiler.

I think that should either be enforced or accounted for, especially if we want to expose this function to users as well (and we do want that, as it would be useful for defining opCmp in user types).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure what the action item is.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@CyberShadow there'd be immutable as well. Doesn't the assert that U1 must be the same as U2 cover it?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, I don't understand how. Unqual strips immutable so we're still looking at unnecessary template instantiations for all the various constnesses.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@CyberShadow I guess we're talking past each other. What do you want this PR to accomplish?

Copy link
Member

@CyberShadow CyberShadow Mar 6, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's say we have:

int[] a;
const(int)[] b;
immutable(int)[] c;

writeln(a < b);
writeln(b < c);
writeln(a < c);

You said that "the qualifiers are passed by the compiler". I understood that to mean that the compiler strips qualifiers before instantiating __cmp, so there will be only one __cmp instantiation? If so, this should be enforced on the __cmp side.

If, however, there are three template instantiations, two of those should short-circuit to calling the third to avoid redundant code gen. I also mentioned this in Lucia's PR and this is what I thought this PR did from your initial description :)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yah, that's my understanding as well - three copies will be generated. I can work on folding those together in this PR. Stay tuned.

src/object.d Outdated
return (() @trusted => dstrcmp(cast(char[])s1, cast(char[])s2))();
}
// integral, struct, nested arrays
// Everything else
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The previous comment was more descriptive.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It had become untrue due to the changes in this diff. There are only a few leading special cases for objects and imaginary types, so... "Everything that is not a class/interface or imaginary type..."

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see, but it would still be prudent to be explicit, which documents that we've tried to ensure that the code below should work for the given kinds of types. At this point it is a nitpick though.

foreach (const u; 0 .. len)
{
auto e1 = at(s1, u);
auto e2 = at(s2, u);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch, this avoids a copy (and in fact wouldn't work with @disable this(this) structs, would it?)

src/object.d Outdated
return c < 0 ? -1 : 1;
return c;
}
else
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you think it would make sense to add another static if with if (__traits(compiles, { int i = at(s1, u) - at(s2, u); })?

By the way, one thing we need to keep in mind is efficient comparison of structs that don't have an opCmp, and those that have an auto-generated one (such as when one of the fields has one). That's a thing, right?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The relationship between < and - does not apply to all types.

So does the compiler generate opCmp for structs always/when necessary?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, doesn't look like it. Curiously, for a struct without opCmp, S(1) < S(2) doesn't compile, but [S(1)] < [S(2)] does. If S has a field with an opCmp, it doesn't get called. So I guess that's not something we need to worry about right now.

Though, there's still the question of structs without opCmp. I think the compiler just does a memcmp right now, but someone would need to check. Ideally we'd want to avoid doing that memcmp twice.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@CyberShadow looks like worth raising a bug report over!

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, what's the bug? Is it for the feature request to auto-generate opCmp?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm thinking the fact that S(1) < S(2) does not compile yet [S(1) < S(2)] compiles is definitely a browraiser.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not [S(1) < S(2)] but [S(1)] < [S(2)]. Is that what you meant?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh. Yah, that's problematic too - you can't order lexicographically unless elements are ordered.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@CyberShadow
Copy link
Member

A few more improvements to __cmp to reduce number of templates generated and tighten the code.

From that description I thought this would involve making __cmp a stub which normalizes constness and then calls an implementation (or itself). That would make sense, right? So as to avoid multiple instantiations for ubyte[] / ubyte[], ubyte[] / const(ubyte)[], const(ubyte)[] / immutable(ubyte)[]...

src/object.d Outdated
}
else static if (is(F == float) || is(F == double) || is(F == real))
{
return (f1 > f2) - (f2 > f1);
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This generates nice code with dmd and gdc, and also much faster than the previous version (and also a bit faster than the obvious alternatives).

Copy link
Member

@WalterBright WalterBright left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a good improvement.

@andralex
Copy link
Member Author

andralex commented Mar 6, 2017

Had a bit more of a good time taking __cmp out for class/interface objects. Then the generic loop implementation works just fine, no more special casing needed.

src/object.d Outdated
{
return (f1 > f2) - (f2 > f1);
}
else static assert(false, "Internal error");
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why this change? Being explicit is generally better than hoping that the catch-all (else/default) does the right thing.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll let folks call __cmp with idouble if they have the need. Just made sure it's not called from the array compare.

src/object.d Outdated
enum RTInfo = null;
}

// Compare floating point numbers for ordering (one instantiation per width).
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd suggest enforcing "one instantiation per width" using e.g. static assert(is(Unqual!F == F));

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The parameters already have const.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a tricky one: the const in the parameters actually does not change F - only the type of f1 / f2.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@CyberShadow got a repro? Just tried this:

void fun(F)(const F x)
{
	pragma(msg, F.stringof);
}

void main()
{
	float a;
	fun(a);
	const float b = 1;
	fun(b);
}

Only one copy is generated.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, OK - it doesn't apply to floats because the compiler strips head const on call. Only types with indirections.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I.e. F will still be non-const but that's not a problem because it will always be non-const :)

src/object.d Outdated
{
// structs
static if (__traits(compiles, at(s1, u).opCmp(at(s2, u))))
static if (__traits(compiles, __cmp(at(s1, u), at(s2, u))))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why the swap? Just curious.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wanted to make sure __cmp gets first crack so we don't call a.opCmp(b) where a is a class type and its value is null.

@somzzz
Copy link
Contributor

somzzz commented Mar 7, 2017

I opened the PR for the dmd lowering to __cmp.
dlang/dmd#6597

There are tests failing in DMD because of comparisons of arrays of struct types without opCmp defined. I talked with @andralex and it seems that this is because of an old issue with using memcmp for comparing arrays of structs, which the current implementation doesn't preserve.

A simple example/test:

struct VariantN(size_t maxDataSize)
{
    ~this() {}
}
alias Variant4 = VariantN!(4);
void test()
{
    Variant4[] a4;
    assert(a4 < a4);
}
void main() {}

This could either be fixed here or I could open a new PR after this is merged. What do you think?

@JackStouffer
Copy link
Contributor

@somzzz One thing at a time.

@andralex
Copy link
Member Author

andralex commented Mar 7, 2017

@JackStouffer the latest commit should take care of things.

return lhs.opCmp(rhs);
}

int __cmp(T)(const T[] lhs, const T[] rhs) @trusted
Copy link
Member Author

@andralex andralex Mar 7, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@CyberShadow ok, so this overload should generate only one copy per scalar type. There are still a couple of opportunities:

  • merge wchar with short implementation
  • merge dchar with uint implementation
  • merge ifloat with float implementation
  • merge idouble with double implementation
  • merge ireal with real implementation

I've let these be because they're relatively rare.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok, so this overload should generate only one copy per scalar type

Are you sure? This doesn't even compile:

int __cmp(T)(const T[] lhs, const T[] rhs) @trusted {}

int[] a;
immutable(int)[] b;
__cmp(a, b);

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@CyberShadow take that back, I decided to implement the whole shebang anyway, it doesn't take much code.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

worksforme:

void main()
{
	int __cmp(T)(const T[] lhs, const T[] rhs) @trusted { return 0; }

	int[] a;
	immutable(int)[] b;
	__cmp(a, b);
}

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice but constness issue remains, I think that would be good as a test BTW.

{
// Reuse another implementation
return __cmp(cast(U[]) lhs, cast(U[]) rhs);
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

@andralex
Copy link
Member Author

andralex commented Mar 8, 2017

Since this is approved I'll merge.

@andralex andralex merged commit 1b37491 into dlang:master Mar 8, 2017
@andralex andralex deleted the __cmp branch March 8, 2017 00:27
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants