Fix Issue 18280 - std.algorithm.comparison.cmp for non-strings should call opCmp only once per item pair#6056
Conversation
|
Thanks for your pull request, @n8sh! We are looking forward to reviewing it, and you should be hearing from a maintainer soon. Some tips to help speed things up:
Bear in mind that large or tricky changes may require multiple rounds of review and revision. Please see CONTRIBUTING.md for more information. Bugzilla references
|
std/algorithm/comparison.d
Outdated
| static if (!(isSomeString!R1 && isSomeString!R2)) | ||
| { | ||
| static if (is(typeof(pred) : string)) | ||
| enum isLessThan = pred == "a < b"; |
There was a problem hiding this comment.
FYI: @andralex isn't a huge fan of string comparisons, see #4265 (comment)
There was a problem hiding this comment.
As I did a bit of digging the string specialization of cmp was added in 2011: a0ecf2a
There was a problem hiding this comment.
Good news, dlang/dmd#7484 is on. With it we should be able to switch to lambdas soon. cc @RazvanN7
std/algorithm/comparison.d
Outdated
| if (binaryFun!pred(b, a)) return 1; | ||
| static if (isLessThan && is(typeof(r1.front.opCmp(r2.front)) : int)) | ||
| { | ||
| immutable int c = r1.front.opCmp(r2.front); |
There was a problem hiding this comment.
Why not r1.front < r2.front?
There was a problem hiding this comment.
In order to get a "threeWay" value (as it is called elsewhere in this function): negative if r1.front < r2.front, positive if r2.front < r1.front, zero if neither.
There was a problem hiding this comment.
I think this is getting a bit on the odd side. How about we split cmp into two overloads:
int cmp(R1, R2)(R1 r1, R2 r2)
if (isInputRange!R1 && isInputRange!R2);
int cmp(alias pred, R1, R2)(R1 r1, R2 r2)
if (isInputRange!R1 && isInputRange!R2);Group documentation together. Documentation specifies that without a predicate specified, opCmp is used.
andralex
left a comment
There was a problem hiding this comment.
I committed a few changes. One thing I'm still worried about is this won't work for types that define opCmp to return a floating-point number.
Please remove the test for pred.
std/algorithm/comparison.d
Outdated
| static if (is(typeof(pred) : string) | ||
| && (__traits(compiles, { static assert("a < b" == pred); }))) | ||
| { | ||
| // In case someone was explicitly using the default predicate. |
There was a problem hiding this comment.
please remove, no need to accommodate this case
Sorry, I clobbered your changes when I force pushed to add a unittest. |
|
OK, I submitted them again. I think since we got here we should fix the |
|
For float |
|
@n8sh Stop the comparison and forward the NaN result. So the core of the loop would be: auto c = r1.front.opCmp(r2.front);
if (c != 0) return c; |
std/algorithm/comparison.d
Outdated
| @@ -614,40 +623,41 @@ if (isInputRange!R1 && isInputRange!R2) | |||
| { | |||
| if (r1.empty) return -cast(int)!r2.empty; | |||
| if (r2.empty) return !r1.empty; | |||
There was a problem hiding this comment.
For these two returns you need a bit of massaging. I'd say something like this:
static if (is(typeof(r1.front.opCmp(r2.front))) R)
alias Result = R;
else
alias Result = int;
if (r1.empty) return -R(!r2.empty);
if (r2.empty) return R(!r1.empty);
...
std/algorithm/comparison.d
Outdated
| result = cmp!"a > b"("", ""); | ||
| assert(result == 0); | ||
| result = cmp!"a > b"("abc", "abcd"); | ||
| assert(result > 0); |
There was a problem hiding this comment.
This behavior is a bug. r1 is a prefix of r2, so the result should be negative even if we are comparing in reverse alphabetical order, but cmp is using pred to compared the lengths! I have no sense of whether anyone might have been relying on this for reverse order comparison. It is a violation of the documentation and it only works for strings. Should it be fixed?
There was a problem hiding this comment.
That's terrible. I'd say open a bug with this repro then mark it as fixed by this PR. That way we'll have it in the changelog.
|
@n8sh you'll need to add a few more unittests because now you are generalizing cmp. Sorry this fell on you and thanks for the quality code! |
|
Found another bad bug. This code: Example https://run.dlang.io/is/feq4KK void main()
{
import std.algorithm.comparison : cmp;
static bool lessThanCaseInsensitive(size_t a, size_t b)
{
import std.ascii : toUpper;
return toUpper(cast(dchar) a) < toUpper(cast(dchar) b);
}
static assert(cmp!lessThanCaseInsensitive("apple2", "APPLE1") != 0,
"These are clearly not the same!");
} |
|
@n8sh it's a shame these haven't been exposed already. Thanks for your work! Please file that bug, too (or massage it with the other) and then mark them as fixed with this PR. |
|
@n8sh I made a push to your code that removes |
|
Approved, please squash and then we'll merge. Thanks!! |
|
Aaand made one more minor optimization - |
4d9af90 to
78c2ceb
Compare
… call opCmp only once per item pair split cmp into two overloads per @andralex dlang#6056 (review) Minor adjustments, again cmp should return auto and let opCmp drive dlang#6056 (comment) Fix Issue 18285 - std.algorithm.comparison.cmp for strings with custom predicate compares lengths wrong Test std.algorithm.comparison.cmp when opCmp returns float Promotions should not use cast Optimize cmp's endgame There are some redundant tests when the end of the ranges is reached. Eliminated that, and improved threeWayByPred. Fix Issue 18286 - std.algorithm.comparison.cmp for string with custom predicate fails if distinct chars can compare equal Fix Issue 18288 - std.algorithm.comparison.cmp for wide strings should be @safe re-apply remove cast in promotions
… call opCmp only once per item pair split cmp into two overloads per @andralex dlang#6056 (review) Minor adjustments, again cmp should return auto and let opCmp drive dlang#6056 (comment) Fix Issue 18285 - std.algorithm.comparison.cmp for strings with custom predicate compares lengths wrong Test std.algorithm.comparison.cmp when opCmp returns float Promotions should not use cast Optimize cmp's endgame There are some redundant tests when the end of the ranges is reached. Eliminated that, and improved threeWayByPred. Fix Issue 18286 - std.algorithm.comparison.cmp for string with custom predicate fails if distinct chars can compare equal Fix Issue 18288 - std.algorithm.comparison.cmp for wide strings should be @safe re-apply remove cast in promotions
cb5e5ab to
0710578
Compare
… call opCmp only once per item pair split cmp into two overloads per @andralex dlang#6056 (review) Minor adjustments, again cmp should return auto and let opCmp drive dlang#6056 (comment) Fix Issue 18285 - std.algorithm.comparison.cmp for strings with custom predicate compares lengths wrong Test std.algorithm.comparison.cmp when opCmp returns float Promotions should not use cast Optimize cmp's endgame There are some redundant tests when the end of the ranges is reached. Eliminated that, and improved threeWayByPred. Fix Issue 18286 - std.algorithm.comparison.cmp for string with custom predicate fails if distinct chars can compare equal Fix Issue 18288 - std.algorithm.comparison.cmp for wide strings should be @safe re-apply remove cast in promotions
|
Design-wise I would be happier if |
|
@n8sh perhaps the way forward is to make it an option but also keep the existing behavior by querying the predicate type via static if. |
|
There's always another name such as |
|
The bot picked up 4 issues here, but it only closed one. Bug? |
|
After @aG0aep6G's message I closed the issues manually, but I don't know if that might result in them not getting added to the next release changelog. |
A bit of explanation: The bugs are closed by the Bugzilla integration - not the bot. Now to the bug - it looks like the Bugzilla integration is a bit dumber and only picks up one bug regex per commit. There's much I can do about (except for writing our own Bugzilla integration). |
Right now when comparing non-string ranges std.algorithm.comparison.cmp calls the comparison predicate twice for every pair of elements to determine their order. When the elements have overloaded
opCmpand the predicate is "a < b" we only need to callopCmponce for each pair.