Skip to content

Issue 6620 - argument evaluation order inversed for extern(C)#4035

Closed
9rnsr wants to merge 5 commits intodlang:masterfrom
9rnsr:fix6620
Closed

Issue 6620 - argument evaluation order inversed for extern(C)#4035
9rnsr wants to merge 5 commits intodlang:masterfrom
9rnsr:fix6620

Conversation

@9rnsr
Copy link
Contributor

@9rnsr 9rnsr commented Sep 30, 2014

@9rnsr 9rnsr force-pushed the fix6620 branch 2 times, most recently from 3688cb9 to 5d75378 Compare October 1, 2014 05:10
@9rnsr
Copy link
Contributor Author

9rnsr commented Oct 1, 2014

Ready to merge.

@ibuclaw
Copy link
Member

ibuclaw commented Oct 1, 2014

Thanks, looks like we'll finally get ARM + x86 working alike. :)

@jpf91
Copy link
Contributor

jpf91 commented Oct 6, 2014

ping @WalterBright @andralex

src/arrayop.c Outdated
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what is "the head"?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It means "the head of argument list". Please teach me a better comment if you have.

@WalterBright
Copy link
Member

This does much more than the evaluation order for function arguments. It is also changing the order of array operations.

Array operations are currently left-to-right, except for assignments, which are right-to-left. This is not an accident. It turns out that faster code usually results from this, because fewer registers and temporaries are required to evaluate assignment right-to-left.

@9rnsr
Copy link
Contributor Author

9rnsr commented Oct 13, 2014

@WalterBright Sorry, I'm really not sure what is the expected behavior.

Today, vector operation behavior depends on issue 6620, because:

int[] a, b, c;
a[] = b[] + c[];

is lowered to:

__arraySliceSliceAddSliceAssign_i(a[], c[], b[]);

And the array op is evaluated with the order b -> c -> a so the generated function __arraySliceSliceAddSliceAssign_i is extern(C) and its arguments are affected by the issue.

And after the glue-layer fix, the evaluation order will be changed to a -> c -> b. I thought it would be much weird than strict evaluation order a -> b -> c, therefore I fixed src/arrayop.c to implement "strict L-to-R evaluation order".

@ibuclaw
Copy link
Member

ibuclaw commented Oct 13, 2014

And the array op is evaluated with the order b -> c -> a so the generated function __arraySliceSliceAddSliceAssign_i is extern(C) and its arguments are affected by the issue.

In case the meaning is still lost on people. In its current state, for all other architectures, the order is a -> c -> b - which is obviously broken.

@yebblies
Copy link
Contributor

Sorry, I'm really not sure what is the expected behavior.

I think Walter is saying that the explicit order tests in arrayop.d are the correct behavior. The fact that the entire array op is lowered to an extern(C) function is an implementation detail. i.e. All those tests should be unchanged, only explicit calls to extern(C) functions should have any noticeable change.

@9rnsr
Copy link
Contributor Author

9rnsr commented Oct 13, 2014

I think Walter is saying that the explicit order tests in arrayop.d are the correct behavior.

Unfortunately, even if a test suite tests the behavior of compiler generated code, but I sometimes doubt that the behavior is not actually designed, especially when the test case is not connected to a bugzilla issue.

From some reasons I thought that the current runnable/arrayop.d result is not actually designed behavior, because:

  1. As far as I know, the "expected LTR evaluation order" does not have any exception for vector ops.
  2. Current vector-ops evaluation order depends on the issue 6620. In other words, it depends on the backend suite. So I couldn't think it is a part of D language specification.

So I'd question to @WalterBright here: Which is the expected behavior and we should go?

@yebblies
Copy link
Contributor

Unfortunately, even if a test suite tests the behavior of compiler generated code, but I sometimes doubt that the behavior is not actually designed, especially when the test case is not connected to a bugzilla issue.

And even if it was designed, that doesn't necessarily mean it is desirable.

From some reasons I thought that the current runnable/arrayop.d result is not actually designed behavior, because:

  1. As far as I know, the "expected LTR evaluation order" does not have any exception for vector ops.
  2. Current vector-ops evaluation order depends on the issue 6620. In other words, it depends on the backend suite. So I couldn't think it is a part of D language specification.

It actually is in the spec.

"The vector assignment operators are evaluated right to left, and the other binary operators are evaluated left to right."

@9rnsr
Copy link
Contributor Author

9rnsr commented Oct 13, 2014

If we should really keep current array-op evaluation order, we need to modify front-end and inliner code drastically to handle the special evaluation order, to distinguish a function call is really a lowering of array-op or not. I cannot feel it would be worth to implement it.

@yebblies
Copy link
Contributor

If we should really keep current array-op evaluation order, we need to modify front-end and inliner code drastically to handle the special evaluation order, to distinguish a function call is really a lowering of array-op or not. I cannot feel it would be worth to implement it.

It doesn't have to be that bad, array-ops can survive intact to the glue layer and be transformed into a call there.

@9rnsr
Copy link
Contributor Author

9rnsr commented Oct 13, 2014

It doesn't have to be that bad

I will never implement it.

@yebblies
Copy link
Contributor

I will never implement it.

Haha fair enough. While it's on my radar it's not exactly high on my priority list.

@9rnsr
Copy link
Contributor Author

9rnsr commented Oct 13, 2014

@WalterBright I have still question.

Array operations are currently left-to-right, except for assignments, which are right-to-left. This is not an accident. It turns out that faster code usually results from this, because fewer registers and temporaries are required to evaluate assignment right-to-left.

Is that generic opinion, or dmd backend specific? In this PR, I'm reorder extern(C) function argument IRs in reverse. Is that will make all extern(C) function calls slower?

@yebblies
Copy link
Contributor

I don't have any objections to changing the order of evaluation for array-ops, so we'll have to see if Walter can be convinced.

@9rnsr
Copy link
Contributor Author

9rnsr commented Oct 13, 2014

OK, I'll wait Walter's answer.

@yebblies
Copy link
Contributor

I will never implement it.

Actually, you could also do it by explicitly evaluating the arguments to temp variables in the frontend.

eg Lowering

a[] = b[] + c[];

to

auto tmpb = b[];
auto tmpc = c[];
auto tmpa = a[];
__arrayOpImp(tmpa, tmpc, tmpb);

@9rnsr
Copy link
Contributor Author

9rnsr commented Oct 13, 2014

Actually, you could also do it by explicitly evaluating the arguments to temp variables in the frontend.

Of course it can. But it would add "more temporaries" and might slow down vector-ops. So I need to ask another question: "Current RTL evaluation order is really necessary?"

@yebblies
Copy link
Contributor

Of course it can. But it would add "more temporaries" and might slow down vector-ops.

Any reasonable optimizer would have no problem. Even dmd might be able to handle it.

So I need to ask another question: "Current RTL evaluation order is really necessary?"

Yeah I dunno. I suspect nobody every bothered to benchmark it.

@WalterBright
Copy link
Member

Consider: *p = b * c + d * e;. With L-R evaluation, we load p into a register and then must preserve that register while the rvalue is evaluated. The rvalue will require 3 registers (generally speaking). So, L-R of this requires 4 registers, while R-L requires 3 registers.

Minimizing register usage has a large impact on performance. This is why assignments are evaluated R-L, while other expressions are L-R. Yes, there are a lot of special case optimizations which will reduce the number of registers required, but in the general case, it remains.

The issue with order of evaluation is not, from the user's perspective, really about whether it is L-R or R-L. It is about it being an essentially unpredictable order. Making it consistent in D is where we need to go. For example,

foo(++x, ++x);

should have repeatable results. What that result is doesn't really matter, it just needs to be repeatable across compiles and compilers.

@9rnsr
Copy link
Contributor Author

9rnsr commented Oct 14, 2014

@WalterBright To me your argument sound reasonable for normal arithmetic operations. But there's still questions.

  1. Array operation cannot be calculated incrementally different from normal arithmetic operations. In a[] = b[] + c[], we cannot store the result of b[] + c[] in a register.
  2. As far as I know, performance critical vector operation is now recommended to use core.simd types. Therefore, I thought that normal array operation would be just a syntactic sugar for the loop calculation today. Is the performance still important on array operations?

OT: Currently normal assignment is evaluated left to right. Maybe is it a bug?

import core.stdc.stdio;
ref int a() { static int x; printf("a\n"); return x; }
int b() { printf("b\n"); return 1; }
int c() { printf("c\n"); return 2; }
void main() { a() = b() + c(); }   // evaluation order is: a->b->c

@WalterBright
Copy link
Member

@9rnsr ,

  1. Right, but I was thinking of consistency
  2. Right, but again I was thinking of consistency

as in, saying that assignments are L-R, and not erratic based on the operand types.

Maybe is it a bug?

It's certainly not consistent. That leads to the question - how far do we want to go with this? There will be some level of performance penalty for making this all consistent, is it worth it? Does this really cause significant problems? I know I myself out of habit never write code that relies on order of evaluation.

@ibuclaw
Copy link
Member

ibuclaw commented Oct 14, 2014

Does this really cause significant problems?

Yes. But the underlying reason is not array ops per say, just the way they happen to be lowered.

foo(++x, ++x);

Should this evaluate LTR regardless of the extern attribute?

@9rnsr
Copy link
Contributor Author

9rnsr commented Oct 14, 2014

Does this really cause significant problems?

It would be vulnerable for the small code modification.

int[] a(); int[] b();
a()[] = b()[];   // a - >b

If you modify the code to add offset number with array operation, evaluation order will be suddenly changed.

a()[] = b()[] + 1;   // b -> a

As I explained, array operation does not have performance issue depending on the evaluation order. OTOH, from Walter's explanation, evaluating plain assignment a = b with LTR order (== current behavior) would have less performance than RTL order.

I think all assignments (AssignExp, BinAssinExp, including any array operations) should have same evaluation order. Therefore now I have only one question: which is the correct evaluation order on all assignments, LTR or RTL? If the answer is LTR, I'll revert the runnable/arrayop.d changes to keep existing LTR order on array ops.

@MartinNowak
Copy link
Member

Then isn't the best order for a() = b() + c() + d(); neither LTR nor RTL, but b -> c -> d -> a?
This is also matches variable declaration, where the initializer can't reference the variable, but otherwise is strictly LTR.

@ibuclaw
Copy link
Member

ibuclaw commented Apr 11, 2015

I'm just wondering when this x86-specific special case will be removed.

@dlang-bot
Copy link
Contributor

Fix Bugzilla Description
6620 argument evaluation order inversed for extern(C)

@9rnsr 9rnsr force-pushed the fix6620 branch 6 times, most recently from 2a67e56 to b4c3d9a Compare April 8, 2016 01:47
@9rnsr
Copy link
Contributor Author

9rnsr commented Apr 8, 2016

I fully updated this PR. Now, array operation keeps existing evaluation order with AST adjustment in src/arrayop.d, and it's mostly ready to go forward.

But unfortunately, Phobos unittest is failing in win32, because of a backend regression introduced from 2.069:
https://issues.dlang.org/show_bug.cgi?id=15898

@9rnsr 9rnsr force-pushed the fix6620 branch 2 times, most recently from 9c92996 to ac9a8a2 Compare April 17, 2016 16:18
@9rnsr
Copy link
Contributor Author

9rnsr commented Apr 17, 2016

Regression 15898 has fixed in stable branch, and the bugfix change has merged into master.
Now, this PR is ready to go.

@9rnsr
Copy link
Contributor Author

9rnsr commented Apr 17, 2016

@WalterBright ?

@ibuclaw
Copy link
Member

ibuclaw commented Dec 24, 2017

As far as I'm aware, array operations are still evaluated in a different order between x86 and every other cpu. Even if this isn't an ideal patch.

@ibuclaw
Copy link
Member

ibuclaw commented Dec 26, 2017

As per #4895 - no point pursuing. The old x86-centric (and they only semantically work on x86) library functions need to die or be made extern(D).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

7 participants