Skip to content

Add Target::reverseCArgs, use it when building arrayops#4895

Closed
ibuclaw wants to merge 3 commits intodlang:masterfrom
ibuclaw:arrayop_order
Closed

Add Target::reverseCArgs, use it when building arrayops#4895
ibuclaw wants to merge 3 commits intodlang:masterfrom
ibuclaw:arrayop_order

Conversation

@ibuclaw
Copy link
Member

@ibuclaw ibuclaw commented Aug 16, 2015

Because it seems that #4035 has reached an impasse, and I still have broken arrayops on almost every non-x86 builds. This takes a different approach so that ARM and x86 behave in the same way by taking note of which way the target will evaluate parameters passed to extern(C) functions.

For x86, this does nothing, as Target::reverseCArgs is true, and arrayop.c assumes that throughout. For every other architecture, this reverses the parameter/argument order so that there are two ABIs.

// NVPTX/x32/x86/x86_64
float[] _arraySliceSliceAddSliceAssign_f(float[] p2, const(float[]) p1, const(float[]) p0) pure nothrow @nogc @trusted;

// ARM/AArch64/AVR/HPPA/IA-64/MIPS/PPC/S390/SPARC
float[] _arraySliceSliceAddSliceAssign_f(const(float[]) p0, const(float[]) p1, float[] p2) pure nothrow @nogc @trusted;

As there are no vector library implementations for array-ops on non-x86 in druntime, this should be considered safe.

@ibuclaw ibuclaw force-pushed the arrayop_order branch 2 times, most recently from 3dece20 to 5793543 Compare August 16, 2015 11:18
src/root/array.d Outdated
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

tos???

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Name in C++ code. I guess only @WalterBright would know the meaning.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah! I've just worked it out, it's referring to 'top of stack'.

@ibuclaw ibuclaw added the Compiler:GDC Gnu D Compiler label Aug 22, 2015
@ibuclaw
Copy link
Member Author

ibuclaw commented Aug 26, 2015

Rebased.

@yebblies
Copy link
Contributor

yebblies commented Sep 1, 2015

I think array ops are lowered too early, and that is (partly) the cause of the problems you're having.

Why don't we force evaluation order of array op arguments using temp variables, instead of relying on evaluation order of function arguments?

@ibuclaw
Copy link
Member Author

ibuclaw commented Sep 2, 2015

Walter argued that using the natural order of evaluation on (RTL) produced faster code because you are consuming less registers for arrayops on x86.

Using the same argument for LTR targets, saving temporaries ends up benefitting no one.

@yebblies
Copy link
Contributor

yebblies commented Sep 2, 2015

Saving temporaries will be cleaned up by the optimizer, we already do this all the time.

@ibuclaw
Copy link
Member Author

ibuclaw commented Sep 2, 2015

You're still keeping values around in the callee that could have been pushed to the caller. ;-)

What would be better is for these to not be functions at all add just be inlined inplace, and for dmd to do away with the vector library ops like I do.

@yebblies
Copy link
Contributor

yebblies commented Sep 2, 2015

How so? Can't gcc get rid of unnecessary stack temps?

@ibuclaw
Copy link
Member Author

ibuclaw commented Sep 2, 2015

For any slice that has a side effect that needs stabilising, yes.

@yebblies
Copy link
Contributor

yebblies commented Sep 2, 2015

Yes it can get rid of the unnecessary temps? Then I'm not seeing the problem?

@ibuclaw
Copy link
Member Author

ibuclaw commented Sep 2, 2015

This is derailing because it shouldn't matter what order a target pushes arguments to a function, but this annoying part of D's codegen forces it to matter and punishes non-x86 platforms because of it.

@yebblies
Copy link
Contributor

yebblies commented Sep 2, 2015

Here's my proposed fix:

  • Generate a ArrayOpExp in the frontend instead of lowering to a function call
  • The glue layer handles that however you want

Will that solve you problems or not?

@ibuclaw
Copy link
Member Author

ibuclaw commented Sep 2, 2015

Consider that all variables are side effects in the following:

extern(D) a(b, c, d, e);

This operation is more costly than what it needs to be on x86 with gdc generated code, because we ensure that 'b' is evaluated first, but then you have to put it somewhere until all arguments have been evaluated and we can finally push them in reverse to the function.

This is not a problem for dmd because you just reverse the parameters for all extern(D) functions, which makes no fun for those debugging it. ;-)

extern(D) a(e, d, c, b);

Now, Walter was against having LTR enforced for extern(C) on x86 for precisely the reason I gave above. And I don't want to force RTL on non-x86 following the same sentiments.

@ibuclaw
Copy link
Member Author

ibuclaw commented Sep 2, 2015

For any slice that has a side effect that needs stabilising, yes

I meant no here oops.

@yebblies
Copy link
Contributor

yebblies commented Sep 2, 2015

My impression is that parameters to D's array ops are almost always trivial slices of arrays, and don't have side effects. Do you have reason to believe this is not true?

@ibuclaw
Copy link
Member Author

ibuclaw commented Sep 2, 2015

/End rant on dumb implementation detail.

I'll switch over from phone to computer as I'm not keeping up with conversation well. ;-)

@ibuclaw
Copy link
Member Author

ibuclaw commented Sep 2, 2015

Do you have reason to believe this is not true?

Yes, and the answer is in the D2 testsuite!

Look for something like: A()[] = B()[] + C()[];

@yebblies
Copy link
Contributor

yebblies commented Sep 2, 2015

Those tests are in there specifically to test evaluation order is correct. I don't care if their speed is sub-optimal, and I doubt anyone else does.

@ibuclaw
Copy link
Member Author

ibuclaw commented Sep 2, 2015

Someone cares #4035 (comment)

@ibuclaw
Copy link
Member Author

ibuclaw commented Sep 2, 2015

Will that solve you problems or not?

Sounds like another way to allow compiler ABI's to divulge even further apart. Not something that I have a problem with though.

@ibuclaw
Copy link
Member Author

ibuclaw commented Dec 26, 2017

I guess the answer is give up, and wait until extern(D) template versions of array ops are implemented.

Because I'm just talking to a brick wall here.

@ibuclaw ibuclaw closed this Dec 26, 2017
@PetarKirov
Copy link
Member

I guess the answer is give up, and wait until extern(D) template versions of array ops are implemented.

@ibuclaw they were already implemented in #7032. Anything missing?

@ibuclaw
Copy link
Member Author

ibuclaw commented Dec 27, 2017

Did that patch make it to 2.076? Otherwise I won't be able to verify.

@wilzbach
Copy link
Contributor

Nope, it was released as part of 2.077.0: 5e0e323

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Compiler:GDC Gnu D Compiler

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants