Unify struct arg handling#18358
Conversation
Eliminate unnecessary struct copies, especially on Linux, and reduce code duplication. Across all targets, use GT_FIELD_LIST to pass promoted structs on stack, and avoid requiring a copy and/or marking `lvDoNotEnregister` for those cases. Unify the specification of multi-reg args: - numRegs now indicates the actual number of reg args (not the size in pointer-size units) - regNums contains all the arg register numbers
|
This change has zero diffs for x86, x64/windows, arm altjit and arm64 altjit. A private perf run showed the following improvements on the Devirtualization benchmark (i.e. a factor of 2 for 3 of the 4 sub-benchmarks): For the remaining benchmarks, there were many numbers in the roughly .9 to 1.1 range that were reported as signficant, but neither crossgen asm-diffs nor superpmi run against both ZapDisable and regular benchmarks on x64/ux showed any significant negative diffs. For the regular run, the overall was:
|
|
@dotnet/jit-contrib PTAL |
| baseVarNum = compiler->lvaFirstStackIncomingArgNum; | ||
|
|
||
| if (compiler->lvaFirstStackIncomingArgNum != BAD_VAR_NUM) | ||
| // Iterate over all the local variables in the Lcl var table. |
There was a problem hiding this comment.
Is it me or this comment is wrong? The for loop below seems to iterate only over the args.
There was a problem hiding this comment.
You're correct - the comment is wrong. In my defense, it isn't new (though the diffs seem to think so), it was just moved from below.
I'll correct it.
sdmaclea
left a comment
There was a problem hiding this comment.
LGTM (Looked only at the general idea and impact to arm64)
| { | ||
| // If this is passed as a floating type, use that. | ||
| // Otherwise, we'll use the general case - we don't want to use the "EightByteType" | ||
| // directly, because it won't preserve small types. |
There was a problem hiding this comment.
This comment seems to imply that the classification returned by Compiler::EightByteType is incorrect, or at least being deprecated. If this is the case, I think it is worth adding a comment to Compiler::EightByteType explaining its limitations.
There was a problem hiding this comment.
I think I should just reword the comment. It's not really a limitation of the "EightByteType", it's that it returns TYP_INT for any integer type <= 4 bytes. I could change that method, but other callers depend on that behavior.
| // Set 'useType' to the type of the first eightbyte item | ||
| // We can't pass this as a primitive type. | ||
| } | ||
| else if (structDesc.passedInRegisters && varTypeIsFloating(GetEightByteType(structDesc, 0))) |
There was a problem hiding this comment.
Seems easier to directly use structDesc.eightByteClassifications[0] == SystemVClassificationTypeSSE even if it is ugly it saves a few calls, one that is redundant.
There was a problem hiding this comment.
OK, I'm not totally comfortable with that, because it assumes that the structDesc is initialized (I realize it is, but it's not obvious from this context).
|
|
||
| // The case of (structDesc.eightByteCount == 1) should have already been handled | ||
| if (structDesc.eightByteCount > 1) | ||
| if ((structDesc.eightByteCount > 1) || !structDesc.passedInRegisters) |
There was a problem hiding this comment.
What case does adding || !structDesc.passedInRegisters? It seems like the comment is correct, by this point any Arg that is <= 8bytes should have been handled and any arg that is > 8 bytes but <= 16 bytes will be correctly addressed with structDesc.eightByteCount > 1 regardless of whether structDesc.passedInRegisters.
There was a problem hiding this comment.
No, I don't think so. I believe that if passedInRegisters is false, the eightByteCount will always be 0 - in which case we still need to set this as being passed by value. It's only the pass by reference case that should not fall into this then clause.
| private: | ||
| regNumberSmall regNums[MAX_ARG_REG_COUNT]; // The registers to use when passing this argument, set to REG_STK for | ||
| // arguments passed on | ||
| // the stack |
There was a problem hiding this comment.
Nit: condense to one line:
// arguments passed on the stack
| // In such case the fgArgTabEntry keeps track of whether the original node (before morphing) | ||
| // was a struct and the struct classification. | ||
| isStructArg = fgEntryPtr->isStruct; | ||
| isStructArg = argEntry->isStruct; |
There was a problem hiding this comment.
Nit: It may be preferable to declare isStructArg twice. To avoid any potential problems with bool isStructArg being uninitialized if someone changes this in the future.
bool isStructArg = argEntry->isStruct;
and in !remorphing
bool isStructArg = varTypeIsStruct(argx);
There was a problem hiding this comment.
That won't work, because we need the value below this if-then-else.
| !isPow2(originalSize)) // size is 3,5,6 or 7 bytes | ||
| { | ||
| if (argObj->gtObj.gtOp1->IsVarAddr()) // Is the source a LclVar? | ||
| // if (argObj->gtObj.gtOp1->IsVarAddr()) // Is the source a LclVar? |
There was a problem hiding this comment.
Why is this commented out? Is it necessary? If not can it just be deleted? As is it is a little hard to follow.
There was a problem hiding this comment.
Oops - I had convinced myself that this didn't need to be limited to that case, so I commented it out to test my hypothesis. Then I failed to remove it.
| { | ||
| // For ARM64 we pass structs that are 3,5,6,7 bytes in size | ||
| // we can read 4 or 8 bytes from the LclVar to pass this arg | ||
| // For ARM64 we pass structs that are 3,5,6,7 bytes in size in registers. |
There was a problem hiding this comment.
Nit: update comment, this now is for arm64 or unix amd64.
There was a problem hiding this comment.
Deleted the comment, as it is redundant
| size = roundupSize / TARGET_POINTER_SIZE; // Normalize size to number of pointer sized items | ||
| } | ||
| } | ||
| #endif |
There was a problem hiding this comment.
Nit: Add comment to endif
#endif // !UNIX_AMD64_ABI
| #ifdef FEATURE_HFA | ||
| if (!passUsingFloatRegs) | ||
| { | ||
| // Note on Arm32 a HFA is passed in int regs for varargs |
There was a problem hiding this comment.
This currently is currently correct for Arm32 and Arm64
There was a problem hiding this comment.
Also note that this is Windows specific
| // We are iterating over the arguments only. | ||
| assert(varDsc->lvIsParam); | ||
| // We are iterating over the arguments only. | ||
| assert(varDsc->lvIsParam); |
There was a problem hiding this comment.
With /GSChecks we can make a "shadow" copy of an incoming struct argument.
Basically we are copying it from the incoming Arg Space into a new struct local that is placed so that
it will catch any buffer overruns into its storage. Something to think about with this change.
There was a problem hiding this comment.
Thanks, Brian. Just to make sure I understand - this would be a "new" local though, right? So that the original varDsc would still have the original offset. If that's the case, then I don't think that would impact this. Otherwise, this would have a problem because it assumes that they are laid out in order.
There was a problem hiding this comment.
Yes it is a new local, I'm not sure about all the details here, but the IL references to the arg are rewrtten to refer to the new local.
There was a recent bug where we struct promotion interacted badly and it and end up with reference to both the original copy and to the individual promoted fields.
A fix and test case for the issue is here: #17329
| } | ||
| } | ||
| #endif // !_TARGET_X86_ | ||
|
|
|
The arm failures in the previous round were due to an overly aggressive assert in |
|
@dotnet-bot test Ubuntu arm Cross Checked Innerloop Build and Test |
|
@dotnet-bot test OSX10.12 x64 Checked Innerloop Build and Test |
|
@dotnet-bot test OSX10.12 x64 Checked Innerloop Build and Test |
|
@dotnet-bot test Windows_NT arm64 Cross Checked normal Build and Test |
|
@dotnet-bot test OSX10.12 x64 Checked Innerloop Build and Test |
* Unify struct arg handling Eliminate unnecessary struct copies, especially on Linux, and reduce code duplication. Across all targets, use GT_FIELD_LIST to pass promoted structs on stack, and avoid requiring a copy and/or marking `lvDoNotEnregister` for those cases. Unify the specification of multi-reg args: - numRegs now indicates the actual number of reg args (not the size in pointer-size units) - regNums contains all the arg register numbers Commit migrated from dotnet/coreclr@d28957d
Eliminate unnecessary struct copies, especially on Linux, and reduce code duplication.
Across all targets, use GT_FIELD_LIST to pass promoted structs on stack, and avoid
requiring a copy and/or marking
lvDoNotEnregisterfor those cases.Unify the specification of multi-reg args: