Skip to content
This repository was archived by the owner on Jan 23, 2023. It is now read-only.

Create RefPositions without TreeNodeInfo#16517

Merged
CarolEidt merged 3 commits into
dotnet:masterfrom
CarolEidt:ElimNodeInfo
May 23, 2018
Merged

Create RefPositions without TreeNodeInfo#16517
CarolEidt merged 3 commits into
dotnet:masterfrom
CarolEidt:ElimNodeInfo

Conversation

@CarolEidt
Copy link
Copy Markdown

@CarolEidt CarolEidt commented Feb 23, 2018

This is the next phase of building RefPositions incrementally.
The big pictures is that, instead of creating TreeNodeInfo with the register requirements for each node, the Build methods in LinearScan build the RefPositions directly, putting the defs in a DefList for when the consuming node builds the corresponding uses.
There are zero diffs for crossgen of frameworks & tests across all the x64 & x86 + altjits (arm64, arm and x64/ux), aside from a small number of improvements due to some RMW handling changes.

@CarolEidt
Copy link
Copy Markdown
Author

This results in the following throughput improvements (crossgen of SPC.dll) as measured by pin instruction count:

  • MinOpts x86: 4.3% improvement
  • Opt x86: 2.5% improvement
  • MinOpts x64: 4.3% improvement
  • Opt x64: 2.7% improvement

@CarolEidt
Copy link
Copy Markdown
Author

@dotnet/jit-contrib PTAL (for preliminary feedback)

Comment thread src/jit/lsraxarch.cpp Outdated
}
#endif
break;
RefPosition* def = BuildDef(tree);
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did you intend to use the def variable for anything?

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No; at one point I had thought I would need to keep the defs around, but realized it was not necessary.

Comment thread src/jit/lsraxarch.cpp Outdated
assert(info->dstCount == 1);
srcCount = 0;
assert(dstCount == 1);
dstCandidates = RBM_NONE;
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Placing this in a "else" ifdef would be better IMO

Comment thread src/jit/lsraxarch.cpp Outdated
// to special case them.
// These tree nodes will have their op1 marked as isDelayFree=true.
// That is, op1's reg remains in use until the subsequent instruction.
GenTree* addr = tree->gtOp.gtOp1;
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would suggest using functions like gtGetOp1 and gtGetOp2 consistently

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will do; I was programmed to avoid it back before we added gtGetOp2IfPresent(), and I guess I'm a creature of habit.

Comment thread src/jit/lsraxarch.cpp
dstCount = 1;
if (!data->isContained())
{
RefPosition* dataUse = dataUse = BuildUse(data);
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's something odd with this line, dataUse appears twice. And it's not used anywhere anyway.

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I mistakenly was initially setting delay free on the data, not addr, which was a mistake - no clue about the double assignment, though.

Comment thread src/jit/lsraxarch.cpp Outdated
for (; sourceInfo != nullptr; sourceInfo = sourceInfo->Next())
srcCandidates = allRegs(TYP_INT) & ~RBM_RCX;
dstCandidates = allRegs(TYP_INT) & ~RBM_RCX;
if (tree->IsReverseOp())
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reverse op? Wasn't this removed from LIR?

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, I feel a little silly - I had either forgotten or had missed that this happened. There's code in LSRA that attempts to handle it, but I see that the Rationalizer is clearing it on all nodes, and CheckLIR is validating it. I added an issue #16528

Comment thread src/jit/lsraxarch.cpp Outdated

assert(!call->isContained());
info->srcCount = 0;
srcCount = 0;
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Redundant

Comment thread src/jit/lsraxarch.cpp Outdated
}
if (!tree->isContained())
{
info->srcCount = srcCount;
RefPosition* def = BuildDef(tree, dstCandidates);
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unused variable?

Comment thread src/jit/lsraxarch.cpp Outdated
bool isUnsignedMultiply = ((tree->gtFlags & GTF_UNSIGNED) != 0);
bool requiresOverflowCheck = tree->gtOverflowEx();
// Only non-floating point mul has special requirements
if (!varTypeIsFloating(tree->TypeGet()))
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Make it consistent with BuildDivMod that does if (is float) { build simple; return; }? Or move the float check inside the switch case and rename BuildMul and BuildDivMod to BuildIntegerMul and BuildIntegerDivMod?

Note that FP add/sub/mul/div use the same instruction format but they're handled more or less differently throughout lower/ra/codegen. I have a PR to clean this up but I was waiting for your lsra refactoring to finish it.

Comment thread src/jit/lsrabuild.cpp Outdated
reinterpret_cast<LocationInfoListNode*>(compiler->compGetMem(preallocateSize, CMK_LSRA));
size_t preallocateSize = sizeof(RefInfoListNode) * preallocate;
RefInfoListNode* preallocatedNodes =
reinterpret_cast<RefInfoListNode*>(compiler->compGetMem(preallocateSize, CMK_LSRA));
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Existing code but note that casting from void* to X* is normally done via static_cast. reinterpret_cast is usually reserved for "weird" stuff such as casting int* to float*.

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Made the change

@CarolEidt
Copy link
Copy Markdown
Author

@mikedn - thanks so much for the quick and thorough review; it's really appreciated! Clearly I need to fix some bugs ;-) and will incorporate your feedback in the next round.

@mikedn
Copy link
Copy Markdown

mikedn commented Feb 23, 2018

quick and thorough review

Far from thorough, I've only took a quick look in the morning and then went to work. Maybe I'll take another look this evening :)

Comment thread src/jit/lsrabuild.cpp Outdated
}

//------------------------------------------------------------------------
// getKillSetForMul: Determine the liveness kill set for a mod or div node.
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wrong function name in comment. Applies to subsequent functions as well.

Comment thread src/jit/lsrabuild.cpp
case GT_MUL_LONG:
#endif
killMask = RBM_RAX | RBM_RDX;
killMask = getKillSetForMul(tree->AsOp());
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not an issue with this particular line per se but this whole function seemed a bit strange to me when I looked at it in the past. I would have expected BuildNode/TreeNodeInfoInit to somehow deal with this instead of having this whole switch here, it looks like a special case.

Looking now through this function's uses I see that one is likely redundant: buildUpperVectorSaveRefPositions is only called from BuildDefsWithKills and this one has a killMask that's supposed to be the same as the one returned by getKillSetForNode due to an assert.

And then there's the said assert in BuildDefsWithKills. Does this function actually needs the killMask parameter?

Apparently the only "real" use of getKillSetForNode is in some stress related code in buildRefPositionsForNode. Maybe there's should be another way to communicate the kill mask from build node (e.g. via a class member similar to pendingDelayFree)?

This may then allow moving functions like getKillSetForMul to where they really belong - lsraxarch.cpp.

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think it's necessary to maintain the kill mask as a class member. I'm making the general getKillSetForNode DEBUG-only; it will only be used in the assert and in the stress modes where we constrain nodes.

Comment thread src/jit/lsrabuild.cpp Outdated
#ifdef _TARGET_XARCH_
if (tgtPrefUse != nullptr)
{
defRefPosition->getInterval()->assignRelatedIntervalIfUnassigned(tgtPrefUse->getInterval());
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Isn't defRefPosition->getInterval() same as interval?

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes it is.

Comment thread src/jit/lsrabuild.cpp Outdated
RefPosition* useRefPos = newRefPosition(interval, currentLoc, RefTypeUse, operand, candidates, multiRegIdx);
if (regOptional)
{
useRefPos->setAllocateIfProfitable(true);
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It might be better performance wise to always call setAllocateIfProfitable, always setting that bit might very well be cheaper than a branch.

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds good.

Comment thread src/jit/lsrabuild.cpp
// The number of use RefPositions created
//
void LinearScan::BuildSimple(GenTree* tree)
int LinearScan::BuildSimple(GenTree* tree)
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this function really useful? For example, I look at xarch build code and I see that it handles GT_CNS_INT, GT_CND_DBL, GT_CNS_LNG (huh? looks like this one shouldn't reach LSRA due to decomposition) so attempting to handle GTK_CONST in BuildSimple is probably useless.

In general, BuildNode's switch already handles a ton of opers, I might be better (from a performance point of view and even for clarity) to add whatever opers are missing to those BuildNode switches and make default case unreached.

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds reasonable, but probably not something I want to undertake with this PR.

Comment thread src/jit/lsra.h Outdated
{
#ifdef _TARGET_XARCH_
RefPosition* tgtPrefUse = nullptr;
if (node->OperIsBinary() && isRMWRegOper(node))
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Similar to BuildSimple, it's not clear if this kind of special casing in a what's supposed to be general purpose function is useful.

It may be better to simply call BuildRMWUses directly from BuildNode's switch relevant cases (e.g. case GT_XOR) rather than calling BuildBinaryUses from many places and having to decide again what to do.

Comment thread src/jit/lsraxarch.cpp Outdated
int dstCount = 0;
regMaskTP dstCandidates = RBM_NONE;
regMaskTP killMask = RBM_NONE;
pendingDelayFree = false;
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These member assignments should be visually separated from the local variable above, I'd move them above locals and add a blank line. Or, perhaps they should not even be in BuildNode but in its caller.

Also, the need for these data members is a bit unfortunate. It's kind of difficult to track down where they are assigned values and where they are used, the delay free ones in particular.

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree that they are somewhat unfortunate; I tried to come up with an approach that didn't require these data members on LinearScan, but could only come up with models that involved excessive passing around of state.
I'd prefer to keep them in BuildNode, however, as that's the main entry point for all the building functionality. I'll separate them and document them more clearly.

Comment thread src/jit/lsraxarch.cpp
assert(!tree->IsUnusedValue() || (dstCount != 0));
assert(dstCount == tree->GetRegisterDstCount());
INDEBUG(dumpNodeInfo(tree, dstCandidates, srcCount, dstCount));
return srcCount;
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks like the return value of this function is used only in an assert. I hope that assert is worth all the added complication :)

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes - it's something I think should probably be removed, but I found it useful for debugging and comparison as I was making the transition from TreeNodeInfo

Comment thread src/jit/lsraxarch.cpp Outdated
isLocalDefUse = true;
tree->SetUnusedValue();
}
RefPosition* def = BuildDef(tree);
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Vertical alignment of all assignments is the worst formatting rule the JIT code uses...

Comment thread src/jit/lsraxarch.cpp Outdated
#endif // _TARGET_X86_
BuildDef(tree, dstCandidates);
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Or just pass RBM_BYTE_REGS directly? It's RBM_ALLINT on x64 so it should do the right thing.

Comment thread src/jit/lsraxarch.cpp Outdated
int srcCount = 0;
int internalCount = 0;
const int MaxInternalCount = 4;
RefPosition* internalDefs[MaxInternalCount];
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not used it seems. Same for internalCount above. Strange that they have the same name as 2 LinearScan data members.

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes; I had originally added these to each method needing internal temps, but it was quite a bit messier than handling it pseudo-globally. I'll delete these.

Comment thread src/jit/lsraxarch.cpp Outdated
bool hasMultiRegRetVal = false;
ReturnTypeDesc* retTypeDesc = nullptr;
RefPosition* internalDefs[MAX_ARG_REG_COUNT];
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is populated in the code below but then it doesn't appear to be used.

Comment thread src/jit/lsraxarch.cpp Outdated
{
return;
}
int srcCount = BuildRMWUses(tree->AsOp());
Copy link
Copy Markdown

@mikedn mikedn Feb 23, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, calling BuildRMWUses without calling isRMWRegOper that has special handling for GT_MUL?

@mikedn
Copy link
Copy Markdown

mikedn commented Feb 26, 2018

Hmm, I left a few comments related to RMW but I'm not sure I'm seeing the big picture, there may be a few issues that need to be considered:

  • BuildHWIntrinsic seems broken when it comes to RMW - many intrinsics are probably RMW when VEX is not available but that code doesn't handle this case
  • FP scalar ops (add, sub, mul, div) are currently treated as RMW but when VEX is available they need not be, this is something that we should improve in the future.

Now, neither issue is directly related to this PR but you may want to consider the impact they have on your refactoring. It may be that isRMWRegOper needs to go away and BuildX functions should call BuildRMWUses directly when they need it.

Comment thread src/jit/lsrabuild.cpp
// Example1: GT_EQ(int, op1 of type ubyte, op2 of type ubyte) - in this case codegen uses
// ubyte as the result of comparison and if the result needs to be materialized into a reg
// simply zero extend it to TYP_INT size. Here is an example of generated code:
// cmp dl, byte ptr[addr mode]
Copy link
Copy Markdown

@mikedn mikedn Mar 5, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The setcc instruction is missing from the example

Copy link
Copy Markdown

@mikedn mikedn Mar 5, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, I'm not sure why "in this case codegen uses ubyte as the result of comparison and if the result needs to be materialized into a reg simply zero extend it to TYP_INT size" appears here, this comment is more suitable where dstCandidates is set.

Comment thread src/jit/lsrabuild.cpp
// Example2: GT_EQ(int, op1 of type ubyte, op2 is GT_CNS_INT) - in this case codegen uses
// ubyte as the result of the comparison and if the result needs to be materialized into a reg
// simply zero extend it to TYP_INT size.
else if (varTypeIsByte(op1) && op2->IsCnsIntOrI())
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why was this case added? Codegen should not attempt to generate a byte instruction unless both operands are byte. If the constant isn't a byte then an int compare instruction should be generated even if the first operand is a byte, it's supposed to have been extended to int.

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These cases where copied from LinearScan::ExcludeNonByteableRegisters(), previously in lsraxarch.cpp. It may be that these can be improved/eliminated, but I am aiming for zero-diffs for now.

Comment thread src/jit/lsraxarch.cpp
RefPosition* sourceHiUse = BuildUse(sourceHi, srcCandidates);

if (tree->OperGet() == GT_LSH_HI)
if (!tree->isContained())
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, when is a shift node contained?

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A shift node can be contained under a STOREIND when it is part of a memory RMW op. In most cases this is just handled in BuildIndir, but shift ops have special register requirements.

Comment thread src/jit/lsraxarch.cpp Outdated
unsigned size = blkNode->gtBlkSize;
GenTree* source = blkNode->Data();
int srcCount = 0;
int internalCount = 0;
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't this have been removed?

Comment thread pgosupport.cmake
if(WIN32)
# set_property(TARGET ${TargetName} APPEND_STRING PROPERTY LINK_FLAGS_RELEASE " /LTCG /USEPROFILE:PGD=${ProfilePath}")
# set_property(TARGET ${TargetName} APPEND_STRING PROPERTY LINK_FLAGS_RELWITHDEBINFO " /LTCG /USEPROFILE:PGD=${ProfilePath}")
set_property(TARGET ${TargetName} APPEND_STRING PROPERTY LINK_FLAGS_RELEASE " /LTCG /USEPROFILE:PGD=${ProfilePath}")
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Presumably you wanted to build without PGO and accidentally checked in the commented out lines? There's a build.cmd if you need that - -nopgooptimize.

@CarolEidt CarolEidt force-pushed the ElimNodeInfo branch 5 times, most recently from e1d63a4 to 238a4dc Compare March 19, 2018 21:22
@CarolEidt
Copy link
Copy Markdown
Author

@dotnet/jit-contrib ping.
All of the x64_arm64_altjit legs failed on the same 11 tests with Assertion failed 'NYI_ARM64: Arm64 does not support tail calls via helpers.'
Re-trying the ubuntu jitstressregs8 leg - it failed with no explicable message besides "category flow".

@sdmaclea
Copy link
Copy Markdown

All of the x64_arm64_altjit legs failed on the same 11 tests with Assertion failed 'NYI_ARM64: Arm64 does not support tail calls via helpers.'

This was fixed recently

@dotnet-bot test Windows_NT x64_arm64_altjit Checked jitstressregs0x1000
@dotnet-bot test Windows_NT x64_arm64_altjit Checked jitstressregs0x80
@dotnet-bot test Windows_NT x64_arm64_altjit Checked jitstressregs1
@dotnet-bot test Windows_NT x64_arm64_altjit Checked jitstressregs2
@dotnet-bot test Windows_NT x64_arm64_altjit Checked jitstressregs3
@dotnet-bot test Windows_NT x64_arm64_altjit Checked jitstressregs4
@dotnet-bot test Windows_NT x64_arm64_altjit Checked jitstressregs8

@CarolEidt
Copy link
Copy Markdown
Author

The 'x64_arm64_altjit Checked jitstressregs3' failures also occur in master.
Filed #18052

@CarolEidt
Copy link
Copy Markdown
Author

@dotnet-bot test Windows_NT x64_arm64_altjit Checked jitstressregs3

Copy link
Copy Markdown

@BruceForstall BruceForstall left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

couple minor things noticed so far

Comment thread src/jit/lsra.h Outdated
@@ -82,118 +82,117 @@ inline regMaskTP calleeSaveRegs(RegisterType rt)
//------------------------------------------------------------------------
// LocationInfo: Captures the necessary information for a node that is "in-flight"
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Change type name in comment

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done.

Comment thread src/jit/lsrabuild.cpp Outdated
}
prevListNode = listNode;
}
assert(!"GetRefPosition didn't find the node");
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

assert has wrong function name

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed

Comment thread src/jit/lsrabuild.cpp Outdated
}
prevListNode = listNode;
}
assert(!"GetRefPosition didn't find the node");
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

assert has wrong function name

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed

Comment thread src/jit/lsrabuild.cpp
//
// Return Value: a register mask of the registers killed
//
regMaskTP LinearScan::getKillSetForProfilerHook()
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems like this should be named getKillSetForProfilerHookTailcall

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Except that the node name is GT_PROF_HOOK, so this seems like the better name to me.

Comment thread src/jit/gentree.h Outdated
static bool OperIsMul(genTreeOps gtOper)
{
return (gtOper == GT_MUL) || (gtOper == GT_MULHI)
#if !defined(_TARGET_64BIT_) && !defined(LEGACY_BACKEND)
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This will be a conflict with @BruceForstall change to remove LEGACY_BACKEND

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, there are probably more conflicts as well; that's how it goes ;-)

Comment thread src/jit/hwintrinsicArm64.cpp Outdated

if (op2 != nullptr)
{
return 2;
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You may want to add:

assert(!op1->OperIsList());

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are the scenarios where op1->OperIsList() and op2 != nullptr?

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, if op1 is a list, op2 must be null.

Comment thread src/jit/hwintrinsicArm64.cpp Outdated
return 2;
}

if (op1 != nullptr)
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: This would be easier to read (less nesting) if we did:

if (op1 == nullptr)
{
    return 0;
}

if (op1->OperIsList())
{
    // logic
}
else
{
    return 1;
}

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Need to account for op2, but will restructure.

Comment thread src/jit/lsraxarch.cpp
@@ -2263,9 +2290,8 @@ void LinearScan::BuildSIMD(GenTreeSIMD* simdTree)
// Return Value:
// None.

void LinearScan::BuildHWIntrinsic(GenTreeHWIntrinsic* intrinsicTree)
int LinearScan::BuildHWIntrinsic(GenTreeHWIntrinsic* intrinsicTree)
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I cleaned up this method here: #18078

It might be worthwhile merging the PRs together (some of the changes were fixing "correctness" issues I found, and the other changes are required to support the FMA instruction set, where any of the three operands can be contained).

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The changes I did to support FMA are here: https://github.com/tannergooding/coreclr/commit/e402548087f2bf9554e83052911561c016c60346#diff-498fd4859cb147c9bf082f5ebf32ca8fR2507

(The PR isn't up as it requires the CoreFX change adding back the unimplemented APIs to the ref assembly).

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It might be worthwhile merging the PRs together

I think it might be easier to serialize them.

Comment thread src/jit/hwintrinsicArm64.cpp Outdated
list = list->Rest();
}

assert(numArgs > 0);
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't this be: assert(numArgs >= 3) or are there cases where we (inefficiently) use a list for less than 2 args?

Copy link
Copy Markdown

@BruceForstall BruceForstall left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A few more comments.

Comment thread src/jit/lsrabuild.cpp Outdated
{
info->srcCount = appendBinaryLocationInfoToList(tree->AsOp());
assert((kind & GTK_SMPOP) != 0);
int srcCount = BuildBinaryUses(tree->AsOp());
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This definition hides the enclosing one; that seems wrong, or at least confusing.

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed. (This default is rarely used, and @mikedn recommended we eliminate it, which is probably a good idea, but perhaps for another day.)

Comment thread src/jit/lsrabuild.cpp
}
#endif // DEBUG
int newDefListCount = defList.Count();
int produce = newDefListCount - oldDefListCount;
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So the only use (and assert using) produce is for amd64? Should you define this below, then?

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This assert was pre-existing, though before there were other uses of produce. However, I'd like to strengthen this assert to apply more broadly, but it requires some cleanup of the various nodes that produce multiple results. I've added a note to issue #13183 that deals with multi-reg nodes to strengthen this assert as well.

Comment thread src/jit/lsra.cpp
{
internalCandidates = allRegs(TYP_INT);
}
printf(" +<TreeNodeInfo %d=%d %di %df", dstCount, srcCount, internalIntCount, internalFloatCount);
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are currently still many references in comments to the TreeNodeInfo struct -- and the type itself is still there. But the type is unused. Can you update the comments, and remove the references and type?

Comment thread src/jit/lsraarm.cpp
// destination and internal [temp] register counts).
//
void LinearScan::BuildNode(GenTree* tree)
int LinearScan::BuildNode(GenTree* tree)
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The return type isn't described in the comments above

Comment thread src/jit/lsrabuild.cpp
}

//------------------------------------------------------------------------
// GetOperandInfo: Get the source registers for an operand that might be contained.
// BuildDef: Build one or more RefTypeDef RefPositions for the given node,
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

name in comment doesn't match function name

Comment thread src/jit/lsrabuild.cpp
}

//------------------------------------------------------------------------
// GetOperandInfo: Get the source registers for an operand that might be contained.
// BuildDef: Build one or more RefTypeDef RefPositions for the given node
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

name in comment doesn't match function name

Copy link
Copy Markdown

@BruceForstall BruceForstall left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

I've only had a few minor comments, but overall it looks good to me.

@CarolEidt
Copy link
Copy Markdown
Author

@dotnet-bot test Windows_NT x64 Formatting

@CarolEidt CarolEidt merged commit b39a5b2 into dotnet:master May 23, 2018
@CarolEidt CarolEidt deleted the ElimNodeInfo branch May 23, 2018 14:39
CarolEidt added a commit to CarolEidt/coreclr that referenced this pull request Jul 3, 2018
This is no longer used after dotnet#16517
@erozenfeld
Copy link
Copy Markdown
Member

@CarolEidt We need to update ryujit-overview.md that still refers to TreeNodeInfo and gtLsraInfo.

@CarolEidt
Copy link
Copy Markdown
Author

@erozenfeld - thanks for pointing that out. I'm going to open an issue and assign it to myself so that it doesn't get lost.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

7 participants