Skip to content
This repository was archived by the owner on Jan 23, 2023. It is now read-only.

Stop using LIST nodes for FIELD_LIST#26800

Merged
sandreenko merged 6 commits into
dotnet:masterfrom
mikedn:no-list-field
Oct 3, 2019
Merged

Stop using LIST nodes for FIELD_LIST#26800
sandreenko merged 6 commits into
dotnet:masterfrom
mikedn:no-list-field

Conversation

@mikedn
Copy link
Copy Markdown

@mikedn mikedn commented Sep 20, 2019

Changes GT_FIELD_LIST to be a special node with a list of "uses" for fields, similar to GT_PHI.

Contributes to #19876

On x64 (which doesn't really use GT_FIELD_LIST) PIN shows a 0.1% improvement. That's probably due to a few rationalizer checks that now only test for GT_LIST. It would probably be more interesting to see results for x86 but I couldn't get PIN to work on x86. On x86 there's a tiny drop in memory usage (0.03% or something like that).

@mikedn mikedn force-pushed the no-list-field branch 2 times, most recently from ee461d8 to 199b6f9 Compare September 21, 2019 07:49
@mikedn
Copy link
Copy Markdown
Author

mikedn commented Sep 24, 2019

@sandreenko Does it sound reasonable to remove GenTreeCall::regArgList at the same time fgArgTabEntry::node is removed? AFAIR you once tried to remove regArgList but restored it because it was useful in catching node update failures?

@sandreenko
Copy link
Copy Markdown

sandreenko commented Sep 24, 2019

@sandreenko Does it sound reasonable to remove GenTreeCall::regArgList at the same time fgArgTabEntry::node is removed?

Yes, feel free to remove that.

AFAIR you once tried to remove regArgList but restored it because it was useful in catching node update failures?

Yes, also there were noway asserts and buggy code in assertion propagation that used it for correctness, that was cleaned. So now regArgList is debug only.

@mikedn
Copy link
Copy Markdown
Author

mikedn commented Sep 24, 2019

Thanks! (and oops, I asked the question in the wrong PR, it should have been in #24294)

@erozenfeld
Copy link
Copy Markdown
Member

@dotnet/jit-contrib

Copy link
Copy Markdown

@CarolEidt CarolEidt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just comments and questions at this point, but I really love this change - I've never been a fan of the FIELD_LIST representation; I think it was one of the worst list cases.

Comment thread src/jit/gentree.h Outdated
Comment thread src/jit/lower.cpp Outdated
Comment thread src/jit/lower.cpp Outdated
(void)new (comp, GT_FIELD_LIST) GenTreeFieldList(argHi, 4, TYP_INT, fieldList);
GenTreeFieldList* fieldList = new (comp, GT_FIELD_LIST) GenTreeFieldList();
fieldList->gtUses = new (comp, CMK_ASTNode) GenTreeFieldList::Use(argLo, 0, TYP_INT);
fieldList->gtUses->SetNext(new (comp, CMK_ASTNode) GenTreeFieldList::Use(argHi, 4, TYP_INT));
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did you consider adding a method to GenTreeFieldList to add fields?

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, every time I did one of these "stop using GT_LIST" changes. The problem is that it's kind of risky to have something like "AddUse" because we only keep track of the head of the linked list and appending would require traversing the list to find the tail. I think there are 2 ways to handle this:

  • add something similar to gtNewCallArgs - gtNewFieldList or something like that - that hides away this manual list manipulation in those cases where the number of fields is statically known
  • add something like UseListBuilder - a small class that keeps track of both the head and tail and can be used in the rest of the cases.

This should cover almost all cases. AFAIR the only place where we still need to do some manual linked list manipulation in long decomposition (and that's the main reason I maintained the existing linked list approach, otherwise arrays would be a very tempting alternative).

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right. Another option would be to have constructor overloads for the known-number-of-fields cases. But I'm OK with deferring consideration of such a change.

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Though in the case of field list (and PHIs) I don't think there's anything stopping us from adding the tail of the list to the node, there's enough space in GenTreeFieldList for that. It's only calls that are problematic, they're large enough even without adding more pointers to them.

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, I added AddField and InsertField. Not sure I like all this too much, in part due to the side effect inheritance. Sure, GenTreeArgList did this automatically (and so do other GenTree derived classes). And I now realize that this happens even in LIR, where side effects are not supposed to be inherited. So I also added AddFieldLIR and InsertFieldLIR to deal with this case.

Opinions?

@sandreenko too

Comment thread src/jit/morph.cpp Outdated
GenTreeFieldList::Use(argx->gtOp.gtOp2, OFFSETOF__CORINFO_TypedReference__type, TYP_I_IMPL));
fieldList->gtFlags |=
(fieldList->gtUses->GetNode()->gtFlags | fieldList->gtUses->GetNext()->GetNode()->gtFlags) &
GTF_ALL_EFFECT;
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This flag inheritance was previously done by the GenTreeArgList constructor, and that's generally how it's done. It seems like it might be better to have an AddField method on GenTreeFieldList to take care of this.

Comment thread src/jit/morph.cpp Outdated
noway_assert(newArg->OperIs(GT_FIELD_LIST));

// We need to propagate any GTF_ALL_EFFECT flags from the end of the list back to the beginning.
// This is verified in fgDebugCheckFlags().
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This comment could be updated; we only need to update the FIELD_LIST node, where previously we actually had to propagate the flags up.

Comment thread src/jit/codegenxarch.cpp
Comment thread src/jit/lowerxarch.cpp

// First, insert the field node into the sorted list.
GenTreeFieldList* prev = nullptr;
for (GenTreeFieldList* cursor = head;; cursor = cursor->Rest())
Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorting code was broken, it never updated prev. The reason why it worked is that the field list was already sorted by increasing offset so the bug here effectively turned this code into list reversal and we ended up getting the expected order. I don't see any reason to create field list with a different initial order so it seems preferable to just reverse the list and avoid the quadratic complexity concerns.

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wow - glad you got this "sorted" ;-)

Copy link
Copy Markdown

@CarolEidt CarolEidt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks great now; thanks for the updates. Ping me when you're ready to remove the [WIP] and we can trigger the failing CI tests (for now, I believe there are still issues in the queue).

@mikedn mikedn changed the title [WIP] Stop using LIST nodes for FIELD_LIST Stop using LIST nodes for FIELD_LIST Oct 2, 2019
@mikedn
Copy link
Copy Markdown
Author

mikedn commented Oct 2, 2019

@CarolEidt Thanks, I've run diffs again (none) and done another pass checking for possible fallout from the change to TYP_STRUCT. I think this is done now.

Off to finish the #21711 monster 😁

@sandreenko
Copy link
Copy Markdown

It would probably be more interesting to see results for x86 but I couldn't get PIN to work on x86. On x86 there's a tiny drop in memory usage (0.03% or something like that).

PIN should work on x86, just run in from SysWOW64 cmd ("C:\Windows\SysWOW64\cmd.exe"), if it doesn't work then I will measure the PIN numbers for this change later today.

@mikedn
Copy link
Copy Markdown
Author

mikedn commented Oct 2, 2019

@sandreenko AFAIR the problem I had with pin on x86 was that I couldn't get it to compile. If that worked for you I can try again one day. Thanks for the offer to run it!

Copy link
Copy Markdown

@sandreenko sandreenko left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM in general.

a few comments/questions.

PIN for x86 (release, no pgo, 5 runs average):
base: 22057891557,
diff: 22021891998,
improvement is 0.1%.

Comment thread src/jit/gentree.h
use = use->GetNext();
}

bool operator==(const UseIterator& other)
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

these operators should be const, shouldn't they?

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes.

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Though given that it took 2 retries to get CI clean it can probably wait until the next GT_LIST related PR.

Comment thread src/jit/gentree.h Outdated
//
static bool Equals(GenTreeFieldList* list1, GenTreeFieldList* list2)
{
// TODO-Cleanup: The list type is actually ignored. It should always be TYP_STRUCT
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what about gtFlags? I understand that it will be checked in the loop, but it could be a fast exit if they are not equal.

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Currently this is only called from Compare and that does its own flag check already. That check is kind of weird:

coreclr/src/jit/gentree.cpp

Lines 1248 to 1252 in f777668

/* Sensible flags must be equal */
if ((op1->gtFlags & (GTF_UNSIGNED)) != (op2->gtFlags & (GTF_UNSIGNED)))
{
return false;
}

Even if we want to be able to use Equals directly (IMO it would be a reasonable thing to do) it's not clear what flags should be checked.

Comment thread src/jit/lowerxarch.cpp

if (src->OperIs(GT_FIELD_LIST))
{
#ifdef _TARGET_X86_
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why the condition and #ifdef were switched?
so now x64 returns when it sees GT_FIELD_LIST in this method.

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Side effect of changing the field list type to TYP_STRUCT from TYP_INT or whatever happens to be the first field type.

With the original ifdef code was reaching an if (src->TypeGet() != TYP_STRUCT) that's below. Inside that if the only thing that happens is an attempt to contain constant operands so in the case of FIELD_LIST nothing really happened.

With the change to TYP_STRUCT, that if is no longer entered and instead some code that's expecting an OBJ is reached.

So it was always the case that FIELD_LIST was supposed to be ignored on x64 but the way this worked was a bit "by accident".

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the explanation!

@mikedn
Copy link
Copy Markdown
Author

mikedn commented Oct 3, 2019

PIN for x86 (release, no pgo, 5 runs average):
base: 22057891557,
diff: 22021891998,
improvement is 0.1%

Thanks! It's interesting that PIN shows the same improvement as on x64. So it's very likely the improvements are indeed due to the removed field list checks rather than any field list handling code changes. I'm curious if we'll get anything more out of it when I remove the final GT_LIST usage - SIMD/HWIntrinsic nodes.

Copy link
Copy Markdown

@sandreenko sandreenko left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thank you for that change!

@sandreenko sandreenko merged commit 0a00ee7 into dotnet:master Oct 3, 2019
@mikedn mikedn deleted the no-list-field branch October 5, 2019 07:12
SrivastavaAnubhav pushed a commit to SrivastavaAnubhav/coreclr that referenced this pull request Oct 7, 2019
* Stop using LIST nodes for FIELD_LIST

* Change GT_FIELD_LIST type to TYP_STRUCT

* Smaller GenTreeField::Use

* Delete out of date comment

* Cleanup LowerArg duplicated code

* Add AddField/InsertField
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants