Skip to content
This repository was archived by the owner on Jan 23, 2023. It is now read-only.

JIT: optimize Enum.HasFlag#13748

Merged
AndyAyersMS merged 2 commits into
dotnet:masterfrom
AndyAyersMS:FixEnumFlagsCompare
Sep 11, 2017
Merged

JIT: optimize Enum.HasFlag#13748
AndyAyersMS merged 2 commits into
dotnet:masterfrom
AndyAyersMS:FixEnumFlagsCompare

Conversation

@AndyAyersMS
Copy link
Copy Markdown
Member

In the vm, add mscorlib binder support for Enum.HasFlag. Recognize this
method as a may-expand intrinsic.

In the jit, generalize the behavior of gtTryRemoveBoxUpstreamEffects to
add a "trial removal" mode and to optionally suppress narrowing of the
copy source.

Then implement the optimization by checking for the intrinsic and ensuring
that both operands are boxes with compatible types that can be removed.
The optimization changes the call to a simple and/compare tree on the
underlying enum values.

Invoke the optimzation during importation (which will catch most cases) and
again during morph (to get the post-inline cases).

Currently not thinking of bumping the JIT GUID since this is a may expand
intrinsic and no existing intrinsic ID has changed.

Added test case. Suprisingly there were almost no uses of HasFlag in the
current CoreCLR test suite.

Closes #5626.

@AndyAyersMS
Copy link
Copy Markdown
Member Author

@jkotas PTAL (vm parts at least)
@JosephTremoulet PTAL (jit parts)
cc @dotnet/jit-contrib

No jit-diffs impact,. There was one use of HasFlag in the test suite in baseservices\threading\commitstackonlyasneeded\DefaultStackCommit and the optimization fires on it when jitting:

;; Before

       48B980B284EAFD7F0000 mov      rcx, 0x7FFDEA84B280
       E81A07865F           call     CORINFO_HELP_NEWSFAST
       4C8BF8               mov      r15, rax
       8B8C2480000000       mov      ecx, dword ptr [rsp+80H]
       41894F08             mov      dword ptr [r15+8], ecx
       48B980B284EAFD7F0000 mov      rcx, 0x7FFDEA84B280
       E8FD06865F           call     CORINFO_HELP_NEWSFAST
       488BD0               mov      rdx, rax
       C7420800100000       mov      dword ptr [rdx+8], 0x1000
       498BCF               mov      rcx, r15
       E85B2DA75E           call     System.Enum:HasFlag(ref):bool:this
       85C0                 test     eax, eax
       0F857F000000         jne      G_M54767_IG12

;; After

       8B8C2480000000       mov      ecx, dword ptr [rsp+80H]
       F7C100100000         test     ecx, 0x1000
       754C                 jne      SHORT G_M54767_IG12

Will do some corefx testing too since there is likely better coverage there.

@dotnet-bot test Windows_NT x64 corefx_baseline

Copy link
Copy Markdown

@JosephTremoulet JosephTremoulet left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great to see this getting implemented, thanks!

Comment thread src/jit/compiler.h
GenTreePtr gtFoldExprCompare(GenTreePtr tree);
bool gtTryRemoveBoxUpstreamEffects(GenTreePtr tree);

enum BoxRemovalOptions
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could use a comment

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yup. Will add some.

Comment thread src/jit/gentree.cpp
GenTree* asgStmt = op->gtBox.gtAsgStmtWhenInlinedBoxValue;
assert(asgStmt->gtOper == GT_STMT);
GenTreePtr copyStmt = op->gtBox.gtCopyStmtWhenInlinedBoxValue;
GenTree* copyStmt = op->gtBox.gtCopyStmtWhenInlinedBoxValue;
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why GenTreePtr -> GenTree*?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Decided not to introduce new GenTreePtr and then having a mixture looked odd. So changed them all.

Comment thread src/jit/gentree.cpp Outdated

// Pop the stack for real.
// impPopStack();
// impPopStack();
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like you do this in the importer now when calling from the importer, so can remove this commented code.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, yeah a vestige of the initial importer-only version.

Comment thread src/jit/gentree.cpp Outdated
return nullptr;
}

GenTree* flagVal = gtTryRemoveBoxUpstreamEffects(flagOp, BR_DONT_REMOVE);
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd be inclined to go ahead and use BR_REMOVE_BUT_NOT_NARROW here (and skip the call below), despite the asymmetry.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good idea. Originally there was one more downstream check so both removals needed to be trials.

Comment thread tests/src/JIT/opt/Enum/hasflag.cs Outdated
// See the LICENSE file in the project root for more information.

// Some simple tests for the Enum.HasFlag optimization.
// All the calls to HasFlag should be optimized to simple compares.
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And are they all optimized with this code, or is there more to do for some of them?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All are now optimized (initially the ones with calls weren't). I'll fix the comment to be clearer.

Comment thread src/jit/gentree.cpp Outdated
// The thisVal and flagVal trees come from earlier statements. We need
// to evaluate them both to temps at those points to safely transmit
// the values here.
const unsigned flagTmp = lvaGrabTemp(true DEBUGARG("Enum:HasFlag flag temp"));
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess it's not worth the trouble of handling the common case where flagVal is a constant and doesn't require a new temp.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it may be worth it... will give it a try.

Copy link
Copy Markdown

@CarolEidt CarolEidt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Comment thread src/jit/gentree.cpp Outdated
{
printf("\nAttempting to remove side effects of BOX (valuetype)\n");
printf("\n%s remove side effects of BOX (valuetype)\n",
options == BR_DONT_REMOVE ? "Checking if it is possible" : "Attempting to");
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: either move the "to " from "Attempting to" to before "remove" , or add after "possible".

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed now. Thanks.

@AndyAyersMS
Copy link
Copy Markdown
Member Author

Pushed an update (but missed Carol's note). Now in cases like

    public static int Main()
    {
        if (E.RED.HasFlag(E.BLUE))
        {
            Console.WriteLine("This bit should not be imported!\n");
        }
        return 100;
    }

The test is folded in the importer.

@AndyAyersMS
Copy link
Copy Markdown
Member Author

Making sure this still passes: @dotnet-bot test Windows_NT x64 corefx_baseline

@AndyAyersMS
Copy link
Copy Markdown
Member Author

Oops, broke the test case; now fixed. Also fixed explanatory message per Carol.

Making sure this still passes: @dotnet-bot test Windows_NT x64 corefx_baseline

Comment thread src/jit/gentree.cpp Outdated
// Unless they are invariant values, we need to evaluate them both
// to temps at those points to safely transmit the values here.
//
// Also we need ot use the flag twice, so we need two trees for it.
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

typo: "ot" -> "to"

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Found one other typo; will fix them both.

@jkotas
Copy link
Copy Markdown
Member

jkotas commented Sep 1, 2017

The type equivalence implementation needs to be more careful about inexact types in shared generic code. The CoreCLR type system is not capable of capturing the exact type information for shared generic code (ie it does not have equivalent of RuntimeDeterminedType from CoreRT type system).

  • Compile the following with /o-, exception will be thrown as expected.
  • Compile the following with /o+, no exception will be thrown because of the optimizations will consider the types to be identical even though they are actually not.
using System;
using System.Runtime.CompilerServices;

class MyG<T,U> {

public enum A {
   X = 1
}

[MethodImpl(MethodImplOptions.NoInlining)]
public static void foo() {
    var a = MyG<object,U>.A.X;
    a.HasFlag(MyG<T,string>.A.X);
}

}

class My {

static void Main() {
    MyG<My,My>.foo();
}

}

It may be nice to plumb this such that these optimization can kick in shared generic code when the EE side has exact types in shared generic code.

@AndyAyersMS
Copy link
Copy Markdown
Member Author

@jkotas thanks for pointing this out. Think I know the right way to fix it...

Comment thread src/vm/jitinterface.cpp Outdated
}
}
else if (pMT->HasSameTypeDefAs(MscorlibBinder::GetClass(CLASS__READONLY_SPAN)))
else if (pMT->HasSameTypeDefAs(MscorlibBinder::GetClass(CLASS__ENUM)))
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The preferred way to define non-generic intrinsincs is to make them FCall and use FCIntrinsic to get them ID assigned in vm\ecalllist.h. The advantage of it is that the intrinsic ID is cached and the scheme scales well with the number of intrinsics.

For Enum.HasFlag, it would require moving the bit of error handling from C# to C++ so that the main entrypoint is FCall.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So basically move the checking in HasFlag down into InternalHasFlag, then remove the former and rename the latter as HasFlag?

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looked into this some -- it is not obvious how to create the proper exception with formatted text for the class equivalence check. Any suggestions?

@jkotas
Copy link
Copy Markdown
Member

jkotas commented Sep 1, 2017

Otherwise LGTM.

@AndyAyersMS
Copy link
Copy Markdown
Member Author

Looks like it needs some fixes for non-default underlying types too.

@AndyAyersMS
Copy link
Copy Markdown
Member Author

@jkotas what is the explanation for the R4/R8 support in Enum.cs, for instance in GetValue ?

@jkotas
Copy link
Copy Markdown
Member

jkotas commented Sep 2, 2017

dotnet/corefx#13981 (comment)

It may be nice to add comment about this at the top of Enum.cs. It is frequently asked question.

@AndyAyersMS
Copy link
Copy Markdown
Member Author

Current thinking is to revise how Enum.HasFlag is implemented as a separate prerequisite PR, then come back to this (the jit side stuff is largely done).

@jkotas any advice on how to create the properly formatted type mismatch exception on the VM side? It seems to me like it will take a fair amount of extra plumbing...

@jkotas
Copy link
Copy Markdown
Member

jkotas commented Sep 5, 2017

create the properly formatted type mismatch exception on the VM side?

I think that the easiest way to do it is to call back to managed code to throw the exception. MetadataImport.ThrowError is good example.

Maybe we should use this as an opportunity to create a protocol that allows arbitrary (non-FCall) CoreLib methods to be recognized as JIT intrinsic. Such scheme will be needed for SIMD intrinsics that are being worked on as well.

@AndyAyersMS
Copy link
Copy Markdown
Member Author

Here's a fork with my take on a fully native Enum.HasFlag that calls back for the type mismatch exception case.

Crafting a more flexible mechanism for determining which methods can be jit intrinsics sounds like a good idea, though I am not clear on what all would be required.

@jkotas
Copy link
Copy Markdown
Member

jkotas commented Sep 5, 2017

Crafting a more flexible mechanism for determining which methods can be jit intrinsics sounds like a good idea, though I am not clear on what all would be required.

  • Mark methods that are potential JIT intrinsics with [Intrinsic] attribute
  • Cache the intrinsic flag on MethodDesc
  • Fast check on JIT/EE interface to tell whether method is a potential JIT intrinsic (e.g. it can be a flag returned by getMethodAttribs)
  • If it is potential intrinsic, the JIT can ask for type name and method name and switch based on that. It is similar to what is done for Vector intrinsics today. Notice that this allows adding/removing JIT intrinsics without changing JIT/EE interface.

@AndyAyersMS
Copy link
Copy Markdown
Member Author

@jkotas here's a first cut at the new style intrinsic support.

Let me know if you have a better name for these new kind of intrinsics and/or there is a better place to check for the attribute.

If this looks plausible I'll put it up for PR.

@jkotas
Copy link
Copy Markdown
Member

jkotas commented Sep 5, 2017

Yes, it looks plausible - I have left a few comments on the commit.

@jkotas
Copy link
Copy Markdown
Member

jkotas commented Sep 5, 2017

Let me know if you have a better name for these new kind of intrinsics

I would expect that we will migrate all JIT intrinsics over to this scheme over time, so I would not worry about the name too much.

Check for calls to `Enum.HasFlag` using the new named intrinsic support
introduced in dotnet#13815. Implement a simple recognizer for these named
intrinsics (currently just recognizing `Enum.HasFlag`).

When the call is recognized, optimize if both operands are boxes with
compatible types and both boxes can be removed. The optimization changes the
call to a simple and/compare tree on the underlying enum values.

To accomplish this, generalize the behavior of `gtTryRemoveBoxUpstreamEffects`
to add a "trial removal" mode and to optionally suppress narrowing of the
copy source.

Invoke the optimization during importation (which will catch most cases) and
again during morph (to get the post-inline cases).

Added test cases. Suprisingly there were almost no uses of HasFlag in the
current CoreCLR test suite.

Closes #5626.
@AndyAyersMS
Copy link
Copy Markdown
Member Author

Reworked this now that the new intrinsic support is in place.

Most of the code is as before, but the recognition bit is different. I put in some toeholds for a more general classification framework for the new intrinsics; eventually we can flesh this out with more information as we need it.

@JosephTremoulet PTAL
cc @dotnet/jit-contrib

@AndyAyersMS
Copy link
Copy Markdown
Member Author

@dotnet-bot retest Ubuntu arm64 Cross Debug Build

Comment thread src/jit/gentree.cpp Outdated
//
// False if the upstream effects could not be removed.
// A tree representing the original value to box, if removal
// is successful (but see note). nullptr if removal fails.
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FWIW, after reading this and the note I was still not sure what the return value is when BR_DONT_REMOVE gets passed... something like successful/possible instead of just successful here, or an explicit Still returns the source tree if removal is legal in the note for BR_DONT_REMOVE, would maybe have helped.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will do the former.

Copy link
Copy Markdown

@JosephTremoulet JosephTremoulet left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good, just some very minor observations.

Comment thread src/jit/gentree.cpp Outdated
}

// A boxed flagOp should have exact type and non-null instance
assert(isExactFlag && isNonNullFlag);
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: I always prefer assert(x); assert(y); over assert(x && y), so you know which failed when one fails...

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

Comment thread src/jit/gentree.cpp
bool isNonNullFlag = false;
CORINFO_CLASS_HANDLE flagHnd = gtGetClassHandle(flagOp, &isExactFlag, &isNonNullFlag);

if (flagHnd == nullptr)
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You don't have the CORINFO_FLG_SHAREDINST check for the flag operand. I presume it couldn't equal thisHnd if it were shared... could consider adding a comment, or deferring the shared inst check until after the handles are checked for equality, just for readability.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Deferring the check.

@AndyAyersMS
Copy link
Copy Markdown
Member Author

OSX test failure is unrelated: faililng pal test

..........................Expected thread to wait for 4000 ms (and not be interrupted).
Thread waited for 3999 ms! (Acceptable delta: 300)


FAILED: threading/WaitForSingleObject/WFSOExSemaphoreTest/paltest_waitforsingleobject_wfsoexsemaphoretest. Exit code: 1

@AndyAyersMS
Copy link
Copy Markdown
Member Author

@dotnet-bot retest OSX10.12 x64 Checked Build and Test

@AndyAyersMS
Copy link
Copy Markdown
Member Author

Tizen time out is worrisome, but the same changes passed before. Am going to retry.

@dotnet-bot retest Tizen armel Cross Debug Build

@AndyAyersMS AndyAyersMS merged commit e764b4d into dotnet:master Sep 11, 2017
@AndyAyersMS AndyAyersMS deleted the FixEnumFlagsCompare branch September 11, 2017 18:12
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

7 participants