Skip to content

Conversation

@AaronRobinsonMSFT
Copy link
Member

@AaronRobinsonMSFT AaronRobinsonMSFT commented Nov 29, 2021

Fixes #62067

This issue was occurring due to Type punning using C# function pointers. Consider the following where using a function pointer can circumvent type safety. Using function pointers enables a vast array of scenarios where type loading can occur at invalid times. This PR handles cases involving Type Equivalence, direct ldftn and ldvirtftn, indirect ldftn and ldvirtftn via Delegate activation, and scenarios involving Generics.

IntPtr fptr = typeof(A.Class).GetMethod("CallMe").MethodHandle.GetFunctionPointer();
unsafe
{
    ((delegate*<B.Struct, int>)fptr)(new B.Struct() { Field = 0 });
}
namespace A
{
    struct Struct
    {
        public int Field;
    }
    class Class
    {
        public static int CallMe(Struct s) => s.Field;
    }
}
namespace B
{
    struct Struct
    {
        public int Field;
    }
}

/cc @jkotas @davidwrighton @AndyAyersMS @jkoritzinsky @elinor-fung

@jkotas
Copy link
Member

jkotas commented Nov 30, 2021

The change is fixing the GC stress crash. On a second thought, I am wondering whether we want to make this work; or whether we should declare this invalid code and fix the test instead (e.g. by passing the struct by reference). If we decide to make this work, I think there are more places that will need the same fix.

@davidwrighton Could you please take a look and share your opinion?

The problem is:

  • The test gets function pointer by calling RuntimeMethodHandle.GetFunctionPointer
  • The target method has value type argument
  • Unsafe code calls the method via function pointer with value type that has same layout, but different identify

This fails under GC stress with "valuetype type not loaded" crash when we are in the prestub of the function.

@davidwrighton
Copy link
Member

Type punning like this isn't something that we generally support throughout the runtime, but we do support it as part of type equivalence. So, at the very minimum, we need to have the valuetypes loaded eagerly if the function pointer utilizes any type equivalent types, so a fix here is needed to some extent.

However, the fix as written makes me uncomfortable in several ways.

  1. The fix will cause a potentially significant amount of work (walking the signature) each time the GetFunctionPointer api is used. We generally try to avoid repeated unnecessary work on non-initialization code paths.
  2. The fix does not cover the case of function pointers created via ldftn and ldvirtftn. Those may already be covered for the type equivalence case, but I doubt they are fully covered for this type punning case.

OTOH, type punning of this form is likely common in C++/CLI and the fact that we still have this issue is likely indicative of a lack of GC stress testing on complex C++/CLI scenarios.

I think we should either scale back the fix to only cover type equivalent structures, or go the extra mile to make sure that ldftn and ldvirtftn instructions when not paired with a delegate allocation force the more complete parameter type load. Scaling back could be done by just calling MethodDesc::HasTypeEquivalentStructParameters from within the new PrepareForUseAsAFunctionPointer function. I believe that would cover the type equivalence scenario well enough.

Also, this should be tested with a type system test which is explicitly testing this, not an artifact of the COM test suite. I'd like to know if we really need this construct of type punning to use the COM api, or if something else is going on here.

@AaronRobinsonMSFT AaronRobinsonMSFT added the NO-MERGE The PR is not ready for merge yet (see discussion for detailed reasons) label Dec 1, 2021
@AaronRobinsonMSFT AaronRobinsonMSFT marked this pull request as draft December 1, 2021 19:00
@davidwrighton
Copy link
Member

@AaronRobinsonMSFT As a set of tests covering the type punning scenarios, that looks good. Please don't check in until we conclude we actually want to support punning.

@AaronRobinsonMSFT AaronRobinsonMSFT marked this pull request as ready for review December 2, 2021 21:17
@AaronRobinsonMSFT AaronRobinsonMSFT removed the NO-MERGE The PR is not ready for merge yet (see discussion for detailed reasons) label Dec 2, 2021
@AaronRobinsonMSFT
Copy link
Member Author

An outstanding case is when ldftn is used for Delegate instantiation. Since the Delegate case is type safe there is no need for the full signature walk and value type load that is being done in the unsafe C# function pointer case.

It would be helpful if someone from the @dotnet/jit-contrib could provide some guidance on how one could detect the Delegate case in the importer. Based on debugging it appears the broad strokes are check if the next instruction is a CEE_NEWOBJ and then interrogate the token if it inherits from Delegate. I can make this check in CEE_LDFTN but want to ensure I'm not missing some existing helper.

case CEE_LDFTN:
{
// Need to do a lookup here so that we perform an access check
// and do a NOWAY if protections are violated
_impResolveToken(CORINFO_TOKENKIND_Method);
JITDUMP(" %08X", resolvedToken.token);
eeGetCallInfo(&resolvedToken, (prefixFlags & PREFIX_CONSTRAINED) ? &constrainedResolvedToken : nullptr,
addVerifyFlag(combine(CORINFO_CALLINFO_SECURITYCHECKS, CORINFO_CALLINFO_LDFTN)),
&callInfo);
// This check really only applies to intrinsic Array.Address methods
if (callInfo.sig.callConv & CORINFO_CALLCONV_PARAMTYPE)
{
NO_WAY("Currently do not support LDFTN of Parameterized functions");
}
// Do this before DO_LDFTN since CEE_LDVIRTFN does it on its own.
impHandleAccessAllowed(callInfo.accessAllowed, &callInfo.callsiteCalloutHelper);

@AaronRobinsonMSFT AaronRobinsonMSFT added the NO-MERGE The PR is not ready for merge yet (see discussion for detailed reasons) label Dec 4, 2021
@AaronRobinsonMSFT AaronRobinsonMSFT removed the NO-MERGE The PR is not ready for merge yet (see discussion for detailed reasons) label Dec 4, 2021
@AaronRobinsonMSFT
Copy link
Member Author

@davidwrighton I've pushed up all the tests for this scenario. The failing scenario was only occurring during a GCStress run so in order to confirm that needs to happen. I believe this PR is now in a good state. and can be merged unless there are further JIT or Type system concerns.

Copy link
Member

@AndyAyersMS AndyAyersMS left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.

@jkotas
Copy link
Member

jkotas commented Dec 4, 2021

Does the fix work for crossgen/R2R?

@AaronRobinsonMSFT
Copy link
Member Author

Does the fix work for crossgen/R2R?

No. That needs to be looked at. @davidwrighton is going to pick this up next week. I might look into this weekend if I have time.

@AaronRobinsonMSFT
Copy link
Member Author

@jkotas Is there an example in CrossGen2 that I could look at that would help me to understand how to fix this there? I assume I need some new helper to call prior to the PreStub being executed.

@jkoritzinsky
Copy link
Member

@AaronRobinsonMSFT looks like there's a sample using the old classMustBeLoadedBeforeCodeIsRun mechanism that you changed from here.

private void classMustBeLoadedBeforeCodeIsRun(TypeDesc type)

Maybe you could add the checks in CorInfoImpl.ReadyToRun.cs in the getCallInfo implementation? I think you'd want to add the WalkValueTypeParameters function and for each value type parameter, call classMustBeLoadedBeforeCodeIsRun for each type. That should add the fixups that you need.

@AaronRobinsonMSFT
Copy link
Member Author

@jkoritzinsky Thanks. That helped and I was able to get ldvirtftn and the Generics cases working. However, the ldftn case doesn't seem to be handled on that path. See

private void getCallInfo(ref CORINFO_RESOLVED_TOKEN pResolvedToken, CORINFO_RESOLVED_TOKEN* pConstrainedResolvedToken, CORINFO_METHOD_STRUCT_* callerHandle, CORINFO_CALLINFO_FLAGS flags, CORINFO_CALL_INFO* pResult)
{
if ((flags & CORINFO_CALLINFO_FLAGS.CORINFO_CALLINFO_CALLVIRT) == 0 && pConstrainedResolvedToken != null)
{
// Defer constrained call / ldftn instructions used for static virtual methods
// to runtime resolution.
throw new RequiresRuntimeJitException("SVM");
}

I am trying to figure out why. I don't see the JIT involved at all during run-time so I am confused. I could be not understanding how to debug this though.

@AaronRobinsonMSFT
Copy link
Member Author

@jkoritzinsky Oops. Wrong. I misread that if. Let me try this again.

@AaronRobinsonMSFT
Copy link
Member Author

AaronRobinsonMSFT commented Dec 4, 2021

Thanks @jkoritzinsky ! That worked and was simple to follow.

@jkotas and @davidwrighton The tests now pass under CrossGen/R2R.

@dotnet/crossgen-contrib Please take a look at the CrossGen changes and let me know if there is a better way.

Copy link
Member

@davidwrighton davidwrighton left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please look into the scenario of using ldftn involving type punning with a virtual method.

@ghost ghost added needs-author-action An issue or pull request that requires more info or actions from the author. and removed needs-author-action An issue or pull request that requires more info or actions from the author. labels Dec 6, 2021
@AaronRobinsonMSFT
Copy link
Member Author

@davidwrighton Please take another look. I added a test for using ldftn on a virtual function.

@AaronRobinsonMSFT AaronRobinsonMSFT merged commit 33d989e into dotnet:main Dec 7, 2021
@AaronRobinsonMSFT AaronRobinsonMSFT deleted the issue62067 branch December 7, 2021 03:03
@ghost ghost locked as resolved and limited conversation to collaborators Jan 6, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Test failure Interop\\COM\\Activator\\ActivatorBuiltInComDisabled\\ActivatorBuiltInComDisabled.cmd

7 participants