Skip to content
This repository was archived by the owner on Jan 23, 2023. It is now read-only.

[x86/Linux] Fix GetCallerSp#9384

Merged
janvorli merged 3 commits into
dotnet:masterfrom
parjong:fix/x86_GetCallerSp
Feb 17, 2017
Merged

[x86/Linux] Fix GetCallerSp#9384
janvorli merged 3 commits into
dotnet:masterfrom
parjong:fix/x86_GetCallerSp

Conversation

@parjong
Copy link
Copy Markdown

@parjong parjong commented Feb 7, 2017

x86 unwinder takes care of callee-poped stack elements, and thus Caller's Esp is not same as Caller-SP (discussed in #9366).

This commit revises GetCallerSp to return correct Caller-SP for x86.

@janvorli
Copy link
Copy Markdown
Member

janvorli commented Feb 7, 2017

@parjong It looks to me that it would be better to keep the EstablisherFrame and GetCallerSP so that they match what is stored in the REGDISPLAY and instead compensate for the stack arguments when getting the GS cookie address. Would that work?
And have you considered using the Initial-SP instead of Caller-SP as @jkotas has suggested in #9366 for the establisher frame?

@parjong
Copy link
Copy Markdown
Author

parjong commented Feb 7, 2017

@janvorli For me, it currently have no issue to use of initial-SP instead of caller-SP as Establish Frame Pointer, but it will have an impact on #9300 (@wateret is working on). #9300 currently assumes that the value of Establish Frame Pointer is Caller-Sp, and uses it to restore Ebp. I currently have no idea whether it is a good idea to use initial-SP instead of caller-SP for #9300.

In its current form, GetCallerSp will return caller's Initial-SP instead of Caller-SP (according to CLR ABI). I could find several places that invoke GetCallerSp, and thus I'm not sure whether this semantic difference affects these code, or not.

\CC @jkotas @seanshpark

@janvorli
Copy link
Copy Markdown
Member

janvorli commented Feb 7, 2017

@parjong I went through all of the usages of GetCallerSp and there are just two that matter for x86 on Linux. In the EECodeManager::GetGSCookieAddr and in the ExceptionTracker::ProcessOSExceptionNotification. For the latter, the only thing that matters is that it matches the way the establisher frame is reported.

@janvorli
Copy link
Copy Markdown
Member

janvorli commented Feb 7, 2017

@parjong one more thing. The CLR ABI talks about the caller SP only for amd64 and arm, since on x86 before the changes to use WIN64EXCEPTIONS on Linux, there was no concept of caller SP. So we can actually define the semantics of the caller SP on x86 Linux. We can define it as the SP before the caller started to push parameters for the call.

@jkotas
Copy link
Copy Markdown
Member

jkotas commented Feb 7, 2017

We can define it as the SP before the caller started to push parameters for the call.

I think it would be better to just define it same way as it is defined on amd64; to reduce the variety of options.

@janvorli
Copy link
Copy Markdown
Member

janvorli commented Feb 8, 2017

@jkotas ok, in that case the spirit of the current change is doing exactly that.

Comment thread src/vm/eetwain.cpp Outdated
EECodeInfo codeInfo;
codeInfo.Init((PCODE) pRD->ControlPC);

res -= codeInfo.GetCodeManager()->GetStackParameterSize(&codeInfo);
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What if we stored the callerSP in the REGDISPLAY in the OOPStackUnwinderX86::VirtualUnwind when we set the EstablisherFrame? Then we would not have to initialize the EECodeInfo and call the GetStackParameterSize here again.

Copy link
Copy Markdown
Author

@parjong parjong Feb 8, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@janvorli Then, CLR will get into stuck when resuming as discussed in #9366 (comment).

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@parjong I am confused by your reply. My comment was not suggesting a change in the behavior, but rather "caching" the value we assign to EstablisherFrame in the OOPStackUnwinderX86::VirtualUnwind, since that's exactly the value we would compute here. So the point was that we don't have to recompute it here that way.

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@janvorli Oh, I'm sorry, I misunderstood your comment. It will work if your suggestion was to add DWORD CallerSp field in REGDISPLAY. I thought that your suggestion was to update Esp as Caller SP.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No problem. Yes, I meant adding CallerSp field in REGDISPLAY

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@janvorli Oops, I found a problem. rd in OOPStackUnwinderX86::VirtualUnwind is a local variable, and thus the update on CallerSp here has no effect.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, you are right. I've forgotten about it.

Comment thread src/vm/i386/cgenx86.cpp Outdated
_ASSERTE(pFunc != NULL);

#ifdef WIN64EXCEPTIONS
UpdateRegDisplayHelper(pRD, 0);
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not sure I understand why the behavior here is different for WIN64EXCEPTIONS. This UpdateRegDisplay is essentially a simple unwind to the caller context. So I would expect it to behave as the UnwindStackFrame w.r.t. the semantics of the value it sets to the SP.

Copy link
Copy Markdown
Author

@parjong parjong Feb 8, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@janvorli You're right. It would be better to use FEATURE_PAL (instead of WIN64EXCEPTIONS) here.

As discussed in #9272, libunwind does not consider calling convention at all, and stack frames from native method such as ThePreStub is unwound by libunwind.

As a result, stack walker failed to recognize the target frame as the esp comes from libuwnind and the esp updated by this method are different.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@parjong I am also not sure I understand your reply here. The TransitionFrame::UpdateRegDisplay effectively unwinds to a managed method. So I was assuming that we would use the UpdateRegDisplayHelper(pRD, pFunc->CbStackPop()) for WIN64EXCEPTIONS too so that pRD->pCurrentContext->Esp gets ESP before the stack arguments were pushed, like the UnwindStackFrame does.

Copy link
Copy Markdown
Author

@parjong parjong Feb 8, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@janvorli Sorry for unclear comment.

Suppose that a managed method A calls a native method B, and B throws a managed exception that A will handle.

FindNonvolatileRegisterPointers attempts to find the frame of A using uOriginalSP which comes from Thread::VirtualUnwindToFirstManagedCallFrame.

Note that the last step of VirtualUnwindToFirstManagedCallFrame is PAL_VirtualUnwind (instead of UnwindStackFrame) which does not consider callee-poped stack elements, and thus uOriginalSP will be exactly same as B's Caller-SP.

(As I understand) stack walker will get A's frame via calling TransitionFrame::UpdateRegDisplay onto B's frame, and thus the SP of A will become B's Caller-SP increased by the result of CbStackPop() in the current form.

Due to this mismatch, stack walker fails to recognize A's frame, which results in an assertion failure inside FindNonvolatileRegisterPointers.

Please let me know if there is something wrong. I find all of these steps via execution log, but there may be something that I misunderstand.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, I was looking at it from the amd64 prespective, where the RIP and RSP are extracted in a different way.
The TransitionFrame (resp. the frame types derived from it, since TransitionFrame is an abstract base) is created for transition from managed code. Stack for one of the cases (taken from the PinnedObject test) is below:

00007FFFFFFFA000 00007FFFF630460F libcoreclr.so!PreStubWorker + 1935 at /home/janvorli/git/coreclr/src/vm/prestub.cpp:1078
00007FFFFFFFA228                  [PrestubMethodFrame: 00007fffffffa228] System.Reflection.Metadata.dll!System.Reflection.Internal.FileStreamReadLightUp+NativeMethods.ReadFileModern(System.Runtime.InteropServices.SafeHandle, Byte*, Int32, Int32 ByRef, IntPtr)
00007FFFFFFFA270 00007FFFF61EC9BC libcoreclr.so!ThePreStub + 92 at /home/janvorli/git/coreclr/bin/obj/Linux.x64.Debug/src/vm/wks/theprestubamd64.S:2
00007FFFFFFFA360 00007FFF7D7AC7F2 System.Reflection.Metadata.dll!System.Reflection.Internal.FileStreamReadLightUp.TryReadFile(System.IO.Stream, Byte*, Int64, Int32) + 274 [/home/janvorli/git/corefx/src/System.Reflection.Metadata/src/System/Reflection/Internal/Utilities/FileStreamReadLightUp.cs @ 111]

ThePreStub asm helper creates a data structure called transition block on the stack. This block contains callee saved and argument registers. The transition block is created right below the return address to the managed caller of the ThePreStub. The PrestubMethodFrame (derived from the TransitionFrame) is created in the PreStubWorker and it contains a pointer to the transition block.
On x64, the RIP is set to the return address to the managed caller and the RSP is set to the end of the transition block plus sizeof(void*) (skipping the return address).
So, these values correspond to the frame of the managed caller of the ThePreStub.
That means that on x86 Linux, the ESP should be the one before the arguments were pushed to be consistent with the cases when we unwind to a managed frame from another managed frame directly. Otherwise it seems that the next unwind would fail if the managed function that called the ThePreStub had stack arguments.

Now I also wonder why we extract the pRD->pCurrentContext->Eip and pRD->pCurrentContext->Esp below in a different way than on amd64 (using pRD->PCTAddr instead).
What I mean, why don't we do it like this:

pRD->pCurrentContext->Eip = GetReturnAddress();
pRD->pCurrentContext->Esp = GetSP() + cbStackPop;

Copy link
Copy Markdown
Author

@parjong parjong Feb 8, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@janvorli The first approach (revise OOPStackUnwinderX86::VirtualUnwind to update Esp as Caller-SP) will make CLR get into stuck when resuming (sometimes). Thus, we need to additionally record Esp for resume.

I'm not sure about the second approach. While investigating #9272, I found that libunwind does not consider calling convention for unwinding. As the issue happens when we unwind native method frame, we could not use the existing infrastructure for managed methods.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@parjong I have spent some time today thinking about all of the aspects of the problem. Here are my notes:

  1. We cannot store the esp at the call instruction in the context, unless we change the UnwindStackFrame to work with that semantics of esp directly. I don't see a way how OOPStackUnwinderX86::VirtualUnwind could augment it back to the SP before pushing the stack args. The problem is that the unwind gets the current context, but for getting the size of the stack arguments, we would need the previous context that we've unwound to the current context before. And that's not available at that point.
  2. There are actually two different ways the stack is unwound.
    a. One is in the stackwalk.cpp, where it doesn't directly walk native code frames, but it uses the explicit stack frames (classes derived from the Frame class) to cross the native area. That's where the XXXFrame::UpdateRegDisplay is used. Frames derived from the TransitionFrame contain MethodDesc of the function that gets called after the transition. There is a managed method A that calls a native code with the TransitionFrame and then a managed method B is called. But logically, you can see it as method A calling method B. And so the stack arguments size from the MethodDesc of method B stored in the transition frame gives you the delta between the caller SP stored in the regdisplay passed to the UpdateRegDisplay and the SP where the stack arguments were pushed. So the TransitionFrame::UpdateRegDisplay should always call the UpdateRegDisplayHelper with the pFunc->CbStackPop(), even for WIN64EXCEPTIONS in the current model when the context of managed frames contains esp before the stack arguments were pushed and not the caller SP. First for the sake of consistency and second because calling the EECodeManager::GetCallerSp on REGDISPLAY of the method A would otherwise return nonsense.
    b. The other one is in the UnwindManagedExceptionPassX and Thread::VirtualUnwindToFirstManagedCallFrame where we walk every single native frame. In this case, the PAL_VirtualUnwind is used and so when we walk out of native frames to a managed frame, the esp is obviously the caller SP and not the SP before the stack arguments were pushed. So at this place we would need to have a way to lookup the size of stack arguments from an address in the function. It seems we could use the information in the jithelpers.h to build some kind of lookup.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@parjong I just want to confirm something. I think you've said before that the UnwindStackFrame didn't work if you have tried to call it with the esp set to the caller SP instead of the SP before the stack arguments were pushed. Am I right? And if yes, do you remember what was the issue?

I am asking since reading through the unwinder code, it seems to me that it could be possible to modify it (using ifdefs only for Linux) so that it would be able to unwind in the same way as the libunwind. That would remove the mismatch between these two unwinders, both would unwind to the caller SP.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@parjong please ignore my last comment. Reading through the unwinder code more, I've found that it would be difficult to make such a change in a reasonable way.

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@janvorli When I set Esp as Caller-SP inside UnwindStackFrame, JIT.Regression.CLR-x86-JIT.V2.0-Beta2.b415164 get stuck (discussed in #9366). It is almost same as #9272, but the callee is managed one instead of native one.

Actually, #9366 was an attempt to implement libunwind-like unwinder without affecting x86/Windows.

@parjong
Copy link
Copy Markdown
Author

parjong commented Feb 8, 2017

@janvorli FYI,

The unittest result with this PR:

=======================
     Test Results
=======================
# CoreCLR Bin Dir  : 
# Tests Discovered : 7027
# Passed           : 5920
# Failed           : 828
# Skipped          : 279
=======================

The unittest result with this PR revised as you suggested (change only GetGSCookieAddress):

=======================
     Test Results
=======================
# CoreCLR Bin Dir  : 
# Tests Discovered : 7027
# Passed           : 5912
# Failed           : 836
# Skipped          : 279
=======================

I'm not sure whether there is actual difference as there are some random failures.

@janvorli
Copy link
Copy Markdown
Member

janvorli commented Feb 8, 2017

@parjong thank you for the rest results. Based on the comment from @jkotas, I have actually changed my mind to keep the CallerSP semantics consistent with AMD64. That's why I have started to add the comments to the specific parts of your change.

@parjong
Copy link
Copy Markdown
Author

parjong commented Feb 8, 2017

@janvorli I understand. I did this test before you changed your mind, and I thought that this result may be useful for someone else.

@parjong parjong changed the title [x86/Linux] Fix GetCallerSp (Work in progress) [x86/Linux] Fix GetCallerSp Feb 8, 2017
@wateret
Copy link
Copy Markdown
Member

wateret commented Feb 8, 2017

@janvorli @jkotas I'm working on a new version of #9300 with PSPSym support.
Basically I was going to write it same way with ARM but I'm still not sure which to use for establisher frame pointer, CallerSP or InitialSP. As far as I understand, CallerSP could be problematic in x86 so InitialSP would be a safer way.
Am I right? If so, would it be OK to follow ARM ABI but InitialSP as establisher frame pointer?

@jkotas
Copy link
Copy Markdown
Member

jkotas commented Feb 8, 2017

Yes, I think Initial-SP would be the right choice for establisher frame on x86. What are parts of the ARM ABI that you are thinking about following for x86? It may be useful if you could send PR with just edits and clarifications in the relevant chapter of clr-abi.md so that we can get the details nailed.

@wateret
Copy link
Copy Markdown
Member

wateret commented Feb 9, 2017

@jkotas What I meant by ARM ABI, specifically, everything in The PSPSym and funclet parameters for ARM/ARM64 and EH funclet convention as discussed in #9300 (comment).

What I thought was, there are two different conventions: ARM/ARM64 and AMD64. I thought AMD64 is a special case so it would be better to follow ARM/ARM64 ABI. However we've decided to use Initial-SP as establisher frame which is like AMD64 ABI so I wanted to clarify the rest of ABI.

@jkotas
Copy link
Copy Markdown
Member

jkotas commented Feb 9, 2017

Well, the ARM/ARM64 is same because of it was easy choice to copy ARM to ARM64. Both AMD64 and ARM/ARM64 have grown somewhat organically.

There is actually a third variant used by CoreRT on all platforms. It was intentionally designed, taking the best parts from both AMD64 and ARM/ARM64. It would be nice to snap to CoreRT convention for Linux x86 where possible.

  • First argument of all funclets is establisher frame, second argument is exception object (where applicable). Same as AMD64.
  • For filter funclets the VM sets the frame register to be the same as the parent function. For second pass funclets the VM restores all non-volatile registers. Similar to ARM/ARM64.
  • No PSP sym (it may be hard to not have PSP sym for CoreCLR - it is ok to skip it)

@parjong
Copy link
Copy Markdown
Author

parjong commented Feb 13, 2017

@janvorli I currently consider the following changes to store both SP before call and SP after call via adding ResumeEsp in CONTEXT:

  • Revise OOPStackUnwinderX86::VirtualUnwind to update both ResumeEsp and Esp ( OOPStackUnwinderX86::VirtualUnwind unwinds a managed frame (a frame created by a managed method), and thus it could know both ResumeEsp and Esp after unwind)
  • Revise TransitionFrame::UpdateRegDisplay to update both ResumeEsp and Esp (cbStackPop allows us to update ResumeEsp correctly)
  • Revise PAL_VirtualUnwind to update both ResumeEsp and Esp as Caller-SP (libunwind does not know ResumeEsp)
  • Revise RtlCaptureContext to update both ResumeEsp and Esp as current SP)
  • Revise RtlRestoreContext to use ResumeEsp instead of Esp

I think that the above changes allows us to get correct ResumeEsp and Esp for managed frames except one case: unwound from FCall as you pointed out. As discussed in #9272, we need to introduce additional adjustment after PAL_VirtualUnwind to deal with such cases.

Could you let me know your opinion on this approach?

@janvorli
Copy link
Copy Markdown
Member

@parjong the plan looks reasonable to me. So basically, Context.Esp would always be the caller context and Context.ResumeEsp would be equivalent to the caller context for native frames and it would be the Esp before pushing the arguments for managed frames, right?

@parjong
Copy link
Copy Markdown
Author

parjong commented Feb 15, 2017

@janvorli Yes, Esp should be same as libunwind unwinder's SP, and ResumeEsp should be same as CLR-internal unwinder's SP.

Copy link
Copy Markdown
Member

@janvorli janvorli left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thank you!

@janvorli
Copy link
Copy Markdown
Member

@dotnet-bot test Windows_NT x64 Debug Build and Test please

@parjong
Copy link
Copy Markdown
Author

parjong commented Feb 15, 2017

@dotnet-bot test Ubuntu x64 Checked Build and Test please

@parjong
Copy link
Copy Markdown
Author

parjong commented Feb 15, 2017

@dotnet-bot test OSX x64 Checked Build and Test please

@parjong
Copy link
Copy Markdown
Author

parjong commented Feb 17, 2017

@janvorli Could you please merge this PR?

@janvorli janvorli merged commit 00fc8f9 into dotnet:master Feb 17, 2017
@parjong parjong deleted the fix/x86_GetCallerSp branch February 19, 2017 23:27
jorive pushed a commit to guhuro/coreclr that referenced this pull request May 4, 2017
* [x86/Linux] Fix GetCallerSp

* Do NOT pop stack argument for TransitionFrame

* Add ResumeEsp to CONTEXT
@karelz karelz modified the milestone: 2.0.0 Aug 28, 2017
picenka21 pushed a commit to picenka21/runtime that referenced this pull request Feb 18, 2022
* [x86/Linux] Fix GetCallerSp

* Do NOT pop stack argument for TransitionFrame

* Add ResumeEsp to CONTEXT


Commit migrated from dotnet/coreclr@00fc8f9
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants