[x86/Linux] Initial patch for EH funclet#9601
Conversation
|
This patch is originally from https://github.com/parjong/coreclr/commit/a992b93ec85cf70c54558ff63a3b11ee0809b4db. Before doing something like #9489, I thought we need sp-based stack unwinding and a basic funclet frame. At least on x86/linux this makes more unit test success with no regression. |
|
cc: @dotnet/jit-contrib |
| // This is the end of the OS-reported prolog for purposes of unwinding | ||
| compiler->unwindEndProlog(); | ||
|
|
||
| if (block->bbCatchTyp != BBCT_FINALLY && block->bbCatchTyp != BBCT_FAULT) |
There was a problem hiding this comment.
It would be easier to read if you used a positive condition (listing catch types for which the code is generated). I guess it would be just BBCT_FILTER?
There was a problem hiding this comment.
Wouldn't it be both BBCT_FILTER and BBCT_FILTER_HANDLER?
There was a problem hiding this comment.
The BBCT_FILTER_HANDLER represents the handler that's called when the filter decides to handle the exception. So I would think that the BBCT_FILTER_HANDLER should be handled like the BBCT_CATCH.
There was a problem hiding this comment.
The only difference between if and else block is the presence of exception object. So I think we need it for all types of funclets other than FINALLY and FAULT.
There was a problem hiding this comment.
Why would the BBCT_FILTER_HANDLER be different from the BBCT_FAULT? It looks like it is essentially the same thing
| { | ||
| getEmitter()->emitIns_R_AR(INS_mov, EA_PTRSIZE, REG_FPBASE, REG_SPBASE, 8); | ||
| getEmitter()->emitIns_R_AR(INS_lea, EA_PTRSIZE, REG_FPBASE, REG_FPBASE, -8); | ||
| } |
There was a problem hiding this comment.
What if it is BBCT_FINALLY? Is it correct that we don't generate anything here?
There was a problem hiding this comment.
This else if block was for BBCT_FINALLY as well but it somehow crashes. It cannot run any managed binaries and it seems like it crashes from mscorlib(AppDomain or something). I didn't figure out why yet.
There was a problem hiding this comment.
@janvorli @wateret It turns out that we need to revise genCallFinally accordingly to support BBCT_FINALLY. The current implementation generates the following call-finally sequence:
90 nop
90 nop
8BCC mov ecx, esp
E80B000000 call [Finally]
90 nop
The current implementation passes Initial-SP via ecx, but EE passes Caller-SP via stack. This difference causes segmentation fault.
| #ifdef WIN64EXCEPTIONS | ||
| if (pCodeInfo->IsFunclet()) | ||
| { | ||
| baseSP = curESP + 8; |
There was a problem hiding this comment.
Could you please add comment on what the constant "8" means?
There was a problem hiding this comment.
I'd also like to see a comment that describes what you expect the stack frame to look like, since you obviously have certain expectations about the stack layout.
In reply to: 101234990 [](ancestors = 101234990)
| } | ||
|
|
||
| // Add a padding (for 16-byte alignment) | ||
| // See #9439 for details |
There was a problem hiding this comment.
Please use full URL of the github issue (https://github.com/dotnet/coreclr/issues/9439)
There was a problem hiding this comment.
The code needs to stand on its own without reference to GitHub. So please include all and any comments required to document this requirement. A temporary reference is ok, but I would expect it to be removed when the issue is fixed, and not remain longer than, say, one month.
In reply to: 101235235 [](ancestors = 101235235)
|
@wateret It would be great to add comment to the CodeGen::genFuncletProlog describing the details of what arguments are passed to each filter type, what does the stack frame look like etc. Similar to what we have for this function for all other architectures. |
|
@dotnet-bot test this please (Jenkins was down) |
| if (block->bbCatchTyp != BBCT_FINALLY && block->bbCatchTyp != BBCT_FAULT) | ||
| { | ||
| getEmitter()->emitIns_R_AR(INS_mov, EA_PTRSIZE, REG_EXCEPTION_OBJECT, REG_SPBASE, 12); | ||
|
|
There was a problem hiding this comment.
This "mov" is the only difference between the funclet prolog types, right? This should be determined by if handlerGetsXcptnObj(block->bbCatchTyp).
There was a problem hiding this comment.
@BruceForstall Thanks you for comment! It works well.
@wateret I updated the draft here, and it works well 👍
|
@janvorli @BruceForstall |
61cb6da to
44107aa
Compare
|
|
||
| // TODO We will need changes here when we introduce PSPSym | ||
| getEmitter()->emitIns_R_R(INS_mov, EA_PTRSIZE, REG_FPBASE, REG_ARG_0); | ||
| getEmitter()->emitIns_R_AR(INS_lea, EA_PTRSIZE, REG_FPBASE, REG_FPBASE, -8); |
There was a problem hiding this comment.
Couldn't you eliminate the first line and make the second:
getEmitter()->emitIns_R_AR(INS_lea, EA_PTRSIZE, REG_FPBASE, REG_ARG_0, -8);
?
There was a problem hiding this comment.
Please add a comment here about the -8. I assume this is because you assume the main function prolog is:
pushd ebp
mov ebp,esp
Thus, CallerSP is exactly 8 bytes above, due to the pushed return address and pushed ebp.
Note that this is the reason we have the funclet frame layout diagram comments. Also, these constants are encapsulated in other architectures with the CodeGen::genCaptureFuncletPrologEpilogInfo() function and descriptive names.
| (!compiler->compLocallocUsed && (compiler->funCurrentFunc()->funKind == FUNC_ROOT))) | ||
| { | ||
| #ifdef _TARGET_X86_ | ||
| getEmitter()->emitIns_R_AR(INS_lea, EA_PTRSIZE, REG_ARG_0, REG_FPBASE, 8); // caller-SP for x86 |
There was a problem hiding this comment.
This needs to use -genCallerSPtoFPdelta(), not "8"
| // we need to change here accordingly. | ||
| if (pCodeInfo->IsFunclet()) | ||
| { | ||
| baseSP = curESP + 8; // padding for 16byte stack alignment allocated in genFuncletProlog() |
There was a problem hiding this comment.
What if we are trying to unwind in a funclet and we have pushed a lot of arguments in preparation for making a call, but haven't yet made the call? Then the curESP will not be what you expect.
There was a problem hiding this comment.
@BruceForstall Thank you for pointing it out! The current problem is that we do not have a frame pointer for this funclet frame (ebp should be restored as that of main function as I understand).
Could you let me know if there is any way to reserve any register (except ebp)? It would resolve issues from incorrect ESP if we could reserve any registers for a frame pointer for funclet frames.
There was a problem hiding this comment.
All the other funclet-based architectures also have fixed sized frames -- they pre-allocate the outgoing argument space for calls, and localloc/_alloca is not allowed in EH handlers / funclets by the CLI spec. So this is new territory.
One possibility is to set up ebp as a normal frame pointer in funclets, and then reserve and use some other register as the base pointer for accessing locals within funclet code generation. That would be a pretty pervasive change.
Another possibility would be to change to fixed frames. That doesn't seem great.
Or, as you suggest, you could reserve another (callee-saved) register for storing the funclet frame pointer. But that then becomes part of the ABI contract, as the VM would need to know which register that is.
There are so few registers already on x86 that reserving (another) one in funclets is going to hurt code quality.
I can't think of any obvious good solution right now.
There was a problem hiding this comment.
@BruceForstall How about ecx? Is it supposed to be available in funclet?
There was a problem hiding this comment.
What if we are trying to unwind in a funclet and we have pushed a lot of arguments in preparation for making a call,
The current Windows x86 can handle this situation fine, with just ESP and EBP. What is different about the Linux x86 funclets that it does not work? I do not think we should burn a register for this.
There was a problem hiding this comment.
If not, it seems that GC would get stuck for the following case (with the above unwind routine):
push STK_ARG_2 << GC >> push STK_ARG _1 call FUNC
The x86 gc and unwind info keeps track of how much was pushed onto stack and whether it needs protecting. E.g. look for GetPushedArgSize in eetwain.cpp - it returns how much is pushed on the stack at given offset in the method.
There was a problem hiding this comment.
The current Windows x86 can handle this situation fine, with just ESP and EBP. What is different about the Linux x86 funclets that it does not work?
GetPushedArgSize() is currently only used in ESP-based frames. With funclets enabled for Linux x86, we're generating EBP-based frames. For normal EBP-based frame unwind, we don't need to worry about pushed arguments because we just use the EBP to pop the frame. However, with funclets, we can't do that because the EBP is pointing to the parent function (the funclet sets this up in the prolog to enable stack local variable access via EBP). Maybe it's possible to still use GetPushedArgSize() in an EBP-based frame in the funclet case to unwind ESP, and then find the actual EBP that was pushed/saved in the funclet prolog (as opposed to the parent frame EBP that was established)?
There was a problem hiding this comment.
BTW, I don't think we should mark regions where we push call arguments within funclets as non-interruptible, which might be what @janvorli was suggesting.
There was a problem hiding this comment.
@BruceForstall no, I was not trying to suggest that. I was just explaining how things work.
There was a problem hiding this comment.
We are trying to solve various issues given the current state of how we decided to do certain things on the fly. Given the knowledge of issues we have accumulated recently, I believe we should step back and reconsider the possibilities we have from higher level point of view. Some of the issues are due to the way we have decided to use for stack aligning. Some might be solved if we changed calling convention of jitted code so that the callee is no longer responsible for reclaiming the stack arguments space.
I am afraid that otherwise we would get lost in solving local issues that we could side step using a different global approach.
|
|
||
| // TODO Restore callee-saved registers | ||
|
|
||
| inst_RV_IV(INS_add, REG_SPBASE, 8, EA_PTRSIZE); |
There was a problem hiding this comment.
Add a comment that this is to revert the alignment adjustment.
732921f to
57f8a53
Compare
|
I have updated the change. I'm aware that we still have some issue but IMHO we will need something like this anyway. So what about merging this, if there is no other issue except for the one that we discussed? Seems like this patch makes most EH test cases(except FILTERs) pass . |
| #elif defined(_TARGET_X86_) | ||
| return (UINT_PTR*)&(pContextRecord->Edi); | ||
| #else | ||
| PORTABILITY_ASSERT("GetFirstNonVolRegLocation"); |
There was a problem hiding this comment.
Could you please change this as GetRegRestoreBase?
| #elif defined(_TARGET_X86_) | ||
| return pContextRecord->Ebp; | ||
| #else | ||
| PORTABILITY_ASSERT("GetRestoreBase"); |
There was a problem hiding this comment.
Similarly, need to change this as GetFrameRestoreBase.
|
@jkotas @janvorli @BruceForstall Could you please take a look this again? As I already posted in #9265, this PR is a necessary step to resolve various issues in x86/Linux. I totally agree with @janvorli's opinion that we need to revisit the overall structure. However, the infrastructure that this PR attempts to introduce will be required even when we take another approach (such as global calling convention change). |
| * | ||
| * Funclets have the following incoming arguments: | ||
| * | ||
| * catch/filter-handler: eax = the exception object that was caught (see GT_CATCH_ARG), ecx = FP |
There was a problem hiding this comment.
Funclets on all other architectures get the argument via regular argument registers. Would it make more sense to pass the exception object in edx instead?
There was a problem hiding this comment.
Last time I did so and I had to also put mov eax, edx in funclet prolog because of
#define REG_EXCEPTION_OBJECT REG_EAX
in jit/target.h.
So I thought if we pass the Exception Object directly to EAX then we could save an instruction.
I also thought it would be better if we could set REG_EXCEPTION_OBJECT to REG_EDX.
|
|
||
| // Restore frame pointer | ||
| // TODO We will need changes here when we introduce PSPSym | ||
| inst_RV_RV(INS_mov, REG_FPBASE, REG_ARG_0); |
There was a problem hiding this comment.
The first thing the funclet code does is to move ECX to EBP. Can the assembly helper set EBP just like all other argument registers instead?
There was a problem hiding this comment.
I tried that last time. However it seems like if EBP is restored in the assembly helper, native unwinder cannot unwind the helper correctly.
There was a problem hiding this comment.
I believe that if you change the assembly helper to use ESP frame instead of EBP one, you would not have this problem. It seems all you'd need to do is to remove the PROLOG_BEG / PROLOG_END and EPILOG_BEG / EPILOG_END (or replace them with dummy ESP_PROLOG_BEG, ... to make it kind of symmetric with EBP based frames) and add PROLOG_PUSH ebp at the beginning and EPILOG_POP ebp at the end. Then changing the EBP in the middle of the function won't cause trouble, since the unwinder would use ESP for unwinding.
There was a problem hiding this comment.
@janvorli I made a draft (I did not update the PR yet) for your suggestion but it does not work with nested exception handling like:
main {
try {
try {
Console.WriteLine("try, first throw");
throw new Exception("a");
} catch (Exception) {
Console.WriteLine("first catch, second throw");
throw new Exception("b");
}
} catch (Exception) {
Console.WriteLine("second catch");
return 2;
}
return 0;
}And the error message is:
try, first throw
first catch, second throw
terminate called after throwing an instance of 'PAL_SEHException'
It crashes when PAL_ThrowExceptionFromContext() throws C++ exception. IMHO it went wrong when unwinding the helper(CallEHFunclet) frame which is the caller of inner catch funclet, since there is no funclet(or its helper) when it handled first one.
Here is my draft wateret@632f2a6 . It would be nice if you have a look. Thanks.
There was a problem hiding this comment.
I set breakpoint on push eax in ThrowExceptionFromContextInternal. At this point the restored EBP and ESP seems correct.
(gdb) i r esp ebp
esp 0xffffb924 0xffffb924
ebp 0xffffcc38 0xffffcc38
There was a problem hiding this comment.
@wateret thank you. I think the esp at the push eax is actually not correct and it should be larger by 4. But we cannot do that, since we would overwrite something on the stack of the CallEHFunclet (or any other native method whose context is passed to ThrowExceptionFromContextInternal.
Can you please change the ThrowExceptionFromContextInternal calling convention to __fastcall and pass the PAL_SEHException * in ecx? I believe that should fix the issue.
There was a problem hiding this comment.
I simply tested by manually setting ESP larger by 4 on GDB and it works! and I made a patch for that. wateret@b5c4f45 is that what you meant? It passes the test case I gave.
BTW I was looking at ThrowExceptionFromContextInternal, is there some meaning for pushing Context_Eip before calling into ThrowExceptionHelper? I see that some kind of return address should be pushed since calling ThrowExceptionHelper with jmp instruction. However it looks like doing nothing since it will never return.
There was a problem hiding this comment.
@wateret yes, the patch is exactly what I've meant. It is possible that it will fix more issues in the coreclr tests, since the pattern in the test is quite common.
As for pushing the Context_Eip on the stack - that enables the stack unwinder to unwind from ThrowExceptionHelper to CallEHFunclet. It creates a stack frame looking as if the ThrowExceptionHelper was called by the CallEHFunclet, which allows the exception to be propagated from ThrowExceptionHelper to CallEHFunclet and further.
There was a problem hiding this comment.
@janvorli Thank you for all of your response. I'll update the PR after polishing the patch and run unittests
| // Add a padding for 16-byte alignment | ||
| inst_RV_IV(INS_sub, REG_SPBASE, 8, EA_PTRSIZE); | ||
|
|
||
| // We've modified EBP, but not really. Say that we haven't... |
There was a problem hiding this comment.
@wateret I don't understand the comment, could you please clarify it for me?
There was a problem hiding this comment.
@janvorli The comment and the next line is copied from AMD64 version of the function. What I thought was: we need that if we modify EBP in funclet.
|
|
||
| // Restore frame pointer | ||
| // TODO We will need changes here when we introduce PSPSym | ||
| inst_RV_RV(INS_mov, REG_FPBASE, REG_ARG_0); |
There was a problem hiding this comment.
I believe that if you change the assembly helper to use ESP frame instead of EBP one, you would not have this problem. It seems all you'd need to do is to remove the PROLOG_BEG / PROLOG_END and EPILOG_BEG / EPILOG_END (or replace them with dummy ESP_PROLOG_BEG, ... to make it kind of symmetric with EBP based frames) and add PROLOG_PUSH ebp at the beginning and EPILOG_POP ebp at the end. Then changing the EBP in the middle of the function won't cause trouble, since the unwinder would use ESP for unwinding.
Generate a simple EH funclet frame and Support SP-based stack unwinding for funclets.
|
@janvorli @jkotas @BruceForstall |
|
LGTM |
|
|
||
| .macro ESP_PROLOG_BEG | ||
| push ebp | ||
| mov ebp, esp |
There was a problem hiding this comment.
A nit - these instructions don't belong here since they are not necessary for ESP based frames in general. You have it just for your convenience in the CallEHFunclet, it would be possible refer to the arguments through ESP without setting the EBP.
So I would prefer leaving ESP_PROLOG_BEG and ESP_EPILOG_END empty and adding PROLOG_PUSH ebp mov ebp, esp / EPILOG_POP ebp to the actual asm helpers.
| #endif | ||
| } | ||
|
|
||
| static inline UINT_PTR *GetRegRestoreBase(PCONTEXT pContextRecord) |
There was a problem hiding this comment.
A nit - I would rename this to GetFirstNonVolatileRegisterAddress. It seems it better describes the purpose and the parameter to the CallEHFunclet that we pass the result of this function to is called pFirstNonVolReg, so it would match that too.
There was a problem hiding this comment.
That sounds better to me too.
| (!compiler->compLocallocUsed && (compiler->funCurrentFunc()->funKind == FUNC_ROOT))) | ||
| { | ||
| inst_RV_RV(INS_mov, REG_ARG_0, REG_SPBASE, TYP_I_IMPL); | ||
| #ifdef UNIX_X86_ABI |
There was a problem hiding this comment.
This change should not be needed anymore.
There was a problem hiding this comment.
We may need the following change since no argument is required for finally (as I understand):
#ifndef UNIX_X86_ABI
inst_RV_RV(INS_mov, REG_ARG_0, REG_SPBASE, TYP_I_IMPL);
#endif // !UNIX_X86_ABI
There was a problem hiding this comment.
Agree. (The extra mov ecx, rsp would be harmless - but it is better to not have it.)
| compiler->unwindBegProlog(); | ||
|
|
||
| // TODO Save callee-saved registers | ||
| inst_RV(INS_push, REG_FPBASE, TYP_REF); |
There was a problem hiding this comment.
You should not need to push FP for funclets because the funclet does not modify it anymore
There was a problem hiding this comment.
@jkotas In the changes to UnwindEbpDoubleAlignFrame(), there is a call to SetEbpLocation() that appears to point to this saved EBP. Is that not required? (Or maybe I'm reading it wrong?)
There was a problem hiding this comment.
This should not be needed. When we come to unwind the funclet, we have to have valid EBP and EBP location from wherever we came from. We should not need to create a duplicate copy here.
There was a problem hiding this comment.
@BruceForstall It seems that UnwindEbpDoubleAlignFrame should be revised along with genFuncletProlog.
|
@dotnet-bot test OSX x64 Checked Build and Test please |
|
@dotnet-bot test Ubuntu x64 Checked Build and Test please |
| // TODO We may need EBP restore sequence here if we introduce PSPSym | ||
|
|
||
| // Add a padding for 16-byte alignment | ||
| inst_RV_IV(INS_sub, REG_SPBASE, 8, EA_PTRSIZE); |
There was a problem hiding this comment.
Does the padding need to be different now?
There was a problem hiding this comment.
Yes, It should be 12. Sorry about that.
| { | ||
| baseSP = curESP + 8; // padding for 16byte stack alignment allocated in genFuncletProlog() | ||
|
|
||
| pContext->SetEbpLocation(PTR_DWORD(baseSP)); |
There was a problem hiding this comment.
This should be deleted because of EBP is not stored anymore
There was a problem hiding this comment.
I see that we need to remove that. Would it be OK to not doing pContext->SetEbpLocation() for unwinder?
cc: @parjong
There was a problem hiding this comment.
This line and baseSP += sizeof(TADDR) at line 3822 should be removed as ebp is not saved any more.
| // TODO Currently we assume that ESP of funclet frames is always fixed but actually it could change. | ||
| if (pCodeInfo->IsFunclet()) | ||
| { | ||
| baseSP = curESP + 8; // padding for 16byte stack alignment allocated in genFuncletProlog() |
There was a problem hiding this comment.
The padding size should be 12 instead of 8.
|
@jkotas Please take a look. I ran unit tests and the results are fine(same as the previous version that restors EBP in funclet). |
|
Merged. @BruceForstall The commits can be squashed as part of merge in the GitHub UI. No need to ask folks to do it. |
|
@wateret I have just found that there is an alignment issue in the CallEHFunclet. You push four registers and there is return address on the stack, so the stack is not aligned to 16 bytes before calling the funclet. I have hit this issue when testing a simple null reference test app. |
|
@janvorli Thank you for finding that. I'll fix that next Monday. |
- Generate a simple EH funclet frame and support SP-based stack unwinding for funclets. - Introduce assembly helpers : CallEHFunclet and CallEHFilterFunclet
- Generate a simple EH funclet frame and support SP-based stack unwinding for funclets. - Introduce assembly helpers : CallEHFunclet and CallEHFilterFunclet Commit migrated from dotnet/coreclr@77f2ad4
Generate a simple EH funclet frame and
Support SP-based stack unwinding for funclets.