Skip to content
This repository was archived by the owner on Jan 23, 2023. It is now read-only.

[x86/Linux] Generate conservative epilog#9602

Closed
parjong wants to merge 1 commit into
dotnet:masterfrom
parjong:fix/x86_converative_epilog
Closed

[x86/Linux] Generate conservative epilog#9602
parjong wants to merge 1 commit into
dotnet:masterfrom
parjong:fix/x86_converative_epilog

Conversation

@parjong
Copy link
Copy Markdown

@parjong parjong commented Feb 15, 2017

ESP may be variable when UNIX_X86_ABI is used, and incorrect ESP (especially resuming from exception) causes various issues such as JIT.IL_Conformance.Old.Conformance_Base.conv_ovf_r8_i test failure.

This commit attempts to insert SP-restore instruction in function epilog to fix such issues.

@parjong
Copy link
Copy Markdown
Author

parjong commented Feb 15, 2017

\CC @seanshpark

@parjong
Copy link
Copy Markdown
Author

parjong commented Feb 15, 2017

This PR seems to resolve #9272, too.

@parjong parjong changed the title [x86/Linux] Generate conservatie epilog [x86/Linux] Generate conservative epilog Feb 15, 2017
@jkotas
Copy link
Copy Markdown
Member

jkotas commented Feb 15, 2017

If you run with wrong SP, the GC reporting will be wrong. Fixing up the SP in epilog may hide the problem for the situation you are seeing; but I expect that it will lead to much harder to debug crashes later.

@parjong
Copy link
Copy Markdown
Author

parjong commented Feb 15, 2017

@jkotas I am also trying to revise unwinder (#9384), but it seems that stack-align padding will cause incorrect SP issue even with correct unwinder.

@parjong
Copy link
Copy Markdown
Author

parjong commented Feb 15, 2017

@jkotas I guessed that GC will refer each object via ebp instead of esp, but your comment suggests that there is a case that GC uses esp to refer each object.

Could you let me know where related implementation is? I would like to analyze that issue, but GC is too large to analyze. It will be very helpful if you could provide any hint for GC issue.

@jkotas
Copy link
Copy Markdown
Member

jkotas commented Feb 15, 2017

This is not really in the GC. It is in the handshake between the JIT and CodeManager.

The JIT generates the GCInfo reporting against SP in some cases, and then the CodeManager does reporting against SP accordingly. It is done in multiple places, for example here: https://github.com/dotnet/coreclr/blob/master/src/vm/eetwain.cpp#L4488 .

Anyway, running with bad ESP and then try to compensate for it is a bad idea. It will create endless problems.

@stephentoub
Copy link
Copy Markdown
Member

@dotnet-bot test this please (Jenkins was down)

@parjong
Copy link
Copy Markdown
Author

parjong commented Feb 15, 2017

@dotnet-bot test Windows_NT x64 Release Priority 1 Build and Test please

@parjong
Copy link
Copy Markdown
Author

parjong commented Feb 15, 2017

@jkotas Here is an simple example of this issue:

PROLOG:
push ebp
mov esp, ebp
MAIN:
sub esp, PAD
push arg_N
...
push arg_0
call func
add  esp, PAD
EPILOG:
pop ebp
ret

The current unwinder sometimes provides the state before call func, but will be revised to provide the state after call func (ESP is correct in some sense).

When we resume execution from EPILIOG after handling a managed exception throw from func, however, EE will resume the execution using the state just after call func not after add esp, PAD (even after unwinder is revised).

Do you mean that unwinder should take care of such padding?

@jkotas
Copy link
Copy Markdown
Member

jkotas commented Feb 16, 2017

When we resume execution from EPILIOG after handling a managed exception throw from func

What will happen if there is more code in the given method after the try/catch that you are resuming from? Won't you be potentially a lot of code with bad SP?

Do you mean that unwinder should take care of such padding?

I think so. Resume after caught exception has to use correct SP.

@parjong
Copy link
Copy Markdown
Author

parjong commented Feb 16, 2017

Hmm.. the code related with compiler->compLocallocUsed indicates that we will encounter similar issues when there is localloc, but seems to work currently. Could you let me know if there is some workaround related with localloc?

@jkotas
Copy link
Copy Markdown
Member

jkotas commented Feb 16, 2017

The GC info is careful about using BP-relative or SP-relative offset for each value. The two are not interchangeable. Pending arguments are reported SP-relative, for example. Reporting them BP-relative would not work in methods with localloc or funclets.

but seems to work currently

Have you done any stress testing on it? The subtle GC reporting issues only show up once you start running large programs; or under targeted stress...

@parjong
Copy link
Copy Markdown
Author

parjong commented Feb 16, 2017

@jkotas x86/Linux is still under bring up, and GCstress test is not performed, yet (but it is definitely required).

@parjong
Copy link
Copy Markdown
Author

parjong commented Feb 16, 2017

I'll take a look at how GC will work, and re-open PR when ready.

@parjong parjong closed this Feb 16, 2017
@parjong parjong deleted the fix/x86_converative_epilog branch March 8, 2017 11:38
@karelz karelz modified the milestone: 2.0.0 Aug 28, 2017
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants