Delete VirtualUnwindToFirstManagedCallFrame#123507
Delete VirtualUnwindToFirstManagedCallFrame#123507am11 wants to merge 10 commits intodotnet:mainfrom
Conversation
|
Tagging subscribers to this area: @mangod9 |
994c384 to
4c9de0e
Compare
|
Ported |
Why is that better than depending on RtlUnwind? We have hard dependency on Windows unwind format and I do not think you are trying to "fix" that, so depending on RtlUnwind should not be a problem. |
|
Also, unwinding by disassembling the instruction stream is dirty. There are newer versions of the Windows x64 unwind info format that avoid disassembling the instruction stream. I expect we will upgrade at some point. |
|
The goal was to match NativeAOT's approach of not relying on any OS unwinder for these paths. If we're okay with keeping |
|
I would also prefer relying on the RtlVirtualUnwind. I don't want to get into the problems we've hit couple of times on x86 in the past when a compiler change has broken unwinding for some native methods. |
src/coreclr/vm/eepolicy.cpp
Outdated
| TADDR framePtr = GetFP(pUnwindContext); | ||
| for (int i = 0; i < kMaxUnwindIterations && framePtr != 0 && !unwound; i++) | ||
| { | ||
| TADDR returnAddr = *(TADDR*)(framePtr + sizeof(void*)); |
There was a problem hiding this comment.
This will be a lot less reliable compared to what we have here today. It is common to have stale left-over addresses in the stack memory that are no longer part of the current call chain.
(I do not have good ideas about what to do about it.)
There was a problem hiding this comment.
It's a best-effort fallback for stackoverflow occurred in a frameless FCall. In case of broken chain, we see Stack overflow. without the stack trace.
Alternatively, we could have two fields in thread TADDR m_pFCallCallerIP, m_pFCallCallerSP; // 16-bytes, populated on FCall entry to make it fully deterministic. That may or may not overlap with the approach we were thinking for QCall #123482?
There was a problem hiding this comment.
I understand that this is best effort. I do not think we want to take the diagnostic experience regression compared to current state. We have been getting regular feedback about poor stackoverflow diagnostic experience before @janvorli implemented what we have currently.
In case of broken chain, we see Stack overflow. without the stack trace.
Or this will crash with seg fault by trying to walk broken chain, and you won't see any error.
Alternatively, we could have two fields in thread TADDR m_pFCallCallerIP, m_pFCallCallerSP; // 16-bytes, populated on FCall entry to make it fully deterministic.
This would cause perf regressions (performance is the only reason why FCalls exist), it would not be reliable (the stackwalker needs more than just IP/SP in general case), and it would not handle PInvokes with SuppressGCTransition that have the same problem as FCalls.
There was a problem hiding this comment.
I fixed based on this repro:
using System;
using System.Runtime.CompilerServices;
using System.Threading;
class Program
{
static void Main()
{
Console.WriteLine("FCALL Stack Overflow Test");
Console.WriteLine("=========================");
Console.WriteLine();
Console.WriteLine("This test will cause a stack overflow in an FCALL (GC.MaxGeneration)");
Console.WriteLine("The stack trace should show the managed call chain leading to the FCALL.");
Console.WriteLine();
Console.WriteLine("Make sure DOTNET_TestFCallStackOverflow=1 is set to enable the test!");
Console.WriteLine();
// Use a larger stack size to allow normal operations, but still small enough
// that the recursive FCALL will overflow quickly
Console.WriteLine("Starting test on a new thread with 256KB stack...");
Console.WriteLine();
Thread testThread = new Thread(RunTest, maxStackSize: 256 * 1024);
testThread.Start();
testThread.Join();
Console.WriteLine();
Console.WriteLine("Test completed (should not reach here if SO occurred).");
}
[MethodImpl(MethodImplOptions.NoInlining)]
static void RunTest()
{
Console.WriteLine("Test thread started");
Level1();
}
[MethodImpl(MethodImplOptions.NoInlining)]
static void Level1()
{
Console.WriteLine("Entering Level1");
Level2();
}
[MethodImpl(MethodImplOptions.NoInlining)]
static void Level2()
{
Console.WriteLine("Entering Level2");
Level3();
}
[MethodImpl(MethodImplOptions.NoInlining)]
static void Level3()
{
Console.WriteLine("Entering Level3 - about to call GC.MaxGeneration (FCALL)");
// GC.MaxGeneration is an FCALL - direct call to native with no Frame
// When DOTNET_TestFCallStackOverflow=1, the native FCALL will overflow the stack
// This tests whether HandleFatalStackOverflow produces proper stack traces for FCALLs
int maxGen = GC.MaxGeneration;
Console.WriteLine("GC.MaxGeneration returned: " + maxGen.ToString());
}
}with:
diff --git a/src/coreclr/vm/comutilnative.cpp b/src/coreclr/vm/comutilnative.cpp
index 27ea6fa75b4..69be4e717cb 100644
--- a/src/coreclr/vm/comutilnative.cpp
+++ b/src/coreclr/vm/comutilnative.cpp
@@ -872,10 +872,66 @@ extern "C" void QCALLTYPE GCInterface_WaitForPendingFinalizers()
**Arguments: None
**Exceptions: None
==============================================================================*/
+
+
+#if defined(_MSC_VER)
+ #define NOINLINE __declspec(noinline)
+#elif defined(__GNUC__) || defined(__clang__)
+ #define NOINLINE __attribute__((noinline))
+#else
+ #define NOINLINE
+#endif
+NOINLINE
+int FCallStackOverflowTestHelper(void) {
+ // ~542 MiB on the stack
+ volatile unsigned char buf[555512u * 1024u];
+
+ // Touch first/last and walk by page to ensure full probing/touch
+ size_t size = sizeof(buf);
+ buf[0] = 1;
+ buf[size - 1] = 1;
+
+ // Walk by 4096-byte pages to force page faults/page commits
+ volatile uint64_t sum = 0;
+ for (size_t i = 0; i < size; i += 4096) {
+ sum += buf[i];
+ }
+
+ return (int)(sum + buf[0] + buf[size - 1]);
+}
+
+
+void log(const char *msg) {
+ FILE *f = fopen("/tmp/foo.txt", "a");
+ if (!f) {
+ perror("fopen");
+ return;
+ }
+
+ fprintf(f, "%s\n", msg);
+ fclose(f);
+}
+
+
FCIMPL0(int, GCInterface::GetMaxGeneration)
{
FCALL_CONTRACT;
+ // TEST ONLY: Check for environment variable to trigger stack overflow test
+ // Set DOTNET_TestFCallStackOverflow=1 to enable
+ static int testMode = -1;
+ if (testMode == -1)
+ {
+ const char* env = getenv("DOTNET_TestFCallStackOverflow");
+ testMode = (env != nullptr && env[0] == '1') ? 1 : 0;
+ }
+ if (testMode == 1)
+ {
+ log("here we are\n");
+ return FCallStackOverflowTestHelper();
+ }
+
+ log("not really\n");
return(INT32)GCHeapUtilities::GetGCHeap()->GetMaxGeneration();
}
FCIMPLEND> $env:DOTNET_TestFCallStackOverflow=1; .\corerun.exe tester.dll
FCALL Stack Overflow Test
=========================
This test will cause a stack overflow in an FCALL (GC.MaxGeneration)
The stack trace should show the managed call chain leading to the FCALL.
Make sure DOTNET_TestFCallStackOverflow=1 is set to enable the test!
Starting test on a new thread with 256KB stack...
Test thread started
Entering Level1
Entering Level2
Entering Level3 - about to call GC.MaxGeneration (FCALL)
Stack overflow.
at Program.Level3()
at Program.Level2()
at Program.Level1()
at Program.RunTest()
at System.Threading.Thread.StartCallback()(matches main behavior)
before 64cc6cf I wasn't getting anything after Stack overflow. line.
src/coreclr/vm/eepolicy.cpp
Outdated
| else | ||
| { | ||
| Frame* pFrame = pThread->GetFrame(); | ||
| if (pFrame != FRAME_TOP && InlinedCallFrame::FrameHasActiveCall(pFrame)) |
There was a problem hiding this comment.
This condition will often cause the interesting part of the stacktrace to be omitted. For example:
using System;
using System.Runtime.InteropServices;
using System.Threading;
unsafe
{
delegate* unmanaged<void> pfn = &F;
pfn();
}
[UnmanagedCallersOnly]
static void F()
{
M().ToString();
}
static long M()
{
Thread.SpinWait(1);
long ret = M();
Thread.SpinWait(1);
return ret;
}Before this PR:
C:\runtime\artifacts\bin\coreclr\windows.x64.Release>corerun test.exe
Stack overflow.
at System.Threading.Thread.SpinWait(Int32)
Repeated 24115 times:
--------------------------------
at Program.<<Main>$>g__M|0_1()
--------------------------------
at Program.<<Main>$>g__F|0_0()
at Program.<Main>$(System.String[])
With this PR:
Stack overflow.
at Program.<Main>$(System.String[])
There was a problem hiding this comment.
Thanks, I wasn't able to repro it on win-x64, but on linux-x64 (wsl), it was reproduced. With 747051b, it matches main.
| const int kMaxUnwindIterations = 64; | ||
|
|
||
| #if defined(TARGET_ARM) | ||
| // On ARM32, the return address is in LR, not on the stack. |
There was a problem hiding this comment.
It seems arm64 would behave the same way as arm w.r.t. this.
| // GetFP() returns the appropriate register for each architecture: | ||
| // x64: RBP, ARM64: X29, ARM32: R7, LoongArch64/RISC-V64: FP | ||
| TADDR framePtr = GetFP(pUnwindContext); | ||
| for (int i = 0; i < kMaxUnwindIterations && framePtr != 0 && !unwound; i++) |
There was a problem hiding this comment.
It seems that there is still a case when this won't work as before - when the stack overflow occurs while storing the FP in a FCALL. Then the FP would contain the value from the caller, which doesn't have to be a frame pointer on Windows. And on Unix, it would contain the FP from the managed caller, so it seems this would skip one managed frame.
Maybe we can detect and handle it here? It seems that this would only happen when the stack overflow happens at the first instruction of a FCALL. dladdr could be used for checking if the address is the first instruction in a function.
There was a problem hiding this comment.
It seems that there is still a case when this won't work as before
I think there are many different cases where this won't work as before. You cannot really make assumption about the shape of the native code. Note that this path applies to P/Invokes with SuppressGCTransition as well that can be arbitrary used provided native code.
| } | ||
| } | ||
|
|
||
| if (unwound && ExecutionManager::IsManagedCode(GetIP(pUnwindContext))) |
There was a problem hiding this comment.
Do we need to call the ExecutionManager::IsManagedCode here? It seems that unwound is set only when we know the pUnwindContext points to managed frame.
| // This implies that we should have got a personality routine returned from the call to | ||
| // RtlVirtualUnwind above. | ||
| // | ||
| // However, if the ControlPC happened to be in the prolog or epilog of a managed method, |
There was a problem hiding this comment.
This part of the comment seems to be deleted by accident.
| // Move to caller's frame | ||
| framePtr = *(TADDR*)framePtr; | ||
| } | ||
| } |
Currently based on #123307 (shall rebase once that one is merged).