Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
27 changes: 21 additions & 6 deletions docs/design/datacontracts/ExecutionManager.md
Original file line number Diff line number Diff line change
Expand Up @@ -161,6 +161,10 @@ Data descriptors used:
| `RealCodeHeader` | `DebugInfo` | Pointer to the DebugInfo |
| `RealCodeHeader` | `GCInfo` | Pointer to the GCInfo encoding |
| `RealCodeHeader` | `EHInfo` | Pointer to the `EE_ILEXCEPTION` containing exception clauses |
| `InterpreterRealCodeHeader` | `MethodDesc` | Pointer to the corresponding `MethodDesc` for interpreter code |
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@janvorli @kotlarmilos - Please take a look at these contracts (the markdown files) and see what you think. Every type/field/algorithm included here becomes part of the interpreter's contract with diagnostic tools. Changing the data structures is possible, but its a breaking change that requires developers to update their tools so we'd only expect to do it rarely. Make sure the things documented here are things you'd expect to be reasonably stable over time or please suggest alternatives.

Thanks!

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Want to bump @janvorli @kotlarmilos @jkotas, planning to merge this soon.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am sorry, I've missed the comment from @noahfalk last week. I'll look at it today.

| `InterpreterRealCodeHeader` | `DebugInfo` | Pointer to the DebugInfo for interpreter code |
| `InterpreterRealCodeHeader` | `GCInfo` | Pointer to the GCInfo encoding for interpreter code |
| `InterpreterRealCodeHeader` | `JitEHInfo` | Pointer to the `EE_ILEXCEPTION` containing exception clauses for interpreter code |
Comment thread
noahfalk marked this conversation as resolved.
| `Module` | `ReadyToRunInfo` | Pointer to the `ReadyToRunInfo` for the module |
| `ReadyToRunInfo` | `ReadyToRunHeader` | Pointer to the ReadyToRunHeader |
| `ReadyToRunInfo` | `CompositeInfo` | Pointer to composite R2R info - or itself for non-composite |
Expand Down Expand Up @@ -259,9 +263,11 @@ The bulk of the work is done by the `GetCodeBlockHandle` API that maps a code po
}
```

There are two JIT managers: the "EE JitManager" for jitted code and "R2R JitManager" for ReadyToRun code.
There are three JIT managers: the "EE JitManager" for jitted code, the "Interpreter JitManager" for interpreted code, and the "R2R JitManager" for ReadyToRun code.

The EE JitManager `GetMethodInfo` implements the nibble map lookup, summarized below, followed by returning the `RealCodeHeader` data:
The EE JitManager and Interpreter JitManager both use the same nibble map lookup to find method code.
The only difference is which code header type is read: the EE JitManager reads a `RealCodeHeader` while the Interpreter JitManager reads an `InterpreterRealCodeHeader`.
Their shared `GetMethodInfo` is summarized below:

```csharp
bool GetMethodInfo(TargetPointer rangeSection, TargetCodePointer jittedCodeAddress, [NotNullWhen(true)] out CodeBlock? info)
Expand All @@ -280,8 +286,10 @@ bool GetMethodInfo(TargetPointer rangeSection, TargetCodePointer jittedCodeAddre
return false;

TargetPointer codeHeaderAddress = Target.ReadPointer(codeHeaderIndirect);
TargetPointer methodDesc = Target.ReadPointer(codeHeaderAddress + /* RealCodeHeader::MethodDesc offset */);
info = new CodeBlock(jittedCodeAddress, realCodeHeader.MethodDesc, relativeOffset);
// EE JitManager: read RealCodeHeader at codeHeaderAddress
// Interpreter JitManager: read InterpreterRealCodeHeader at codeHeaderAddress
TargetPointer methodDesc = // read MethodDesc field from the appropriate code header
info = new CodeBlock(jittedCodeAddress, methodDesc, relativeOffset);
return true;
}
```
Expand Down Expand Up @@ -384,13 +392,14 @@ public override void GetMethodRegionInfo(RangeSection rangeSection, TargetCodePo

```

`GetJITType` returns the JIT type by finding the JIT manager for the data range containing the relevant code block. We return `Jit` for the `EEJitManager`, `R2R` for the `R2RJitManager`, and `Unknown` for any other value.
`GetJITType` returns the JIT type by finding the JIT manager for the data range containing the relevant code block. We return `Jit` for the `EEJitManager`, `R2R` for the `R2RJitManager`, `Interpreter` for the `InterpreterJitManager`, and `Unknown` for any other value.
```csharp
public enum JitType : uint
{
Unknown = 0,
Jit = 1,
R2R = 2
R2R = 2,
Interpreter = 3
};
```
`NonVirtualEntry2MethodDesc` attempts to find a method desc from an entrypoint. If portable entrypoints are enabled, we attempt to read the entrypoint data structure to find the method table. We also attempt to find the method desc from a precode stub. Finally, we attempt to find the method desc using `GetMethodInfo` as described above.
Expand Down Expand Up @@ -466,6 +475,8 @@ The `GetMethodDesc`, `GetStartAddress`, and `GetRelativeOffset` APIs extract fie

* For R2R code (`ReadyToRunJitManager`), a list of sorted `RUNTIME_FUNCTION` are stored on the module's `ReadyToRunInfo`. This is accessed as described above for `GetMethodInfo`. Again, the relevant `RUNTIME_FUNCTION` is found by binary searching the list based on IP.

* For interpreted code (`InterpreterJitManager`), there is no native unwind info. `GetUnwindInfo` returns null.

Unwind info (`RUNTIME_FUNCTION`) use relative addressing. For managed code, these values are relative to the start of the code's containing range in the RangeSectionMap (described below). This could be the beginning of a `CodeHeap` for jitted code or the base address of the loaded image for ReadyToRun code.
`GetUnwindInfoBaseAddress` finds this base address for a given `CodeBlockHandle`.

Expand All @@ -476,6 +487,8 @@ Unwind info (`RUNTIME_FUNCTION`) use relative addressing. For managed code, thes
* For R2R code (`ReadyToRunJitManager`) the `DebugInfo` is stored as part of the R2R image. The relevant `ReadyToRunInfo` stores a pointer to the an `ImageDataDirectory` representing the `DebugInfo` directory. Read the `VirtualAddress` of this data directory as a `NativeArray` containing the `DebugInfos`. To find the specific `DebugInfo`, index into the array using the `index` of the beginning of the R2R function as found like in `GetMethodInfo` above. This yields an offset `offset` value relative to the image base. Read the first variable length uint at `imageBase + offset`, `lookBack`. If `lookBack != 0`, return `imageBase + offset - lookback`. Otherwise return `offset + size of reading lookback`.
For R2R images, `hasFlagByte` is always `false`.

* For interpreted code (`InterpreterJitManager`), a pointer to the `DebugInfo` is stored on the `InterpreterRealCodeHeader` which is accessed in the same way as the EE JitManager's `GetMethodInfo` (nibble map lookup followed by code header read). `hasFlagByte` is always `false`.

`IExecutionManager.GetGCInfo` gets a pointer to the relevant GCInfo for a `CodeBlockHandle`. The ExecutionManager delegates to the JitManager implementations as the GCInfo is stored differently on jitted and R2R code.

* For jitted code (`EEJitManager`) a pointer to the `GCInfo` is stored on the `RealCodeHeader` which is accessed in the same way as `GetMethodInfo` described above. This can simply be returned as is. The `GCInfoVersion` is defined by the runtime global `GCInfoVersion`.
Expand All @@ -484,6 +497,8 @@ For R2R images, `hasFlagByte` is always `false`.
* The `GCInfoVersion` of R2R code is mapped from the R2R MajorVersion and MinorVersion which is read from the ReadyToRunHeader which itself is read from the ReadyToRunInfo (can be found as in GetMethodInfo). The current GCInfoVersion mapping is:
* MajorVersion >= 11 and MajorVersion < 15 => 4

* For interpreted code (`InterpreterJitManager`), a pointer to the `GCInfo` is stored on the `InterpreterRealCodeHeader`, accessed via nibble map lookup as with the EE JitManager. The `GCInfoVersion` is defined by the runtime global `GCInfoVersion`. The GC info is decoded using interpreter-specific decoding (`DecodeInterpreterGCInfo`).


`IExecutionManager.GetFuncletStartAddress` finds the start of the code blocks funclet. This will be different than the methods start address `GetStartAddress` if the current code block is inside of a funclet. To find the funclet start address, we get the unwind info corresponding to the code block using `IExecutionManager.GetUnwindInfo`. We then parse the unwind info to find the begin address (relative to the unwind info base address) and return the unwind info base address + unwind info begin address.

Expand Down
54 changes: 54 additions & 0 deletions docs/design/datacontracts/PrecodeStubs.md
Original file line number Diff line number Diff line change
Expand Up @@ -7,6 +7,11 @@ This contract provides support for examining [precode](../coreclr/botr/method-de
```csharp
// Gets a pointer to the MethodDesc for a given stub entrypoint
TargetPointer GetMethodDescFromStubAddress(TargetCodePointer entryPoint);

// If the code pointer is an interpreter precode, returns the actual interpreter
// code address (ByteCodeAddr). Otherwise returns the original address unchanged.
// Mirrors GetInterpreterCodeFromInterpreterPrecodeIfPresent in native code (precode.cpp).
TargetCodePointer GetInterpreterCodeFromInterpreterPrecodeIfPresent(TargetCodePointer entryPoint);
```

## Version 1, 2, and 3
Expand Down Expand Up @@ -40,6 +45,10 @@ Data descriptors used:
| StubPrecodeData | Type | precise sort of stub precode |
| FixupPrecodeData | MethodDesc | pointer to the MethodDesc associated with this fixup precode |
| ThisPtrRetBufPrecodeData | MethodDesc | pointer to the MethodDesc associated with the ThisPtrRetBufPrecode (Version 2 only) |
| InterpreterPrecodeData | ByteCodeAddr | pointer to the `InterpByteCodeStart` for the interpreter bytecode (Version 3 only) |
| InterpreterPrecodeData | Type | precode sort byte identifying this as an interpreter precode (Version 3 only) |
| InterpByteCodeStart | Method | pointer to the `InterpMethod` associated with the bytecode |
| InterpMethod | MethodDesc | pointer to the MethodDesc for the interpreted method |

arm32 note: the `CodePointerToInstrPointerMask` is used to convert IP values that may include an arm Thumb bit (for example extracted from disassembling a call instruction or from a snapshot of the registers) into an address. On other architectures applying the mask is a no-op.

Expand Down Expand Up @@ -259,6 +268,22 @@ After the initial precode type is determined, for stub precodes a refined precod
}
}

// Version 3 only: resolves MethodDesc for interpreter precodes by following
// the InterpreterPrecodeData → InterpByteCodeStart → InterpMethod → MethodDesc chain.
internal sealed class InterpreterPrecode : ValidPrecode
{
internal InterpreterPrecode(TargetPointer instrPointer) : base(instrPointer, KnownPrecodeType.Interpreter) { }

internal override TargetPointer GetMethodDesc(Target target, Data.PrecodeMachineDescriptor precodeMachineDescriptor)
{
TargetPointer dataAddr = InstrPointer + precodeMachineDescriptor.StubCodePageSize;
Data.InterpreterPrecodeData precodeData = target.ProcessedData.GetOrAdd<Data.InterpreterPrecodeData>(dataAddr);
Data.InterpByteCodeStart byteCodeStart = target.ProcessedData.GetOrAdd<Data.InterpByteCodeStart>(precodeData.ByteCodeAddr);
Data.InterpMethod interpMethod = target.ProcessedData.GetOrAdd<Data.InterpMethod>(byteCodeStart.Method);
return interpMethod.MethodDesc;
}
}

internal TargetPointer CodePointerReadableInstrPointer(TargetCodePointer codePointer)
{
// Mask off the thumb bit, if we're on arm32, to get the actual instruction pointer
Expand All @@ -282,6 +307,8 @@ After the initial precode type is determined, for stub precodes a refined precod
return new PInvokeImportPrecode(instrPointer);
case KnownPrecodeType.ThisPtrRetBuf:
return new ThisPtrRetBufPrecode(instrPointer);
case KnownPrecodeType.Interpreter:
return new InterpreterPrecode(instrPointer);
default:
break;
}
Expand All @@ -295,4 +322,31 @@ After the initial precode type is determined, for stub precodes a refined precod

return precode.GetMethodDesc(_target, MachineDescriptor);
}

// Returns the interpreter bytecode address if the entry point is an interpreter precode,
// otherwise returns the original entry point unchanged.
// This method never throws - on any failure, the original address is returned.
TargetCodePointer IPrecodeStubs.GetInterpreterCodeFromInterpreterPrecodeIfPresent(TargetCodePointer entryPoint)
{
try
{
TargetPointer instrPointer = CodePointerReadableInstrPointer(entryPoint);
if (!IsAlignedInstrPointer(instrPointer))
return entryPoint;

if (TryGetKnownPrecodeType(instrPointer) is not KnownPrecodeType.Interpreter)
return entryPoint;

TargetPointer dataAddr = instrPointer + MachineDescriptor.StubCodePageSize;
Data.InterpreterPrecodeData precodeData = // read InterpreterPrecodeData at dataAddr
if (precodeData.ByteCodeAddr == TargetPointer.Null)
return entryPoint;

return new TargetCodePointer(precodeData.ByteCodeAddr);
}
catch
{
return entryPoint;
}
}
```
32 changes: 32 additions & 0 deletions docs/design/datacontracts/StackWalk.md
Original file line number Diff line number Diff line change
Expand Up @@ -73,6 +73,11 @@ This contract depends on the following descriptors:
| `HijackArgs` (amd64) | `CalleeSavedRegisters` | CalleeSavedRegisters data structure |
| `HijackArgs` (amd64 Windows) | `Rsp` | Saved stack pointer |
| `HijackArgs` (arm/arm64/x86) | For each register `r` saved in HijackArgs, `r` | Register names associated with stored register values |
| `InterpreterFrame` | `TopInterpMethodContextFrame` | Pointer to the InterpreterFrame's top `InterpMethodContextFrame` |
| `InterpMethodContextFrame` | `StartIp` | Pointer to the `InterpByteCodeStart` for resolving the MethodDesc |
| `InterpMethodContextFrame` | `ParentPtr` | Pointer to the parent `InterpMethodContextFrame` in the call chain (null for outermost frame) |
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • You'll also need the NextPtr for locating the real topmost frame.
  • It seems you'd also need the Ip, StartIp is the IP of the start of the method while the 'Ip` is the actual instruction pointer.

| `InterpMethodContextFrame` | `Ip` | The actual instruction pointer within the method (null if frame is inactive/reusable) |
| `InterpMethodContextFrame` | `NextPtr` | Pointer to the next `InterpMethodContextFrame` toward the top of the stack |
| `ArgumentRegisters` (arm) | For each register `r` saved in ArgumentRegisters, `r` | Register names associated with stored register values |
| `CalleeSavedRegisters` | For each callee saved register `r`, `r` | Register names associated with stored register values |
| `TailCallFrame` (x86 Windows) | `CalleeSavedRegisters` | CalleeSavedRegisters data structure |
Expand Down Expand Up @@ -119,6 +124,33 @@ In reality, the actual algorithm is a little more complex fow two reasons. It re
If the address of the `frame` is less than the caller's stack pointer, **return the current context**, pop the top Frame from `frameStack`, and **go to step 3**.
3. Unwind `currContext` using the Windows style unwinder. **Return the current context**.

#### Interpreter Frame Expansion

When the stack walker encounters an `InterpreterFrame`, it expands it into multiple logical frames by walking the `InterpMethodContextFrame` chain. The runtime maintains a linked list of `InterpMethodContextFrame` nodes representing each interpreted method currently on the call stack within a single `InterpreterFrame`.

The `TopInterpMethodContextFrame` field is an approximate hint that may point to a stale frame during dump or native debugging. The actual top frame must be resolved using the `Ip` and `NextPtr`/`ParentPtr` fields, replicating `InterpreterFrame::GetTopInterpMethodContextFrame()`:

- If the hinted frame's `Ip` is non-null (active): seek upward via `NextPtr` while the next frame's `Ip` is also non-null.
- If the hinted frame's `Ip` is null (inactive/reusable): seek downward via `ParentPtr` until finding a frame with non-null `Ip`.

Only frames with non-null `Ip` (active frames) are yielded during the walk. Each node's `ParentPtr` points to its caller.

For each active `InterpMethodContextFrame` in the chain, the stack walker yields a separate frame. The `MethodDesc` for each frame is resolved by following:
`InterpMethodContextFrame.StartIp` -> `InterpByteCodeStart.Method` -> `InterpMethod.MethodDesc`

```
InterpreterFrame
└-> TopInterpMethodContextFrame (hint, may be stale)
└-> ResolveTop() -> InterpMethodContextFrame (method C, Ip != null)
└-> ParentPtr -> InterpMethodContextFrame (method B, Ip != null)
└-> ParentPtr -> InterpMethodContextFrame (method A, Ip != null)
└-> ParentPtr -> null
```

This produces three frames in order: C, B, A (innermost to outermost).

When the stack walk starts with an explicit context in interpreted code (e.g., from a debugger breakpoint), the interpreted frames are already yielded from the initial context as frameless frames. When the walker subsequently encounters the corresponding `InterpreterFrame`, it skips expanding it to prevent the same frames from being walked twice.


#### Simple Example

Expand Down
4 changes: 4 additions & 0 deletions src/coreclr/vm/datadescriptor/datadescriptor.h
Original file line number Diff line number Diff line change
Expand Up @@ -24,6 +24,10 @@

#include "configure.h"

#ifdef FEATURE_INTERPRETER
#include "interpexec.h"
#endif // FEATURE_INTERPRETER

#include "virtualcallstub.h"
#include "../debug/ee/debugger.h"
#include "patchpointinfo.h"
Expand Down
43 changes: 43 additions & 0 deletions src/coreclr/vm/datadescriptor/datadescriptor.inc
Original file line number Diff line number Diff line change
Expand Up @@ -831,6 +831,42 @@ CDAC_TYPE_FIELD(RealCodeHeader, T_UINT32, NumUnwindInfos, offsetof(RealCodeHeade
CDAC_TYPE_FIELD(RealCodeHeader, TYPE(RuntimeFunction), UnwindInfos, offsetof(RealCodeHeader, unwindInfos))
CDAC_TYPE_END(RealCodeHeader)

#ifdef FEATURE_INTERPRETER
CDAC_TYPE_BEGIN(InterpreterRealCodeHeader)
CDAC_TYPE_INDETERMINATE(InterpreterRealCodeHeader)
CDAC_TYPE_FIELD(InterpreterRealCodeHeader, T_POINTER, MethodDesc, offsetof(InterpreterRealCodeHeader, phdrMDesc))
CDAC_TYPE_FIELD(InterpreterRealCodeHeader, T_POINTER, DebugInfo, offsetof(InterpreterRealCodeHeader, phdrDebugInfo))
CDAC_TYPE_FIELD(InterpreterRealCodeHeader, T_POINTER, GCInfo, offsetof(InterpreterRealCodeHeader, phdrJitGCInfo))
CDAC_TYPE_FIELD(InterpreterRealCodeHeader, T_POINTER, JitEHInfo, offsetof(InterpreterRealCodeHeader, phdrJitEHInfo))
CDAC_TYPE_END(InterpreterRealCodeHeader)

#ifndef FEATURE_PORTABLE_ENTRYPOINTS
CDAC_TYPE_BEGIN(InterpreterPrecodeData)
CDAC_TYPE_INDETERMINATE(InterpreterPrecodeData)
CDAC_TYPE_FIELD(InterpreterPrecodeData, T_POINTER, ByteCodeAddr, offsetof(::InterpreterPrecodeData, ByteCodeAddr))
CDAC_TYPE_FIELD(InterpreterPrecodeData, T_UINT8, Type, offsetof(::InterpreterPrecodeData, Type))
CDAC_TYPE_END(InterpreterPrecodeData)
#endif // !FEATURE_PORTABLE_ENTRYPOINTS

CDAC_TYPE_BEGIN(InterpByteCodeStart)
CDAC_TYPE_INDETERMINATE(InterpByteCodeStart)
CDAC_TYPE_FIELD(InterpByteCodeStart, T_POINTER, Method, offsetof(InterpByteCodeStart, Method))
CDAC_TYPE_END(InterpByteCodeStart)

CDAC_TYPE_BEGIN(InterpMethod)
CDAC_TYPE_INDETERMINATE(InterpMethod)
CDAC_TYPE_FIELD(InterpMethod, T_POINTER, MethodDesc, offsetof(InterpMethod, methodHnd))
CDAC_TYPE_END(InterpMethod)

CDAC_TYPE_BEGIN(InterpMethodContextFrame)
CDAC_TYPE_INDETERMINATE(InterpMethodContextFrame)
CDAC_TYPE_FIELD(InterpMethodContextFrame, T_POINTER, StartIp, offsetof(InterpMethodContextFrame, startIp))
CDAC_TYPE_FIELD(InterpMethodContextFrame, T_POINTER, ParentPtr, offsetof(InterpMethodContextFrame, pParent))
CDAC_TYPE_FIELD(InterpMethodContextFrame, T_POINTER, Ip, offsetof(InterpMethodContextFrame, ip))
CDAC_TYPE_FIELD(InterpMethodContextFrame, T_POINTER, NextPtr, offsetof(InterpMethodContextFrame, pNext))
CDAC_TYPE_END(InterpMethodContextFrame)
#endif // FEATURE_INTERPRETER

CDAC_TYPE_BEGIN(EEExceptionClause)
CDAC_TYPE_SIZE(sizeof(EE_ILEXCEPTION_CLAUSE))
CDAC_TYPE_FIELD(EEExceptionClause, T_UINT32, Flags, offsetof(EE_ILEXCEPTION_CLAUSE, Flags))
Expand Down Expand Up @@ -962,6 +998,13 @@ CDAC_TYPE_FIELD(FramedMethodFrame, T_POINTER, TransitionBlockPtr, cdac_data<Fram
CDAC_TYPE_FIELD(FramedMethodFrame, T_POINTER, MethodDescPtr, cdac_data<FramedMethodFrame>::MethodDescPtr)
CDAC_TYPE_END(FramedMethodFrame)

#ifdef FEATURE_INTERPRETER
CDAC_TYPE_BEGIN(InterpreterFrame)
CDAC_TYPE_INDETERMINATE(InterpreterFrame)
CDAC_TYPE_FIELD(InterpreterFrame, T_POINTER, TopInterpMethodContextFrame, cdac_data<InterpreterFrame>::TopInterpMethodContextFrame)
CDAC_TYPE_END(InterpreterFrame)
#endif // FEATURE_INTERPRETER

CDAC_TYPE_BEGIN(TransitionBlock)
CDAC_TYPE_SIZE(sizeof(TransitionBlock))
CDAC_TYPE_FIELD(TransitionBlock, T_POINTER, ReturnAddress, offsetof(TransitionBlock, m_ReturnAddress))
Expand Down
8 changes: 8 additions & 0 deletions src/coreclr/vm/frames.h
Original file line number Diff line number Diff line change
Expand Up @@ -2315,6 +2315,14 @@ class InterpreterFrame : public FramedMethodFrame
TADDR m_SP;
#endif // TARGET_WASM
PTR_Object m_continuation;

friend struct cdac_data<InterpreterFrame>;
};

template<>
struct cdac_data<InterpreterFrame>
{
static constexpr size_t TopInterpMethodContextFrame = offsetof(InterpreterFrame, m_pTopInterpMethodContextFrame);
};

#endif // FEATURE_INTERPRETER
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -78,7 +78,8 @@ public enum JitType : uint
{
Unknown = 0,
Jit = 1,
R2R = 2
R2R = 2,
Interpreter = 3
}

public interface IExecutionManager : IContract
Expand Down
Loading
Loading