diff --git a/.github/copilot-instructions.md b/.github/copilot-instructions.md index d3c1221e713383..4355d406393b9b 100644 --- a/.github/copilot-instructions.md +++ b/.github/copilot-instructions.md @@ -14,6 +14,8 @@ When NOT running under CCA, skip the `code-review` skill if the user has stated Before making changes to a directory, search for `README.md` files in that directory and its parent directories up to the repository root. Read any you find — they contain conventions, patterns, and architectural context relevant to your work. +For specialized runtime subsystems, also check `docs/design/features/` for design documents that describe architectural constraints relevant to your work. + If the changes are intended to improve performance, or if they could negatively impact performance, use the `performance-benchmark` skill to validate the impact before completing. You MUST follow all code-formatting and naming conventions defined in [`.editorconfig`](/.editorconfig). @@ -121,7 +123,7 @@ Test projects are typically at: `tests/.Tests.csproj` or `tests/
  • .dll --- -## Adding new tests +## ⚠️ MANDATORY: Regression Test Verification + +**Before committing a regression test, you MUST confirm BOTH conditions:** +- the test fails against the unfixed code, and +- the test passes with the fix applied. + +The baseline build artifacts are the unfixed binaries for negative verification. Reuse that known-good pre-change build/output when proving the test fails without your fix; do not assume you need to reconstruct a separate "bad" build after editing code. + +⚠️ A first green run is **not** evidence that the test is valid. It may be passing because it never exercised the bug, because it ran against already-fixed binaries, or because it is only asserting existing behavior. When creating a regression test for a bug fix: -1. **Verify the test FAILS without the fix** — build and run against the unfixed code. -2. **Verify the test PASSES with the fix** — apply the fix, rebuild, and run again. -3. If the fix is not yet merged locally, manually apply the minimal changes from the PR/commit to verify. +1. **Verify the test FAILS without the fix** — run it against the baseline/unfixed build artifacts and confirm you are exercising the pre-fix binaries. +2. **Verify the test PASSES with the fix** — apply the fix, rebuild the affected components, and rerun the same test to confirm the behavior changed for the right reason. +3. **If the fix is not yet merged locally, manually apply the minimal changes from the PR/commit to verify** — do not skip negative verification just because the fix originated elsewhere. -Do not mark a regression test task as complete until both conditions are confirmed. +Do not mark a regression test task as complete until both conditions are explicitly confirmed. ## Troubleshooting @@ -221,6 +238,7 @@ Do not mark a regression test task as complete until both conditions are confirm |-------|----------| | "shared framework must be built" | Run baseline build: `./build.sh clr+libs -rc release` | | "testhost" missing / FileNotFoundException | Run baseline build first (Step 2 above) | +| crossgen2 + CoreLib changed | When a PR modifies both crossgen2 and CoreLib (e.g., adding new helper methods), do a full `./build.sh clr+libs` from the PR branch. Incremental crossgen2-only rebuilds will fail because the framework CoreLib won't have the new methods. | | Build timeout | Wait up to 40 min; only fail if no output for 5 min | | "Target does not exist" | Avoid specifying a target framework; the build will auto-select `$(NetCoreAppCurrent)` | | "0 test projects" after `build.sh -Test` | The test has `` > 0; add `-priority1` to the build command | diff --git a/.github/skills/code-review/SKILL.md b/.github/skills/code-review/SKILL.md index 7542b697368f0b..ff7d53061fdf5b 100644 --- a/.github/skills/code-review/SKILL.md +++ b/.github/skills/code-review/SKILL.md @@ -577,6 +577,8 @@ Flag discrepancies with the following severity: - **Consider NativeAOT parity for runtime changes.** When changing CoreCLR behavior, verify whether the same change is needed for NativeAOT. > "The code you have changed is not used on NativeAOT. Do we need the same change for NativeAOT as well?" — jkotas +- **R2R and NativeAOT P/Invoke IL emission must stay aligned.** When adding new IL stub logic to the R2R `PInvokeILEmitter` (e.g., platform-specific exception checks, new marshalling helpers), verify the same logic exists in the shared `Common/TypeSystem/IL/Stubs/PInvokeILEmitter.cs` used by NativeAOT. Divergence between the two emitters causes subtle behavioral differences between R2R and NativeAOT compilation modes. + - **Keep interpreter behavior consistent with the regular JIT.** Follow the same patterns, naming, error codes (`CORJIT_BADCODE`), and macros (`NO_WAY`). Use `FEATURE_INTERPRETER` guards. > "Should we call it NO_WAY like in a regular JIT? I think the more similar the interpreter JIT to the regular JIT, the better." — jkotas @@ -629,6 +631,10 @@ Flag discrepancies with the following severity: - **Follow naming conventions for regression test directories.** In `src/tests/Regressions/coreclr/`, use `GitHub_` for the directory and `test` for the test name. > "Please follow the pre-existing pattern for naming. The directory name should be GitHub_122933 and the test name should be test122933." — jkotas +- **R2R map validation must check `MethodWithGCInfo`, not `MethodFixupSignature`.** When a ReadyToRun test asserts a method was precompiled into the R2R image, the assertion must check for `MethodWithGCInfo` entries in the crossgen2 `--map` output. `MethodFixupSignature` and `DelayLoadHelperImport` entries are metadata references that exist regardless of whether the method body was actually compiled to native code. See `src/tests/readytorun/README.md` for details. + +- **R2R tests that change P/Invoke compilation behavior need both map validation and runtime verification.** A map test proves stubs were precompiled; a runtime test proves they work correctly. For example, adding R2R support for a new P/Invoke category should include both a `--map` check (proving precompilation) and verification that the precompiled stub's behavior matches the JIT-compiled version (e.g., exception propagation, marshalling correctness). + --- ## Documentation & Comments @@ -727,3 +733,11 @@ Flag discrepancies with the following severity: - **Prefer 4-byte `BOOL` for native interop marshalling.** Use `UnmanagedType.Bool`. Verify P/Invoke return types match native signatures exactly—mismatches may work on 64-bit but fail on 32-bit/WASM. > "bool marshalling has always been bug prone area. The 4-byte bool (UnmanagedType.Bool) tends to be the least bug-prone option." — jkotas + +- **R2R P/Invoke stubs that call CoreLib helpers require `--inputbubble`.** When crossgen2 generates IL stubs that reference CoreLib methods (e.g., `ThrowPendingExceptionObject` for ObjC exception checks, `SetLastError` support), the input assembly must be compiled with `--inputbubble` so crossgen2 can create cross-module fixups. Without it, `ModuleTokenResolver.GetModuleTokenForMethod` throws `NotImplementedException` for methods the input assembly doesn't already reference. See `docs/design/features/readytorun-pinvoke.md`. + +- **Verify `IsMarshallingRequired` prevents JIT from inlining past R2R P/Invoke stubs.** When a P/Invoke stub contains safety logic (exception checks, GC transitions), `PInvokeILStubMethodIL.IsMarshallingRequired` must return `true`. The JIT checks `pInvokeMarshalingRequired` to decide whether to use the precompiled stub or inline a raw native call (`GTF_CALL_UNMANAGED`). If `IsMarshallingRequired` is `false` for a blittable P/Invoke, the JIT will inline the raw call and silently skip the stub's safety logic. + +- **Separate marshalling requirements from platform-specific stub requirements.** `Marshaller.IsMarshallingRequired` should only return `true` when actual data marshalling is needed (non-blittable parameters, `SetLastError`, non-`PreserveSig`). Platform-specific stub requirements — like ObjC pending exception checks — are not marshalling concerns and should not be injected into `Marshaller.IsMarshallingRequired`. Instead, set `PInvokeILStubMethodIL.IsMarshallingRequired` directly. Mixing these concerns creates fragile transitive coupling: if the marshalling check is later cleaned up (because the P/Invoke is blittable), the platform-specific JIT inlining protection silently breaks. Check whether the safety invariant depends on a check being in the right abstraction layer or merely happens to work via a side effect in a different layer. + +- **P/Invoke platform detection strings must match `MarshalHelpers.cs` exactly.** `ShouldCheckForPendingException` in `MarshalHelpers.cs` matches specific module paths (e.g., `"/usr/lib/libobjc.dylib"`) and entrypoint prefixes (e.g., `"objc_msgSend"`). Using an alias like `"libobjc.dylib"` in a `DllImport` attribute will bypass the detection entirely, causing the platform-specific code path to never trigger. diff --git a/.github/skills/jit-regression-test/SKILL.md b/.github/skills/jit-regression-test/SKILL.md index e6cc8f82d58c50..adc6582bbdbf0b 100644 --- a/.github/skills/jit-regression-test/SKILL.md +++ b/.github/skills/jit-regression-test/SKILL.md @@ -61,7 +61,7 @@ public class Runtime_ - **License header**: Always include the standard .NET Foundation license header - **Class name**: Match the file name exactly (`Runtime_`) -- **Test method**: `[Fact]` attribute, named `TestEntryPoint()` +- **Test method**: `[Fact]` attribute, named `TestEntryPoint()` - **Minimize the reproduction**: Strip to the minimal case that triggers the bug - **Use `[MethodImpl(MethodImplOptions.NoInlining)]`** when preventing inlining is needed to reproduce @@ -122,6 +122,21 @@ If a custom .csproj file is needed, it should be located next to the test source ``` +## Step 5: ⚠️ Verify Test Correctness (Mandatory) + +A regression test is only valid if it **fails without the fix** and **passes with the fix**. A test that passes in both cases is worthless — it does not actually test the bug. + +1. **Verify the test FAILS without the fix:** + - If you have a baseline build from `main` (before the fix), use those artifacts to run the test. + - The test should fail (non-zero exit code, assertion failure, or incorrect output). + - If the test passes without the fix, the test is wrong — revisit your assertions. + +2. **Verify the test PASSES with the fix:** + - Build and run with the fix applied. + - The test should pass (exit code 100 for CoreCLR tests, or all assertions green for xUnit tests). + +Do not consider the test complete until both conditions are confirmed. A first green run is not evidence the test is valid — it may be a false positive caused by incorrect assertions or test inputs that don't exercise the bug. + ## Tips - **No .csproj needed for simple tests** — register the `.cs` file in `Regression_ro_2.csproj` instead. diff --git a/src/tests/readytorun/README.md b/src/tests/readytorun/README.md new file mode 100644 index 00000000000000..2dff43d26075c9 --- /dev/null +++ b/src/tests/readytorun/README.md @@ -0,0 +1,97 @@ +# ReadyToRun Tests + +These tests validate ReadyToRun (R2R) behavior in the CoreCLR test suite, especially crossgen2 compilation correctness, cross-module compilation behavior, platform-specific code generation, and determinism. The directory contains both harness-driven tests and tests that invoke crossgen2 manually when they need more control over inputs or emitted artifacts. + +## Test Patterns + +### Pattern 1: Built-in Harness (preferred for simple tests) + +Use the built-in harness when a test only needs the standard ReadyToRun pipeline for a single assembly. + +- Enable it with `true`. +- Add extra crossgen2 switches when needed with ``. +- Examples: + - `crossgen2/crossgen2smoke.csproj` for the basic harness-driven pattern + - `HardwareIntrinsics/` for harness-driven tests with extra instruction-set flags + - `GenericCycleDetection/` for harness-driven tests with extra cycle-detection flags +- When to use: single-assembly compilation, standard validation, or tests that only need a few extra crossgen2 switches. + +This pattern is the simplest to maintain because the normal test infrastructure handles the crossgen2 invocation and test execution. For example, `GenericCycleDetection/*.csproj` enables `AlwaysUseCrossGen2` and adds cycle-detection options through `CrossGen2TestExtraArguments`, while `HardwareIntrinsics/X86/*.csproj` appends instruction-set flags the same way. + +### Pattern 2: Manual crossgen2 Invocation + +Use manual crossgen2 invocation when the built-in harness is not expressive enough. + +- Disable automatic crossgen with `false`. +- Run custom compilation steps with `CLRTestBashPreCommands` (and usually matching `CLRTestBatchPreCommands` for Windows). +- When to use: need `--map`, `--inputbubble`, `--opt-cross-module`, multi-step compilation, or platform-specific scripting. +- Examples: + - `tests/mainv1.csproj` + - `determinism/crossgen2determinism.csproj` + - `ObjCPInvokeR2R/ObjCPInvokeR2R.csproj` + +This pattern is common when a test must stage input assemblies, compile multiple outputs in a specific order, compare multiple generated binaries, or run platform-specific shell logic before execution. + +## Map File Validation + +When using `--map` to validate R2R compilation: + +- **`MethodWithGCInfo`** entries represent actual compiled native code in the ReadyToRun image. This is the signal to use when asserting that a method was precompiled. +- **`MethodFixupSignature`** entries are metadata or signature references. They do not prove that the method body was compiled. +- **`DelayLoadHelperImport`** entries are call-site fixups. They are also metadata-related and are not proof of compilation. + +Always check for `MethodWithGCInfo` when asserting that a method was precompiled into the R2R image. `ObjCPInvokeR2R/ObjCPInvokeR2R.cs` and `tests/test.cs` both use this rule when validating map output. + +## Version Bubble and `--inputbubble` + +R2R compilation outside CoreLib has cross-module reference limitations. The main background is documented in [R2R P/Invoke Design](../../../docs/design/features/readytorun-pinvoke.md). + +- P/Invoke stubs that call CoreLib helpers, such as Objective-C pending-exception helpers or `SetLastError` support, require `--inputbubble` so crossgen2 can create fixups for CoreLib methods. +- Without `--inputbubble`, crossgen2 can only reference members that the input assembly already references through existing `MemberRef` tokens in IL metadata. + +In practice, this is why tests like `ObjCPInvokeR2R` use manual crossgen2 invocation with `--inputbubble`. + +## Platform Gating + +Use MSBuild conditions to keep ReadyToRun tests targeted to the environments they actually validate. + +- Use `true` for platform-specific tests. +- Example: `ObjCPInvokeR2R.csproj` uses `Condition="'$(TargetsOSX)' != 'true'"` because `objc_msgSend` behavior is only relevant on Apple platforms. +- When sanitizers are enabled, crossgen2 tests may need `true` or `DisableProjectBuild` to avoid unsupported infrastructure combinations. + +Other tests in this directory also gate on `$(RuntimeFlavor)`, target architecture, or 32-bit limitations. + +## P/Invoke Detection Tests + +When testing platform-specific P/Invoke behavior, validate the exact detection logic used by the type system. + +- Check the exact library path and entrypoint constants in `src/coreclr/tools/Common/TypeSystem/Interop/IL/MarshalHelpers.cs`. +- Example: Objective-C detection matches `"/usr/lib/libobjc.dylib"` exactly. +- Using only `"libobjc.dylib"` in a `DllImport` will not trigger the Objective-C-specific code path. + +For ObjC-related tests, also verify the expected entrypoint name such as `objc_msgSend`. + +## Building and Running + +Tests are built and run as part of the CoreCLR test suite: + +```bash +# Build all R2R tests +src/tests/build.sh checked -tree:src/tests/readytorun + +# Generate Core_Root layout (required for manual runs) +src/tests/build.sh -GenerateLayoutOnly x64 Release + +# Run a single test manually +export CORE_ROOT=$(pwd)/artifacts/tests/coreclr/..Release/Tests/Core_Root +cd artifacts/tests/coreclr/..Debug/readytorun// +$CORE_ROOT/corerun .dll +# Exit code 100 = pass +``` + +Manual-invocation tests often expect the generated `.map` files and any staged IL assemblies to live beside the test output, so run them from the built test directory, not from the source tree. + +## Related Documentation + +- [R2R P/Invoke Design](../../../docs/design/features/readytorun-pinvoke.md) - version bubble constraints and marshalling pregeneration +- [R2R Composite Format](../../../docs/design/features/readytorun-composite-format-design.md) - composite image design