Add linux_arm and windows_x86 to cDAC dump test platforms#125793
Add linux_arm and windows_x86 to cDAC dump test platforms#125793max-charlamb wants to merge 6 commits intodotnet:mainfrom
Conversation
|
Tagging subscribers to this area: @dotnet/runtime-infrastructure |
There was a problem hiding this comment.
Pull request overview
Adds 32-bit platform coverage to the cDAC dump test Helix infrastructure by enabling windows_x86 and linux_arm as first-class platforms in the runtime diagnostics pipeline, ensuring dump generation/testing exercises more architecture combinations.
Changes:
- Expanded
cdacDumpPlatformsdefaults to includewindows_x86andlinux_arm. - Added a Helix platforms variable for
helix_windows_x86. - Updated cDAC Helix queue selection logic to route
windows_x86andlinux_armto appropriate queues.
Reviewed changes
Copilot reviewed 3 out of 3 changed files in this pull request and generated 1 comment.
| File | Description |
|---|---|
| eng/pipelines/runtime-diagnostics.yml | Adds windows_x86 and linux_arm to the default cDAC dump test platform list. |
| eng/pipelines/helix-platforms.yml | Introduces a Helix queue variable for Windows x86 runs (via WoW64 on x64 machines). |
| eng/pipelines/cdac/prepare-cdac-helix-steps.yml | Extends the platform→Helix-queue mapping to support windows_x86 and linux_arm. |
| # Windows x86 (runs on x64 machines via WoW64) | ||
| - name: helix_windows_x86 | ||
| value: Windows.10.Amd64.Open | ||
|
|
There was a problem hiding this comment.
Pull request overview
This PR expands cDAC dump-test Helix coverage to include two additional 32-bit target platforms (Windows x86 via WoW64 and Linux ARM32 via containerized Helix queues), and adds infrastructure to make dump artifacts easier to retrieve from Helix runs.
Changes:
- Add
windows_x86andlinux_armto the defaultcdacDumpPlatformsmatrix in the runtime diagnostics pipeline. - Extend cDAC Helix queue selection logic to map
windows_x86andlinux_armto appropriate Helix queues. - Add dump tarball upload/download wiring for cDAC dump tests.
Reviewed changes
Copilot reviewed 4 out of 4 changed files in this pull request and generated 3 comments.
| File | Description |
|---|---|
src/native/managed/cdac/tests/DumpTests/cdac-dump-helix.proj |
Adds tar creation/upload + attempts to download dumps.tar.gz from Helix results for dump tests/dumpgen. |
eng/pipelines/runtime-diagnostics.yml |
Adds windows_x86/linux_arm to cDAC dump platform defaults; quotes Helix queue param; publishes Helix results on failure for single-leg runs. |
eng/pipelines/helix-platforms.yml |
Introduces a helix_windows_x86 queue alias (WoW64 on x64 Windows 10 queue). |
eng/pipelines/cdac/prepare-cdac-helix-steps.yml |
Adds queue switch cases for windows_x86 and linux_arm. |
| <_TestCommand>%HELIX_CORRELATION_PAYLOAD%\dotnet.exe exec --runtimeconfig %HELIX_WORKITEM_PAYLOAD%\tests\Microsoft.Diagnostics.DataContractReader.DumpTests.runtimeconfig.json --depsfile %HELIX_WORKITEM_PAYLOAD%\tests\Microsoft.Diagnostics.DataContractReader.DumpTests.deps.json %HELIX_WORKITEM_PAYLOAD%\tests\xunit.console.dll %HELIX_WORKITEM_PAYLOAD%\tests\Microsoft.Diagnostics.DataContractReader.DumpTests.dll -xml testResults.xml -nologo</_TestCommand> | ||
| <_FullCommand>$(_DumpGenCommands) & $(_DumpInfoCommand) & $(_TestCommand)</_FullCommand> | ||
| <_TarCommand>tar -czf %HELIX_WORKITEM_UPLOAD_ROOT%\dumps.tar.gz -C %HELIX_WORKITEM_PAYLOAD%\dumps .</_TarCommand> | ||
| <_FullCommand>$(_DumpGenCommands) & $(_DumpInfoCommand) & $(_TestCommand) & $(_TarCommand)</_FullCommand> |
There was a problem hiding this comment.
On Windows, appending $(_TarCommand) at the end of _FullCommand means the overall process exit code becomes tar's exit code (cmd uses the last command’s ERRORLEVEL). If the xunit run fails/crashes and the reporter doesn’t produce usable XML, this can mask the failure and incorrectly report the Helix work item as succeeded. Consider preserving the test exit code (capture ERRORLEVEL before running tar and exit /b with that code), and optionally only run tar when the test command fails.
| <_FullCommand>$(_DumpGenCommands) & $(_DumpInfoCommand) & $(_TestCommand) & $(_TarCommand)</_FullCommand> | |
| <_FullCommand>set "_testExitCode=0" & $(_DumpGenCommands) & $(_DumpInfoCommand) & $(_TestCommand) & set "_testExitCode=%ERRORLEVEL%" & $(_TarCommand) & exit /b %_testExitCode%</_FullCommand> |
| <_TestCommand>$HELIX_CORRELATION_PAYLOAD/dotnet exec --runtimeconfig $HELIX_WORKITEM_PAYLOAD/tests/Microsoft.Diagnostics.DataContractReader.DumpTests.runtimeconfig.json --depsfile $HELIX_WORKITEM_PAYLOAD/tests/Microsoft.Diagnostics.DataContractReader.DumpTests.deps.json $HELIX_WORKITEM_PAYLOAD/tests/xunit.console.dll $HELIX_WORKITEM_PAYLOAD/tests/Microsoft.Diagnostics.DataContractReader.DumpTests.dll -xml testResults.xml -nologo</_TestCommand> | ||
| <_FullCommand>$(_DumpGenCommands) && $(_DumpInfoCommand) && $(_TestCommand)</_FullCommand> | ||
| <_TarCommand>tar -czf $HELIX_WORKITEM_UPLOAD_ROOT/dumps.tar.gz -C $HELIX_WORKITEM_PAYLOAD/dumps .</_TarCommand> | ||
| <_FullCommand>$(_DumpGenCommands) && $(_DumpInfoCommand) && $(_TestCommand) ; $(_TarCommand)</_FullCommand> |
There was a problem hiding this comment.
On Unix, _FullCommand ends with ; $(_TarCommand), so the work item’s exit status becomes the tar exit status (not the dumpgen/dumpinfo/xunit chain). This can hide failures (e.g., xunit crash/OOM, missing runtimeconfig, etc.) if tar succeeds. Capture/propagate the original exit code (e.g., save $? before tar and exit with it) so Helix reliably fails the work item when the test command fails.
| <_FullCommand>$(_DumpGenCommands) && $(_DumpInfoCommand) && $(_TestCommand) ; $(_TarCommand)</_FullCommand> | |
| <_FullCommand>$(_DumpGenCommands) && $(_DumpInfoCommand) && $(_TestCommand) ; exit_code=$? ; $(_TarCommand) ; exit $exit_code</_FullCommand> |
| <!-- Test mode: generate dumps, run tests, then tar dumps for download on failure --> | ||
| <ItemGroup Condition="'$(DumpOnly)' != 'true'"> | ||
| <HelixWorkItem Include="CdacDumpTests"> | ||
| <PayloadDirectory>$(DumpTestsPayload)</PayloadDirectory> | ||
| <Command>$(_FullCommand)</Command> | ||
| <Timeout>$(WorkItemTimeout)</Timeout> | ||
| <DownloadFilesFromResults>dumps.tar.gz</DownloadFilesFromResults> | ||
| </HelixWorkItem> |
There was a problem hiding this comment.
In test mode, the work item now always creates dumps.tar.gz and requests it via DownloadFilesFromResults. Given dump sizes (multi-GB full dumps), this will upload/download large artifacts even on successful runs, which can significantly increase Helix time, storage, and pipeline bandwidth. If the intent is “download on failure”, consider only creating/downloading the tar when tests fail (or gate it behind a property), and keep the default path lightweight for passing runs.
| $queue = switch ("$(osGroup)_$(archType)") { | ||
| "windows_x64" { "$(helix_windows_x64)" } | ||
| "windows_x86" { "$(helix_windows_x64)" } | ||
| "windows_arm64" { "$(helix_windows_arm64)" } | ||
| "linux_x64" { "$(helix_linux_x64_oldest)" } | ||
| "linux_arm64" { "$(helix_linux_arm64_oldest)" } | ||
| "linux_arm" { "$(helix_linux_arm32_oldest)" } |
There was a problem hiding this comment.
The windows_x86 queue selection is currently wired to $(helix_windows_x64), which (via helix-platforms.yml) targets the Windows x64 latest queue. This doesn’t match the PR intent of running x86 on the dedicated helix_windows_x86 (Windows 10) queue and also leaves the newly-added helix_windows_x86 variable unused. Point the windows_x86 case at $(helix_windows_x86) (or the intended Windows 10 queue alias).
| <!-- Test mode: generate dumps, run tests, then tar dumps for download on failure --> | ||
| <ItemGroup Condition="'$(DumpOnly)' != 'true'"> | ||
| <HelixWorkItem Include="CdacDumpTests"> | ||
| <PayloadDirectory>$(DumpTestsPayload)</PayloadDirectory> | ||
| <Command>$(_FullCommand)</Command> | ||
| <Timeout>$(WorkItemTimeout)</Timeout> | ||
| <DownloadFilesFromResults>dumps.tar.gz</DownloadFilesFromResults> |
There was a problem hiding this comment.
The updated comment says dumps are tarred “for download on failure”, but _TarCommand is executed unconditionally in the test-mode _FullCommand on both Windows and Unix. Either adjust the comment to reflect the unconditional tar, or gate tar creation to failure if that’s the intent.
bd65a1a to
39eeff7
Compare
| <_TestCommand>%HELIX_CORRELATION_PAYLOAD%\dotnet.exe exec --runtimeconfig %HELIX_WORKITEM_PAYLOAD%\tests\Microsoft.Diagnostics.DataContractReader.DumpTests.runtimeconfig.json --depsfile %HELIX_WORKITEM_PAYLOAD%\tests\Microsoft.Diagnostics.DataContractReader.DumpTests.deps.json %HELIX_WORKITEM_PAYLOAD%\tests\xunit.console.dll %HELIX_WORKITEM_PAYLOAD%\tests\Microsoft.Diagnostics.DataContractReader.DumpTests.dll -xml testResults.xml -nologo</_TestCommand> | ||
| <_FullCommand>$(_DumpGenCommands) & $(_DumpInfoCommand) & $(_TestCommand)</_FullCommand> | ||
| <_TarCommand>tar -czf %HELIX_WORKITEM_UPLOAD_ROOT%\dumps.tar.gz -C %HELIX_WORKITEM_PAYLOAD%\dumps .</_TarCommand> | ||
| <_FullCommand>$(_DumpGenCommands) & $(_DumpInfoCommand) & $(_TestCommand) & $(_TarCommand)</_FullCommand> |
There was a problem hiding this comment.
In the Windows test-mode command chain, appending the tar step with & will overwrite the %ERRORLEVEL% from the xUnit run (and potentially from earlier steps) with tar's exit code. That can cause Helix to report the work item as successful even when tests fail. Consider capturing the test exit code before running tar and exiting with that original code after tar completes (while still tarring for diagnostics).
| <_FullCommand>$(_DumpGenCommands) & $(_DumpInfoCommand) & $(_TestCommand) & $(_TarCommand)</_FullCommand> | |
| <_FullCommand>cmd /v:ON /c "$(_DumpGenCommands) & $(_DumpInfoCommand) & $(_TestCommand) & set _testExitCode=!ERRORLEVEL! & $(_TarCommand) & exit /b !_testExitCode!"</_FullCommand> |
| <_TestCommand>$HELIX_CORRELATION_PAYLOAD/dotnet exec --runtimeconfig $HELIX_WORKITEM_PAYLOAD/tests/Microsoft.Diagnostics.DataContractReader.DumpTests.runtimeconfig.json --depsfile $HELIX_WORKITEM_PAYLOAD/tests/Microsoft.Diagnostics.DataContractReader.DumpTests.deps.json $HELIX_WORKITEM_PAYLOAD/tests/xunit.console.dll $HELIX_WORKITEM_PAYLOAD/tests/Microsoft.Diagnostics.DataContractReader.DumpTests.dll -xml testResults.xml -nologo</_TestCommand> | ||
| <_FullCommand>$(_DumpGenCommands) && $(_DumpInfoCommand) && $(_TestCommand)</_FullCommand> | ||
| <_TarCommand>tar -czf $HELIX_WORKITEM_UPLOAD_ROOT/dumps.tar.gz -C $HELIX_WORKITEM_PAYLOAD/dumps .</_TarCommand> | ||
| <_FullCommand>$(_DumpGenCommands) && $(_DumpInfoCommand) && $(_TestCommand) ; $(_TarCommand)</_FullCommand> |
There was a problem hiding this comment.
In the Unix test-mode command chain, using ; $(_TarCommand) will make the overall shell command exit with tar's status, not the xUnit run's status. This can mask test failures if tar succeeds. Preserve the xUnit exit code (e.g., save $? before tarring and exit with it afterward) so Helix correctly fails the work item while still producing dumps.tar.gz.
| <_FullCommand>$(_DumpGenCommands) && $(_DumpInfoCommand) && $(_TestCommand) ; $(_TarCommand)</_FullCommand> | |
| <_FullCommand>$(_DumpGenCommands) && $(_DumpInfoCommand) && $(_TestCommand) ; test_exit=$? ; $(_TarCommand) ; exit $test_exit</_FullCommand> |
|
|
||
| $queue = switch ("$(osGroup)_$(archType)") { | ||
| "windows_x64" { "$(helix_windows_x64)" } | ||
| "windows_x86" { "$(helix_windows_x64)" } |
There was a problem hiding this comment.
The new "windows_x86" case currently maps to $(helix_windows_x64) (which is the latest Windows x64 alias), even though this PR introduces a dedicated helix_windows_x86 variable (Windows 10 queue) and the PR description calls out Windows.10.Amd64.Open. Switch the case to use $(helix_windows_x86) so queue selection matches the intended x86 coverage/OS baseline.
| "windows_x86" { "$(helix_windows_x64)" } | |
| "windows_x86" { "$(helix_windows_x86)" } |
Add 32-bit platform coverage to the cDAC dump test Helix infrastructure: - windows_x86: runs on Windows.10.Amd64.Open via WoW64 - linux_arm: runs on containerized ARM32 Helix queue (Debian on ARM64) Also adds helix_windows_x86 variable to helix-platforms.yml for queue name centralization. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
The containerized ARM32 queue name contains parentheses which bash interprets as subshell syntax. Quoting the /p:HelixTargetQueues argument prevents this. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Tar and upload dumps from Helix back to the agent so they can be published as pipeline artifacts for local investigation. The tar runs unconditionally after tests, but the artifact is only published when the job fails. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
shouldContinueOnError causes Helix failures to report as SucceededWithIssues, not Failed. Use Agent.JobStatus check instead of failed() so dumps are published on any non-success. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
The Windows.10.Amd64.Open queue may have different permissions for the DAC registry key needed for heap dumps. Use the same Windows.11.Amd64.Client.Open queue that windows_x64 uses. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Move the DisableAuxProviderSignatureCheck registry key logic into a shared props file imported by both DumpTests.targets and cdac-dump-helix.proj. Add /reg:32 to also set the key in the WoW64 registry view so x86 processes find it under WOW6432Node. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
39eeff7 to
0ba94b3
Compare
Adds 32-bit platform coverage to the cDAC dump test Helix infrastructure:
Windows.10.Amd64.Openvia WoW64 (matching coreclr pattern)helix_linux_arm32_oldest— Debian on ARM64 hardware)Changes
eng/pipelines/helix-platforms.yml: Addedhelix_windows_x86variableeng/pipelines/cdac/prepare-cdac-helix-steps.yml: Added queue switch cases forwindows_x86andlinux_armeng/pipelines/runtime-diagnostics.yml: Addedlinux_armandwindows_x86tocdacDumpPlatformsdefaultsTesting
CI will validate all 6 platforms: windows_x64, windows_x86, windows_arm64, linux_x64, linux_arm64, linux_arm.