Build coreclr targetGeneric tests separately for CI #34790

sdmaclea · 2020-04-09T22:05:16Z

Fixes #34508

This PR change splits the building of targetGeneric and targetSpecific tests in ci.yml

Add testBuildPhase parameter to build-test-job
- Modify names of artifacts and logs for targetGeneric build
Add testBuildPhased parameter to run-test job
- Add DependsOn as needed
- Add additional test download as needed
Modifies CI.yml to call build-test-job twice. Once withtestBuildPhase:targetGeneric and once with testBuildPhase:targetSpecific
Modifies CI.yml to call run-test-job with testBuildPhased:true

ghost · 2020-04-09T22:05:21Z

Tagging @ViktorHofer as an area owner

jashook · 2020-04-09T22:07:48Z

Will be able to take a look tomorrow morning.

sdmaclea · 2020-04-13T17:52:57Z

I am waiting on a review on this before investing more in this.

The current implementation requires manually enabling this at every call/instantiation of build-test-job and run-test-job
I have not touched all the call/instantiation points.
It appears I have not touched the most important one as I don't see the separation on the runs in this PR.

Is this the correct approach?
Is there a better approach?
How should I test?

jkoritzinsky · 2020-04-16T22:54:10Z

To test this in CI, you need to change the runtime.yml pipeline to use it. The ci.yml pipeline you changed is the CoreCLR outerloop job, so it doesn't run in PR. (It should definitely also have this change though, so don't undo it.)

eng/pipelines/runtime.yml

sdmaclea · 2020-04-20T22:58:09Z

The Windows tests runs are failing because they are trying to download CoreCLRManagedTestArtifacts_AnyOS_AnyCPU_checked_LibDebug.zip, but the artifact is actually saved from OSX as CoreCLRManagedTestArtifacts_AnyOS_AnyCPU_checked_LibDebug.tar.gz.

trylek · 2020-04-20T23:18:57Z

Hmm, that is annoying. I recall that @jashook mentioned some limitations regarding what compression tools are available on the various build machines; I guess we need to figure out whether it's easier to use zip on OSX or tar.gz on Windows.

sdmaclea · 2020-04-20T23:31:58Z

Looks like I have some control, if I chose to not use the default.

          tarCompression: $(tarCompression)
          archiveExtension: $(archiveExtension)

sdmaclea · 2020-04-21T15:46:58Z

Tests are now running.
There are 45 test failures on this run.

A lot of them are COM Interop tests. I would guess they should be marked TargetSpecific, because they are only supported on Windows. I'll take a look today. /cc @jkoritzinsky

There were also Libraries test run failures (some on one Mono leg, and some on one CoreCLR leg). It is not clear to me whether these are related failures or not. I will assume not until I clean up the CoreCLR test legs.

sdmaclea · 2020-04-22T15:29:54Z

The PR was clean last night. So this change is now ready for review.
At the moment there are merge conflicts, so I will rebase to fix the merge conflicts.

edit...
Looks like I introduced issues with the rebase. Working on fixing them...

+ Build coreclr targetGeneric tests separately for CI + Build coreclr targetGeneric tests separately in runtime.yml + Distinguish generic build based on Libraries config + Build OSX release libraries when CoreCLR changed + Always use tar.gz compression for generic tests

trylek

Thanks Steve, this looks like a solid step forward but I believe a few things are still missing:

At the bottom of your "PR conversation" I see that your runtime pipeline didn't even start. I believe this is due to incorrect handling of testPriority in the two templates (build-test-job, run-test-job) as I described in another review comment.
I suspect we're missing an execution of build-test in run-test-job to build the target-specific tests in phased mode.
I cannot find the counterpart changes to the src/coreclr/build-test.cmd / sh scripts implementing the new targetSpecific and targetGeneric switches. I'm not seeing any such logic in the scripts as they stand today.

As a side note, Vlad mentioned in your yesterday meeting that he was hitting some issues with the item-based test traits and mentioned that we might need to rethink this bit and instead use project properties to declare them. Once you manage to fix the issues per my bullets 1-3, be prepared that you may hit difficulties in the test split because of that.

Thanks

Tomas

eng/pipelines/coreclr/templates/build-test-job.yml

eng/pipelines/coreclr/ci.yml

eng/pipelines/common/templates/runtimes/build-test-job.yml

eng/pipelines/common/templates/runtimes/run-test-job.yml

sdmaclea · 2020-04-22T22:33:30Z

@trylek Thanks for the review.

As a side note, Vlad mentioned in your yesterday meeting that he was hitting some issues with the item-based test traits and mentioned that we might need to rethink this bit and instead use project properties to declare them. Once you manage to fix the issues per my bullets 1-3, be prepared that you may hit difficulties in the test split because of that

#34658 Added properties because that was how existing p0/p1 filtering was being done.

eng/pipelines/runtime.yml

eng/pipelines/coreclr/ci.yml

sdmaclea · 2020-04-23T00:16:28Z

I guess. I need to change code base on the Mono file relocation too...

sdmaclea · 2020-04-23T00:24:54Z

The YAML is parsing again. I would expect the test run to pass.

Do we need to trigger a build somewhere else to the the changes in ci.yml?

trylek · 2020-04-23T00:27:22Z

https://dev.azure.com/dnceng/public/_build?definitionId=655&_a=summary

runtime-coreclr outerloop

sdmaclea · 2020-04-23T14:33:09Z

Innerloop looks good to me minus spurious failures. I don't have permissions at the moment to re-run failed jobs due to dotnet/core-eng#9655
Outerloop passed.
https://dev.azure.com/dnceng/public/_build/results?buildId=612835

Edit -- retriggered failed innerloop jobs....

trylek

Ship it!

trylek · 2020-04-23T15:12:58Z

And if for whatever reason we haven't managed to scare you off further infra work yet, I think the following items are up for grabs:

Do we really need the pair "CLRTestNeedsTarget" / "OsSpecific" everywhere? Perhaps one of them (probably the former) might suffice.
Get rid of managedBuildOsGroup/Subgroup as it should be superfluous with this change.
Monitor perf implications of this change and decide whether we want to move OsSpecific test build to run-test-job and revert the pipeline changes you've made or whether we want to keep the separate set of build jobs for the OS specific tests in which case we should ultimately update all our existing pipelines.

Thanks for your great work, tenacity, and patience!

Tomas

sdmaclea · 2020-04-23T15:19:42Z

Do we really need the pair "CLRTestNeedsTarget" / "OsSpecific" everywhere? Perhaps one of them (probably the former) might suffice.

I added CLRTestNeedsTarget everywhere to allow this work to be implemented while @VSadov figured out generic filtering. I knew how to filter properties, but not filter on TraitTag items. I have been assuming @VSadov will remove these when he implements generic filtering.

Get rid of managedBuildOsGroup/Subgroup as it should be superfluous with this change.

I can take this on.

Monitor perf implications of this change and decide whether we want to move OsSpecific test build to run-test-job and revert the pipeline changes you've made or whether we want to keep the separate set of build jobs for the OS specific tests in which case we should ultimately update all our existing pipelines.

I don't really know how to do this monitoring. If someone monitors it, I'd be happy to refactor this.

I was also looking at #33128 & #33067

trylek · 2020-04-23T15:30:17Z

All your responses sound great to me, thank you!

safern · 2020-04-23T22:02:17Z

eng/pipelines/runtime.yml

+- template: /eng/pipelines/common/platform-matrix.yml
+  parameters:
+    jobTemplate: /eng/pipelines/libraries/build-job.yml
+    buildConfig: Release


@sdmaclea do we need to build libraries in Release here? We already build it on Debug: https://github.com/dotnet/runtime/blob/master/eng/pipelines/runtime.yml#L382

Do we need the release build?

Also, this will introduce a duplicate job of the libraries on rolling builds because there's already a libraries job for Release in OSX, because https://github.com/dotnet/runtime/blob/master/eng/pipelines/runtime.yml#L382 will be built on release. See, variables.debugOnPrReleaseOnRolling.

@trylek this might be the reason of the break

I agree this is definitely at least part of the problem, thanks for pointing that out. I'm investigating whether it's the only one.

The duplicate job was prevented by the conditional logic below

condition: >- and( ne(variables['debugOnPrReleaseOnRolling'], 'Release'), or( eq(dependencies.checkout.outputs['SetPathVars_coreclr.containsChange'], true), eq(variables['isFullMatrix'], true)))

…4790)" This reverts commit 9188a58.

…#35371) This reverts commit 9188a58.

* Build CoreClr tests separately * Build coreclr targetGeneric tests separately for CI * Build coreclr targetGeneric tests separately in runtime.yml * Distinguish generic build based on Libraries config * Build OSX release libraries when CoreCLR changed * Always use tar.gz compression for generic tests * Mark all CoreCLR Interop/COM projects OsSpecific

* Build coreclr targetGeneric tests separately for CI (#34790) * Build CoreClr tests separately * Build coreclr targetGeneric tests separately for CI * Build coreclr targetGeneric tests separately in runtime.yml * Distinguish generic build based on Libraries config * Build OSX release libraries when CoreCLR changed * Always use tar.gz compression for generic tests * Mark all CoreCLR Interop/COM projects OsSpecific * Assume debug/release libraries doesn't matter during test build * Add comments

sdmaclea requested review from jashook and trylek April 9, 2020 22:05

sdmaclea self-assigned this Apr 9, 2020

Dotnet-GitSync-Bot added the area-Infrastructure label Apr 9, 2020

sdmaclea added area-Infrastructure-coreclr and removed area-Infrastructure labels Apr 9, 2020

jashook requested a review from jkoritzinsky April 14, 2020 19:59

sdmaclea force-pushed the TestDecouple branch 6 times, most recently from d54ee5f to b008eb3 Compare April 17, 2020 19:06

sdmaclea changed the title ~~WIP Build coreclr targetGeneric tests separately for CI~~ Build coreclr targetGeneric tests separately for CI Apr 17, 2020

sdmaclea commented Apr 17, 2020

View reviewed changes

eng/pipelines/runtime.yml Outdated Show resolved Hide resolved

sdmaclea force-pushed the TestDecouple branch 3 times, most recently from d28b60a to 77e3cab Compare April 21, 2020 02:10

sdmaclea force-pushed the TestDecouple branch from 6cd3672 to a1d857d Compare April 22, 2020 16:59

sdmaclea added 2 commits April 22, 2020 15:11

Mark all CoreCLR Interop/COM projects OsSpecific

ba1d783

Use parameters.testPriority

314577c

sdmaclea force-pushed the TestDecouple branch from a1d857d to 314577c Compare April 22, 2020 19:20

trylek requested changes Apr 22, 2020

View reviewed changes

sdmaclea commented Apr 22, 2020

View reviewed changes

eng/pipelines/runtime.yml Show resolved Hide resolved

sdmaclea commented Apr 22, 2020

View reviewed changes

eng/pipelines/coreclr/ci.yml Show resolved Hide resolved

Revert flawed testPriority YAML

d31236c

sdmaclea requested a review from trylek April 23, 2020 00:15

Fix paths

71da436

trylek approved these changes Apr 23, 2020

View reviewed changes

sdmaclea merged commit 9188a58 into dotnet:master Apr 23, 2020

sdmaclea deleted the TestDecouple branch April 23, 2020 18:29

jaredpar mentioned this pull request Apr 23, 2020

OSX machines are de-provisioned during CI / PR runs leading to failures #34472

Closed

safern reviewed Apr 23, 2020

View reviewed changes

sdmaclea added a commit to sdmaclea/runtime that referenced this pull request Apr 23, 2020

Revert "Build coreclr targetGeneric tests separately for CI (dotnet#3…

57dc0f2

…4790)" This reverts commit 9188a58.

trylek pushed a commit that referenced this pull request Apr 23, 2020

Revert "Build coreclr targetGeneric tests separately for CI (#34790)" (…

cce6883

…#35371) This reverts commit 9188a58.

ghost locked as resolved and limited conversation to collaborators Dec 9, 2020

Build coreclr targetGeneric tests separately for CI #34790

Build coreclr targetGeneric tests separately for CI #34790

Uh oh!

Conversation

sdmaclea commented Apr 9, 2020

Uh oh!

ghost commented Apr 9, 2020

Uh oh!

jashook commented Apr 9, 2020

Uh oh!

sdmaclea commented Apr 13, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

jkoritzinsky commented Apr 16, 2020

Uh oh!

Uh oh!

sdmaclea commented Apr 20, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

trylek commented Apr 20, 2020

Uh oh!

sdmaclea commented Apr 20, 2020

Uh oh!

sdmaclea commented Apr 21, 2020

Uh oh!

sdmaclea commented Apr 22, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

trylek left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

sdmaclea commented Apr 22, 2020

Uh oh!

Uh oh!

Uh oh!

sdmaclea commented Apr 23, 2020

Uh oh!

sdmaclea commented Apr 23, 2020

Uh oh!

trylek commented Apr 23, 2020

Uh oh!

sdmaclea commented Apr 23, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

trylek left a comment

Choose a reason for hiding this comment

Uh oh!

trylek commented Apr 23, 2020

Uh oh!

sdmaclea commented Apr 23, 2020

Uh oh!

trylek commented Apr 23, 2020

Uh oh!

safern Apr 23, 2020

Choose a reason for hiding this comment

Uh oh!

trylek Apr 23, 2020

Choose a reason for hiding this comment

Uh oh!

sdmaclea Apr 23, 2020

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

6 participants

sdmaclea commented Apr 13, 2020 •

edited

Loading

sdmaclea commented Apr 20, 2020 •

edited

Loading

sdmaclea commented Apr 22, 2020 •

edited

Loading

sdmaclea commented Apr 23, 2020 •

edited

Loading