Skip to content

Conversation

@sdmaclea
Copy link
Contributor

@sdmaclea sdmaclea commented Apr 9, 2020

Fixes #34508

This PR change splits the building of targetGeneric and targetSpecific tests in ci.yml

  • Add testBuildPhase parameter to build-test-job
    • Modify names of artifacts and logs for targetGeneric build
  • Add testBuildPhased parameter to run-test job
    • Add DependsOn as needed
    • Add additional test download as needed
  • Modifies CI.yml to call build-test-job twice. Once withtestBuildPhase:targetGeneric and once with testBuildPhase:targetSpecific
  • Modifies CI.yml to call run-test-job with testBuildPhased:true

@ghost
Copy link

ghost commented Apr 9, 2020

Tagging @ViktorHofer as an area owner

@jashook
Copy link
Contributor

jashook commented Apr 9, 2020

Will be able to take a look tomorrow morning.

@sdmaclea
Copy link
Contributor Author

sdmaclea commented Apr 13, 2020

I am waiting on a review on this before investing more in this.

  • The current implementation requires manually enabling this at every call/instantiation of build-test-job and run-test-job
  • I have not touched all the call/instantiation points.
  • It appears I have not touched the most important one as I don't see the separation on the runs in this PR.

Is this the correct approach?
Is there a better approach?
How should I test?

@jashook jashook requested a review from jkoritzinsky April 14, 2020 19:59
@jkoritzinsky
Copy link
Member

To test this in CI, you need to change the runtime.yml pipeline to use it. The ci.yml pipeline you changed is the CoreCLR outerloop job, so it doesn't run in PR. (It should definitely also have this change though, so don't undo it.)

@sdmaclea sdmaclea force-pushed the TestDecouple branch 6 times, most recently from d54ee5f to b008eb3 Compare April 17, 2020 19:06
@sdmaclea sdmaclea changed the title WIP Build coreclr targetGeneric tests separately for CI Build coreclr targetGeneric tests separately for CI Apr 17, 2020
@sdmaclea
Copy link
Contributor Author

sdmaclea commented Apr 20, 2020

The Windows tests runs are failing because they are trying to download CoreCLRManagedTestArtifacts_AnyOS_AnyCPU_checked_LibDebug.zip, but the artifact is actually saved from OSX as CoreCLRManagedTestArtifacts_AnyOS_AnyCPU_checked_LibDebug.tar.gz.

@trylek
Copy link
Member

trylek commented Apr 20, 2020

Hmm, that is annoying. I recall that @jashook mentioned some limitations regarding what compression tools are available on the various build machines; I guess we need to figure out whether it's easier to use zip on OSX or tar.gz on Windows.

@sdmaclea
Copy link
Contributor Author

Looks like I have some control, if I chose to not use the default.

          tarCompression: $(tarCompression)
          archiveExtension: $(archiveExtension)

@sdmaclea sdmaclea force-pushed the TestDecouple branch 3 times, most recently from d28b60a to 77e3cab Compare April 21, 2020 02:10
@sdmaclea
Copy link
Contributor Author

Tests are now running.
There are 45 test failures on this run.

A lot of them are COM Interop tests. I would guess they should be marked TargetSpecific, because they are only supported on Windows. I'll take a look today. /cc @jkoritzinsky

There were also Libraries test run failures (some on one Mono leg, and some on one CoreCLR leg). It is not clear to me whether these are related failures or not. I will assume not until I clean up the CoreCLR test legs.

@sdmaclea
Copy link
Contributor Author

sdmaclea commented Apr 22, 2020

The PR was clean last night. So this change is now ready for review.
At the moment there are merge conflicts, so I will rebase to fix the merge conflicts.

edit...
Looks like I introduced issues with the rebase. Working on fixing them...

+ Build coreclr targetGeneric tests separately for CI
+ Build coreclr targetGeneric tests separately in runtime.yml
+ Distinguish generic build based on Libraries config
+ Build OSX release libraries when CoreCLR changed
+ Always use tar.gz compression for generic tests
Copy link
Member

@trylek trylek left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks Steve, this looks like a solid step forward but I believe a few things are still missing:

  1. At the bottom of your "PR conversation" I see that your runtime pipeline didn't even start. I believe this is due to incorrect handling of testPriority in the two templates (build-test-job, run-test-job) as I described in another review comment.

  2. I suspect we're missing an execution of build-test in run-test-job to build the target-specific tests in phased mode.

  3. I cannot find the counterpart changes to the src/coreclr/build-test.cmd / sh scripts implementing the new targetSpecific and targetGeneric switches. I'm not seeing any such logic in the scripts as they stand today.

As a side note, Vlad mentioned in your yesterday meeting that he was hitting some issues with the item-based test traits and mentioned that we might need to rethink this bit and instead use project properties to declare them. Once you manage to fix the issues per my bullets 1-3, be prepared that you may hit difficulties in the test split because of that.

Thanks

Tomas

@sdmaclea
Copy link
Contributor Author

@trylek Thanks for the review.

As a side note, Vlad mentioned in your yesterday meeting that he was hitting some issues with the item-based test traits and mentioned that we might need to rethink this bit and instead use project properties to declare them. Once you manage to fix the issues per my bullets 1-3, be prepared that you may hit difficulties in the test split because of that

#34658 Added properties because that was how existing p0/p1 filtering was being done.

@sdmaclea sdmaclea requested a review from trylek April 23, 2020 00:15
@sdmaclea
Copy link
Contributor Author

I guess. I need to change code base on the Mono file relocation too...

@sdmaclea
Copy link
Contributor Author

The YAML is parsing again. I would expect the test run to pass.

Do we need to trigger a build somewhere else to the the changes in ci.yml?

@trylek
Copy link
Member

trylek commented Apr 23, 2020

@sdmaclea
Copy link
Contributor Author

sdmaclea commented Apr 23, 2020

Innerloop looks good to me minus spurious failures. I don't have permissions at the moment to re-run failed jobs due to dotnet/core-eng#9655
Outerloop passed.
https://dev.azure.com/dnceng/public/_build/results?buildId=612835

Edit -- retriggered failed innerloop jobs....

Copy link
Member

@trylek trylek left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ship it!

@sdmaclea sdmaclea merged commit 9188a58 into dotnet:master Apr 23, 2020
@trylek
Copy link
Member

trylek commented Apr 23, 2020

And if for whatever reason we haven't managed to scare you off further infra work yet, I think the following items are up for grabs:

  1. Do we really need the pair "CLRTestNeedsTarget" / "OsSpecific" everywhere? Perhaps one of them (probably the former) might suffice.

  2. Get rid of managedBuildOsGroup/Subgroup as it should be superfluous with this change.

  3. Monitor perf implications of this change and decide whether we want to move OsSpecific test build to run-test-job and revert the pipeline changes you've made or whether we want to keep the separate set of build jobs for the OS specific tests in which case we should ultimately update all our existing pipelines.

Thanks for your great work, tenacity, and patience!

Tomas

@sdmaclea
Copy link
Contributor Author

Do we really need the pair "CLRTestNeedsTarget" / "OsSpecific" everywhere? Perhaps one of them (probably the former) might suffice.

I added CLRTestNeedsTarget everywhere to allow this work to be implemented while @VSadov figured out generic filtering. I knew how to filter properties, but not filter on TraitTag items. I have been assuming @VSadov will remove these when he implements generic filtering.

Get rid of managedBuildOsGroup/Subgroup as it should be superfluous with this change.

I can take this on.

Monitor perf implications of this change and decide whether we want to move OsSpecific test build to run-test-job and revert the pipeline changes you've made or whether we want to keep the separate set of build jobs for the OS specific tests in which case we should ultimately update all our existing pipelines.

I don't really know how to do this monitoring. If someone monitors it, I'd be happy to refactor this.

I was also looking at #33128 & #33067

@trylek
Copy link
Member

trylek commented Apr 23, 2020

All your responses sound great to me, thank you!

- template: /eng/pipelines/common/platform-matrix.yml
parameters:
jobTemplate: /eng/pipelines/libraries/build-job.yml
buildConfig: Release
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@sdmaclea do we need to build libraries in Release here? We already build it on Debug: https://github.com/dotnet/runtime/blob/master/eng/pipelines/runtime.yml#L382

Do we need the release build?

Also, this will introduce a duplicate job of the libraries on rolling builds because there's already a libraries job for Release in OSX, because https://github.com/dotnet/runtime/blob/master/eng/pipelines/runtime.yml#L382 will be built on release. See, variables.debugOnPrReleaseOnRolling.

@trylek this might be the reason of the break

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree this is definitely at least part of the problem, thanks for pointing that out. I'm investigating whether it's the only one.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The duplicate job was prevented by the conditional logic below

      condition: >-
        and(
          ne(variables['debugOnPrReleaseOnRolling'], 'Release'),
          or(
            eq(dependencies.checkout.outputs['SetPathVars_coreclr.containsChange'], true),
            eq(variables['isFullMatrix'], true)))

sdmaclea added a commit to sdmaclea/runtime that referenced this pull request Apr 23, 2020
trylek pushed a commit that referenced this pull request Apr 23, 2020
sdmaclea added a commit to sdmaclea/runtime that referenced this pull request Apr 23, 2020
* Build CoreClr tests separately

* Build coreclr targetGeneric tests separately for CI
* Build coreclr targetGeneric tests separately in runtime.yml
* Distinguish generic build based on Libraries config
* Build OSX release libraries when CoreCLR changed
* Always use tar.gz compression for generic tests

* Mark all CoreCLR Interop/COM projects OsSpecific
sdmaclea added a commit that referenced this pull request Apr 27, 2020
* Build coreclr targetGeneric tests separately for CI (#34790)

* Build CoreClr tests separately

* Build coreclr targetGeneric tests separately for CI
* Build coreclr targetGeneric tests separately in runtime.yml
* Distinguish generic build based on Libraries config
* Build OSX release libraries when CoreCLR changed
* Always use tar.gz compression for generic tests

* Mark all CoreCLR Interop/COM projects OsSpecific

* Assume debug/release libraries doesn't matter during test build

* Add comments
@ghost ghost locked as resolved and limited conversation to collaborators Dec 9, 2020
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[CI] Hoist the platform/arch independent test build into its own job

6 participants