Optimize Static Checks job for most regular PRs #35461

potiuk · 2023-11-05T18:29:16Z

By default static checks run in the CI with --all-files to make sure that there are no side-effects coming from only running the pre-commit checks on a subset of files. The complete --all-files static check suite on self-hosted runners take ~ 9 minutes and for public runners take ~ 15 minutes so there is room for optimisation now that Tests are running faster than that in many cases.

While we cannot run static checks on only the files changed in the PR (that would lead to many false-positives) we can disable whole pre-commit checks in case the incoming PRs do not contain certain files modified. For example we we can skip mypy-providers when no provider files changed, or we can skip mypy-core when no core files changed or we can skip helm linting in case chart files have not changed in the incoming PR.

This PR implements selective check rules that will skip some of the longest running pre-commit checks in case it looks like the checks are not needed.

Still - all tests will be run when "full tests needed" flag is set (i.e. when build scripts change or when we detect structural/package changes in the project or when "full tests needed" label is set for the PR - and all the static checks will also continue running in "canary" builds, so we will be able to catch and correct any rules that will lead to skipping some of the static checks when they should be run in fact.

Also the cache for pre-commits is renamed to be "common" - currently the "basic" cache is different than full cache but since cache is only really uploaded by the "canary" builds, it's quite ok to have a common "full"cache - it will get faster to retrieve it by regular PRs.

^ Add meaningful description above
Read the Pull Request Guidelines for more information.
In case of fundamental code changes, an Airflow Improvement Proposal (AIP) is needed.
In case of a new dependency, check compliance with the ASF 3rd Party License Policy.
In case of backwards incompatible changes please leave a note in a newsfragment file, named {pr_number}.significant.rst or {issue_number}.significant.rst, in newsfragments.

potiuk · 2023-11-05T18:36:10Z

Yet another optimization. That one should save 3-10 minutes of waiting for static-checks jobs on most PRs

potiuk · 2023-11-05T21:54:50Z

OK I think this one is going to speed up regular PRs quite a bit.

potiuk · 2023-11-06T16:53:00Z

Would also be nice to get that optimization in - it will get faster feedback for "smaller" contributors

By default static checks run in the CI with `--all-files` to make sure that there are no side-effects coming from only running the pre-commit checks on a subset of files. The complete `--all-files` static check suite on self-hosted runners take ~ 9 minutes and for public runners take ~ 15 minutes so there is room for optimisation now that Tests are running faster than that in many cases. While we cannot run static checks on only the files changed in the PR (that would lead to many false-positives) we can disable whole pre-commit checks in case the incoming PRs do not contain certain files modified. For example we we can skip mypy-providers when no provider files changed, or we can skip mypy-core when no core files changed or we can skip helm linting in case chart files have not changed in the incoming PR. This PR implements selective check rules that will skip some of the longest running pre-commit checks in case it looks like the checks are not needed. Still - all tests will be run when "full tests needed" flag is set (i.e. when build scripts change or when we detect structural/package changes in the project or when "full tests needed" label is set for the PR - and all the static checks will also continue running in "canary" builds, so we will be able to catch and correct any rules that will lead to skipping some of the static checks when they should be run in fact. Also the cache for pre-commits is renamed to be "common" - currently the "basic" cache is different than full cache but since cache is only really uploaded by the "canary" builds, it's quite ok to have a common "full"cache - it will get faster to retrieve it by regular PRs.

potiuk · 2023-11-06T20:34:36Z

Also added algorithm description in SELECTIVE_CHECKS.md and reviewed it to make sure the algorithm description is reflecting the current status.

potiuk · 2023-11-07T15:06:18Z

I'd really love to merge that one.... Just to give an example of potential improvement:

Look at this Job (which is for this PR: #35498) :

https://github.com/apache/airflow/actions/runs/6785746475/job/18446028582?pr=35498

Everything is completed already ~ 10 minutes ago - except static checks that run already for almost 15 minutes and will run for another few minutes. But with this change, the static checks would likely complete in under 5 minutes as well.

eladkal

Nice!!

vincbeck

LGTM

amoghrajesh

LGTM

By default static checks run in the CI with `--all-files` to make sure that there are no side-effects coming from only running the pre-commit checks on a subset of files. The complete `--all-files` static check suite on self-hosted runners take ~ 9 minutes and for public runners take ~ 15 minutes so there is room for optimisation now that Tests are running faster than that in many cases. While we cannot run static checks on only the files changed in the PR (that would lead to many false-positives) we can disable whole pre-commit checks in case the incoming PRs do not contain certain files modified. For example we we can skip mypy-providers when no provider files changed, or we can skip mypy-core when no core files changed or we can skip helm linting in case chart files have not changed in the incoming PR. This PR implements selective check rules that will skip some of the longest running pre-commit checks in case it looks like the checks are not needed. Still - all tests will be run when "full tests needed" flag is set (i.e. when build scripts change or when we detect structural/package changes in the project or when "full tests needed" label is set for the PR - and all the static checks will also continue running in "canary" builds, so we will be able to catch and correct any rules that will lead to skipping some of the static checks when they should be run in fact. Also the cache for pre-commits is renamed to be "common" - currently the "basic" cache is different than full cache but since cache is only really uploaded by the "canary" builds, it's quite ok to have a common "full"cache - it will get faster to retrieve it by regular PRs.

potiuk requested review from ashb and jedcunningham as code owners November 5, 2023 18:29

boring-cyborg bot added the area:dev-tools label Nov 5, 2023

potiuk requested review from amoghrajesh, eladkal, ephraimbuddy, pankajastro, pankajkoti and pierrejeambrun November 5, 2023 18:29

potiuk force-pushed the more-selective-static-checks-in-ci branch 2 times, most recently from 00bd3f0 to 9a732c5 Compare November 5, 2023 18:35

potiuk requested a review from hussein-awala November 5, 2023 18:36

potiuk force-pushed the more-selective-static-checks-in-ci branch 3 times, most recently from 14cdaa2 to b23a6f3 Compare November 5, 2023 21:53

potiuk requested a review from kaxil as a code owner November 5, 2023 21:53

potiuk force-pushed the more-selective-static-checks-in-ci branch from b23a6f3 to 67166e5 Compare November 6, 2023 20:33

potiuk requested a review from o-nikolas November 6, 2023 20:41

potiuk requested a review from vincbeck November 7, 2023 15:06

eladkal approved these changes Nov 7, 2023

View reviewed changes

eladkal merged commit 6d31865 into apache:main Nov 7, 2023

vincbeck approved these changes Nov 7, 2023

View reviewed changes

amoghrajesh reviewed Nov 7, 2023

View reviewed changes

potiuk deleted the more-selective-static-checks-in-ci branch November 17, 2023 16:05

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Optimize Static Checks job for most regular PRs #35461

Optimize Static Checks job for most regular PRs #35461

Uh oh!

potiuk commented Nov 5, 2023 •

edited

Loading

Uh oh!

potiuk commented Nov 5, 2023

Uh oh!

potiuk commented Nov 5, 2023

Uh oh!

potiuk commented Nov 6, 2023

Uh oh!

potiuk commented Nov 6, 2023

Uh oh!

potiuk commented Nov 7, 2023 •

edited

Loading

Uh oh!

eladkal left a comment

Uh oh!

vincbeck left a comment

Uh oh!

amoghrajesh left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Optimize Static Checks job for most regular PRs #35461

Optimize Static Checks job for most regular PRs #35461

Uh oh!

Conversation

potiuk commented Nov 5, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

potiuk commented Nov 5, 2023

Uh oh!

potiuk commented Nov 5, 2023

Uh oh!

potiuk commented Nov 6, 2023

Uh oh!

potiuk commented Nov 6, 2023

Uh oh!

potiuk commented Nov 7, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

eladkal left a comment

Choose a reason for hiding this comment

Uh oh!

vincbeck left a comment

Choose a reason for hiding this comment

Uh oh!

amoghrajesh left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

potiuk commented Nov 5, 2023 •

edited

Loading

potiuk commented Nov 7, 2023 •

edited

Loading