-
Notifications
You must be signed in to change notification settings - Fork 16.4k
Improve, speed up, simplify and stabilize airflow-ctl tests suite #51908
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
cc: @bugraoz93 -> this is should vastly speed up and simplify the "airflow-ctl" test suite and also locally running it - harnessing the power of Running all airflow-ctl tests will now be possible with just: |
|
cc: @kaxil @ashb @amoghrajesh The And this only shows how important is to split airflow into smaller pieces- including splitting airflow core. Currently we simply cannot separate the tests of webserver, dag processor, triggerer etc. because all that is so heavily entangled, we still have airflow core depending on some provider tests etc. But if we do think seriously of our dependencies, and split them to smaller sets, this will be entirely possible to do very similar change as this one with This is one of the reasons why I am so keen on splitting airflow and sharding the dependencies and using separate distributions and workspace to bring it all together. This is something @ashb always complained about that we have far too complex environment - and yes, we do, because we have everything bundled together. When we split it - we will finally be able to achieve the "dream" that many of our tests will be runnable in such a super simple way as "airflow-ctl" will be after this PR. |
fb282ed to
dda4496
Compare
bugraoz93
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Amazing improvement Jarek, thanks a lot! Isolating and decoupling always feels good, and airflow-ctl is the best candidate to take these steps.
We still need Breeze to be installed in local env containers in the CI. We can tweak the CI but we would hit the hard unique workflow limit again or end up bespoke if statements. We can call uv run now even without Breeze in the CI, which can also eliminate installing Breeze into the local env.
Actually we still can use breeze - but breeze does not really need the CI image to run (once I make the PR green) - which has the nice feature of providing a unified simple interface to run tests ( I think I would like to stick with the pattern that all WDYT @bugraoz93 and others ? |
dda4496 to
d68fd09
Compare
|
Also I think having all kinds of tests you can run triggered by breeze gives a nice feature of discoverability what kind of tests we have and how to run them. Having it all grouped together in "testing" (except k8s which is separate thing altogether) is really nice for newcomers as they can immediately (with auto-complete) discover what kind of tests there can be run and how to run them - without even having to read contributing documentation. |
|
See the last change - no need for extra workflows - we have just conditional steps in the existing workflow, where we control whether we use local venv or CI image to run tests (with |
1db8d90 to
1865c20
Compare
|
Yep - it should get green now, with running things locally - without CI image even with |
|
And airflow-ctl tests will start now without even waiting for CI image being built. https://github.com/apache/airflow/actions/runs/15755034518/job/44408491632?pr=51908 |
|
Error unrelated - PR to fix is #51931 |
The tests in airflow-ctl failed occasionally when run with xdist, also the definition of the airflow-ctl dependencies in pyproject.toml did not properly reflect dependencies needed to run the tests - they silently assumed that tests are always run in a complete workspace, where all packages were installed. That is pretty unnecessary and it was possible to improve the devel-common to handle minimum set of requirements for airflow-ctl so that it does not have to install unnecessary depedencies. Since `airflow-ctl` does not now require airflow nor sqlalchemy for installation and tests we can use the power of `uv` to automatically install and run the tests without the need of CI image - airflow-ctl is supposed to be small and standalone package so set of dependencies it has should make it possible to install it without having all the 700+ dependencies in the CI nor having system dependencies installed in Debian image. Another problem was that when the tests were run with MacOS keyring, the login test was hanging waiting for user entry. This PR fixes those problems: * devel-common distribution does not have to have airflow nor sqlalchemy as dependency - this way airflow-ctl distribution can run tests without all packages installed * instead of platformdirs user_config_dir, we use direct retrieval of configuration folder following airflow pattern (AIRLFOW_HOME or ~/airflow). This allows to easily patch airflow home to use temporary directory for each test - which solves the xdist failures when tests running in parallel were overriding each-others config folder * patching all paths that the test can ask for password makes it not hang when keyring is installed * breeze airflow-ctl tests perform `uv sync` in airflow-ctl directory in order to make sure that no other dependencies are used during CI tests. * test_command.py file is .gitignored and deleted after the test are run * breeze testing airflow-ctl now does not require CI images to be run, we are using `uv run` to autonaticaly install the venv in the right python version, which simplifies and speeds up running. * for airflow-ctl we use --use-local-hatch to build the packages locally After this change running tests for airflow-ctl can be done very easily with: * breeze testing airfllw-ctl --python N.N or * cd airflow-ctl; uv run --python N.N pytest They should be equivalent.
1865c20 to
41c2664
Compare
It makes sense! I think using Breeze for all CI related things makes it easy to maintain and always shows where to check without any ambiguity. I am on using the Breeze pattern. I tried to be pointing out and maybe considering moving uv for a couple of actions, but no strong stance on there |
I am all for using But when it comes to more "complex" pieces which require quite a bit more libraries, modules etc. to import - fully fledged |
When features come to |
Actually i found that I am using both - as I need. For example building docs is now quite easy with uv - especialy that the docs are now "isolated per project" - and I usually keep history of commands in the terminal so .... Will nicely run autobuild server with documentation built and auto-reload when I change stuff. And the doc build command is identical for all our distributions - you just need to be in the distribution's folder - so I don't have to remember it just I think it's good to have both ways. |
It is definitely to have them both. Good points, that's true. Yeah, maybe it should |
…ache#51908) The tests in airflow-ctl failed occasionally when run with xdist, also the definition of the airflow-ctl dependencies in pyproject.toml did not properly reflect dependencies needed to run the tests - they silently assumed that tests are always run in a complete workspace, where all packages were installed. That is pretty unnecessary and it was possible to improve the devel-common to handle minimum set of requirements for airflow-ctl so that it does not have to install unnecessary depedencies. Since `airflow-ctl` does not now require airflow nor sqlalchemy for installation and tests we can use the power of `uv` to automatically install and run the tests without the need of CI image - airflow-ctl is supposed to be small and standalone package so set of dependencies it has should make it possible to install it without having all the 700+ dependencies in the CI nor having system dependencies installed in Debian image. Another problem was that when the tests were run with MacOS keyring, the login test was hanging waiting for user entry. This PR fixes those problems: * devel-common distribution does not have to have airflow nor sqlalchemy as dependency - this way airflow-ctl distribution can run tests without all packages installed * instead of platformdirs user_config_dir, we use direct retrieval of configuration folder following airflow pattern (AIRLFOW_HOME or ~/airflow). This allows to easily patch airflow home to use temporary directory for each test - which solves the xdist failures when tests running in parallel were overriding each-others config folder * patching all paths that the test can ask for password makes it not hang when keyring is installed * breeze airflow-ctl tests perform `uv sync` in airflow-ctl directory in order to make sure that no other dependencies are used during CI tests. * test_command.py file is .gitignored and deleted after the test are run * breeze testing airflow-ctl now does not require CI images to be run, we are using `uv run` to autonaticaly install the venv in the right python version, which simplifies and speeds up running. * for airflow-ctl we use --use-local-hatch to build the packages locally After this change running tests for airflow-ctl can be done very easily with: * breeze testing airfllw-ctl --python N.N or * cd airflow-ctl; uv run --python N.N pytest They should be equivalent.
The tests in airflow-ctl failed occasionally when run with xdist, also the definition of the airflow-ctl dependencies in pyproject.toml did not properly reflect dependencies needed to run the tests - they silently assumed that tests are always run in a complete workspace, where all packages were installed. That is pretty unnecessary and it was possible to improve the devel-common to handle minimum set of requirements for airflow-ctl so that it does not have to install unnecessary depedencies.
Since
airflow-ctldoes not now require airflow nor sqlalchemy for installation and tests we can use the power ofuvto automatically install and run the tests without the need of CI image - airflow-ctl is supposed to be small and standalone package so set of dependencies it has should make it possible to install it without having all the 700+ dependencies in the CI nor having system dependencies installed in Debian image.Another problem was that when the tests were run with MacOS keyring, the login test was hanging waiting for user entry.
This PR fixes those problems:
devel-common distribution does not have to have airflow nor sqlalchemy as dependency - this way airflow-ctl distribution can run tests without all packages installed
instead of platformdirs user_config_dir, we use direct retrieval of configuration folder following airflow pattern (AIRLFOW_HOME or ~/airflow). This allows to easily patch airflow home to use temporary directory for each test - which solves the xdist failures when tests running in parallel were overriding each-others config folder
patching all paths that the test can ask for password makes it not hang when keyring is installed
breeze airflow-ctl tests perform
uv syncin airflow-ctl directory in order to make sure that no other dependencies are used during CI tests.test_command.py file is .gitignored and deleted after the test are run
breeze testing airflow-ctl now does not require CI images to be run, we are using
uv runto autonaticaly install the venv in the right python version, which simplifies and speeds up running.for airflow-ctl we use --use-local-hatch to build the packages locally
After this change running tests for airflow-ctl can be done very easily with:
or
They should be equivalent.
^ Add meaningful description above
Read the Pull Request Guidelines for more information.
In case of fundamental code changes, an Airflow Improvement Proposal (AIP) is needed.
In case of a new dependency, check compliance with the ASF 3rd Party License Policy.
In case of backwards incompatible changes please leave a note in a newsfragment file, named
{pr_number}.significant.rstor{issue_number}.significant.rst, in airflow-core/newsfragments.