Skip to content

Conversation

@potiuk
Copy link
Member

@potiuk potiuk commented Mar 28, 2024

When we build PROD images in CI, we need constraints, to know which version of the depencies to use. Those constraints are build by earlier steps in the build and uploaded as artifacts so that they can be re-used by PROD image build. However depending on the type of PROD build (in main or in v2-) different constraints are used: for main build we use source constraints because we are installing provider packages build from main sources, in v2 builds we are using PyPI constraints, because we are installing provider packages from PyPI. This made the dependency tree a bit complicated because PROD builds would have to wait for constraints generation rather than use quickly prepared source constraints and in #38533 we introduced an extra step in PROD build for v2* branches to additionally generate constraint in-the job that builds PROD images.

This however has some consequences, such as potentially incosistencies between constraints used during other parts of the build, because PROD images can be potentially built much later than the generated constraints for the rest of the build. Also it introduced the need for the PROD builds to pull CI images to generate the constraints which adds an overhead in the process and makes PROD builds in v2 branches few minutes longer than they could be.

This PR approaches it differently. Since now with uv constraints generation is way faster (few minutes, we could easily make PROD builds use always the same constraint artifacts generated in single job and pay a little delay penalty of starting the PROD builds but get it back at PROD build time (because constraints will be generated only once).

Also that removes (back) the need of PROD builds to pull the CI images and removes the need to generate source constraints at CI image build time - which will decrease the build time a little.

This PR makes necessary changes to make it happens:

  • removes source constraints generation in CI images
  • extracting generate-constraints job to be top-level job in CI.yml
  • add dependency from all PROD build to wait for generate-constraints
  • remove CI image pulling and related steps from PROD image build
  • switches PROD to download and use the full set of constraints generated by generate-constraints job

^ Add meaningful description above
Read the Pull Request Guidelines for more information.
In case of fundamental code changes, an Airflow Improvement Proposal (AIP) is needed.
In case of a new dependency, check compliance with the ASF 3rd Party License Policy.
In case of backwards incompatible changes please leave a note in a newsfragment file, named {pr_number}.significant.rst or {issue_number}.significant.rst, in newsfragments.

@potiuk potiuk added full tests needed We need to run full set of tests for this PR to merge default versions only When assigned to PR - only default python version is used for CI tests labels Mar 28, 2024
@potiuk potiuk requested review from ashb and kaxil as code owners March 28, 2024 10:26
@potiuk potiuk force-pushed the better-fix-for-constraints-generation-for-prod-build branch 2 times, most recently from c4de342 to 6c5c45d Compare March 28, 2024 11:47
@potiuk potiuk added this to the Airflow 2.9.0 milestone Mar 28, 2024
@potiuk potiuk force-pushed the better-fix-for-constraints-generation-for-prod-build branch 5 times, most recently from 58b3598 to 278ad0c Compare March 28, 2024 12:34
@potiuk potiuk force-pushed the better-fix-for-constraints-generation-for-prod-build branch from 278ad0c to 611afd3 Compare March 28, 2024 14:00
When we build PROD images in CI, we need constraints, to know which
version of the depencies to use. Those constraints are build by
earlier steps in the build and uploaded as artifacts so that they
can be re-used by PROD image build. However depending on the type
of PROD build (in main or in v2-*) different constraints are used:
for main build we use source constraints because we are installing
provider packages build from main sources, in v2* builds we are using
PyPI constraints, because we are installing provider packages from
PyPI. This made the dependency tree a bit complicated because PROD
builds would have to wait for constraints generation rather than
use quickly prepared source constraints and in apache#38533 we introduced
an extra step in PROD build for v2* branches to additionally generate
constraint in-the job that builds PROD images.

This however has some consequences, such as potentially incosistencies
between constraints used during other parts of the build, because PROD
images can be potentially built much later than the generated
constraints for the rest of the build. Also it introduced the need
for the PROD builds to pull CI images to generate the constraints
which adds an overhead in the process and makes PROD builds in v2
branches few minutes longer than they could be.

This PR approaches it differently. Since now with `uv` constraints
generation is way faster (few minutes, we could easily make PROD builds
use always the same constraint artifacts generated in single job and
pay a little delay penalty of starting the PROD builds but get it back
at PROD build time (because constraints will be generated only once).

Also that removes (back) the need of PROD builds to pull the CI images
and removes the need to generate source constraints at CI image build
time - which will decrease the build time a little.

This PR makes necessary changes to make it happens:

* removes source constraints generation in CI images
* extracting generate-constraints job to be separate workflow and
  merging "docs" instead (we are hitting the limit of 20 workflows)
* add dependency from all PROD build to wait for generate-constraints
* remove CI image pulling and related steps from PROD image build
* switches PROD to download and use the full set of constraints
  generated by `generate-constraints` job
@potiuk potiuk force-pushed the better-fix-for-constraints-generation-for-prod-build branch from 611afd3 to 428366a Compare March 28, 2024 17:45
@potiuk potiuk merged commit 32e04a4 into apache:main Mar 28, 2024
@potiuk potiuk deleted the better-fix-for-constraints-generation-for-prod-build branch March 28, 2024 18:56
ephraimbuddy pushed a commit that referenced this pull request Mar 31, 2024
When we build PROD images in CI, we need constraints, to know which
version of the depencies to use. Those constraints are build by
earlier steps in the build and uploaded as artifacts so that they
can be re-used by PROD image build. However depending on the type
of PROD build (in main or in v2-*) different constraints are used:
for main build we use source constraints because we are installing
provider packages build from main sources, in v2* builds we are using
PyPI constraints, because we are installing provider packages from
PyPI. This made the dependency tree a bit complicated because PROD
builds would have to wait for constraints generation rather than
use quickly prepared source constraints and in #38533 we introduced
an extra step in PROD build for v2* branches to additionally generate
constraint in-the job that builds PROD images.

This however has some consequences, such as potentially incosistencies
between constraints used during other parts of the build, because PROD
images can be potentially built much later than the generated
constraints for the rest of the build. Also it introduced the need
for the PROD builds to pull CI images to generate the constraints
which adds an overhead in the process and makes PROD builds in v2
branches few minutes longer than they could be.

This PR approaches it differently. Since now with `uv` constraints
generation is way faster (few minutes, we could easily make PROD builds
use always the same constraint artifacts generated in single job and
pay a little delay penalty of starting the PROD builds but get it back
at PROD build time (because constraints will be generated only once).

Also that removes (back) the need of PROD builds to pull the CI images
and removes the need to generate source constraints at CI image build
time - which will decrease the build time a little.

This PR makes necessary changes to make it happens:

* removes source constraints generation in CI images
* extracting generate-constraints job to be separate workflow and
  merging "docs" instead (we are hitting the limit of 20 workflows)
* add dependency from all PROD build to wait for generate-constraints
* remove CI image pulling and related steps from PROD image build
* switches PROD to download and use the full set of constraints
  generated by `generate-constraints` job

(cherry picked from commit 32e04a4)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area:dev-tools default versions only When assigned to PR - only default python version is used for CI tests full tests needed We need to run full set of tests for this PR to merge

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants