Skip to content

Conversation

@potiuk
Copy link
Member

@potiuk potiuk commented Mar 27, 2024

PROD image - when built in non-main branch - needs constraints generated for PyPI builds. Those constraints are generated in "generate-constraints" build but we should not add a dependency for that job, because it will slow-down the regular main builds and PRs. Instead, we should generate the constraints again - this is now rather quick thanks to uv so it should not introduce much delay in the non-main builds.


^ Add meaningful description above
Read the Pull Request Guidelines for more information.
In case of fundamental code changes, an Airflow Improvement Proposal (AIP) is needed.
In case of a new dependency, check compliance with the ASF 3rd Party License Policy.
In case of backwards incompatible changes please leave a note in a newsfragment file, named {pr_number}.significant.rst or {issue_number}.significant.rst, in newsfragments.

@potiuk potiuk force-pushed the add-dependency-for-prod-builds branch from 061eef1 to 850ac28 Compare March 27, 2024 11:57
@potiuk potiuk force-pushed the add-dependency-for-prod-builds branch 2 times, most recently from 6853434 to 87bcbfc Compare March 27, 2024 12:10
@potiuk potiuk force-pushed the add-dependency-for-prod-builds branch 2 times, most recently from 741d317 to e0a98c1 Compare March 27, 2024 13:31
PROD image - when built in non-main branch - needs constraints
generated for PyPI builds. Those constraints are generated in
"generate-constraints" build but we should not add a dependency
for that job, because it will slow-down the regular main builds and
PRs. Instead, we should generate the constraints again - this is now
rather quick thanks to `uv` so it should not introduce much delay in
the non-main builds.
@potiuk potiuk force-pushed the add-dependency-for-prod-builds branch from e0a98c1 to 1d3f92b Compare March 27, 2024 13:51
@potiuk potiuk merged commit 977f0bd into apache:main Mar 27, 2024
@potiuk potiuk deleted the add-dependency-for-prod-builds branch March 27, 2024 14:27
potiuk added a commit to potiuk/airflow that referenced this pull request Mar 27, 2024
As of apache#38533 - we are building locally constraints for non-main
production image builds. This change required CI images to be used
and pulled before they were used. For "additional" image checks
image-tag used in this case contained the "extra" prefix and the
CI image could not be pulled.

This change fixes this, it also displays the right Python version
when pulling CI image and forces it's use in the pull/constraint
steps to be explicitly used when provider packages are built.
potiuk added a commit that referenced this pull request Mar 27, 2024
As of #38533 - we are building locally constraints for non-main
production image builds. This change required CI images to be used
and pulled before they were used. For "additional" image checks
image-tag used in this case contained the "extra" prefix and the
CI image could not be pulled.

This change fixes this, it also displays the right Python version
when pulling CI image and forces it's use in the pull/constraint
steps to be explicitly used when provider packages are built.
ephraimbuddy pushed a commit that referenced this pull request Mar 27, 2024
As of #38533 - we are building locally constraints for non-main
production image builds. This change required CI images to be used
and pulled before they were used. For "additional" image checks
image-tag used in this case contained the "extra" prefix and the
CI image could not be pulled.

This change fixes this, it also displays the right Python version
when pulling CI image and forces it's use in the pull/constraint
steps to be explicitly used when provider packages are built.

(cherry picked from commit 18c2e94)
potiuk added a commit to potiuk/airflow that referenced this pull request Mar 28, 2024
When we build PROD images in CI, we need constraints, to know which
version of the depencies to use. Those constraints are build by
earlier steps in the build and uploaded as artifacts so that they
can be re-used by PROD image build. However depending on the type
of PROD build (in main or in v2-*) different constraints are used:
for main build we use source constraints because we are installing
provider packages build from main sources, in v2* builds we are using
PyPI constraints, because we are installing provider packages from
PyPI. This made the dependency tree a bit complicated because PROD
builds would have to wait for constraints generation rather than
use quickly prepared source constraints and in apache#38533 we introduced
an extra step in PROD build for v2* branches to additionally generate
constraint in-the job that builds PROD images.

This however has some consequences, such as potentially incosistencies
between constraints used during other parts of the build, because PROD
images can be potentially built much later than the generated
constraints for the rest of the build. Also it introduced the need
for the PROD builds to pull CI images to generate the constraints
which adds an overhead in the process and makes PROD builds in v2
branches few minutes longer than they could be.

This PR approaches it differently. Since now with `uv` constraints
generation is way faster (few minutes, we could easily make PROD builds
use always the same constraint artifacts generated in single job and
pay a little delay penalty of starting the PROD builds but get it back
at PROD build time (because constraints will be generated only once).

Also that removes (back) the need of PROD builds to pull the CI images
and removes the need to generate source constraints at CI image build
time - which will decrease the build time a little.

This PR makes necessary changes to make it happens:

* removes source constraints generation in CI images
* extracting generate-constraints job to be separate workflow and
  merging "docs" instead (we are hitting the limit of 20 workflows)
* add dependency from all PROD build to wait for generate-constraints
* remove CI image pulling and related steps from PROD image build
* switches PROD to download and use the full set of constraints
  generated by `generate-constraints` job
potiuk added a commit to potiuk/airflow that referenced this pull request Mar 28, 2024
When we build PROD images in CI, we need constraints, to know which
version of the depencies to use. Those constraints are build by
earlier steps in the build and uploaded as artifacts so that they
can be re-used by PROD image build. However depending on the type
of PROD build (in main or in v2-*) different constraints are used:
for main build we use source constraints because we are installing
provider packages build from main sources, in v2* builds we are using
PyPI constraints, because we are installing provider packages from
PyPI. This made the dependency tree a bit complicated because PROD
builds would have to wait for constraints generation rather than
use quickly prepared source constraints and in apache#38533 we introduced
an extra step in PROD build for v2* branches to additionally generate
constraint in-the job that builds PROD images.

This however has some consequences, such as potentially incosistencies
between constraints used during other parts of the build, because PROD
images can be potentially built much later than the generated
constraints for the rest of the build. Also it introduced the need
for the PROD builds to pull CI images to generate the constraints
which adds an overhead in the process and makes PROD builds in v2
branches few minutes longer than they could be.

This PR approaches it differently. Since now with `uv` constraints
generation is way faster (few minutes, we could easily make PROD builds
use always the same constraint artifacts generated in single job and
pay a little delay penalty of starting the PROD builds but get it back
at PROD build time (because constraints will be generated only once).

Also that removes (back) the need of PROD builds to pull the CI images
and removes the need to generate source constraints at CI image build
time - which will decrease the build time a little.

This PR makes necessary changes to make it happens:

* removes source constraints generation in CI images
* extracting generate-constraints job to be separate workflow and
  merging "docs" instead (we are hitting the limit of 20 workflows)
* add dependency from all PROD build to wait for generate-constraints
* remove CI image pulling and related steps from PROD image build
* switches PROD to download and use the full set of constraints
  generated by `generate-constraints` job
potiuk added a commit that referenced this pull request Mar 28, 2024
When we build PROD images in CI, we need constraints, to know which
version of the depencies to use. Those constraints are build by
earlier steps in the build and uploaded as artifacts so that they
can be re-used by PROD image build. However depending on the type
of PROD build (in main or in v2-*) different constraints are used:
for main build we use source constraints because we are installing
provider packages build from main sources, in v2* builds we are using
PyPI constraints, because we are installing provider packages from
PyPI. This made the dependency tree a bit complicated because PROD
builds would have to wait for constraints generation rather than
use quickly prepared source constraints and in #38533 we introduced
an extra step in PROD build for v2* branches to additionally generate
constraint in-the job that builds PROD images.

This however has some consequences, such as potentially incosistencies
between constraints used during other parts of the build, because PROD
images can be potentially built much later than the generated
constraints for the rest of the build. Also it introduced the need
for the PROD builds to pull CI images to generate the constraints
which adds an overhead in the process and makes PROD builds in v2
branches few minutes longer than they could be.

This PR approaches it differently. Since now with `uv` constraints
generation is way faster (few minutes, we could easily make PROD builds
use always the same constraint artifacts generated in single job and
pay a little delay penalty of starting the PROD builds but get it back
at PROD build time (because constraints will be generated only once).

Also that removes (back) the need of PROD builds to pull the CI images
and removes the need to generate source constraints at CI image build
time - which will decrease the build time a little.

This PR makes necessary changes to make it happens:

* removes source constraints generation in CI images
* extracting generate-constraints job to be separate workflow and
  merging "docs" instead (we are hitting the limit of 20 workflows)
* add dependency from all PROD build to wait for generate-constraints
* remove CI image pulling and related steps from PROD image build
* switches PROD to download and use the full set of constraints
  generated by `generate-constraints` job
ephraimbuddy pushed a commit that referenced this pull request Mar 31, 2024
When we build PROD images in CI, we need constraints, to know which
version of the depencies to use. Those constraints are build by
earlier steps in the build and uploaded as artifacts so that they
can be re-used by PROD image build. However depending on the type
of PROD build (in main or in v2-*) different constraints are used:
for main build we use source constraints because we are installing
provider packages build from main sources, in v2* builds we are using
PyPI constraints, because we are installing provider packages from
PyPI. This made the dependency tree a bit complicated because PROD
builds would have to wait for constraints generation rather than
use quickly prepared source constraints and in #38533 we introduced
an extra step in PROD build for v2* branches to additionally generate
constraint in-the job that builds PROD images.

This however has some consequences, such as potentially incosistencies
between constraints used during other parts of the build, because PROD
images can be potentially built much later than the generated
constraints for the rest of the build. Also it introduced the need
for the PROD builds to pull CI images to generate the constraints
which adds an overhead in the process and makes PROD builds in v2
branches few minutes longer than they could be.

This PR approaches it differently. Since now with `uv` constraints
generation is way faster (few minutes, we could easily make PROD builds
use always the same constraint artifacts generated in single job and
pay a little delay penalty of starting the PROD builds but get it back
at PROD build time (because constraints will be generated only once).

Also that removes (back) the need of PROD builds to pull the CI images
and removes the need to generate source constraints at CI image build
time - which will decrease the build time a little.

This PR makes necessary changes to make it happens:

* removes source constraints generation in CI images
* extracting generate-constraints job to be separate workflow and
  merging "docs" instead (we are hitting the limit of 20 workflows)
* add dependency from all PROD build to wait for generate-constraints
* remove CI image pulling and related steps from PROD image build
* switches PROD to download and use the full set of constraints
  generated by `generate-constraints` job

(cherry picked from commit 32e04a4)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants