Skip to content

Conversation

@potiuk
Copy link
Member

@potiuk potiuk commented Jul 26, 2022

When we are constructing devel-all, we should strip the limits
to account for some shared packages that are not yet released.

Example is a common-sql package that might be limited to >1.1.0
for example as part of the change, but it might not yet be released.

It does not impact our CI, because we are anyhow removing the packages
after installing (we only care about dependencies of their) and we
are using source version of providers - but having a limit coming
from another provider will make pip resolver fail if we do not
strip the limits.

We also need to add .a0 as suffix whenever in our providers we refer
to other providers with >= install clause because of a bug in
pip that does not take into account pre-release version when
performing >= comparision, thus disallowing to release a package
that depends on new version of another package together.

We need to add >= X.Y.Za0 in case we release pre-release version to
fix the problem. Also a bug was found that would prevent to generate
the dependencies in case provider.yaml file only changed (bad
specification of .pre-commit include)


^ Add meaningful description above

Read the Pull Request Guidelines for more information.
In case of fundamental code changes, an Airflow Improvement Proposal (AIP) is needed.
In case of a new dependency, check compliance with the ASF 3rd Party License Policy.
In case of backwards incompatible changes please leave a note in a newsfragment file, named {pr_number}.significant.rst or {issue_number}.significant.rst, in newsfragments.

@potiuk potiuk requested review from ashb, jedcunningham and kaxil July 26, 2022 11:32
@potiuk
Copy link
Member Author

potiuk commented Jul 26, 2022

Extracted from #25294

@potiuk potiuk force-pushed the strip-limits-from-devel-all branch from dc57d86 to a35aa85 Compare July 26, 2022 12:45
@potiuk
Copy link
Member Author

potiuk commented Jul 26, 2022

BTW. Those are exactly the kind of finding I expected to test with common.sql extraction @eladkal @jedcunningham - nice test ground to see all the "interesting" cases when we have providers depending on specific versions of other providers.

@potiuk potiuk force-pushed the strip-limits-from-devel-all branch from a35aa85 to 3260091 Compare July 26, 2022 22:02
Comment on lines 320 to 324
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is not a bug but in the Python version range specification. A better “fix” would be to conditionally supply --pre when pip is invoked.

Copy link
Member Author

@potiuk potiuk Jul 27, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah. I thought you mentioned before that this was a long standing bug that you never got around to fix, but I must have been mistaken. If --pre works this way, then it is indeed a much better fix, so let me change it

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry - i was 100% sure we discussed it before and that was the outcome :)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tried to use pre (I will have a nice feature added to docker files out of that), but I am afraid this is indeed the bug and I remembered our earlier discussion well. This is what happened when I tried to install the providers:

pip install --find-links=file:///docker-context-files 
--root-user-action ignore --upgrade --upgrade-strategy eager 
--pre 'apache-airflow[amazon,async,celery,cncf.kubernetes,dask,docker,elasticsearch,ftp,google,google_auth,grpc,hashicorp,http,ldap,microsoft.azure,mysql,odbc,pandas,postgres,redis,sendgrid,sftp,slack,ssh,statsd,virtualenv]==2.4.0.dev0'
 /docker-context-files/apache_airflow_providers_amazon-4.1.0.dev0-py3-none-any.whl 
/docker-context-files/apache_airflow_providers_celery-3.0.0.dev0-py3-none-any.whl 
/docker-context-files/apache_airflow_providers_cncf_kubernetes-4.2.0.dev0-py3-none-any.whl 
/docker-context-files/apache_airflow_providers_common_sql-1.1.0.dev0-py3-none-any.whl 
/docker-context-files/apache_airflow_providers_docker-3.1.0.dev0-py3-none-any.whl 
/docker-context-files/apache_airflow_providers_elasticsearch-4.1.0.dev0-py3-none-any.whl 
/docker-context-files/apache_airflow_providers_ftp-3.1.0.dev0-py3-none-any.whl
/docker-context-files/apache_airflow_providers_google-8.2.0.dev0-py3-none-any.whl 
/docker-context-files/apache_airflow_providers_grpc-3.0.0.dev0-py3-none-any.whl
/docker-context-files/apache_airflow_providers_hashicorp-3.0.1.dev0-py3-none-any.whl
/docker-context-files/apache_airflow_providers_http-4.0.0.dev0-py3-none-any.whl
/docker-context-files/apache_airflow_providers_imap-3.0.0.dev0-py3-none-any.whl
/docker-context-files/apache_airflow_providers_microsoft_azure-4.1.0.dev0-py3-none-any.whl
/docker-context-files/apache_airflow_providers_mysql-3.1.0.dev0-py3-none-any.whl
/docker-context-files/apache_airflow_providers_odbc-3.1.0.dev0-py3-none-any.whl
/docker-context-files/apache_airflow_providers_postgres-5.1.0.dev0-py3-none-any.whl 
/docker-context-files/apache_airflow_providers_redis-3.0.0.dev0-py3-none-any.whl
/docker-context-files/apache_airflow_providers_sendgrid-3.0.0.dev0-py3-none-any.whl
/docker-context-files/apache_airflow_providers_sftp-4.0.0.dev0-py3-none-any.whl
/docker-context-files/apache_airflow_providers_slack-5.1.0.dev0-py3-none-any.whl
/docker-context-files/apache_airflow_providers_sqlite-3.1.0.dev0-py3-none-any.whl
/docker-context-files/apache_airflow_providers_ssh-3.1.0.dev0-py3-none-any.whl 'dill<0.3.3'

This fails with:

#56 34.54 ERROR: Cannot install apache-airflow-providers-amazon==4.1.0.dev0, apache-airflow-providers-common-sql 1.1.0.dev0 (from /docker-context-files/apache_airflow_providers_common_sql-1.1.0.dev0-py3-none-any.whl), apache-airflow-providers-elasticsearch==4.1.0.dev0, apache-airflow-providers-google==8.2.0.dev0 and apache-airflow[amazon,async,celery,cncf-kubernetes,dask,docker,elasticsearch,ftp,google,google-auth,grpc,hashicorp,http,ldap,microsoft-azure,mysql,odbc,pandas,postgres,redis,sendgrid,sftp,slack,ssh,statsd,virtualenv]==2.4.0.dev0 because these package versions have conflicting dependencies.
#56 34.54 
#56 34.54 The conflict is caused by:
#56 34.54     The user requested apache-airflow-providers-common-sql 1.1.0.dev0 (from /docker-context-files/apache_airflow_providers_common_sql-1.1.0.dev0-py3-none-any.whl)
#56 34.54     apache-airflow[amazon,async,celery,cncf-kubernetes,dask,docker,elasticsearch,ftp,google,google-auth,grpc,hashicorp,http,ldap,microsoft-azure,mysql,odbc,pandas,postgres,redis,sendgrid,sftp,slack,ssh,statsd,virtualenv] 2.4.0.dev0 depends on apache-airflow-providers-common-sql
#56 34.54     apache-airflow-providers-amazon 4.1.0.dev0 depends on apache-airflow-providers-common-sql
#56 34.54     apache-airflow-providers-elasticsearch 4.1.0.dev0 depends on apache-airflow-providers-common-sql
#56 34.54     apache-airflow-providers-google 8.2.0.dev0 depends on apache-airflow-providers-common-sql>=1.1.0
#56 34.54 
#56 34.54 To fix this you could try to:
#56 34.54 1. loosen the range of package versions you've specified
#56 34.54 2. remove package versions to allow pip attempt to solve the dependency conflict
#56 34.54 
#56 34.54 ERROR: ResolutionImpossible: for help visit https://pip.pypa.io/en/latest/topics/dependency-resolution/#dealing-with-dependency-conflicts

The problem is that

  • apache-airflow-providers-google 8.2.0.dev0 depends on apache-airflow-providers-common-sql>=1.1.0
  • The user requested apache-airflow-providers-common-sql 1.1.0.dev0 (from /docker-context-files/apache_airflow_providers_common_sql-1.1.0.dev0-py3-none-any.whl

I was using --pre flag (as you can see above).

Still google's > 1.1.0 conflicts with 1.1.0dev0 - which is the problem

Copy link
Member Author

@potiuk potiuk Jul 27, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cc: @uranusjr - something I missed?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah. That woudl explain it. Just checked it and indeed - simply replacing the suffix with the one we are just building with is good. Update coming in a minute.

So really the problem (might not be a "bug" ) with pip is that --pre does not apply to limit of the dependecies of the packages you are installing - just to the packages that you are installing. Would that be the right "assesment" ?

Do you think is it something that could be changed easiy and accepted? Maybe I could even attempt to find and fix it, but I am a little bit sceptical after the rejections (and the form of those rejections especially) I got before, so I am a little bit sceptical about submitting anything like that without knowing if it is something that maybe has been discussed in the past or maybe there is a long discussion somewhere I am not aware of.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think --pre does apply to transitive dependencies, but it’s not possible to tell without a reproducible with much less moving parts.

Copy link
Member Author

@potiuk potiuk Jul 28, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Or is --pre only valid for pre-releases and we need --dev as well? (Just reading PEP 440 and it seems to me that --pre would only apply to a/b/rc ones but not dev ones ?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Though the other part of PEP mentions ("Pre-releases of any kind, including developmental releases") - so I am not sure any more whether --pre's pre-release also includes dev or not :)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually after further reading of the PEP, I tested one more approach. It's not explicitly mentioned in the PEP, but apparently it works.

Seems that converting the "pre-release" depencies to have ".*" at the end even in >= would include all kinds of pre-releases, so at least we have a way to specify "all pre-releases including development ones" in our dependencies. So apache-airflow-providers-common-sql>=1.1.0.* works in this case and that's what I am going to settle with I think.

This is the part of meta-data fron .whl in the Google Provider that works:

Requires-Dist: PyOpenSSL
Requires-Dist: apache-airflow-providers-common-sql (>=1.1.0.*)
Requires-Dist: apache-airflow (>=2.2.0.*)
Requires-Dist: google-ads (>=15.1.1)

I think that's sustainable for our case (but we have to manually modify the requirements). Would be great to add this behaviour to --pre flag I guess, but we can definitely live with that even if it is not added.

@potiuk potiuk force-pushed the strip-limits-from-devel-all branch from 3260091 to 4e1eb58 Compare July 27, 2022 17:45
@potiuk potiuk changed the title Strip limits when constructing devel-all Add pre-release limits when constructing devel-all Jul 28, 2022
When we are constructing devel-all, we should strip the limits
to account for some shared packages that are not yet released.

Example is a common-sql package that might be limited to >1.1.0
for example as part of the change, but it might not yet be released.

It does not impact our CI, because we are anyhow removing the packages
after installing (we only care about dependencies of their) and we
are using source version of providers - but having a limit coming
from another provider will make `pip` resolver fail if we do not
strip the limits.

We also need to add .a0 as suffix whenever in our providers we refer
to other providers with >= install clause because of a bug in
`pip` that does not take into account pre-release version when
performing >= comparision, thus disallowing to release a package
that depends on new version of another package together.

We need to add >= X.Y.Za0 in case we release pre-release version to
fix the problem. Also a bug was found that would prevent to generate
the dependencies in case provider.yaml file only changed (bad
specification of .pre-commit include)
@potiuk potiuk force-pushed the strip-limits-from-devel-all branch from 4e1eb58 to 1727841 Compare July 28, 2022 11:38
@potiuk potiuk closed this Jul 28, 2022
@potiuk potiuk deleted the strip-limits-from-devel-all branch July 28, 2022 11:50
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants