Skip to content

Conversation

@potiuk
Copy link
Member

@potiuk potiuk commented Aug 23, 2023

This one completes review and check of provider.yaml verifications. There was one more check - for duplicates - that did not work as expected. It was hiding errors detected.

This PR fixes it and also adds a bit more diagnostics messagese on what is actually being checked to give a bit more clue if some check is not doing what it is supposed to be doing.

Follow up after #33662 and #33640


^ Add meaningful description above
Read the Pull Request Guidelines for more information.
In case of fundamental code changes, an Airflow Improvement Proposal (AIP) is needed.
In case of a new dependency, check compliance with the ASF 3rd Party License Policy.
In case of backwards incompatible changes please leave a note in a newsfragment file, named {pr_number}.significant.rst or {issue_number}.significant.rst, in newsfragments.

@potiuk potiuk requested a review from ashb as a code owner August 23, 2023 19:23
@boring-cyborg boring-cyborg bot added area:dev-tools area:providers provider:google Google (including GCP) related issues provider:microsoft-azure Azure-related issues labels Aug 23, 2023
@potiuk potiuk requested review from eladkal and uranusjr August 23, 2023 19:23
@potiuk
Copy link
Member Author

potiuk commented Aug 23, 2023

Somehow we had a number of issues with provider check.

This is always the question : who verifies the verifier.

@potiuk potiuk force-pushed the further-provider-check-improvements branch from f6d6fd5 to 8376795 Compare August 24, 2023 07:19
@uranusjr
Copy link
Member

uranusjr commented Aug 24, 2023

Should the check-provider-yaml-valid hook be run when any file in providers is changed? Currently it only triggers when a provider.yaml is changed so pre-commit would skip it if we (say) rename a class and cause the wrong content to be committed.

@potiuk
Copy link
Member Author

potiuk commented Aug 24, 2023

Should the check-provider-yaml-valid hook be run when any file in providers is changed? Currently it only triggers when a provider.yaml is changed so pre-commit would skip it if we (say) rename a class and cause the wrong content to be committed.

It could, but it's not a very fast check - it requires breeze image and imports a lot. So I'd prefer to keep it the way it is.

The pre-commits in CI are ALWAYS run with --all-files flag (precisely to catch the cases where there are some side-effects from files that were not specified as those triggering local pre-commites), so the case you mention will be caught in the CI - static checks in PR will fail. I'd prefer that rather than slow down local commits.

The general approach for local pre-commits (i.e. those which are run on "commit" scope) should be as fast as possible.

@uranusjr
Copy link
Member

Maybe we can bring the ast version back to run locally instead? I feel it’s still somewhat valuable to have a check locally.

@potiuk
Copy link
Member Author

potiuk commented Aug 24, 2023

Maybe we can bring the ast version back to run locally instead? I feel it’s still somewhat valuable to have a check locally.

Instead, I am not sure, but additionally - if you want - sure. The checks we are doing here are far more complete - because they not only check the AST but also imports, Deprecation warnings etc.

This one completes review and check of provider.yaml verifications.
There was one more check - for duplicates - that did not work
as expected. It was hiding errors detected.

This PR fixes it and also adds a bit more diagnostics messagese
on what is actually being checked to give a bit more clue if some
check is not doing what it is supposed to be doing.

Follow up after apache#33662 and apache#33640
@potiuk potiuk force-pushed the further-provider-check-improvements branch from 8376795 to c1f3311 Compare August 24, 2023 09:18
@uranusjr
Copy link
Member

By instead I meant only run the ast check locally, and only the real imports on CI. Does that make sense?

@potiuk
Copy link
Member Author

potiuk commented Aug 24, 2023

By instead I meant only run the ast check locally, and only the real imports on CI. Does that make sense?

But it's ok to run this one locally when provider.yaml changes. It's not THAT slow (it's a second or so) .... And provider.yaml changes really rarely. Also the checks that are run there are cross-provider.yaml, because they also check integrations, documentation, logos, url and many other content of the provider.yaml which are non-AST related. So generally "not running" it when provider.yaml files change is a bad ida.

@potiuk
Copy link
Member Author

potiuk commented Aug 24, 2023

So in short: files: ^airflow/providers/.*/provider\.yaml$ was a very deliberate choice.

  • we want to run it when provider.yaml changes in your PR, because you are very likely to make a typo/mistake there - for example by misspelling integration or refering to a wrong class. And we want to give you a feedback on it asap - before you even push it, without waiting 30 minutes for CI even if it slows the pre-commit by a second or so. And we want to make a complete check when we do. The checks including checking if the classes/modules referred there are importable, whether they throw deprecation warnings - because maybe you forgot to remove a deprecated Hook that is still fine with AST and importing and throws the Deprecation Warning - this is the recent realisation I had with Slack Notifier that you can leave a deprecated class while you introduce a new one, which is wrong.

  • but we do not want to run it when just provider .py files changes. I believe (we can make some stats if needed) high % of changes in providers are just bugfix/update/corrections - without changing provider's metadata - for those we prefer an optimistic strategy, where yeah, you might break provider.yaml metadata, but we assume you didn't (but you will find out - and with this PR it will even tell you exactly how to fix the problem by providing straightforward instructions)

So I am not sure there is a need to change here :).

Adding an AST parser for quick check might be a good idea as well but that's pretty orthogonal to those checks,

@potiuk potiuk merged commit 96efcfb into apache:main Aug 24, 2023
@potiuk potiuk deleted the further-provider-check-improvements branch August 24, 2023 10:07
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area:dev-tools area:providers provider:google Google (including GCP) related issues provider:microsoft-azure Azure-related issues

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants