Skip to content

Conversation

@dstandish
Copy link
Contributor

When running airflow from sources we might not have all deps installed so provider imports fail.

Currently the traceback is logged at warning level which is too much; it obliterates your history.

Instead we make optional the simple reporting of an import failure in a provider, and additionally we move the traceback to DEBUG only (not configurable).

When running airflow from sources we might not have all deps installed so provider imports fail.

Currently the traceback is logged at warning level which is too much; it obliterates your history.

Instead we make optional the simple reporting of an import failure in a provider, and additionally we move the traceback to DEBUG only (not configurable).
@dstandish dstandish force-pushed the reduce-provider-import-failure-noise branch from 222fac8 to ded64a5 Compare March 26, 2022 00:05
@pingzh
Copy link
Contributor

pingzh commented Mar 26, 2022

there was a discussion in the email group. you can see https://lists.apache.org/thread/m6lsnkn0xdwqopzpv4m7qcvdz6t77w3m for more context.

cc @potiuk

@dstandish
Copy link
Contributor Author

I see yeah that PR made it possible for providers to have optional components.

I think this PR still should be merged.

What it does is
(1) reduce verbosity by removing traceback from the warning (so now it's just one line)
(2) add DEBUG logging that includes the traceback (so users can get more detail by adding debug logging)
(3) make it optional to suppress the one-line import error warning by configuration.

This is a painful issue for developers who use virtualenv and it seems like a reasonable solution to me.

@potiuk
Copy link
Member

potiuk commented Mar 26, 2022

I think the problem here is that we hav duplicate warnings generated for some of the providers (am I right about it @dstandish ?) an I think it comes from the fact that 4 of the providers are both installed by default and also available in the sources - and we should solve this in Breeze/Dev environment rather than adding a config setting that is "development only". I think I saw it few times too.... But I am not sure if this is the error you try to deal with @dstandish :).

Do I get it right? I am happy to take a close look at this and fix it "better" @dstandish - can you briefly desribe (copy&paste) the scenario/output logs you experience?

@dstandish
Copy link
Contributor Author

well the scenario is, when using virtualenv, if you don't have some deps for providers, then when you run webserver you'll get error logging over and over and over.

at a minimum we should suppress traceback except when log level is debug

@potiuk
Copy link
Member

potiuk commented Mar 27, 2022

But I guess it's only when you are in "editable" mode, right? We should really suppres the errors in this case not by configuration. It can be actually easily detected (and there is a separate path in Provider's Manager to handle providers from sources, so it should be done there.

@dstandish
Copy link
Contributor Author

But I guess it's only when you are in "editable" mode, right? We should really suppres the errors in this case not by configuration. It can be actually easily detected (and there is a separate path in Provider's Manager to handle providers from sources, so it should be done there.

if you have a hint i'll take it

@potiuk
Copy link
Member

potiuk commented Mar 28, 2022

def _discover_all_airflow_builtin_providers_from_local_sources(self) -> None:

@potiuk
Copy link
Member

potiuk commented Mar 28, 2022

I think we could ignore all import errors from providers found "via sources" rather than "via packages".

@dstandish
Copy link
Contributor Author

@potiuk i looked at that method, it doesn't actually import the provider there, but just stores in _provider_dict.

are you thinking we add more information for each provider to _provider_dict (e.g. where it comes from, and store in ProviderInfo or something) and then use that information to conditionally suppress when importing lazily, later?

@potiuk
Copy link
Member

potiuk commented Mar 28, 2022

@potiuk i looked at that method, it doesn't actually import the provider there, but just stores in _provider_dict.

are you thinking we add more information for each provider to _provider_dict (e.g. where it comes from, and store in ProviderInfo or something) and then use that information to conditionally suppress when importing lazily, later?

Yep.

@potiuk
Copy link
Member

potiuk commented Mar 28, 2022

Just to add more context- unfortunately there is no easy way to to determine if we are in "dev" (editable) environment or not. We could simply check if we are in "BREEZE" (but this is not a full-proof either (we can run ./breeze --use-airflow-version 2.1.4 (and we could add some env variables here too). But it does not cover the case when you run pip install -e .. - in this case when you want to modify providers in the env you need to get them available from sources rather than from provider packages.

Detecing whether the provider import comes from a "packaged" provider (i.e. should have all deps installed) or sources (where you get all providers no matter if you have all the deps installed) seems to be the best way to know whether you can ignore import errors or not.

@dstandish
Copy link
Contributor Author

resolved via other approach in #22579

@dstandish dstandish closed this Mar 29, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants