Skip to content

Conversation

@raphaelauv
Copy link
Contributor

@raphaelauv raphaelauv commented Oct 30, 2021

close #19251

the new parameter will skip all external the secret backend for variables

so it will not skip the env variables set by AIRFLOW_VAR_FOO

and the variables inside the airflow db

and (if setup) a LocalFilesystemBackend

@raphaelauv raphaelauv marked this pull request as ready for review November 2, 2021 21:07
Copy link
Member

@potiuk potiuk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just one NIT: with the name

key: str,
default_var: Any = __NO_DEFAULT_SENTINEL,
deserialize_json: bool = False,
skip_external_backends: bool = False,
Copy link
Member

@potiuk potiuk Nov 3, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should we rename it to "skip_secret_backend"? I think this would be more consistent with the currently used naming

https://airflow.apache.org/docs/apache-airflow/stable/security/secrets/secrets-backend/index.html

For example we can have local FilesystemBackend (which is not external). Also it would indicate better the intentions.

What we want to do by adding the flag we want to explicitly read values that we know are not "secret" and we do not want to look for them in secret backend.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also plural suggests that we can have more than one "external/secret" backend configured - which we don't have (and will not likely have), so I suggest to use singular form.

Copy link
Contributor Author

@raphaelauv raphaelauv Nov 3, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the "problem" is that the LocalFilesystemBackend need to be setup as a secret_backend https://airflow.apache.org/docs/apache-airflow/stable/security/secrets/secrets-backend/local-filesystem-secrets-backend.html

but we don't want skip it cause it's not an "external" backend

so skip_secret_backend will not be very clear for people with a LocalFilesystemBackend setup

( that's why originally I come up with the name skip_external_backends )

what should we do ?

the documentation of the arg should precise the 3 not skipped backends ?

@raphaelauv raphaelauv changed the title Add a skip parameter to Variable to skip the secret backend(s) Add a skip parameter to Variable to skip the secret backend Nov 3, 2021
@ashb
Copy link
Member

ashb commented Nov 3, 2021

Not needed, possible via secrets backend configuration

:param variables_prefix: Specifies the prefix of the secret to read to get Variables.
If set to None (null), requests for variables will not be sent to GCP Secrets Manager
:type variables_prefix: str

@ashb ashb closed this Nov 3, 2021
@potiuk potiuk reopened this Nov 3, 2021
@potiuk
Copy link
Member

potiuk commented Nov 3, 2021

Not really. This one is for selective disabling of secret backends only for some cases which user chooses consciously to skip secrets querying (knowing that the data is not secret). This is an optimisation of cost mostly (many secret backends are paid by access)

@ashb
Copy link
Member

ashb commented Nov 3, 2021

@potiuk That strikes me as a very-hard to reason about feature -- depending on how the DAG accesses the variable means that you get different behaviour. If I do it one way the it will load the secret from the secrets store, but if I do it another way it will use the value in the env/DB.

This makes the runtime behaviour hard to reason about and makes the system as a whole hard to reason about.

This is a -1/veto from me, sorry.

If you still want this sort of feature @raphaelauv then I would suggest you write a custom secrets backend (subclassing the existing GCS backend) and make a decision based on the name/prefix of the variable. That way the source of a variable doesn't depend on how it is accessed in different DAGS.

@ashb ashb closed this Nov 3, 2021
@raphaelauv
Copy link
Contributor Author

The option is false by default. That mean only when user is using Variable.get() himself he can custom the behavior. So I don't see any hidden behavior.

Yes extending the custom_backend is an option , I still think that for the majority of the calls to the variables are for the airflow DB and not the custom secret backend.

@ashb
Copy link
Member

ashb commented Nov 3, 2021

My problem with this solution is making the decision something that each dag author has to do on a case-by-base basis, rather than a "global" setting that applies to a given variable (name) always.

@raphaelauv
Copy link
Contributor Author

raphaelauv commented Nov 3, 2021

Yes it's perflectly true that it's going to be boring to setup almost all the variables to

Variable.get("toto",skip_secret_backend=True)

a better option would be a prefix-pattern-rule for variables to lookup

so a new common option to backend_kwargs

"only_look_up_prefix_variables" : "secret_"

for Variable.get("secret_toto") the secret_backend would be call

and Variable.get("toto") the secret_backend would directly return None and the other backend would then be check

This would be great and explicit

@potiuk
Copy link
Member

potiuk commented Nov 3, 2021

Proposal: Let's continue the discussion in #19251

@nxhuy-github
Copy link

Hi @potiuk @ashb and @raphaelauv , do we have any news on this feature please? I feel like this PR was never merged yet.

@raphaelauv
Copy link
Contributor Author

this PR is close and will never be merge

@nxhuy-github
Copy link

nxhuy-github commented Apr 14, 2022

Ah, so we need to custom your secret backend the same way you did, if not, we need to accept the fact that these errors will always be there no matter what when we use Secret Manager, am I right, please? @raphaelauv

@raphaelauv
Copy link
Contributor Author

raphaelauv commented Apr 14, 2022

It depend the implementation of the secret_backend , the secret_backend part of the airflow providers log errors , you just have to code something better/different

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants