-
Notifications
You must be signed in to change notification settings - Fork 16.4k
Check if sqlalchemy_scheme extra contains forbidden characters #31984
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
pankajkoti
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good, but have a query inline.
| or self.DEFAULT_SQLALCHEMY_SCHEME | ||
| ) | ||
| extra_scheme = self.connection_extra_lower.get("sqlalchemy_scheme") | ||
| if not self._sqlalchemy_scheme and extra_scheme and (":" in extra_scheme or "/" in extra_scheme): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If the user has set sqlalchemy_scheme via constructor but wants to override via extras, will we not allow it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes that was a little doubt I had myself and it's a very good point you raised. I deliberately chose to fail it even if it is not used.
I think it's better to hard-fail in case you have "wrong" extra defined - even if it is not used. The thing is that you could have it defined before with ; for whatever reason. But now, if you continue having it there, it makes it useless:
- if you have it in constructor, it will not be used
- if you do not have it in constructor, it will raise an exception
So basically, there is no point whatsoever, to keep it in your extra in case you have it. It will not be used anyway.
Better to fail it straight away if you have such bad extra, so that you can fix it quickly.
I am a big fan of failing hard rather than raising warnings, in such case. While it might be seen as "tough", it's super easy to recover from such error (just remove the bad extra in one connection and you are done. Having just warning about it will have no effect in many cases, because people will not do anything about the warnings, and then it might hit you in the worst time possible, when someone will write a super-important DAG, and put it in production and it will start to fail. It's better to immediately fail such bad configuration at the moment of migration to the new provider version. This time you are prepared to fix things if needed and you chose the time when to do it, and you have also option to go back to previous provider. Once you already migrated and some of your users started to use the new provider features already and then suddenly you find that something does not work when someone does not add the scheme in their constructor, you do not have the easy "recovery" by downgrading the provider.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
makes complete sense, thanks for the detailed explanation.
| extra_scheme = self.connection_extra_lower.get("sqlalchemy_scheme") | ||
| if not self._sqlalchemy_scheme and extra_scheme and (":" in extra_scheme or "/" in extra_scheme): | ||
| raise RuntimeError("sqlalchemy_scheme in connection extra should not contain : or / characters") | ||
| return self._sqlalchemy_scheme or extra_scheme or self.DEFAULT_SQLALCHEMY_SCHEME |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What is the general precedence of considering connection parameters when there are multiple ways to set a parameter?
I am under the impression that extras > constructor/hook params > default
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Generally, yes, and it also surprised me a bit. But I did not want to add even more somewhat breaking changes at the same time. One breaking thing at a time is breaking enough (even thought this one is technically a bugfix, not a breaking change).
But I can be convinced otherwise if others things it would be better to make it a breaking change as well and change the sequence.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If the user has set sqlalchemy_scheme via constructor but wants to override via extras, will we not allow it?
why does he need to override a very flexible value provided to the constructor by an extra provided in very less flexible way (connection).
If he wants to use the connection scheme he can skip providing a value to the constructor, and if he want to override the connection scheme, he can provide the new value to the constructor.
We can easily create 10 instances of this hook, but we can easily create and manage 10 different connection. So for me it's perfect like that:
constructor/hook params > connection extras > default
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes, not sure, how users use it. I saw it in a few other places where extras take precedence over constructor/hook params and hence was curious to understand. Changing the hook/constructor params would need DAG/code changes and the need to update that, whereas changing the extras would not need code changes and more or less like changing an environment variable. Also across environments, they can provide different values for the same DAGs with different values in the extra and have a DAG level default hook param and not override with extras.
No strong opinion, just wanted to understand :)
| extra_scheme = self.connection_extra_lower.get("sqlalchemy_scheme") | ||
| if not self._sqlalchemy_scheme and extra_scheme and (":" in extra_scheme or "/" in extra_scheme): | ||
| raise RuntimeError("sqlalchemy_scheme in connection extra should not contain : or / characters") | ||
| return self._sqlalchemy_scheme or extra_scheme or self.DEFAULT_SQLALCHEMY_SCHEME |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If the user has set sqlalchemy_scheme via constructor but wants to override via extras, will we not allow it?
why does he need to override a very flexible value provided to the constructor by an extra provided in very less flexible way (connection).
If he wants to use the connection scheme he can skip providing a value to the constructor, and if he want to override the connection scheme, he can provide the new value to the constructor.
We can easily create 10 instances of this hook, but we can easily create and manage 10 different connection. So for me it's perfect like that:
constructor/hook params > connection extras > default
^ Add meaningful description above
Read the Pull Request Guidelines for more information.
In case of fundamental code changes, an Airflow Improvement Proposal (AIP) is needed.
In case of a new dependency, check compliance with the ASF 3rd Party License Policy.
In case of backwards incompatible changes please leave a note in a newsfragment file, named
{pr_number}.significant.rstor{issue_number}.significant.rst, in newsfragments.