-
Notifications
You must be signed in to change notification settings - Fork 16.4k
Databricks SQL Sensor #30428
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Databricks SQL Sensor #30428
Conversation
|
Congratulations on your first Pull Request and welcome to the Apache Airflow community! If you have any issues or are unsure about any anything please check our Contribution Guide (https://github.com/apache/airflow/blob/main/CONTRIBUTING.rst)
|
|
@alexott @bilalaslamseattle @janvandervegt @o-nikolas @josh-fell - thank you for taking the time to review my PR. I have made all the changes requested for PR #30204 here. I had to open a new one because of GitHub handles. Can you please review once more and confirm if the changes look good? Thank you! |
|
@potiuk @josh-fell it looks like there are problems with Docker auth - is this error critical? |
alexott
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
small changes are necessary + comments from Bilal
Not sure if critical but for sure intermittent (and likely a failure on GitHub size). Rebase should attempt to re-run the job. |
78dcf62 to
e8217dd
Compare
@alexott @bilalaslamseattle I have made all the changes requested. Can you please review again? |
bilalaslamseattle
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
d58fd51 to
f297270
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think for the naming convention, it's better if we use _get_hook without @cached_property or if we rename the method to hook which is the name of the property. WDYT? (I prefer renaming it to hook)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1 for
@cached_property
def hook(self) -> DatabricksSqlHook:
...There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Changed it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Was this file change intended as part of this PR?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This isn't part of the change. It got added when I rebased the PR with main.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@josh-fell not sure why it keeps happening, every time I rebase my branch with main, it keeps adding new files to my PR. Any suggestions to remove them and keep only my files?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Probably you do some mistake when rebasing. But what it is, hard to say. You can follow the "--rebase" pattern - it should not add files. You can use the GitHub UI to rebase (eaasier) or command line (more powerful and more arcane for those who are not used to git cli): see "How to Rebase" section https://github.com/apache/airflow/blob/main/CONTRIBUTING.rst#id14
I recommend to learn and master the CLI version, once you get used to doing it and do it often, it gets into a habit. Also things like liquidprompt or oh-my-zsh and friends help a lot with seeing the status of your git repo in the command prompt are immensely helpful to see if you are doing it right. Usually rebase should result with only your commits that should be pushed on top of the current master.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you @potiuk, I rebased from the UI a few times and I am not sure what went wrong. I can clean things up and open a new PR if needed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah seems liek a recurring issue last few days https://apache-airflow.slack.com/archives/CCPRP7943/p1680639118757579 - created an issue in GitHub Support for that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for letting me know @potiuk. Is there a remediation path to resolve the extra files in the PR?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I will follow what @hussein-awala did to clean this up.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There are still leftovers of unrelated changes that needs to be cleaned
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since both the DatabrickSqlHook and DatabricksSqlOperator are both in "databricks_sql.py" files, probably makes sense to keep the file name convention for this new sensor file as well. Even though it's borderline non-compliant with AIP-21, consistency is probably worth having given they all deal with the same service.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Renamed it to databricks_sql.py to keep it consistent with the hook and operator files.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| raise AirflowException("Both HTTP Path and SQL warehouse name are not specified.") | |
| raise AirflowException("Databricks SQL warehouse/cluster configuration missing. Please specify either http_path or sql_warehouse_name.") |
The current exception might be misleading to some users. It reads as though both http_path and sql_warehouse_name need to be specified. WDYT?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agree, added the change.
|
@hussein-awala @josh-fell pushed changes based on PR comments, can you please review again? Thank you for your review! |
Databricks SQL Sensor.