-
Notifications
You must be signed in to change notification settings - Fork 16.4k
Add dbt Cloud provider #20998
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add dbt Cloud provider #20998
Conversation
24c7634 to
c7f9be9
Compare
c7f9be9 to
b3f7b0f
Compare
|
@nathaniel-may @emmyoop @leahwicz Can you look at it? |
3b82ff8 to
e00536e
Compare
@mik-laj These people are dbt-core contributors and do not actively work on the dbt Cloud API. What did you want someone from dbt Labs to look into specifically for this pull request? |
|
@sungchun12 DBT is used quite often with Apache Airflow, so I believe it is worth asking for reviews to give you an opportunity to share your thoughts on this contribution. The earlier the changes are introduced, the lower the cost of their implementation. We also sometimes do not know all API features as they may not be widely promoted, but from your perspective they are important. For example, all requests to Google API contain client info, which allows them to track API usage by a specific solution airflow/airflow/providers/google/common/hooks/base_google.py Lines 340 to 352 in d353f02
Snowflake provider provides have a similar feature:
On the other hand, it can also be a signal for you that there is a new feature in the third project and you can promote it, e.g. by updating the documentation https://docs.getdbt.com/docs/running-a-dbt-project/running-dbt-in-production#using-airflow Does it make sense to you? |
|
@mik-laj thanks a bunch for the follow up. I'm in conversations internally with the dbt Labs engineers to specify a User-Agent in the headers of the dbt Cloud API requests for tracking. I'll submit a pull request for the dbt Labs docs after this is merged! @josh-fell I'll follow up with you personally on the above. |
|
Would you please rebase @josh-fell ? |
e00536e to
bbc5d09
Compare
@sungchun12 @mik-laj This is implemented now if you'd like to take a look. I also refactored the operator link slightly to not create ad hoc TaskInstances to align with #21285. |
sungchun12
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for adding the change Josh. I verified it's tracking on the dbt Cloud side in our logs!
bbc5d09 to
7eaf5ba
Compare
Having a `cached_property` named "conn" while there is a `get_conn()` method of which both don't return the same thing could be confusing.
7eaf5ba to
5922a74
Compare
|
@mik-laj Are there any other nuances/features I should try to incorporate in the provider? The last suggestion was a great one. |
potiuk
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it's a great DBT provider start. Just in time for Feb release.
|
The PR most likely needs to run full matrix of tests because it modifies parts of the core of Airflow. However, committers might decide to merge it quickly and take the risk. If they don't merge it quickly - please rebase it to the latest main at your convenience, or amend the last commit of the PR, and push it with --force-with-lease. |
|
@josh-fell amazing work, thanks for driving this to the finish line! |
This PR adds a new provider to interface with dbt Cloud via the dbt Cloud API. The provider includes:
DbtCloudHookwhich implements an abstraction for almost all of the available endpoints in the dbt Cloud API and inherits from the existingHttpHook.DbtCloudRunJobOperatorandDbtCloudGetJobRunArtifactOperator, which triggers a dbt Cloud job and downloads a run artifact, respectively.DbtCloudJobRunSensor, to poll status of a specific dbt Cloud job run.test_connection()method in theDbtCloudHookfor users to test connections prior to executing DAGs.^ Add meaningful description above
Read the Pull Request Guidelines for more information.
In case of fundamental code change, Airflow Improvement Proposal (AIP) is needed.
In case of a new dependency, check compliance with the ASF 3rd Party License Policy.
In case of backwards incompatible changes please leave a note in UPDATING.md.