-
Notifications
You must be signed in to change notification settings - Fork 16.4k
New Optional dbt Cloud Job Operator Params #45634
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
New Optional dbt Cloud Job Operator Params #45634
Conversation
|
Congratulations on your first Pull Request and welcome to the Apache Airflow community! If you have any issues or are unsure about any anything please check our Contributors' Guide (https://github.com/apache/airflow/blob/main/contributing-docs/README.rst)
|
|
@joellabes another MR related to dbt Cloud Operator from our side - could you have a look if you are fine with the approach? |
358ba15 to
c93f3fe
Compare
|
Hi @ginone , great proposal and change. We use in our Airflow environment an own helper and retrieve job_ids by names. We figure out that in our case the payload from dbt Cloud REST API is quite big to retrieve it for every dbt cloud job triggering. To reduce payload size we additionally filter rest api calls by filtering project name in API call, for example: or |
|
@pikachuev thanks for the feedback! If you look at the code,
I decided to only use Now that you mentioned it, I think it would make sense to use |
c93f3fe to
4ac4e96
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Except of these very minor comments below - I think about one more approach.
Replace:
self.job_id = job_id
self.project_name = project_name
self.environment_name = environment_name
self.job_name = job_name
with*:
self.job = job
self.project = project
self.environment = environment
where all these parameters can be either string or int.
- if
jobis int - that's all we need - if it's string - we need
projectandenvironment - if any of these two is int - it's passed directly to
get_job_by_nameas resource ID - if any of these two is string - resource ID is retrieved by calling
get_<project | environment>_by_name
This way we allow for more flexible setup (e.g. job by name, environment and project by id). Ofc instead of checking the type - we could keep separate parameters explicitly (== have <resource>_id and <resource>_name params for all three resources), but then we would need to either:
- add logic for mutually exclusive params
- document that
<resource>_idhas always higher priority than<resource>_name
*for backward compatibility we would need to keep job_id parameter, but we could mark it as deprecated
4ac4e96 to
ea9230c
Compare
I'm sorry that it's necessary (and have linked this internally for us to work on a better strategy!) but I think it's a sensible workaround for now, and is a pattern we've seen customers use before 👍 |
ea9230c to
167523d
Compare
|
@josh-fell @Lee-W thank you for the code review, I've addressed all feedback. Could you please look again? BTW. What do you think regarding @jaklan suggestion (#45634 (review))? My thoughts:
self.job: str | int,
self.project: str | int,
self.environment: str | int,I think the params should be explicit, so I would go with: self.job_id: int,
self.project_id: int,
self.environment_id: int,
self.job_name: str,
self.project_name: str,
self.environment_name: str,
|
e77dff0 to
2457ee9
Compare
New PR that includes the changes described above: #46184 Feel free to select just one of my PRs and close the other one. |
2457ee9 to
f41e5b3
Compare
Hi, sorry for the late reply. I prefer this idea more. It would be great if you could rebase from the main branch and resolve the conflict, then I'll review it again. Thanks a lot! |
f41e5b3 to
c4d8752
Compare
|
@Lee-W rebasing is done, please have a look again. Thanks! |
Lee-W
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good to me. Thanks for your prompt reply. I'll keep it open for one to two days so others can take a look.
…nd list_jobs methods
…ignatures to use keyword-only arguments
c4d8752 to
670aa4f
Compare
Fixed one incorrect "include" path in docs. Checks should be all green now. Thanks! |
|
Awesome work, congrats on your first merged pull request! You are invited to check our Issue Tracker for additional contributions. |
* Update DbtCloudRunJobOperator and corresponding tests * update documentation * Fixes after manual tests * add unit test for duplicate names * Enhance DbtCloudHook to support name filtering in list_environments and list_jobs methods * Improve clarity by renaming variables and enhancing comments * Refactor dbt Cloud hook and operator error handling & update method signatures to use keyword-only arguments * Correct get_job_by_name docstring * Fix include path for dbt Cloud documentation --------- Co-authored-by: Marcin Sitnik <marcin.sitnik@roche.com>
* Update DbtCloudRunJobOperator and corresponding tests * update documentation * Fixes after manual tests * add unit test for duplicate names * Enhance DbtCloudHook to support name filtering in list_environments and list_jobs methods * Improve clarity by renaming variables and enhancing comments * Refactor dbt Cloud hook and operator error handling & update method signatures to use keyword-only arguments * Correct get_job_by_name docstring * Fix include path for dbt Cloud documentation --------- Co-authored-by: Marcin Sitnik <marcin.sitnik@roche.com>
Summary
Allow triggering dbt Cloud jobs using
project_name,environment_name, andjob_nameinstead ofjob_id.It is not a breaking change, but optional, alternative way of triggering a job - both options will work:
This is beneficial in dynamically configured environments (e.g. managed by Infrastructure as Code) when
job_idis not known upfront (or may change over time) and therefore hardcoding it is not convenient.^ Add meaningful description above
Read the Pull Request Guidelines for more information.
In case of fundamental code changes, an Airflow Improvement Proposal (AIP) is needed.
In case of a new dependency, check compliance with the ASF 3rd Party License Policy.
In case of backwards incompatible changes please leave a note in a newsfragment file, named
{pr_number}.significant.rstor{issue_number}.significant.rst, in newsfragments.