-
Notifications
You must be signed in to change notification settings - Fork 16.4k
Add docs and example dag for AWS Glue #22295
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
The PR is likely OK to be merged with just subset of tests for default Python and Database versions without running the full matrix of tests, because it does not modify the core of Airflow. If the committers decide that the full tests matrix is needed, they will add the label 'full tests needed'. Then you should rebase to the latest main or amend the last commit of the PR, and push it with --force-with-lease. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can have less code lines with the python decorator
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We really should be using the TaskFlow API where possible. There was an effort a while back to transition the example DAGs to this approach. Related to #9415
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agreed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure, folks, I will make the change :)
potiuk
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As @josh-fell mentioned - using @task is much better than Python Operators :)
1ffdcd6 to
ca0b916
Compare
josh-fell
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks awesome! Just a small TaskFlow API update to use the .output property of operators.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| run_id="{{ ti.xcom_pull(task_ids='submit_glue_job') }}", | |
| run_id=submit_glue_job.output, |
Another perk of the TaskFlow API: using the .output property or XComArgs. This is a functional abstraction over the classic "{{ ti.xcom_pull(...) }}" approach to pull XComs from operators.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure, updated the PR with this suggestion :)
Note: I also removed the header from the example CSV file since it was causing issues in the glue job, so when you see that in the diff, it was intentional 👍
ca0b916 to
17116bb
Compare
Part of a project to add, simplify and standardize AWS sample dags and docs in preparation for adding System Testing.
related: #21828
related: #22010
related: #21920
^ Add meaningful description above
Read the Pull Request Guidelines for more information.
In case of fundamental code change, Airflow Improvement Proposal (AIP) is needed.
In case of a new dependency, check compliance with the ASF 3rd Party License Policy.
In case of backwards incompatible changes please leave a note in UPDATING.md.