-
Notifications
You must be signed in to change notification settings - Fork 16.4k
AIP-72: Improving Operator Links Interface to Prevent User Code Execution in Webserver #46613
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
Should something like this pass: I do not really understand what: means. |
task_sdk/src/airflow/sdk/definitions/_internal/abstractoperator.py
Outdated
Show resolved
Hide resolved
task_sdk/src/airflow/sdk/definitions/_internal/abstractoperator.py
Outdated
Show resolved
Hide resolved
task_sdk/src/airflow/sdk/definitions/_internal/abstractoperator.py
Outdated
Show resolved
Hide resolved
…erator, some refactoring, signature of GOL to BOL
task_sdk/src/airflow/sdk/definitions/_internal/abstractoperator.py
Outdated
Show resolved
Hide resolved
ashb
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overall LGTM.
1 question about how we should handle global links (I really would like to avoid plugin manager in the SDK if we can, it will make dependencies a nightmare in the medium term if we start using it there)
|
Update:
Whats left:
|
|
A lot of core tests are failing, working on fixing them |
|
Last few tests to go, almost there! |
|
Yeah finally!! 🥇 @ashb could you take a look again at this one? |
|
Added some additional information about the design et al in the last self review. |
…tion in Webserver (apache#46613) Operator Links interface changed to not run user code in Airflow Webserver The Operator Extra links, which can be defined either via plugins or custom operators now do not execute any user code in the Airflow Webserver, but instead push the "full" links to XCom backend and the value is again fetched from the XCom backend when viewing task details in grid view. Example: ``` @attr.s(auto_attribs=True) class CustomBaseIndexOpLink(BaseOperatorLink): """Custom Operator Link for Google BigQuery Console.""" index: int = attr.ib() @Property def name(self) -> str: return f"BigQuery Console #{self.index + 1}" @Property def xcom_key(self) -> str: return f"bigquery_{self.index + 1}" def get_link(self, operator, *, ti_key): search_queries = XCom.get_one( task_id=ti_key.task_id, dag_id=ti_key.dag_id, run_id=ti_key.run_id, key="search_query" ) if not search_queries: return None if len(search_queries) < self.index: return None search_query = search_queries[self.index] return f"https://console.cloud.google.com/bigquery?j={search_query}" ```
closes: #46609
Why?
If you want to add further links to operators you can define them via a plugin or provider package. Extra links will be displayed in task details page in Grid view.
Currently, users define operator links in one of two ways:
This feature doesn't work with Airflow 3, so while porting this to task sdk so it can be used by users, we decided to simplify the extra links code-flow in a way to also reduce / decouple the user code that runs using
get_linkof a operator extra link in the airflow webserver to outside the context of the webserver.What?
To get this working, a simple interface change is made and the following things get easier:
For example:
finalizeing), it requests to push an XCOM which contains the "entire" link for the extra link with the xcom key and task id.Ex:
The link in this button comes by reading from xcom instead of running user code
How?
Changes of note:
BaseOperatorLink Model
Introducing a new
xcom_keydatafield that carries the xcom_key with which the entire link will be pushed.Introduced a new slim interface that we will use during deserialisation which can return the link from xcoms instead of user defined "get_link".
DAG serialisation/ deserialisation
We need something in the serdag to say what links a task has, which we already have but the format is overly complex.
Difference in formats (then vs now):
Example of how its stored in the serialised dag now:
Abstract Operator
get_extra_linksin task sdk now follows the new function signatureTesting
DAG:
The link elements:
XCOMS pushed:

Logs:

Last few lines:
Plugin:
DAG:
The link elements:
XComs pushed:

Logs:

Last few logs lines:
Whats pending?
^ Add meaningful description above
Read the Pull Request Guidelines for more information.
In case of fundamental code changes, an Airflow Improvement Proposal (AIP) is needed.
In case of a new dependency, check compliance with the ASF 3rd Party License Policy.
In case of backwards incompatible changes please leave a note in a newsfragment file, named
{pr_number}.significant.rstor{issue_number}.significant.rst, in newsfragments.