Skip to content

Conversation

@SamWheating
Copy link
Contributor

@SamWheating SamWheating commented Aug 22, 2023

closes: #32839

(Copying my comment from the original issue)

We have two places to define permissions (manually via the UI/CLI and in code via DAG.access_control) and there's no stance on which one takes precedence, which leads to all sorts of unexpected behaviour (mostly around not cleaning up permissions enough, or cleaning up permissions too aggressively)

I would propose making the access control for a given DAG authoritative as long as it is explicitly set. This way, if a DAG was updated, removing access_control from a DAG would not trigger a cleanup, but setting access_control={} would.

So we have three cases. A DAG defined like this:

with DAG(
    dag_id="test",
    schedule="0 0 * * *",
    start_date=pendulum.datetime(2021, 1, 1, tz="UTC"),
    access_control={'role1': {'can_read'}, 'role2': {'can_read', 'can_edit', 'can_delete'}}
) as dag:

Would overwrite any existing dag-level permissions for this DAG, ensuring that the roles listed here are the only ones with permissions specific to this DAG.

While a DAG with explicitly empty access_contol, like this:

with DAG(
    dag_id="test",
    schedule="0 0 * * *",
    start_date=pendulum.datetime(2021, 1, 1, tz="UTC"),
    access_control={}
) as dag:

Would explicitly clear out all DAG-level permissions for this role.

While a DAG without any access_control defined:

with DAG(
    dag_id="test",
    schedule="0 0 * * *",
    start_date=pendulum.datetime(2021, 1, 1, tz="UTC"),
) as dag:

Would just defer to the existing DAG-level roles in the database.

This presents its own hazards around dangling and stray permissions, but I think that any approach here is going to have some sort of downside. The important part is that the approach to DAG-level access is clearly-defined and well documented (neither of which is true at the moment).

TODO (before merging):

I'll do these things as a part of this PR, but I want to get early feedback on the approach before I spend too much time here.

  • Add documentation around the precedence of different DAG-level access controls.
  • Update and add tests

@boring-cyborg boring-cyborg bot added the area:webserver Webserver related Issues label Aug 22, 2023
@SamWheating SamWheating changed the title Proposal: treat dag-defined access_control as authoritative only if it exists Proposal: treat dag-defined access_control as authoritative if defined Aug 22, 2023
@SamWheating SamWheating marked this pull request as ready for review August 22, 2023 22:50
@SamWheating SamWheating requested a review from jhtimmins as a code owner August 22, 2023 22:50

if access_control:

if access_control is not None:
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the most important change in this PR - if access control will proceed into the following block if and only if access_control is a non-empty dict, whereas if access control is not None will proceed given any dict, including an empty one.

@SamWheating
Copy link
Contributor Author

cc @potiuk - follow-up on our conversation yesterday. Have a look at the description here and let me know if you agree.

@potiuk
Copy link
Member

potiuk commented Aug 23, 2023

Yes. I agree with the reasoning.

Rebasing now to fix problem with image builds from earlier (autocomplete from the phone was strange last night :))

@SamWheating SamWheating requested a review from potiuk as a code owner August 23, 2023 23:39
@potiuk potiuk changed the title Proposal: treat dag-defined access_control as authoritative if defined Treat dag-defined access_control as authoritative if defined Aug 24, 2023
Copy link
Member

@potiuk potiuk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Needs static code check /docs fixing. I like the docs and explanation and finally a bit more clarity on what the users should expect.

:start-after: [START howto_no_fine_grained_access_control]
:end-before: [END howto_no_fine_grained_access_control]

- Removing the access_control block from a DAG altogether (or setting it to `None`) can leave dangling permissions.
Copy link
Member

@potiuk potiuk Aug 24, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep. This is a side effect that we can accept.

One small comment. Maybe a good idea would be to also accompany it with a new CLI command airflow dags clean-permissions and a new API call (eventually) to do it. Those can be separate PRs, but having them explicitly spelled out as way to clean permissions on a DAG that previously had permissions is a way to workaround the problem with dangling permissions.

I see that as an edge case, with this change there is already a way to clean permissions by setting access_control to {} - but I can see a situation that somoene would like to recover from the historical changes.

Copy link
Contributor

@huymq1710 huymq1710 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM
Thank you very much

@potiuk
Copy link
Member

potiuk commented Aug 24, 2023

Some static checks :(, Recommend installing pre-commit. Also breeze statich-checks --only-my-changes is helpful.

@SamWheating
Copy link
Contributor Author

breeze static-checks --only-my-changes

Good to know, I was just running ruff + black checks but it looks like there's a few more failing 👀 I'll use this command in the future.

@potiuk
Copy link
Member

potiuk commented Aug 24, 2023

breeze static-checks --only-my-changes

Good to know, I was just running ruff + black checks but it looks like there's a few more failing 👀 I'll use this command in the future.

Highly recommend just pre-commit install. I generally almost never run any static checks manually for airflow. It happens all automagically with pre-commits, so I just don't think about them and fix them (or get them fixed automatically) when I commit stuff.

@potiuk
Copy link
Member

potiuk commented Aug 24, 2023

Still small thing in code blocks :(

@potiuk potiuk merged commit 370348a into apache:main Aug 24, 2023
@potiuk
Copy link
Member

potiuk commented Aug 24, 2023

🎉

@ephraimbuddy ephraimbuddy added the type:bug-fix Changelog: Bug Fixes label Aug 28, 2023
ephraimbuddy pushed a commit that referenced this pull request Aug 28, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area:webserver Webserver related Issues type:bug-fix Changelog: Bug Fixes

Projects

None yet

Development

Successfully merging this pull request may close these issues.

DAG-level permissions set in Web UI disappear from roles on DAG sync

4 participants