-
Notifications
You must be signed in to change notification settings - Fork 16.4k
Replace architecture diagram of Airflow with diagrams-generated one #36035
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Replace architecture diagram of Airflow with diagrams-generated one #36035
Conversation
39aa850 to
83617b0
Compare
BasPH
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Few minor comments + for a basic diagram I'd show only one user in the diagram, with an arrow to both the DAG files & webserver.
I thought about it, and I think It's good to mention DAG Authors and Ops Users separately. They are part of our security model and I think it would be great to keep them separated here - becuse essentially they are different users. I think there are a lot of misconceptions on the "users" accessing the DAGs and UI to be the same - but in fact all teh security mechanism and even often actually UI users are often separate ones. .. I will have some more iterations on that - and I think we should rewrite more of our graphs (and I am planning to use the tool to add diagrams for the security model of ours (and later for multi-tenancy), so maybe it's indeed a good idea to keep one user and link to more "Complex" variants of the architecture separately ? WDYT? |
9362324 to
7bb42a2
Compare
|
I introduced two diagrams now - one basic, and one with standalond DAG file processor. While it is not yet fullly - mutlitenant, this already gives some good properties (like scheduler not having accesss DAG files at all) and having this picture described now is a good idea - and reflecting the current architecture. I also added a "dashed" line showing "executors" -> link between scheduler and workers and it looks better in Right to Lefte form - it also shows nicely the progression of things that happen with the tasks - scheduler being on the left of workers and triggerers. I converted the script to be entirely in pre-commit and added hash check so that it will not be running unnecessarily even in CI with I left two types of users for now. - I still think it is a good idea even for "basic" diagram. I also hope we will regenerate more diagrams using the same approach - celery , kubernetes, logging etc. - they will be so much |
c6a4a19 to
e920247
Compare
0e62230 to
83b460f
Compare
|
Ok. I added a few more touches and the "DAG file processor" case is now much nicer and cleaner shows what I wanted to show - separation between the part where DAG files are actually parsed and executed and when they are not. In this case also separation betwen the Users is much more apparent - showing that the UI user has no influence on Arbitrary DAG code execution while the DAG author does not. For me this is really first stepping stone/diagram that we will explain in the future for multi-tenancy architecture (which will mostly be showing how you can build Airfflow from it's building block in order to achieve multi-tenant architecture if you really want. So I think it's worth to gradually introduce this architecture (and link to the architecture from our Security Model which describes the details about those different types of users and their capabilities. I'v also added links between the architecture and security model involved, as I think this is a great way to educate the users on security implications of the architecture WDYT @BasPH ? |
3983187 to
ca90947
Compare
|
cc: @feluelle -> this is also result of what we talked about many months ago - the inspiration came from https://github.com/feluelle/airflow-diagrams :) .. I hope we can convert all the diagrams we have in Airflow to use it (and need a bit more familiarity with manipulating attributes of the nodes and edges to help graphviz to come out with a bit better layouts so I hope we can tap into your experience there :D |
ca90947 to
9ce5b4c
Compare
eef1991 to
77cd660
Compare
|
I came up with much nicer layout . I think it's very close to what I had in mind. |
|
cc: @mhenc @vincbeck -> I think this is very closely reflecting the "trusted" / "untrusted" split we were always using when it comes to AIP-44. The nice thing about it is when we get to Internal-API prime time and introduction, the 2nd diagram will become way simpler - because it will get just an "Internal API" shielding the left side of the graph from the Database (outside of the "DAG Execution" zone. |
77cd660 to
39c078e
Compare
c001bb8 to
4a1e5e5
Compare
|
@BasPH - are you ok with keeping two users ? I do feel it's much I also quite like this one: Where the DAGs are "flowing" from left to righ -> from the author to someone who sees the result of execution :) |
4a1e5e5 to
55e0b0b
Compare
The architecture diagram of Airflow has been long time outdated. This is an attempt to generate it using generated diagrams using Python's diagrams library (already used by some tools in our ecosystem).
55e0b0b to
025d242
Compare

The architecture diagram of Airflow has been long time outdated.
This is an attempt to generate it using generated diagrams using
Python's diagrams library (already used by some tools in our
ecosystem).
^ Add meaningful description above
Read the Pull Request Guidelines for more information.
In case of fundamental code changes, an Airflow Improvement Proposal (AIP) is needed.
In case of a new dependency, check compliance with the ASF 3rd Party License Policy.
In case of backwards incompatible changes please leave a note in a newsfragment file, named
{pr_number}.significant.rstor{issue_number}.significant.rst, in newsfragments.