Skip to content

Conversation

@tirkarthi
Copy link
Contributor

We extensively use log grouping in Airflow 2. While looking at implementation of grouping through a prism plugin to support colors I found about <details> and <summary> tag which has good browser support supporting almost all browsers in last 5 years.

The implementation is to keep the group name under <details> and to have the group's log content under <summary> section which can be clicked to expand like Airflow 2. The PR involves looking for "::group::" in the log line to indicate the log group start marker and collect the subsequent lines till line with "::endgroup::" is reached and wrap them up in the <details> tag as a section. The code logic and styling around this can be improved but I felt this is a good enough to open a PR for discussion. This also makes the implementation simpler compared to Airflow 2.

Docs : https://developer.mozilla.org/en-US/docs/Web/HTML/Element/details

Notes to reviewer and self

  1. It seems "::group::" is the documented approach though code also supports "##group##" . Do we need to support this?
  2. I just used split string by "::group::" and capture the part after marker as name instead of regex to extract group name for simplicity but it can be made as a regex.

image

@boring-cyborg boring-cyborg bot added the area:UI Related to UI/UX. For Frontend Developers. label Feb 17, 2025
@jscheffl
Copy link
Contributor

It seems "::group::" is the documented approach though code also supports "##group##" . Do we need to support this?

The syntax with :: is the Github style. The variant with ## is the way how ADO (MS AzureDevOps Pipelines) handle log grouping. As I was contributing this initially I wanted to make it compatible to both. If this is major complexity we can re-discuss. Else I'd propose to keep it.

I just used split string by "::group::" and capture the part after marker as name instead of regex to extract group name for simplicity but it can be made as a regex.

I think this is OK.

Copy link
Contributor

@jscheffl jscheffl left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The PR Looks promising. Do you have a DAG which might be good for testing?

Before I manually test, does it support nested grouping? (Github does not and I was always proud Airflow does. Code looks like but have not checked.

When I made the implementation in Airflow 2 I made some tests that a browser is not killed by OOM or freezing if the logs are long. Have you tested with a 100MB log file with some groups before / after? Any performance difference that you could see?

(If you have positive answers then I don't need to test myself :-D)

@tirkarthi
Copy link
Contributor Author

Thanks @jscheffl for the review. Glad to have the points since you implemented it in Airflow 2.

The syntax with :: is the Github style. The variant with ## is the way how ADO (MS AzureDevOps Pipelines) handle log grouping. As I was contributing this initially I wanted to make it compatible to both. If this is major complexity we can re-discuss. Else I'd propose to keep it.

Supporting Azure would involve using regex and another branch along with handling of mix match between github and azure which will make this implementation complex.

Before I manually test, does it support nested grouping? (Github does not and I was always proud Airflow does. Code looks like but have not checked.

Nested groups are unfortunately not supported at the moment since I go through each line and then add it to an array to empty it once the group ends. Nested groups might take something like a stack of lines to push and pop along with embedding the summary and details tag correctly inside. I have added a todo about the same.

I used the below dag to generate logs of various sizes from 1 to 80 MB and I don't see unexpected delay other than few seconds of delay to render the larger log files in the UI along with downloading. Speaking of this there used to be a limit for larger logs in old UI with a download button to view it. Download button was also handy tool for offline views and analysis which is missing in the new UI currently.

I guess this is good enough for an initial implementation and can always be iterated through later since we use this feature very much internally to handle logs.

from datetime import datetime
import uuid

from airflow import DAG
from airflow.decorators import task


with DAG(
    dag_id="log_grouping",
    start_date=datetime(2025, 2, 1),
    catchup=False,
    schedule=None,
) as dag:

    @task
    def generate(**context):
        line_limit = int(context["params"]["line_limit"])
        group_limit = int(context["params"]["group_limit"])

        print("::group::group name")

        for index in range(group_limit):
            print(f"::group::inner group name {index}")
            for index in range(line_limit):
                print(" ".join([uuid.uuid4().hex] * 1))
            print("::endgroup::")

        print("::endgroup::")

    generate()

@bbovenzi
Copy link
Contributor

I think we need to make sure this is compatible with the log handler changes here: #46827

@tirkarthi
Copy link
Contributor Author

Thanks, whichever gets merged first I can try to fix the compatibility issues. I guess color highlighting based on keywords is also easier since I just need to add color attribute to Text based on keyword present in the line which can be done in a follow up PR before release.

@jscheffl
Copy link
Contributor

I'd for sure strongly nested grouping - but I would assume this can be added later as well. I think this is a very big shortcoming in Github and I was a big fan that Airflow had this in.

Airflow 2 also went over the logs line-by-line. Don't know if you checked the old implementation. Do you think this could be factored-in?

@tirkarthi
Copy link
Contributor Author

Airflow 2 also went over the logs line-by-line. Don't know if you checked the old implementation. Do you think this could be factored-in?

Thanks, I checked it. From what I understand in Airflow 2 manually created span tags appending to each other and used dangerouslySetInnerHTML to set the manually created html to div. This is simpler and doesn't need that. There should be a way to take nested groups through recursion but I cannot get the correct approach. I would be okay for nested implementation to be added in another PR to not block this PR.

Copy link
Contributor

@jscheffl jscheffl left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Accept it as intermediate state. Hoping that nested grouping is added later. Don't know when I'd be ablde to make it myself... so if somebody else would be able...

I assume approach would be simplest by implementing via a recusive call.

@bbovenzi
Copy link
Contributor

Let's rebase.

@tirkarthi
Copy link
Contributor Author

I rebased and with structlog rendering I guess I might need to start from scratch on the implementation :( renderStructuredLog handles logMessage which could be both string and StructuredLogMessage . Is there a way to see from response headers or content if the log response is a task-sdk log and a normal log to switch between implementations?

@bbovenzi
Copy link
Contributor

I rebased and with structlog rendering I guess I might need to start from scratch on the implementation :( renderStructuredLog handles logMessage which could be both string and StructuredLogMessage . Is there a way to see from response headers or content if the log response is a task-sdk log and a normal log to switch between implementations?

I was afraid that change was going to be hard to rebase against.

@Lee-W Do you know the answer to this question?

I think regardless we need to check the logMessage string or logMessage.event string if it includes ::group::

@Lee-W
Copy link
Member

Lee-W commented Feb 28, 2025

Is there a way to see from response headers or content if the log response is a task-sdk log and a normal log to switch between implementations?

In the latest change, we make StructuredLogMessage starts like this

           StructuredLogMessage(event="::group::Log message source details", sources=source_list),  # type: ignore[call-arg]
            StructuredLogMessage(event="::endgroup::"),

This logic is implemented in airflow/utils/log/file_task_handler.py so I'm not sure what do you mean by task-sdk and normal ones 🤔

@tirkarthi
Copy link
Contributor Author

@Lee-W Sorry, by normal log I meant the Airflow 2.x logs which were mostly just string.

@Lee-W
Copy link
Member

Lee-W commented Mar 1, 2025

@Lee-W Sorry, by normal log I meant the Airflow 2.x logs which were mostly just string.

I think we'll need to check the content like Brent suggested

@tirkarthi
Copy link
Contributor Author

Ok, I rebased and tried an approach here which would be to let renderStructuredLog form the elements. There is a package react-innertext through which the text could be obtained. Then I just continue processing like the earlier approach on strings with value from innerText to get beginning and end of the group markers. I couldn't add background color differentiation since the renderStructuredLog already returns HTML Element and then wrapping it again in Text shows React warning.

Current screenshot

image

@tirkarthi
Copy link
Contributor Author

Strangely pnpm test fails but pnpm test src/pages/TaskInstance/Logs/Logs.test.tsx passes locally. Is there something I am missing while running tests in parallel?

@tirkarthi
Copy link
Contributor Author

The default timeout for waitFor is 1000ms (1 second) which might be causing random failures. Increased it to 10s and it seems to be fixed.

https://testing-library.com/docs/dom-testing-library/api-async/#waitfor

@bbovenzi
Copy link
Contributor

bbovenzi commented Mar 3, 2025

I think the screenshot you have there looks good.

Nice catch on the timeout.

@bbovenzi
Copy link
Contributor

bbovenzi commented Mar 3, 2025

Although it looks like sometimes we just repeat the log group message

Screenshot 2025-03-03 at 9 43 59 AM

@tirkarthi
Copy link
Contributor Author

Task sdk logs render the event which has group name along with any other key=value pair as single string so while parsing this after the concatenation looks like the actual log group name. The concatenation and grouping are done in different functions. Not sure how to fix it.

Copy link
Contributor

@bbovenzi bbovenzi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Rebase and otherwise lgtm

@pierrejeambrun pierrejeambrun merged commit e49d796 into apache:main Mar 5, 2025
34 checks passed
@pierrejeambrun pierrejeambrun added the AIP-38 Modern Web Application label Mar 5, 2025
shahar1 pushed a commit to shahar1/airflow that referenced this pull request Mar 5, 2025
* Add log grouping using summary and details tag.

* Add tests.

* Group logs after render elements for the log is processed.

* Fix tests and update log response payload.

* Increase timeout from 1s to 10s.
nailo2c pushed a commit to nailo2c/airflow that referenced this pull request Apr 4, 2025
* Add log grouping using summary and details tag.

* Add tests.

* Group logs after render elements for the log is processed.

* Fix tests and update log response payload.

* Increase timeout from 1s to 10s.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

AIP-38 Modern Web Application area:UI Related to UI/UX. For Frontend Developers.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants