Skip to content

Conversation

@jscheffl
Copy link
Contributor

We are using DockerOperator for executions outside the cloud... and compared to KubernetesPodOperator I see a lot of problems when logs are emitted. This PR improves the situation:

  • If an image is pulled all layer status is posted to logs - mostly this is not of interest, added a cool new log group around
  • As logs from container are captured, log chunks are logged to stdout for Airflow. But chunks mostly contain multiple lines and if a lot of logs are emitted then... at point of retrieval the task_log_handler tries to sort messages by timestamp an fails if numeric content is in the docker logs and totally scrambles the result. Logs are with this PR split by line and the python logger prefix is added consistently on each line - like in K8sPodOperator.

@potiuk potiuk merged commit c8b7dc5 into apache:main Jun 29, 2024
@potiuk
Copy link
Member

potiuk commented Jun 29, 2024

Nice one :)

romsharon98 pushed a commit to romsharon98/airflow that referenced this pull request Jul 26, 2024
@jscheffl jscheffl deleted the feature/improve-logging-in-docker-operator branch October 5, 2025 07:38
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants