Skip to content

Conversation

@collinmcnulty
Copy link
Contributor

The local task job heartbeat will crash the task if it fails. This is counter to the goal of the heartbeat, which is that it's ok if the task temporarily loses connection to the DB as long as it re-establishes it within the zombie timeout.

I've used an extremely broad exception here. I'm open to something more narrow if we have a smaller error type that will still cover the problems not controllable by the Airflow code, but I think it's reasonable to use a very broad exception.


^ Add meaningful description above
Read the Pull Request Guidelines for more information.
In case of fundamental code changes, an Airflow Improvement Proposal (AIP) is needed.
In case of a new dependency, check compliance with the ASF 3rd Party License Policy.
In case of backwards incompatible changes please leave a note in a newsfragment file, named {pr_number}.significant.rst or {issue_number}.significant.rst, in newsfragments.

@boring-cyborg boring-cyborg bot added the area:Scheduler including HA (high availability) scheduler label Aug 23, 2024
@collinmcnulty collinmcnulty requested a review from RNHTTR August 25, 2024 00:48
@collinmcnulty
Copy link
Contributor Author

Sorry Jed and Jarek for the incorrect notification, I messed up the rebase process and included extra changes. Now fixed.

@RNHTTR RNHTTR merged commit 6647610 into apache:main Aug 26, 2024
jedcunningham pushed a commit to astronomer/airflow that referenced this pull request Aug 27, 2024
* Never fail an ltj over a heartbeat

* Log a warning on failed heartbeat

* Avoid using f-string in log

* Remove unnecessary pass statement

(cherry picked from commit 6647610)
@jedcunningham
Copy link
Member

Cherry-pick pr: #41810

potiuk pushed a commit that referenced this pull request Aug 28, 2024
* Never fail an ltj over a heartbeat

* Log a warning on failed heartbeat

* Avoid using f-string in log

* Remove unnecessary pass statement

(cherry picked from commit 6647610)

Co-authored-by: Collin McNulty <collin.mcnulty@gmail.com>
Copy link

@tthompson691 tthompson691 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a long overdue improvement!

@utkarsharma2 utkarsharma2 added this to the Airflow 2.10.1 milestone Aug 30, 2024
@utkarsharma2 utkarsharma2 added the type:bug-fix Changelog: Bug Fixes label Aug 30, 2024
utkarsharma2 pushed a commit that referenced this pull request Sep 2, 2024
* Never fail an ltj over a heartbeat

* Log a warning on failed heartbeat

* Avoid using f-string in log

* Remove unnecessary pass statement

(cherry picked from commit 6647610)

Co-authored-by: Collin McNulty <collin.mcnulty@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area:Scheduler including HA (high availability) scheduler type:bug-fix Changelog: Bug Fixes

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants