Skip to content

Conversation

@Bowrna
Copy link
Contributor

@Bowrna Bowrna commented Sep 18, 2023

related: #31810
closed: #31810

this covers the point C of the issue


^ Add meaningful description above
Read the Pull Request Guidelines for more information.
In case of fundamental code changes, an Airflow Improvement Proposal (AIP) is needed.
In case of a new dependency, check compliance with the ASF 3rd Party License Policy.
In case of backwards incompatible changes please leave a note in a newsfragment file, named {pr_number}.significant.rst or {issue_number}.significant.rst, in newsfragments.

@boring-cyborg boring-cyborg bot added the area:Scheduler including HA (high availability) scheduler label Sep 18, 2023
@Bowrna Bowrna force-pushed the heartbeat-message-enhancement branch 4 times, most recently from df94736 to 55faf46 Compare September 19, 2023 07:31
@Bowrna Bowrna force-pushed the heartbeat-message-enhancement branch from 5c097d4 to a33cf10 Compare September 21, 2023 17:09
@Bowrna Bowrna force-pushed the heartbeat-message-enhancement branch from a33cf10 to 1730227 Compare October 1, 2023 04:45
@Bowrna Bowrna force-pushed the heartbeat-message-enhancement branch from d051cb4 to 9d82803 Compare October 13, 2023 12:35
@Bowrna Bowrna force-pushed the heartbeat-message-enhancement branch 8 times, most recently from c21c011 to 54e6c96 Compare November 1, 2023 07:10
@Bowrna Bowrna force-pushed the heartbeat-message-enhancement branch from 54e6c96 to f77fa86 Compare November 1, 2023 09:11
@Bowrna Bowrna requested a review from bolkedebruin as a code owner November 1, 2023 09:11
@Bowrna Bowrna force-pushed the heartbeat-message-enhancement branch 5 times, most recently from 0328759 to 139f4aa Compare November 2, 2023 04:34
@eladkal eladkal requested a review from ephraimbuddy November 2, 2023 06:16
@Bowrna Bowrna force-pushed the heartbeat-message-enhancement branch 2 times, most recently from 984c74c to dfaaabe Compare December 3, 2023 15:27
@potiuk
Copy link
Member

potiuk commented Mar 9, 2024

this looks good @Bowrna (except the static check failing) - but I have one small proposal. It took me a bit of time to see why grace_multiplier is not passed to to the method in heartbeat - and I found out that even if it is specified in is_alive method, it's never set to anything diffferent than 2.1. And I think it's placed wrrongly. it should not be parameter of is_alive, but it should be - similarly as heartrate for example) added as Job's field in Job's constructor.

self.grace_multiplier = 2.1

And then - it should be taken from job whenever needed (similarly as we take job.job_type).

It would be great to change it here - because we are anyhow touching the same code, but If you think it's too much, I am fine to approve it even without it (and you could do it as a follow-up for example).

@Bowrna
Copy link
Contributor Author

Bowrna commented Mar 13, 2024

@potiuk i will add the changes that you have suggested

@Bowrna Bowrna force-pushed the heartbeat-message-enhancement branch 2 times, most recently from 3176485 to f28a5b7 Compare March 20, 2024 06:27
@Bowrna
Copy link
Contributor Author

Bowrna commented Mar 20, 2024

@potiuk i made the changes regarding the grace_multiplier. If I remember right, the grace_multiplier was allowed to be set as a param value in config side and I don't see that part of the code currently. am I missing something here?

Copy link
Member

@potiuk potiuk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. But I would love some other committer to take a look - it's quite crucial part of our processing and we had some issues that have been overlooked there in the past - for example #37992

@potiuk potiuk requested a review from jedcunningham March 20, 2024 06:45
@potiuk
Copy link
Member

potiuk commented Mar 20, 2024

@potiuk i made the changes regarding the grace_multiplier. If I remember right, the grace_multiplier was allowed to be set as a param value in config side and I don't see that part of the code currently. am I missing something here?

Yes. Grace_multiplier is not set anywhere to a different value than default 2.1 now.

@potiuk
Copy link
Member

potiuk commented Mar 20, 2024

(or so I saw from reviewing the code myself). It's still good to keep it though in case we have a case we want to override it.

@Bowrna Bowrna force-pushed the heartbeat-message-enhancement branch from f28a5b7 to eef776f Compare March 20, 2024 09:19
@Bowrna Bowrna closed this Mar 20, 2024
@Bowrna Bowrna reopened this Mar 20, 2024
@potiuk
Copy link
Member

potiuk commented Mar 20, 2024

Looks like real issues.

@Bowrna Bowrna force-pushed the heartbeat-message-enhancement branch 3 times, most recently from 2e4b788 to fd2e4e1 Compare March 20, 2024 14:54
@Bowrna
Copy link
Contributor Author

Bowrna commented Mar 21, 2024

Test is failing in the health api call testcases. let me debug the issue and fix it by EOD

@Bowrna Bowrna force-pushed the heartbeat-message-enhancement branch from fd2e4e1 to ddc7d44 Compare March 21, 2024 08:43
@Bowrna Bowrna force-pushed the heartbeat-message-enhancement branch 2 times, most recently from d0b3f94 to 40c4b33 Compare April 4, 2024 06:26
@Bowrna Bowrna force-pushed the heartbeat-message-enhancement branch from 40c4b33 to dda9b36 Compare April 4, 2024 06:43
@Bowrna
Copy link
Contributor Author

Bowrna commented Apr 4, 2024

All green now :)

@potiuk potiuk merged commit 6c866f4 into apache:main Apr 5, 2024
@potiuk
Copy link
Member

potiuk commented Apr 5, 2024

🎉

@ephraimbuddy ephraimbuddy added the type:improvement Changelog: Improvements label Jun 3, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area:Scheduler including HA (high availability) scheduler type:improvement Changelog: Improvements

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Improve log messages of heartbeat connection errors and recovery

5 participants