Skip to content

Conversation

@mobuchowski
Copy link
Contributor

@mobuchowski mobuchowski commented May 28, 2024

Currently, LocalTaskJobRunner kills worker processes after they have their state set to SUCCESS and are still running. This can happen in few ways - most common are either state is set externally, or there are relatively slow listeners or other processes (mini scheduler) running.

This PR introduces mechanism in which we give the task some additional to finish their job - by default 20 seconds, configurable - after which the process is terminated.

@boring-cyborg boring-cyborg bot added area:providers area:Scheduler including HA (high availability) scheduler provider:openlineage AIP-53 labels May 28, 2024
@mobuchowski mobuchowski force-pushed the listener-task-timeout branch from 460fadf to 32e46dd Compare May 30, 2024 12:57
@mobuchowski mobuchowski marked this pull request as ready for review May 30, 2024 13:01
@mobuchowski mobuchowski requested a review from potiuk May 30, 2024 13:01
@mobuchowski mobuchowski force-pushed the listener-task-timeout branch 2 times, most recently from 430c4fa to e2acd51 Compare June 4, 2024 16:52
@mobuchowski mobuchowski changed the title local task job: add timeout, to not kill on_task_instance_success listener by heartbeat_callback local task job: add overtime mechanism, to not kill task prematurely when auxiliary processes are running Jun 4, 2024
@mobuchowski mobuchowski force-pushed the listener-task-timeout branch from e2acd51 to ec1b194 Compare June 5, 2024 21:25
@mobuchowski mobuchowski force-pushed the listener-task-timeout branch 2 times, most recently from f0128ce to 3d4661d Compare June 10, 2024 14:40
@potiuk
Copy link
Member

potiuk commented Jun 11, 2024

Nice and simple. I think we should have another committer to take a look as well.

Copy link
Contributor

@jscheffl jscheffl left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code-wise looks good.

…tener prematurely

Signed-off-by: Maciej Obuchowski <obuchowski.maciej@gmail.com>
@mobuchowski mobuchowski force-pushed the listener-task-timeout branch from 3d4661d to 7b0404b Compare June 12, 2024 14:53
@mobuchowski mobuchowski merged commit fa65a20 into main Jun 13, 2024
@eladkal eladkal deleted the listener-task-timeout branch June 29, 2024 17:32
@ephraimbuddy ephraimbuddy added this to the Airflow 2.10.0 milestone Jul 1, 2024
@ephraimbuddy ephraimbuddy added the type:improvement Changelog: Improvements label Jul 1, 2024
@ephraimbuddy ephraimbuddy added type:bug-fix Changelog: Bug Fixes and removed type:improvement Changelog: Improvements labels Jul 1, 2024
romsharon98 pushed a commit to romsharon98/airflow that referenced this pull request Jul 26, 2024
…tener prematurely (apache#39890)

Signed-off-by: Maciej Obuchowski <obuchowski.maciej@gmail.com>
kaxil added a commit to astronomer/airflow that referenced this pull request Dec 3, 2024
This PR ports the overtime feature on `LocalTaskJob` (added in apache#39890) to the Supervisor.
It allows to terminate Task process to terminate when it exceeding the configured success overtime threshold which is useful when we add Listenener to the Task process.

closes apache#44356

Also added `TaskState` to update state and send end_date from task process to the supervisor.
kaxil added a commit to astronomer/airflow that referenced this pull request Dec 3, 2024
This PR ports the overtime feature on `LocalTaskJob` (added in apache#39890) to the Supervisor.
It allows to terminate Task process to terminate when it exceeding the configured success overtime threshold which is useful when we add Listenener to the Task process.

closes apache#44356

Also added `TaskState` to update state and send end_date from task process to the supervisor.
kaxil added a commit to astronomer/airflow that referenced this pull request Dec 3, 2024
This PR ports the overtime feature on `LocalTaskJob` (added in apache#39890) to the Supervisor.
It allows to terminate Task process to terminate when it exceeding the configured success overtime threshold which is useful when we add Listenener to the Task process.

closes apache#44356

Also added `TaskState` to update state and send end_date from task process to the supervisor.
kaxil added a commit to astronomer/airflow that referenced this pull request Dec 3, 2024
This PR ports the overtime feature on `LocalTaskJob` (added in apache#39890) to the Supervisor.
It allows to terminate Task process to terminate when it exceeding the configured success overtime threshold which is useful when we add Listenener to the Task process.

closes apache#44356

Also added `TaskState` to update state and send end_date from task process to the supervisor.
kaxil added a commit to astronomer/airflow that referenced this pull request Dec 3, 2024
This PR ports the overtime feature on `LocalTaskJob` (added in apache#39890) to the Supervisor.
It allows to terminate Task process to terminate when it exceeding the configured success overtime threshold which is useful when we add Listenener to the Task process.

closes apache#44356

Also added `TaskState` to update state and send end_date from task process to the supervisor.
kaxil added a commit to astronomer/airflow that referenced this pull request Dec 3, 2024
This PR ports the overtime feature on `LocalTaskJob` (added in apache#39890) to the Supervisor.
It allows to terminate Task process to terminate when it exceeding the configured success overtime threshold which is useful when we add Listenener to the Task process.

closes apache#44356

Also added `TaskState` to update state and send end_date from task process to the supervisor.
kaxil added a commit to astronomer/airflow that referenced this pull request Dec 3, 2024
This PR ports the overtime feature on `LocalTaskJob` (added in apache#39890) to the Supervisor.
It allows to terminate Task process to terminate when it exceeding the configured success overtime threshold which is useful when we add Listenener to the Task process.

closes apache#44356

Also added `TaskState` to update state and send end_date from task process to the supervisor.
kaxil added a commit to astronomer/airflow that referenced this pull request Dec 3, 2024
This PR ports the overtime feature on `LocalTaskJob` (added in apache#39890) to the Supervisor.
It allows to terminate Task process to terminate when it exceeding the configured success overtime threshold which is useful when we add Listenener to the Task process.

closes apache#44356

Also added `TaskState` to update state and send end_date from task process to the supervisor.
kaxil added a commit to astronomer/airflow that referenced this pull request Dec 3, 2024
This PR ports the overtime feature on `LocalTaskJob` (added in apache#39890) to the Supervisor.
It allows to terminate Task process to terminate when it exceeding the configured success overtime threshold which is useful when we add Listenener to the Task process.

closes apache#44356

Also added `TaskState` to update state and send end_date from task process to the supervisor.
kaxil added a commit that referenced this pull request Dec 3, 2024
This PR ports the overtime feature on `LocalTaskJob` (added in #39890) to the Supervisor.
It allows to terminate Task process to terminate when it exceeding the configured success overtime threshold which is useful when we add Listenener to the Task process.

closes #44356

Also added `TaskState` to update state and send end_date from task process to the supervisor.
got686-yandex pushed a commit to got686-yandex/airflow that referenced this pull request Jan 30, 2025
This PR ports the overtime feature on `LocalTaskJob` (added in apache#39890) to the Supervisor.
It allows to terminate Task process to terminate when it exceeding the configured success overtime threshold which is useful when we add Listenener to the Task process.

closes apache#44356

Also added `TaskState` to update state and send end_date from task process to the supervisor.
kosteev pushed a commit to GoogleCloudPlatform/composer-airflow that referenced this pull request May 27, 2025
This PR ports the overtime feature on `LocalTaskJob` (added in apache/airflow#39890) to the Supervisor.
It allows to terminate Task process to terminate when it exceeding the configured success overtime threshold which is useful when we add Listenener to the Task process.

closes apache/airflow#44356

Also added `TaskState` to update state and send end_date from task process to the supervisor.

GitOrigin-RevId: d059d4a84b526e5f975d17c6846f5c8e29adbc3a
kosteev pushed a commit to GoogleCloudPlatform/composer-airflow that referenced this pull request Sep 23, 2025
This PR ports the overtime feature on `LocalTaskJob` (added in apache/airflow#39890) to the Supervisor.
It allows to terminate Task process to terminate when it exceeding the configured success overtime threshold which is useful when we add Listenener to the Task process.

closes apache/airflow#44356

Also added `TaskState` to update state and send end_date from task process to the supervisor.

GitOrigin-RevId: d059d4a84b526e5f975d17c6846f5c8e29adbc3a
kosteev pushed a commit to GoogleCloudPlatform/composer-airflow that referenced this pull request Oct 21, 2025
This PR ports the overtime feature on `LocalTaskJob` (added in apache/airflow#39890) to the Supervisor.
It allows to terminate Task process to terminate when it exceeding the configured success overtime threshold which is useful when we add Listenener to the Task process.

closes apache/airflow#44356

Also added `TaskState` to update state and send end_date from task process to the supervisor.

GitOrigin-RevId: d059d4a84b526e5f975d17c6846f5c8e29adbc3a
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area:providers area:Scheduler including HA (high availability) scheduler provider:openlineage AIP-53 type:bug-fix Changelog: Bug Fixes

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants