Skip to content

Conversation

@ephraimbuddy
Copy link
Contributor

This method has been replaced by handle_removed_files and deactivate_deleted_dags. Its present now causes issues as it deactivates DAGs incorrectly. handle_removed_files is a better method more suited to dag bundles as the file's processor is also terminated.

Also removed unused config variable in scheduler and config.yml

Closes: #47294

@boring-cyborg boring-cyborg bot added area:CLI area:DAG-processing area:Scheduler including HA (high availability) scheduler labels Mar 18, 2025
This method has been replaced by `handle_removed_files` and `deactivate_deleted_dags`
method. Its present now causes issues as it deactivates DAGs incorrectly. `handle_removed_files`
is a better method more suited to dag bundles as the file's processor is also terminated.

Also removed unused config variable in scheduler and config.yml
Closes: apache#47294
@ephraimbuddy ephraimbuddy force-pushed the remove-stale-dag-check-processor branch from e502479 to 653d927 Compare March 18, 2025 23:22
@ephraimbuddy ephraimbuddy marked this pull request as draft March 18, 2025 23:43
@ephraimbuddy
Copy link
Contributor Author

Marked this as draft as I found that when the dag file exists, but there's no dag anymore, the dag in DB doesn't get deactivated. Working on that

ephraimbuddy added a commit to astronomer/airflow that referenced this pull request Mar 20, 2025
We currently use Psutil's create_time for process start time,
which gives a different time than the time we use to check process
duration(which comes from time.monotonic). This leads to the
processor timing out after a while due to the large (false)difference
in time recorded, especially when the laptop is hibernated.

Process time should not depend on the system's time clock, and I guess that's what happened here.

Closes: apache#47937, Closes: apache#47294
ephraimbuddy added a commit to astronomer/airflow that referenced this pull request Mar 20, 2025
We currently use Psutil's create_time for process start time,
which gives a different time than the time we use to check process
duration(which comes from time.monotonic). This leads to the
processor timing out after a while due to the large (false)difference
in time recorded, especially when the laptop is hibernated.

Process time should not depend on the system's time clock, and I guess that's what happened here.

Closes: apache#47937, Closes: apache#47294
ephraimbuddy added a commit to astronomer/airflow that referenced this pull request Mar 20, 2025
We currently use Psutil's create_time for process start time,
which gives the time the process started using the system clock. We can use
time.time to track when the process started processing the files instead.

In breeze, once the laptop hibbernates, you would have to restart the dag
processor but this fixes it. Since this does not happen in other deployments,
we suspect that this issue is peculiar to breeze.

Closes: apache#47937, Closes: apache#47294
ephraimbuddy added a commit to astronomer/airflow that referenced this pull request Mar 21, 2025
We currently use Psutil's create_time for process start time,
which gives the time the process started using the system clock. We can use
time.time to track when the process started processing the files instead.

In breeze, once the laptop hibbernates, you would have to restart the dag
processor but this fixes it. Since this does not happen in other deployments,
we suspect that this issue is peculiar to breeze.

Closes: apache#47937, Closes: apache#47294
@ephraimbuddy
Copy link
Contributor Author

Closing in favour of #48004

ephraimbuddy added a commit to astronomer/airflow that referenced this pull request Mar 22, 2025
We currently use Psutil's create_time for process start time,
which gives the time the process started using the system clock. We can use
time.time to track when the process started processing the files instead.

In breeze, once the laptop hibbernates, you would have to restart the dag
processor but this fixes it. Since this does not happen in other deployments,
we suspect that this issue is peculiar to breeze.

Closes: apache#47937, Closes: apache#47294
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area:CLI area:DAG-processing area:Scheduler including HA (high availability) scheduler

Projects

None yet

Development

Successfully merging this pull request may close these issues.

DAGs disappearing after some idle time

1 participant