Skip to content

Conversation

@potiuk
Copy link
Member

@potiuk potiuk commented Apr 27, 2023

The standalone file processor as of #30278 introduced accidentally an artifficial delay between dag processing by adding heartbeat but missing to set "only_if_necessary" flag to True.

If your dag file processing has been fast (faster than the scheduler job_heartbeat_sec) this introduced unnecessary pause between the next dag file processor loop (up until the time passed), it also introduced inflation of the
dag_processing_last_duration metrics (it would always show minimum job_heartbeat_sec)

Adding "only_if_necessary" flag fixes the problem.

Fixes: #30593
Fixes: #30884


^ Add meaningful description above

Read the Pull Request Guidelines for more information.
In case of fundamental code changes, an Airflow Improvement Proposal (AIP) is needed.
In case of a new dependency, check compliance with the ASF 3rd Party License Policy.
In case of backwards incompatible changes please leave a note in a newsfragment file, named {pr_number}.significant.rst or {issue_number}.significant.rst, in newsfragments.

@potiuk potiuk requested review from XD-DENG, ashb and kaxil as code owners April 27, 2023 10:00
@boring-cyborg boring-cyborg bot added the area:Scheduler including HA (high availability) scheduler label Apr 27, 2023
@potiuk potiuk force-pushed the fix-heartbeat-only-if-necessary-for-dag-file-processor branch from c281f5a to 759c8e3 Compare April 27, 2023 10:14
@potiuk potiuk mentioned this pull request Apr 27, 2023
2 tasks
The standalone file processor as of apache#30278 introduced accidentally
an artifficial delay between dag processing by adding heartbeat
but missing to set "only_if_necessary" flag to True.

If your dag file processing has been fast (faster than the
scheduler job_heartbeat_sec) this introduced unnecessary pause
between the next dag file processor loop (up until the time
passed), it also introduced inflation of the
dag_processing_last_duration metrics (it would always show minimum
job_heartbeat_sec)

Adding "only_if_necessary" flag fixes the problem.

Fixes: apache#30593
Fixes: apache#30884
@potiuk potiuk force-pushed the fix-heartbeat-only-if-necessary-for-dag-file-processor branch from 759c8e3 to c865052 Compare April 27, 2023 10:34
@potiuk potiuk merged commit 00ab45f into apache:main Apr 27, 2023
@potiuk potiuk deleted the fix-heartbeat-only-if-necessary-for-dag-file-processor branch April 27, 2023 11:27
@potiuk potiuk added this to the Airflow 2.6.0 milestone Apr 27, 2023
ephraimbuddy pushed a commit that referenced this pull request Apr 27, 2023
The standalone file processor as of #30278 introduced accidentally
an artifficial delay between dag processing by adding heartbeat
but missing to set "only_if_necessary" flag to True.

If your dag file processing has been fast (faster than the
scheduler job_heartbeat_sec) this introduced unnecessary pause
between the next dag file processor loop (up until the time
passed), it also introduced inflation of the
dag_processing_last_duration metrics (it would always show minimum
job_heartbeat_sec)

Adding "only_if_necessary" flag fixes the problem.

Fixes: #30593
Fixes: #30884
(cherry picked from commit 00ab45f)
@ephraimbuddy ephraimbuddy added the type:bug-fix Changelog: Bug Fixes label Apr 28, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area:Scheduler including HA (high availability) scheduler type:bug-fix Changelog: Bug Fixes

Projects

None yet

Development

Successfully merging this pull request may close these issues.

DagProcessor Performance Regression After upgrading to 2.5.3, Dag Processing time increased dramatically.

3 participants