Skip to content

Conversation

@BasPH
Copy link
Contributor

@BasPH BasPH commented Mar 15, 2024

This PR adds a config to avoid immediately running a DAG run when first unpausing a DAG with catchup=False.

This is a common issue during e.g. migrations, where people expect that unpausing a DAG with catchup=False will wait for the next interval to complete, and only then run the first interval. However, it currently immediately executes the first DAG runs.

Since this would be a breaking change, I suggest this behaviour should be fixed in Airflow 3.0.0. To support this temporarily in Airflow 2.*, I suggest introducing a config AIRFLOW__SCHEDULER__CATCHUP_FALSE_NO_DAGRUN_AIRFLOW_2_FIX, which is default False, but can be set to True to enable this fix.

Creating a draft for feedback. TODOs are add tests, docs, and support all timetables (currently only cron-based timetable supported).


^ Add meaningful description above
Read the Pull Request Guidelines for more information.
In case of fundamental code changes, an Airflow Improvement Proposal (AIP) is needed.
In case of a new dependency, check compliance with the ASF 3rd Party License Policy.
In case of backwards incompatible changes please leave a note in a newsfragment file, named {pr_number}.significant.rst or {issue_number}.significant.rst, in newsfragments.

@BasPH BasPH changed the title Add config to avoid one DAG run when unpausing a DAG Add config to avoid one DAG run when unpausing a DAG with catchup=False Mar 15, 2024
@eladkal
Copy link
Contributor

eladkal commented Mar 16, 2024

I think we need to have holistic approach for this use case as well as #35392
I think we should make catchup a bit smarter than just boolean flag.

@BasPH
Copy link
Contributor Author

BasPH commented Mar 17, 2024

@eladkal please elaborate on "holistic approach"?

I see this PR as a temporary solution until Airflow 3. What I expect catchup to do when unpausing a DAG:

  • catchup=True executes all historical DAG runs that weren't executed while the DAG was paused
  • catchup=False waits for the first interval to complete when unpaused, e.g. waits till 00:00 with a "@daily" schedule

You currently immediately execute one DAG run when unpausing with a past start_date + catchup=False. This is causing a lot of trouble, especially during migrations where sometimes hundreds or even thousands of DAGs have to be unpaused. This is very counterintuitive. I think having yet another parameter as suggested in #35392 is unnecessary.

However, since changing this behaviour would be considered a breaking change, it can only first be changed in Airflow 3.0. This PR adds a config that's also deprecated at the same time which enables this behaviour as a bridge solution for future Airflow 2.* versions.

@eladkal
Copy link
Contributor

eladkal commented Apr 16, 2024

This PR adds setting while #35392 changes the catchup parameters values.
Each of these is stand alone and patch by itself.

my point is that this is the heart of Airflow.
We should consider holistic approach and try to think on this PR with #35392 together.
Preferably open a mailing list discussion and come up with the interface we want to have

@github-actions
Copy link

This pull request has been automatically marked as stale because it has not had recent activity. It will be closed in 5 days if no further activity occurs. Thank you for your contributions.

@github-actions github-actions bot added the stale Stale PRs per the .github/workflows/stale.yml policy file label Jul 27, 2024
@github-actions github-actions bot closed this Aug 2, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

stale Stale PRs per the .github/workflows/stale.yml policy file

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants