Skip to content

Conversation

@TJaniF
Copy link
Contributor

@TJaniF TJaniF commented Mar 4, 2025

"Oh I need to unpause the DAG to run it, gotcha"
click
Many DAG runs spawn and start running
"Nonono I only wanted one, stop stop"

⬆️ This was me almost 3 years ago, using Airflow for the first time, obviously not understanding scheduling yet, and thinking of catchup mainly as a condiment.

Now I know how start_date works, but I still often set it just a while in the past because time math can be hard. Setting catchup=False has become a default and I don't think I've written more than one or two DAGs without it, ever.

Talking to other Airflow users, my experience seems to be the common one, especially when first starting with Airflow, accidental catchup as a rite of passage.

Preparing this PR I also think the vast majority of example DAGs have catchup=False, it's the de-facto default, so lets make it the default? :)

Side note: The tests appear to use explicit setting of catchup when testing it, so no adjustment needed there as far as I can tell.

EDIT: got breeze to run and it isnt working yet... 👀
EDIT2: got it to work but I touched dags.py 😅

DIscussion: https://lists.apache.org/thread/omh35gs8r5rmcxhs6onygzfhh9znkodm
Lazy Consensus: https://lists.apache.org/thread/mdpsyrodctv2ky2pvsl605y85h5mo8hv


^ Add meaningful description above
Read the Pull Request Guidelines for more information.
In case of fundamental code changes, an Airflow Improvement Proposal (AIP) is needed.
In case of a new dependency, check compliance with the ASF 3rd Party License Policy.
In case of backwards incompatible changes please leave a note in a newsfragment file, named {pr_number}.significant.rst or {issue_number}.significant.rst, in newsfragments.

@TJaniF TJaniF marked this pull request as ready for review March 4, 2025 16:45
@TJaniF TJaniF requested review from XD-DENG, ashb and potiuk as code owners March 4, 2025 16:45
Copy link
Contributor

@eladkal eladkal left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we do this we need newsfrgment.

I would also perfer to have discussion/lazy consensus in the mailing list. Previously we had some related suggestion to the catchup issue in general
#35392 (comment)
#38168

@cmarteepants cmarteepants requested a review from uranusjr March 4, 2025 20:58
TJaniF and others added 2 commits March 6, 2025 03:52
Co-authored-by: Tzu-ping Chung <uranusjr@gmail.com>
@uranusjr
Copy link
Member

uranusjr commented Mar 6, 2025

Looks like a lot of tests are not happy because they rely on the first run being created against start_date. Let’s just add catchup_by_default = true in airflow/config_templates/unit_tests.cfg so tests still set catchup by default.

@uranusjr uranusjr requested a review from sunank200 March 13, 2025 23:58
Copy link
Member

@uranusjr uranusjr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not enough knowledge on the lint command and the config default, otherwise lgtm.

@uranusjr
Copy link
Member

Also need to fix static checks. Kubernetes test failures are unrelated I believe.

@TJaniF
Copy link
Contributor Author

TJaniF commented Mar 14, 2025

Not enough knowledge on the lint command and the config default, otherwise lgtm.

Thank you @uranusjr! The static check was an extra empty line, sorry, fixed that now :)
And yes, I don't think the K8s checks are failing because of something I did... 🤔

EDIT: it is 💚 ! :D

@kaxil kaxil merged commit 9450832 into apache:main Mar 14, 2025
89 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

airflow3.0:breaking Candidates for Airflow 3.0 that contain breaking changes kind:documentation

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants