-
Notifications
You must be signed in to change notification settings - Fork 16.4k
Set catchup_by_default config to False by default
#47354
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
eladkal
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we do this we need newsfrgment.
I would also perfer to have discussion/lazy consensus in the mailing list. Previously we had some related suggestion to the catchup issue in general
#35392 (comment)
#38168
Co-authored-by: Tzu-ping Chung <uranusjr@gmail.com>
|
Looks like a lot of tests are not happy because they rely on the first run being created against start_date. Let’s just add |
Co-authored-by: Tzu-ping Chung <uranusjr@gmail.com>
…he current default
uranusjr
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not enough knowledge on the lint command and the config default, otherwise lgtm.
|
Also need to fix static checks. Kubernetes test failures are unrelated I believe. |
Thank you @uranusjr! The static check was an extra empty line, sorry, fixed that now :) EDIT: it is 💚 ! :D |
Co-authored-by: Kaxil Naik <kaxilnaik@gmail.com>
"Oh I need to unpause the DAG to run it, gotcha"
click
Many DAG runs spawn and start running
"Nonono I only wanted one, stop stop"
⬆️ This was me almost 3 years ago, using Airflow for the first time, obviously not understanding scheduling yet, and thinking of catchup mainly as a condiment.
Now I know how start_date works, but I still often set it just a while in the past because time math can be hard. Setting catchup=False has become a default and I don't think I've written more than one or two DAGs without it, ever.
Talking to other Airflow users, my experience seems to be the common one, especially when first starting with Airflow, accidental catchup as a rite of passage.
Preparing this PR I also think the vast majority of example DAGs have catchup=False, it's the de-facto default, so lets make it the default? :)
Side note: The tests appear to use explicit setting of catchup when testing it, so no adjustment needed there as far as I can tell.
EDIT: got breeze to run and it isnt working yet... 👀
EDIT2: got it to work but I touched dags.py 😅
DIscussion: https://lists.apache.org/thread/omh35gs8r5rmcxhs6onygzfhh9znkodm
Lazy Consensus: https://lists.apache.org/thread/mdpsyrodctv2ky2pvsl605y85h5mo8hv
^ Add meaningful description above
Read the Pull Request Guidelines for more information.
In case of fundamental code changes, an Airflow Improvement Proposal (AIP) is needed.
In case of a new dependency, check compliance with the ASF 3rd Party License Policy.
In case of backwards incompatible changes please leave a note in a newsfragment file, named
{pr_number}.significant.rstor{issue_number}.significant.rst, in newsfragments.