Skip to content

Conversation

@hussein-awala
Copy link
Member

@hussein-awala hussein-awala commented Nov 3, 2023

This PR updates the type of catchup to string and adds support for the ignore_first value, to ignore catchup only for the first DagRub.

Motivations:

The start_date is a mandatory parameter, and catchup=True is very useful to avoid Airflow skip creating some dag run when there are some pressures on its services, there is a downtime, or the duration of some dag run is greater than the schedule interval, and max_active_runs is set to 1.

When creating dags dynamically from a file/API response, setting catchup to True will create all the dag runs since the start_date, and setting start_date equal to the current moment is a very bad practice and can lead to missing DagRuns.

The new test will only succeed with #35391, which fixes the bug in the data interval date for the dag runs.

@boring-cyborg boring-cyborg bot added area:API Airflow's REST/HTTP API area:serialization area:UI Related to UI/UX. For Frontend Developers. area:webserver Webserver related Issues labels Nov 3, 2023
@hussein-awala hussein-awala added the type:new-feature Changelog: New Features label Nov 3, 2023
@hussein-awala hussein-awala added this to the Airflow 2.8.0 milestone Nov 3, 2023
@BasPH
Copy link
Contributor

BasPH commented Nov 3, 2023

+1 for this👍

However, instead of adding yet another setting, I think this should be the default behavior of Airflow. I have yet to find the first user that finds it logical to get one DAG run when unpausing a DAG with catchup=False.

That would mean a breaking change and thus release in Airflow 3, so I was thinking this could be offered as a stop-gap solution for Airflow >=2.8,<3 users, but with a deprecation flag added to it?

@Taragolis
Copy link
Contributor

I think we bump into the boolean trap here, seems like logic around catchup not a binary, and for resolve it we need to add additional boolean here. And we may find ourselves in a situation when we need to add 4 and 5 logic around catchup, e.g. "catchup only on first"

So my proposal, if it possible to try get rid of boolean logic for the catchup, and replace it by Enum or string literal with boolean fallback logic, and deprecation warning about True / False values for catchup

Some simple enum

class Catchup(str, Enum):
    ENABLED = "enabled"
    DISABLED = "disabled"
    IGNORE_FIRST = "ignore_first"

    @classmethod
    def from_string(cls, value: str | Enum):
        values = cls.__members__.values()
        for v in values:
            if v.value == value:
                return v
        msg = (
            f"Unsupported Enum value for {cls!r}. Expected one of {', '.join(repr(v.value) for v in values)}, "
            f"but got {value!r}."
        )
        raise ValueError(msg)

    def __eq__(self, other):
        if isinstance(other, str):
            other = Catchup.from_string(other)
        elif isinstance(other, bool):
            return bool(self) is other
        return self == other

    def __bool__(self):
        # Deprecation warning here
        return self != self.DISABLED

However it still might break someone pipeline (I'm not sure) and we need to wait until Airflow 3 for use something different hen boolean here, and for now just add new attribute. Same is valid for already existed depends_on_past and ignore_first_depends_on_past in BaseOperator

@hussein-awala hussein-awala changed the title Add a new dag param ignore_first_catchup to disable catchup for the first DagRun Add support for catchup="ignore_first" to disable catchup for the first DagRun Nov 6, 2023
@hussein-awala
Copy link
Member Author

@BasPH @Taragolis I merged the new param in catchup, and I kept the bool values for both provided and default values supported with a deprecation warning, but I had to update the value in all the examples to avoid the warning in the CI and to invite the users to use the new values.

$ref: '#/components/schemas/Timezone'
catchup:
type: boolean
type: string
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@pierrejeambrun, is this considered a breaking change?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I feel we should still keep True and False as accepted values.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

DAGDetail is used as a response type for the endpoint /dags/{dag_id}/details, IMHO it's better to return one of the supported catchup values instead of a boolean even if the user uses a boolean in its dag.

Copy link
Member

@pierrejeambrun pierrejeambrun Nov 9, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Normally changing the type of one field would be breaking. (Either in response or request).

Here, if True False are still accepted value, it should be fine.

We would need to confirm in the client what happens if we pass a Boolean object to a string field. Does it get silently cast to string or does it crash I am not sure. That could break there, appart from that I don’t see an issue (I don’t know if the API client tries type casting in case types are wrong, or if it breaks right away)

@eladkal
Copy link
Contributor

eladkal commented Nov 25, 2023

Bas is not saying catchup false is not intuitive in general, he's talking about the specific behavior that a run is immediately created even when catchup is false.

It runs immediately because of start_date not because of catchup.
If start_date would be set correctly there won't be an immediate run.
For better or worst airflow schedule runs at the end of interval so if the interval has ended a run would be created.

I do agree that any place of confusion is something that we should handle and simply but this specific pain feels out of scope for this PR.

@hussein-awala
Copy link
Member Author

I updated the dag details endpoint and made it b/c (510b10e), and the catchup doc (3942509)

@github-actions
Copy link

This pull request has been automatically marked as stale because it has not had recent activity. It will be closed in 5 days if no further activity occurs. Thank you for your contributions.

@github-actions github-actions bot added the stale Stale PRs per the .github/workflows/stale.yml policy file label Jan 22, 2024
@github-actions github-actions bot closed this Jan 28, 2024
@hussein-awala hussein-awala reopened this Jan 28, 2024
@hussein-awala hussein-awala added pinned Protect from Stalebot auto closing and removed stale Stale PRs per the .github/workflows/stale.yml policy file labels Jan 28, 2024
@WillAyd
Copy link

WillAyd commented Sep 19, 2024

If start_date would be set correctly there won't be an immediate run.

What is the proper start_date to set to prevent an immediate run when enabling a DAG? The organizations I have worked with struggle with this a lot. We don't want to hard-code a start_date, and it seems like using a dynamic start_date is discouraged in the Airflow documentation. We are confused as to how we can deploy a DAG to different environments at different points in time and just get them to run once per day starting at the next valid interval, without an immediate execution upon enablement

@kaxil kaxil removed this from the Airflow 2.11.0 milestone Sep 26, 2024
@dead10ck
Copy link

This would be great to have

@weiminmei
Copy link

If start_date would be set correctly there won't be an immediate run.

What is the proper start_date to set to prevent an immediate run when enabling a DAG? The organizations I have worked with struggle with this a lot. We don't want to hard-code a start_date, and it seems like using a dynamic start_date is discouraged in the Airflow documentation. We are confused as to how we can deploy a DAG to different environments at different points in time and just get them to run once per day starting at the next valid interval, without an immediate execution upon enablement

What I have been having to do is update the start_date to a future date and then turn on the DAG, this is very inconvenient and I would surmise almost impossible for people with many dags.

@ashb
Copy link
Member

ashb commented Nov 13, 2024

No more new configs for this please. We already have config sprawl of dag behaviours

If this is being added to core the correct way of doing this is i feel a custom/new timetable

@dstandish
Copy link
Contributor

Can we just decide on the correct behavior and fix it. We make too many things configurable in Airflow.

@ashb
Copy link
Member

ashb commented Nov 13, 2024

Closing this, it's been inactive for a year too, plus consensus seems to be no new config option specifically

@ashb ashb closed this Nov 13, 2024
@dead10ck
Copy link

Is there an issue to actually resolve what the desired behavior is? I get closing PRs if there is not movement being made on it because of a blocker on UX design, but the problem that this PR is attempting to address remains.

@dstandish
Copy link
Contributor

dstandish commented Nov 19, 2024

Is there an issue to actually resolve what the desired behavior is? I get closing PRs if there is not movement being made on it because of a blocker on UX design, but the problem that this PR is attempting to address remains.

I'm not sure. You could look in issues or discussions, and if none found, create one for it, and possibly try to engage to help as come to something approaching a consensus on what to do. Airflow 3 is the time to fix things like this (assuming fix is needed) so we should get moving on it if it's to be resolved in 3.0.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area:API Airflow's REST/HTTP API area:serialization area:UI Related to UI/UX. For Frontend Developers. area:webserver Webserver related Issues pinned Protect from Stalebot auto closing type:new-feature Changelog: New Features

Projects

None yet

Development

Successfully merging this pull request may close these issues.