AIP-83 run_id logic when no logical date #46398

prabhusneha · 2025-02-03T20:16:46Z

Related issue: #46199

Construct run_id, when logical date is null, using run_after + random string.

uranusjr · 2025-02-04T09:37:11Z

airflow/api/common/trigger_dag.py

+        run_type=DagRunType.MANUAL,
+        logical_date=coerced_logical_date,
+        data_interval=data_interval,
+        run_after=data_interval.end,


At some point (before 3.0) I want to reduce the arguments here to just take a DagRunInfo, but that can be done separately instead.

according to the doc, when logical date is null, then data interval end should be null. so it would not make sense to use data interval end as the run_after date. thoughts @uranusjr

Oh yeah you’re right. This should do coerced_logical_date or timezone.utcnow() (as currently implemented, coerced_logical_date can never be None, but it will be when everything is finished).

Should we include this change as a part of this PR or have a separate PR?

I think we'll need to do it in this PR 🤔

timezone.utcnow() seems to be a reasonable default for run_after

I have changed it here: #46616

uranusjr · 2025-02-04T09:38:27Z

airflow/jobs/scheduler_job_runner.py

                        run_type=DagRunType.ASSET_TRIGGERED,
                        logical_date=logical_date,
                        data_interval=data_interval,
+                        run_after=max(logical_dates.values()),


There’s a task to make asset-triggered runs have None logical_date instead, so we’ll need to change this again soon. This is good enough for now.

uranusjr · 2025-02-04T09:38:57Z

airflow/api_connexion/schemas/dag_run_schema.py

                data["dag_run_id"] = DagRun.generate_run_id(
-                    DagRunType.MANUAL, timezone.parse(data["logical_date"])
+                    DagRunType.MANUAL,
+                    timezone.parse(data["logical_date"]),
                )


Do we want to pass run_after here?

looks like there is no change here, just a new line?

Yes and I think that’s wrong, we should change the logic here.

Is this required here? as run_after was not added in this as part of the PR that added the new field run_after ( #46195 ).
I assumed this was going to be deprecated so wasn't added here.

I have changed it here: #46616

uranusjr · 2025-02-04T09:40:20Z

airflow/utils/types.py


-    def generate_run_id(self, logical_date: datetime) -> str:
+    def generate_run_id(self, logical_date: datetime | None, run_after: datetime | None) -> str:
+        if logical_date is None:
+            if run_after is None:
+                raise ValueError("run_after cannot be None")
+            return run_after + get_random_string()
        return f"{self}__{logical_date.isoformat()}"


Maybe if this function should continue to accept one single datetime value, and we do the if-else check outside instead.

run_after could be None as well?

Maybe if this function should continue to accept one single datetime value, and we do the if-else check outside instead.

Here we need to know if logical_date is None or not before generating a random string that gets appended to run_after.
If we take just one argument, it wouldn't know when to append the random string. Are you suggesting to move this logic into the callers(Timetable.generate_run_id and DagRun.generate_run_id)?
I went with the current implementation to avoid duplicate code/multiple function calls.

I would move random string generation to DagRun.

I have changed it here: #46616

dstandish · 2025-02-04T18:24:51Z

airflow/api_fastapi/core_api/routes/public/dag_run.py

                run_type=DagRunType.MANUAL,
                logical_date=logical_date,
                data_interval=data_interval,
+                run_after=data_interval.end,


again, if logical date is null, then we won't generally have a data interval.... correct @uranusjr ?

Thats right

data_interval would be null as well.

utcnow for this as well probably?

I have changed it here: #46616

Lee-W · 2025-02-10T05:56:37Z

airflow/api/common/trigger_dag.py

+        run_type=DagRunType.MANUAL,
+        logical_date=coerced_logical_date,
+        data_interval=data_interval,
+        run_after=data_interval.end,


I think we'll need to do it in this PR 🤔

Lee-W · 2025-02-10T05:56:59Z

airflow/api/common/trigger_dag.py

+        run_type=DagRunType.MANUAL,
+        logical_date=coerced_logical_date,
+        data_interval=data_interval,
+        run_after=data_interval.end,


timezone.utcnow() seems to be a reasonable default for run_after

Lee-W · 2025-02-10T05:58:04Z

airflow/api_connexion/schemas/dag_run_schema.py

+                    DagRunType.MANUAL,
+                    timezone.parse(data["logical_date"]),


Suggested change

DagRunType.MANUAL,

timezone.parse(data["logical_date"]),

run_type=DagRunType.MANUAL,

logical_date=timezone.parse(data["logical_date"]),

Lee-W · 2025-02-10T05:58:35Z

airflow/api_fastapi/core_api/routes/public/dag_run.py

                run_type=DagRunType.MANUAL,
                logical_date=logical_date,
                data_interval=data_interval,
+                run_after=data_interval.end,


utcnow for this as well probably?

Lee-W · 2025-02-10T05:59:09Z

airflow/models/baseoperator.py

+                        DagRunType.MANUAL,
+                        info.logical_date,


Suggested change

DagRunType.MANUAL,

info.logical_date,

run_type=DagRunType.MANUAL,

logical_date=info.logical_date,

I have changed it here: #46616

Lee-W · 2025-02-10T05:59:52Z

airflow/models/dagrun.py

    @staticmethod
-    def generate_run_id(run_type: DagRunType, logical_date: datetime) -> str:
+    def generate_run_id(
+        run_type: DagRunType, logical_date: datetime | None, run_after: datetime | None = None


Suggested change

run_type: DagRunType, logical_date: datetime | None, run_after: datetime | None = None

*, run_type: DagRunType, logical_date: datetime | None, run_after: datetime | None = None

as the method becomes more complicate (hard to reason the order of logcial_date and run_after), we probably should make it keyword only

I have changed it here: #46616

Lee-W · 2025-02-10T06:00:25Z

airflow/utils/types.py

        return self.value

-    def generate_run_id(self, logical_date: datetime) -> str:
+    def generate_run_id(self, logical_date: datetime | None, run_after: datetime | None) -> str:


Suggested change

def generate_run_id(self, logical_date: datetime | None, run_after: datetime | None) -> str:

def generate_run_id(self, *, logical_date: datetime | None, run_after: datetime | None) -> str:

same here

I have changed it here: #46616

Lee-W · 2025-02-10T06:01:13Z

airflow/utils/types.py

+        if logical_date is None:
+            if run_after is None:
+                raise ValueError("run_after cannot be None")
+            return run_after + get_random_string()


Suggested change

return run_after + get_random_string()

return f"{run_after}{get_random_string()}"

I have changed it here: #46616

sunank200 · 2025-02-10T11:36:23Z

I am doing changes in #46616 as logic has changed now.

run_id logic when logical_date is null

8f22db7

prabhusneha requested review from XD-DENG, ashb, bbovenzi, ephraimbuddy, jscheffl, pierrejeambrun, ryanahamilton and uranusjr as code owners February 3, 2025 20:16

boring-cyborg bot added area:API Airflow's REST/HTTP API area:Scheduler including HA (high availability) scheduler area:webserver Webserver related Issues labels Feb 3, 2025

Merge branch 'main' into generate_run_id_logic

05d1ec9

prabhusneha mentioned this pull request Feb 3, 2025

AIP-83 question 6 run_id logic when no logical date #46199

Closed

1 task

Lee-W self-requested a review February 4, 2025 08:50

Lee-W added the legacy api Whether legacy API changes should be allowed in PR label Feb 4, 2025

uranusjr reviewed Feb 4, 2025

View reviewed changes

jedcunningham added the AIP-83 Remove Execution Date Unique Constraint from DAG Run label Feb 4, 2025

dstandish reviewed Feb 4, 2025

View reviewed changes

Lee-W mentioned this pull request Feb 5, 2025

Set logical_date and data_interval to None for asset-triggered dags and forbid them to be accessed in context/template #46460

Merged

vatsrahul1001 mentioned this pull request Feb 7, 2025

AIP-83 Logical date should be required field when triggering run via API #46390

Merged

Lee-W reviewed Feb 10, 2025

View reviewed changes

Lee-W assigned sunank200 Feb 10, 2025

This was referenced Feb 10, 2025

Change manual run data interval behavior based on logical_date #46512

Merged

AIP-83 amendment: Add logic for generating run_id when logical date is None. #46616

Merged

sunank200 closed this Feb 10, 2025

	run_type: DagRunType, logical_date: datetime \| None, run_after: datetime \| None = None
	*, run_type: DagRunType, logical_date: datetime \| None, run_after: datetime \| None = None

	def generate_run_id(self, logical_date: datetime \| None, run_after: datetime \| None) -> str:
	def generate_run_id(self, *, logical_date: datetime \| None, run_after: datetime \| None) -> str:

	return run_after + get_random_string()
	return f"{run_after}{get_random_string()}"

AIP-83 run_id logic when no logical date #46398

AIP-83 run_id logic when no logical date #46398

Uh oh!

Conversation

prabhusneha commented Feb 3, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

uranusjr Feb 5, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

uranusjr Feb 10, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

prabhusneha commented Feb 3, 2025 •

edited

Loading

uranusjr Feb 5, 2025 •

edited

Loading

uranusjr Feb 10, 2025 •

edited

Loading