Skip to content

AIP-83 amendment -- update backfill#46248

Merged
phanikumv merged 9 commits intoapache:mainfrom
astronomer:update_backfill_logic
Feb 7, 2025
Merged

AIP-83 amendment -- update backfill#46248
phanikumv merged 9 commits intoapache:mainfrom
astronomer:update_backfill_logic

Conversation

@prabhusneha
Copy link
Contributor

@prabhusneha prabhusneha commented Jan 29, 2025

Closes: #46202
Depends on: #46187

Once uniqueness on logical date is restored, the change in backfill logic handles the removal of locking of dagrun that was added to avoid the creation of multiple concurrent dag runs for a given logical date.

@prabhusneha prabhusneha force-pushed the update_backfill_logic branch from 88f3c48 to 42a19c9 Compare February 4, 2025 11:16
@jedcunningham jedcunningham added the AIP-83 Remove Execution Date Unique Constraint from DAG Run label Feb 4, 2025
@phanikumv phanikumv merged commit 891c67f into apache:main Feb 7, 2025
45 checks passed
@phanikumv phanikumv deleted the update_backfill_logic branch February 7, 2025 07:06
Copy link
Collaborator

@sunank200 sunank200 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should add a test for this in future PRs.

niklasr22 pushed a commit to niklasr22/airflow that referenced this pull request Feb 8, 2025
log.info("Doing session rollback.")
session.rollback()

should_create_backfill_dag_run(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do you know why we are calling should_create_backfill_dag_run here?

this seems might be a mistake?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here should_create_backfill_dag_run is called to create backfill dag run record if the insert fails.
Created a separate function for this to reuse in try and error handling

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But if insert fails, wouldn’t a new insert still fail?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The try block creates a backfill by inserting into dag_run table and if that fails, the error handling here is inserting a row in backfill_dag_run table with non_create_reason.
This was suggested here - #46248 (comment)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If there are any concerns I can address them in another PR along with adding a test

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So the logic is (simplified)

  1. Check if there’s an existing run
  2. If so, return
  3. Otherwise, try to create a new run
  4. If that succeeds, good.
  5. Otherwise, race condition? Go to should_create_backfill_dag_run again; in this case, _get_latest_dag_run_row_query should return that conflicting run instead, and we only insert a new BackfillDagRun without a new DagRun.

It might be better if it’s clearer that the DagRun fetched in the second should_create_backfill_dag_run call should be a different one, but I’m not exactly sure how. Maybe call _get_latest_dag_run_row_query in _create_backfill_dag_run instead? Not sure.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah the problem here is in part naming.

a function called "should_create_x" reads like it will just give you information. so you would not expect it to be creating records, or to depend on it to create a record.

otherwise it's confusing. you could either get rid of this method, or limit it to only give info (e.g. maybe just return BackfillDagRunExceptionReason | None and do the insert conditionally at call site)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

resolved in #46959

@prabhusneha
Copy link
Contributor Author

I think we should add a test for this in future PRs.

Adding a test in a separate PR

@sunank200
Copy link
Collaborator

Add test for AIP-83 amendment - update backfill #46605

@prabhusneha I think Rahul has added test here: 46605. Please check.

ambika-garg pushed a commit to ambika-garg/airflow that referenced this pull request Feb 17, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

AIP-83 Remove Execution Date Unique Constraint from DAG Run

Projects

Status: Done

Development

Successfully merging this pull request may close these issues.

AIP-83 amendment -- update backfill

8 participants