Skip to content

Conversation

@dstandish
Copy link
Contributor

@dstandish dstandish commented Jan 25, 2024

Add conditional logic for dataset-triggered dags.

This means we can schedule based on dataset1 OR dataset1.

This PR only implements the underlying classes, DatasetAny and DatasetAll. In a followup PR we will add more convenient syntax for this, specifically the | and & symbols, e.g. (dataset1 | dataset2) & dataset3.

@dstandish dstandish force-pushed the add-conditional-logic-for-dataset-triggering branch from cd01fde to 859a0ac Compare January 29, 2024 23:20
@dstandish dstandish force-pushed the add-conditional-logic-for-dataset-triggering branch 2 times, most recently from 0e1e5d8 to ecccd6f Compare February 1, 2024 16:21
@dstandish
Copy link
Contributor Author

dstandish commented Feb 2, 2024

@sunank200

  1. fix test on main PR
  2. look at the tests on the PR and determine which should be added or removed
  3. performance
    • why 10 minutes in main?
    • why slower with this PR?
    • what's going on?

update: the performance concern appears to be invalid, due to randomness in task execution issues / unrelated scheduler restarts

@sunank200 sunank200 force-pushed the add-conditional-logic-for-dataset-triggering branch 5 times, most recently from 83087fc to 3cffcf4 Compare February 7, 2024 06:24
@kaxil kaxil added this to the Airflow 2.9.0 milestone Feb 7, 2024
@sunank200 sunank200 force-pushed the add-conditional-logic-for-dataset-triggering branch 3 times, most recently from 7514d44 to ab979da Compare February 8, 2024 07:16
@sunank200
Copy link
Collaborator

@sunank200

  1. fix test on main PR

  2. look at the tests on the PR and determine which should be added or removed

  3. performance

    • why 10 minutes in main?
    • why slower with this PR?
    • what's going on?

Documentation changes are done here

@dstandish dstandish changed the title DRAFT Add conditional logic for dataset triggering Add conditional logic for dataset triggering Feb 13, 2024
@dstandish dstandish marked this pull request as ready for review February 13, 2024 19:32
@dstandish dstandish force-pushed the add-conditional-logic-for-dataset-triggering branch from 870452c to 9753973 Compare February 20, 2024 22:14
@dstandish dstandish force-pushed the add-conditional-logic-for-dataset-triggering branch from 0137de6 to 71f6eba Compare February 21, 2024 17:47
dstandish and others added 5 commits February 21, 2024 09:49
Co-authored-by: Wei Lee <weilee.rx@gmail.com>
Co-authored-by: Jed Cunningham <66968678+jedcunningham@users.noreply.github.com>
@dstandish
Copy link
Contributor Author

Do we plan to update object/next_run_datasets so the UI can show all of this logic?

Same for /next_run_datasets_summary, We should make sure the ready+total counts are still accurate. It might need to change from total to a min and max. so we can say something like 1 of 2-3 datasets updated

Created issue on our board @bbovenzi

@dstandish dstandish merged commit f971232 into apache:main Feb 21, 2024
@dstandish dstandish deleted the add-conditional-logic-for-dataset-triggering branch February 21, 2024 19:24
@ephraimbuddy ephraimbuddy added the type:new-feature Changelog: New Features label Mar 6, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

8 participants