Skip to content

Conversation

@ephraimbuddy
Copy link
Contributor

@ephraimbuddy ephraimbuddy commented Oct 10, 2024

This commit introduces versioning for DAGs.

Changes:

  • Introduced DagVersion model to handle versioning of DAGs.
  • Added version_name field to DAG for use in tracking the dagversion by users
  • Modified DAG execution logic to reference dag_version_id instead of the dag_hash to ensure DAG runs are linked to specific versions.

The table relations:
Screenshot 2024-10-25 at 09 21 19

The versioning is based on the serialized dict changing. If a dag's serialized dict changes, a new serialized dag will be registered based on the hash diference, and consequently, a new dag version and dag code. The link from dag_version to TI is because of TaskInstance clearing. It helps us retain the previous dag version the task ran with.

Closes: #42333, #42334, #42336

@boring-cyborg boring-cyborg bot added area:API Airflow's REST/HTTP API area:CLI area:db-migrations PRs with DB migration area:dev-tools area:Scheduler including HA (high availability) scheduler area:serialization area:UI Related to UI/UX. For Frontend Developers. area:webserver Webserver related Issues kind:documentation labels Oct 10, 2024
@ephraimbuddy ephraimbuddy force-pushed the versioned-dag2 branch 5 times, most recently from ad4f57c to dd272b0 Compare October 16, 2024 12:27
@ephraimbuddy ephraimbuddy added the legacy api Whether legacy API changes should be allowed in PR label Oct 16, 2024
@ephraimbuddy ephraimbuddy force-pushed the versioned-dag2 branch 3 times, most recently from eb13cdd to 1b81f2f Compare October 16, 2024 13:27
@ephraimbuddy ephraimbuddy marked this pull request as ready for review October 16, 2024 13:27
Copy link
Member

@uranusjr uranusjr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good enough

@ephraimbuddy ephraimbuddy merged commit 1116f28 into apache:main Nov 5, 2024
@ephraimbuddy ephraimbuddy deleted the versioned-dag2 branch November 5, 2024 14:19
potiuk added a commit that referenced this pull request Nov 6, 2024
potiuk added a commit that referenced this pull request Nov 6, 2024
* Revert "Delete the Serialized Dag and DagCode before DagVersion migration (#43700)"

This reverts commit 438f71d.

* Revert "AIP-65: Add DAG versioning support (#42913)"

This reverts commit 1116f28.
ellisms pushed a commit to ellisms/airflow that referenced this pull request Nov 13, 2024
* AIP-65: Add DAG versioning support

This commit introduces versioning for DAGs

Changes:
- Introduced DagVersion model to handle versioning of DAGs.
- Added version_name field to DAG for use in tracking the dagversion by users
- Added support for version retrieval in the get_dag_source API endpoint
- Modified DAG execution logic to reference dag_version_id instead of the
dag_hash to ensure DAG runs are linked to specific versions.

Fix tests

revert RESTAPI changes

* fixup! AIP-65: Add DAG versioning support

* fixup! fixup! AIP-65: Add DAG versioning support

* fix migration

* fix test

* more test fixes

* update query count

* fix static checks

* Fix query and add created_at to dag_version table

* improve code

* Change to using UUID for primary keys

* DagCode.bulk_write_code is no longer used

* fixup! Change to using UUID for primary keys

* fix tests

* fixup! fix tests

* use uuid for version_name

* fixup! use uuid for version_name

* use row lock when writing dag version

* use row lock when writing dag version

* fixup! use row lock when writing dag version

* deactivating dag should not remove serialized dags

* save version_name as string not uuid

* Make dag_version_id unique

* fixup! Make dag_version_id unique

* Fix tests

* Use uuid7

* fix test

* fixup! fix test

* use binary=False for uuid field to fix sqlite issue

* apply suggestions from code review

* Remove unnecessary version_name on dagmodel

* Fix sqlalchemy 2 warning

* Fix conflicts

* Apply suggestions from code review

Co-authored-by: Jed Cunningham <66968678+jedcunningham@users.noreply.github.com>

* fixup! Apply suggestions from code review

* fixup! fixup! Apply suggestions from code review

* add test for dagversion model and make version_name, number and dag_id unique

* Remove commented test as serdag can no longer disappear

* Add SQLAlchemy-utils to requirements

* mark test_dag_version.py as db_test

* make version_name nullable

* Apply suggestions from code review

* fixup! Apply suggestions from code review

* remove file_updater

* Use dag_version for creating dagruns instead of dag_version_id

* fix conflicts

* use if TYPE_CHECKING

* Add docstrings to methods

* Move getting latest serdags to SerializedDagModel
ellisms pushed a commit to ellisms/airflow that referenced this pull request Nov 13, 2024
* Revert "Delete the Serialized Dag and DagCode before DagVersion migration (apache#43700)"

This reverts commit 438f71d.

* Revert "AIP-65: Add DAG versioning support (apache#42913)"

This reverts commit 1116f28.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

AIP-65: DAG history in UI area:API Airflow's REST/HTTP API area:CLI area:db-migrations PRs with DB migration area:dev-tools area:Scheduler including HA (high availability) scheduler area:serialization area:UI Related to UI/UX. For Frontend Developers. area:webserver Webserver related Issues kind:documentation legacy api Whether legacy API changes should be allowed in PR

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Calculate and track DAG version

7 participants