Skip to content

Conversation

@stevenschaerer
Copy link
Contributor

This closes: #38672

I opted to slightly extend some existing unit tests rather than write mostly identical new ones. If the @sync_to_async decorator and the awaits before self._get_count are removed the tests will fail because the trigger and dummy tasks are executed sequentially.

Should I also remove TaskStateTrigger in this PR? It is not used anymore outside of tests.

Copy link
Member

@hussein-awala hussein-awala left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good, LGTM

@hussein-awala hussein-awala requested a review from Taragolis April 2, 2024 22:09
@uranusjr
Copy link
Member

uranusjr commented Apr 2, 2024

I don’t think we can just remove TaskStateTrigger, but let’s deprecate it for future removal.

It would also be a good idea to implement an version of _get_count to use the async SQLAlchemy API if available, and fallback to the sync_to_async version otherwise. Most people likely won’t have the async component (pip install sqlalchemy[asyncio]) but we should start to slowly add support to it in triggerers.

@hussein-awala
Copy link
Member

I don’t think we can just remove TaskStateTrigger, but let’s deprecate it for future removal.

It would also be a good idea to implement an version of _get_count to use the async SQLAlchemy API if available, and fallback to the sync_to_async version otherwise. Most people likely won’t have the async component (pip install sqlalchemy[asyncio]) but we should start to slowly add support to it in triggerers.

I'm working on a draft AIP to add this support, I tested it a few months ago and the results were promising: #36504

@hussein-awala
Copy link
Member

Also some async features are only available in sqlalchemy 2.0, so maybe we need to focus on this migration first.

@stevenschaerer
Copy link
Contributor Author

I don’t think we can just remove TaskStateTrigger, but let’s deprecate it for future removal.

It would also be a good idea to implement an version of _get_count to use the async SQLAlchemy API if available, and fallback to the sync_to_async version otherwise. Most people likely won’t have the async component (pip install sqlalchemy[asyncio]) but we should start to slowly add support to it in triggerers.

I have deprecated TaskStateTrigger. Following the other comments I'll leave using async SQLAlchemy for a separate PR.

@Taragolis Taragolis merged commit e6eec0c into apache:main Apr 4, 2024
@Taragolis Taragolis added the type:bug-fix Changelog: Bug Fixes label Apr 4, 2024
@Taragolis Taragolis added this to the Airflow 2.9.1 milestone Apr 4, 2024
jedcunningham pushed a commit that referenced this pull request Apr 26, 2024
* Use async db calls in WorkflowTrigger

* address PR comments

* deprecate TaskStateTrigger with proper category

(cherry picked from commit e6eec0c)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area:Triggerer type:bug-fix Changelog: Bug Fixes

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Avoid to use sync (with blocking io) functions into the WorkflowTrigger

4 participants