Skip to content

Conversation

@JDarDagran
Copy link
Contributor

@JDarDagran JDarDagran commented Apr 24, 2024


This may possibly fix #39232. I tested the solution many times locally and deployed in Astro Cloud (so it gives me quite certainty it works).

I don't really know how to write tests for the change (load test is the thing but do we do this in regular tests?). Some guidance would be very much helpful.

@JDarDagran JDarDagran force-pushed the openlineage/use-process-pool-executor branch from e8edb6a to 582d9b1 Compare April 30, 2024 12:00
@JDarDagran JDarDagran force-pushed the openlineage/use-process-pool-executor branch 2 times, most recently from 7670638 to 05261ed Compare May 7, 2024 14:10
@JDarDagran
Copy link
Contributor Author

Needed a bit more time to confirm in various environment that ProcessPoolExecutor fixes the issue and doesn't break anything else.

Converting to ready for review since I've added configurable max_workers and simple test.

@JDarDagran JDarDagran marked this pull request as ready for review May 7, 2024 14:11
@JDarDagran JDarDagran requested review from Taragolis and ashb May 14, 2024 13:28
Signed-off-by: Jakub Dardzinski <kuba0221@gmail.com>

Make `max_workers` configurable.

Signed-off-by: Jakub Dardzinski <kuba0221@gmail.com>
@JDarDagran JDarDagran force-pushed the openlineage/use-process-pool-executor branch from 05261ed to b9bb056 Compare May 14, 2024 13:39
@mobuchowski
Copy link
Contributor

Internal tests did not find any problems when this approach is enabled.

Copy link
Contributor

@tatiana tatiana left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for working on this, @JDarDagran , @rahul342 confirmed things worked with our internal benchmarks - it will be great to see this change reaching Airflow + OL users.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

openlineage, celery: scheduler hanging when emitting lots of OL events via HTTP

5 participants