-
Notifications
You must be signed in to change notification settings - Fork 16.4k
[AIRFLOW-2761] Parallelize enqueue in celery executor #4234
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[AIRFLOW-2761] Parallelize enqueue in celery executor #4234
Conversation
70232e9 to
33ce56d
Compare
Codecov Report
@@ Coverage Diff @@
## master #4234 +/- ##
==========================================
+ Coverage 77.82% 77.84% +0.01%
==========================================
Files 201 201
Lines 16367 16455 +88
==========================================
+ Hits 12738 12809 +71
- Misses 3629 3646 +17
Continue to review full report at Codecov.
|
93b5333 to
d93d295
Compare
|
Awesome, always love the detail in your PR descriptions 💖 |
|
Hi @KevinYang21 , haven’t got time to check details now. A few very minor points above first. |
d93d295 to
554b4cf
Compare
|
@Fokko Thoughts on including this to 1.10.2 ??? |
|
I would vote +1 to include so that we have a complete story around scaling in 1.10.2 |
|
hey @KevinYang21 , I discussed with Max and heard that Airbnb run 2000+ dag in prod daily. I wonder what's the setup look like? Do you guys do multi airflow clusters? |
|
BTW, pr nicely done! |
|
@feng-tao ty and ty for the reviews you have done. We have ~1600 active DAGs after some recent pruning and we don't do multi cluster. I shared some of our setup in a thread from the dev mail list. I'm happy to share more details if you'd like. In fact I think it would be a good idea for us to have a short meet up( maybe lunch on either side) to discuss more efficiently. Or if the schedule is tight we can do a webex too. |
|
thanks @KevinYang21 . Yeah, let me go back and check with the team. It definitely helps to share and learn more from Airbnb's experience. |
|
@KevinYang21 hi, how do you guys get >30k running tasks in Airbnb internal cluster? |
Jira
Description
The change is depending on this PR: [AIRFLOW-2760] Decouple DAG parsing loop from scheduler loop #3873
Summary of major changes:
Tests
tests/executors/test_celery_executor.py:TestCeleryExecutor.test_error_sending_task
tests/jobs.py:SchedulerJobTest.test_change_state_for_tasks_failed_to_execute
Also updated existing failing unit test.
The change has been running in Airbnb internal cluster for 3 months+.
Before:

Before (32k tasks should run on 13:30 but didn't get >30k running tasks until 13:41):
After (32k tasks should run on 15:10 and got them all running at 15:14):

Commits
Documentation
Code Quality
git diff upstream/master -u -- "*.py" | flake8 --diff