Skip to content

Conversation

@HsiuChuanHsu
Copy link
Contributor

@HsiuChuanHsu HsiuChuanHsu commented Oct 13, 2025

Summary

This PR adds support for multiple Celery worker groups. Each group can now be independently configured to target specific queues and use distinct resource settings

The goal is to achieve fine-grained control over task distribution by assigning queues to specific worker nodes.

Changes

New Configuration: workers.celeryQueueGroups
Added support for defining multiple worker deployments through the workers.celeryQueueGroups configuration. Each group can specify:

  • name: (required) Unique name for the worker group
  • queues: (required) Comma-separated list of Celery queues
  • replicas: (required) Number of worker replicas
  • nodeSelector: (optional)
  • affinity: (optional)
  • tolerations: (optional)
  • topologySpreadConstraints: (optional)
  • resources: (optional)
  • labels: (optional)
  • podAnnotations: (optional)
  • priorityClassName: (optional)
  • env: (optional)

worker-deployment.yaml Logic

  • Modified the worker deployment template to loop through all configured worker groups
  • For default behavior (no celeryQueueGroups defined), the template maintains original setup using existing workers.* configuration
  • Each worker group creates a separate Deployment/StatefulSet with:
    • Unique name suffix based on group name (e.g., airflow-worker-high-priority)
    • worker-group label for identification
    • Queue-specific command arguments: exec airflow celery worker --queues <group-queues>
    • Custom resource configurations override default settings per group

worker-service.yaml Logic

  • Extended service template to create separate services for each worker group
  • Each service is scoped to its corresponding worker group using the worker-group label selector
  • Service naming follows the pattern: {{ airflow.fullname }}-worker-<group-name>

closes: #56591


^ Add meaningful description above
Read the Pull Request Guidelines for more information.
In case of fundamental code changes, an Airflow Improvement Proposal (AIP) is needed.
In case of a new dependency, check compliance with the ASF 3rd Party License Policy.
In case of backwards incompatible changes please leave a note in a newsfragment file, named {pr_number}.significant.rst or {issue_number}.significant.rst, in airflow-core/newsfragments.

@HsiuChuanHsu
Copy link
Contributor Author

Example Configuration

workers:
  celeryQueueGroups:
    - name: default-workers
      queues: "default,email,notifications"
      replicas: 1
    - name: heavy-workers
      queues: "reports,batch_processing"
      replicas: 2
image

@ronaldorcampos
Copy link
Contributor

We have this exact use case where we want to define different queues for different celery workers, potentially with different labels and tolerations. Thanks for the PR!

@Miretpl
Copy link
Contributor

Miretpl commented Oct 15, 2025

Hi @HsiuChuanHsu, thanks for this PR! I personally planned to work on this feature somewhere in the future. I have one question: some time ago, I merged the split of the workers section to workers.kubernetes and workers.celery (for now, only for the service account conf), as with the Hybrid Executor setup, having workers section used by these two different executors becomes a little problematic. What do you think about placing your works under the workers.celery section, as it is only applicable to the celery workers?

@HsiuChuanHsu
Copy link
Contributor Author

Hi @Miretpl,
Thanks for pointing out the newly added workers.celery & workers. kubernetes sections within the configuration.
I hadn't noticed those sections. You're right; the celeryQueueGroups setting definitely belongs under workers.celery. I'll make that change.

@HsiuChuanHsu HsiuChuanHsu force-pushed the feature/airflow-celery-queue-node-assignment branch from 20ae869 to 085a66f Compare October 17, 2025 15:35
@ashb ashb force-pushed the feature/airflow-celery-queue-node-assignment branch from 085a66f to 1e38b5d Compare October 21, 2025 09:33
Copy link
Member

@ashb ashb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please add some tests in helm-tests/tests/helm-tests/ folder for this change.

@ashb ashb changed the title Add support for multiple Celery worker groups with queue-specific configurations Add support for multiple Celery worker groups with queue-specific configurations in Helm chart Oct 21, 2025
@HsiuChuanHsu HsiuChuanHsu force-pushed the feature/airflow-celery-queue-node-assignment branch from 1e38b5d to 4ac1805 Compare October 21, 2025 15:16
@HsiuChuanHsu
Copy link
Contributor Author

HsiuChuanHsu commented Oct 21, 2025

Thanks for the review! I've summarized the improvements below.

  • Naming Fix

    • values.yaml

      • name: celeryQueueGroups -> queueGroups
      • example config: name: default-workers -> name: base-workers
    • worker-deployment.yaml & worker-service.yaml

      • $isDefaultWorker -> $isBaseWorker
      • $defaultWorker -> $baseWorker
  • Tests added

    • helm-tests/tests/helm_tests/airflow_core/test_worker.py

@HsiuChuanHsu HsiuChuanHsu force-pushed the feature/airflow-celery-queue-node-assignment branch from 4ac1805 to 34a881f Compare October 29, 2025 14:25
@eladkal eladkal requested a review from jscheffl November 2, 2025 07:06
jscheffl
jscheffl previously approved these changes Nov 2, 2025
Copy link
Contributor

@jscheffl jscheffl left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From my view this PR is a usable extension and review comments have been mostly addressed.

If KEDA and HPA is not usable in parallel with the worker groups we should mention it in the docs as (current) clear limitation.

One potentially missing this is what I saw that all worker groups take the same celery command line definition, so this means if I want to define different --concurrency per worker type I need to make this via env, correct? Maybe also worth noting in the docs (e.g. schema in line 2664).

Approving and would be happy about the small docs adjustments and if no other objection would merge then after.

@eladkal eladkal added this to the Airflow Helm Chart 1.19.0 milestone Nov 3, 2025
@HsiuChuanHsu HsiuChuanHsu force-pushed the feature/airflow-celery-queue-node-assignment branch from 34a881f to 0cc0b75 Compare November 3, 2025 15:41
@HsiuChuanHsu
Copy link
Contributor Author

Thanks for your review! Summary of changes:

If KEDA and HPA is not usable in parallel with the worker groups we should mention it in the docs as (current) clear limitation.

Agree that this limitation must be clearly documented.

  • Added a "not supported" note preceding workers.keda, workers.hpa & workers.queueGroups settings in values.yaml
  • Add a "not supported" note description of queueGroups in values.schema.json

Next Steps: Will continue working on enableing support for HPA & KEDA.

One potentially missing this is what I saw that all worker groups take the same celery command line definition, so this means if I want to define different --concurrency per worker type I need to make this via env, correct? Maybe also worth noting in the docs (e.g. schema in line 2664).

Yes, your understanding is correct.
I honestly overlooked the need for the $\texttt{--concurrency}$ setting, but I don't currently have a better implementation idea for handling this. Let's note the current approach for now

  • Added a "not supported" note preceding workers.queueGroups settings in values.yaml
  • Added a clarifying note regarding the configuration of AIRFLOW__CELERY__WORKER_CONCURRENCY within values.schema.json.

Added workers.celeryQueueGroups in values.yaml to define worker groups.
Refactored worker-deployment.yaml to support multiple worker groups using a range loop.
Fixed context reference issues (. to $).
@HsiuChuanHsu HsiuChuanHsu force-pushed the feature/airflow-celery-queue-node-assignment branch from 0cc0b75 to 8b76366 Compare November 26, 2025 15:16
@jscheffl
Copy link
Contributor

jscheffl commented Dec 7, 2025

Note: Functionality overlaps with #58547 and even though this was before it is lagging KEDA and HPA support and is more complex. Would it be OK to favor the other PR or is there a missing piece which would be needed to the other?

@jscheffl jscheffl self-requested a review December 7, 2025 22:23
@HsiuChuanHsu
Copy link
Contributor Author

Yes, I'm completely fine with this. #58547 has a much better and cleaner solution for this feature.
My approach is significantly more complicated, so let's go ahead and merge #58547.
I'll close my PR now.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area:helm-chart Airflow Helm Chart

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Add support for multiple Celery worker groups with queue-specific configurations in Helm chart

6 participants