Skip to content

Conversation

@Miretpl
Copy link
Contributor

@Miretpl Miretpl commented Jan 12, 2026

The goal of this PR is to fix the compatibility of:

  1. Workers Celery Sets added in Chart: Enhance Celery Worker Sets support for multi-queue configurations #58547
  2. workers section separation to kubernetes/celery, which is done for Can't configure Kubernetes and Celery workers in Helm Chart #28880

Current state

In the current state of the main branch, both features do not work together. It means that:

  1. Values set in workers.celery.sets for worker pools creation cannot overwrite values different from null set in workers.celery section (despite workers.celery.replicas)
  2. The default behaviour of workers.celery.replicas with workers.replicas is broken (unsetting the workers.celery.replicas value does not fall back to workers.replicas)
  3. The default behaviour of workers.celery.persistence.enabled with workers.persistence.enabled is broken (to disable it, the change of two fields is required) - because of that, all related features like HPA/KEDA do not work properly

Furthermore, the current implementation of Workers Sets does not allow for:

  1. PodDisruptionBudget per set creation
  2. ServiceAccount per set creation

Changes

Rewrite the whole logic of #58547. The idea of updating the definition of workers per worker set stayed, but the logic of how it is done was changed. To make it work, I needed to implement a custom merge function, whose working is described in the _helpers.yaml file next to its definition.

All fields are overwritten fully by settings in workers.celery.sets if specified.

Future PRs

  1. Add test case for overwrite workers.logGroomerSidecar values by Worker Sets:
  2. Add test case for selectors

Was generative AI tooling used to co-author this PR?

  • Yes (please specify the tool below)

  • Read the Pull Request Guidelines for more information. Note: commit author/co-author name and email in commits become permanently public when merged.
  • For fundamental code changes, an Airflow Improvement Proposal (AIP) is needed.
  • When adding dependency, check compliance with the ASF 3rd Party License Policy.
  • For significant user-facing changes create newsfragment: {pr_number}.significant.rst or {issue_number}.significant.rst, in airflow-core/newsfragments.

@Miretpl
Copy link
Contributor Author

Miretpl commented Jan 12, 2026

The previous behaviour of workers.replicas/workers.celery.replicas was added in #59730

@Miretpl
Copy link
Contributor Author

Miretpl commented Jan 12, 2026

Converting to draft - I think more things are broken, but I need to verify it more. Maybe I'm not seeing something, but I think that it would be the easiest to have fully ready workers.celery section first and sets after, because the logic to make them work together, in the middle of implementation the first one, can not be so straightforward

@Miretpl Miretpl marked this pull request as draft January 12, 2026 21:28
@jscheffl
Copy link
Contributor

Oh, sorry I missed to review all details, the PR was pending a longer time and all the rework you invested in moving parameters came-in in parallel. so we generated a semantic conflict in the parameter rework vs. this extension :-(

@Miretpl
Copy link
Contributor Author

Miretpl commented Jan 13, 2026

No worries. Personally, I really wanted the worker sets to be merged too. I will let you know when something meaningful will happen in this PR, as I think it might not be today or tomorrow, but we will see :D

@Miretpl Miretpl force-pushed the fix-changed-replicas-behaviour branch 3 times, most recently from fa4a2e7 to 1e3e75d Compare January 14, 2026 21:03
@jscheffl
Copy link
Contributor

Hi @Miretpl we are opting to get Helm Chart 1.19 release rolling next Monday. Do you think we can get this PR in until then (and maybe some other parameter renamings if there are more pending)?

@Miretpl
Copy link
Contributor Author

Miretpl commented Jan 15, 2026

Hi @jscheffl, regarding this PR - maybe. I have core working locally (there are some edge cases that I'm working on now - I'm unsure how long they will take to resolve tho). There is a chance that we will need to revert #52357 change. I'm not sure now how it should work with workers.celery.sets, but it will need to be rebuilt in some way too. I guess we can always revert it and bring it back in some other form with the release after 1.19.

Also, the current version on main doesn't work, and I mean not only workers separation, but workers.celery.sets too (which can be seen in the already added tests on this PR). All of the values (despite replicas) which have been moved to workers.celery cannot be overwritten by sets (if they have default value). IMHO, if I do not finish this by Monday, I'm not sure if it would be a good decision to release 1.19.0 in its current state.

Regarding further renaming - I planned to get back to them after this PR, but if 1.19 is close to the release, I will probably add tests for the added PodDistributionBudgets, cause most of the additions after 1.18.0 release do not really have any tests.

@Miretpl
Copy link
Contributor Author

Miretpl commented Jan 17, 2026

Currently, falling tests are due to 2 reasons:

  1. unrebased branch
  2. workers.celery/workers.kubernetes service account section separations - for this, I created Revert "Separate workers service accounts (#52357)" #60721

@Miretpl Miretpl changed the title Fix overwrite behaviour of workers.celery.replicas Fix Compatibility of Celery Workers Sets with Workers Separation Jan 17, 2026
@Miretpl Miretpl changed the title Fix Compatibility of Celery Workers Sets with Workers Separation Fix Compatibility of Celery Worker Sets with Workers Separation Jan 17, 2026
@Miretpl
Copy link
Contributor Author

Miretpl commented Jan 17, 2026

Hi @jscheffl, I think we will manage till Monday with it. I pushed the latest version (which is working locally), and I updated the description here. I didn't try deployment of the environment with this version yet, but looking at the generated yamls and test cases, it looks correct. Unfortunately, we will need to revert #52357 after all (PR for it #60721).

Do you think it would be possible to postpone the cut-off by a week? I would want to finish adding test cases and properly check everything before it, and I will probably not finish that before Monday :(

@Miretpl Miretpl force-pushed the fix-changed-replicas-behaviour branch from e5d8a8d to eb27346 Compare January 17, 2026 21:26
@Miretpl
Copy link
Contributor Author

Miretpl commented Jan 17, 2026

kerberosInitContainer related tests will fail. I didn't fixed them as I believe that they should be fixed in #60677 (current implementation of workers.celery.kerberosInitContainer has different behaviour compared to all other fields under workers.celery (that is why the tests here will be failing now).

+ service account related - the rest should pass now

@jscheffl
Copy link
Contributor

@jedcunningham As of the engagement of @Miretpl and as some logical conflicts exists from the different reworks... I'd also propose to push the Helm release for a few days to get this cleaned. Would it be OK for you?

@Miretpl Miretpl force-pushed the fix-changed-replicas-behaviour branch 2 times, most recently from 20b7e50 to f55ed08 Compare January 17, 2026 23:21
@Miretpl Miretpl force-pushed the fix-changed-replicas-behaviour branch from f55ed08 to 93763c3 Compare January 18, 2026 14:23
@Miretpl Miretpl force-pushed the fix-changed-replicas-behaviour branch from 60fcebb to e1dc1c3 Compare January 21, 2026 21:13
@Miretpl Miretpl marked this pull request as ready for review January 21, 2026 21:20
@Miretpl
Copy link
Contributor Author

Miretpl commented Jan 21, 2026

It is now ready to review. Most of the changes are just test additions (I added around 160 test cases). I moved the sections that were newly added in the previous PR to workers.celery to avoid new deprecations in the future.

I deployed a couple of times manually environment with different configurations and it worked correctly, but I would appreciate any additional tests done by someone else

@Miretpl Miretpl force-pushed the fix-changed-replicas-behaviour branch from d47fbde to 6c6bac9 Compare January 21, 2026 21:41
@Miretpl Miretpl force-pushed the fix-changed-replicas-behaviour branch from 6c6bac9 to 7c87b66 Compare January 21, 2026 21:44
Copy link
Contributor

@jscheffl jscheffl left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wow - impressive. Have atm no option to deploy something but coverage of unit tests looks exhaustive. Was attempting to find more than a nit but looks really good.

I'd propose a second pair of eyes == @jedcunningham prior merge

@jscheffl jscheffl added this to the Airflow Helm Chart 1.19.0 milestone Jan 21, 2026
Miretpl and others added 2 commits January 22, 2026 19:56
Co-authored-by: Jens Scheffler <95105677+jscheffl@users.noreply.github.com>
@jedcunningham jedcunningham requested a review from Copilot January 23, 2026 18:34
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR fixes compatibility issues between Celery Worker Sets (introduced in #58547) and the separated workers section to kubernetes/celery. It rewrites the worker set configuration merge logic to properly handle value overrides and null value fallback behavior.

Changes:

  • Moved enableDefault and queue from workers to workers.celery section
  • Implemented custom merge function workersMergeValues to handle configuration inheritance and overrides between worker sets
  • Updated all worker-related templates to use the new merge logic

Reviewed changes

Copilot reviewed 16 out of 16 changed files in this pull request and generated 1 comment.

Show a summary per file
File Description
helm-tests/tests/helm_tests/security/test_security_context.py Added test cases for worker sets with security context configurations
helm-tests/tests/helm_tests/other/test_keda.py Fixed queue configuration path from workers.queue to workers.celery.queue
helm-tests/tests/helm_tests/airflow_core/test_worker_sets.py New comprehensive test file for worker sets functionality (2580 lines)
helm-tests/tests/helm_tests/airflow_core/test_worker.py Fixed test case for replicas inheritance and removed old worker sets tests
chart/values.yaml Moved enableDefault and queue configuration to workers.celery section
chart/values.schema.json Updated schema to reflect configuration structure changes
chart/templates/workers/*.yaml Updated all worker templates to use new merge logic and configuration paths
chart/templates/_helpers.yaml Added workersMergeValues and removeNilFields helper functions
chart/templates/NOTES.txt Updated deprecation warning to include queue in workers.args
chart/newsfragments/58547.significant.rst Updated documentation with corrections and clarifications

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

@jedcunningham jedcunningham merged commit 008329e into apache:main Jan 23, 2026
125 of 126 checks passed
@jscheffl
Copy link
Contributor

Merged, cool! Thanks for the fix @Miretpl !

Then we would be ready to kick 1.19 release?

@Miretpl
Copy link
Contributor Author

Miretpl commented Jan 24, 2026

@jscheffl yes, I think we should be good now

suii2210 pushed a commit to suii2210/airflow that referenced this pull request Jan 26, 2026
@Miretpl Miretpl deleted the fix-changed-replicas-behaviour branch January 26, 2026 19:34
shreyas-dev pushed a commit to shreyas-dev/airflow that referenced this pull request Jan 29, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area:helm-chart Airflow Helm Chart

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants