Skip to content

Conversation

@potiuk
Copy link
Member

@potiuk potiuk commented Feb 9, 2023

When you upgrade/change job in K8S that has been finished and not manually removed, this leads to "Field is immutable" error.

This is a known kubernetes issue:

kubernetes/kubernetes#89657

And there are some workarounds (manually removing the job for example), but the only good solution is possible only in K8S 1.23+ with ttlSecondsAfterFinished set for the job, so that K8S can auto
clean it.

This PR adds it conditionally for K8S >= 1.23

Fixes: #27561


^ Add meaningful description above

Read the Pull Request Guidelines for more information.
In case of fundamental code changes, an Airflow Improvement Proposal (AIP) is needed.
In case of a new dependency, check compliance with the ASF 3rd Party License Policy.
In case of backwards incompatible changes please leave a note in a newsfragment file, named {pr_number}.significant.rst or {issue_number}.significant.rst, in newsfragments.

@boring-cyborg boring-cyborg bot added the area:helm-chart Airflow Helm Chart label Feb 9, 2023
@potiuk potiuk requested a review from dimberman February 9, 2023 10:23
@potiuk potiuk force-pushed the fix-cleanup-of-jobs-for-terraform-redeployments branch from 1cecb63 to 294d266 Compare February 9, 2023 10:25
@potiuk potiuk changed the title Add time to leave for finished jobs in Kubernetes Add time to live for finished jobs in Kubernetes Feb 9, 2023
When you upgrade/change job in K8S that has been finished and not
manually removed, this leads to "Field is immutable" error.

This is a known kubernetes issue:

kubernetes/kubernetes#89657

And there are some workarounds (manually removing the job for
example), but the only good solution is possible only in K8S 1.23+
with ttlSecondsAfterFinished set for the job, so that K8S can auto
 clean it.

This PR adds it conditionally for K8S >= 1.23

Fixes: apache#21943
@potiuk
Copy link
Member Author

potiuk commented Feb 12, 2023

Hey @dstandish @jedcunningham -> WDYT?

Copy link
Member

@hussein-awala hussein-awala left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's an interesting change, but there's still a chance of getting the same error if for some reason we run the jobs twice before the TTL ends, on top of that we can add {{ .Release.Revision }} as suffix for job name -> metadata.name: {{ .Release.Name }}-<job name>-{{ .Release.Revision }}, and keep the cleanup task for garbage collection.

@potiuk
Copy link
Member Author

potiuk commented Feb 13, 2023

It's an interesting change, but there's still a chance of getting the same error if for some reason we run the jobs twice before the TTL ends, on top of that we can add {{ .Release.Revision }} as suffix for job name -> metadata.name: {{ .Release.Name }}-<job name>-{{ .Release.Revision }}, and keep the cleanup task for garbage collection.

Thought about it, but I am not sure if there is another reason why we wanted to keep the job names "static" - @jedcunningham @dstandish ?

@potiuk
Copy link
Member Author

potiuk commented Feb 13, 2023

Closing in favour of #29314

@potiuk potiuk closed this Feb 13, 2023
@potiuk potiuk deleted the fix-cleanup-of-jobs-for-terraform-redeployments branch March 22, 2023 18:31
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area:helm-chart Airflow Helm Chart

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Helm chart tries to patch immutable Job resources on helm upgrade

2 participants