Skip to content

Deleting a Helm-based operator CR doesn't guarantee all it's associated resources are deleted #3407

@mikeshng

Description

@mikeshng

Bug Report

What did you do?

Sometimes (very rarely but it does happen) after I delete a Helm-based CR, there are resources that are left behind.
The CR is now gone, when it should probably remain because there are still remaining resources.

An example:

helm repo add stable https://kubernetes-charts.storage.googleapis.com/
operator-sdk new nginx-ingress-operator --type=helm --helm-chart=stable/nginx-ingress
cd nginx-ingress-operator
kubectl apply -f deploy/crds/helm.operator-sdk_nginxingresses_crd.yaml
kubectl apply -f deploy/crds/helm.operator-sdk_v1alpha1_nginxingress_cr.yaml
operator-sdk run --local

In a separate terminal

kubectl edit deployment example-nginxingress-nginx-ingress-controller 
# add a finalizer like
# finalizers:
# - abc.com/def
kubectl delete nginxingress/example-nginxingress

Now nginxingress/example-nginxingress is deleted but when I run

$ kubectl get deployments
NAME                                            READY   UP-TO-DATE   AVAILABLE   AGE
example-nginxingress-nginx-ingress-controller   1/1     1            1           3m30s

There are still a resource left behind.

What did you expect to see?
I am hoping that the CR nginxingress/example-nginxingress remains because not all the resources are deleted.

What did you see instead? Under which circumstances?
After a delete, very rarely I see some resources are left behind. Manually adding the finalizers in the reproduction step above is just to illustrate the point.
I think there was something "busy" with the Kubernetes server and it failed to properly process the delete (or issue the delete)

Environment

  • operator-sdk version:

operator-sdk version: "v0.19.0", commit: "8e28aca60994c5cb1aec0251b85f0116cc4c9427", kubernetes version: "v1.18.2", go version: "go1.13.10 linux/amd64"

  • go version:

go version go1.14.2 linux/amd64

  • Kubernetes version information:

kubectl version
Client Version: version.Info{Major:"1", Minor:"18", GitVersion:"v1.18.2", GitCommit:"52c56ce7a8272c798dbc29846288d7cd9fbae032", GitTreeState:"clean", BuildDate:"2020-04-16T11:56:40Z", GoVersion:"go1.13.9", Compiler:"gc", Platform:"linux/amd64"}
Server Version: version.Info{Major:"1", Minor:"16", GitVersion:"v1.16.4", GitCommit:"224be7bdce5a9dd0c2fd0d46b83865648e2fe0ba", GitTreeState:"clean", BuildDate:"2020-01-14T01:25:43Z", GoVersion:"go1.12.12", Compiler:"gc", Platform:"linux/amd64"}

  • Kubernetes cluster kind:

kind create cluster --image kindest/node:v1.16.4

  • Are you writing your operator in ansible, helm, or go?

helm

Possible Solution
The stripping of the CR finalizer happens in here https://github.com/operator-framework/operator-sdk/blob/v0.19.0/pkg/helm/controller/reconcile.go#L138
Is it possible to double check all the resources are in fact deleted before removing the finalizer?

Additional context
NA

Metadata

Metadata

Assignees

Labels

help wantedDenotes an issue that needs help from a contributor. Must meet "help wanted" guidelines.kind/bugCategorizes issue or PR as related to a bug.kind/featureCategorizes issue or PR as related to a new feature.language/helmIssue is related to a Helm operator projectlifecycle/rottenDenotes an issue or PR that has aged beyond stale and will be auto-closed.

Type

No type

Projects

No projects

Relationships

None yet

Development

No branches or pull requests

Issue actions