Cert agent controller avoids locating the agent pod on unschedulable nodes when possible #2143
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Fixes #2137.
With this PR, the Concierge's controller which creates the "cert agent" Deployment now pays attention to which nodes are marked as unschedulable.
Background: The goal of this controller is to co-locate a Pinniped "cert agent pod" onto the same node as a Kubernetes controller-manager pod. Previously, it would always choose the newest (by creation date) running controller-manager pod, regardless of which node that pod was running on. This approach causes a problem when a cluster administrator uses
kubectl drainorkubectl cordonto cordon a node that they wish to avoid using, e.g. because it needs maintenance. When that happened to be the node on which the cert agent pod was running, the cert agent pod would not move away from the cordoned node, even if the admin deletes the pod to try to force it to be rescheduled. (Note thatkubectl drainalso has the effect ofkubectl cordonin that they both mark the node as unschedulable.)With this PR, now when there are multiple running controller-manager pods to choose from, the controller will prefer to co-locate the cert agent pod with one that is running on a node which allows scheduling pods (
spec.unschedulableequal tofalse).When an admin uses
kubectl cordonto cordon the node on which the cert agent pod is running, the Concierge will notice this within about 5 minutes and will move the pod to another node, assuming that a suitable one is available. The delay is because this controller does not dynamically watch all activity on the nodes. It could, but that would have some performance impact only to handle this rare corner case, so I feel like the delay is acceptable. Also note that just cordoning a node does not mean that its pods should necessarily be rescheduled, because it only marks it as unschedulable for future pods, which is another reason that the delay is acceptable. To speed things up, the admin can manually delete the cert agent pod, or can usekubectl drainto delete all the pods from the node (including the cert agent pod). This will cause the controller to run faster and move the pod to another node if possible.Release note: