fix: working systemd monitor jobs by jackfrancis · Pull Request #3788 · Azure/aks-engine

jackfrancis · 2020-09-04T02:09:37Z

Reason for Change:

This PR updates the implementation of the various systemd monitor jobs so that the following critical services are monitored for failure, and restarted:

docker
containerd
etcd
kubelet

Issue Fixed:

Requirements:

uses conventional commit messages
includes documentation
adds unit tests
tested upgrade from previous version

Notes:

codecov · 2020-09-04T03:45:05Z

Codecov Report

Merging #3788 into master will increase coverage by 0.00%.
The diff coverage is 93.75%.

@@           Coverage Diff           @@
##           master    #3788   +/-   ##
=======================================
  Coverage   73.19%   73.20%           
=======================================
  Files         148      148           
  Lines       25394    25403    +9     
=======================================
+ Hits        18587    18596    +9     
  Misses       5671     5671           
  Partials     1136     1136

Impacted Files	Coverage Δ
pkg/engine/templates_generated.go	`53.42% <77.77%> (ø)`
pkg/api/defaults-kubelet.go	`96.82% <100.00%> (+0.01%)`	⬆️
pkg/engine/armvariables.go	`86.47% <100.00%> (-0.03%)`	⬇️
pkg/engine/template_generator.go	`82.40% <100.00%> (+0.20%)`	⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 735f843...aba419c. Read the comment docs.

jackfrancis · 2020-09-08T22:29:33Z

examples/e2e-tests/kubernetes/release/default/definition.json

          {
            "name": "kubernetes-dashboard",
-            "enabled": true
+            "enabled": false


these are temporary changes while we work on reducing customData size

jackfrancis · 2020-09-08T22:30:11Z

parts/k8s/cloud-init/artifacts/cse_config.sh

 NODE_INDEX=$(hostname | tail -c 2)
 NODE_NAME=$(hostname)
-PRIVATE_IP=$(hostname -I | cut -d' ' -f1)
+PRIVATE_IP=$(hostname -i | cut -d' ' -f1)


It's unclear to me why we're using -I (get me all interfaces) vs -i get me my primary interface...

Note that the ip address here (-i) is a resolved IP address based on the host name. Is that what you want?

The hostname -I returns all ip addresses without DNS resolution. This means you get things like loopback, etc.

Note that I would look into another way to get the right IP address for the master (like, it must be known somewhere already)

jackfrancis · 2020-09-10T20:25:57Z

parts/k8s/cloud-init/artifacts/docker-monitor.timer

@@ -1,7 +1,7 @@
 [Unit]


I would advocate we eliminate this timer spec. It adds complexity to the overall implementation, and the way we're implementing the docker health check (docker ps) should not be "racy" given that the health check script runs only after the docker systemd service has started (see After=docker.service above)

@Michael-Sinz @mboersma thoughts?

I think what we're implicitly saying by delaying things for 30 mins is that we are tolerant of docker ps repeatedly failing during the first 30 mins of boot time, which I don't think is defensible.

I think providing a warmup period for Docker to get its act together was basically what Azure/acs-engine#4050 was about. I agree simpler is way better when it comes to systemd units especially, so l'm ok with removing it to see if it's unneeded now.

After further investigation, it is a bit tricky to tell a systemd service "wait until this service starts, and also until it is fully activated; and so I've moved the delay into the health script itself. Arguably that is less complicated than maintaining systemd overhead to do that.

jackfrancis · 2020-09-10T21:10:35Z

parts/k8s/cloud-init/artifacts/cse_config.sh

  {{- end}}
  systemctlEnableAndStart kubelet || exit {{GetCSEErrorCode "ERR_KUBELET_START_FAIL"}}
+  wait_for_file 1200 1 /etc/systemd/system/kubelet-monitor.service || exit {{GetCSEErrorCode "ERR_FILE_WATCH_TIMEOUT"}}
+  systemctlEnableAndStart kubelet-monitor || exit {{GetCSEErrorCode "ERR_KUBELET_START_FAIL"}}


Also (continuing along the systemd relationship conversation), we are as a rule blocking the startup of the monitor jobs on the jobs themselves starting successfully, as exemplified by this kubelet-monitor start operation

jackfrancis · 2020-09-11T18:16:25Z

parts/k8s/cloud-init/artifacts/cse_helpers.sh

 DOCKER_VERSION=1.13.1-1
 NVIDIA_CONTAINER_RUNTIME_VER=2.0.0
 NVIDIA_DOCKER_SUFFIX=docker18.09.2-1
+PRIVATE_IP=$(ip -4 addr show eth0 | grep -Po '(?<=inet )[\d.]+')


This is a new, common "what is my private IP address?" implementation to be shared across the various places where that runtime determination is needed.

eth0 always has this information at first boot; in an Azure CNI configuration, an "azure0" bridge interface is set up with the primary NIC IP address (and the eth0 interface no longer containes it)

In the edge case scenario where none of those exists, we fall back to the IP address that the hostname resolves to. The reason we don't want to do that primarily is because we don't want to rely upon DNS lookups.

cc @Michael-Sinz

jackfrancis · 2020-09-11T21:13:31Z

parts/k8s/cloud-init/artifacts/cse_config.sh

 #!/bin/bash
 NODE_INDEX=$(hostname | tail -c 2)
 NODE_NAME=$(hostname)
-PRIVATE_IP=$(hostname -I | cut -d' ' -f1)


This var assignment has been moved into the cse_helpers.sh instead for more general purpose usage, and the etcd vars are moved down to local vars inside the funcs that use them

jackfrancis · 2020-09-11T21:21:54Z

parts/k8s/cloud-init/artifacts/cse_config.sh

  sysctl_reload 10 5 120 || exit {{GetCSEErrorCode "ERR_SYSCTL_RELOAD"}}
  wait_for_file 1200 1 /etc/default/kubelet || exit {{GetCSEErrorCode "ERR_FILE_WATCH_TIMEOUT"}}
  wait_for_file 1200 1 /var/lib/kubelet/kubeconfig || exit {{GetCSEErrorCode "ERR_FILE_WATCH_TIMEOUT"}}
+  if [[ -n ${MASTER_NODE} ]]; then


This one-time VMSS master-specific foo has been moved into the bootstrap script, and out of the shell script that runs every time that the kubelet systemd service starts.

jackfrancis · 2020-09-11T21:27:34Z

parts/k8s/cloud-init/artifacts/cse_helpers.sh

 DOCKER_VERSION=1.13.1-1
 NVIDIA_CONTAINER_RUNTIME_VER=2.0.0
 NVIDIA_DOCKER_SUFFIX=docker18.09.2-1
+PRIVATE_IP=$( (ip -br -4 addr show eth0 || ip -br -4 addr show azure0) | grep -Po '\d+\.\d+\.\d+\.\d+')


This code does the following:

Get me the IP addresses on eth0

Or, if there aren't any, get me the IP addresses on azure0

If there isn't exactly 1 IP address from trying those, then just get the IP address that the hostname DNS entry returns

#3 is undesirable as it relies upon functional DNS, so we only do it in a fallback scenario

We consider the possibility that eth0 won't have the desired IP address in order to make this solution resilient during the lifecycle of the VM: in Azure CNI configuration scenarios the IP address will be attached to the "azure0" interface when the first (non-hostNetwork) pod is scheduled onto the node that the VM represents (that's how Azure CNI routes container traffic out of the VM).

Which one is priority? The azure0 or the eth0?
If azure0 is there then eth0 should not be, then would the order be better first check azure0 and then eth0?

(Unclear from your comment)

When we define the VM in the ARM template, we declare a primary IP address on a single NIC, so we expect it to be reflected in eth0. Because Azure CNI creates a bridge "azure0" interface and "takes over" the IP address from eth0, that's where we have to look (if we want to look locally and not rely upon DNS) for the primary IP address of the host.

An additional consideration is that Ubuntu 18.04-LTS's cloud-init implementation has a bug in that it doesn't respect the ARM template-declared primary IP address when the spec also includes secondary IP addresses, and so eth0 will have more than one IP address (those secondary IP addresses are for Azure CNI to use in the container networking layer, not the host layer, so eth0 should only ever have 1 IP address). To deal with that edge case (e.g., 18.04-LTS on the first boot, before the cloud-init network config override has been applied) we evaluate the number of IP addresses returned (the | grep -c '^' part), and if it's not exactly 1, then we just fallback to relying upon DNS.

jackfrancis · 2020-09-11T21:28:10Z

parts/k8s/cloud-init/artifacts/docker-monitor.service

 Restart=always
 RestartSec=10
 RemainAfterExit=yes
+Environment=CONTAINER_RUNTIME={{GetContainerRuntime}}


This service is now overloaded to support both moby or containerd

jackfrancis · 2020-09-11T21:33:33Z

parts/k8s/cloud-init/masternodecustomdata.yml

        echo "" >> /etc/environment
    fi
  {{- if IsMasterVirtualMachineScaleSets}}
+    source {{GetCSEHelpersScriptFilepath}}


This source statement gets us the $PRIVATE_IP var we need

jackfrancis · 2020-09-11T21:35:32Z

test/e2e/kubernetes/kubernetes_test.go

 	env                             azure.Environment
 	azureClient                     *armhelpers.AzureClient
-	firstMasterRegexStr             = fmt.Sprintf("^%s-", common.LegacyControlPlaneVMPrefix)
+	firstMasterRegexStr             = fmt.Sprintf("^%s-.*-0", common.LegacyControlPlaneVMPrefix)


If you look hard enough you find interesting things

parts/k8s/cloud-init/artifacts/health-monitor.sh

mboersma

/lgtm

acs-bot · 2020-09-14T22:10:06Z

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: jackfrancis, mboersma

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Details

Needs approval from an approver in each of these files:

~~OWNERS~~ [jackfrancis,mboersma]

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

acs-bot added size/L approved size/XL and removed size/L labels Sep 4, 2020

jackfrancis force-pushed the systemd-monitor branch 2 times, most recently from 089b5ab to 21098b8 Compare September 4, 2020 23:29

jackfrancis commented Sep 8, 2020

View reviewed changes

jackfrancis force-pushed the systemd-monitor branch from 21098b8 to 134d3f2 Compare September 10, 2020 19:59

jackfrancis commented Sep 10, 2020

View reviewed changes

jackfrancis commented Sep 11, 2020

View reviewed changes

jackfrancis added 11 commits September 11, 2020 16:08

fix: working systemd monitor jobs

f750422

fix UT

98fde31

Mary Anne With The Shaky Hand

3b4e9c5

chore: lint

8a32a0f

temporary addon reduction

e5f8457

remove this one too

8e4c31c

undo config and hostname changes

46c050e

undo longer timeout change

338d7be

add after for flatcar, reduce chars

183fb23

remove docker monitor timer

0a515e7

remove wait

010b2b2

jackfrancis added 9 commits September 11, 2020 16:08

health monitor script errata

f844a4a

improvements

f33bb95

chore: lint

a69230b

standardize private ip

091cb50

slightly better IP retrieval

9c2f950

optimized ip retrieval

afda41f

wrap kubelet-monitor in HasKubeletHealthZPort

c9fefde

simplified validation script

3f5274e

delete docker-monitor.timer

eadf7ac

jackfrancis force-pushed the systemd-monitor branch from 07b12c2 to eadf7ac Compare September 11, 2020 23:08

mboersma reviewed Sep 14, 2020

View reviewed changes

parts/k8s/cloud-init/artifacts/health-monitor.sh Show resolved Hide resolved

update usage statement

aba419c

mboersma approved these changes Sep 14, 2020

View reviewed changes

acs-bot assigned mboersma Sep 14, 2020

acs-bot added the lgtm label Sep 14, 2020

jackfrancis merged commit c670571 into Azure:master Sep 14, 2020

jackfrancis deleted the systemd-monitor branch September 14, 2020 22:08

fmotrifork mentioned this pull request Sep 21, 2020

aks-engine 0.56.0 fishworks/fish-food#964

Merged

penggu pushed a commit to penggu/aks-engine that referenced this pull request Oct 28, 2020

fix: working systemd monitor jobs (Azure#3788)

2d33f85

Conversation

jackfrancis commented Sep 4, 2020

Uh oh!

codecov bot commented Sep 4, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

mboersma left a comment

Choose a reason for hiding this comment

Uh oh!

acs-bot commented Sep 14, 2020

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

codecov bot commented Sep 4, 2020 •

edited

Loading