test: Make aware test pod cgroup option. by jcvenegas · Pull Request #1824 · kata-containers/tests

jcvenegas · 2019-07-17T20:08:17Z

Add scripts to enable it if needed.
Add script to rolback config to enable it.
Check in docker test if option is enabled to not check cgroups in the
host.

Depends-on: github.com/kata-containers/runtime#1880

Signed-off-by: Jose Carlos Venegas Munoz jcvenega@jcvenega-nuc.zpn.intel.com

* config: add option to use only pod cgroup. Use option to use only pod cgroup. - Get pod Cgroup getting the parent of the SANDBOX container. - Do not create host cgroup for each container - Place all kata processese in the cgroup * Proxy * Shim * Runtime ? * Hypervisor This may reduce the perfomance of small pods as Hypervisor is totally constrained (not partially anymore to vcpus). It gives to the container administrator more acurrate data on the resource usage. Memory limitations: Because VMM is contrained in the pod sandox it could be OOM so the "pod admin" should consider to add an extra memory for the overhead. From Kata perspective we should investigate what is the overhead of Kata for at least memory as this is a critical resource that could get the all the pod worloads killed if the VM get OMM. Ideally the Guest memory is larger enough so first Container workloads running on it get killed and there is room to allow Kata in the host works correctly. It may be helpful to provide the pod overhead as part of cli/other api. $ kata-runtime overhead Memory: 200 MB ( Proxy, VM and Shim) This probably will be hardcoded (as config or code ) So the runtime can check the pod cgroup and decide to fail nicely asking for more memory in the parent cgroup for correct functionality aditionally we could have metrics test to verify that it is possible to creat a true or bash command with that defined memory at least. - If the test fail due to change that increase the memory it should be updated. And ideally should be updated when this is reduced. info: docker cgroup before Directory /sys/fs/cgroup/cpu/docker/: └─kata_f12efaa8d43d55904f5fad8344a640536c74276bacff8bb8b0020fd056a4d5cc ├─29198 /opt/kata/bin/qemu-system-x86_64 -name sandbox-f12efaa8d43d55904f5f... └─29221 /opt/kata/libexec/kata-containers/kata-shim -agent unix:///run/vc/s... - This mode Use all cgroups types in the pod sandbox This config enables the option to use the cgroup parent of the container with annotation of Sandbox. To make easy debug and CI just name it, kata-sandbox pod-cgroup kata-sandbox 2487 /opt/kata/bin/qemu-system-x86_64 -name sandbox-090142d008e431ac5d4a... 22494 /opt/kata/libexec/kata-containers/kata-proxy -listen-socket unix://... 2654 /opt/kata/libexec/kata-containers/kata-shim -agent unix:///run/vc/s... Depends-on: github.com/kata-containers/tests#1824 Fixes: kata-containers#1879 Signed-off-by: Carlos Venegas <vmjcarlos@gmail.com>

* config: add option to use only pod cgroup. Use option to use only pod cgroup. - Get pod Cgroup getting the parent of the SANDBOX container. - Do not create host cgroup for each container - Place all kata processese in the cgroup * Proxy * Shim * Runtime * Hypervisor Move hypervisor and other process after they are launched does not provide an expected behavior for resource accounting. If a process is moved after is created the cgroup will start to account resources that are used after moved. For better host accounting move the runtime to the cgroup sandbox before anything is created this way all the processes created by the runtime are accounted as expected. limitation for very small memory cgroups the runtime may not work as expected as the process are going to get OOM. TODO: A minimal pod cgroup memory should be defined and validated. This may reduce the perfomance of small pods as Hypervisor is totally constrained (not partially anymore to vcpus). It gives to the container administrator more acurrate data on the resource usage. By checking the pod cgroup directly cat /sys/fs/cgroup/<type>/pod-cgroup/data Intra conainer usage is still possible to get via kata-runtime events Memory limitations: Because VMM is contrained in the pod sandox it could be OOM so the "pod admin" should consider to add an extra memory for the overhead. TODO: - [ ] Provide a minimal expected kata memory usage. From Kata perspective we should investigate what is the overhead of Kata for at least memory as this is a critical resource that could get the all the pod worloads killed if the VM get OMM. Ideally the Guest memory is larger enough so first Container workloads running on it get killed and there is room to allow Kata in the host works correctly. It may be helpful to provide the pod overhead as part of cli/other api. ``` $ kata-runtime overhead This probably will be hardcoded (as config or code ) Memory: 200 MB ( Proxy, VM and Shim) So the runtime can check the pod cgroup and decide to fail nicely asking for more memory in the parent cgroup for correct functionality aditionally we could have metrics test to verify that it is possible to creat a true or bash command with that defined memory at least. ``` OR ``` Daynamic: cgroup usage - containers inside guest usage ``` - If the test fail due to change that increase the memory it should be updated. And ideally should be updated when this is reduced. info: docker cgroup before Directory /sys/fs/cgroup/cpu/docker/: └─kata_f12efaa8d43d55904f5fad8344a640536c74276bacff8bb8b0020fd056a4d5cc ├─29198 /opt/kata/bin/qemu-system-x86_64 -name sandbox-f12efaa8d43d55904f5f... └─29221 /opt/kata/libexec/kata-containers/kata-shim -agent unix:///run/vc/s... - This mode Use all cgroups types in the pod sandbox This config enables the option to use the cgroup parent of the container with annotation of Sandbox. To make easy debug and CI just name it, kata-sandbox pod-cgroup kata-sandbox-<cid-sandboxcontainer> 2487 /opt/kata/bin/qemu-system-x86_64 -name sandbox-090142d008e431ac5d4a... 22494 /opt/kata/libexec/kata-containers/kata-proxy -listen-socket unix://... 2654 /opt/kata/libexec/kata-containers/kata-shim -agent unix:///run/vc/s... Depends-on: github.com/kata-containers/tests#1824 Fixes: kata-containers#1879 Signed-off-by: Carlos Venegas <vmjcarlos@gmail.com>

jodh-intel · 2019-07-29T10:44:40Z

Ping @jcvenegas.

* config: add option to use only pod cgroup. Use option to use only pod cgroup. - Get pod Cgroup getting the parent of the SANDBOX container. - Do not create host cgroup for each container - Place all kata processese in the cgroup * Proxy * Shim * Runtime * Hypervisor Move hypervisor and other process after they are launched does not provide an expected behavior for resource accounting. If a process is moved after is created the cgroup will start to account resources that are used after moved. For better host accounting move the runtime to the cgroup sandbox before anything is created this way all the processes created by the runtime are accounted as expected. limitation for very small memory cgroups the runtime may not work as expected as the process are going to get OOM. TODO: A minimal pod cgroup memory should be defined and validated. This may reduce the perfomance of small pods as Hypervisor is totally constrained (not partially anymore to vcpus). It gives to the container administrator more acurrate data on the resource usage. By checking the pod cgroup directly cat /sys/fs/cgroup/<type>/pod-cgroup/data Intra conainer usage is still possible to get via kata-runtime events Memory limitations: Because VMM is contrained in the pod sandox it could be OOM so the "pod admin" should consider to add an extra memory for the overhead. TODO: - [ ] Provide a minimal expected kata memory usage. From Kata perspective we should investigate what is the overhead of Kata for at least memory as this is a critical resource that could get the all the pod worloads killed if the VM get OMM. Ideally the Guest memory is larger enough so first Container workloads running on it get killed and there is room to allow Kata in the host works correctly. It may be helpful to provide the pod overhead as part of cli/other api. ``` $ kata-runtime overhead This probably will be hardcoded (as config or code ) Memory: 200 MB ( Proxy, VM and Shim) So the runtime can check the pod cgroup and decide to fail nicely asking for more memory in the parent cgroup for correct functionality aditionally we could have metrics test to verify that it is possible to creat a true or bash command with that defined memory at least. ``` OR ``` Daynamic: cgroup usage - containers inside guest usage ``` - If the test fail due to change that increase the memory it should be updated. And ideally should be updated when this is reduced. info: docker cgroup before Directory /sys/fs/cgroup/cpu/docker/: └─kata_f12efaa8d43d55904f5fad8344a640536c74276bacff8bb8b0020fd056a4d5cc ├─29198 /opt/kata/bin/qemu-system-x86_64 -name sandbox-f12efaa8d43d55904f5f... └─29221 /opt/kata/libexec/kata-containers/kata-shim -agent unix:///run/vc/s... - This mode Use all cgroups types in the pod sandbox This config enables the option to use the cgroup parent of the container with annotation of Sandbox. To make easy debug and CI just name it, kata-sandbox pod-cgroup kata-sandbox-<cid-sandboxcontainer> 2487 /opt/kata/bin/qemu-system-x86_64 -name sandbox-090142d008e431ac5d4a... 22494 /opt/kata/libexec/kata-containers/kata-proxy -listen-socket unix://... 2654 /opt/kata/libexec/kata-containers/kata-shim -agent unix:///run/vc/s... Depends-on: github.com/kata-containers/tests#1824 Fixes: kata-containers#1879 Signed-off-by: Carlos Venegas <vmjcarlos@gmail.com>

* config: add option to use only pod cgroup. Use option to use only pod cgroup. - Get pod Cgroup getting the parent of the SANDBOX container. - Do not create host cgroup for each container - Place all kata processese in the cgroup * Proxy * Shim * Runtime * Hypervisor Move hypervisor and other process after they are launched does not provide an expected behavior for resource accounting. If a process is moved after is created the cgroup will start to account resources that are used after moved. For better host accounting move the runtime to the cgroup sandbox before anything is created this way all the processes created by the runtime are accounted as expected. limitation for very small memory cgroups the runtime may not work as expected as the process are going to get OOM. TODO: A minimal pod cgroup memory should be defined and validated. This may reduce the perfomance of small pods as Hypervisor is totally constrained (not partially anymore to vcpus). It gives to the container administrator more acurrate data on the resource usage. By checking the pod cgroup directly cat /sys/fs/cgroup/<type>/pod-cgroup/data Intra conainer usage is still possible to get via kata-runtime events Memory limitations: Because VMM is contrained in the pod sandox it could be OOM so the "pod admin" should consider to add an extra memory for the overhead. TODO: - [ ] Provide a minimal expected kata memory usage. From Kata perspective we should investigate what is the overhead of Kata for at least memory as this is a critical resource that could get the all the pod worloads killed if the VM get OMM. Ideally the Guest memory is larger enough so first Container workloads running on it get killed and there is room to allow Kata in the host works correctly. It may be helpful to provide the pod overhead as part of cli/other api. ``` $ kata-runtime overhead This probably will be hardcoded (as config or code ) Memory: 200 MB ( Proxy, VM and Shim) So the runtime can check the pod cgroup and decide to fail nicely asking for more memory in the parent cgroup for correct functionality aditionally we could have metrics test to verify that it is possible to creat a true or bash command with that defined memory at least. ``` OR ``` Daynamic: cgroup usage - containers inside guest usage ``` - If the test fail due to change that increase the memory it should be updated. And ideally should be updated when this is reduced. info: docker cgroup before Directory /sys/fs/cgroup/cpu/docker/: └─kata_f12efaa8d43d55904f5fad8344a640536c74276bacff8bb8b0020fd056a4d5cc ├─29198 /opt/kata/bin/qemu-system-x86_64 -name sandbox-f12efaa8d43d55904f5f... └─29221 /opt/kata/libexec/kata-containers/kata-shim -agent unix:///run/vc/s... - This mode Use all cgroups types in the pod sandbox This config enables the option to use the cgroup parent of the container with annotation of Sandbox. To make easy debug and CI just name it, kata-sandbox pod-cgroup kata-sandbox-<cid-sandboxcontainer> 2487 /opt/kata/bin/qemu-system-x86_64 -name sandbox-090142d008e431ac5d4a... 22494 /opt/kata/libexec/kata-containers/kata-proxy -listen-socket unix://... 2654 /opt/kata/libexec/kata-containers/kata-shim -agent unix:///run/vc/s... Depends-on: github.com/kata-containers/tests#1824 Fixes: kata-containers#1879 Signed-off-by: Carlos Venegas <vmjcarlos@gmail.com> Signed-off-by: Jose Carlos Venegas Munoz <jose.carlos.venegas.munoz@intel.com>

* config: add option to use only pod cgroup (SandboxCgroupOnly) 1) Given a PodSanbox container creation, let ``` cgroupPod=Parent(cgroupPath) ``` and ``` KataSandboxCgroup=${podCgroup}/kata-sandbox-<PodSandbox-id> ``` 2) Then create a cgroup the KataSandboxCgroup 3) Join to the KataSandboxCgroup Any process created by the runtime will be created in KataSandboxCgroup * Proxy * Shim * Runtime * Hypervisor Examples: Limitations * For very small memory cgroups the runtime may not work as expected as the process are going to get OOM. * This may reduce the perfomance of small pods, where the pod cgroup is sized to the same resources of the containers (no overhead is considered). Depends-on: github.com/kata-containers/tests#1824 Fixes: kata-containers#1879 Signed-off-by: Jose Carlos Venegas Munoz <jose.carlos.venegas.munoz@intel.com>

jcvenegas · 2019-08-07T17:55:58Z

/test

jcvenegas · 2019-08-07T23:59:40Z

/test

egernst · 2019-08-08T14:08:33Z

 sudo systemctl start ${cri_runtime}
+for i in {1..5}; do
+       sleep 5
+       if [ -f "${cri_runtime_socket}" ]; then


I see some other white space cleanup, as will as this retry. I’m okay w the whitepace being included in the commit, but what’s this for?

Because the cri-containerd job is running k8s two times, the contaienrd service is restarted and seems that kubeadm tries to connect and is not ready yet. I add a some retries to check the socket exist.

egernst

@chavafg @GabyCT can y’all PTAL?

chavafg · 2019-08-08T14:41:47Z

+
+echo "Replacing ${option_false} for ${option_true}"
+# If is false
+sudo sed -i "s,^${option_false},${option_true},g" "${KATA_ETC_CONFIG_PATH}"


would recommend to use crudini here. that way we don't need to verify if option is commented or true/false.

awesome, just added

devimc · 2019-08-08T14:59:04Z

+
+if [ -f "${KATA_ETC_CONFIG_PATH}" ]; then
+	bk_file="${KATA_ETC_CONFIG_PATH}-${bk_suffix}"
+	echo "backup ${KATA_ETC_CONFIG_PATH} in ${bk_file}"


s/echo/info/g ?

add option to eneable only pod cgroup (SandboxCgroupOnly) Depends-on: github.com/kata-containers/tests#1824 Fixes: kata-containers#1879 Signed-off-by: Jose Carlos Venegas Munoz <jose.carlos.venegas.munoz@intel.com>

When a new sandbox is created, join to its cgroup path this will create all proxy, shim, etc in the sandbox cgroup. Depends-on: github.com/kata-containers/tests#1824 Fixes: kata-containers#1879 Signed-off-by: Jose Carlos Venegas Munoz <jose.carlos.venegas.munoz@intel.com>

add option to eneable only pod cgroup (SandboxCgroupOnly) Depends-on: github.com/kata-containers/tests#1824 Fixes: kata-containers#1879 Signed-off-by: Jose Carlos Venegas Munoz <jose.carlos.venegas.munoz@intel.com>

jcvenegas · 2019-08-30T06:23:58Z

/test

jcvenegas · 2019-09-09T18:29:09Z

kata-containers/runtime#1880 is merged now this can be merged too.
@chavafg @egernst @devimc

jcvenegas · 2019-09-09T18:50:43Z

/test

jcvenegas · 2019-09-09T19:17:50Z

/test

jcvenegas · 2019-09-09T22:15:02Z

/test

jcvenegas · 2019-09-10T14:47:28Z

@jodh-intel @egernst @devimc ready to merge PTAL

jodh-intel · 2019-09-10T14:57:51Z

+
+info "Validate option is enabled"
+current_value=$(kata-runtime kata-env --json | jq ".Runtime.SandboxCgroupOnly")
+if [ "$current_value" != "${option_value}" ]; then


Could be simplified to:

[ "$current_value" != "${option_value}" ] && die "The option was not updated"

jodh-intel · 2019-09-10T15:00:48Z

@@ -0,0 +1,35 @@
+#!/bin/bash


This script is very similar to the enable one. To avoid the code duplication, I'd be tempted to merge them together into .ci/toggle_sandbox_cgroup_only.sh (or similar) and require the caller to specify an enable or disable argument. That will guarantee the two scripts won't diverge in future.

Done, looks better know thanks.

jcvenegas · 2019-09-12T20:28:43Z

/test

jodh-intel

Thanks @jcvenegas.

lgtm.

- Add scripts to enable it if needed. - Add script to rolback config to enable it. - Add ci Job case for pod cgroup - Run again kuberentes test with pod cgroup enabled for K8S_CONTAINERD_JOB - Check in docker test if option is enabled to not check cgroups in the host. Fixes: kata-containers#1810 Signed-off-by: Jose Carlos Venegas Munoz <jcvenega@jcvenega-nuc.zpn.intel.com>

jcvenegas · 2019-09-13T20:23:48Z

/test

jcvenegas · 2019-10-02T17:57:32Z

/test

jodh-intel reviewed Jul 18, 2019

View reviewed changes

Comment thread .ci/enable_sandbox_cgroup_only.sh Outdated

Comment thread .ci/enable_sandbox_cgroup_only.sh Outdated

Comment thread .ci/enable_sandbox_cgroup_only.sh Outdated

Comment thread .ci/enable_sandbox_cgroup_only.sh Outdated

Comment thread integration/docker/cgroups_test.go Outdated

jcvenegas force-pushed the sandbox_cgroup_only_true branch from 92a5698 to b9d958f Compare July 18, 2019 14:27

jcvenegas force-pushed the sandbox_cgroup_only_true branch from b9d958f to 94b106a Compare July 25, 2019 14:14

jcvenegas force-pushed the sandbox_cgroup_only_true branch from 94b106a to 4e97883 Compare July 26, 2019 16:34

jcvenegas mentioned this pull request Aug 6, 2019

cgroups: Use only pod cgroup kata-containers/runtime#1880

Merged

jcvenegas force-pushed the sandbox_cgroup_only_true branch 3 times, most recently from 89ed58f to 458cd9e Compare August 7, 2019 17:50

jcvenegas force-pushed the sandbox_cgroup_only_true branch 2 times, most recently from 074d649 to c6d3f09 Compare August 7, 2019 23:59

egernst reviewed Aug 8, 2019

View reviewed changes

jcvenegas added the do-not-merge PR has problems or depends on another label Aug 8, 2019

chavafg reviewed Aug 8, 2019

View reviewed changes

devimc reviewed Aug 8, 2019

View reviewed changes

jodh-intel reviewed Aug 15, 2019

View reviewed changes

Comment thread .ci/enable_sandbox_cgroup_only.sh Outdated

jcvenegas force-pushed the sandbox_cgroup_only_true branch 5 times, most recently from ac225d8 to 7b4d416 Compare August 30, 2019 06:23

jcvenegas removed the do-not-merge PR has problems or depends on another label Sep 9, 2019

jcvenegas force-pushed the sandbox_cgroup_only_true branch from 7b4d416 to 9c965ec Compare September 9, 2019 18:49

chavafg reviewed Sep 9, 2019

View reviewed changes

Comment thread .ci/rollback_enable_sandbox_cgroup_only.sh Outdated

jcvenegas force-pushed the sandbox_cgroup_only_true branch from 9c965ec to 5ed0047 Compare September 9, 2019 19:17

chavafg approved these changes Sep 9, 2019

View reviewed changes

jcvenegas force-pushed the sandbox_cgroup_only_true branch from 5ed0047 to 9d03488 Compare September 9, 2019 22:14

jodh-intel reviewed Sep 10, 2019

View reviewed changes

jcvenegas force-pushed the sandbox_cgroup_only_true branch from 9d03488 to 547940b Compare September 12, 2019 20:22

jodh-intel approved these changes Sep 13, 2019

View reviewed changes

jcvenegas added the do-not-merge PR has problems or depends on another label Sep 13, 2019

jcvenegas force-pushed the sandbox_cgroup_only_true branch from 547940b to 29618b3 Compare September 13, 2019 20:23

jcvenegas removed the do-not-merge PR has problems or depends on another label Sep 13, 2019

GabyCT approved these changes Oct 3, 2019

View reviewed changes

GabyCT merged commit 6b4bb31 into kata-containers:master Oct 3, 2019

Conversation

jcvenegas commented Jul 17, 2019

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

jodh-intel commented Jul 29, 2019

Uh oh!

jcvenegas commented Aug 7, 2019

Uh oh!

jcvenegas commented Aug 7, 2019

Uh oh!

egernst Aug 8, 2019

Choose a reason for hiding this comment

Uh oh!

jcvenegas Aug 8, 2019 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

egernst left a comment

Choose a reason for hiding this comment

Uh oh!

chavafg Aug 8, 2019

Choose a reason for hiding this comment

Uh oh!

jcvenegas Sep 9, 2019

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

devimc Aug 8, 2019

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

jcvenegas commented Aug 30, 2019

Uh oh!

jcvenegas commented Sep 9, 2019

Uh oh!

jcvenegas commented Sep 9, 2019

Uh oh!

Uh oh!

jcvenegas commented Sep 9, 2019

Uh oh!

jcvenegas commented Sep 9, 2019

Uh oh!

jcvenegas commented Sep 10, 2019

Uh oh!

Uh oh!

jodh-intel Sep 10, 2019

Choose a reason for hiding this comment

Uh oh!

Uh oh!

jodh-intel Sep 10, 2019

Choose a reason for hiding this comment

Uh oh!

jcvenegas Sep 12, 2019

Choose a reason for hiding this comment

Uh oh!

jcvenegas commented Sep 12, 2019

Uh oh!

jodh-intel left a comment

Choose a reason for hiding this comment

Uh oh!

jcvenegas commented Sep 13, 2019

Uh oh!

jcvenegas commented Oct 2, 2019

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

jcvenegas Aug 8, 2019 •

edited

Loading