Skip to content
This repository was archived by the owner on May 12, 2021. It is now read-only.

clh: Add support to unplug block devices#2833

Merged
jcvenegas merged 2 commits into
kata-containers:masterfrom
likebreath:fix_2832
Aug 14, 2020
Merged

clh: Add support to unplug block devices#2833
jcvenegas merged 2 commits into
kata-containers:masterfrom
likebreath:fix_2832

Conversation

@likebreath
Copy link
Copy Markdown
Contributor

This patch enables kata+clh to unplug block devices, which is required
to pass cri-o integration tests.

Fixes: #2832

Signed-off-by: Bo Chen chen.bo@intel.com

@auto-comment
Copy link
Copy Markdown

auto-comment Bot commented Jul 16, 2020

Thank you for raising your pull request. Please note that the main development of Kata Containers has moved to the 2.0-dev branch of https://github.com/kata-containers/kata-containers repository. The kata-containers/runtime repository is kept for 1.x release maintenance. Please check twice if your change should go to the 2.0-dev branch directly.

If it is strongly required for adding the change to Kata Containers 1.x releases, please ping @kata-containers/runtime to assign a dedicated developer to be responsible for porting the change to 2.0-dev branch. Thanks!

likebreath added a commit to likebreath/kata-tests that referenced this pull request Jul 16, 2020
This patch extends the existing scripts to cover a new Jenkins job on
testing k8s with crio for cloud-hypervisor.

Fixes: kata-containers#2546

Depends-on: github.com/kata-containers/runtime#2833

Signed-off-by: Bo Chen <chen.bo@intel.com>
@likebreath
Copy link
Copy Markdown
Contributor Author

/test-clh

@codecov
Copy link
Copy Markdown

codecov Bot commented Jul 16, 2020

Codecov Report

Merging #2833 into master will increase coverage by 0.00%.
The diff coverage is 51.85%.

@@           Coverage Diff           @@
##           master    #2833   +/-   ##
=======================================
  Coverage   51.40%   51.41%           
=======================================
  Files         118      118           
  Lines       17411    17434   +23     
=======================================
+ Hits         8950     8963   +13     
- Misses       7379     7388    +9     
- Partials     1082     1083    +1     

@likebreath
Copy link
Copy Markdown
Contributor Author

/test-ubuntu

Comment thread virtcontainers/clh.go Outdated
Comment thread virtcontainers/clh.go Outdated
Comment thread virtcontainers/clh.go Outdated
Comment thread virtcontainers/clh.go Outdated
@likebreath
Copy link
Copy Markdown
Contributor Author

/test-clh

@likebreath
Copy link
Copy Markdown
Contributor Author

/test-ubuntu

@likebreath
Copy link
Copy Markdown
Contributor Author

/test-clh

@likebreath
Copy link
Copy Markdown
Contributor Author

/test-ubuntu

likebreath added a commit to likebreath/kata-tests that referenced this pull request Jul 28, 2020
This patch extends the existing scripts to cover a new Jenkins job on
testing k8s with crio for cloud-hypervisor.

Fixes: kata-containers#2546

Depends-on: github.com/kata-containers/runtime#2833

Signed-off-by: Bo Chen <chen.bo@intel.com>
@likebreath
Copy link
Copy Markdown
Contributor Author

@jcvenegas @GabyCT The metrics CI is failing on Failed to load JSON error. Can you please share your insights about it?

19:34:43 Report Summary:
19:34:43 +-----+----------------------+---------------------+---------------+------+-----+-----+-----+-----+-----+-----+
19:34:43 | P/F |         NAME         |         FLR         |     MEAN      | CEIL | GAP | MIN | MAX | RNG | COV | ITS |
19:34:43 +-----+----------------------+---------------------+---------------+------+-----+-----+-----+-----+-----+-----+
19:34:43 | *F* | boot-times           | Failed to load JSON | exit status 2 |      |     |     |     |     |     |     |
19:34:43 | *F* | memory-footprint     | Failed to load JSON | exit status 2 |      |     |     |     |     |     |     |
19:34:43 | *F* | memory-footprint-ksm | Failed to load JSON | exit status 2 |      |     |     |     |     |     |     |
19:34:43 | *F* | blogbench            | Failed to load JSON | exit status 2 |      |     |     |     |     |     |     |
19:34:43 | *F* | blogbench            | Failed to load JSON | exit status 2 |      |     |     |     |     |     |     |
19:34:43 +-----+----------------------+---------------------+---------------+------+-----+-----+-----+-----+-----+-----+

@GabyCT
Copy link
Copy Markdown
Contributor

GabyCT commented Jul 28, 2020

@likebreath if you see the log, there are issues of unexpected processes

21:34:40 setting KSM to aggressive mode
21:34:40 
21:34:40 ===== starting test [memory footprint ksm] =====
21:34:40 command: docker: yes
21:34:40 27346
21:34:40 ERROR: Found unexpected /usr/bin/cloud-hypervisor present
21:34:40 disabling KSM
21:34:41 
21:34:41 ===== starting test [memory footprint] =====
21:34:41 command: docker: yes
21:34:41 27346
21:34:41 ERROR: Found unexpected /usr/bin/cloud-hypervisor present
21:34:41 
21:34:41 ===== starting test [memory footprint inside container] =====
21:34:41 command: docker: yes
21:34:41 27346
21:34:41 ERROR: Found unexpected /usr/bin/cloud-hypervisor present
21:34:41 Executing test: boot times image=ubuntu runtime=kata-runtime units=seconds
21:34:41 command: bc: yes
21:34:41 command: awk: yes
21:34:41 
21:34:41 ===== starting test [boot times] =====
21:34:41 command: docker: yes
21:34:42 27346
21:34:42 ERROR: Found unexpected /usr/bin/cloud-hypervisor present
21:34:42 
21:34:42 ===== starting test [blogbench] =====
21:34:42 command: docker: yes
21:34:42 27346
21:34:42 ERROR: Found unexpected /usr/bin/cloud-hypervisor present
21:34:43 27346
21:34:43 ERROR: Found unexpected /usr/bin/cloud-hypervisor present

@GabyCT
Copy link
Copy Markdown
Contributor

GabyCT commented Jul 28, 2020

that corrupted the tests and that is why the failure, that process is there before running the metrics

@likebreath
Copy link
Copy Markdown
Contributor Author

21:34:41 ERROR: Found unexpected /usr/bin/cloud-hypervisor present

@GabyCT Thank you for the inputs. Any thoughts on why this error is happening and how it can be fixed?

@GabyCT
Copy link
Copy Markdown
Contributor

GabyCT commented Jul 28, 2020

@likebreath no, check the logs maybe you will get an info there of what is happening

@likebreath
Copy link
Copy Markdown
Contributor Author

/test-clh-docker

Copy link
Copy Markdown

@jodh-intel jodh-intel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @likebreath.

lgtm

@likebreath
Copy link
Copy Markdown
Contributor Author

/test-clh-metrics

@likebreath
Copy link
Copy Markdown
Contributor Author

I re-ran the metrics CI again, and it is now not failing on the original setup stage in the CI. For now it is failing on the memory foot-print metrics, as listed below:

08:41:39 Report Summary:
08:41:39 +-----+----------------------+-------+--------+--------+-------+--------+--------+-------+------+-----+
08:41:39 | P/F |         NAME         |  FLR  |  MEAN  |  CEIL  |  GAP  |  MIN   |  MAX   |  RNG  | COV  | ITS |
08:41:39 +-----+----------------------+-------+--------+--------+-------+--------+--------+-------+------+-----+
08:41:39 | P   | boot-times           | 83.3% | 87.6%  | 116.7% | 33.3% | 79.2%  | 93.8%  | 18.5% | 4.7% |  20 |
08:41:39 | *F* | memory-footprint     | 95.0% | 95.0%  | 105.0% | 10.0% | 95.0%  | 95.0%  | 0.0%  | 0.0% |   1 |
08:41:39 | *F* | memory-footprint-ksm | 95.0% | 95.0%  | 105.0% | 10.0% | 95.0%  | 95.0%  | 0.0%  | 0.0% |   1 |
08:41:39 | P   | blogbench            | 80.0% | 117.2% | 120.0% | 40.0% | 117.2% | 117.2% | 0.0%  | 0.0% |   1 |
08:41:39 | P   | blogbench            | 80.0% | 103.6% | 120.0% | 40.0% | 103.6% | 103.6% | 0.0%  | 0.0% |   1 |
08:41:39 +-----+----------------------+-------+--------+--------+-------+--------+--------+-------+------+-----+
08:41:39 Fails: 2, Passes 3

With pointer from @jcvenegas, the root cause of this failure is: 08:41:39 time="2020-07-29T15:41:39Z" level=warning msg="Failed Minval ( 269727.515000(minimal mean expected) > 269672.400000(minimal mean on dataset)) for [memory-footprint]".

A similar metrics CI failure is also observed from PR #2863 (@amshinde). From the log: 17:52:12 time="2020-07-29T00:52:12Z" level=warning msg="Failed Minval ( 269727.515000(minimal mean expected) > 269595.700000(minimal mean on dataset)) for [memory-footprint]".

@GabyCT Can you please share any insights about the above obversations? Also, do you have pointers to explain the metrics CI in general? I am interested to know how those reference data are generated, and what kind of data are we tracking as the part of the metrics (e.g. why are we monitoring minimum memory-footprint, etc). Thank you.

To support unplug block device, we need to set the 'Id' explicitly while
hotplugging devices with cloud-hypervisor HTTP API.

Fixes: kata-containers#2832

Signed-off-by: Bo Chen <chen.bo@intel.com>
@likebreath
Copy link
Copy Markdown
Contributor Author

/test-clh

@likebreath
Copy link
Copy Markdown
Contributor Author

/test-clh

This patch enables kata+clh to unplug block devices, which is required
to pass cri-o integration tests.

Fixes: kata-containers#2832

Signed-off-by: Bo Chen <chen.bo@intel.com>
@likebreath
Copy link
Copy Markdown
Contributor Author

/test-clh-metrics

@likebreath
Copy link
Copy Markdown
Contributor Author

/test-ubuntu

@likebreath
Copy link
Copy Markdown
Contributor Author

@amshinde @jcvenegas This PR is ready to be landed. Please take a look and ACK. Thanks.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

clh: Add support to unplug block devices

6 participants