Skip to content
This repository was archived by the owner on May 12, 2021. It is now read-only.

Consistent device address matching between getDeviceName() and listenToUdevEvents()#849

Merged
dgibson merged 2 commits into
kata-containers:masterfrom
dgibson:bug848
Sep 25, 2020
Merged

Consistent device address matching between getDeviceName() and listenToUdevEvents()#849
dgibson merged 2 commits into
kata-containers:masterfrom
dgibson:bug848

Conversation

@dgibson
Copy link
Copy Markdown
Contributor

@dgibson dgibson commented Sep 22, 2020

This is a fix for issue #848. While we're there, simplify the code a bit and rename a poorly named data structure.

@dgibson dgibson self-assigned this Sep 22, 2020
@dgibson dgibson added the needs-forward-port Changes need to be applied to a newer branch / repository label Sep 22, 2020
@dgibson
Copy link
Copy Markdown
Contributor Author

dgibson commented Sep 22, 2020

Not sure if this should get a backport or not. It is, technically, a bug, but the chances of hitting it in practice are pretty small AFAICT.

@codecov
Copy link
Copy Markdown

codecov Bot commented Sep 22, 2020

Codecov Report

Merging #849 into master will increase coverage by 0.08%.
The diff coverage is 100.00%.

@@            Coverage Diff             @@
##           master     #849      +/-   ##
==========================================
+ Coverage   57.85%   57.94%   +0.08%     
==========================================
  Files          17       17              
  Lines        2373     2373              
==========================================
+ Hits         1373     1375       +2     
+ Misses        839      838       -1     
+ Partials      161      160       -1     

@dgibson dgibson changed the title Consistent device addres matching between getDeviceName() and listenToUdevEvents() Consistent device address matching between getDeviceName() and listenToUdevEvents() Sep 22, 2020
@dgibson
Copy link
Copy Markdown
Contributor Author

dgibson commented Sep 22, 2020

I don't really understand how my change can be reducing test coverage, since I've removed code and not changed the tests.

@jodh-intel jodh-intel added the needs-backport Changes need to be applied to an older branch / repository label Sep 22, 2020
Copy link
Copy Markdown

@jodh-intel jodh-intel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @dgibson.

lgtm

(And yes, we need someone to dig into codecov results as i've seen some oddities like this).

@devimc
Copy link
Copy Markdown

devimc commented Sep 22, 2020

/test

Copy link
Copy Markdown

@devimc devimc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm - thanks @dgibson

what's the next step? I think we need device handlers for VFIO devices, this way the agent will wait for them, right?

@dgibson
Copy link
Copy Markdown
Contributor Author

dgibson commented Sep 22, 2020

what's the next step? I think we need device handlers for VFIO devices, this way the agent will wait for them, right?

Yes. I'm working on it.

@devimc
Copy link
Copy Markdown

devimc commented Sep 22, 2020

/test

pciDeviceMap is essentially just a map from sysfs paths to corresponding
device nodes.  The only thing PCI related is that the sysfs paths we're
usually looking up in it usually involve PCI addresses.  There's no reason
that has to be the case, however, and I have future uses which will not
be PCI.

Therefore, rename it (and change some surrounding comments) to reflect
its more general nature.  While we're there, fix a typo in a related
comment.

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
This function watches for uvents that match addresses stored in
s.deviceWatchers() and if there's a matching entry signals the associated
channel.

"Matching" in this instance means soem sort of comparison between the
address we're watching for and the sysfs path in the uevent.  Currently we
match any of a bunch of different cases for different types of device.
Being very specific with how we match *sounds* like a good idea, but it's
complicated, tightly coupled to other parts of the code and, ultimately,
pointless.

The user of the data we're collecting here is getDeviceName().  Before
registering a watcher it checks s.sysToDevMap in case the uevent it's
interested in has already happened.

To have consistent behaviour regardless of when the uevent arrives,
the matching that getDeviceName() uses against s.sysToDevMap must be
the same as we use in listenToUdevEvents() and right now, it's not.
So, change the matching in listenToUdevEvents() to be the same as the
much simpler version in getDeviceName().

fixes #848

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
@devimc
Copy link
Copy Markdown

devimc commented Sep 23, 2020

@dgibson I see that #850 includes these changes, should we close this and review/merge #850 ?

@dgibson
Copy link
Copy Markdown
Contributor Author

dgibson commented Sep 24, 2020

I'm fine either way. These patches are essentially a prerequisite for #850 , so I don't mind if they go in as two PRs or one.

@dgibson
Copy link
Copy Markdown
Contributor Author

dgibson commented Sep 24, 2020

/test-ubuntu

@dgibson
Copy link
Copy Markdown
Contributor Author

dgibson commented Sep 24, 2020

/test

@dgibson
Copy link
Copy Markdown
Contributor Author

dgibson commented Sep 24, 2020

Hmmm the CI says
.ci/teardown.sh: line 260: make: command not found

I don't think that's my fault...

@jodh-intel
Copy link
Copy Markdown

@dgibson - hahaa! That's a pretty brutal tear down script eh? 😄

@GabyCT - That make error seems to be being caused by https://github.com/kata-containers/ci/blob/master/jenkins/jobs/kata-containers-agent-rhel-8-q35-PR/config.xml#L135..L142. But if so, that implies PATH isn't set at that point. If so, we need to add /bin:/usr/bin.

But I think the main problem is the previous error:

Error: 
 Problem: package podman-1.9.3-2.module+el8.2.1+6867+366c07d6.x86_64 requires runc >= 1.0.0-57, but none of the providers can be installed
  - package containerd.io-1.2.6-3.3.el7.x86_64 conflicts with containerd provided by containerd.io-1.3.7-3.1.el8.x86_64
  - package containerd.io-1.3.7-3.1.el8.x86_64 conflicts with runc provided by containerd.io-1.2.6-3.3.el7.x86_64
  - package containerd.io-1.3.7-3.1.el8.x86_64 conflicts with containerd provided by containerd.io-1.2.6-3.3.el7.x86_64
  - cannot install both containerd.io-1.3.7-3.1.el8.x86_64 and containerd.io-1.2.6-3.3.el7.x86_64
  - package containerd.io-1.3.7-3.1.el8.x86_64 conflicts with runc provided by runc-1.0.0-60.rc8.module+el8.1.0+4081+b29780af.x86_64
  - package containerd.io-1.3.7-3.1.el8.x86_64 obsoletes runc provided by runc-1.0.0-60.rc8.module+el8.1.0+4081+b29780af.x86_64
  - package containerd.io-1.3.7-3.1.el8.x86_64 conflicts with runc provided by runc-1.0.0-64.rc9.module+el8.1.1+5259+bcdd613a.x86_64
  - package containerd.io-1.3.7-3.1.el8.x86_64 obsoletes runc provided by runc-1.0.0-64.rc9.module+el8.1.1+5259+bcdd613a.x86_64
  - package containerd.io-1.3.7-3.1.el8.x86_64 conflicts with runc provided by runc-1.0.0-65.rc10.module+el8.2.0+6368+cf16aa14.x86_64
  - package containerd.io-1.3.7-3.1.el8.x86_64 obsoletes runc provided by runc-1.0.0-65.rc10.module+el8.2.0+6368+cf16aa14.x86_64
  - package containerd.io-1.3.7-3.1.el8.x86_64 conflicts with runc provided by runc-1.0.0-61.rc8.module+el8.1.0+4873+4a24e241.x86_64
  - package containerd.io-1.3.7-3.1.el8.x86_64 obsoletes runc provided by runc-1.0.0-61.rc8.module+el8.1.0+4873+4a24e241.x86_64
  - package containerd.io-1.3.7-3.1.el8.x86_64 conflicts with runc provided by runc-1.0.0-65.rc10.module+el8.2.0+5762+aaee29fb.x86_64
  - package containerd.io-1.3.7-3.1.el8.x86_64 obsoletes runc provided by runc-1.0.0-65.rc10.module+el8.2.0+5762+aaee29fb.x86_64
  - package containerd.io-1.3.7-3.1.el8.x86_64 conflicts with runc provided by runc-1.0.0-66.rc10.module+el8.2.1+6465+1a51e8b6.x86_64
  - package containerd.io-1.3.7-3.1.el8.x86_64 obsoletes runc provided by runc-1.0.0-66.rc10.module+el8.2.1+6465+1a51e8b6.x86_64
  - cannot install the best update candidate for package podman-1.0.5-1.gitf604175.module+el8.0.0+4017+bbba319f.x86_64
  - cannot install the best update candidate for package containerd.io-1.2.6-3.3.el7.x86_64
  - package runc-1.0.0-64.rc10.module+el8.2.0+6369+1f4293b4.x86_64 is filtered out by modular filtering
  - package runc-1.0.0-64.rc10.module+el8.2.0+5728+ac3aae00.x86_64 is filtered out by modular filtering
(try to add '--allowerasing' to command line to replace conflicting packages or '--skip-broken' to skip uninstallable packages or '--nobest' to use not only best candidate packages)
Failed at 26: sudo -E yum -y update

@devimc
Copy link
Copy Markdown

devimc commented Sep 24, 2020

/test-centos
/test-rhel

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

needs-backport Changes need to be applied to an older branch / repository needs-forward-port Changes need to be applied to a newer branch / repository

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants