virtcontainers: Do not lazy-attach devices#2670
Conversation
PCI bus rescan code was added long time ago in Clear Containers due to lack of ACPI support in QEMU 2.9 + q35 [1]. Now this code is messing up PCIe hotplug in Kata Containers. A workaround to this issue is the "lazy attach" mechanism [2] that hotplugs LBS (Large BAR space) devices after re-scanning the PCI bus, unfourtunately some non-LBS devices are being affected too, for instance SR-IOV devices. It would not make sense to lazy-attach non-LBS devices because kata will end up lazy-attaching all the devices, having said that, the PCI bus rescan code and the "lazy attach" mechanism should be removed Depends-on: github.com/kata-containers/runtime#2670 fixes kata-containers#781 fixes kata-containers/runtime#2664 [1] clearcontainers/agent#139 [2] kata-containers/runtime#2461 Signed-off-by: Julio Montes <julio.montes@intel.com>
|
/test |
The "lazy attach" mechanism [1] was added to hotplugs LBS (Large BAR space) devices after re-scanning the PCI bus, fixing LBS hotplug in kata containers. Since PCI rescan is removed in kata-containers/agent#782, lazy attach is not longer needed. Depends-on: github.com/kata-containers/agent#782 fixes kata-containers#2664 [1] kata-containers#2461 Signed-off-by: Julio Montes <julio.montes@intel.com>
39d5113 to
0b1c99d
Compare
|
/test |
Codecov Report
@@ Coverage Diff @@
## master #2670 +/- ##
==========================================
+ Coverage 45.58% 50.55% +4.96%
==========================================
Files 118 118
Lines 17131 17074 -57
==========================================
+ Hits 7810 8631 +821
+ Misses 8456 7388 -1068
- Partials 865 1055 +190 |
fidencio
left a comment
There was a problem hiding this comment.
This whole re-scan / lazy-attach scenario is rather complicated, isn't it?
Here, similarly to what I've done for kata-containers/agent#782, as the code and the concept do look good, I'll "Approve" the PR, when @amorenoz finishes his tests, using again the AaaS ("Ack-as-a-Service") concept. :-)
|
Btw, before we have this merged, would be possible to also have an opinion from @Jimmy-Xu that it doesn't regress on their use case? |
PCI bus rescan code was added long time ago in Clear Containers due to lack of ACPI support in QEMU 2.9 + q35 [1]. Now this code is messing up PCIe hotplug in Kata Containers. A workaround to this issue is the "lazy attach" mechanism [2] that hotplugs LBS (Large BAR space) devices after re-scanning the PCI bus, unfourtunately some non-LBS devices are being affected too, for instance SR-IOV devices. It would not make sense to lazy-attach non-LBS devices because kata will end up lazy-attaching all the devices, having said that, the PCI bus rescan code and the "lazy attach" mechanism should be removed Depends-on: github.com/kata-containers/runtime#2670 fixes kata-containers#781 fixes kata-containers/runtime#2664 [1] clearcontainers/agent#139 [2] kata-containers/runtime#2461 Signed-off-by: Julio Montes <julio.montes@intel.com>
|
all green, just waiting for review |
|
I took this PR and the associated agent PR and here is a summary of my verification results Device for passthrough Kata container execution command Kata configuration-1 This config works fine and dmesg inside the container/Kata-VM shows the PCI device Kata configuration-2 This configuration fails with the following error This should have worked but I'm not sure at this moment why it's failing in my setup. Will continue debugging. |
|
|
||
| func getVFIODetails(deviceFileName, iommuDevicesPath string) (deviceBDF, deviceSysfsDev string, vfioDeviceType config.VFIODeviceType, err error) { | ||
| vfioDeviceType = GetVFIODeviceType(deviceFileName) | ||
| tokens := strings.Split(deviceFileName, ":") |
There was a problem hiding this comment.
It might be worth putting this logic into a new function that can be unit-tested with various numbers of colons and dashes in deviceFileName.
func getVFIODeviceType(deviceFileName string) string|
/test-vfio |
PCI bus rescan code was added long time ago in Clear Containers due to lack of ACPI support in QEMU 2.9 + q35 [1]. Now this code is messing up PCIe hotplug in Kata Containers. A workaround to this issue is the "lazy attach" mechanism [2] that hotplugs LBS (Large BAR space) devices after re-scanning the PCI bus, unfourtunately some non-LBS devices are being affected too, for instance SR-IOV devices. It would not make sense to lazy-attach non-LBS devices because kata will end up lazy-attaching all the devices, having said that, the PCI bus rescan code and the "lazy attach" mechanism should be removed Depends-on: github.com/kata-containers/runtime#2670 fixes kata-containers#781 fixes kata-containers/runtime#2664 [1] clearcontainers/agent#139 [2] kata-containers/runtime#2461 Signed-off-by: Julio Montes <julio.montes@intel.com>
|
/test-vfio |
|
@devimc not sure why but this PR still has a pending DCO and WIP check. Do we still want to merge it? |
|
closing. #2981 includes this |
The "lazy attach" mechanism [1] was added to hotplugs LBS (Large BAR space)
devices after re-scanning the PCI bus, fixing LBS hotplug in kata containers.
Since PCI rescan is removed in kata-containers/agent#782, lazy attach is not
longer needed.
Depends-on: github.com/kata-containers/agent#782
fixes #2664
[1] #2461
Signed-off-by: Julio Montes julio.montes@intel.com