Skip to content
This repository was archived by the owner on May 12, 2021. It is now read-only.

virtcontainers: Do not lazy-attach devices#2670

Closed
devimc wants to merge 1 commit into
kata-containers:masterfrom
devimc:topic/virtcontainers/doNotLazyAttach
Closed

virtcontainers: Do not lazy-attach devices#2670
devimc wants to merge 1 commit into
kata-containers:masterfrom
devimc:topic/virtcontainers/doNotLazyAttach

Conversation

@devimc
Copy link
Copy Markdown

@devimc devimc commented May 7, 2020

The "lazy attach" mechanism [1] was added to hotplugs LBS (Large BAR space)
devices after re-scanning the PCI bus, fixing LBS hotplug in kata containers.
Since PCI rescan is removed in kata-containers/agent#782, lazy attach is not
longer needed.

Depends-on: github.com/kata-containers/agent#782
fixes #2664

[1] #2461

Signed-off-by: Julio Montes julio.montes@intel.com

@devimc devimc added the do-not-merge PR has problems or depends on another label May 7, 2020
devimc pushed a commit to devimc/kata-agent that referenced this pull request May 7, 2020
PCI bus rescan code was added long time ago in Clear Containers due to lack of
ACPI support in QEMU 2.9 + q35 [1]. Now this code is messing up PCIe hotplug
in Kata Containers. A workaround to this issue is the "lazy attach"
mechanism [2] that hotplugs LBS (Large BAR space) devices after re-scanning the
PCI bus, unfourtunately some non-LBS devices are being affected too, for
instance SR-IOV devices. It would not make sense to lazy-attach non-LBS
devices because kata will end up lazy-attaching all the devices, having said
that, the PCI bus rescan code and the "lazy attach" mechanism should be removed

Depends-on: github.com/kata-containers/runtime#2670
fixes kata-containers#781
fixes kata-containers/runtime#2664

[1] clearcontainers/agent#139
[2] kata-containers/runtime#2461

Signed-off-by: Julio Montes <julio.montes@intel.com>
@devimc
Copy link
Copy Markdown
Author

devimc commented May 7, 2020

/test

The "lazy attach" mechanism [1] was added to hotplugs LBS (Large BAR space)
devices after re-scanning the PCI bus, fixing LBS hotplug in kata containers.
Since PCI rescan is removed in kata-containers/agent#782, lazy attach is not
longer needed.

Depends-on: github.com/kata-containers/agent#782
fixes kata-containers#2664

[1] kata-containers#2461

Signed-off-by: Julio Montes <julio.montes@intel.com>
@devimc devimc force-pushed the topic/virtcontainers/doNotLazyAttach branch from 39d5113 to 0b1c99d Compare May 8, 2020 00:34
@devimc
Copy link
Copy Markdown
Author

devimc commented May 8, 2020

/test

@codecov
Copy link
Copy Markdown

codecov Bot commented May 8, 2020

Codecov Report

Merging #2670 into master will increase coverage by 4.96%.
The diff coverage is 100.00%.

@@            Coverage Diff             @@
##           master    #2670      +/-   ##
==========================================
+ Coverage   45.58%   50.55%   +4.96%     
==========================================
  Files         118      118              
  Lines       17131    17074      -57     
==========================================
+ Hits         7810     8631     +821     
+ Misses       8456     7388    -1068     
- Partials      865     1055     +190     

Copy link
Copy Markdown
Member

@fidencio fidencio left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This whole re-scan / lazy-attach scenario is rather complicated, isn't it?

Here, similarly to what I've done for kata-containers/agent#782, as the code and the concept do look good, I'll "Approve" the PR, when @amorenoz finishes his tests, using again the AaaS ("Ack-as-a-Service") concept. :-)

@fidencio
Copy link
Copy Markdown
Member

fidencio commented May 8, 2020

Btw, before we have this merged, would be possible to also have an opinion from @Jimmy-Xu that it doesn't regress on their use case?

@devimc devimc removed the do-not-merge PR has problems or depends on another label May 8, 2020
devimc pushed a commit to devimc/kata-agent that referenced this pull request May 8, 2020
PCI bus rescan code was added long time ago in Clear Containers due to lack of
ACPI support in QEMU 2.9 + q35 [1]. Now this code is messing up PCIe hotplug
in Kata Containers. A workaround to this issue is the "lazy attach"
mechanism [2] that hotplugs LBS (Large BAR space) devices after re-scanning the
PCI bus, unfourtunately some non-LBS devices are being affected too, for
instance SR-IOV devices. It would not make sense to lazy-attach non-LBS
devices because kata will end up lazy-attaching all the devices, having said
that, the PCI bus rescan code and the "lazy attach" mechanism should be removed

Depends-on: github.com/kata-containers/runtime#2670
fixes kata-containers#781
fixes kata-containers/runtime#2664

[1] clearcontainers/agent#139
[2] kata-containers/runtime#2461

Signed-off-by: Julio Montes <julio.montes@intel.com>
@devimc
Copy link
Copy Markdown
Author

devimc commented May 8, 2020

all green, just waiting for review

@bpradipt
Copy link
Copy Markdown
Contributor

I took this PR and the associated agent PR and here is a summary of my verification results
Test environment
Intel system with Fedora 31, kernel-5.6.8-200.fc31.x86_64, cgroup v2 disabled and intel_iommu is turned on

Device for passthrough

IOMMU Group 16:
        16:00.0 Ethernet controller [0200]: Emulex Corporation OneConnect 10Gb NIC (be3) [19a2:0710] (rev 02)

# lspci -nnk -d 19a2:0710
[snip]
16:00.0 Ethernet controller [0200]: Emulex Corporation OneConnect 10Gb NIC (be3) [19a2:0710] (rev 02)
        Subsystem: Emulex Corporation Device [10df:e731]
        Kernel driver in use: vfio-pci
        Kernel modules: be2net
[snip]

# ls -l /dev/vfio/
total 0
crw-------. 1 root root 237,   0 May 10 00:13 16
crw-rw-rw-. 1 root root  10, 196 May 10 00:13 vfio

Kata container execution command

podman run -it --runtime=/usr/local/bin/kata-runtime --device /dev/vfio/16 fedora bash

Kata configuration-1

machine_type = "pc"
#hotplug_vfio_on_root_bus = true
#pcie_root_port = 2

This config works fine and dmesg inside the container/Kata-VM shows the PCI device

[snip]

[    2.066592] pci 0000:01:01.0: [19a2:0710] type 00 class 0x020000
[    2.067263] pci 0000:01:01.0: reg 0x10: [mem 0x00000000-0x00003fff 64bit]
[    2.067647] pci 0000:01:01.0: reg 0x18: [mem 0x00000000-0x0001ffff 64bit]
[    2.068060] pci 0000:01:01.0: reg 0x20: [mem 0x00000000-0x0001ffff 64bit]
[    2.069305] pci 0000:01:01.0: PME# supported from D3hot D3cold
[    2.069784] pci 0000:01:01.0: 0.000 Gb/s available PCIe bandwidth, limited by Unknown speed x0 link at 0000:00:02.0 (capable of 32.000 Gb/s with 5 GT/s x8 link)
[    2.070412] pci 0000:01:01.0: BAR 2: assigned [mem 0xfe400000-0xfe41ffff 64bit]
[    2.070836] pci 0000:01:01.0: BAR 4: assigned [mem 0xfe420000-0xfe43ffff 64bit]
[    2.071190] pci 0000:01:01.0: BAR 0: assigned [mem 0xfe440000-0xfe443fff 64bit]

[snip]

Kata configuration-2

machine_type = "q35"
hotplug_vfio_on_root_bus = true
pcie_root_port = 2

This configuration fails with the following error

Bus 'pcie.0' does not support hotplugging: OCI runtime error

This should have worked but I'm not sure at this moment why it's failing in my setup. Will continue debugging.


func getVFIODetails(deviceFileName, iommuDevicesPath string) (deviceBDF, deviceSysfsDev string, vfioDeviceType config.VFIODeviceType, err error) {
vfioDeviceType = GetVFIODeviceType(deviceFileName)
tokens := strings.Split(deviceFileName, ":")
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It might be worth putting this logic into a new function that can be unit-tested with various numbers of colons and dashes in deviceFileName.

func getVFIODeviceType(deviceFileName string) string

@amorenoz
Copy link
Copy Markdown
Contributor

amorenoz commented May 11, 2020

Thanks @bpradipt for testing this.
I reported two issues squashed in one, that may have caused some confusion, sorry about that. I have now split them. Your error is now reported in #2678.
As for this PR (and the associated agent PR), I have also verified that fixes the problem in bridge hotplug.

Copy link
Copy Markdown
Member

@fidencio fidencio left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Approving based on both @amorenoz and @bpradipt feedback!

@devimc
Copy link
Copy Markdown
Author

devimc commented May 13, 2020

/test-vfio

@Jimmy-Xu Jimmy-Xu self-requested a review June 2, 2020 02:46
devimc pushed a commit to devimc/kata-agent that referenced this pull request Sep 4, 2020
PCI bus rescan code was added long time ago in Clear Containers due to lack of
ACPI support in QEMU 2.9 + q35 [1]. Now this code is messing up PCIe hotplug
in Kata Containers. A workaround to this issue is the "lazy attach"
mechanism [2] that hotplugs LBS (Large BAR space) devices after re-scanning the
PCI bus, unfourtunately some non-LBS devices are being affected too, for
instance SR-IOV devices. It would not make sense to lazy-attach non-LBS
devices because kata will end up lazy-attaching all the devices, having said
that, the PCI bus rescan code and the "lazy attach" mechanism should be removed

Depends-on: github.com/kata-containers/runtime#2670
fixes kata-containers#781
fixes kata-containers/runtime#2664

[1] clearcontainers/agent#139
[2] kata-containers/runtime#2461

Signed-off-by: Julio Montes <julio.montes@intel.com>
@cmaf
Copy link
Copy Markdown
Contributor

cmaf commented Sep 30, 2020

/test-vfio

@cmaf
Copy link
Copy Markdown
Contributor

cmaf commented Sep 30, 2020

@devimc not sure why but this PR still has a pending DCO and WIP check. Do we still want to merge it?

@dgibson
Copy link
Copy Markdown
Contributor

dgibson commented Sep 30, 2020

@cmaf I think this is superseded by #2981

@devimc
Copy link
Copy Markdown
Author

devimc commented Sep 30, 2020

closing. #2981 includes this

@devimc devimc closed this Sep 30, 2020
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[qemu] q35 VFIO passthrough fails on both bridge and pcie-root-port

8 participants