Skip to content
This repository was archived by the owner on May 6, 2020. It is now read-only.

agent: re-scan pci bus to discover hidden devices#139

Merged
sboeuf merged 1 commit into
clearcontainers:masterfrom
devimc:q35Hotplug
Oct 27, 2017
Merged

agent: re-scan pci bus to discover hidden devices#139
sboeuf merged 1 commit into
clearcontainers:masterfrom
devimc:q35Hotplug

Conversation

@devimc
Copy link
Copy Markdown

@devimc devimc commented Oct 20, 2017

qemu <= 2.9 does not support ACPI pci hotplug in Q35 machines,
this means the linux kernel will not receive the order to
re-enumerate/re-scan the pci devices connected to the buses,
hence pci bus must be re-scanned manually looking for hidden devices

fixes #138

Signed-off-by: Julio Montes julio.montes@intel.com

Copy link
Copy Markdown

@sboeuf sboeuf left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks good (I have only one comment about the log), but I have a question regarding the commit message.
If I read the commit message, I have the feeling that Qemu <= 2.9 cannot support ACPI hotplug, meaning that we cannot hotplug anything. And at the same time, you're saying that we have to rescan the PCI bus from inside the VM in that case. Could you elaborate how you can still "hotplug" things (without ACPI), and that you need to rescan because they don't show up automatically. I am confused and it would be great if the commit message was more verbose to explain this change (I trust you that the fix is really needed but I'd like to understand exactly why).

Comment thread agent.go Outdated
// re-scan PCI bus
// looking for hidden devices
if err := ioutil.WriteFile(pciBusRescanFile, []byte("1"), pciBusMode); err != nil {
agentLog.WithError(err).Warnf("Could not open pci bus rescan file %s: %s", pciBusRescanFile, err)
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"Could not rescan the PCI bus" would be more generic and more appropriate. The underlying reason why this is happening will come from err.

@devimc
Copy link
Copy Markdown
Author

devimc commented Oct 20, 2017

@sboeuf done

Copy link
Copy Markdown

@sboeuf sboeuf left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @devimc
This looks better now ;)

@sboeuf
Copy link
Copy Markdown

sboeuf commented Oct 20, 2017

LGTM

Comment thread agent.go Outdated
// re-scan PCI bus
// looking for hidden devices
if err := ioutil.WriteFile(pciBusRescanFile, []byte("1"), pciBusMode); err != nil {
agentLog.WithError(err).Warnf("Could not rescan pci bus: %s", err)
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is going to log the error twice, so I'd make this just:

agentLog.WithError(err).Warn("Could not rescan PCI bus")

Copy link
Copy Markdown
Contributor

@grahamwhaley grahamwhaley left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm
but let's fix that log message @jodh-intel noted.

qemu <= 2.9 does not support ACPI pci hotplug in Q35 machines,
this means the linux kernel will not receive the order to
re-enumerate/re-scan the pci devices connected to the buses,
hence pci bus must be re-scanned manually looking for hidden devices

fixes clearcontainers#138

Signed-off-by: Julio Montes <julio.montes@intel.com>
@devimc
Copy link
Copy Markdown
Author

devimc commented Oct 23, 2017

@grahamwhaley @jodh-intel done, thanks

@jodh-intel
Copy link
Copy Markdown

jodh-intel commented Oct 23, 2017

lgtm

Approved with PullApprove

Copy link
Copy Markdown
Contributor

@grahamwhaley grahamwhaley left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

Comment thread agent.go

// re-scan PCI bus
// looking for hidden devices
if err := ioutil.WriteFile(pciBusRescanFile, []byte("1"), pciBusMode); err != nil {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@devimc In the other PR you are reusing the pci slots. Not sure if a slot will ever be reused (block device unplug followed by a plug in the real world with virtcontainers). However if that does happen, what happens to the mount points?

Also how do we know when the rescan is complete?

Copy link
Copy Markdown
Author

@devimc devimc Oct 25, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@mcastelino pci slots can be reused once it's free

what happens to the mount points?

There is still missing that part, in qemu <= 2.9 Q35 PCI devices must be removed manually, I guess we have to implement an extra command in the agent to clean up unplugged devices

Also how do we know when the rescan is complete?

it is a blocking IO operation, process will not continue until all PCI buses are re-scanned

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@devimc will the pci slot reuse happen in the real world. Do we support removing a container from a POD?

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not sure if we can remove a container from a POD, but we can unplug devices from bridges.
we could reuse that slot to hot plug another device if we need it since we have a limitation in the number of devices per bridge.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@devimc so when will we ever unplug a device? I cannot think of a case. Which means we are ok for now and need no special logic or unmount/remount.

@sboeuf
Copy link
Copy Markdown

sboeuf commented Oct 27, 2017

@mcastelino what is the status on this ? Are you okay with this PR, or you expect some changes ?

@mcastelino
Copy link
Copy Markdown
Contributor

mcastelino commented Oct 27, 2017

LGTM. We do not remount block devices now. So this is fine.

Approved with PullApprove Approved with PullApprove

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Add support for Q35 machines

5 participants