agent: re-scan pci bus to discover hidden devices#139
Conversation
sboeuf
left a comment
There was a problem hiding this comment.
This looks good (I have only one comment about the log), but I have a question regarding the commit message.
If I read the commit message, I have the feeling that Qemu <= 2.9 cannot support ACPI hotplug, meaning that we cannot hotplug anything. And at the same time, you're saying that we have to rescan the PCI bus from inside the VM in that case. Could you elaborate how you can still "hotplug" things (without ACPI), and that you need to rescan because they don't show up automatically. I am confused and it would be great if the commit message was more verbose to explain this change (I trust you that the fix is really needed but I'd like to understand exactly why).
| // re-scan PCI bus | ||
| // looking for hidden devices | ||
| if err := ioutil.WriteFile(pciBusRescanFile, []byte("1"), pciBusMode); err != nil { | ||
| agentLog.WithError(err).Warnf("Could not open pci bus rescan file %s: %s", pciBusRescanFile, err) |
There was a problem hiding this comment.
"Could not rescan the PCI bus" would be more generic and more appropriate. The underlying reason why this is happening will come from err.
|
@sboeuf done |
|
LGTM |
| // re-scan PCI bus | ||
| // looking for hidden devices | ||
| if err := ioutil.WriteFile(pciBusRescanFile, []byte("1"), pciBusMode); err != nil { | ||
| agentLog.WithError(err).Warnf("Could not rescan pci bus: %s", err) |
There was a problem hiding this comment.
This is going to log the error twice, so I'd make this just:
agentLog.WithError(err).Warn("Could not rescan PCI bus")
grahamwhaley
left a comment
There was a problem hiding this comment.
lgtm
but let's fix that log message @jodh-intel noted.
qemu <= 2.9 does not support ACPI pci hotplug in Q35 machines, this means the linux kernel will not receive the order to re-enumerate/re-scan the pci devices connected to the buses, hence pci bus must be re-scanned manually looking for hidden devices fixes clearcontainers#138 Signed-off-by: Julio Montes <julio.montes@intel.com>
|
@grahamwhaley @jodh-intel done, thanks |
|
|
||
| // re-scan PCI bus | ||
| // looking for hidden devices | ||
| if err := ioutil.WriteFile(pciBusRescanFile, []byte("1"), pciBusMode); err != nil { |
There was a problem hiding this comment.
@devimc In the other PR you are reusing the pci slots. Not sure if a slot will ever be reused (block device unplug followed by a plug in the real world with virtcontainers). However if that does happen, what happens to the mount points?
Also how do we know when the rescan is complete?
There was a problem hiding this comment.
@mcastelino pci slots can be reused once it's free
what happens to the mount points?
There is still missing that part, in qemu <= 2.9 Q35 PCI devices must be removed manually, I guess we have to implement an extra command in the agent to clean up unplugged devices
Also how do we know when the rescan is complete?
it is a blocking IO operation, process will not continue until all PCI buses are re-scanned
There was a problem hiding this comment.
@devimc will the pci slot reuse happen in the real world. Do we support removing a container from a POD?
There was a problem hiding this comment.
not sure if we can remove a container from a POD, but we can unplug devices from bridges.
we could reuse that slot to hot plug another device if we need it since we have a limitation in the number of devices per bridge.
There was a problem hiding this comment.
@devimc so when will we ever unplug a device? I cannot think of a case. Which means we are ok for now and need no special logic or unmount/remount.
|
@mcastelino what is the status on this ? Are you okay with this PR, or you expect some changes ? |
qemu <= 2.9 does not support ACPI pci hotplug in Q35 machines,
this means the linux kernel will not receive the order to
re-enumerate/re-scan the pci devices connected to the buses,
hence pci bus must be re-scanned manually looking for hidden devices
fixes #138
Signed-off-by: Julio Montes julio.montes@intel.com