Skip to content
This repository was archived by the owner on May 12, 2021. It is now read-only.

shim v2: Close vhostfd after vm get vhostfd#1670

Merged
lifupan merged 1 commit intokata-containers:masterfrom
xs3c:fix-vfio-hang
May 21, 2019
Merged

shim v2: Close vhostfd after vm get vhostfd#1670
lifupan merged 1 commit intokata-containers:masterfrom
xs3c:fix-vfio-hang

Conversation

@xs3c
Copy link
Contributor

@xs3c xs3c commented May 13, 2019

If kata containers is using vfio and vhost net,the unbinding
of vfio would be hang. In the scenario, vhost net kernel thread
takes a reference to the qemu's mm, and the reference also includes
the mmap regions on the vfio device file. so vhost kernel thread
would be not released when qemu is killed as the vhost file
descriptor still is opened by shim v2 process, and the vfio device
is not released because there's still a reference to the mmap.

Fixes: #1669

Signed-off-by: Yang, Wei w90p710@gmail.com

@devimc
Copy link

devimc commented May 13, 2019

/test

@devimc
Copy link

devimc commented May 13, 2019

cc @amshinde

@xs3c xs3c force-pushed the fix-vfio-hang branch 2 times, most recently from 4098830 to 01f1646 Compare May 13, 2019 16:51
Copy link

@jodh-intel jodh-intel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @xs3c.

lgtm

@jodh-intel
Copy link

Ahh, but the CI caught a small problem:

Found 1 commit between commit 01f1646fc06186101a7d0765e1f4abbda6a4bcf7 and branch master
ERROR: Commit 01f1646fc06186101a7d0765e1f4abbda6a4bcf7: Failed to find subsystem in subject: "shim v2: Close vhostfd after vm get vhostfd"
ERROR: checkcommits failed. See the document below for help on formatting
commits for the project.

That's arguably a bug with our commit checker but the solution is simple: change the "shim v2:" prefix to "shim-v2:" I think ;)

Copy link
Member

@amshinde amshinde left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm!

@egernst
Copy link
Member

egernst commented May 13, 2019

/test

@xs3c - I ammended your commit header to make travis happy (to me, that static failure was lame, so I wanted to make your life easier). Thx - LGTM.

@egernst egernst added the bug Incorrect behaviour label May 13, 2019
@lifupan
Copy link
Member

lifupan commented May 14, 2019

Hi @xs3c , it seems you didn't do gofmt to the files you changed, that's why the CI failed.

@xs3c xs3c force-pushed the fix-vfio-hang branch from 46dca22 to 3cb5215 Compare May 15, 2019 05:32
@xs3c
Copy link
Contributor Author

xs3c commented May 15, 2019

@lifupan Thx fupan, I did gofmt these files and then pushed it again. @egernst I used the commit log you modified ,so I also keep your SOF.

@xs3c
Copy link
Contributor Author

xs3c commented May 15, 2019

/test

1 similar comment
@lifupan
Copy link
Member

lifupan commented May 15, 2019

/test

@egernst
Copy link
Member

egernst commented May 15, 2019

still looks good, restarting what appears to be a flaky test failure on Ubuntu 18.04.

If kata containers is using vfio and vhost net,the unbinding
of vfio would be hang. In the scenario, vhost net kernel thread
takes a reference to the qemu's mm, and the reference also includes
the mmap regions on the vfio device file. so vhost kernel thread
would be not released when qemu is killed as the vhost file
descriptor still is opened by shim v2 process, and the vfio device
is not released because there's still a reference to the mmap.

Fixes: kata-containers#1669

Signed-off-by: Yang, Wei <w90p710@gmail.com>
Signed-off-by: Eric Ernst <eric.ernst@intel.com>
@xs3c xs3c force-pushed the fix-vfio-hang branch from 3cb5215 to 071030b Compare May 16, 2019 05:31
@bergwolf
Copy link
Member

/test

@lifupan lifupan merged commit 100db8a into kata-containers:master May 21, 2019
This was referenced Jun 3, 2019
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

bug Incorrect behaviour

Projects

None yet

Development

Successfully merging this pull request may close these issues.

the unbinding of vfio device would be hang when using vhost and vfio

8 participants