Workaround kernel bugs s related to namespaces#223
Workaround kernel bugs s related to namespaces#223mavenugo merged 1 commit intomoby:docker1.7.0_integfrom
Conversation
This PR attempts to work around bugs present in kernel version 3.18-4.0.1 relating to namespace creation and destruction. This fix attempts to avoid certain systemmcalls to not get in the kkernel bug path as well as lazily garbage collecting the name paths when they are removed. Signed-off-by: Jana Radhakrishnan <mrjana@docker.com>
|
LGTM. Thanks for the quick fix. |
|
LGTM |
Workaround kernel bugs s related to namespaces
|
@mrjana would you mind explaining what specific kernel bugs this is addressing? Leaving the removal of the ns path to the period gc thread, instead of doing so eagerly in |
|
@rade It was introduced to fix a kernel race bug in mount shared subtrees (/var/run is one typically mounted by many distros as one) which wasn't fixed until 3.19 IIRC. We have to workaround the issue because we can't control kernel versions. But the namespace gets removed immediately. Only the namespace file gets garbage collected after 60 seconds. So there shouldn't not be any holding up network resources of any kind. |
|
@rade The commit is torvalds/linux@8f502d5 so actually it wasn't fixed until 4.1. The issue in particular is the crash happening inside |
|
Thanks @mrjana. But:
Containers started with As you can see, even though Kernel versions prior to 3.18 behaved differently: The unlink of the namespace mount point file would fail with EBUSY, and the other container could hold onto the netns forever. So kernel 3.18 improves the situation. But it would be better without the delay. Should I create an issue for this? |
|
@mrjana ping |
|
@dpw @rade I don't think we can remove the delayed garbage collection of namespace paths because of bugs in kernel during |
That is what we are going to do for our own containers. Unfortunately there are some quite popular containers out there which mount If this really isn't fixable, I suggest alerting developers to the problem with a note in the volume mount docs. |
For the drivers who provide us with the interfaces to push into namespace using the network driver plugin protocol, we do move those interfaces out the namespace when the container exits and it is upto the driver to delete the interfaces (veths) when they get a For And when in 1.9 release when the plugin framework becomes generally available |
|
@mrjana Would it be possible to notify the GC goroutine in |
This PR attempts to work around bugs present in kernel
version 3.18-4.0.1 relating to namespace creation
and destruction. This fix attempts to avoid certain
system calls to not get in the kernel bug path as well
as lazily garbage collecting the name paths when they are removed.
Signed-off-by: Jana Radhakrishnan mrjana@docker.com