Fix container access issue when it moves across hosts #1783

sanimej · 2017-05-28T20:14:19Z

Noticed this issue when debugging few test case failures in e2e test suite. This is likely to be the root cause for docker #33076 and the service access issue during container bring up reported in docker #30321

libnetwork IPAM recycles the IP address when a task goes down on a node and brought up in another node. For remote tasks overlay network namespace has one static fdb entry programmed by the driver and one dynamic entry learned by the bridge from the data path when a packet is received from the remote container. The dynamic entry ages out after 300 seconds. If a task on a remote node goes down and gets scheduled on a node the dynamic fdb entry still remains. Unless the container generates some data traffic it won't be updated. This can lead to unpredictability in accessing the container; sometimes it will work pretty quickly if there is some traffic from the container and the mac entry gets updated. If the container is completely silent it can lead to upto 300 seconds of traffic loss.

Change in this PR is to disable the mac learning in the bridge. This is done only for the vxlan interface. For local veth interfaces we rely on mac learning for container to container communication.

There is one caveat with this approach: When there are many local containers traffic to a remote task would get replicated to all those local endpoints. This can potentially be a performance issue. An alternative approach is to delete the dynamic mac entry in the bridge. But in some older kernels deleting the dynamic entry doesn't work because of the default vlan issue. It might still be possible to handle that case by setting the default bridge vlan. If that is feasible this fix can be changed to use that approach.

Signed-off-by: Santhosh Manohar santhosh@docker.com

Signed-off-by: Santhosh Manohar <santhosh@docker.com>

mavenugo · 2017-05-28T21:07:28Z

code LGTM.

@sanimej regarding the caveat that you mentioned,

When there are many local containers traffic to a remote task would get replicated to all those local endpoints.

isnt that true only in the case of remote static entry is not programmed ? And for any such container movement, the control-plane must converge and the static entry will be programmed across the nodes and hence such flooding is a temporary behavior till the entry appears. Is that correct ?

sanimej · 2017-05-28T21:15:55Z

@mavenugo A local container's traffic hits the overlay bridge first. So without the learned entry in the bridge it would get replicated to all the ports, including the vxlan port. So I would expect the replication to happen always.

mavenugo · 2017-05-28T21:23:42Z

@sanimej in that case, is it safe to add a static fdb entry on the bridge as well and remove it when the container moves (just like we do it on the vxlan device ?).

abhi · 2017-05-30T02:11:32Z

@mavenugo I think adding a static fdb entry on the bridge as well will prevent flooding in silent/consumer-only container scenario too.

sanimej · 2017-06-01T19:28:58Z

@mavenugo We don't have to add a static fdb entry. We just to have to delete the dynamic entry when the task goes down. But this doesn't work in some kernel versions because of the default vlan being 1. #1792 PR has a fix for this by setting the bridge default vlan to 0.

aaronlehmann · 2017-06-02T09:34:55Z

osl/interface_linux.go


+func (i *nwIface) DisableLearning() bool {
+	i.Lock()
+	i.Unlock()


Was this meant to be deferred? The code doesn't look right as-is.

Yes, it should have been defer. Thanks.

As mentioned in the description the flooding to the local ports is better avoided. I have pushed a PR with a different approach to address this issue, #1792. I will close this PR.

sanimej · 2017-06-02T15:40:47Z

#1792 fixes the problem without the drawback mentioned in this PR's description. Closing this.

Fix container access issue when it moves across hosts

88c1023

Signed-off-by: Santhosh Manohar <santhosh@docker.com>

sanimej mentioned this pull request Jun 1, 2017

Remove dynamic mac entry from fdb on endpoint deletion #1792

Merged

aaronlehmann reviewed Jun 2, 2017

View reviewed changes

sanimej closed this Jun 2, 2017

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Fix container access issue when it moves across hosts #1783

Fix container access issue when it moves across hosts #1783

Uh oh!

sanimej commented May 28, 2017

Uh oh!

mavenugo commented May 28, 2017

Uh oh!

sanimej commented May 28, 2017

Uh oh!

mavenugo commented May 28, 2017

Uh oh!

abhi commented May 30, 2017

Uh oh!

sanimej commented Jun 1, 2017

Uh oh!

aaronlehmann Jun 2, 2017

Uh oh!

sanimej Jun 2, 2017

Uh oh!

sanimej commented Jun 2, 2017

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Fix container access issue when it moves across hosts #1783

Fix container access issue when it moves across hosts #1783

Uh oh!

Conversation

sanimej commented May 28, 2017

Uh oh!

mavenugo commented May 28, 2017

Uh oh!

sanimej commented May 28, 2017

Uh oh!

mavenugo commented May 28, 2017

Uh oh!

abhi commented May 30, 2017

Uh oh!

sanimej commented Jun 1, 2017

Uh oh!

aaronlehmann Jun 2, 2017

Choose a reason for hiding this comment

Uh oh!

sanimej Jun 2, 2017

Choose a reason for hiding this comment

Uh oh!

sanimej commented Jun 2, 2017

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants