Skip to content

Conversation

@fcrisciani
Copy link

@fcrisciani fcrisciani commented Apr 27, 2017

This change serialize the init/close of the libnetwork agent into one single code path.
On swarm init there are several things that are appening in parallel, before the
initialization of the agent was happening thorugh differents code paths depending
from timing.
This change cleans up the entire init logic letting the init happening in the goroutine
clusterAgentInit.

Moby side: moby/moby#32906

Signed-off-by: Flavio Crisciani flavio.crisciani@docker.com

return err
}

ch, cancel := nDB.Watch(libnetworkEPTable, "", "")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There was actually a bug here.
At line 319 the ep table cancel function was expected:

epTblCancel: cancel,

agent.go Outdated
bindAddr string
advertiseAddr string
dataPathAddr string
epTblCancelFunList []func()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is not only for endpoint tables, it is also for node event tables.
Please rename the variable to coreCancelFuncs. We already have a driverCancelFuncs, so that it is clear this new one contains the cancel functions for the tables followed by libnetwork core

@fcrisciani fcrisciani force-pushed the init_race_cond branch 3 times, most recently from 88b3799 to ed2c555 Compare May 3, 2017 18:33
@aboch
Copy link
Contributor

aboch commented May 3, 2017

LGTM

@fcrisciani
Copy link
Author

@aboch after a new change on the moby side I realized that there was no reason to change the ListenClusterEvent function so I just restored it. The provider interface again returns a read only channel as the original

@aboch
Copy link
Contributor

aboch commented May 4, 2017

Right, because we no longer need to push an event from inside libnetwork (I remember at some point the key ready msg was being sent from inside SetKey)

Still LGTM

This change cleans up the SetClusterProvider method.
Swarm calls the SetClusterProvider to pass to libnetwork the pointer
of the provider from which libnetwork can fetch all the information to
initialize the internal agent.

The method can be and is called multiple times passing the same value,
with the previous logic that was erroneusly spawning multiple go routines that
were making possiblea race between an agentInit and an agentClose.

The new logic aims to disallow it by checking for the provider passed and
ensuring that if the provider is already present there is nothing to do because
there is already an active go routine that is ready to process cluster events.
Moreover a patch on moby side takes care of clearing up the Cluster Events
dispacthing using only 1 channel to handle all the events types.
This will also guarantee in order event handling because now all the events are
piped into one single channel.

Signed-off-by: Flavio Crisciani <flavio.crisciani@docker.com>
Copy link
Contributor

@mavenugo mavenugo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@mavenugo mavenugo merged commit 6786135 into moby:master May 10, 2017
@fcrisciani fcrisciani mentioned this pull request Jun 9, 2017
@fcrisciani fcrisciani deleted the init_race_cond branch June 12, 2017 05:37
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants