Skip to content

[1.12] raft: Fix campaign on restart#1588

Merged
aluzzardi merged 1 commit into
moby:bump_v1.12.2from
aaronlehmann:fix-campaign-on-restart
Sep 30, 2016
Merged

[1.12] raft: Fix campaign on restart#1588
aluzzardi merged 1 commit into
moby:bump_v1.12.2from
aaronlehmann:fix-campaign-on-restart

Conversation

@aaronlehmann
Copy link
Copy Markdown
Collaborator

In #1221, the code was changed not to call Campaign on restart unless
n.Node.Status().Progress[n.Config.ID] was set. The problem is that this
is never set unless the node is already the leader, so this effectively
disabled the quick leader election on restart for a single-node cluster.

Instead of checking Progress, figure out whether it's safe to call
Campaign by setting a flag when we apply the config change that adds the
local node to the raft state machine. This flag is set in registerNode,
which is technically also called when loading a snapshot, but in this
case the node is restored into the state machine through ConfState, so
setting the flag in both cases is correct.

Also, only enable the calls to Campaign when the node was restarted from
state on disk.

This PR is against the 1.12.2 branch. The fix does not seem necessary on master. I'm going to open a separate PR there to simplify the code.

cc @mrjana @LK4D4

@aaronlehmann aaronlehmann added this to the 1.12.2 milestone Sep 29, 2016
@aaronlehmann aaronlehmann changed the title raft: Fix campaign on restart [1.12] raft: Fix campaign on restart Sep 29, 2016
aaronlehmann pushed a commit to aaronlehmann/swarmkit that referenced this pull request Sep 29, 2016
This brings the code on master in sync with the 1.12 changes in moby#1588.
The code on master does not have the same problem (since a newer
etcd/raft library lets it skip the problematic check), but the version
being adopted by 1.12 should be a bit more robust.

Signed-off-by: Aaron Lehmann <aaron.lehmann@docker.com>
@codecov-io
Copy link
Copy Markdown

codecov-io commented Sep 29, 2016

Current coverage is 55.06% (diff: 100%)

Merging #1588 into bump_v1.12.2 will increase coverage by 0.08%

@@           bump_v1.12.2      #1588   diff @@
==============================================
  Files                78         78          
  Lines             12559      12561     +2   
  Methods               0          0          
  Messages              0          0          
  Branches              0          0          
==============================================
+ Hits               6905       6917    +12   
+ Misses             4695       4686     -9   
+ Partials            959        958     -1   

Sunburst

Powered by Codecov. Last update de507ff...f56cc6d

@mrjana
Copy link
Copy Markdown
Contributor

mrjana commented Sep 29, 2016

Test this patch on top of 1.12.2-rc1 code and the delay in cluster restart seems to be fixed:

Sep 29 13:36:29 ip-172-31-27-46.us-west-1.compute.internal dockerd[28350]: time="2016-09-29T13:36:29.222161100-04:00" level=info msg="Loading containers: done."
Sep 29 13:36:29 ip-172-31-27-46.us-west-1.compute.internal dockerd[28350]: time="2016-09-29T13:36:29.245711430-04:00" level=info msg="3f3e8717c9a69bbb became follower at term 2"
Sep 29 13:36:29 ip-172-31-27-46.us-west-1.compute.internal dockerd[28350]: time="2016-09-29T13:36:29.245742196-04:00" level=info msg="newRaft 3f3e8717c9a69bbb [peers: [], term: 2, commit: 19, applied: 0, lastindex: 19, lastterm: 2]"
Sep 29 13:36:29 ip-172-31-27-46.us-west-1.compute.internal dockerd[28350]: time="2016-09-29T13:36:29.246899745-04:00" level=info msg="Listening for connections" addr="[::]:2377" proto=tcp
Sep 29 13:36:29 ip-172-31-27-46.us-west-1.compute.internal dockerd[28350]: time="2016-09-29T13:36:29.247120263-04:00" level=info msg="Listening for local connections" addr="/var/lib/docker/swarm/control.sock" proto=unix
Sep 29 13:36:29 ip-172-31-27-46.us-west-1.compute.internal dockerd[28350]: time="2016-09-29T13:36:29.250285413-04:00" level=info msg="3f3e8717c9a69bbb is starting a new election at term 2"
Sep 29 13:36:29 ip-172-31-27-46.us-west-1.compute.internal dockerd[28350]: time="2016-09-29T13:36:29.250338902-04:00" level=info msg="3f3e8717c9a69bbb became candidate at term 3"
Sep 29 13:36:29 ip-172-31-27-46.us-west-1.compute.internal dockerd[28350]: time="2016-09-29T13:36:29.250349018-04:00" level=info msg="3f3e8717c9a69bbb received vote from 3f3e8717c9a69bbb at term 3"
Sep 29 13:36:29 ip-172-31-27-46.us-west-1.compute.internal dockerd[28350]: time="2016-09-29T13:36:29.250370675-04:00" level=info msg="3f3e8717c9a69bbb became leader at term 3"
Sep 29 13:36:29 ip-172-31-27-46.us-west-1.compute.internal dockerd[28350]: time="2016-09-29T13:36:29.250380316-04:00" level=info msg="raft.node: 3f3e8717c9a69bbb elected leader 3f3e8717c9a69bbb at term 3"

Comment thread manager/state/raft/raft.go Outdated
wal *wal.WAL
snapshotter *snap.Snapshotter
restored bool
addedSelf bool // true if the raft state machine knows about the local node
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can't we just check in n.cluster?

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, sorry, went directly to the code without reading description.

@LK4D4
Copy link
Copy Markdown
Contributor

LK4D4 commented Sep 29, 2016

LGTM

aaronlehmann pushed a commit to aaronlehmann/swarmkit that referenced this pull request Sep 29, 2016
This brings the code on master in sync with the 1.12 changes in moby#1588.
The code on master does not have the same problem (since a newer
etcd/raft library lets it skip the problematic check), but the version
being adopted by 1.12 should be a bit more robust.

Signed-off-by: Aaron Lehmann <aaron.lehmann@docker.com>
@aaronlehmann
Copy link
Copy Markdown
Collaborator Author

This seems to break TestApiSwarmManagerRestore, but I don't understand why:

----------------------------------------------------------------------
FAIL: docker_api_swarm_test.go:661: DockerSwarmSuite.TestApiSwarmManagerRestore

[d79866457] waiting for daemon to start
[d79866457] daemon started
[d79866457] exiting daemon
[d79866457] waiting for daemon to start
[d79866457] daemon started
[d13403274] waiting for daemon to start
[d13403274] daemon started
[d13403274] exiting daemon
[d13403274] waiting for daemon to start
[d13403274] daemon started
docker_api_swarm_test.go:677:
    d2.getService(c, id)
github.com/docker/docker/integration-cli/_test/_obj_test/daemon_swarm.go:156:
    ...open github.com/docker/docker/integration-cli/_test/_obj_test/daemon_swarm.go: no such file or directory
... obtained int = 500
... expected int = 200
... output: "{\"message\":\"rpc error: code = 2 desc = raft: no elected cluster leader\"}\n"

[d79866457] exiting daemon
[d13403274] exiting daemon

----------------------------------------------------------------------

Maybe this just exposes flakiness in the test?

@aaronlehmann
Copy link
Copy Markdown
Collaborator Author

Passed after a second run. So maybe the test is just flaky on my setup. I can't understand how this code change would cause this failure. It shouldn't change the behavior when d2 restarts (and even if it did, that shouldn't cause problems).

@aluzzardi
Copy link
Copy Markdown
Member

Waiting for #1584 (or a similar fix) to land before merging this as it would hide the underlying error

aaronlehmann pushed a commit to aaronlehmann/swarmkit that referenced this pull request Sep 29, 2016
This brings the code on master in sync with the 1.12 changes in moby#1588.
The code on master does not have the same problem (since a newer
etcd/raft library lets it skip the problematic check), but the version
being adopted by 1.12 should be a bit more robust.

Signed-off-by: Aaron Lehmann <aaron.lehmann@docker.com>
In moby#1221, the code was changed not to call Campaign on restart unless
n.Node.Status().Progress[n.Config.ID] was set. The problem is that this
is never set unless the node is already the leader, so this effectively
disabled the quick leader election on restart for a single-node cluster.

Instead of checking Progress, figure out whether it's safe to call
Campaign by checking if we are in the member list. This list is kept in
sync with the config changes we apply in raft.

Also, only enable the calls to Campaign when the node was restarted from
state on disk.

Signed-off-by: Aaron Lehmann <aaron.lehmann@docker.com>
@aaronlehmann aaronlehmann force-pushed the fix-campaign-on-restart branch from 65a09d7 to f56cc6d Compare September 29, 2016 22:51
@aaronlehmann
Copy link
Copy Markdown
Collaborator Author

Updated based on feedback in #1589.

@LK4D4
Copy link
Copy Markdown
Contributor

LK4D4 commented Sep 30, 2016

Patch LGTM. I think it's probably really cool to have it in 1.12.1

@vieux
Copy link
Copy Markdown
Contributor

vieux commented Sep 30, 2016

@LK4D4 you mean 1.12.2 I believe

@LK4D4
Copy link
Copy Markdown
Contributor

LK4D4 commented Sep 30, 2016

@vieux yup

@aluzzardi aluzzardi merged commit 5f934bd into moby:bump_v1.12.2 Sep 30, 2016
@aaronlehmann aaronlehmann deleted the fix-campaign-on-restart branch September 30, 2016 19:45
aaronlehmann pushed a commit to aaronlehmann/swarmkit that referenced this pull request Sep 30, 2016
This brings the code on master in sync with the 1.12 changes in moby#1588.
The code on master does not have the same problem (since a newer
etcd/raft library lets it skip the problematic check), but the version
being adopted by 1.12 should be a bit more robust.

Signed-off-by: Aaron Lehmann <aaron.lehmann@docker.com>
niau pushed a commit to vasil-yordanov/swarmkit that referenced this pull request Oct 21, 2016
This brings the code on master in sync with the 1.12 changes in moby#1588.
The code on master does not have the same problem (since a newer
etcd/raft library lets it skip the problematic check), but the version
being adopted by 1.12 should be a bit more robust.

Signed-off-by: Aaron Lehmann <aaron.lehmann@docker.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

7 participants