Skip to content

Step down if local_put fails in leader worker#27

Merged
borshop merged 1 commit intodevelopfrom
bugfix/step-down-on-local-failure
Jun 24, 2014
Merged

Step down if local_put fails in leader worker#27
borshop merged 1 commit intodevelopfrom
bugfix/step-down-on-local-failure

Conversation

@andrewjstone
Copy link
Copy Markdown
Contributor

If we fail to write locally as a leader, we cannot get a valid quorum since those quorums include the leader. Therefore we need to step down to prevent committing unsafely.

In riak_ensemble_peer:put_obj/4 when local_put/4 fails, step down by
sending a request_failed message to the leader.

@andrewjstone
Copy link
Copy Markdown
Contributor Author

Fixes #2

@jtuple
Copy link
Copy Markdown
Contributor

jtuple commented Jun 16, 2014

/cc basho/riak#536

@jtuple jtuple assigned andrewjstone and jtuple and unassigned andrewjstone and jtuple Jun 16, 2014
@lordnull
Copy link
Copy Markdown

Had to rebase to get ee to compile. ensemble develop introduced a type that does not exist in this branch.

riak_test run:

./riak_test -c ee -t ensemble_basic -t ensemble_basic2 -t ensemble_basic3 -t ensemble_basic4 -t ensemble_interleave -t ensemble_remove_node -t ensemble_remove_node2 -t ensemble_start_without_aae -t ensemble_sync -t ensemble_util -t ensemble_vnode_crash

First run had sync, remove_node2, and remove_node all fail. Running them individually had them pass, and 2nd run with all ensemble tests passed.

Code is nicely contained.

I'll run the ensemble suite a couple more times to see if the failure can be replicated.

@lordnull
Copy link
Copy Markdown

👍 02ee10c

At least one test will timeout when running the suite, but each test will pass individually, so I don't think the changes in this pr are responsible for that. Since that was the only concern, this is ready for merging.

@andrewjstone
Copy link
Copy Markdown
Contributor Author

@borshop merge

In riak_ensemble_peer:put_obj/4 when local_put/4 fails, step down by
sending a 'request_failed' message to the leader.
@andrewjstone
Copy link
Copy Markdown
Contributor Author

👍 8f6efad

borshop added a commit that referenced this pull request Jun 24, 2014
Step down if local_put fails in leader worker

Reviewed-by: andrewjstone
@andrewjstone
Copy link
Copy Markdown
Contributor Author

@borshop merge

@borshop borshop merged commit 8f6efad into develop Jun 24, 2014
@seancribbs seancribbs deleted the bugfix/step-down-on-local-failure branch April 1, 2015 23:15
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants