Improve raftReplDev creation #695

yuwmao · 2025-04-22T15:11:32Z

Use user-defined config(e.g. priority) to create raftReplDev.

codecov-commenter · 2025-04-22T15:50:10Z

⚠️ Please install the to ensure uploads and comments are reliably processed by Codecov.

Codecov Report

Attention: Patch coverage is 94.73684% with 1 line in your changes missing coverage. Please review.

Project coverage is 66.75%. Comparing base (1a0cef8) to head (9ed7ccf).
Report is 179 commits behind head on master.

Files with missing lines	Patch %	Lines
src/lib/replication/repl_dev/solo_repl_dev.h	0.00%	1 Missing ⚠️

❗ Your organization needs to install the Codecov GitHub app to enable full functionality.

Additional details and impacted files

@@             Coverage Diff             @@
##           master     #695       +/-   ##
===========================================
+ Coverage   56.51%   66.75%   +10.24%     
===========================================
  Files         108      109        +1     
  Lines       10300    11519     +1219     
  Branches     1402     1574      +172     
===========================================
+ Hits         5821     7690     +1869     
+ Misses       3894     3076      -818     
- Partials      585      753      +168

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

xiaoxichen

Reading this doc https://github.com/eBay/NuRaft/blob/1c2587f159775a4924573db64ffac5c973230761/docs/leader_election_priority.md#L16

I am concerning of open the priority of int32 to CM, as without understanding the implementation detail it probabaly set < int32_max, 1, 0> which will be a nightmare for leader election.

Probabaly HO can still provide a list of replica_id_t and HS decide the priority based on the internal knowledge, we can make the K configurable where K = round of election timeout needed.

Decay = (0.8^k) , for example K =2 make it <10000, 6400, 4096 >. (each with 0.64 decay meaning 2 election timeout).

yuwmao · 2025-04-23T03:15:22Z

Reading this doc https://github.com/eBay/NuRaft/blob/1c2587f159775a4924573db64ffac5c973230761/docs/leader_election_priority.md#L16

I am concerning of open the priority of int32 to CM, as without understanding the implementation detail it probabaly set < int32_max, 1, 0> which will be a nightmare for leader election.

Probabaly HO can still provide a list of replica_id_t and HS decide the priority based on the internal knowledge, we can make the K configurable where K = round of election timeout needed.

Decay = (0.8^k) , for example K =2 make it <10000, 6400, 4096 >. (each with 0.64 decay meaning 2 election timeout).

Good point, and the gap should be bigger than 10. Regarding to decide the priority based on the internal knowledge, I think we can only infer who is the leader although it's enough in our NuObject scenario.
There are two thoughts:

Make strict control of the priority setting in HS, make sure they are valid and we can define K as max round the election timeout, leaving some flexibility for CM.
Simplify the contract with CM, change priority to a bool which means who is assigned to the leader(maybe not necessary in proto, but at least CM should keep the intent in its DB). And in HS's replace_member, the in_member inherit out_member's priority directly based on srv_config.

I think option2 is simpler and more controllable.

xiaoxichen · 2025-04-23T03:31:39Z

Yes we discussed this a bit in CM meeting, currently there is no use-case we want to control the secondary-leader.

CM will nominate a leader, and CM will persist the leader in its DB as well as internal PG structure. So that CM is able to tell for any given PG if the leader has drifted.
When CM send create pg request, it will carry the leader intent, either through priority or we change the grpc protocol. But at least for now it is clear the one who receive the PG_CREATE is the leader.
HO/HS should decide the priority setting of NuRaft, based on the K configuration or other internal knowledge.
When replace member, the in_member inherit the priority ---though for the sake of completeness I believe the in_member will have a flag should_be_leader indicating if it supposed to be leader.

So basically it is your approach 2

xiaoxichen · 2025-04-24T16:40:43Z

src/lib/replication/service/raft_repl_service.cpp

+    auto max_wait_round = std::min(raft_priority_election_round_upper_limit,
+                                   HS_DYNAMIC_CONFIG(consensus.max_wait_rounds_of_priority_election));
+    if (max_wait_round == 0) { return raft_leader_priority; }
+    auto priority = 1 + static_cast< int32_t >(


this is a nice formula

xiaoxichen reviewed Apr 22, 2025

View reviewed changes

yuwmao changed the title ~~Improve raftReplDev creation~~ Improve raftReplDev creation[drafting] Apr 23, 2025

yuwmao force-pushed the priority branch from 9a51b37 to c4795c5 Compare April 24, 2025 14:43

xiaoxichen previously approved these changes Apr 24, 2025

View reviewed changes

yuwmao dismissed xiaoxichen’s stale review via 010668a April 25, 2025 00:55

yuwmao force-pushed the priority branch from c4795c5 to 010668a Compare April 25, 2025 00:55

yuwmao changed the title ~~Improve raftReplDev creation[drafting]~~ Improve raftReplDev creation Apr 25, 2025

xiaoxichen previously approved these changes Apr 25, 2025

View reviewed changes

Set priority when create RaftReplDev

9ed7ccf

yuwmao dismissed xiaoxichen’s stale review via 9ed7ccf April 25, 2025 05:24

yuwmao force-pushed the priority branch from 010668a to 9ed7ccf Compare April 25, 2025 05:24

xiaoxichen approved these changes Apr 25, 2025

View reviewed changes

xiaoxichen merged commit 460695b into eBay:master Apr 25, 2025
21 checks passed

hkadayam pushed a commit to hkadayam/HomeStore that referenced this pull request Aug 7, 2025

Set priority when create RaftReplDev (eBay#695)

91442fd

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Improve raftReplDev creation #695

Improve raftReplDev creation #695

Uh oh!

yuwmao commented Apr 22, 2025

Uh oh!

codecov-commenter commented Apr 22, 2025 •

edited

Loading

Uh oh!

xiaoxichen left a comment

Uh oh!

yuwmao commented Apr 23, 2025

Uh oh!

xiaoxichen commented Apr 23, 2025

Uh oh!

xiaoxichen Apr 24, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Improve raftReplDev creation #695

Improve raftReplDev creation #695

Uh oh!

Conversation

yuwmao commented Apr 22, 2025

Uh oh!

codecov-commenter commented Apr 22, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

xiaoxichen left a comment

Choose a reason for hiding this comment

Uh oh!

yuwmao commented Apr 23, 2025

Uh oh!

xiaoxichen commented Apr 23, 2025

Uh oh!

xiaoxichen Apr 24, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

codecov-commenter commented Apr 22, 2025 •

edited

Loading