Skip to content

Conversation

@JacksonYao287
Copy link
Contributor

@JacksonYao287 JacksonYao287 commented Nov 13, 2024

when follower hits some error before appending log entries, it will set batch_size_hint_in_bytes to -1 to ask leader do not send more log entries in the next append_log_req.
https://github.com/eBay/NuRaft/blob/eabdeeda538a27370943f79a2b08b5738b697ac3/src/handle_append_entries.cxx#L760

in nuobject case , if a new member is added to a raft group and it tries to append create_shard log entry , which will try to alllocate block from the chunks of the pg, before the create_pg log is committed , which will allocated chunks to this pg, and error will happen and the log batch containing create_shard log entry will be wholy rejected and set batch_size_hint_in_bytes to -1 in the response to leader.

this pr aims to set the log count in the next batch sent to follower to 1, so that:

if the create_pg and create_shard are in the same log batch , the pr will first reject this log batch and leader will send only create_pg in the next batch , which will be accepted by follower , since it will only create this pg.

if the create_pg and create_shard are not in the same log batch, and create_shard is trying to allocate block before the pg it created(chunks of this pg is alllocated), then , with this pr, follower will reject this batch so that it will give more time to creating pg. create_shard log will be resent in the next batch , and at that moment pg has probably already been successfully created.

@codecov-commenter
Copy link

⚠️ Please install the 'codecov app svg image' to ensure uploads and comments are reliably processed by Codecov.

Codecov Report

Attention: Patch coverage is 66.66667% with 1 line in your changes missing coverage. Please review.

Project coverage is 67.33%. Comparing base (1a0cef8) to head (939e1b2).
Report is 89 commits behind head on master.

Files with missing lines Patch % Lines
.../lib/replication/log_store/home_raft_log_store.cpp 66.66% 0 Missing and 1 partial ⚠️

❗ Your organization needs to install the Codecov GitHub app to enable full functionality.

Additional details and impacted files
@@             Coverage Diff             @@
##           master     #588       +/-   ##
===========================================
+ Coverage   56.51%   67.33%   +10.82%     
===========================================
  Files         108      109        +1     
  Lines       10300    10725      +425     
  Branches     1402     1467       +65     
===========================================
+ Hits         5821     7222     +1401     
+ Misses       3894     2801     -1093     
- Partials      585      702      +117     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

Copy link
Collaborator

@xiaoxichen xiaoxichen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

if (batch_size_hint_in_bytes < 0) end = start + 1;

// for the case where batch_size_hint_in_bytes >= 0, we do not take any size check here for now.
// TODO: limit the size of the returned entries by batch_size_hint_in_bytes int the future if necessary
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@yamingk we discussed pressure_back , do you think we can use the batch_size_hint for this purpose?
i.e. follower can respond with batch_size_hint to leader through the response of appendEntries request.

Copy link
Contributor

@yamingk yamingk Nov 13, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good finding. yes, looks like follower can set the next_batch_size_hint in append_entries_resp and leader will set it in its next batch size and consume it when next batch happens:
https://github.com/eBay/NuRaft/blob/eabdeeda538a27370943f79a2b08b5738b697ac3/src/handle_append_entries.cxx#L1069~1072

@xiaoxichen xiaoxichen merged commit 328cef3 into eBay:master Nov 13, 2024
@JacksonYao287 JacksonYao287 deleted the add-log_entries_ext branch November 13, 2024 06:33
hkadayam pushed a commit to hkadayam/HomeStore that referenced this pull request Aug 7, 2025
when follower hits some error before appending log entries, it will set batch_size_hint_in_bytes to -1 to ask leader do not send more log entries in the next append_log_req.
https://github.com/eBay/NuRaft/blob/eabdeeda538a27370943f79a2b08b5738b697ac3/src/handle_append_entries.cxx#L760

in nuobject case , if a new member is added to a raft group and it tries to append create_shard log entry , which will try to alllocate block from the chunks of the pg, before the create_pg log is committed , which will allocated chunks to this pg, and error will happen and the log batch containing create_shard log entry will be wholy rejected and set batch_size_hint_in_bytes to -1 in the response to leader.

this pr aims to set the log count in the next batch sent to follower to 1, so that:

if the create_pg and create_shard are in the same log batch , the pr will first reject this log batch and leader will send only create_pg in the next batch , which will be accepted by follower , since it will only create this pg.

if if the create_pg and create_shard are not in the same log batch, and create_shard is trying to allocate block before the pg it created(chunks of this pg is alllocated), then , with this pr, follower will reject this batch so that it will give more time to creating pg. create_shard log will be resent in the next batch , and at that moment pg has probably already been successfully be created.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants