-
Notifications
You must be signed in to change notification settings - Fork 260
IBD Handle Syncer Pruning Movement #702
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
Open Issues (a)Even if catch up is possible, it might still be worthwhile to attempt to sync from another peer who had still not pruned and has maintained the missing block body segment, rather than downloading an entire fresh utxo set. A more robust implementation could check with various peers and only initiate catchup once no other choice remains. The same can even apply to the syncer as is: while its pruning point has changed, its retention point/ root may lag behind (intentionally, or even just due to slow pruning) and they will still be able to send the relevant data if the syncee could recognize this scenario. (b)The transitional states possibly break some implicit assumptions by the node and clients about the status of an active consensus, which demand further thought on how best address them. namely: Possible solutions involve: cloning the consensus - which is expensive in both space and time and relatively complex. A variant of this could perhaps, allow consensii to share most but not all of the stores, and thus only clone a very limited amount of storage required touched upon by the transitional state (namely stores effected by body validation.) |
* Remove temporary dust prevention mechanism * Disable uninlined_format_args lint * Apply workspace lints to all crates * clippy
d755a1e to
989cd5b
Compare
|
Addendum to the above comment on open issues : in hindsight I came to realize (b1) probably also applies to the virtual utxo set which can end up in limbo if the node is reset in very specific times, this seems more acute then the pruning_utxo_set, but I believe it can be solved in similar manner to that described above. |
|
(c) following a "catchup" usually a lot of old data can be pruned right away. For simplicity and safety's sake the current implementation will not start pruning until pruning_depth block data has accumulated on top of the new pruning point. While superficially reasonable, this prevents even the pruning of old data several pruning depths ago. A user repeatedly "catching up" at the worst of times (exactly before triggering pruning every time) could find themselves with unexpectedly large storage, though it is challenging to imagine anything too extreme. |
coderofstuff
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Disembodied Anticone must be changed in another PR, before a release
coderofstuff
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@someone235 @freshair18 up to you if you want to fix these comments here or in a different PR after this is merged
consensus/src/consensus/mod.rs
Outdated
| // Verify that the new pruning point can be safely imported | ||
| // and return all new pruning point on path to it that needs to be updated in consensus | ||
| fn get_and_verify_novel_pruning_points(&self, new_pruning_point: Hash, syncer_sink: Hash) -> ConsensusResult<VecDeque<Hash>> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Update the function docs to use /// for rust standard.
consensus/src/consensus/mod.rs
Outdated
| } | ||
| info!("Setting {new_pruning_point} as the pruning point"); | ||
| // 4) The pruning points declared on headers on that path must be consistent with those already known by the node: | ||
| let pruning_point_read = self.pruning_point_store.upgradable_read(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why does this need to be an upgradable_read as opposed to read()?
consensus/src/consensus/mod.rs
Outdated
|
|
||
| // Verify that the new pruning point can be safely imported | ||
| // and return all new pruning point on path to it that needs to be updated in consensus | ||
| fn get_and_verify_novel_pruning_points(&self, new_pruning_point: Hash, syncer_sink: Hash) -> ConsensusResult<VecDeque<Hash>> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you split this into two parts:
is_pruning_point_importable- checks whether the new pruning point can be importedget_path_to_pruning_point- gets the pp path to the new pruning point (if possible)
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Tell me how you think of it but it is a tad unnatural to me: verifying the new pruning point consists of getting these new pruning points and seeing that they form a coherent "chain" is in itself part of the verification.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm, ok, let me explain what I'm thinking. The function signature takes in a pruning point (presumably a new one). The name get_and_verify_novel_pruning_points appears to indicate that you are attempting to verify some set of pruning points. It's inconsistent with the input, since you're just passing in one. So if someone were reading this function signature, you are left to wonder "what pruning points are you referring to when you're just passing in one?"
If the intent is as you describe, can I propose a rename of this to get_and_verify_path_to_new_pruning_point. From this name, you can tell there's a reference to a new pruning point and you know you'll be receiving a path to that new pruning point.
consensus/src/consensus/mod.rs
Outdated
|
|
||
| // Verify that the new pruning point can be safely imported | ||
| // and return all new pruning point on path to it that needs to be updated in consensus | ||
| fn get_and_verify_novel_pruning_points(&self, new_pruning_point: Hash, syncer_sink: Hash) -> ConsensusResult<VecDeque<Hash>> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm, ok, let me explain what I'm thinking. The function signature takes in a pruning point (presumably a new one). The name get_and_verify_novel_pruning_points appears to indicate that you are attempting to verify some set of pruning points. It's inconsistent with the input, since you're just passing in one. So if someone were reading this function signature, you are left to wonder "what pruning points are you referring to when you're just passing in one?"
If the intent is as you describe, can I propose a rename of this to get_and_verify_path_to_new_pruning_point. From this name, you can tell there's a reference to a new pruning point and you know you'll be receiving a path to that new pruning point.
|
How did you conduct tests for this feature? As in, what is the setup so I can run a similar test locally? |
2ea9727
Addresses #679
IBD Type determination
A new IBD type is added, currently called pruning_catchup. Disregarding some subtleties, this IBD type is triggered when all the following are fulfilled:
Conveniently, negotiate_missing_syncer_chain_segment allows for an easy way to derive the syncer's current pruning point hash.
Validation Before Movement
Before any sensitive and irreversible part, the node first downloads and validates headers from the Syncer until its declared sink. "Destructive" changes would only occur when :
1)the syncer pp is a valid pruning sample (it satisfies the blue_score requirements to be a pp)
2)there are sufficiently many headers built on top of it, specifically, the syncer's sink validated header blue_score is greater than P.b_score+pruning_depth.
3) the syncer pruning point is on the selected chain from the syncer's sink, and any pruning points declared
on headers on its path must be consistent with those already known
Transitional States During Catchup
Updating to a new pruning point, conceptually consists of three stages:
During IBD_with_headers_proof (as it previously was), these three stages are performed atomically, using a "discardable" staging consensus, which either goes through all of them and only then is commited, or the current consensus remains active.
Unlike an IBD with headers proof, pruning_catchup inherently consists of building on the information of the current consensus rather than starting from scratch.
The current implementation hence introduces transitional states, with corresponding "flags" for the intermediary cases where the pruning point movement occured but a new pruning utxo set is yet to be downloaded, and or the anticone's block bodies have not all went through verification. The required anticone in particular is maintained by computing and storing it already during the pp movement, with it being computed in relation to the syncer's sink (In theory this maintained set could be shrunk on the fly as more bodies are synced, but at the moment this set is maintained in an all or nothing manner - since sending validated blocks to validation causes no harm and is fast enough).
Given the easy recognition, these intermediary states could just be handled in future syncs. These transitional states are unabusable given the standard security assumption of an honest majority at every pruning period: as we synced sufficiently many headers on top of the pruning point, we know the syncee's Dag on top of it represents the honest network, and hence its PP represents a valid pruning utxo set, and the blocks on the anticone must have had a block_body - or the honest network would have "rejected" this Dag (more precisely, the pp would not have been on the selected chain of it). It is remarked the same assumption was used previously when choosing to commit a staging consensus before all blocks synced underwent validation.
Transitional States Security
Pruning: generally pruning is not activated unless a virtual task is completed, and hence would not be called while in the limbo state of a missing utxo set. To be on the safe side it is confirmed we are not in a transitional state before attempting to naturally advance the pruning_utxo_set. This could perhaps be turned to an assert.
Block Relay: a check is added if the consensus is in a transitional state to immediately send it to IBD if it is.
HandleRelayBlockRequests: The node will ignore requests to send over its sink if it is in a transitional state, to avoid log cluttering and disconnecting due to a potential missing block error.
For simplicity both transitional states are checked in all the above, though at times a distinction could be made between them.
A final sidenote: advancing the pruning_utxo_set (and pruning in general) is also prevented until the virtual is sufficiently above the pruning point: this ultimately stems from technicalities, and may be modified in the future, but it seems fine to prevent pruning while blocks IBD is still taking place.