NDRS-62: Implement deploy validity checks#31
Conversation
| }); | ||
| // filter out invalid deploys | ||
| deploys_to_return.retain(|_deploy_hash, deploy| { | ||
| self.is_deploy_valid(deploy, current_timestamp, max_ttl, max_dependencies, past) |
There was a problem hiding this comment.
In Scala we're purging the deploy buffer periodically.
There was a problem hiding this comment.
That makes sense and should probably be added, but I'd say it's out of scope of this task.
There was a problem hiding this comment.
Since this PR is already merged, can you create a ticket for this?
|
|
||
| #[derive(Debug, Clone, Default)] | ||
| pub(crate) struct DeployBuffer { | ||
| collected_deploys: HashMap<DeployHash, DeployHeader>, |
There was a problem hiding this comment.
Why just the header? Where will we get the body from?
There was a problem hiding this comment.
The idea here is not to keep the deploy bodies in memory unnecessarily, only the info that is needed for the buffer to know what deploys we have and which could still be included in blocks. The bodies would be kept in storage and only read when we need to communicate them to someone else.
There was a problem hiding this comment.
So deploy buffer doesn't have access direct to deploy storage? How will those two communicate?
There was a problem hiding this comment.
Since this PR is already merged, can you create a ticket for this?
| pub(crate) struct DeployBuffer { | ||
| collected_deploys: HashMap<DeployHash, DeployHeader>, | ||
| processed: HashMap<BlockHash, HashMap<DeployHash, DeployHeader>>, | ||
| finalized: HashMap<BlockHash, HashMap<DeployHash, DeployHeader>>, |
There was a problem hiding this comment.
It looks like this BlockHash neither belongs to a finalized block — it's used before execution, and thus before the state root hash is known, which should be part of the finalized block's hash — nor a Highway vote hash, but is rather an identifier for a (possibly pending) batch of deploys?
Should we call it BatchHash or DeployListHash or something?
Or will we actually ask Highway for the vote hash, and use that to identify the list?
There was a problem hiding this comment.
It probably depends on whether we go with the execution before consensus or not, doesn't it?
I'd prefer renaming out of the two options, since it would work with both approaches - we need some kind of an identifier before a block is finalized, anyway.
There was a problem hiding this comment.
Isn't this a hash of the SignedWireVote that carries the consensus values (deploy hashes)? And it is "finalized" when consensus outputs Finalized message. It doesn't need to have the post state hash (coming from the batch block).
There was a problem hiding this comment.
Yes, that's what I meant with my last question above: We could pick the deploys, pass them into Highway, get back the vote hash (i.e. the hash of SignedWireVote) and use that as an identifier for the list. Not sure if that's a good idea.
There was a problem hiding this comment.
Isn't this what happens in added_block? When consensus asks for deploys for the new block, it will call back the added_block
There was a problem hiding this comment.
You're right… so the intention with this is that the block hash is the Highway vote hash, I guess?
| self.is_deploy_valid(deploy, current_timestamp, max_ttl, max_dependencies, past) | ||
| }); | ||
| // TODO: check gas and block size limits | ||
| deploys_to_return |
There was a problem hiding this comment.
Continuing this discussion, in the long run collected_deploys will probably be some kind of priority queue, usually much larger than past, and possibly even backed by a database. Also, the set of finalized deploys will grow indefinitely. I think:
- We should make sure
collected_deploysnever contains finalized deploys, so we don't need to filter them out here. I don't think a module should defensively program against violating its own invariants. - We should avoid cloning
collected_deploysor iterating over all of it. Instead, collect thepastdeploys into a set and use that tofilteran iterator overcollected_deploys. Then iterate only until the size or gas limit is reached.
(Doesn't have to be in this PR, of course, or in this sprint!)
There was a problem hiding this comment.
I agree with everything except that collected_deploys will probably be much larger than past. Why would it be? past should include the hashes of all past blocks, and so will grow indefinitely. collected_deploys, on the other hand, would only include the deploys that haven't been finalized and aren't outdated yet (once we implement purging suggested by @goral09 ), so worst case scenario, its size will be roughly proportional to the size of the network. Am I missing something here?
There was a problem hiding this comment.
The past should be only the diff between the last finalized block (known to the DeployBuffer) and all the blocks that consensus has proposed since then. So it doesn't grow indefinitely.
There was a problem hiding this comment.
Ah, of course! Right, that makes sense, then.
| .or_else(|| self.processed.get(block_hash)) | ||
| }) | ||
| .flat_map(|deploys| deploys.keys()) | ||
| .collect::<HashSet<_>>(); |
There was a problem hiding this comment.
This is another potentially huge collection.
Would be good to keep the finalized deploys in a set in DeployBuffer, so we don't have to collect them each time we check the validity of a deploy.
(And/or use a bloom filter later maybe?)
There was a problem hiding this comment.
Would be good to keep the finalized deploys in a set in DeployBuffer
Would it be a good idea to keep such a huge set in memory at all times? Both ways seem expensive...
I guess we need to decide whether we want to use a lot of memory constantly, or take some time to construct this for a brief moment when we check the validity of some deploys (but at the very least we should do it once per call to remaining_deploys, not once per call to is_deploy_valid).
There was a problem hiding this comment.
I think once it's so big that we can't keep it in memory at all times, it's also too big to construct it every time we validate a deploy.
Possibly a bloom filter alone will be able to replace this set?
We probably also can prune deploys that are older than the maximum TTL… so we could have one set and/or bloom filter for each period of length max TTL, and only keep two of those periods in memory at any time.
| if !self | ||
| .finalized | ||
| .values() | ||
| .any(|block| block.contains_key(&hash)) |
There was a problem hiding this comment.
We should be checking the signature before adding it to the deploy buffer.
There was a problem hiding this comment.
Here's the list of all validity checks we do on deploy before it's even added to the deploy buffer.
There was a problem hiding this comment.
True, but we can't do that until we have the storage of the deploy bodies implemented, can we? Unless the signature only covers the header, but would that make sense?
There was a problem hiding this comment.
Right, we should check the hash and then the signature but it is the responsibility of the component that adds the deploy to the storage/buffer.
There was a problem hiding this comment.
Since this PR is already merged, can you create a ticket for this?
| .filter_map(|block_hash| self.processed.get(block_hash)) | ||
| .chain(self.finalized.values()) | ||
| .flat_map(|deploys| deploys.keys()) | ||
| .collect::<HashSet<_>>(); |
There was a problem hiding this comment.
There's still:
.fold(self.collected_deploys.clone(), …)above. Shouldn't we just use self.collected_deploys below now, instead of deploys_to_return?
There was a problem hiding this comment.
That's a good question. I guess it depends on how we want to handle re-adding deploys after they are included in a processed block. Right now we allow for it, because processed blocks might not be finalized and adding the deploy again might be justified, but then we have to do this filtering by past blocks that have not been finalized yet.
There was a problem hiding this comment.
But aren't you filtering them out twice right now? Once in line 45 and once in is_deploy_valid?
There was a problem hiding this comment.
is_deploy_valid doesn't do any filtering, though. It only checks whether the dependencies are satisfied.
There was a problem hiding this comment.
Ah, of course! Sorry, ignore my remark…
…
Although… since we already collect past_deploys anyway, couldn't we also use it to get deploys_to_return by filtering collected_deploys with it, instead of cloning and folding?
The receipts will represent VM events and will be tracked via a merkle tree of receipts. The transaction creation will accept an initial receipts merkle state and will mutate it after the execution of the script, providing the new merkle root.
Feat 2.0 clean up
# This is the 1st commit message: rebasing # The commit message #2 will be skipped: # WIP # The commit message #3 will be skipped: # WIP # The commit message #4 will be skipped: # wip # The commit message #5 will be skipped: # wip # The commit message #6 will be skipped: # wip # The commit message #7 will be skipped: # wip # The commit message #8 will be skipped: # wip # The commit message #9 will be skipped: # wip # The commit message casper-network#10 will be skipped: # wip # The commit message casper-network#11 will be skipped: # wip # The commit message casper-network#12 will be skipped: # wip # The commit message casper-network#13 will be skipped: # rebasing # The commit message casper-network#14 will be skipped: # wip # The commit message casper-network#15 will be skipped: # wip # The commit message casper-network#16 will be skipped: # wip # The commit message casper-network#17 will be skipped: # wip # The commit message casper-network#18 will be skipped: # wip # The commit message casper-network#19 will be skipped: # wip # The commit message casper-network#20 will be skipped: # rebasing # The commit message casper-network#21 will be skipped: # wip # The commit message casper-network#22 will be skipped: # wip # The commit message casper-network#23 will be skipped: # wip # The commit message casper-network#24 will be skipped: # wip # The commit message casper-network#25 will be skipped: # wip # The commit message casper-network#26 will be skipped: # wip # The commit message casper-network#27 will be skipped: # wip # The commit message casper-network#28 will be skipped: # wip # The commit message casper-network#29 will be skipped: # wip # The commit message casper-network#30 will be skipped: # wip # The commit message casper-network#31 will be skipped: # wip # The commit message casper-network#32 will be skipped: # wip
# This is the 1st commit message: rebasing # The commit message #2 will be skipped: # WIP # The commit message #3 will be skipped: # WIP # The commit message #4 will be skipped: # wip # The commit message #5 will be skipped: # wip # The commit message #6 will be skipped: # wip # The commit message #7 will be skipped: # wip # The commit message #8 will be skipped: # wip # The commit message #9 will be skipped: # wip # The commit message casper-network#10 will be skipped: # wip # The commit message casper-network#11 will be skipped: # wip # The commit message casper-network#12 will be skipped: # wip # The commit message casper-network#13 will be skipped: # rebasing # The commit message casper-network#14 will be skipped: # wip # The commit message casper-network#15 will be skipped: # wip # The commit message casper-network#16 will be skipped: # wip # The commit message casper-network#17 will be skipped: # wip # The commit message casper-network#18 will be skipped: # wip # The commit message casper-network#19 will be skipped: # wip # The commit message casper-network#20 will be skipped: # rebasing # The commit message casper-network#21 will be skipped: # wip # The commit message casper-network#22 will be skipped: # wip # The commit message casper-network#23 will be skipped: # wip # The commit message casper-network#24 will be skipped: # wip # The commit message casper-network#25 will be skipped: # wip # The commit message casper-network#26 will be skipped: # wip # The commit message casper-network#27 will be skipped: # wip # The commit message casper-network#28 will be skipped: # wip # The commit message casper-network#29 will be skipped: # wip # The commit message casper-network#30 will be skipped: # wip # The commit message casper-network#31 will be skipped: # wip # The commit message casper-network#32 will be skipped: # wip # The commit message casper-network#33 will be skipped: # wip # The commit message casper-network#34 will be skipped: # wip # The commit message casper-network#35 will be skipped: # wip # The commit message casper-network#36 will be skipped: # wip # The commit message casper-network#37 will be skipped: # wip # The commit message casper-network#38 will be skipped: # wip # The commit message casper-network#39 will be skipped: # wip # The commit message casper-network#40 will be skipped: # rebasing
This PR does a few things:
deploy_buffermodule fromconsensus_servicetoconsensus(TBD if that's the right place for it)typesmoduleWhat isn't done:
(Discussions whether and how to implement them in this PR will follow.)