Skip to content

NDRS-62: Implement deploy validity checks#31

Merged
fizyk20 merged 10 commits intocasper-network:masterfrom
fizyk20:ndrs-62
Jun 19, 2020
Merged

NDRS-62: Implement deploy validity checks#31
fizyk20 merged 10 commits intocasper-network:masterfrom
fizyk20:ndrs-62

Conversation

@fizyk20
Copy link
Contributor

@fizyk20 fizyk20 commented Jun 17, 2020

This PR does a few things:

  • moves the deploy_buffer module from consensus_service to consensus (TBD if that's the right place for it)
  • changes the types used for the deploy data to the ones defined in the types module
  • implements validity checks regarding timestamps and dependencies.

What isn't done:

  • gas and block size limit checks
  • deploy timestamp check on addition
    (Discussions whether and how to implement them in this PR will follow.)

});
// filter out invalid deploys
deploys_to_return.retain(|_deploy_hash, deploy| {
self.is_deploy_valid(deploy, current_timestamp, max_ttl, max_dependencies, past)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In Scala we're purging the deploy buffer periodically.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That makes sense and should probably be added, but I'd say it's out of scope of this task.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since this PR is already merged, can you create a ticket for this?


#[derive(Debug, Clone, Default)]
pub(crate) struct DeployBuffer {
collected_deploys: HashMap<DeployHash, DeployHeader>,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why just the header? Where will we get the body from?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The idea here is not to keep the deploy bodies in memory unnecessarily, only the info that is needed for the buffer to know what deploys we have and which could still be included in blocks. The bodies would be kept in storage and only read when we need to communicate them to someone else.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So deploy buffer doesn't have access direct to deploy storage? How will those two communicate?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since this PR is already merged, can you create a ticket for this?

pub(crate) struct DeployBuffer {
collected_deploys: HashMap<DeployHash, DeployHeader>,
processed: HashMap<BlockHash, HashMap<DeployHash, DeployHeader>>,
finalized: HashMap<BlockHash, HashMap<DeployHash, DeployHeader>>,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks like this BlockHash neither belongs to a finalized block — it's used before execution, and thus before the state root hash is known, which should be part of the finalized block's hash — nor a Highway vote hash, but is rather an identifier for a (possibly pending) batch of deploys?
Should we call it BatchHash or DeployListHash or something?
Or will we actually ask Highway for the vote hash, and use that to identify the list?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It probably depends on whether we go with the execution before consensus or not, doesn't it?
I'd prefer renaming out of the two options, since it would work with both approaches - we need some kind of an identifier before a block is finalized, anyway.

Copy link
Contributor

@goral09 goral09 Jun 18, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Isn't this a hash of the SignedWireVote that carries the consensus values (deploy hashes)? And it is "finalized" when consensus outputs Finalized message. It doesn't need to have the post state hash (coming from the batch block).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ping :)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, that's what I meant with my last question above: We could pick the deploys, pass them into Highway, get back the vote hash (i.e. the hash of SignedWireVote) and use that as an identifier for the list. Not sure if that's a good idea.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Isn't this what happens in added_block? When consensus asks for deploys for the new block, it will call back the added_block

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You're right… so the intention with this is that the block hash is the Highway vote hash, I guess?

self.is_deploy_valid(deploy, current_timestamp, max_ttl, max_dependencies, past)
});
// TODO: check gas and block size limits
deploys_to_return
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Continuing this discussion, in the long run collected_deploys will probably be some kind of priority queue, usually much larger than past, and possibly even backed by a database. Also, the set of finalized deploys will grow indefinitely. I think:

  • We should make sure collected_deploys never contains finalized deploys, so we don't need to filter them out here. I don't think a module should defensively program against violating its own invariants.
  • We should avoid cloning collected_deploys or iterating over all of it. Instead, collect the past deploys into a set and use that to filter an iterator over collected_deploys. Then iterate only until the size or gas limit is reached.

(Doesn't have to be in this PR, of course, or in this sprint!)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree with everything except that collected_deploys will probably be much larger than past. Why would it be? past should include the hashes of all past blocks, and so will grow indefinitely. collected_deploys, on the other hand, would only include the deploys that haven't been finalized and aren't outdated yet (once we implement purging suggested by @goral09 ), so worst case scenario, its size will be roughly proportional to the size of the network. Am I missing something here?

Copy link
Contributor

@goral09 goral09 Jun 18, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The past should be only the diff between the last finalized block (known to the DeployBuffer) and all the blocks that consensus has proposed since then. So it doesn't grow indefinitely.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, of course! Right, that makes sense, then.

.or_else(|| self.processed.get(block_hash))
})
.flat_map(|deploys| deploys.keys())
.collect::<HashSet<_>>();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is another potentially huge collection.
Would be good to keep the finalized deploys in a set in DeployBuffer, so we don't have to collect them each time we check the validity of a deploy.
(And/or use a bloom filter later maybe?)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would be good to keep the finalized deploys in a set in DeployBuffer

Would it be a good idea to keep such a huge set in memory at all times? Both ways seem expensive...
I guess we need to decide whether we want to use a lot of memory constantly, or take some time to construct this for a brief moment when we check the validity of some deploys (but at the very least we should do it once per call to remaining_deploys, not once per call to is_deploy_valid).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think once it's so big that we can't keep it in memory at all times, it's also too big to construct it every time we validate a deploy.

Possibly a bloom filter alone will be able to replace this set?
We probably also can prune deploys that are older than the maximum TTL… so we could have one set and/or bloom filter for each period of length max TTL, and only keep two of those periods in memory at any time.

if !self
.finalized
.values()
.any(|block| block.contains_key(&hash))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should be checking the signature before adding it to the deploy buffer.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here's the list of all validity checks we do on deploy before it's even added to the deploy buffer.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

True, but we can't do that until we have the storage of the deploy bodies implemented, can we? Unless the signature only covers the header, but would that make sense?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right, we should check the hash and then the signature but it is the responsibility of the component that adds the deploy to the storage/buffer.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since this PR is already merged, can you create a ticket for this?

Copy link
Contributor

@afck afck left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(Not a blocker.)

.filter_map(|block_hash| self.processed.get(block_hash))
.chain(self.finalized.values())
.flat_map(|deploys| deploys.keys())
.collect::<HashSet<_>>();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's still:

.fold(self.collected_deploys.clone(),)

above. Shouldn't we just use self.collected_deploys below now, instead of deploys_to_return?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's a good question. I guess it depends on how we want to handle re-adding deploys after they are included in a processed block. Right now we allow for it, because processed blocks might not be finalized and adding the deploy again might be justified, but then we have to do this filtering by past blocks that have not been finalized yet.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But aren't you filtering them out twice right now? Once in line 45 and once in is_deploy_valid?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is_deploy_valid doesn't do any filtering, though. It only checks whether the dependencies are satisfied.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, of course! Sorry, ignore my remark…

Although… since we already collect past_deploys anyway, couldn't we also use it to get deploys_to_return by filtering collected_deploys with it, instead of cloning and folding?

@fizyk20 fizyk20 merged commit 41e8185 into casper-network:master Jun 19, 2020
@fizyk20 fizyk20 deleted the ndrs-62 branch June 19, 2020 14:58
rafal-ch pushed a commit that referenced this pull request Sep 11, 2024
The receipts will represent VM events and will be tracked via a merkle
tree of receipts.

The transaction creation will accept an initial receipts merkle state
and will mutate it after the execution of the script, providing the new
merkle root.
rafal-ch pushed a commit that referenced this pull request Sep 11, 2024
rafal-ch pushed a commit that referenced this pull request Sep 11, 2024
casperlabs-bors-ng bot pushed a commit that referenced this pull request Nov 6, 2024
zajko added a commit to zajko/casper-node that referenced this pull request Oct 13, 2025
# This is the 1st commit message:

rebasing

# The commit message #2 will be skipped:

# WIP

# The commit message #3 will be skipped:

# WIP

# The commit message #4 will be skipped:

# wip

# The commit message #5 will be skipped:

# wip

# The commit message #6 will be skipped:

# wip

# The commit message #7 will be skipped:

# wip

# The commit message #8 will be skipped:

# wip

# The commit message #9 will be skipped:

# wip

# The commit message casper-network#10 will be skipped:

# wip

# The commit message casper-network#11 will be skipped:

# wip

# The commit message casper-network#12 will be skipped:

# wip

# The commit message casper-network#13 will be skipped:

# rebasing

# The commit message casper-network#14 will be skipped:

# wip

# The commit message casper-network#15 will be skipped:

# wip

# The commit message casper-network#16 will be skipped:

# wip

# The commit message casper-network#17 will be skipped:

# wip

# The commit message casper-network#18 will be skipped:

# wip

# The commit message casper-network#19 will be skipped:

# wip

# The commit message casper-network#20 will be skipped:

# rebasing

# The commit message casper-network#21 will be skipped:

# wip

# The commit message casper-network#22 will be skipped:

# wip

# The commit message casper-network#23 will be skipped:

# wip

# The commit message casper-network#24 will be skipped:

# wip

# The commit message casper-network#25 will be skipped:

# wip

# The commit message casper-network#26 will be skipped:

# wip

# The commit message casper-network#27 will be skipped:

# wip

# The commit message casper-network#28 will be skipped:

# wip

# The commit message casper-network#29 will be skipped:

# wip

# The commit message casper-network#30 will be skipped:

# wip

# The commit message casper-network#31 will be skipped:

# wip

# The commit message casper-network#32 will be skipped:

# wip
zajko added a commit to zajko/casper-node that referenced this pull request Oct 13, 2025
# This is the 1st commit message:

rebasing

# The commit message #2 will be skipped:

# WIP

# The commit message #3 will be skipped:

# WIP

# The commit message #4 will be skipped:

# wip

# The commit message #5 will be skipped:

# wip

# The commit message #6 will be skipped:

# wip

# The commit message #7 will be skipped:

# wip

# The commit message #8 will be skipped:

# wip

# The commit message #9 will be skipped:

# wip

# The commit message casper-network#10 will be skipped:

# wip

# The commit message casper-network#11 will be skipped:

# wip

# The commit message casper-network#12 will be skipped:

# wip

# The commit message casper-network#13 will be skipped:

# rebasing

# The commit message casper-network#14 will be skipped:

# wip

# The commit message casper-network#15 will be skipped:

# wip

# The commit message casper-network#16 will be skipped:

# wip

# The commit message casper-network#17 will be skipped:

# wip

# The commit message casper-network#18 will be skipped:

# wip

# The commit message casper-network#19 will be skipped:

# wip

# The commit message casper-network#20 will be skipped:

# rebasing

# The commit message casper-network#21 will be skipped:

# wip

# The commit message casper-network#22 will be skipped:

# wip

# The commit message casper-network#23 will be skipped:

# wip

# The commit message casper-network#24 will be skipped:

# wip

# The commit message casper-network#25 will be skipped:

# wip

# The commit message casper-network#26 will be skipped:

# wip

# The commit message casper-network#27 will be skipped:

# wip

# The commit message casper-network#28 will be skipped:

# wip

# The commit message casper-network#29 will be skipped:

# wip

# The commit message casper-network#30 will be skipped:

# wip

# The commit message casper-network#31 will be skipped:

# wip

# The commit message casper-network#32 will be skipped:

# wip

# The commit message casper-network#33 will be skipped:

# wip

# The commit message casper-network#34 will be skipped:

# wip

# The commit message casper-network#35 will be skipped:

# wip

# The commit message casper-network#36 will be skipped:

# wip

# The commit message casper-network#37 will be skipped:

# wip

# The commit message casper-network#38 will be skipped:

# wip

# The commit message casper-network#39 will be skipped:

# wip

# The commit message casper-network#40 will be skipped:

# rebasing
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants