Skip to content
This repository was archived by the owner on Nov 15, 2023. It is now read-only.

Allow for customisation of chain selection systems#2240

Merged
gavofyork merged 28 commits intomasterfrom
ben-custom-chainselection-strategy
May 10, 2019
Merged

Allow for customisation of chain selection systems#2240
gavofyork merged 28 commits intomasterfrom
ben-custom-chainselection-strategy

Conversation

@gnunicorn
Copy link
Contributor

@gnunicorn gnunicorn commented Apr 9, 2019

Infrastructure ref #1125 . Superseeds #1443.

This breaks off the SelectChain into its own trait, externalised from client, in common-consensus and then provides the current implementation as LongestChain for convenience. In order to allow others to specify a custom chain, this now also needs to be handled separately in the service definition and custom setups for the consensus defined in there. Thus, this breaks compatibility with existing implementations.

The general infrastructure is there, but this PR still has a few issues that needs resolving before it can be merged. However, reviewing (especially around the architecture in general) can already proceed.

The known issues to fix are:

@gnunicorn gnunicorn requested review from bkchr and rphmeier and removed request for rphmeier April 10, 2019 00:24
slot_duration: SlotDuration,
local_key: Arc<P>,
client: Arc<C>,
select_chain: SC,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

unfortunately, this breaks API. We will have to find a way to do this that doesn't.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, this change will inevitably break the api - as it needs to be specified and at the service-intend-layer and somehow pushed down to the authoring and finalisation. Even if we tried to still have client facilitate this somehow, this would mean adding the same SC on every Where C: Client<...SC>, which would also break this same call.

I am afraid there is just no way around this breaking the API.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The only thing we could do is have client implement the new trait as well and hard-code that to the LongestChain implementation and directly deprecate its usage. But then still, would the where clause later on have to change to the new trait and thus, this would still break the API.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

unfortunately, this breaks API. We will have to find a way to do this that doesn't.

There are hacky ways to do it by stuffing the information into mutable state, such as a thread-local variable. None of those are good, but we can avoid future breaking changes by switching to the builder pattern.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is currently the blocker for me on this PR. I am not sure how we even can resolve this without creating a huge other mess elsewhere.

The only way I can currently think of is by having a compatibility layer, that is Client implements LongestChain SC, however in that case, a) the where clause still needs to be extended (and that is "breaking the API", as it now constrains more than it was before) and b) it would mean this function doesn't actually allow any other implementation and thus isn't really implementing the ticket (or solving our problem).

Not sure what to do about this?!?

}

/// Get best block header for finalisation
fn best_block_header_for_finalisation(&self) -> Result<<Block as BlockT>::Header, Error> {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should an instant-finality authoring system use this or _for_authoring?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good question. Was also wondering whether _finalisation should fallback to _authoring in the main trait to make this easier...

Copy link
Contributor

@rphmeier rphmeier Apr 15, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We might want to express it as a subset relationship. The best_for_authoring must always be equal to or a descendent of the best_for_finalization. We may want a marker trait for chain selection algorithms where they are always equal (and this is the only kind that instant-finality engines can really support, because anything else implies forks). The only reason we wouldn't want a marker trait is because of swappable consensus.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@rphmeier I think anything that does not support swappable consensus is a non-starter. Ideally, consensus could be implemented in the runtime.

gnunicorn and others added 3 commits April 12, 2019 15:58
@gnunicorn gnunicorn added the A0-please_review Pull request needs code review. label Apr 12, 2019
//! or prune any signaled changes based on whether the signaling block is
//! included in the newly-finalized chain.
#![forbid(warnings)]
/// FIXME
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll happily put this into a ticket, once review is done.

(Why is this here: because I officially deprecated the usage of client.backend() and the forbid then refuses to compile this file. However the deprecation, nor the changes needed in this file are strictly part of this PR and can be done as a nice starter-ticket)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ticket now at #2532

@gnunicorn gnunicorn requested review from Demi-Marie and rphmeier and removed request for Demi-Marie May 6, 2019 11:26
@gnunicorn
Copy link
Contributor Author

I've updated the PR to latest master (including changes to babe) and changed the interface as we've discussed, rather than having the weird four methods, we split up on a generic best_chain (primarily for authoring) but also let SelectChain choose a different block for finalization (which we will need for Polkadot). Makes the entire interface a bit cleaner.

Also addressed the other comments. Please review again, @demimarie-parity, @rphmeier

@gnunicorn gnunicorn requested a review from Demi-Marie May 6, 2019 11:42
@gavofyork
Copy link
Member

bump @demimarie-parity @rphmeier

fn best_chain(&self) -> Result<<Block as BlockT>::Header, Error>;

/// Get the most recent block hash of the best chain that contain block
/// with the given `target_hash` for finalisation
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a break from what we discussed while designing, which was that finality_target would find an ancestor of target_hash. So to find a block to finalize, you do finality_target(best_chain()). For a lot of chain selection systems, finality_target is the identity function, but for some we might not be able to finalize all the blocks in that chain, so it would look backwards a bit.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I must have mixed something up then, because I thought this was the API we discussed. I think I just copied the docs over and didn't update them properly. But the idea was that given best_chain() (which is what most finality gadget will use prior) the SC would pick the most recent block that can be/should be finalized.

I'll update the doc to reflect that.

Copy link
Contributor Author

@gnunicorn gnunicorn May 10, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@rphmeier I changed the docs to reflect closer what we had in mind. Please confirm, that this the API we had discussed.

One thing I wasn't quite sure about is, how this is supposed to act upon the target_hash already being finalized. Is the expectation, that the finality_target is always a not-yet-finalized block? This would fail in the case where we only have the Gensis-Block or on all instant-finalising consensus systems -- maybe that is a specific error case we want to mention. Or we state that the returned block might be finalized already, in that the case the finality gadget would have to deal with that though, so we better make that explicit.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@gnunicorn This is a good point. I think the only solution is to specify that the finality_target is the block we would like to see finalized, which excludes blocks that are already finalized. So a finality_target should be able to return None to indicate that there is no viable chain to finalize.

debug!(target: "afg", "Finding best chain containing block {:?} with number limit {:?}", block, limit);

match self.inner.best_containing(block, None) {
match self.select_chain.finality_target(block, None) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah, I guess the main thing is that we need a way to get the best chain containing a given hash. I thought of doing it as a separate operation, where we would use sc.finality_target(sc.best_containing(included_hash))

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the API in this PR is workable for GRANDPA, but I think we need to introduce a split if we want something that would work for finality gadgets and instant-finality authorship and probablistic finality authorship.

With the current API I'm not sure exactly what a Tendermint implementation would build on.

Copy link
Contributor Author

@gnunicorn gnunicorn May 10, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this was mostly because of the same (wrong) API docs. I updated those.

The way I'd see this work is, that if you have a finality gadget, you'd pass the best_chain to finality_target and it has some way to find the ancestor that you should vote on.

For instant finality systems (like Tendermint or Rhododendron) I don't see why you'd ever need to use finality_target (or any finality gadget for that matter) at all. Because best_chain will already return a finalised block, thus the default implementation just returns the same hash again. Same for any probabilistic finality system but the other way around: no block is ever "surely finalized", thus the system would not use the finality_target function -- or given a hash and a certain risk profile could return some ancestor it is certain to not be reverted, but as there won't be any finality gadget, no one would ask for it...

Copy link
Contributor

@rphmeier rphmeier May 10, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@gnunicorn I think if we would apply e.g. the polkadot chain-selection rule to an instant-finality authorship system, the behavior we'd desire is that

  • we have a best block which is finalized
  • but we refuse to propose or vote on a new block until we've received the erasure-coded chunks for that chain. This would be accomplished by doing finality_target(best_chain()) == best_chain().

Whereas if you only used best_chain(), you would end up with nodes attempting to finalize new blocks where they haven't received all the availability chunks for the parent block.

@rphmeier rphmeier added A3-needsresolving and removed A0-please_review Pull request needs code review. labels May 9, 2019
@gavofyork gavofyork merged commit 77c10cd into master May 10, 2019
@gavofyork gavofyork deleted the ben-custom-chainselection-strategy branch May 10, 2019 12:08
MTDK1 pushed a commit to bdevux/substrate that referenced this pull request Jul 10, 2019
* move SelectChain trait out of client

* Extend SelectChain, move longest chain implementation into it

* Bring SelectChain into service

* implement LongestChain SelectChain

* implement longest chain for node

* update Cargo.lock's

* in between erroring tests

* deprecate ::backend and ::import_lock

* Remove unneded space

Co-Authored-By: gnunicorn <ben.kampmann@googlemail.com>

* Remove unneded space

Co-Authored-By: gnunicorn <ben.kampmann@googlemail.com>

* Fixes test compilation

* remove todo

* re-enable client test

* add doc

* fixing tests

* Clarify SelectChain Interface, intended implementation and usage

* minor components cleanups

* minor cleanups

* Update lock files

* Implement cleaner interface for SelectChain

* addressing comments

* Updating tests

* bump node runtime impl version

* address grumbles
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants