client: Reorg to new best when finalizing divergent branch by andresilva · Pull Request #2869 · paritytech/substrate

andresilva · 2019-06-14T14:52:33Z

When we finalize a block on a divergent branch from the currently tracked best we need to figure out what the new best block is on the finalized branch and we need to re-org to it. This is done by having finalize_block optionally take a SelectChain which is the pluggable logic for best fork selection (currently only longest chain is implemented). If no SelectChain is provided and the finalized block is on a divergent branch then we set the finalized block as best (this is the case for the light client since it doesn't track leaves). When importing a block that is finalized the same thing can happen but in this case we don't need SelectChain since the given block that's being imported must be the new best.
I also renamed the finality_target in SelectChain since there's nothing relating to finality there (e.g. GRANDPA has further voting rules that restrict what we vote on), instead SelectChain is strictly concerned with defining the notion of a best block, regardless of what we do with it.

Fix #1442.

Demi-Marie

This looks good to me, but I won’t approve it myself, as I am not familiar with this part of the codebase.

rphmeier · 2019-06-14T18:46:02Z

core/consensus/common/src/select_chain.rs

-	/// to finalize next.
-	fn finality_target(
+	/// Get the best block in the fork containing `target_hash`, if any.
+	fn best_containing<'a>(


That's not the goal of finality_target -- the previous goal was to return the highest ancestor block which was suitable to finalize. This is typically the identity function applied to best_containing, but for e.g. Polkadot we need to have additional voting rules based on what erasure-coded pieces we've seen.

SelectChain is meant to make a distinction between what we optimistically think is best and what we are willing to commit to finalizing.

rphmeier · 2019-06-14T18:55:25Z

core/client/src/client.rs

+				// find the new best block that is a descendent of the finalized
+				// block. we already have the import lock acquired so there's no
+				// need to re-acquire it in `select_chain`.
+				Some(select_chain) => select_chain.best_containing(block, None, None)?


If this were finality_target, this wouldn't work. Neither would it if best_containing were implemented for Polkadot in a way that limited to only chains where availability pieces have been seen.

rphmeier · 2019-06-14T18:57:12Z

core/client/src/client.rs

+				Some(select_chain) => select_chain.best_containing(block, None, None)?
+					.unwrap_or(block),
+				// if no `select_chain` is provided set the finalized block as best
+				None => block,


In some situations, this is clearly OK, but I would worry that None is passed even in situations where we need it. Rather than use an Option, I recommend creating a dummy SelectChain for the cases where we currently return None, which always returns the known best block for best_containing.

rphmeier · 2019-06-14T18:57:35Z

core/client/src/client.rs

+		&self,
+		id: BlockId<Block>,
+		justification: Option<Justification>,
+		select_chain: Option<&dyn SelectChain<Block>>,


Same here. I don't know why the select_chain would ever be optional.

rphmeier

The finality_target interface doesn't support the needs we wanted it to anymore. And the select_chain being optional can lead to weird corner cases that I'd prefer to avoid.

Otherwise looks good.

andresilva · 2019-06-14T21:23:24Z

My thoughts regarding SelectChain (related to #2556):

The client should have a notion of what is the "best" block (and optimize its backend accordingly);
The client doesn't define what is "best", it is defined externally when interacting with the client;
Whatever the client reports as "best" is canonical and we can rely on it (e.g. for block authoring on top of);

I wanted to somehow unfiy ForkChoiceStrategy and SelectChain since both strategies need to be consistent with one another (in our case longest chain), I don't think that's possible right now since the light client doesn't have a SelectChain. Currently this is because leaves aren't tracked by the light client and therefore we can't implement the LongestSelectChain. This is also the reason why finalize_block takes an optional SelectChain.

I also wanted to further simplify SelectChain to only have the best_containing method, everything else can be removed if we make the assumption that we always author on top of what is the current "best" in the client (which was defined previously by SelectChain or ForkChoiceStrategy).

For Polkadot what I would imagine is that when we receive a new availability piece we check whether there's any leaf that has all the pieces, in which case it should be the new "best" and we set it explicitly through the client (this doesn't really work for the case where select_chain is None in finalize_block though). So my question is, in Polkadot when we call client.info().best_hash should we have an invariant that the given block must always have all availability pieces available?

I take suggestions wrt how to deal with the light client integration. Should we just start tracking leaves there as well?

gavofyork · 2019-06-25T10:10:19Z

looks stale... @rphmeier @andresilva what's happening here?

andresilva · 2019-06-25T10:13:34Z

@svyatonik could you have a look at this and do you have any suggestions regarding light client integration?

svyatonik · 2019-06-25T10:18:53Z

Oh, I thought @rphmeier had complains - that's why I have ignored that PR until now :/ Will take a look

rphmeier · 2019-06-25T10:30:27Z

So my question is, in Polkadot when we call client.info().best_hash should we have an invariant that the given block must always have all availability pieces available?

Not necessarily. It should be equal to or a descendent of the block we would select to build upon. Which is going to be at most the K'th descendent of a chain we've seen every availability piece for.

At heart, we need a couple different things:

A fork choice rule for ordering leaves. (for Polkadot, this is something like IMD-GHOST but with availability pieces instead of casper votes). We have best_containing that uses these leaves without further constraint.
Ways of constraining chains further. Basically a fn constrain(&self, block_hash) -> Hash, which for many cases is just the identity function, but for Polkadot voting rules we'll walk backwards until we reach a chain with enough availability pieces.

The fork choice abstraction is tricky. We need to apply it upon block import but also upon "external events" like importing availability messages. It would be nice to do this lazily, but there doesn't seem to be an elegant way to integrate something like that with every call to best. That was kind of the goal of SelectChain: to phase out calls to client.best_block() and rather use SelectChain::best_block which would do any necessary lazy application of fork-choice stuff as necessary.

svyatonik · 2019-06-25T11:16:30Z

From what I've understood, in Polkadot we'll need some complicated best-block/block-to-finalize selection strategies, which require some additional data that won't be available on light client (availability pieces?). So, no matter what, we can't guarantee that even if both light and full client have synced the same blocks, they still have same best-block (just because full client will have some additional data for decision making).

So given that, imo:

there's no point in using best-block on light client for anything else than logging + providing defaults for RPC calls;
on light client it is better stick to strategy that never returns best-block that is not the best block on full client AND is not the ancestor of this block. And the only strategy I could think of is to make finalized block the best block.

So my vote is not to track leaves on light && instead always make last finalized block the best block on light nodes. Obviously, I can't be a final judge here - that's just my opinion.

andresilva · 2019-07-01T13:23:04Z

instead always make last finalized block the best block on light nodes.

@svyatonik This is the approach that's implemented in this PR, but we only update the best block when finalizing if it's from a divergent branch compared to the current best block, otherwise we keep the current best block as is.

We need to apply it upon block import but also upon "external events" like importing availability messages. It would be nice to do this lazily, but there doesn't seem to be an elegant way to integrate something like that with every call to best.

@rphmeier Doing it lazily makes sense since it might be potentially expensive, and given that constraint I better understand the original idea behind SelectChain. In that case, I think we can assume that the best block as reported by the client is always a "best effort". Any operation that requires accurate knowledge of best block (authoring or voting) will have to go through SelectChain. Assuming that I think we should be able to use the same approach that we have for the light client and just set the finalized block to be the best block in case of divergence (no need for SelectChain in finalize_block). This means that the client might be a bit behind on the best block, but that should also take care of itself during the next block import. WDYT?

andresilva · 2019-07-05T18:09:34Z

@rphmeier Updated this PR with your suggestions. SelectChain is now mandatory, the light client uses a DummySelectChain which always returns the best block that's stored in the backend and for best_containing returns the given best hash. Also reverted some changes to SelectChain, it now contains finality_target and best_containing methods. The default implementation of finality_target is to call best_containing. The GRANDPA voter appropriately uses finality_target.

The light client code is ready to deal with an arbitrary SelectChain even though it is static and cannot be configured at this point.

rphmeier

We may want to rethink the BlockImport stuff a bunch. Originally they were intended to be composable. Maybe the SelectChain thing should be passed in with the ImportBlock parameter as opposed to being owned by the BlockImport.

andresilva · 2019-07-15T19:24:18Z

I talked to @rphmeier about this and we decided the extra complexity added with finalize_block taking SelectChain wasn't worth it. Any usage that requires tracking the best block accurately (e.g. block authoring or voting), should go through SelectChain. The best block as reported by the client is mainly a DB optimization and should be considered possible outdated (e.g. probably fine for usage by the informant).

I updated the PR so that when finalizing a block that diverges from the current best branch we set the finalized block as the best block (much simpler). I also needed to make some changes to the LongestChain implementation, so that it didn't rely so closely on the best block reported by the client.

andresilva added A0-please_review Pull request needs code review. M4-core labels Jun 14, 2019

andresilva force-pushed the andre/reorg-on-finality branch from be88c6f to 068f966 Compare June 14, 2019 15:02

andresilva requested review from rphmeier and svyatonik June 14, 2019 15:37

Demi-Marie self-requested a review June 14, 2019 18:23

Demi-Marie reviewed Jun 14, 2019

View reviewed changes

rphmeier reviewed Jun 14, 2019

View reviewed changes

rphmeier suggested changes Jun 14, 2019

View reviewed changes

rphmeier added A5-grumble and removed A0-please_review Pull request needs code review. labels Jun 14, 2019

andresilva force-pushed the andre/reorg-on-finality branch from c9a3fbb to 3ff701c Compare July 5, 2019 17:32

andresilva added A0-please_review Pull request needs code review. and removed A5-grumble labels Jul 5, 2019

andresilva force-pushed the andre/reorg-on-finality branch from b69bd29 to da12388 Compare July 5, 2019 18:38

rphmeier approved these changes Jul 15, 2019

View reviewed changes

andresilva added 3 commits July 15, 2019 19:00

client: add tests for reorging on diverged finality

76e1ae9

client: mark finalized block as best if diverged from current best chain

80a4165

client: update meta on set_head

10682fe

andresilva added 4 commits July 15, 2019 19:14

core: add docs about SelectChain to finalize_block

052554a

client: improve finality reorg test

dca006f

client: LongestChain doesn't return client best block

71b552f

client: LongestChain searches canonical chain

26424a1

andresilva force-pushed the andre/reorg-on-finality branch from e51d404 to 26424a1 Compare July 15, 2019 19:18

gavofyork merged commit b6e6707 into master Jul 15, 2019

gavofyork deleted the andre/reorg-on-finality branch July 15, 2019 23:50

andresilva mentioned this pull request Sep 30, 2019

grandpa: pluggable voting rules #3673

Merged

Conversation

andresilva commented Jun 14, 2019

Uh oh!

Demi-Marie left a comment

Choose a reason for hiding this comment

Uh oh!

rphmeier Jun 14, 2019 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

rphmeier Jun 14, 2019

Choose a reason for hiding this comment

Uh oh!

rphmeier Jun 14, 2019

Choose a reason for hiding this comment

Uh oh!

rphmeier Jun 14, 2019

Choose a reason for hiding this comment

Uh oh!

rphmeier left a comment

Choose a reason for hiding this comment

Uh oh!

andresilva commented Jun 14, 2019 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

gavofyork commented Jun 25, 2019

Uh oh!

andresilva commented Jun 25, 2019

Uh oh!

svyatonik commented Jun 25, 2019

Uh oh!

rphmeier commented Jun 25, 2019 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

svyatonik commented Jun 25, 2019

Uh oh!

andresilva commented Jul 1, 2019

Uh oh!

andresilva commented Jul 5, 2019

Uh oh!

rphmeier left a comment

Choose a reason for hiding this comment

Uh oh!

andresilva commented Jul 15, 2019

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

rphmeier Jun 14, 2019 •

edited

Loading

andresilva commented Jun 14, 2019 •

edited

Loading

rphmeier commented Jun 25, 2019 •

edited

Loading