Skip to content

Conversation

@aymkhalil
Copy link
Contributor

@aymkhalil aymkhalil commented Dec 29, 2022

Continuation of #18518 and #17609 to ensure the ns policy deleted is in consistent state between brokers when performing and isAllowAutoTopicCreationAsync

Motivation

Modifications

Leveraged the MetadataStore sync api to have a consistent view of the ns policy that will feed into the isAllowAutoTopicCreationAsync.

Verifying this change

  • Make sure that the change passes the CI checks.

This change is already covered by existing tests, such as (please describe tests).

This change added tests and can be verified as follows:

(example:)

  • Added integration tests for end-to-end deployment with large payloads (10MB)
  • Extended integration test for recovery after broker failure

Does this pull request potentially affect one of the following parts:

If the box was checked, please highlight the changes

  • Dependencies (add or upgrade a dependency)
  • The public API
  • The schema
  • The default values of configurations
  • The threading model
  • The binary protocol
  • The REST endpoints
  • The admin CLI options
  • The metrics
  • Anything that affects deployment

Documentation

  • doc
  • doc-required
  • doc-not-needed
  • doc-complete

Matching PR in forked repository

PR in forked repository: aymkhalil#5

@github-actions
Copy link

@aymkhalil Please add the following content to your PR description and select a checkbox:

- [ ] `doc` <!-- Your PR contains doc changes -->
- [ ] `doc-required` <!-- Your PR changes impact docs and you will update later -->
- [ ] `doc-not-needed` <!-- Your PR changes do not impact docs -->
- [ ] `doc-complete` <!-- Docs have been already added -->

@aymkhalil aymkhalil marked this pull request as draft December 29, 2022 00:39
@github-actions github-actions bot added doc-not-needed Your PR changes do not impact docs and removed doc-label-missing labels Dec 29, 2022
Copy link
Contributor

@eolivelli eolivelli left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that you found a valid case in which we need to use "refresh" (sync()), but the current patch is using refresh too eagerly (probably)

I have left one comment, please check

}

return pulsar.getPulsarResources().getNamespaceResources()
.getPoliciesAsync(topicName.getNamespaceObject(), true)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that there maybe some case in which it is good to use the local version of the Policies object.

We need to use "refresh" only when we reach here from a HTTP API call,
I know that is looks pretty convoluted, but with this change in the current form we are going to use "refresh" (that is a heavyweight operation) probably in code paths that don't need it

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I traced http vs binary protocol calls that invoke this code:
http calls:

binary protocol calls:

  • handlePartitionMetadataRequest
  • handleSubscribe
  • handleProducer

I isolated the calls with as minimum code changes as possible. Please evaluate the value of the change vs the introduced code branches.

PTAL.

@aymkhalil aymkhalil requested review from eolivelli and poorbarcode and removed request for eolivelli December 30, 2022 00:54
@aymkhalil aymkhalil marked this pull request as ready for review December 30, 2022 00:54
});
}

public static CompletableFuture<PartitionedTopicMetadata> getPartitionedTopicMetadata(
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is not used anywhere, removed it to minimize the code paths we need to think about.

@aymkhalil aymkhalil requested review from eolivelli and poorbarcode and removed request for eolivelli and poorbarcode December 30, 2022 00:58
@eolivelli eolivelli dismissed their stale review December 31, 2022 10:41

I will review later

@eolivelli
Copy link
Contributor

At first glance now the patch looks good, but I will review it more carefully next week

@github-actions
Copy link

The pr had no activity for 30 days, mark with Stale label.

Copy link
Contributor

@eolivelli eolivelli left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Copy link
Contributor

@dlg99 dlg99 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FWIW I don't think this change alone can completely prevent the topic creation during ns deletion: one can sync the policy and get the latest one, then ns deletion starts, then we create a topic - there is no locking/distributed locking for these operations.
It might reduce frequency of the problem occurrence.

@dlg99
Copy link
Contributor

dlg99 commented Jan 31, 2023

Have you considered the double-check:
use cached policies to decide on eligibility as without your changes, check again with refresh right before managedLedgerFactory.asyncOpen call in the BrokerService, fail if ns.deleted.

@aymkhalil
Copy link
Contributor Author

FWIW I don't think this change alone can completely prevent the topic creation during ns deletion: one can sync the policy and get the latest one, then ns deletion starts, then we create a topic - there is no locking/distributed locking for these operations. It might reduce frequency of the problem occurrence.

+1. If fact, any "read" based solution to achieve consistency will be best effort only. I don't see how it could be solved for good without conditional writes which are not well supported. For example, in ZK the setData API does not support something like "fail if any version exists" but also there will be need to coordinate updates because the metadata for policy and topic (managed ledger) are under different nodes.

So, I couldn't find an easy way to do it that fits in the scope/purpose of this patch...

@aymkhalil
Copy link
Contributor Author

aymkhalil commented Jan 31, 2023

Have you considered the double-check:
use cached policies to decide on eligibility as without your changes, check again with refresh right before managedLedgerFactory.asyncOpen call in the BrokerService, fail if ns.deleted.

I didn't explicitly, the approach was to trace all http calls that touches the ns policy (here) so we cover more cases, but not cases where this could add unnecessary overhead (hence the distinction between http vs binary calls).

But you made me think, actually the automatic topic creation triggered by a producer is actually on the binary protocol path so I either have to make all isAllowAutoTopicCreationAsync use the refresh API, or consider limiting the scope to the ML open async as you proposed...

Regarding the latter approach, any overhead concerns to refreshing the policy metadata before ANY attempt to call managedLedgerFactory.asyncOpen? It seems to me topic create is a control/management API call that won't be highly sensitive to latency but I wanna avoid making assumptions. WDYT @dlg99 @eolivelli ?

Copy link
Member

@lhotari lhotari left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My concern about the solution is that it will degrade performance significantly for any hot paths where there's Zookeeper sync calls. We don't have performance benchmarks as part of the Pulsar CI pipeline and the performance benchmark would have to be done separately.

This potential performance problem is already introduced by

in #18518 so this PR isn't really where this performance problem has been introduced.

I guess that the original goal was to prevent new topics from being created while the namespace is being deleted?

I believe that we could find a solution where we could combine eventual consistency for performance and use a way to ensure strong consistency with the tradeoff of having some delays in reaching this consensus. My assumption is that requiring strong consistency at all times using Zookeeper's "sync" would be too slow. I could be wrong.

I'm not exactly sure of the details, but I was thinking that perhaps the part that needs to delete the namespace waits a period of time until the eventual consistent data can be considered to be eventually reached a certain state. A better approach would be to have an explicit way to ensure that all distributed state has been replicated, but with the current Metadata Store design, waiting for a period of time could be a reasonable solution.

I suggest that we don't proceed further in merging this PR before this potential performance issue is resolved.

It seems that Matteo's comment #18518 (comment) in the original PR that introduced the ZK sync wasn't fully addressed.
I consider isAllowAutoTopicCreationAsync to be on the hot path and therefore using ZK sync is not a proper solution unless we are fine with performance regressions.

@lhotari lhotari requested review from hangc0276 and merlimat February 3, 2023 17:03
@aymkhalil
Copy link
Contributor Author

After collecting feedback, I'm closing this PR for now bc:

  1. It does not address the consistency problem for auto created topics since those happen on the binary protocol path (i.e. a producer)
  2. Concerns about extra overhead of calling the sync API on multiple code paths
  3. Even if we limit the sync API to when ML.asyncOpen() call is attempted, it means that for 100% of topic creation attempt we enforce the sync although the problem happens only in corner cases. The actual impact here is not measured due to lack of data points regarding the sync API latency && the frequency of the delete problem (clarified in the next point).
  4. The underlying issue (which is creating topics while ns delete is in progress) could be already mitigated to an extent with the delete retries implemented in [fix][broker] Fix delete namespace fail by a In-flight topic #19374

So the idea would be to:

  1. Monitor if the problem happens again after merging [fix][broker] Fix delete namespace fail by a In-flight topic #19374
  2. Reproduce the problem
  3. Submit another PR for the fix in-light of the above information (please note that all solutions are best effort only as explained here: [improve] Refresh ns policy when deciding auto topic creation eligibility #19097 (comment)

@aymkhalil aymkhalil closed this Feb 6, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

doc-not-needed Your PR changes do not impact docs ready-to-test

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants