Skip to content

[enh][transaction] Optimize to reuse transaction buffer snapshot writer#15015

Closed
gaoran10 wants to merge 8 commits intoapache:masterfrom
gaoran10:tb-snapshot-writer-reuse
Closed

[enh][transaction] Optimize to reuse transaction buffer snapshot writer#15015
gaoran10 wants to merge 8 commits intoapache:masterfrom
gaoran10:tb-snapshot-writer-reuse

Conversation

@gaoran10
Copy link
Copy Markdown
Contributor

@gaoran10 gaoran10 commented Apr 3, 2022

Motivation

The transaction buffer(short for TB) snapshot topic is used to persistent snapshot data for TB, the snapshot can reduce read data when recovering.

Currently, if enable the transaction feature, every topic will create a TB snapshot producer, the topic of the producer is a namespace system topic, so we don't need to create a producer for every topic, one producer per namespace is enough.

Modification

Add a new inner class ReferenceCountedWriter, if the ReferenceCountedWriter for one namespace does not exist, it will be initialized, or else its reference count will be increased, if the reference count reduces to 0, it will be removed from map-cache.

Verifying this change

Adjust existing tests to cover this change.

Does this pull request potentially affect one of the following parts:

If yes was chosen, please highlight the changes

  • Dependencies (does it add or upgrade a dependency): (no)
  • The public API: (no)
  • The schema: (no)
  • The default values of configurations: (no)
  • The wire protocol: (no)
  • The rest endpoints: (no)
  • The admin cli options: (no)
  • Anything that affects deployment: (no)

Documentation

Check the box below or label this PR directly (if you have committer privilege).

Need to update docs?

  • doc-required

    (If you need help on updating docs, create a doc issue)

  • no-need-doc

    (Please explain why)

  • doc

    (If this PR contains doc changes)

@github-actions github-actions Bot added the doc-not-needed Your PR changes do not impact docs label Apr 3, 2022
@gaoran10 gaoran10 added type/feature The PR added a new feature or issue requested a new feature area/transaction labels Apr 3, 2022
@gaoran10 gaoran10 changed the title [enh][transaction] Optimize to reuse transaction buffer snapshot writer [WIP][enh][transaction] Optimize to reuse transaction buffer snapshot writer Apr 4, 2022
@gaoran10 gaoran10 changed the title [WIP][enh][transaction] Optimize to reuse transaction buffer snapshot writer [enh][transaction] Optimize to reuse transaction buffer snapshot writer Apr 6, 2022

private synchronized void initWriterFuture() {
this.future = service.getTransactionBufferSystemTopicClient(namespaceName).newWriterAsync();
this.future.thenRunAsync(this.backoff::reset).exceptionally(throwable -> {
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why use thenRunAsync with common-pool here? I'm not sure we should use a specific executor?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does the common pool have some potential problems?

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's OK here.
Just to prevent others from using common-pool to do some IO-intensive tasks and affecting there.
After deep thinking, I think we need to check all abuses of common-pool, not against it.

// Resolve potential race condition problem, if retain method encounter reference count exception
// or other exceptions, create a new `ReferenceCountedWriter`, when the `ReferenceCountedWriter` release
// but didn't remove from `writerFutureMap`.
return new ReferenceCountedWriter(namespaceName, this);
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Question: Should we close the original writer?

Copy link
Copy Markdown
Contributor Author

@gaoran10 gaoran10 Apr 7, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The previous writer will be closed when its reference count reduces to 0, if its reference count reduces to 0, it must call the method deallocate.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, thanks for your explanation.


// The class ReferenceCountedWriter will maintain the reference count,
// when the reference count decrement to 0, it will be removed from writerFutureMap, the writer will be closed.
public static class ReferenceCountedWriter extends AbstractReferenceCounted {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suggest don't extends AbstractReferenceCounted, deallocate seem not be used. and is complicated to implement

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When the reference count reduces to 0, the deallocate will be called, maybe using it to compute the reference count is simple, it's thread-safe.

private synchronized void initWriterFuture() {
this.future = service.getTransactionBufferSystemTopicClient(namespaceName).newWriterAsync();
this.future.thenRunAsync(this.backoff::reset).exceptionally(throwable -> {
long delay = backoff.next();
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

seem we only use backoff.next() once and then reset

Copy link
Copy Markdown
Contributor Author

@gaoran10 gaoran10 Apr 7, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If encounter exceptions the backoff will increase, if the writer creates successfully, it will be reset.

@gaoran10 gaoran10 changed the title [enh][transaction] Optimize to reuse transaction buffer snapshot writer [WIP][enh][transaction] Optimize to reuse transaction buffer snapshot writer Apr 8, 2022
@gaoran10 gaoran10 changed the title [WIP][enh][transaction] Optimize to reuse transaction buffer snapshot writer [enh][transaction] Optimize to reuse transaction buffer snapshot writer Apr 12, 2022
The transaction buffer(short for TB) snapshot topic is used to persistent snapshot data for TB, the snapshot can reduce read data when recovering.

Currently, if enable transaction feature, every topic will create a TB snapshot producer, the topic of the producer is a namespace system topic, so we don't need to create producer for every topic, one producer per namespace is enough.

# Modification

Add a new inner class `ReferenceCountedWriter`, it extends `AbstractReferenceCounted`, if the `ReferenceCountedWriter` is not exist, it will be initilied, or else its reference count will be increased, after the reference count reduce to 0, it will be removed from map cache.
@gaoran10 gaoran10 force-pushed the tb-snapshot-writer-reuse branch from a3c098b to 7cac709 Compare April 12, 2022 16:01
Copy link
Copy Markdown
Contributor

@congbobo184 congbobo184 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good work! left some comments

}

private void closePendingCloseWriter() {
Iterator<Writer<TransactionBufferSnapshot>> iterator = pendingCloseWriterList.stream().iterator();
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: why are you creating the stream?
We can iterate on the LinkedList directly, creating less garbage and also making the code simpler

@github-actions
Copy link
Copy Markdown

The pr had no activity for 30 days, mark with Stale label.

@github-actions
Copy link
Copy Markdown

The pr had no activity for 30 days, mark with Stale label.

@gaoran10
Copy link
Copy Markdown
Contributor Author

This PR is too old, I create a new one #19641.

@gaoran10 gaoran10 closed this Feb 28, 2023
@gaoran10 gaoran10 deleted the tb-snapshot-writer-reuse branch March 21, 2023 01:42
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area/transaction doc-not-needed Your PR changes do not impact docs lifecycle/stale Stale type/feature The PR added a new feature or issue requested a new feature

Projects

None yet

Development

Successfully merging this pull request may close these issues.

7 participants