Skip to content

Conversation

@massakam
Copy link
Contributor

Fixes #6437

Motivation

In a replicated subscription with no consumers connected, the number of marker messages in the backlog will continue to increase. If at least one consumer is connected, the marker messages will be acknowledged and deleted by the dispatcher.

if (msgMetadata == null || Markers.isServerOnlyMarker(msgMetadata)) {
PositionImpl pos = (PositionImpl) entry.getPosition();
// Message metadata was corrupted or the messages was a server-only marker
if (Markers.isReplicatedSubscriptionSnapshotMarker(msgMetadata)) {
processReplicatedSubscriptionSnapshot(pos, metadataAndPayload);
}
entries.set(i, null);
entry.release();
subscription.acknowledgeMessage(Collections.singletonList(pos), AckType.Individual,
Collections.emptyMap());
continue;

However, if no consumers are connected, the dispatcher does not exist or has stopped reading entries. As a result, the marker messages accumulate in the backlog without being acknoledged by anyone.

Modifications

There are four types of marker messages:

  • ReplicatedSubscriptionsSnapshotRequest
  • ReplicatedSubscriptionsSnapshotResponse
  • ReplicatedSubscriptionsSnapshot
  • ReplicatedSubscriptionsUpdate

Of these, three messages, except ReplicatedSubscriptionsSnapshot, are not used in the local cluster. They are published to local topics for sending to remote clusters. So, modified the ReplicatedSubscriptionsController class to acknowledge these marker messages on all subscriptions immediately after publishing the messages to a local topic.

On the other hand, ReplicatedSubscriptionsSnapshot is used only in the local cluster and does not need to be sent to remote clusters. So stopped publishing ReplicatedSubscriptionsSnapshot to topics.

In addition, marker messages sent from remote clusters are now acknowledged by the replicator on all subscriptions.

With these changes, marker messages no longer continue to accumulate in the replicated subscription backlog.

@massakam massakam added this to the 2.6.0 milestone Mar 23, 2020
@massakam massakam self-assigned this Mar 23, 2020
@sijie sijie requested a review from codelipenghui March 26, 2020 05:18
@sijie
Copy link
Member

sijie commented Mar 26, 2020

@jiazhai @codelipenghui @merlimat can you review this pull request?

Comment on lines +126 to +137
try {
ReplicatedSubscriptionsSnapshot snapshot = Markers.instantiateReplicatedSubscriptionsSnapshot(snapshotId,
controller.localCluster(), p.getLedgerId(), p.getEntryId(), responses);
controller.topic().getSubscriptions().forEach((subName, sub) -> {
if (sub != null) {
sub.processReplicatedSubscriptionSnapshot(snapshot);
}
});
} catch (Throwable t) {
log.warn("[{}] Failed to process replicated subscription snapshot {} -- {}", controller.topic().getName(),
snapshotId, t.getMessage());
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don’t understand a bit here. If we don't store snapshots into the topic, how can we recover snapshots after broker restarts?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think that snapshots need to be persisted because they are constantly being updated with new ones, is that wrong? What issues do you think would arise if we don't persist ReplicatedSubscriptionsSnapshot messages?

@massakam massakam changed the title [Issue 6437][broker] Prevent marker messages from accumulating in backlog of replicated subscription [WIP][Issue 6437][broker] Prevent marker messages from accumulating in backlog of replicated subscription Apr 22, 2020
@codelipenghui
Copy link
Contributor

move to 2.7.0, feel free to move it back if needed.

@codelipenghui codelipenghui modified the milestones: 2.6.0, 2.7.0 Jun 4, 2020
@massakam massakam force-pushed the repl-sub-backlog branch from b402e99 to 6815fad Compare June 9, 2020 04:36
@massakam
Copy link
Contributor Author

massakam commented Jun 9, 2020

Close this PR once.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Backlog messages increase if no consumer is connected to replicated subscription

3 participants