Skip to content

KAFKA-13783; Remove reason prefixing in JoinGroupRequest and LeaveGroupRequest#11971

Merged
dajac merged 4 commits intoapache:trunkfrom
dajac:minor-remove-reason-prefix
Mar 31, 2022
Merged

KAFKA-13783; Remove reason prefixing in JoinGroupRequest and LeaveGroupRequest#11971
dajac merged 4 commits intoapache:trunkfrom
dajac:minor-remove-reason-prefix

Conversation

@dajac
Copy link
Copy Markdown
Member

@dajac dajac commented Mar 30, 2022

KIP-800 introduced a mechanism to pass a reason in the join group request and in the leaver group request. A default reason is used unless one is provided by the user. In this case, the custom reason is prefixed by the default one.

When we tried to used this in Kafka Streams, we noted a significant degradation of the performances, see #11873. It is not clear wether the prefixing is the root cause of the issue or not. To be on the safe side, I think that we should remove the prefixing. It does not bring much anyway as we are still able to distinguish a custom reason from the default one on the broker side.

This patch removes prefixing the user provided reasons. So if a the user provides a reason, the reason is used directly. If the reason is empty or null, the default reason is used.

Committer Checklist (excluded from commit message)

  • Verify design and implementation
  • Verify test coverage and CI build status
  • Verify documentation (including upgrade notes)

@dajac dajac requested a review from showuon March 30, 2022 08:20
@showuon
Copy link
Copy Markdown
Member

showuon commented Mar 30, 2022

I'll take a look tomorrow. Thanks.

@dajac
Copy link
Copy Markdown
Member Author

dajac commented Mar 30, 2022

cc @cadonna I'd like to get this one in 3.2.

@cadonna
Copy link
Copy Markdown
Member

cadonna commented Mar 30, 2022

@dajac Please go ahead!

List<MemberIdentity> membersToRemove = new ArrayList<>();
for (final MemberDescription member : members) {
MemberIdentity memberIdentity = new MemberIdentity()
.setReason(reason);
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

QQ: seems reason is also used in equals, is it ok to add this? Will this cause membersToRemove not found from somewhere?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good question. Let me double check this. For the context, we were doing this before this patch as well.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This does not seem to be an issue. I have extended the unit test to ensure that the look up works as we expect.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: does .setReason() have to be in its own line?

Copy link
Copy Markdown
Contributor

@jeffkbkim jeffkbkim left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the PR! left a few comments

List<MemberIdentity> membersToRemove = new ArrayList<>();
for (final MemberDescription member : members) {
MemberIdentity memberIdentity = new MemberIdentity()
.setReason(reason);
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: does .setReason() have to be in its own line?


// a first rebalance to get the assignment, we need two poll calls since we need two round trips to finish join / sync-group
consumer.poll(Duration.ZERO);
consumer.poll(Duration.ZERO);
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the test passes without the second poll. the first poll finishes the sync

INFO Successfully synced group in generation

before the second poll is triggered.
The second poll notifies the assignor and gets committed offsets which i don't think is necessary in this test

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think what we want is to wait for rebalance complete. we can explicitly wait like this:

TestUtils.waitForCondition(() -> {
            ConsumerRecords<String, String> recs = consumer.poll(Duration.ofMillis(100L));
            return consumer.assignment().equals(Utils.mkSet(tp0, t2p0));
});

WDYT?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just refactored this test based on @jeffkbkim's comment. In the end, we only care about the join group call in this context so I have changed the test to only care about this part.

Comment on lines +3743 to +3752
String reason = options.reason() == null || options.reason().isEmpty() ?
DEFAULT_LEAVE_GROUP_REASON : options.reason();

List<MemberIdentity> members;
if (options.removeAll()) {
members = getMembersFromGroup(groupId);
members = getMembersFromGroup(groupId, reason);
} else {
members = options.members().stream().map(MemberToRemove::toMemberIdentity).collect(Collectors.toList());
members = options.members().stream()
.map(m -> m.toMemberIdentity().setReason(reason))
.collect(Collectors.toList());
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should this have been done as part of KIP-800?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a small refactoring of the code that we did in KIP-800. It basically set the reason when the member identity is created instead of re-iterating over the members afterwards. I think that it is a bit clearer this way. I guess that we could have done like this in the original implementation but I did not think about it last time.

Copy link
Copy Markdown
Member

@showuon showuon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the PR. LGTM! Left a minor comment. Thanks.


// a first rebalance to get the assignment, we need two poll calls since we need two round trips to finish join / sync-group
consumer.poll(Duration.ZERO);
consumer.poll(Duration.ZERO);
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think what we want is to wait for rebalance complete. we can explicitly wait like this:

TestUtils.waitForCondition(() -> {
            ConsumerRecords<String, String> recs = consumer.poll(Duration.ofMillis(100L));
            return consumer.assignment().equals(Utils.mkSet(tp0, t2p0));
});

WDYT?

@dajac
Copy link
Copy Markdown
Member Author

dajac commented Mar 31, 2022

@jeffkbkim @showuon Thanks for your comments. I have addressed them.

Copy link
Copy Markdown
Member

@showuon showuon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! Thanks.

@dajac dajac merged commit ce7788a into apache:trunk Mar 31, 2022
dajac added a commit that referenced this pull request Mar 31, 2022
…upRequest (#11971)

KIP-800 introduced a mechanism to pass a reason in the join group request and in the leave group request. A default reason is used unless one is provided by the user. In this case, the custom reason is prefixed by the default one.

When we tried to used this in Kafka Streams, we noted a significant degradation of the performances, see #11873. It is not clear wether the prefixing is the root cause of the issue or not. To be on the safe side, I think that we should remove the prefixing. It does not bring much anyway as we are still able to distinguish a custom reason from the default one on the broker side.

This patch removes prefixing the user provided reasons. So if a the user provides a reason, the reason is used directly. If the reason is empty or null, the default reason is used.

Reviewers: Luke Chen <showuon@gmail.com>, <jeff.kim@confluent.io>, Hao Li <hli@confluent.io>
@dajac dajac deleted the minor-remove-reason-prefix branch March 31, 2022 12:34
@dajac
Copy link
Copy Markdown
Member Author

dajac commented Mar 31, 2022

Merged to trunk and to 3.2.

@ijuma
Copy link
Copy Markdown
Member

ijuma commented Mar 31, 2022

Can we make sure this actually helps with the benchmark that previously regressed?

@cadonna
Copy link
Copy Markdown
Member

cadonna commented Mar 31, 2022

@lihaosky Could you please try to re-add the leave reason in Streams and re-run the benchmarks?

@lihaosky
Copy link
Copy Markdown
Contributor

Sure thing!

@dajac
Copy link
Copy Markdown
Member Author

dajac commented Apr 7, 2022

We have found the root cause of the original issue. It was a bug in our benchmark setup so it had nothing to do with this change in the end. We can still keep this change as it looks better to me like this.

@showuon
Copy link
Copy Markdown
Member

showuon commented Apr 7, 2022

We have found the root cause of the original issue. It was a bug in our benchmark setup so it had nothing to do with this change in the end. We can still keep this change as it looks better to me like this.

Thanks for the update. Great to hear that!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants