Skip to content

KIP-229: DeleteGroups API#4479

Merged
hachikuji merged 8 commits intoapache:trunkfrom
vahidhashemian:KAFKA-6275
Jan 31, 2018
Merged

KIP-229: DeleteGroups API#4479
hachikuji merged 8 commits intoapache:trunkfrom
vahidhashemian:KAFKA-6275

Conversation

@vahidhashemian
Copy link
Copy Markdown
Contributor

@vahidhashemian vahidhashemian commented Jan 26, 2018

This PR implements KIP-229.

Committer Checklist (excluded from commit message)

  • Verify design and implementation
  • Verify test coverage and CI build status
  • Verify documentation (including upgrade notes)

@vahidhashemian
Copy link
Copy Markdown
Contributor Author

@hachikuji, due to the fast-approaching feature freeze, I thought I'd ask your opinion: The KIP proposes an error at the high level and then per group errors. But I now think that the high level error is not really something that could apply to all groups that are requested to be deleted. For example, Errors.COORDINATOR_NOT_AVAILABLE would apply to some (coordinator of some may be available while coordinator for some other may not), Errors.GROUP_AUTHORIZATION_FAILED would apply to some other, ... So, to me it sounds the high level code should be removed from the protocol.

Just wanted to get your feedback on my understanding is correct, and if so, the course of action. Thanks a lot.

@hachikuji
Copy link
Copy Markdown
Contributor

@vahidhashemian Yes, I was in fact wondering about that when I read through the KIP. The only case I could think of when we'd take advantage of it would be unhandled errors. If we don't have a good use case for it at the moment, I think it would be fine to drop it.

@vahidhashemian
Copy link
Copy Markdown
Contributor Author

@hachikuji thanks a lot for the quick response. Should I just update the KIP with this? Any notification/revote required?

@hachikuji
Copy link
Copy Markdown
Contributor

I'd suggest updating the KIP and sending a message to the discussion thread. We often change minor details during implementation, so I don't think a revote will be needed, but we can see if anyone has feedback.

@vahidhashemian vahidhashemian force-pushed the KAFKA-6275 branch 3 times, most recently from 8dd6874 to ae247cf Compare January 27, 2018 00:58
@vahidhashemian vahidhashemian changed the title KIP-229: DeleteGroups API (WIP) KIP-229: DeleteGroups API Jan 27, 2018
@vahidhashemian
Copy link
Copy Markdown
Contributor Author

@hachikuji, would appreciate your feedback on this PR when you get a chance. Thanks!

Copy link
Copy Markdown
Contributor

@hachikuji hachikuji left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the patch. Left some comments.

@Override
public AbstractResponse getErrorResponse(int throttleTimeMs, Throwable e) {
Errors error = Errors.forException(e);
Map<String, Errors> groupErrors = new HashMap<>();
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: may as well initialize with the right size

return new DeleteGroupsResponse(throttleTimeMs, groupErrors);
default:
throw new IllegalArgumentException(String.format("Version %d is not valid. Valid versions for %s are 0 to %d",
version(), this.getClass().getSimpleName(), ApiKeys.DELETE_GROUPS.latestVersion()));
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: instead of the class name, maybe use ApiKeys.DELETE_GROUPS.name.



/**
* Possible error codes:
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think NOT_COORDINATOR should also be possible?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Correct. Also COORDINATOR_LOAD_IN_PROGRESS if I'm not mistaken?


def deleteConsumerGroups(groups: List[String]): Map[String, Errors] = {
var errors: Map[String, Errors] = Map()
val groupsPerCoordinator = groups.map { group =>
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I'd suggest moving coordinator lookup to a separate function. You might also consider using the Either class to distinguish errors since the mixture of functional logic and updates to the mutable errors collection is a little odd.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tried to improve upon this in the new commit. Please let me know what you think.

}

override def deleteGroups(): Map[String, Errors] = {
val groupsToDelete = opts.options.valuesOf(opts.groupOpt).asScala.toList
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm.. It's a little weird that we allow multiple groups to be passed when using the new consumer, but we expect a single group for the old consumer. If we're to stay consistent, do you think it would be restrictive in practice to only support deletion of a single group at a time?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the meantime I'll look at your other feedback (thanks btw) regarding this one, it seems the old consumer also supports deleting multiple groups, i.e. ... --delete --group group1 --group group2 works and attempts to remove both groups.

I originally wanted to support single group deletion only, but after considering the existing behavior for old consumer decided otherwise.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, you are right. The name deleteForGroup is kind of misleading. Maybe it just needs to be pluralized. I wouldn't hate it if we came up with better names for all of these deleteForXXX APIs.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, I gave this a quick try. Let me know if you have better suggestions.

var result: Map[String, Errors] = Map()

groupIds.foreach { groupId =>
if (!groupMetadataCache.contains(groupId))
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This "check and act" is not safe since we're not holding a lock. It would be better to get the GroupMetadata object and check if it is null. If it is not null, then we need to grab the group lock before checking its state and attempting to delete its state.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's correct. It seems because of this lock we need to delete groups one by one then (as in the new commit)?

}

if (eligibleGroups.nonEmpty) {
cleanupGroupMetadata(None, eligibleGroups, Long.MaxValue)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think passing None works. Looking at cleanupGroupMetadata, that would just result in removal of the expired offsets.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Correct, but since we pass Long.MaxValue as the current time, all offsets in the passed groups expire. Would that work?

def cleanupGroupMetadata(deletedTopicPartitions: Option[Seq[TopicPartition]]) {
val startMs = time.milliseconds()
def cleanupGroupMetadata(deletedTopicPartitions: Option[Seq[TopicPartition]],
groups: Iterable[GroupMetadata] = groupMetadataCache.values,
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be better not to have optional arguments. Let's make the caller provide the values explicitly.

groups.foreach { group =>
if (!authorize(request.session, Delete, new Resource(Group, group))) {
unauthorizedGroupsDeletionResult += (group -> Errors.GROUP_AUTHORIZATION_FAILED)
groups -= group
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps we can partition to split the incoming group list into the authorized and unauthorized groups.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the suggestion, makes a lot of sense.

}

@Test
def testDeleteEmptyGroup() {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should have a test case which tests removal when there are stored offsets.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added one in the new commit.

Copy link
Copy Markdown
Contributor Author

@vahidhashemian vahidhashemian left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@hachikuji thanks for the feedback. I tried to address them in the new commit.



/**
* Possible error codes:
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Correct. Also COORDINATOR_LOAD_IN_PROGRESS if I'm not mistaken?


def deleteConsumerGroups(groups: List[String]): Map[String, Errors] = {
var errors: Map[String, Errors] = Map()
val groupsPerCoordinator = groups.map { group =>
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tried to improve upon this in the new commit. Please let me know what you think.

else if (opts.options.has(opts.topicOpt))
deleteAllForTopic()

Map()
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds good. I updated this in the new commit.

}

if (eligibleGroups.nonEmpty) {
cleanupGroupMetadata(None, eligibleGroups, Long.MaxValue)
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Correct, but since we pass Long.MaxValue as the current time, all offsets in the passed groups expire. Would that work?

var result: Map[String, Errors] = Map()

groupIds.foreach { groupId =>
if (!groupMetadataCache.contains(groupId))
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's correct. It seems because of this lock we need to delete groups one by one then (as in the new commit)?

groups.foreach { group =>
if (!authorize(request.session, Delete, new Resource(Group, group))) {
unauthorizedGroupsDeletionResult += (group -> Errors.GROUP_AUTHORIZATION_FAILED)
groups -= group
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the suggestion, makes a lot of sense.

}

override def deleteGroups(): Map[String, Errors] = {
val groupsToDelete = opts.options.valuesOf(opts.groupOpt).asScala.toList
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, I gave this a quick try. Let me know if you have better suggestions.

}

@Test
def testDeleteEmptyGroup() {
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added one in the new commit.

Copy link
Copy Markdown
Contributor

@omkreddy omkreddy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

authorize(request.session, Delete, new Resource(Group, group))
}

val groupDeletionResult = groupCoordinator.handleDeleteGroups(authorizedGroups)._2 ++
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks like we are ignoring groupCoordinator.handleDeleteGroups(authorizedGroups)._1 error here.
handleDeleteGroups(authorizedGroups) can return Errors.COORDINATOR_NOT_AVAILABLE.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for catching this. I missed updating this with the recent change to the protocol. Will update it in the next commit.

Copy link
Copy Markdown
Contributor

@hachikuji hachikuji left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Left a few more comments. I think we still need a little work to make the deletion safe for edge cases around coordinator failover.

}.filter(_._1 != null)
}

val groupCoordinator = groups.map(group => (group -> coordinatorLookup(group)))
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems this is unused

errors += (group -> error)
case Left(coordinator) =>
groupsPerCoordinator.get(coordinator) match {
case Some(gList: List[String]) =>
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: I don't think you need the type.

val responseBody = send(coordinator, ApiKeys.DELETE_GROUPS, new DeleteGroupsRequest.Builder(groups.toSet.asJava))
val response = responseBody.asInstanceOf[DeleteGroupsResponse]
groups.foreach {
case group if (response.hasError(group)) => errors += (group -> response.errors.get(group))
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: unneeded parenthesis around response.hasError(group)

deleteAllForTopic()
deleteAllGroupsInfoForTopic()

Map()
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems like you were intending to use the results of the deleteGroupsInfo and such. We should probably have a test case (could be done in a follow-up).

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, I'll submit a separate PR with proper test(s) after this is merged.

if (AdminUtils.deleteConsumerGroupInfoForTopicInZK(zkUtils, group, topic)) {
println(s"Deleted consumer group information for group '$group' topic '$topic' in zookeeper.")
else
(group -> Errors.NONE)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: unneeded parenthesis. A few more like this.

}
}

val groupCoordinator = groups.map(group => (group -> coordinatorLookup(group)))
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

unused?

groupIds.foreach { groupId =>
if (!validGroupId(groupId))
groupErrors += (groupId -> Errors.INVALID_GROUP_ID)
else if (!isCoordinatorForGroup(groupId)) {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: looks weird that only this branch has braces

var groupErrors: Map[String, Errors] = Map()
var eligibleGroups: Seq[String] = Seq()

groupIds.foreach { groupId =>
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In fact, this is also a "check and act." It is possible for an eligible group to be unloaded between these checks and the call to deleteGroups.

I think we should follow a structure more similar to the other API handlers. I would suggest moving the state checking that we currently have in GroupMetadataManager.deleteGroups into the else case below. We should do the following:

  1. Check if there is no group or if the group is Dead. If so, it could mean that it has already been moved to another broker or it could mean that the group doesn't exist. I am not sure we have a bulletproof way to distinguish these cases, but maybe we could just check again if the coordinator is still correct?
  2. Check if the group is not empty. If so, return the GROUP_NOT_EMPTY error code.
  3. If the group is empty, we should transition to Dead. Once we do so, we are ensured that no other thread will attempt to use the GroupMetadata object. We can then collect this eligible group in a collection (as is currently done) and send it to cleanupGroupMetadata outside of the lock.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm working on this and have a couple of questions for now:

  1. It seems all this can be done here and we could get rid of GroupMetadataManager.deleteGroups(). Do you see an issue with it?
  2. Could you please clarify what you mean by "check again if the coordinator is still correct" when group cannot be found or is Dead?

Thanks.

Copy link
Copy Markdown
Contributor

@hachikuji hachikuji Jan 30, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. Seems reasonable to me.

  2. We are trying to address the case in which a group gets deleted or migrated in between the time that we check if the coordinator is assigned and the time we delete the group metadata. We have the Dead state for this purpose, so whenever we check GroupMetadata, the first thing we should check is whether it is already Dead. If it is, then we know it was either already deleted or already migrated. My suggestion is to check again whether we are still the coordinator for the group to disambiguate the two cases.

Note that I am not sure that this is 100% bulletproof. For example, it may not handle the case when the coordinator is migrated away and then back very quickly. A spurious NOT_COORDINATOR error is not a big deal because clients are expected to handle it, but I am not too sure about the GROUP_ID_NOT_FOUND error. Maybe clients just have to treat it with the same skepticism that they treat the UNKOWN_TOPIC_OR_PARTITION errors.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for clarifying this. I'll try to do just that in the new patch. Will submit shortly.

Copy link
Copy Markdown
Contributor Author

@vahidhashemian vahidhashemian left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@hachikuji thanks for another review. Could you please clarify on a couple of questions inline before I submit another patch? Thanks.

deleteAllForTopic()
deleteAllGroupsInfoForTopic()

Map()
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, I'll submit a separate PR with proper test(s) after this is merged.

var groupErrors: Map[String, Errors] = Map()
var eligibleGroups: Seq[String] = Seq()

groupIds.foreach { groupId =>
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm working on this and have a couple of questions for now:

  1. It seems all this can be done here and we could get rid of GroupMetadataManager.deleteGroups(). Do you see an issue with it?
  2. Could you please clarify what you mean by "check again if the coordinator is still correct" when group cannot be found or is Dead?

Thanks.

Copy link
Copy Markdown
Contributor

@hachikuji hachikuji left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank for the updates. A few more comments.

else {
groupManager.getGroup(groupId) match {
case None =>
groupErrors += groupId -> Errors.GROUP_ID_NOT_FOUND
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This case should be handled the same as if the group is dead. You can probably add a little helper to avoid the duplication.

Copy link
Copy Markdown
Contributor

@hachikuji hachikuji Jan 30, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On second thought, this probably doesn't solve the underlying issue. I'm trying to think how we can be sure that we're returning this error code correctly. Maybe we need to check for group existence while holding the ownedPartitions lock in GroupMetadataManager.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If ownedPartitions includes the corresponding topic partition for the group, and if the cached group either doesn't exist or is Dead, then I think it is safe to return GROUP_ID_NOT_FOUND. Maybe we can just add a method like the following to GroupMetadataManager:

// return true iff group is owned and the group doesn't exist
def groupNotExists(groupId: String) = inLock(partitionLock) {
  isGroupLocal(groupId) && (!groupMetadataCache.contains(groupId) || groupMetadataCache.get(groupId).is(Dead))
}

Then we can use this function here instead of checking the coordinator again. The name could probably be improved.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For both case None and case Dead we already know the second part ((!groupMetadataCache.contains(groupId) || groupMetadataCache.get(groupId).is(Dead))) is true. So, it suffices to check isGroupLocal(groupId) (to avoid redundant checks). Is that correct? If so, we wouldn't need this helper (at least here).

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The point is to check it while holding the partition lock so that it is an atomic operation. This ensures that we will not have any race conditions with partition loading/unloading.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Aah, right. That makes sense.

}

@Test
def testDeleteNonEmptyGroup() {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why remove these test cases?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

They made a call to GroupMetadataManager.deleteGroups(...) that we just deleted. Similar tests exist in GroupCoordinatorTest.

}

if (eligibleGroups.nonEmpty) {
groupManager.cleanupGroupMetadata(None, eligibleGroups, Long.MaxValue)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This still feels a bit hacky. As an alternative, maybe we can let the offset selector be provided as a function. Something like this:

def cleanupGroupMetadata(
 groups: Iterable[GroupMetadata], 
 collectOffsetsToRemove: Group => Map[TopicPartition, OffsetAndMetadata])

What do you think?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure which part you consider hacky, and am trying to understand your suggestion.

For the sake of deleteGroups functionality, we can use group.allOffsets that conforms to the function signature above. But how about the existing functionality, where we want to delete specific topic partitions from a group: groupManager.cleanupGroupMetadata(Some(topicPartitions), groupManager.currentGroups, time.milliseconds()) and populate the corresponding OffsetAndMetadata values? I'm assuming we want to reuse the same cleanupGroupMetadata method for both cases.

On the same assumption, we also need to factor in the concept of current time so we can determine the expired offsets for the existing functionality.

On the other hand if you are proposing to create A new cleanupGroupMetadata method that calls on the existing method, we should make this call once per group (since topic partitions are group-specific).

Or maybe I'm missing the point :)

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's not that big of a deal. I just thought it was a mild abuse to reuse the expiration logic to delete all offsets. Alternatively, what I was suggesting is to let the caller choose the offsets to delete.

@hachikuji
Copy link
Copy Markdown
Contributor

@vahidhashemian If you can update the patch this morning, we may still be able to get it into this release. The main thing from my perspective is ensuring that the GROUP_ID_NOT_FOUND error code is returned correctly as discussed above.

@vahidhashemian
Copy link
Copy Markdown
Contributor Author

@hachikuji I just updated the patch, without the improvement on cleanupGroupMetadata. I can work on it in the meantime, and submit a patch separately (perhaps under a different PR).


// return true iff group is owned and the group doesn't exist
def groupNotExists(groupId: String) = inLock(partitionLock) {
isGroupLocal(groupId) && (!groupMetadataCache.contains(groupId) || groupMetadataCache.get(groupId).is(Dead))
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should have mentioned before, but we do need to grab the group lock to check the state.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Correct, thanks for catching. Hopefully the new commit works.

// return true iff group is owned and the group doesn't exist
def groupNotExists(groupId: String) = inLock(partitionLock) {
isGroupLocal(groupId) && (!groupMetadataCache.contains(groupId) || groupMetadataCache.get(groupId).is(Dead))
isGroupLocal(groupId) && (!groupMetadataCache.contains(groupId) || {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you write a short test case to make sure this function works correctly. Also, I think this is a bit more concise:

    isGroupLocal(groupId) && getGroup(groupId).forall { group =>
      group.inLock(group.is(Dead))
    }

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the code improvement suggestion. I added a basic unit test in the new commit.

// group is not owned
assertFalse(groupMetadataManager.groupNotExists(groupId))

groupMetadataManager.addPartitionOwnership(groupPartitionId)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Following this and prior to adding the group, we should see groupNotExists return true?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I'll add that. Thanks!

Copy link
Copy Markdown
Contributor

@hachikuji hachikuji left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Thanks for the patch!

@hachikuji
Copy link
Copy Markdown
Contributor

The test failures appear unrelated. Merging to trunk.

@hachikuji hachikuji merged commit 1ed6da7 into apache:trunk Jan 31, 2018
@vahidhashemian
Copy link
Copy Markdown
Contributor Author

Great, and thanks for quick reviews!

@asfgit
Copy link
Copy Markdown

asfgit commented Jan 31, 2018

Refer to this link for build results (access rights to CI server needed):
https://builds.apache.org/job/kafka-pr-test-coverage/203/

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants