Skip to content

KAFKA-6718 / Rack aware standby task assignor#10851

Merged
showuon merged 59 commits intoapache:trunkfrom
lkokhreidze:KAFKA-6718-standby-task-assignment
Mar 3, 2022
Merged

KAFKA-6718 / Rack aware standby task assignor#10851
showuon merged 59 commits intoapache:trunkfrom
lkokhreidze:KAFKA-6718-standby-task-assignment

Conversation

@lkokhreidze
Copy link
Copy Markdown
Contributor

@lkokhreidze lkokhreidze commented Jun 9, 2021

This PR is part of KIP-708 and adds rack aware standby task assignment logic.

Rack aware standby task assignment won't be functional until all parts of this KIP gets merged.

Splitting PRs into three smaller PRs to make the review process easier to follow. Overall plan is the following:

👉 Rack aware standby task assignment logic.
⏭️ Protocol change, add clientTags to SubscriptionInfoData #10802
⏭️ Add required configurations to StreamsConfig (public API change, at this point we should have full functionality) #11837

This PR implements first point of the above mentioned plan.

Committer Checklist (excluded from commit message)

  • Verify design and implementation
  • Verify test coverage and CI build status
  • Verify documentation (including upgrade notes)

@lkokhreidze
Copy link
Copy Markdown
Contributor Author

Call for review @cadonna @vvcephei @ableegoldman

@lkokhreidze
Copy link
Copy Markdown
Contributor Author

gentle nudge @cadonna @vvcephei @ableegoldman

@mjsax mjsax added the streams label Jul 13, 2021
@cadonna
Copy link
Copy Markdown
Member

cadonna commented Jul 13, 2021

@lkokhreidze Sorry for the long silence! Some other tasks got in my way. I plan to review this PR tomorrow.

Copy link
Copy Markdown
Member

@cadonna cadonna left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@lkokhreidze Thank you for the PR!

I did a first pass. Here is my feedback.

lkokhreidze and others added 3 commits July 14, 2021 16:25
@lkokhreidze
Copy link
Copy Markdown
Contributor Author

Thanks @cadonna for the feedback.
I've replied/addressed all of your comments.

Copy link
Copy Markdown
Member

@cadonna cadonna left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@lkokhreidze Thank you for the updates!

Here my next round of feedback!

…ignature change of HighAvailabilityTaskAssignor#assignActiveStatefulTasks
@lkokhreidze
Copy link
Copy Markdown
Contributor Author

Thanks @cadonna for the valuable feedback 🙇
I've replied/addressed your comments.

@cadonna
Copy link
Copy Markdown
Member

cadonna commented Jul 23, 2021

@lkokhreidze FYI: I will be offline the next two weeks. I am sorry that I haven't had time to review this PR this week.

Copy link
Copy Markdown
Member

@cadonna cadonna left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @lkokhreidze !

Thank you for your patience!

Here my feedback.

@lkokhreidze
Copy link
Copy Markdown
Contributor Author

Hi @cadonna !
Thanks for the feedback. I will address your comments this week.

@lkokhreidze
Copy link
Copy Markdown
Contributor Author

Thanks @showuon for the valuable feedback.
I've addressed your comments, please have a look when you got time.

Copy link
Copy Markdown
Member

@showuon showuon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@lkokhreidze , thanks for the update. Overall LGTM! Only one question for partial rack awareness assignment. Thanks.

Comment on lines +266 to +269
// If we have used more clients than all the tag's unique values,
// we can't filter out clients located on that tag, because it'll exclude all the clients.
// Instead, we need to discard all the clients that were marked as "used" on that tag key values.
// Please check ClientTagAwareStandbyTaskAssignorTest#shouldDoThePartialRackAwareness test for more info.
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That is quite challenging to understand. After reading it a couple of times I understood that if we've used a number of clients that is equal to or greater than the number of unique values of the tag, we cannot guarantee that each standby is on a different tag value than the active and other standbys. So the rack-awareness becomes partial. Is that correct?
Could you reformulate it, so that it states that the rack-awareness guarantee does not hold anymore.
And why "more clients than all tag's unique values"? When the number of used clients is equal to the unique tag values, we are already in the partial rack-awareness situation, right?
Maybe you should give here an example as in the mentioned test. I find referencing the test is a bit cumbersome, because if the test gets renamed this comment becomes useless.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks Bruno, I'll add the example to the comment.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pushed the changes. I added detailed explanation with an example. Also tests have the similar example. Hopefully this change makes logic more clear.

final int numStandbyReplicas = configs.numStandbyReplicas;
final Set<String> rackAwareAssignmentTags = new HashSet<>(configs.rackAwareAssignmentTags);

final Map<TaskId, Integer> tasksToRemainingStandbys = computeTasksToRemainingStandbys(
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I could not find where you decrement the number of remaining standbys. If you get a value from this map and put it into an int variable, you do not have a reference to the Integer value in the map anymore. This might become a problem in StandbyTaskAssignmentUtils#pollClientAndMaybeAssignRemainingStandbyTasks().

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice catch! And maybe we should add a test to address this.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, a test is absolutely necessary!

Copy link
Copy Markdown
Contributor Author

@lkokhreidze lkokhreidze Feb 23, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated tests for the StandbyTaskAssignmentUtils#pollClientAndMaybeAssignAndUpdateRemainingStandbyTasks and also added separate test for the ClientTagAwareStandbyTaskAssignor.

For the ClientTagAwareStandbyTaskAssignor I decided to make few things package private to be able to test this logic. As otherwise, there was no easy way to test if tasksToRemainingStandbys was getting updated properly. Hope this is okay.

…nals/assignment/ClientTagAwareStandbyTaskAssignor.java

Co-authored-by: Bruno Cadonna <cadonna@apache.org>
Copy link
Copy Markdown
Member

@showuon showuon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the update in partial rack awareness assignment. LGTM! Left some minor comments. Thanks.

final int numStandbyReplicas = configs.numStandbyReplicas;
final Set<String> rackAwareAssignmentTags = new HashSet<>(configs.rackAwareAssignmentTags);

final Map<TaskId, Integer> tasksToRemainingStandbys = computeTasksToRemainingStandbys(
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice catch! And maybe we should add a test to address this.

@lkokhreidze
Copy link
Copy Markdown
Contributor Author

Hi @cadonna @showuon
I've addressed your comments. Please have a look.
Thank you!

Copy link
Copy Markdown
Member

@showuon showuon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the update. I'll check the test code tomorrow.

Copy link
Copy Markdown
Member

@showuon showuon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! Thanks for the improvement!

@lkokhreidze
Copy link
Copy Markdown
Contributor Author

Hi @cadonna
Mind having another look? Hoping to finalise this PR as soon as possible :)
Thanks

@cadonna
Copy link
Copy Markdown
Member

cadonna commented Mar 1, 2022

@lkokhreidze I will try to have a look this week. I would also like to get this PR merged as soon as possible. Since @showuon has already approved it. If I will not manage to have a look, @showuon can merge it.

@lkokhreidze
Copy link
Copy Markdown
Contributor Author

Thanks @cadonna, appreciate it.
@showuon please also note that next PR, the protocol change, is also ready to be reviewed. #10802

Whenever it's possible, please have a look at that too.

Copy link
Copy Markdown
Member

@cadonna cadonna left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@lkokhreidze Thanks a lot for this PR!

Sorry for all the delay!

LGTM! I had just some minor comments that can be addressed in a follow-up PR.

As I pointed out in one of my comments, I am not a big fan of // Visible for testing because that often indicates some deficiencies in the modularization. But this is nothing that cannot be addressed in a follow-up PR. So I am fine with merging it.

}

if (!tasksToRemainingStandbys.isEmpty()) {
log.debug("Rack aware standby task assignment was not able to assign all standby tasks. " +
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think info log would also be OK here. I imagine users that are wondering why their standbys are not distributed as they would expect. With this information they can at least try to fix it on the config level. This log message should only happen at rebalance time, which should usually be rather seldom.
If we decide to put the log message on info level, you should also change a bit the wording and not use variable names in it. Maybe some hints what the users can do to fix this would also be nice.

Is it possible to separate the concerns of this log message and the one on line 135? Something along the lines of here the rack-aware standby assignment did not work due the tag config and on line 135 the assignment did not work due to too low number of instances. We can then put both on warn or info (do not forget to also check the related log message in the default standby assignor).

Comment on lines +145 to +153
// Visible for testing
static StandbyTaskAssignor createStandbyTaskAssignor(final AssignmentConfigs configs) {
if (!configs.rackAwareAssignmentTags.isEmpty()) {
return new ClientTagAwareStandbyTaskAssignor();
} else {
return new DefaultStandbyTaskAssignor();
}
}

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This can be done in a follow-up PR:
I am not a big fan of // Visible for testing because it often means that we missed to extract code to separate classes. Here I would definitely extract this code to a factory class.

@cadonna
Copy link
Copy Markdown
Member

cadonna commented Mar 2, 2022

@lkokhreidze Let me know if you want to address my minor comments in this PR. After that @showuon or I can merge this PR.

Nice work!

lkokhreidze and others added 2 commits March 2, 2022 15:27
…nals/assignment/ClientTagAwareStandbyTaskAssignor.java

Co-authored-by: Bruno Cadonna <cadonna@apache.org>
…nals/assignment/ClientTagAwareStandbyTaskAssignor.java

Co-authored-by: Bruno Cadonna <cadonna@apache.org>
@lkokhreidze
Copy link
Copy Markdown
Contributor Author

Hi @cadonna , thanks for the review.
I agree with what you said and made a note to myself to address your comments in the follow-up PRs. So if it's okay, I think we can merge this one.

Thank you!

@showuon
Copy link
Copy Markdown
Member

showuon commented Mar 3, 2022

Failed tests are unrelated and also failed in trunk build.

Build / ARM / org.apache.kafka.streams.integration.NamedTopologyIntegrationTest.shouldRemoveOneNamedTopologyWhileAnotherContinuesProcessing
Build / JDK 11 and Scala 2.13 / kafka.server.DynamicBrokerReconfigurationTest.testThreadPoolResize()
Build / JDK 11 and Scala 2.13 / kafka.server.DynamicBrokerReconfigurationTest.testThreadPoolResize()
Build / JDK 11 and Scala 2.13 / org.apache.kafka.streams.integration.NamedTopologyIntegrationTest.shouldRemoveOneNamedTopologyWhileAnotherContinuesProcessing
Build / JDK 17 and Scala 2.13 / kafka.server.DynamicBrokerReconfigurationTest.testThreadPoolResize()
Build / JDK 17 and Scala 2.13 / kafka.server.DynamicBrokerReconfigurationTest.testThreadPoolResize()
Build / JDK 17 and Scala 2.13 / org.apache.kafka.streams.integration.NamedTopologyIntegrationTest.shouldRemoveOneNamedTopologyWhileAnotherContinuesProcessing
Build / JDK 8 and Scala 2.12 / org.apache.kafka.connect.integration.RebalanceSourceConnectorsIntegrationTest.testDeleteConnector
Build / JDK 8 and Scala 2.12 / org.apache.kafka.connect.integration.RebalanceSourceConnectorsIntegrationTest.testDeleteConnector
Build / JDK 8 and Scala 2.12 / org.apache.kafka.streams.integration.NamedTopologyIntegrationTest.shouldRemoveOneNamedTopologyWhileAnotherContinuesProcessing

@cadonna
Copy link
Copy Markdown
Member

cadonna commented Mar 3, 2022

🥳

cadonna pushed a commit that referenced this pull request Mar 16, 2022
This PR is part of KIP-708 and adds rack aware standby task assignment logic.

Rack aware standby task assignment won't be functional until all parts of this KIP gets merged.

Splitting PRs into three smaller PRs to make the review process easier to follow. Overall plan is the following:

⏭️ Rack aware standby task assignment logic #10851
⏭️ Protocol change, add clientTags to SubscriptionInfoData #10802
👉 Add required configurations to StreamsConfig (public API change, at this point we should have full functionality)

This PR implements last point of the above mentioned plan.

Reviewers: Luke Chen <showuon@gmail.com>, Bruno Cadonna <cadonna@apache.org>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants