Skip to content

KAFKA-13763 (1): Improve unit testing coverage and flexibility for IncrementalCooperativeAssignor#11974

Merged
mimaison merged 17 commits intoapache:trunkfrom
C0urante:kafka-13763
Apr 5, 2022
Merged

KAFKA-13763 (1): Improve unit testing coverage and flexibility for IncrementalCooperativeAssignor#11974
mimaison merged 17 commits intoapache:trunkfrom
C0urante:kafka-13763

Conversation

@C0urante
Copy link
Copy Markdown
Contributor

Jira

This is strictly a testing refactor. No functional changes are made; this can be easily verified by confirming that the only affected file is the IncrementalCooperativeAssignorTest.java test suite.

These changes were initially discussed during review of #10367, which partially focused on improving readability in the unit tests for incremental rebalancing in Connect.

The goals here include:

  1. Simplify the logic that has to be manually specified on a per-test-case basis for simulating a rebalance (accomplished by extracting common logic into reusable utility methods such as performRebalance and addNewEmptyWorkers)
  2. Reduce the cognitive burden for following testing logic by removing unnecessary fields (like expectedMemberConfigs and assignments) and assertions (like the redundant checks for leader and leader URL)
  3. Add powerful, granular, and reusable utility methods that can provide stronger guarantees about the state of a cluster across successive rebalances without forcing people to track this state in their heads (accomplished by replacing the existing assertAssignments method with assertWorkers, assertEmptyAssignment, assertConnectorAllocations, and assertTaskAllocations, and by adding the new assertBalancedAllocation and assertCompleteAllocation methods)
  4. Refactor common logic for testing utilities to be more concise and reduce the number of Java 8 streams statements that have to be understood in order to read through a test case
  5. Fix a bug in the assertion logic for checking for duplicates currently present here and here (List::removeAll removes all occurrences of any element contained in the collection passed to the method)
  6. Remove an incorrect assertion in the assertNoReassignments (now renamed to assertNoRedundantAssignments) method that there should be no duplicated connectors or tasks in the assignments reported by each worker to the leader during rebalance (this is unnecessary and even contradicts logic used for testing in cases like this where we intentionally simulate a worker with a duplicated set of connectors and/or tasks rejoining a cluster; the only reason this bug wasn't surfaced sooner is because the bug mentioned in the prior point covers it)

Once merged, this should allow for cleaner, faster test writing when adding new cases for incremental rebalancing, such as with #10367.

Committer Checklist (excluded from commit message)

  • Verify design and implementation
  • Verify test coverage and CI build status
  • Verify documentation (including upgrade notes)

C0urante added 14 commits March 23, 2022 18:00
…emove unnecessary expectedMemberConfigs field
… logic for examining aggregated assignment data
…ertConnectorAllocations, and assertTaskAllocations methods
@C0urante
Copy link
Copy Markdown
Contributor Author

There's still room for improvement with the incremental rebalancing testing logic, but many of the remaining changes would involve modifications to the IncrementalCooperativeAssignor class as well. In order to reduce PR size and keep things focused, I've decided to leave that to a follow-up pull request, but since it would accomplish basically the same goal, I'll attach it to KAFKA-13763 as well.

@C0urante
Copy link
Copy Markdown
Contributor Author

C0urante commented Apr 1, 2022

Filed #11983 as a more aggressive follow-up that touches on the IncrementalCooperativeAssignor class.

Copy link
Copy Markdown
Member

@mimaison mimaison left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @C0urante for the PR, this is a nice cleanup!

I've not looked at everything in detail yet. I'll make another pass tomorrow. I've left a few small suggestions.

import static org.apache.kafka.connect.runtime.distributed.IncrementalCooperativeConnectProtocol.CONNECT_PROTOCOL_V2;
import static org.apache.kafka.connect.runtime.distributed.WorkerCoordinator.WorkerLoad;
import static org.hamcrest.CoreMatchers.hasItems;
import static org.hamcrest.CoreMatchers.is;
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we remove hamcrest now and only use junit?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good call, done 👍

import static org.junit.Assert.assertNotNull;
import static org.junit.Assert.assertNull;
import static org.junit.Assert.assertTrue;
import static org.junit.runners.Parameterized.Parameter;
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not related to changes in this PR. I see we are defining a test parameter with mode() but the class is not annotated with @RunWith(value = Parameterized.class) so protocolVersion is never initialized and always set to 0.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, good catch! I've corrected this in the test, although it's worth noting that in #11983 I'm proposing that we remove the protocolVersion parameter entirely since it's not accomplishing a whole lot here.

private Map<String, ExtendedWorkerState> memberConfigs(String givenLeader,
long givenOffset,
int start,
int connectorNum) {
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this be workerNum instead of connectorNum?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep, good catch 👍

return IntStream.range(start, connectorNum + 1)
.mapToObj(i -> new SimpleEntry<>("connector" + i, state))
.collect(Collectors.toMap(SimpleEntry::getKey, SimpleEntry::getValue));
return fillMap(
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if building this map explicitly may be more readable. What about something like:

private Map<String, ExtendedWorkerState> memberConfigs(String givenLeader, long givenOffset, String ... workers)

Then you call it with memberConfigs(leader, offset, W0, W1). It's slightly less flexible, but since all tests only use a few workers I think this could make the code a bit simpler.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agree that it's more readable, and in some ways it's actually more flexible since you can now specify an arbitrary set of worker names instead of having them generated for you. Done 👍

@Parameters
public static Iterable<?> mode() {
return Arrays.asList(new Object[][] {{CONNECT_PROTOCOL_V1, CONNECT_PROTOCOL_V2}});
return Arrays.asList(new Object[][] {{CONNECT_PROTOCOL_V1}, {CONNECT_PROTOCOL_V2}});
Copy link
Copy Markdown
Member

@mimaison mimaison Apr 5, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This can be simplified into return Arrays.asList(CONNECT_PROTOCOL_V1, CONNECT_PROTOCOL_V2); but since you're removing it in #11983, we can ignore it.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Blegh, thanks. I'll simplify it anyways; rather leave trunk in a healthy place and not risk FUD from someone else copy+pasting this style.

Copy link
Copy Markdown
Member

@mimaison mimaison left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @C0urante, the PR looks good. I left a few more very minor suggestions.

private void expectGeneration(boolean expectMismatch) {
when(coordinator.generationId())
.thenReturn(assignor.previousGenerationId + 1)
.thenReturn(assignor.previousGenerationId + 1);
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since we return the same value, we can remove one of these thenReturn()

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, good call. I figured the existing style was necessary since WorkerCoordinator::generationId would be invoked twice for every call to expectGeneration but it looks like Mockito handles this just fine with a single thenReturn.

memberConfigs);

assertThat("Wrong set of workers for reassignments",
assertEquals("Wrong set of workers for reassignments",
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What about simplifying these assertions into:

assertTrue("Wrong set of workers for reassignments", assignor.candidateWorkersForReassignment.isEmpty());

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I usually prefer assertEquals when possible since it shows the difference between actual and expected, which in this case would mean the complete set of incorrect candidate workers for reassignment.

@C0urante
Copy link
Copy Markdown
Contributor Author

C0urante commented Apr 5, 2022

Thanks @mimaison. I hope you don't mind but I've pushed one more change that simplifies how the configured set of connectors and tasks is simulated (this came in handy when designing tests for one of the scenarios described in KAFKA-13764).

It should improve readability by removing the existing clusterConfigState methods (and the fillMap method that was introduced in this PR to clean those up) and making changes to the set of configured connectors and tasks more explicit. However, if this is too much then let me know and I can revert the commit from this PR and add it in a follow-up.

Copy link
Copy Markdown
Member

@mimaison mimaison left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, LGTM

@mimaison mimaison merged commit 88e5f13 into apache:trunk Apr 5, 2022
@C0urante C0urante deleted the kafka-13763 branch April 5, 2022 19:45
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants