Skip to content

KAFKA-10199: Implement adding active tasks to the state updater#12128

Merged
guozhangwang merged 1 commit intoapache:trunkfrom
cadonna:impl-adding_active_tasks_state_updater
May 5, 2022
Merged

KAFKA-10199: Implement adding active tasks to the state updater#12128
guozhangwang merged 1 commit intoapache:trunkfrom
cadonna:impl-adding_active_tasks_state_updater

Conversation

@cadonna
Copy link
Copy Markdown
Member

@cadonna cadonna commented May 5, 2022

This PR adds the default implementation of the state updater.
The implementation only implements adding active tasks to the
state updater.

Committer Checklist (excluded from commit message)

  • Verify design and implementation
  • Verify test coverage and CI build status
  • Verify documentation (including upgrade notes)

This PR adds the default implementation of the state updater.
The implementation only implements adding active tasks to the
state updater.
@cadonna cadonna requested a review from guozhangwang May 5, 2022 19:00
Copy link
Copy Markdown
Contributor

@guozhangwang guozhangwang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @cadonna ! I made a pass on that, and I will merge the PR as-is once we get green builds.

Please feel free to reply those comments later on and if some makes sense, we can have incorporated in the next PR.

*/
Set<TopicPartition> completedChangelogs();

boolean allChangelogsCompleted();
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: also add a javadoc explanation, especially emphasize that standby task's changelogs would never be completed, and hence this function would only return true if there's no standby tasks.


public class DefaultStateUpdater implements StateUpdater {

private final static String BUG_ERROR_MESSAGE = "This indicates a bug. " +
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe we can move this static field to the StateUpdater interface?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would not do that since it is actually an implementation detail. I would rather move it to a global location where it is accessible from multiple classes since it is a quite general message. I am not aware of a class for global constants in Streams.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Makes sense.


private final ChangelogReader changelogReader;
private final AtomicBoolean isRunning = new AtomicBoolean(true);
private final java.util.function.Consumer<Set<TopicPartition>> offsetResetter;
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: why not also import java.util.function.Consumer?

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I saw that we did it in other places like #10000, but I cannot remember why..

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here it is not needed and I removed it. In other places, I think there might be a name clash between java.util.function.Consumer and org.apache.kafka.clients.consumer.Consumer.


@Override
public void add(final Task task) {
if (stateUpdaterThread == null) {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think in the end state, we may want to have more than the state-updater as the prefix since we would have multiple such threads. But this can be added later.

}
}

private void performActionsOnTasks() throws InterruptedException {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm assuming throws InterruptedException is for other actions for the future PRs, while ADD does not throw any yet.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same assumption elsewhere.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Those were indeed not needed.

}
}

private boolean isStateless(final Task task) {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a meta thought: I think in the end state, we should let the stream thread to make the call and skip sending them to the tasksAndActions queue, than letting such stateless task ping-pong between the two sides, wdyt?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, that is an option. Let's see how we feel when we integrate the state updater with the rest of the code.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cool.

return task.changelogPartitions().isEmpty() && task.isActive();
}

private void endRestorationIfChangelogsCompletelyRead(final Task task,
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: rename to CompleteTaskRestorationIfPossible?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I renamed it to maybeCompleteRestoration(). It is a bit shorter 🙂.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!

restoredActiveTasksLock.lock();
try {
while (restoredActiveTasks.isEmpty() && now <= deadline) {
final boolean elapsed = restoredActiveTasksCondition.await(deadline - now, TimeUnit.MILLISECONDS);
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems elapsed not used.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I know, but IIRC I got a spotbug error if I did not use the return value.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm. I think we have lots of places here return values are not used, don't know why spotbug is sensitive about this one (maybe we just disabled this rule in other classes :P ) thanks for letting me know.

final boolean elapsed = restoredActiveTasksCondition.await(deadline - now, TimeUnit.MILLISECONDS);
now = time.milliseconds();
}
while (!restoredActiveTasks.isEmpty()) {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we just result.addAll and then restoredActiveTasks.clear() with just a if condition?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point! We even do not need the if condition.

stateUpdaterThread.isRunning.set(false);
stateUpdaterThread.interrupt();
try {
stateUpdaterThread.join(timeout.toMillis());
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Generally we use a latch to check if the thread terminates successfully in time since the join function returns void and we cannot check if it did die within the timeout. If it did not terminate within the timeout, we could consider throw an exception for it.

@guozhangwang guozhangwang merged commit ced5989 into apache:trunk May 5, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants