Skip to content

KAFKA-9441: Unify committing within TaskManager#8218

Merged
mjsax merged 6 commits intoapache:trunkfrom
mjsax:kafka-9441-kip-447-3-unify-commit-in-TM
Mar 19, 2020
Merged

KAFKA-9441: Unify committing within TaskManager#8218
mjsax merged 6 commits intoapache:trunkfrom
mjsax:kafka-9441-kip-447-3-unify-commit-in-TM

Conversation

@mjsax
Copy link
Copy Markdown
Member

@mjsax mjsax commented Mar 4, 2020

With KIP-447 we need to commit all tasks at once using eos-beta as all tasks share the same producer. Right now, each task is committed individually (by making multiple calls to consumer.commitSync or individual calls to producer.commitTransaction for the corresponding task producer).

To allow for a unified commit logic, we move the commit logic into TaskManager. For non-eos, we collect all offset to be committed and do a single call to consumer.commitSync -- for eos-alpha we still commit each transaction individually (but now triggered by the TaskManager to have all code in one place).

To allow for a unified commit logic, we need to split existing method on Task interface in pre/post parts to allow committing in between, in particular:

  • commit() -> prepareCommit() and postCommit()
  • suspend() -> prepareSuspend() and suspend() (we keep the name as it still does suspend the task)
  • closeClean() -> prepareCloseClean() and closeClean(Map<TopicPartitions, Long> checkpoint) (we keep the name as it still does close the task)
  • closeDirty() -> prepareCloseDirty() and closeDirty() (we keep the name as it still does close the task)

prepareCloseClean() returns checkpoint information to allow checkpointing after the TaskManager did the commit within closeClean().

In a follow up PR, we will introduce eso-beta and commit a single transaction over all task using the shared producer.

Call for review @guozhangwang @abbccdda

@mjsax mjsax added the streams label Mar 4, 2020
Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is not really relevant for this PR, but we need to add it for KIP-447 eventually, thus I just include it in this PR.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We moved this to TaskManager

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please see my other comment above --- I think it is cleaner to just call foreach(active-task) task.recordCollector.commit inside the task-manager; and inside RecordCollectorImpl we check that eosEnabled is always true, otherwise illegal-state thrown.

In the next PR where we have the thread-producer, we could then only create a single recordCollector object that is shared among all active tasks and wraps the thread-producer, and then the caller taskManager code then can just get one active task and call its record-collector's commit function knowing that is sufficient to actually commit for all tasks since everyone is using the same record-collector.

WDYT?

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After syncing offline about this, I think I'm convinced now that moving this logic into TaskManager is better.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On suspend() and prepareCommit() we don't commit yet, but return the offsets that need to be committed

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We don't commit and thus don't throw any longer

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Frankly, not sure if this is correct any longer. What do we want to record with this sensor exactly? Flushing can be expensive and we might want to record it as part of committing -- but I am not sure.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not happy with this rewrite (but as I know that John did some changes in this class in another PR, I just did this hack her for now -- needs some cleanup after a rebase)

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could also do a second loop over all tasks, after calling commit(..) below -- not sure if this is ok as-is?

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would prefer a second loop to guarantee a consistent reflection on the task committed state.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Moved both tests to TaskManagerTest

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

move all 4 tests to TaskManagerTest

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Checkmark for proving the 6 tests are all migrated.

@abbccdda
Copy link
Copy Markdown

abbccdda commented Mar 4, 2020

Could we have a summary for this PR?

@mjsax
Copy link
Copy Markdown
Member Author

mjsax commented Mar 4, 2020

The summary is not longer than the PR title. We make TaskManager the class that is responsible for committing (to allow to commit all tasks at once, instead of each task individually)

@abbccdda
Copy link
Copy Markdown

abbccdda commented Mar 4, 2020

The title sounds pretty vague to me. The description could at least include what the committing behavior look like under TaskManager, what's the motivation, etc as we are already overloading JIRA 9441. In general we could try to be more specific about the changes in the PR description as I could see you are also adding upgrade flags inside this PR. Major side effects should be better to document at first IMHO.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We need to allow committing in SUSPENDED state now as we first suspend all tasks and than commit. Cf. TaskManager#handleRevocation()

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

MInor improvement: we include writing the checkpoint and the caller can indicate if it should be written or not.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This issues was introduced in the PR that introduced StreamsProducer -- we forgot to close them. Fixing this on the side.

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds good, just mark that depending on John's fix, we probably don't need to handle this.

Copy link
Copy Markdown
Member Author

@mjsax mjsax Mar 4, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We call closeClean below -- just fixing the comment here for now (\cc @guozhangwang)

Note that we don't commit offsets for this case any longer -- previously, committing offsets "might" have been done with closeClean() (even if I believe that the task would be marked as "commitNeeded == false"). We don't let the TaskManager commit offsets here, as it should not be required.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Similar to above: this issue was introduced in the StreamsProducer PR. We nee to close the producer when we remove it.

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Probably need to change after rebase

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

as above

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure why we use an iterator here. Simplifying the code with a for-loop

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We need to commit explicitly in TTD now to mimic the TaskManger. Hence, we need access to the consumer and streamsProducer

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Makes sense to me.

Copy link
Copy Markdown
Contributor

@guozhangwang guozhangwang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm still in the middle of reviewing this PR, just left a meta question here: previously the commit ordering is

  1. flush store, flush producer
  2. commit offset
  3. write checkpoint file

Now it becomes 1/3/2. It means if we have a crash between 2 and 3, in the past we would likely read from an old checkpointed offset and replay the changelogs, which is fine with or without EOS (with EOS we would just read committed data still so it is safe).

But now if we crash between 3 and 2, it means the checkpoint file has been written, but the offsets are not committed yet, which means we could potentially lose some data. Right?

Copy link
Copy Markdown

@abbccdda abbccdda left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the PR, left some comments

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did you already update the KIP for the new config?

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's the benefit of building this as a static helper?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We will need this later (ie follow up PR) and it reduced code duplication

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we start to suppress warnings?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should have done this from the beginning on... (it's just a "side fix")

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds good, just mark that depending on John's fix, we probably don't need to handle this.

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add a comment describing the new return statement.

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's the reasoning her for only wrapping the consumer offset commit case here, not for EOS case?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You mean exception handling? For the producer all exception handling is done within StreamsProducer (note that threadProducer above is a StreamsProducer, not a KafkaProducer)

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Always feels better for one less parameter :)

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Makes sense to me.

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Checkmark for proving the 6 tests are all migrated.

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Probably need to change after rebase

@mjsax mjsax force-pushed the kafka-9441-kip-447-3-unify-commit-in-TM branch 2 times, most recently from b083c8d to 86aa7f6 Compare March 7, 2020 04:37
@mjsax mjsax force-pushed the kafka-9441-kip-447-3-unify-commit-in-TM branch from 86aa7f6 to 0b5ae1b Compare March 11, 2020 02:10
Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is an open question: we don't want to remove this sensor however it was unclear to me how to handle this metric after we split "task committing" into three steps (prepareCommit; taskManager#commit; postCommit).

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we attempt to add more fine-grained metrics for 3 stages then?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Frankly, I have no good idea atm... Also, if we change metrics, we need to update the KIP and it's getting more complicated. If possible, I would prefer to not change any metric, but not sure if it is possible...

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I actually think that we can remove this DEBUG-level per-task commit metrics, since we already have the INFO-level per-thread commit metric and this one does not provide much more additional information?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Simplification to avoid passing in eosEnabled and reducing constructor parameter list -- we just piggy back on the application.id that shall be null for non-eos.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems a bit roundabout to have to remember we should send a null application.id as the constructor argument to indicate that eos is enabled. What's wrong with saying "eos is enabled" when you want eos to be enabled?

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Subjectively I'd +1 that adding one more parameter to avoid piggy-backing on the applicationId is better.

Copy link
Copy Markdown
Member Author

@mjsax mjsax Mar 13, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's a personal preference I guess. But seems you don't like it. Will revert it.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Avoid redundant logging.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Side cleanup: All those method can actually be package private.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removing this state -- this is an open question if I did this correctly. \cc @vvcephei

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After we addressed the question how we want to do metrics, we can update this tests

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because we make app method in StreamsProducer package private but need access to commit() we add TestDriverProducer to get access.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just added to give public access to commitTransaction() to TTD (it's more elegant than making StreamsProducer#commitTransaction public IMHO)

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why making commitTransaction is less elegant? I thought that was fine since StreamsProducer is inside the internals package anyways? In fact, in TTD we have access to InternalTopologyBuilder accessing it functions (we used to also have a wrapper of InternalTopologyBuilder which we removed later) so I thought that was the agreed pattern.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's obviously subjective -- personally, even if something is internal, we should not just declare stuff as public but try to keep it to a minimum to follow the idea of encapsulation (not always possible). If you want me to remove this class and make the method public I can do it in a follow up PR. Not sure if we have an agreed pattern, though.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cool, in that sense let's just keep it then -- do not add it in one PR and remove it immediately in the next.

@mjsax mjsax force-pushed the kafka-9441-kip-447-3-unify-commit-in-TM branch from 0b5ae1b to 2aeb998 Compare March 11, 2020 02:31
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: we could log thread-id here for easier log search.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The threadId is already added to the log prefix when the log object is created in StreamsThread

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we internalize this state check inside the task to simplify the logic here?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

prepareCloseClean() already does a state check and returns emptyMap if state is CREATED.

The point of this check is, that we don't add anything to the consumedOffsetsAndMetadataPerTask map -- this is important for the corner case for which all tasks are in state CREATED and thus no transaction was initialized. For this case we cannot call producer.addOffsetsToTranscation() and must skip this step entirely. Note, that we have a corresponding check below to not call commitOffsetsOrTransaction if the map is empty.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See my other comments: we should not commit in CREATED, RESTORING and SUSPENDED state, and it's better just to let the prepareXX function to indicate if there's anything to commit based on its state internally than letting task-manager to branch on the task state -- more specifically, here the prepareClose call should not return the map of checkpoints but the map of partition -> partition-timestamps (if empty it means nothing to commit), since the checkpoint map are not needed at task-manager at all and post commit, if the offsets should be empty it would still be empty.

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Similarly here, this state check could be internalized.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should just let the prepareXX function to return the map of partitions -> partition-timestamp to indicate if it should be included in the map of committing offsets, so that we do not need to leak the state into task-manager here. Also we only need to call mainConsumer.position once for all tasks -- please see my other comment above.

Also: we should not try to commit state if we are in RESTORING but only flushing store and writing checkpoints (I think this is already the behavior in trunk), since the partitions are paused from the main-consumer before restoration is done --- maybe it is partly causing some unit test flakiness.

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: let's order the functions as

prepareCloseClean
closeClean
prepareCloseDirty
closeDirty

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Prepare to uncleanly close a task that we may not own.

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we need to move these tests?

Copy link
Copy Markdown
Member Author

@mjsax mjsax Mar 12, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I try to keep "order" and group test methods to keep an overview if test coverage is complete.

// generic tests
    // functional
    // exception handling
// non-EOS tests
    // functional
    // exception handling
// EOS tests
    // functional
    // exception handling

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like we lack test coverage for TimeoutException and KafkaException cases

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, this PR does not yet add all required test...

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Covered via shouldCommitNextOffsetFromQueueIfAvailable and shouldCommitConsumerPositionIfRecordQueueIsEmpty

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We don't have unit test coverage for this exception case

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added test shouldThrowWhenHandlingClosingTasksOnProducerCloseError

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We lack unit test coverage for this case

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I know...

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we verify the assignment stack and lost stack separately, by doing handleAssignment verify first before calling handleLost

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure if I can follow? The comments just mark which setup calls belongs to which test call, nothing more. All setup is done upfront before we call the actually methods under test.

Copy link
Copy Markdown
Contributor

@guozhangwang guozhangwang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is also a meta comment: I'm trying to think of a way where we do not need to mess the hierarchy of task / task-manager classes which are just cleaned.

One wild thought following my previous comment, with the goal to avoid all of those prepare post ordering etc, is to abstract this logic away from the task / task-manager and handle inside the record-collector. In other words, the task still follow the same pattern as in today's trunk, e.g. in commitState, calling recordCollector.commit(consumedOffsetsAndMetadata);. However, the recordCollector does not call the corresponding commitTxn immediately but wait until all expected commit calls as been triggered and then call the function with the aggregated offset map. More specifically:

  1. We let the txn-manager to keep a reference to the shared record-collector.
  2. We add a "pre-commit" function inside record-collector which passed in the set of expected tasks (or partitions?) to be committed; and then when commit is called, if there is no set of expected tasks set, record-collector would trigger immediately, otherwise, wait until all the expected elements have been passed in, and then trigger.
  3. In these scenarios:

3.a) Suspend: Although we may only suspend a subset of tasks, we'd still have to commit ALL tasks under eos-beta, so we just call record-collector.preCommit with all the tasks, and then forloop task.suspend.

3.b) Commit: we would have to commit all tasks, so we just call record-collector.preCommit with all the tasks, and then forloop task.commit.

3.c) CloseClean: no matter what task(s) we are closing, we need to committing for all, so the same as above.

3.d) CloseDirty: we do not need to commit at all, so we do not need to call record-collector.preCommit since we know its commit function would not be triggered.

I admit it is not ideal since we are sort of poking a hole inside record-collector to be tasks-aware, but it saves all the code changes we'd have to introduce in task and most of the task-manager messiness.

@mjsax sorry for going back and forth on the high-level code design here, I know changing 1000+ LOC is not a easy job... but I just want to make sure we introduce as less tech complexity in our code base as possible to achieve the same thing.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: add a check that taskId exists in taskProducers to make sure we do not return null.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually on a second thought, I'm wondering if the following inside TaskManager is cleaner:

for (task <- taskManager.activeTasks)
    task.recordCollector().commit(taskToCommit.getTask(task.id);

Instead of:

activeTaskCreator.streamsProducerForTask(taskToCommit.getKey()).commitTransaction(taskToCommit.getValue());

My gut feeling is that it is cleaner to not access the task creator for its created stream-producers (and hence here we need to change the task-producer map to streamsProducers), but just access each task's record collector and call its commit --- today we already have a StreamTask#recordCollector method.

Copy link
Copy Markdown
Member Author

@mjsax mjsax Mar 13, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's unclean to let the RecordCollector commit (note that this PR removes RecordCollector not at side refactoring but on purpose) -- to me the RecordCollector has the responsibility to bridge the gap between the runtime code (that is typed), and the Producer that uses <byte[],byte[]> (ie, it serialized the data and manages errors from send) -- why would a collector know anything about committing (for which it also needs a handle to the consumer)?

About accessing the ActiveTaskCreator: we could also expose the StreamsProducer via the RecordCollector though (or directly via the task)? That would be cleaner I guess.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Subjectively I'd +1 that adding one more parameter to avoid piggy-backing on the applicationId is better.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please see my other comment above --- I think it is cleaner to just call foreach(active-task) task.recordCollector.commit inside the task-manager; and inside RecordCollectorImpl we check that eosEnabled is always true, otherwise illegal-state thrown.

In the next PR where we have the thread-producer, we could then only create a single recordCollector object that is shared among all active tasks and wraps the thread-producer, and then the caller taskManager code then can just get one active task and call its record-collector's commit function knowing that is sufficient to actually commit for all tasks since everyone is using the same record-collector.

WDYT?

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a meta comment: I think we can consolidate prepareCommit and prepareClose and prepareSuspend here by introducing the clean parameters to the function, since their logic are very similar (for the part that they diffs a bit, it can be pushed to post logic), and on task-manager during commit:

  1. for each task -> task.prepareCommit(true)
  2. commit
  3. for each task -> task.postCommit(true)

During close:

if (clean)
1) for each task -> task.prepareCommit(true)
2) commit()
3) for each task -> task.postCommit(true)
else
1) for each task -> task.prepareCommit(false)
// do not commit
3) for each task -> task.postCommit(false)
4) tasks.close(flag)

And the same for suspension.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will do this in a follow up PR.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

SG.

I think in this PR we still can do the change to let prepareXX to return the map of partitions -> partition-timestamp to indicate whether this task should be included in committing.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I actually think that we can remove this DEBUG-level per-task commit metrics, since we already have the INFO-level per-thread commit metric and this one does not provide much more additional information?

@mjsax
Copy link
Copy Markdown
Member Author

mjsax commented Mar 17, 2020

Addressed review comments and fix bug exposed by system tests. Also added more tests.

New system test run: https://jenkins.confluent.io/job/system-test-kafka-branch-builder/3834/

@guozhangwang
Copy link
Copy Markdown
Contributor

LGTM. Please feel free to merge after green builds.

@abbccdda
Copy link
Copy Markdown

The failed test seems to be just flaky
org.apache.kafka.streams.processor.internals.StateDirectoryTest.shouldReturnEmptyArrayIfListFilesReturnsNull

@mjsax
Copy link
Copy Markdown
Member Author

mjsax commented Mar 18, 2020

Actually the same test failed in both previous runs -- but it passes locally for me. As this PR has nothing to do with the failing test, I might just want go ahead and merge. The last PR that changed the failing test was merged recently: 605d55d

There is one concerning comment on that PR: Merged to trunk and 2.5 after tests green locally. ?

Seems the nightly Jenkins job also fails on this test: -

Did this PR break the Jenkins job? \cc @guozhangwang

@abbccdda
Copy link
Copy Markdown

@mjsax My understanding is that this PR will fix the problem #8310

You could choose to merge it first if looks good, and rebase the current one

@guozhangwang
Copy link
Copy Markdown
Contributor

@mjsax I just double checked and the failure in trunk is fixed as #8310 while the failure (a different one) in 2.5 is also fixed by @abbccdda 's PR. We should be fine now.

@guozhangwang
Copy link
Copy Markdown
Contributor

test this please

@mjsax mjsax force-pushed the kafka-9441-kip-447-3-unify-commit-in-TM branch from 3daa9a4 to 693b33d Compare March 18, 2020 17:04
@mjsax
Copy link
Copy Markdown
Member Author

mjsax commented Mar 18, 2020

System test failure StreamsEosTest (4 different test as mentioned by Sophie above).

Given that some more fixed got pushed to trunk I rebased. This should give us a green Jenkins and we should be able to merge.

@ableegoldman
Copy link
Copy Markdown
Member

@mjsax only two StreamsEOSTest system tests were failing when I ran them. test_rebalance_complex and test_rebalance_simple are either newly failing or due to this PR

@ableegoldman
Copy link
Copy Markdown
Member

ableegoldman commented Mar 18, 2020

Also, the two already-failing tests (test_failure_and_recovery) failed with a different error than they were trunk:

java.lang.IllegalStateException: Unknown TaskId: 0_0
	at org.apache.kafka.streams.processor.internals.ActiveTaskCreator.streamsProducerForTask(ActiveTaskCreator.java:118)
	at org.apache.kafka.streams.processor.internals.TaskManager.commitOffsetsOrTransaction(TaskManager.java:749)
	at org.apache.kafka.streams.processor.internals.TaskManager.commitAll(TaskManager.java:706)
	at org.apache.kafka.streams.processor.internals.StreamThread.maybeCommit(StreamThread.java:772)
	at org.apache.kafka.streams.processor.internals.StreamThread.runOnce(StreamThread.java:645)
	at org.apache.kafka.streams.processor.internals.StreamThread.runLoop(StreamThread.java:501)
	at org.apache.kafka.streams.processor.internals.StreamThread.run(StreamThread.java:475)

@mjsax
Copy link
Copy Markdown
Member Author

mjsax commented Mar 19, 2020

@ableegoldman Thanks for pointing it out! Pushed a fix and updated the unit test accordingly. \cc @guozhangwang

New system test run: https://jenkins.confluent.io/job/system-test-kafka-branch-builder/3835/

@mjsax
Copy link
Copy Markdown
Member Author

mjsax commented Mar 19, 2020

StreamsEosTest.test_failure_and_recovery and StreamsEosTest.test_failure_and_recovery_complex failed. I only see:

[2020-03-19 12:12:47,231] ERROR stream-thread [EosTest-e1bb7363-be65-43b7-a35b-326c73533bdf-StreamThread-1] Encountered the following exception during processing and the thread is going to shut down:  (org.apache.kafka.streams.processor.internals.StreamThread)
org.apache.kafka.streams.errors.ProcessorStateException: Error opening store KSTREAM-AGGREGATE-STATE-STORE-0000000003 at location /mnt/streams/EosTest/0_0/rocksdb/KSTREAM-AGGREGATE-STATE-STORE-0000000003
	at org.apache.kafka.streams.state.internals.RocksDBTimestampedStore.openRocksDB(RocksDBTimestampedStore.java:87)
	at org.apache.kafka.streams.state.internals.RocksDBStore.openDB(RocksDBStore.java:195)
	at org.apache.kafka.streams.state.internals.RocksDBStore.init(RocksDBStore.java:231)
	at org.apache.kafka.streams.state.internals.WrappedStateStore.init(WrappedStateStore.java:48)
	at org.apache.kafka.streams.state.internals.ChangeLoggingKeyValueBytesStore.init(ChangeLoggingKeyValueBytesStore.java:44)
	at org.apache.kafka.streams.state.internals.WrappedStateStore.init(WrappedStateStore.java:48)
	at org.apache.kafka.streams.state.internals.MeteredKeyValueStore.lambda$init$0(MeteredKeyValueStore.java:101)
	at org.apache.kafka.streams.processor.internals.metrics.StreamsMetricsImpl.maybeMeasureLatency(StreamsMetricsImpl.java:806)
	at org.apache.kafka.streams.state.internals.MeteredKeyValueStore.init(MeteredKeyValueStore.java:101)
	at org.apache.kafka.streams.processor.internals.StateManagerUtil.registerStateStores(StateManagerUtil.java:81)
	at org.apache.kafka.streams.processor.internals.StandbyTask.initializeIfNeeded(StandbyTask.java:87)
	at org.apache.kafka.streams.processor.internals.TaskManager.tryToCompleteRestoration(TaskManager.java:331)
	at org.apache.kafka.streams.processor.internals.StreamThread.runOnce(StreamThread.java:586)
	at org.apache.kafka.streams.processor.internals.StreamThread.runLoop(StreamThread.java:501)
	at org.apache.kafka.streams.processor.internals.StreamThread.run(StreamThread.java:475)
Caused by: org.rocksdb.RocksDBException: lock : /mnt/streams/EosTest/0_0/rocksdb/KSTREAM-AGGREGATE-STATE-STORE-0000000003/LOCK: No locks available
	at org.rocksdb.RocksDB.open(Native Method)
	at org.rocksdb.RocksDB.open(RocksDB.java:286)
	at org.apache.kafka.streams.state.internals.RocksDBTimestampedStore.openRocksDB(RocksDBTimestampedStore.java:75)
	... 14 more

followed by

java.lang.NullPointerException
	at org.apache.kafka.streams.processor.internals.StoreChangelogReader.remove(StoreChangelogReader.java:801)
	at org.apache.kafka.streams.processor.internals.TaskManager.cleanupTask(TaskManager.java:549)
	at org.apache.kafka.streams.processor.internals.TaskManager.shutdown(TaskManager.java:567)
	at org.apache.kafka.streams.processor.internals.StreamThread.completeShutdown(StreamThread.java:839)
	at org.apache.kafka.streams.processor.internals.StreamThread.run(StreamThread.java:484)

Seems to be unrelated to this PR. Are we good to merge?

@abbccdda
Copy link
Copy Markdown

@mjsax We should be good. I have another PR to fix those #8307

@mjsax mjsax merged commit 89cd2f2 into apache:trunk Mar 19, 2020
@mjsax mjsax deleted the kafka-9441-kip-447-3-unify-commit-in-TM branch June 11, 2020 01:12
@mjsax mjsax added the kip Requires or implements a KIP label Jun 12, 2020
cadonna added a commit that referenced this pull request Nov 9, 2023
The task-level commit metrics were removed without deprecation in KIP-447 and the corresponding PR #8218. However, we forgot to update the docs and to remove the code that creates the task-level commit sensor.
This PR removes the task-level commit metrics from the docs and removes the code that creates the task-level commit sensor. The code was effectively dead since it was only used in tests.

Reviewers: Guozhang Wang <wangguoz@gmail.com>, Matthias J. Sax <matthias@confluent.io>
rreddy-22 pushed a commit to rreddy-22/kafka-rreddy that referenced this pull request Jan 2, 2024
The task-level commit metrics were removed without deprecation in KIP-447 and the corresponding PR apache#8218. However, we forgot to update the docs and to remove the code that creates the task-level commit sensor.
This PR removes the task-level commit metrics from the docs and removes the code that creates the task-level commit sensor. The code was effectively dead since it was only used in tests.

Reviewers: Guozhang Wang <wangguoz@gmail.com>, Matthias J. Sax <matthias@confluent.io>
yyu1993 pushed a commit to yyu1993/kafka that referenced this pull request Feb 15, 2024
The task-level commit metrics were removed without deprecation in KIP-447 and the corresponding PR apache#8218. However, we forgot to update the docs and to remove the code that creates the task-level commit sensor.
This PR removes the task-level commit metrics from the docs and removes the code that creates the task-level commit sensor. The code was effectively dead since it was only used in tests.

Reviewers: Guozhang Wang <wangguoz@gmail.com>, Matthias J. Sax <matthias@confluent.io>
AnatolyPopov pushed a commit to aiven/kafka that referenced this pull request Feb 16, 2024
The task-level commit metrics were removed without deprecation in KIP-447 and the corresponding PR apache#8218. However, we forgot to update the docs and to remove the code that creates the task-level commit sensor.
This PR removes the task-level commit metrics from the docs and removes the code that creates the task-level commit sensor. The code was effectively dead since it was only used in tests.

Reviewers: Guozhang Wang <wangguoz@gmail.com>, Matthias J. Sax <matthias@confluent.io>
clolov pushed a commit to clolov/kafka that referenced this pull request Apr 5, 2024
The task-level commit metrics were removed without deprecation in KIP-447 and the corresponding PR apache#8218. However, we forgot to update the docs and to remove the code that creates the task-level commit sensor.
This PR removes the task-level commit metrics from the docs and removes the code that creates the task-level commit sensor. The code was effectively dead since it was only used in tests.

Reviewers: Guozhang Wang <wangguoz@gmail.com>, Matthias J. Sax <matthias@confluent.io>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

kip Requires or implements a KIP streams

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants