KAFKA-9441: Unify committing within TaskManager by mjsax · Pull Request #8218 · apache/kafka

mjsax · 2020-03-04T03:10:26Z

With KIP-447 we need to commit all tasks at once using eos-beta as all tasks share the same producer. Right now, each task is committed individually (by making multiple calls to consumer.commitSync or individual calls to producer.commitTransaction for the corresponding task producer).

To allow for a unified commit logic, we move the commit logic into TaskManager. For non-eos, we collect all offset to be committed and do a single call to consumer.commitSync -- for eos-alpha we still commit each transaction individually (but now triggered by the TaskManager to have all code in one place).

To allow for a unified commit logic, we need to split existing method on Task interface in pre/post parts to allow committing in between, in particular:

commit() -> prepareCommit() and postCommit()
suspend() -> prepareSuspend() and suspend() (we keep the name as it still does suspend the task)
closeClean() -> prepareCloseClean() and closeClean(Map<TopicPartitions, Long> checkpoint) (we keep the name as it still does close the task)
closeDirty() -> prepareCloseDirty() and closeDirty() (we keep the name as it still does close the task)

prepareCloseClean() returns checkpoint information to allow checkpointing after the TaskManager did the commit within closeClean().

In a follow up PR, we will introduce eso-beta and commit a single transaction over all task using the shared producer.

Call for review @guozhangwang @abbccdda

mjsax · 2020-03-04T03:11:27Z

This is not really relevant for this PR, but we need to add it for KIP-447 eventually, thus I just include it in this PR.

mjsax · 2020-03-04T03:12:04Z

We moved this to TaskManager

Please see my other comment above --- I think it is cleaner to just call foreach(active-task) task.recordCollector.commit inside the task-manager; and inside RecordCollectorImpl we check that eosEnabled is always true, otherwise illegal-state thrown.

In the next PR where we have the thread-producer, we could then only create a single recordCollector object that is shared among all active tasks and wraps the thread-producer, and then the caller taskManager code then can just get one active task and call its record-collector's commit function knowing that is sufficient to actually commit for all tasks since everyone is using the same record-collector.

WDYT?

After syncing offline about this, I think I'm convinced now that moving this logic into TaskManager is better.

mjsax · 2020-03-04T03:12:59Z

On suspend() and prepareCommit() we don't commit yet, but return the offsets that need to be committed

mjsax · 2020-03-04T03:13:24Z

We don't commit and thus don't throw any longer

mjsax · 2020-03-04T03:15:41Z

Frankly, not sure if this is correct any longer. What do we want to record with this sensor exactly? Flushing can be expensive and we might want to record it as part of committing -- but I am not sure.

mjsax · 2020-03-04T03:17:11Z

I am not happy with this rewrite (but as I know that John did some changes in this class in another PR, I just did this hack her for now -- needs some cleanup after a rebase)

mjsax · 2020-03-04T03:18:27Z

We could also do a second loop over all tasks, after calling commit(..) below -- not sure if this is ok as-is?

I would prefer a second loop to guarantee a consistent reflection on the task committed state.

mjsax · 2020-03-04T03:25:43Z

Moved both tests to TaskManagerTest

mjsax · 2020-03-04T03:26:02Z

move all 4 tests to TaskManagerTest

Checkmark for proving the 6 tests are all migrated.

abbccdda · 2020-03-04T03:58:30Z

Could we have a summary for this PR?

mjsax · 2020-03-04T04:00:38Z

The summary is not longer than the PR title. We make TaskManager the class that is responsible for committing (to allow to commit all tasks at once, instead of each task individually)

abbccdda · 2020-03-04T17:56:26Z

The title sounds pretty vague to me. The description could at least include what the committing behavior look like under TaskManager, what's the motivation, etc as we are already overloading JIRA 9441. In general we could try to be more specific about the changes in the PR description as I could see you are also adding upgrade flags inside this PR. Major side effects should be better to document at first IMHO.

mjsax · 2020-03-04T23:23:13Z

We need to allow committing in SUSPENDED state now as we first suspend all tasks and than commit. Cf. TaskManager#handleRevocation()

mjsax · 2020-03-04T23:23:59Z

MInor improvement: we include writing the checkpoint and the caller can indicate if it should be written or not.

mjsax · 2020-03-04T23:25:15Z

This issues was introduced in the PR that introduced StreamsProducer -- we forgot to close them. Fixing this on the side.

Sounds good, just mark that depending on John's fix, we probably don't need to handle this.

mjsax · 2020-03-04T23:26:10Z

We call closeClean below -- just fixing the comment here for now (\cc @guozhangwang)

Note that we don't commit offsets for this case any longer -- previously, committing offsets "might" have been done with closeClean() (even if I believe that the task would be marked as "commitNeeded == false"). We don't let the TaskManager commit offsets here, as it should not be required.

mjsax · 2020-03-04T23:28:26Z

Similar to above: this issue was introduced in the StreamsProducer PR. We nee to close the producer when we remove it.

Probably need to change after rebase

mjsax · 2020-03-04T23:28:44Z

mjsax · 2020-03-04T23:29:16Z

Not sure why we use an iterator here. Simplifying the code with a for-loop

mjsax · 2020-03-04T23:31:16Z

We need to commit explicitly in TTD now to mimic the TaskManger. Hence, we need access to the consumer and streamsProducer

Makes sense to me.

guozhangwang

I'm still in the middle of reviewing this PR, just left a meta question here: previously the commit ordering is

flush store, flush producer
commit offset
write checkpoint file

Now it becomes 1/3/2. It means if we have a crash between 2 and 3, in the past we would likely read from an old checkpointed offset and replay the changelogs, which is fine with or without EOS (with EOS we would just read committed data still so it is safe).

But now if we crash between 3 and 2, it means the checkpoint file has been written, but the offsets are not committed yet, which means we could potentially lose some data. Right?

abbccdda

Thanks for the PR, left some comments

abbccdda · 2020-03-04T22:54:42Z

Did you already update the KIP for the new config?

abbccdda · 2020-03-04T22:56:19Z

What's the benefit of building this as a static helper?

We will need this later (ie follow up PR) and it reduced code duplication

abbccdda · 2020-03-05T00:33:39Z

Why do we start to suppress warnings?

We should have done this from the beginning on... (it's just a "side fix")

abbccdda · 2020-03-05T00:35:52Z

Sounds good, just mark that depending on John's fix, we probably don't need to handle this.

abbccdda · 2020-03-05T00:41:10Z

Add a comment describing the new return statement.

abbccdda · 2020-03-05T01:20:06Z

What's the reasoning her for only wrapping the consumer offset commit case here, not for EOS case?

You mean exception handling? For the producer all exception handling is done within StreamsProducer (note that threadProducer above is a StreamsProducer, not a KafkaProducer)

abbccdda · 2020-03-05T01:21:10Z

Always feels better for one less parameter :)

abbccdda · 2020-03-05T01:22:58Z

Makes sense to me.

abbccdda · 2020-03-05T01:23:57Z

Checkmark for proving the 6 tests are all migrated.

abbccdda · 2020-03-05T01:25:58Z

Probably need to change after rebase

mjsax · 2020-03-11T02:20:42Z

This is an open question: we don't want to remove this sensor however it was unclear to me how to handle this metric after we split "task committing" into three steps (prepareCommit; taskManager#commit; postCommit).

Should we attempt to add more fine-grained metrics for 3 stages then?

Frankly, I have no good idea atm... Also, if we change metrics, we need to update the KIP and it's getting more complicated. If possible, I would prefer to not change any metric, but not sure if it is possible...

I actually think that we can remove this DEBUG-level per-task commit metrics, since we already have the INFO-level per-thread commit metric and this one does not provide much more additional information?

mjsax · 2020-03-11T02:22:13Z

Simplification to avoid passing in eosEnabled and reducing constructor parameter list -- we just piggy back on the application.id that shall be null for non-eos.

It seems a bit roundabout to have to remember we should send a null application.id as the constructor argument to indicate that eos is enabled. What's wrong with saying "eos is enabled" when you want eos to be enabled?

Subjectively I'd +1 that adding one more parameter to avoid piggy-backing on the applicationId is better.

It's a personal preference I guess. But seems you don't like it. Will revert it.

mjsax · 2020-03-11T02:22:35Z

Avoid redundant logging.

mjsax · 2020-03-11T02:23:06Z

Side cleanup: All those method can actually be package private.

mjsax · 2020-03-11T02:23:55Z

Removing this state -- this is an open question if I did this correctly. \cc @vvcephei

mjsax · 2020-03-11T02:27:59Z

After we addressed the question how we want to do metrics, we can update this tests

mjsax · 2020-03-11T02:29:59Z

Because we make app method in StreamsProducer package private but need access to commit() we add TestDriverProducer to get access.

mjsax · 2020-03-11T02:31:09Z

Just added to give public access to commitTransaction() to TTD (it's more elegant than making StreamsProducer#commitTransaction public IMHO)

Why making commitTransaction is less elegant? I thought that was fine since StreamsProducer is inside the internals package anyways? In fact, in TTD we have access to InternalTopologyBuilder accessing it functions (we used to also have a wrapper of InternalTopologyBuilder which we removed later) so I thought that was the agreed pattern.

It's obviously subjective -- personally, even if something is internal, we should not just declare stuff as public but try to keep it to a minimum to follow the idea of encapsulation (not always possible). If you want me to remove this class and make the method public I can do it in a follow up PR. Not sure if we have an agreed pattern, though.

Cool, in that sense let's just keep it then -- do not add it in one PR and remove it immediately in the next.

abbccdda · 2020-03-11T03:27:07Z

nit: we could log thread-id here for easier log search.

The threadId is already added to the log prefix when the log object is created in StreamsThread

abbccdda · 2020-03-11T20:05:06Z

Could we internalize this state check inside the task to simplify the logic here?

prepareCloseClean() already does a state check and returns emptyMap if state is CREATED.

The point of this check is, that we don't add anything to the consumedOffsetsAndMetadataPerTask map -- this is important for the corner case for which all tasks are in state CREATED and thus no transaction was initialized. For this case we cannot call producer.addOffsetsToTranscation() and must skip this step entirely. Note, that we have a corresponding check below to not call commitOffsetsOrTransaction if the map is empty.

See my other comments: we should not commit in CREATED, RESTORING and SUSPENDED state, and it's better just to let the prepareXX function to indicate if there's anything to commit based on its state internally than letting task-manager to branch on the task state -- more specifically, here the prepareClose call should not return the map of checkpoints but the map of partition -> partition-timestamps (if empty it means nothing to commit), since the checkpoint map are not needed at task-manager at all and post commit, if the offsets should be empty it would still be empty.

abbccdda · 2020-03-11T20:05:36Z

Similarly here, this state check could be internalized.

I think we should just let the prepareXX function to return the map of partitions -> partition-timestamp to indicate if it should be included in the map of committing offsets, so that we do not need to leak the state into task-manager here. Also we only need to call mainConsumer.position once for all tasks -- please see my other comment above.

Also: we should not try to commit state if we are in RESTORING but only flushing store and writing checkpoints (I think this is already the behavior in trunk), since the partitions are paused from the main-consumer before restoration is done --- maybe it is partly causing some unit test flakiness.

abbccdda · 2020-03-11T20:07:57Z

nit: let's order the functions as

prepareCloseClean closeClean prepareCloseDirty closeDirty

abbccdda · 2020-03-11T20:08:36Z

Prepare to uncleanly close a task that we may not own.

abbccdda · 2020-03-11T22:11:16Z

Why do we need to move these tests?

I try to keep "order" and group test methods to keep an overview if test coverage is complete.

// generic tests // functional // exception handling // non-EOS tests // functional // exception handling // EOS tests // functional // exception handling

abbccdda · 2020-03-11T22:13:39Z

Looks like we lack test coverage for TimeoutException and KafkaException cases

Yeah, this PR does not yet add all required test...

Covered via shouldCommitNextOffsetFromQueueIfAvailable and shouldCommitConsumerPositionIfRecordQueueIsEmpty

abbccdda · 2020-03-11T22:17:34Z

We don't have unit test coverage for this exception case

Added test shouldThrowWhenHandlingClosingTasksOnProducerCloseError

abbccdda · 2020-03-11T22:18:14Z

We lack unit test coverage for this case

abbccdda · 2020-03-11T22:20:40Z

Could we verify the assignment stack and lost stack separately, by doing handleAssignment verify first before calling handleLost

Not sure if I can follow? The comments just mark which setup calls belongs to which test call, nothing more. All setup is done upfront before we call the actually methods under test.

guozhangwang

This is also a meta comment: I'm trying to think of a way where we do not need to mess the hierarchy of task / task-manager classes which are just cleaned.

One wild thought following my previous comment, with the goal to avoid all of those prepare post ordering etc, is to abstract this logic away from the task / task-manager and handle inside the record-collector. In other words, the task still follow the same pattern as in today's trunk, e.g. in commitState, calling recordCollector.commit(consumedOffsetsAndMetadata);. However, the recordCollector does not call the corresponding commitTxn immediately but wait until all expected commit calls as been triggered and then call the function with the aggregated offset map. More specifically:

We let the txn-manager to keep a reference to the shared record-collector.
We add a "pre-commit" function inside record-collector which passed in the set of expected tasks (or partitions?) to be committed; and then when commit is called, if there is no set of expected tasks set, record-collector would trigger immediately, otherwise, wait until all the expected elements have been passed in, and then trigger.
In these scenarios:

3.a) Suspend: Although we may only suspend a subset of tasks, we'd still have to commit ALL tasks under eos-beta, so we just call record-collector.preCommit with all the tasks, and then forloop task.suspend.

3.b) Commit: we would have to commit all tasks, so we just call record-collector.preCommit with all the tasks, and then forloop task.commit.

3.c) CloseClean: no matter what task(s) we are closing, we need to committing for all, so the same as above.

3.d) CloseDirty: we do not need to commit at all, so we do not need to call record-collector.preCommit since we know its commit function would not be triggered.

I admit it is not ideal since we are sort of poking a hole inside record-collector to be tasks-aware, but it saves all the code changes we'd have to introduce in task and most of the task-manager messiness.

@mjsax sorry for going back and forth on the high-level code design here, I know changing 1000+ LOC is not a easy job... but I just want to make sure we introduce as less tech complexity in our code base as possible to achieve the same thing.

guozhangwang · 2020-03-12T18:22:32Z

nit: add a check that taskId exists in taskProducers to make sure we do not return null.

Actually on a second thought, I'm wondering if the following inside TaskManager is cleaner:

for (task <- taskManager.activeTasks) task.recordCollector().commit(taskToCommit.getTask(task.id);

Instead of:

activeTaskCreator.streamsProducerForTask(taskToCommit.getKey()).commitTransaction(taskToCommit.getValue());

My gut feeling is that it is cleaner to not access the task creator for its created stream-producers (and hence here we need to change the task-producer map to streamsProducers), but just access each task's record collector and call its commit --- today we already have a StreamTask#recordCollector method.

I think it's unclean to let the RecordCollector commit (note that this PR removes RecordCollector not at side refactoring but on purpose) -- to me the RecordCollector has the responsibility to bridge the gap between the runtime code (that is typed), and the Producer that uses <byte[],byte[]> (ie, it serialized the data and manages errors from send) -- why would a collector know anything about committing (for which it also needs a handle to the consumer)?

About accessing the ActiveTaskCreator: we could also expose the StreamsProducer via the RecordCollector though (or directly via the task)? That would be cleaner I guess.

guozhangwang · 2020-03-12T18:27:10Z

Subjectively I'd +1 that adding one more parameter to avoid piggy-backing on the applicationId is better.

guozhangwang · 2020-03-12T19:00:10Z

Please see my other comment above --- I think it is cleaner to just call foreach(active-task) task.recordCollector.commit inside the task-manager; and inside RecordCollectorImpl we check that eosEnabled is always true, otherwise illegal-state thrown.

In the next PR where we have the thread-producer, we could then only create a single recordCollector object that is shared among all active tasks and wraps the thread-producer, and then the caller taskManager code then can just get one active task and call its record-collector's commit function knowing that is sufficient to actually commit for all tasks since everyone is using the same record-collector.

WDYT?

guozhangwang · 2020-03-12T19:16:48Z

This is a meta comment: I think we can consolidate prepareCommit and prepareClose and prepareSuspend here by introducing the clean parameters to the function, since their logic are very similar (for the part that they diffs a bit, it can be pushed to post logic), and on task-manager during commit:

for each task -> task.prepareCommit(true)

commit

for each task -> task.postCommit(true)

During close:

if (clean)
1) for each task -> task.prepareCommit(true)
2) commit()
3) for each task -> task.postCommit(true)
else
1) for each task -> task.prepareCommit(false)
// do not commit
3) for each task -> task.postCommit(false)
4) tasks.close(flag)

And the same for suspension.

Will do this in a follow up PR.

SG.

I think in this PR we still can do the change to let prepareXX to return the map of partitions -> partition-timestamp to indicate whether this task should be included in committing.

guozhangwang · 2020-03-12T19:18:06Z

I actually think that we can remove this DEBUG-level per-task commit metrics, since we already have the INFO-level per-thread commit metric and this one does not provide much more additional information?

mjsax · 2020-03-17T23:14:32Z

Addressed review comments and fix bug exposed by system tests. Also added more tests.

New system test run: https://jenkins.confluent.io/job/system-test-kafka-branch-builder/3834/

guozhangwang · 2020-03-18T03:10:24Z

LGTM. Please feel free to merge after green builds.

abbccdda · 2020-03-18T04:22:27Z

The failed test seems to be just flaky
org.apache.kafka.streams.processor.internals.StateDirectoryTest.shouldReturnEmptyArrayIfListFilesReturnsNull

mjsax · 2020-03-18T04:45:14Z

Actually the same test failed in both previous runs -- but it passes locally for me. As this PR has nothing to do with the failing test, I might just want go ahead and merge. The last PR that changed the failing test was merged recently: 605d55d

There is one concerning comment on that PR: Merged to trunk and 2.5 after tests green locally. ?

Seems the nightly Jenkins job also fails on this test: -

Did this PR break the Jenkins job? \cc @guozhangwang

abbccdda · 2020-03-18T04:53:45Z

@mjsax My understanding is that this PR will fix the problem #8310

You could choose to merge it first if looks good, and rebase the current one

guozhangwang · 2020-03-18T05:30:17Z

@mjsax I just double checked and the failure in trunk is fixed as #8310 while the failure (a different one) in 2.5 is also fixed by @abbccdda 's PR. We should be fine now.

guozhangwang · 2020-03-18T05:30:25Z

test this please

mjsax · 2020-03-18T18:45:38Z

System test failure StreamsEosTest (4 different test as mentioned by Sophie above).

Given that some more fixed got pushed to trunk I rebased. This should give us a green Jenkins and we should be able to merge.

ableegoldman · 2020-03-18T19:21:58Z

@mjsax only two StreamsEOSTest system tests were failing when I ran them. test_rebalance_complex and test_rebalance_simple are either newly failing or due to this PR

ableegoldman · 2020-03-18T19:23:03Z

Also, the two already-failing tests (test_failure_and_recovery) failed with a different error than they were trunk:

java.lang.IllegalStateException: Unknown TaskId: 0_0
	at org.apache.kafka.streams.processor.internals.ActiveTaskCreator.streamsProducerForTask(ActiveTaskCreator.java:118)
	at org.apache.kafka.streams.processor.internals.TaskManager.commitOffsetsOrTransaction(TaskManager.java:749)
	at org.apache.kafka.streams.processor.internals.TaskManager.commitAll(TaskManager.java:706)
	at org.apache.kafka.streams.processor.internals.StreamThread.maybeCommit(StreamThread.java:772)
	at org.apache.kafka.streams.processor.internals.StreamThread.runOnce(StreamThread.java:645)
	at org.apache.kafka.streams.processor.internals.StreamThread.runLoop(StreamThread.java:501)
	at org.apache.kafka.streams.processor.internals.StreamThread.run(StreamThread.java:475)

mjsax · 2020-03-19T08:49:47Z

@ableegoldman Thanks for pointing it out! Pushed a fix and updated the unit test accordingly. \cc @guozhangwang

New system test run: https://jenkins.confluent.io/job/system-test-kafka-branch-builder/3835/

mjsax · 2020-03-19T17:38:40Z

StreamsEosTest.test_failure_and_recovery and StreamsEosTest.test_failure_and_recovery_complex failed. I only see:

[2020-03-19 12:12:47,231] ERROR stream-thread [EosTest-e1bb7363-be65-43b7-a35b-326c73533bdf-StreamThread-1] Encountered the following exception during processing and the thread is going to shut down:  (org.apache.kafka.streams.processor.internals.StreamThread)
org.apache.kafka.streams.errors.ProcessorStateException: Error opening store KSTREAM-AGGREGATE-STATE-STORE-0000000003 at location /mnt/streams/EosTest/0_0/rocksdb/KSTREAM-AGGREGATE-STATE-STORE-0000000003
	at org.apache.kafka.streams.state.internals.RocksDBTimestampedStore.openRocksDB(RocksDBTimestampedStore.java:87)
	at org.apache.kafka.streams.state.internals.RocksDBStore.openDB(RocksDBStore.java:195)
	at org.apache.kafka.streams.state.internals.RocksDBStore.init(RocksDBStore.java:231)
	at org.apache.kafka.streams.state.internals.WrappedStateStore.init(WrappedStateStore.java:48)
	at org.apache.kafka.streams.state.internals.ChangeLoggingKeyValueBytesStore.init(ChangeLoggingKeyValueBytesStore.java:44)
	at org.apache.kafka.streams.state.internals.WrappedStateStore.init(WrappedStateStore.java:48)
	at org.apache.kafka.streams.state.internals.MeteredKeyValueStore.lambda$init$0(MeteredKeyValueStore.java:101)
	at org.apache.kafka.streams.processor.internals.metrics.StreamsMetricsImpl.maybeMeasureLatency(StreamsMetricsImpl.java:806)
	at org.apache.kafka.streams.state.internals.MeteredKeyValueStore.init(MeteredKeyValueStore.java:101)
	at org.apache.kafka.streams.processor.internals.StateManagerUtil.registerStateStores(StateManagerUtil.java:81)
	at org.apache.kafka.streams.processor.internals.StandbyTask.initializeIfNeeded(StandbyTask.java:87)
	at org.apache.kafka.streams.processor.internals.TaskManager.tryToCompleteRestoration(TaskManager.java:331)
	at org.apache.kafka.streams.processor.internals.StreamThread.runOnce(StreamThread.java:586)
	at org.apache.kafka.streams.processor.internals.StreamThread.runLoop(StreamThread.java:501)
	at org.apache.kafka.streams.processor.internals.StreamThread.run(StreamThread.java:475)
Caused by: org.rocksdb.RocksDBException: lock : /mnt/streams/EosTest/0_0/rocksdb/KSTREAM-AGGREGATE-STATE-STORE-0000000003/LOCK: No locks available
	at org.rocksdb.RocksDB.open(Native Method)
	at org.rocksdb.RocksDB.open(RocksDB.java:286)
	at org.apache.kafka.streams.state.internals.RocksDBTimestampedStore.openRocksDB(RocksDBTimestampedStore.java:75)
	... 14 more

followed by

java.lang.NullPointerException
	at org.apache.kafka.streams.processor.internals.StoreChangelogReader.remove(StoreChangelogReader.java:801)
	at org.apache.kafka.streams.processor.internals.TaskManager.cleanupTask(TaskManager.java:549)
	at org.apache.kafka.streams.processor.internals.TaskManager.shutdown(TaskManager.java:567)
	at org.apache.kafka.streams.processor.internals.StreamThread.completeShutdown(StreamThread.java:839)
	at org.apache.kafka.streams.processor.internals.StreamThread.run(StreamThread.java:484)

Seems to be unrelated to this PR. Are we good to merge?

abbccdda · 2020-03-19T17:41:17Z

@mjsax We should be good. I have another PR to fix those #8307

The task-level commit metrics were removed without deprecation in KIP-447 and the corresponding PR #8218. However, we forgot to update the docs and to remove the code that creates the task-level commit sensor. This PR removes the task-level commit metrics from the docs and removes the code that creates the task-level commit sensor. The code was effectively dead since it was only used in tests. Reviewers: Guozhang Wang <wangguoz@gmail.com>, Matthias J. Sax <matthias@confluent.io>

The task-level commit metrics were removed without deprecation in KIP-447 and the corresponding PR apache#8218. However, we forgot to update the docs and to remove the code that creates the task-level commit sensor. This PR removes the task-level commit metrics from the docs and removes the code that creates the task-level commit sensor. The code was effectively dead since it was only used in tests. Reviewers: Guozhang Wang <wangguoz@gmail.com>, Matthias J. Sax <matthias@confluent.io>

mjsax added the streams label Mar 4, 2020

mjsax commented Mar 4, 2020

View reviewed changes

Comment thread streams/src/main/java/org/apache/kafka/streams/processor/internals/TaskManager.java Outdated

Copy link
Copy Markdown

Member Author

mjsax Mar 4, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

as above

mjsax commented Mar 4, 2020

View reviewed changes

guozhangwang reviewed Mar 5, 2020

View reviewed changes

abbccdda requested changes Mar 5, 2020

View reviewed changes

mjsax force-pushed the kafka-9441-kip-447-3-unify-commit-in-TM branch 2 times, most recently from b083c8d to 86aa7f6 Compare March 7, 2020 04:37

mjsax mentioned this pull request Mar 7, 2020

MINOR: Refactor StreamsProducer usage #8249

Closed

mjsax force-pushed the kafka-9441-kip-447-3-unify-commit-in-TM branch from 86aa7f6 to 0b5ae1b Compare March 11, 2020 02:10

mjsax commented Mar 11, 2020

View reviewed changes

mjsax force-pushed the kafka-9441-kip-447-3-unify-commit-in-TM branch from 0b5ae1b to 2aeb998 Compare March 11, 2020 02:31

abbccdda reviewed Mar 11, 2020

View reviewed changes

guozhangwang reviewed Mar 12, 2020

View reviewed changes

mjsax added 5 commits March 18, 2020 10:04

KAFKA-9441: Unify committing within TaskManager

b5de5b1

Github comments

2002144

Github comments

a2f2c04

Fix bug

45e5633

Github comments

693b33d

mjsax force-pushed the kafka-9441-kip-447-3-unify-commit-in-TM branch from 3daa9a4 to 693b33d Compare March 18, 2020 17:04

Bug fix

4b03a48

mjsax merged commit 89cd2f2 into apache:trunk Mar 19, 2020

mjsax mentioned this pull request Mar 25, 2020

KAFKA-9441: Cleanup Streams metrics for removed task commit latency metrics #8356

Merged

mjsax deleted the kafka-9441-kip-447-3-unify-commit-in-TM branch June 11, 2020 01:12

mjsax added the kip Requires or implements a KIP label Jun 12, 2020

cadonna mentioned this pull request Nov 1, 2023

KAFKA-15765: Delete task-level commit sensor #14677

Merged

3 tasks

Conversation

mjsax commented Mar 4, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

abbccdda commented Mar 4, 2020

Uh oh!

mjsax commented Mar 4, 2020

Uh oh!

abbccdda commented Mar 4, 2020

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

mjsax Mar 4, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

guozhangwang left a comment

Choose a reason for hiding this comment

Uh oh!

abbccdda left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

mjsax commented Mar 4, 2020 •

edited

Loading

mjsax Mar 4, 2020 •

edited

Loading