KAFKA-6834: Handle compaction with batches bigger than max.message.bytes by rajinisivaram · Pull Request #4953 · apache/kafka

rajinisivaram · 2018-05-02T08:16:29Z

Committer Checklist (excluded from commit message)

Verify design and implementation
Verify test coverage and CI build status
Verify documentation (including upgrade notes)

ijuma

Thanks for the PR, left a few comments.

ijuma · 2018-05-02T08:38:57Z

This should be CorruptRecordException too.

We only ever invoke this method from nextBatch() which has already done this check, or when growing log cleaner buffers beyond max.message.size. So we don't expect this check to fail, hence IllegalStateException seems better?

We also invoke it from MemoryRecords, no?

We invoke it from MemoryRecords#nextBatchSize() for use in LogCleaner since this class is package-private.

Right, so it seems a bit brittle to assume that some checks have already been done.

Since we have the check in nextBatch already, would it make sense to grab the record size there and pass it to this function as an argument?

I changed the method to return null if there isn't enough data in the buffer, making it consistent with nextBatch. Added comments and test.

ijuma · 2018-05-02T08:41:24Z

What if the max message size is reduced? Is this check too strict?

See seem to only create ByteBufferLogInputStream with maxMessageSize of Integer.MAX_VALUE.

But this is a general class, right. It may be used in different ways in the future. We should try to write robust code that makes sense conceptually. If there are some assumptions, we should document them and ideally have tests so that we can't break them.

ijuma · 2018-05-02T08:42:18Z

Shouldn't we do something If this evaluates to false?

When invoked from nextBatch, we return null and would do this again when there is sufficient data in the buffer (that was the existing behaviour). For log cleaner, this should always succeed since since we are growing buffer beyond max.message.bytes.

ijuma · 2018-05-02T08:45:50Z

We should include the log segment base offset if we can in messages such as this. Also, at this point we have already allocated the batch, so is checking crc even helpful?

I was thinking that if there is a lot of corrupt data in the logs and we managed to allocate a buffer for the first one because the size was small enough to allocate, it may be worth checking CRC to detect corruption and avoid allocating even larger buffers later.

Updated exception messsage.

hachikuji · 2018-05-02T16:48:24Z

Should we be duplicating the buffer?

Also, not really sure we need to expose nextBatchSize. Couldn't we do nextBatch().size()?

We dont change the position in the buffer in nextBatchSize, so duplicate() is not required? The unit test verifies that position is not changed.

nextBatch() returns null if the buffer is not large enough to hold the batch. nextBatchSize returns the batch size as long as the header is present, so we can allocate the buffer based on this size. Have I misunderstood the comment?

hachikuji · 2018-05-02T16:50:59Z

The notion of "next batch" is a little weird for MemoryRecords since it just represents a chunk of messages. Is firstBatchSize the intent?

Good point, changed to firstBatchSize.

hachikuji · 2018-05-02T18:52:53Z

Hmm.. should we validate the CRC before growing the buffers? The length is not protected by the CRC of course, but corruption may impact both the length and parts of the batch.

Discussed with @ijuma offline. This probably doesn't make much sense because checking the CRC requires pulling the record into memory in the first place (which requires the length). In that case, I feel like the CRC check on the first batch is just kind of weird. Do we really get a lot of benefit from it? Maybe a simple validation we could do is at least ensure that the size of the batch is not bigger than the remaining size of the segment.

junrao · 2018-05-03T00:34:04Z

It seems that we need to grow buffers in a similar way in buildOffsetMapForSegment() too?

rajinisivaram · 2018-05-08T17:05:14Z

@ijuma @hachikuji @junrao Thanks for the reviews. I have addressed the comments so far.

hachikuji

LGTM, thanks for the fix. Just one comment which does not need re-review.

hachikuji · 2018-05-08T20:21:24Z

+  }
+
+  def createLogWithMessagesLargerThanMaxSize(largeMessageSize: Int): (Log, FakeOffsetMap) = {
+    // Create cleaner with very small default max message size


This comment seems misplaced?

rajinisivaram · 2018-05-09T10:44:20Z

@hachikuji Thanks for the review, merging to trunk.

…-record-version * apache-github/trunk: KAFKA-6894: Improve err msg when connecting processor with global store (apache#5000) KAFKA-6893; Create processors before starting acceptor in SocketServer (apache#4999) MINOR: Fix typo in ConsumerRebalanceListener JavaDoc (apache#4996) MINOR: Remove deprecated valueTransformer.punctuate (apache#4993) MINOR: Update dynamic broker configuration doc for truststore update (apache#4954) KAFKA-6870 Concurrency conflicts in SampledStat (apache#4985) KAFKA-6361: Fix log divergence between leader and follower after fast leader fail over (apache#4882) KAFKA-6813: Remove deprecated APIs in KIP-182, Part II (apache#4976) KAFKA-6878 Switch the order of underlying.init and initInternal (apache#4988) KAFKA-6299; Fix AdminClient error handling when metadata changes (apache#4295) KAFKA-6878: NPE when querying global state store not in READY state (apache#4978) KAFKA 6673: Implemented missing override equals method (apache#4745) KAFKA-6834: Handle compaction with batches bigger than max.message.bytes (apache#4953)

…tes (apache#4953) Grow buffers in log cleaner to hold one message set after sanity check even if message set is bigger than max.message.bytes. Reviewers: Jason Gustafson <jason@confluent.io>, Ismael Juma <ismael@juma.me.uk>, Jun Rao <junrao@gmail.com>

rajinisivaram requested a review from junrao May 2, 2018 08:20

ijuma reviewed May 2, 2018

View reviewed changes

hachikuji reviewed May 2, 2018

View reviewed changes

junrao reviewed May 3, 2018

View reviewed changes

rajinisivaram added 5 commits May 8, 2018 15:18

KAFKA-6834: Handle compaction with batches bigger than max.message.bytes

6b787f1

Address review comment

af1925b

Address review comment

2c740fb

Address review comment

28cfb85

Address review comments

43aa1dc

rajinisivaram force-pushed the KAFKA-6834-log-cleaner branch from 2e234b1 to 43aa1dc Compare May 8, 2018 17:04

hachikuji approved these changes May 8, 2018

View reviewed changes

Address review comment

4d53672

rajinisivaram merged commit 0ecb72f into apache:trunk May 9, 2018

Conversation

rajinisivaram commented May 2, 2018

Committer Checklist (excluded from commit message)

Uh oh!

ijuma left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

ijuma May 2, 2018 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

rajinisivaram commented May 8, 2018

Uh oh!

hachikuji left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

rajinisivaram commented May 9, 2018

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

ijuma May 2, 2018 •

edited

Loading