Skip to content

KAFKA-8729: Change PartitionResponse to include all troubling records#7612

Merged
guozhangwang merged 4 commits intoapache:trunkfrom
tuvtran:enhanced-validation
Nov 6, 2019
Merged

KAFKA-8729: Change PartitionResponse to include all troubling records#7612
guozhangwang merged 4 commits intoapache:trunkfrom
tuvtran:enhanced-validation

Conversation

@tuvtran
Copy link
Copy Markdown
Contributor

@tuvtran tuvtran commented Oct 29, 2019

Background:

Currently, whenever a batch is dropped because ofInvalidRecordException or InvalidTimestampException, only the culprit record appears in ProduceResponse.PartitionResponse.recordErrors. However, after users try to resend that batch excluding the rejected message, the latter records are not guaranteed to be free of problems.

Changes:

To address this issue, I changed the function signature of validateKey, validateRecord and validateTimestamp to return a Scala's Option object. Specifically, this object will hold the reason/message the current record in iteration fails and leaves to the callers (convertAndAssignOffsetsNonCompressed, assignOffsetsNonCompressed, validateMessagesAndAssignOffsetsCompressed) to gathered all troubling records into one place. Then, all these records will be returned along with the PartitionResponse object. As a result, if a batch contains more than one record errors, users see exactly which records cause the failure. PartitionResponse.recordErrors is a list of RecordError objects introduced by #7167 which include batchIndex denoting the relative position of a record in a batch and message indicating the reason of failure.

Gotchas:

Things are particularly tricky when a batch has records rejected because of both InvalidRecordException and InvalidTimestampException. In this case, the InvalidTimestampException takes higher precedence. Therefore, the Error field in PartitionResponse will be encoded with INVALID_TIMESTAMP.

Committer Checklist (excluded from commit message)

  • Verify design and implementation
  • Verify test coverage and CI build status
  • Verify documentation (including upgrade notes)

@tuvtran tuvtran changed the title change function signatures KAFKA-8729: Change PartitionResponse to include all troubling records Oct 29, 2019
@guozhangwang
Copy link
Copy Markdown
Contributor

retest this please

@tuvtran tuvtran force-pushed the enhanced-validation branch from eea3dde to b4f12d2 Compare October 31, 2019 21:21
@tuvtran
Copy link
Copy Markdown
Contributor Author

tuvtran commented Oct 31, 2019

This PR will likely have some conflicts with #7628 , which is a blocker for 2.4 so I think we should wait for it to be merged first.

@tuvtran
Copy link
Copy Markdown
Contributor Author

tuvtran commented Nov 4, 2019

Hi @guozhangwang , I have merged the changes from trunk to this PR. The build should be green now.

@guozhangwang
Copy link
Copy Markdown
Contributor

LGTM!

@guozhangwang guozhangwang merged commit 16f1ce1 into apache:trunk Nov 6, 2019
@ijuma
Copy link
Copy Markdown
Member

ijuma commented Nov 7, 2019

Have we verified that this doesn't cause any performance regressions?

@tuvtran
Copy link
Copy Markdown
Contributor Author

tuvtran commented Nov 8, 2019

I’ve been running kafka-producer-perf-test with 1M messages of size 102400 bytes each and observed RequestHandlerAvgIdlePercent. I haven’t seen any significant difference on that metric. Producer throughputs are also the same

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants