Skip to content

KAFKA-9579 Fetch implementation for records in the remote storage through a specific purgatory.#13535

Merged
satishd merged 30 commits intoapache:trunkfrom
satishd:rlm-consumer-fetch
May 18, 2023
Merged

KAFKA-9579 Fetch implementation for records in the remote storage through a specific purgatory.#13535
satishd merged 30 commits intoapache:trunkfrom
satishd:rlm-consumer-fetch

Conversation

@satishd
Copy link
Copy Markdown
Member

@satishd satishd commented Apr 11, 2023

This PR includes

  • Recognize the fetch requests with out of range local log offsets
  • Add fetch implementation for the data lying in the range of [logStartOffset, localLogStartOffset]
  • Add a new purgatory for async remote read requests which are served through a specific thread pool

We have an extended version of remote fetch that can fetch from multiple remote partitions in parallel, which we will raise as a followup PR.

A few tests for the newly introduced changes are added in this PR. There are some tests available for these scenarios in 2.8.x, refactoring with the trunk changes, will add them in followup PRs.

Other contributors:
kamal.chandraprakash@gmail.com - Further improvements and adding a few tests
showuon@gmail.com - Added a few test cases for these changes.

PS: This functionality is pulled out from internal branches with other functionalities related to the feature in 2.8.x. The reason for not pulling all the changes as it makes the PR huge, and burdensome to review and it also needs other metrics, minor enhancements(including perf), and minor changes done for tests. So, we will try to have followup PRs to cover all those.

Committer Checklist (excluded from commit message)

  • Verify design and implementation
  • Verify test coverage and CI build status
  • Verify documentation (including upgrade notes)

@satishd satishd force-pushed the rlm-consumer-fetch branch 4 times, most recently from ca16e79 to 9ba10cf Compare April 12, 2023 09:43
Copy link
Copy Markdown
Member

@showuon showuon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the PR. Left some comments.

Comment thread core/src/main/java/kafka/log/remote/RemoteLogManager.java Outdated
Comment thread core/src/main/java/kafka/log/remote/RemoteLogManager.java Outdated
Comment thread core/src/main/java/kafka/log/remote/RemoteLogReader.java Outdated
Comment thread core/src/main/java/kafka/log/remote/RemoteLogReader.java Outdated
Comment thread core/src/main/scala/kafka/server/DelayedRemoteFetch.scala Outdated
Comment thread core/src/main/scala/kafka/server/ReplicaManager.scala Outdated
Comment thread core/src/main/scala/kafka/server/ReplicaManager.scala Outdated
@satishd satishd marked this pull request as ready for review April 12, 2023 15:49
@satishd satishd requested a review from junrao April 12, 2023 15:49
@satishd satishd force-pushed the rlm-consumer-fetch branch from 9acff1f to fb613c8 Compare April 12, 2023 15:56
@satishd satishd changed the title [DRAFT] KAFKA-9579 Fetch implementation for records in the remote storage through a specific purgatory. KAFKA-9579 Fetch implementation for records in the remote storage through a specific purgatory. Apr 12, 2023
Comment thread core/src/main/scala/kafka/server/ReplicaManager.scala Outdated
RecordBatch firstBatch = findFirstBatch(remoteLogInputStream, offset);

if (firstBatch == null)
return new FetchDataInfo(new LogOffsetMetadata(offset), MemoryRecords.EMPTY, false,
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we need to log something in this case.

@satishd satishd force-pushed the rlm-consumer-fetch branch from fb613c8 to c2873f5 Compare April 13, 2023 11:41
Comment on lines +1129 to +1174
// The 1st topic-partition that has to be read from remote storage
var remoteFetchInfo: Optional[RemoteStorageFetchInfo] = Optional.empty()
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I understand a new PR will come to overcome this, but could we provide further context (on the source code or PR) about the implications of using the first topic-partition only?

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed - there are consumption patterns which diverge from the local case with this approach (that is, uneven progress across the partitions consumed from a topic [with said partitions of the same nature w.r.t. record batch size and overall size]).

It may be preferable not to diverge from the local approach and read from all the remote partitions found in the fetchInfos. Then, a different read pattern which provides greater performance for a specific operational environment and workload could be enforced via a configuration property.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As I already called out in this PR description, that it is followed up with a PR. We will describe the config on different options with respective scenarios. The default value will be to fetch from multiple partitions as it does with local log segments.

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Got it, thanks.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, thanks.

Copy link
Copy Markdown
Contributor

@junrao junrao left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@satishd : Thanks for the PR. A few comments below.

// may arrive and hence make this operation completable.
delayedFetchPurgatory.tryCompleteElseWatch(delayedFetch, delayedFetchKeys)

if (remoteFetchInfo.isPresent) {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In line 1082, we should further test !remoteFetchInfo.isPresent, right?

Copy link
Copy Markdown
Member Author

@satishd satishd Apr 18, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not sure line num:1082 is sane as you meant it to be as the file could have been updated. Please clarify.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the following code, we should go into that branch only if remoteFetchInfo is empty, right? Otherwise, if we could get into a situation that a remote partition is never served because the fetch request is always satisfied with new local data on other partitions.

    if (params.maxWaitMs <= 0 || fetchInfos.isEmpty || bytesReadable >= params.minBytes || errorReadingData ||
      hasDivergingEpoch || hasPreferredReadReplica) {

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you mean to say that we should not return immediately if remoteFetchInfo exists because that should be served otherwise remote fetches may starve as long as there is enough data immediately available to be sent? So, the condition becomes

    if (!remoteFetchInfo.isPresent && (params.maxWaitMs <= 0 || fetchInfos.isEmpty 
            || bytesReadable >= params.minBytes || errorReadingData || hasDivergingEpoch 
            || hasPreferredReadReplica))

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes

Copy link
Copy Markdown
Member Author

@satishd satishd May 9, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, that check was missed while pulling the changes. Good catch. Updated it with the latest commit.

Comment thread core/src/main/scala/kafka/server/ReplicaManager.scala Outdated
Comment thread core/src/main/scala/kafka/server/DelayedRemoteFetch.scala Outdated
Comment thread core/src/main/java/kafka/log/remote/RemoteLogManager.java
Comment thread core/src/main/java/kafka/log/remote/RemoteLogManager.java Outdated
Comment thread core/src/main/scala/kafka/server/DelayedRemoteFetch.scala
Comment thread core/src/main/scala/kafka/server/DelayedRemoteFetch.scala Outdated
Comment thread core/src/main/java/kafka/log/remote/RemoteLogReader.java
Comment thread core/src/main/java/kafka/log/remote/RemoteStorageThreadPool.java Outdated
Comment thread core/src/main/java/kafka/log/remote/RemoteStorageThreadPool.java Outdated
InputStream remoteSegInputStream = null;
try {
// Search forward for the position of the last offset that is greater than or equal to the target offset
remoteSegInputStream = remoteLogStorageManager.fetchLogSegment(rlsMetadata.get(), startPos);
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would be possible sending the endOffset as well? Without it, input stream will potentially contain the whole log and not be consumed til the end.
In the case of S3, when inputstream is not consumed til the end HTTP connection is aborted.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We will look into it in a followup PR.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, thanks.

Comment thread core/src/main/java/kafka/log/remote/RemoteLogManager.java
Comment thread core/src/main/java/kafka/log/remote/RemoteLogManager.java Outdated
Comment on lines +1129 to +1174
// The 1st topic-partition that has to be read from remote storage
var remoteFetchInfo: Optional[RemoteStorageFetchInfo] = Optional.empty()
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed - there are consumption patterns which diverge from the local case with this approach (that is, uneven progress across the partitions consumed from a topic [with said partitions of the same nature w.r.t. record batch size and overall size]).

It may be preferable not to diverge from the local approach and read from all the remote partitions found in the fetchInfos. Then, a different read pattern which provides greater performance for a specific operational environment and workload could be enforced via a configuration property.

Comment thread core/src/main/scala/kafka/server/ReplicaManager.scala
@satishd satishd force-pushed the rlm-consumer-fetch branch 2 times, most recently from b9c6ef8 to 666fd8d Compare April 17, 2023 11:13
Copy link
Copy Markdown
Member Author

@satishd satishd left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @junrao for your review. Addressed your comments inline and/or with the latest commits.

Comment thread core/src/main/java/kafka/log/remote/RemoteLogManager.java
Comment thread core/src/main/scala/kafka/server/ReplicaManager.scala Outdated
}

if (searchInLocalLog) {
txnIndexOpt = (localLogSegments.hasNext()) ? Optional.of(localLogSegments.next().txnIndex()) : Optional.empty();
Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right, it can have duplicates. But consumer already handles the duplicate aborted transactions. Updated the code to remove duplicates incase any consumer implementation can not handle duplicate aborted transactions.

Comment thread core/src/main/java/kafka/log/remote/RemoteLogReader.java
// may arrive and hence make this operation completable.
delayedFetchPurgatory.tryCompleteElseWatch(delayedFetch, delayedFetchKeys)

if (remoteFetchInfo.isPresent) {
Copy link
Copy Markdown
Member Author

@satishd satishd Apr 18, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not sure line num:1082 is sane as you meant it to be as the file could have been updated. Please clarify.

@satishd satishd force-pushed the rlm-consumer-fetch branch from 4a8f67f to b8a3c83 Compare April 19, 2023 05:05
@satishd satishd requested a review from junrao April 25, 2023 06:11
Copy link
Copy Markdown
Member

@showuon showuon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! @satishd , will there be other tests added as in the PR description said, or there will be follow-up PRs to add them?

@showuon
Copy link
Copy Markdown
Member

showuon commented Apr 28, 2023

@Hangleton @junrao @jeqo , any other comments to this PR? We hope we can merge it in the early stage of a release, so that we can have enough time to test the stability and have more improvement. Thanks.

Copy link
Copy Markdown
Member

@divijvaidya divijvaidya left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall looks good to me. One major comment about correctly shutting down the delayed fetch thread pool, otherwise looks good to me.

Comment thread core/src/main/java/kafka/log/remote/RemoteLogManager.java Outdated
Comment thread core/src/main/java/kafka/log/remote/RemoteLogManager.java Outdated
Comment thread core/src/main/java/kafka/log/remote/RemoteLogManager.java Outdated
Comment thread core/src/main/scala/kafka/server/DelayedRemoteFetch.scala Outdated
Comment thread core/src/main/scala/kafka/server/DelayedRemoteFetch.scala Outdated
Comment thread core/src/main/scala/kafka/server/DelayedRemoteFetch.scala Outdated
Comment thread core/src/main/java/kafka/log/remote/RemoteLogManager.java Outdated
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this RejectedExecutionException propagated to the Consumer fetch? If yes, is this a change in the existing interface with the consumer? (please correct me if I am wrong but I am not aware of consumer handling or expecting RejectedExecutionException today.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This error is propagated as unexpected error (UnknownServerException) to the consumer client and it is already handled.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you. That answers my question.

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we add a log if the nature of the error is not propagated?

@satishd
Copy link
Copy Markdown
Member Author

satishd commented May 1, 2023

LGTM! @satishd , will there be other tests added as in the PR description said, or there will be follow-up PRs to add them?

@showuon Those will be added as followups.

Comment thread core/src/main/java/kafka/log/remote/RemoteLogManager.java Outdated
Comment thread core/src/main/java/kafka/log/remote/RemoteLogManager.java Outdated
Comment on lines +1129 to +1174
// The 1st topic-partition that has to be read from remote storage
var remoteFetchInfo: Optional[RemoteStorageFetchInfo] = Optional.empty()
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Got it, thanks.

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: instead of NOT_AVAILABLE, maybe the message could report that the log start offset is strictly greater than the fetch offset?

Copy link
Copy Markdown
Member

@divijvaidya divijvaidya left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for addressing the previous comments Satish. I have some additional ones about how we are handling shutdown.

Comment thread core/src/main/java/kafka/log/remote/RemoteLogManager.java Outdated
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How did we decide on 2min. here? I don't think we should block shutdown of the broker on this over here because there are other limits associated with clean vs unclean shutdown. If we do plan to block, we should tie it to overall shutdown timeout. As an example, clean shutdown is expected to be completed in 5 min. see lifecycleManager.controlledShutdownFuture.get(5L, TimeUnit.MINUTES) in BrokerServer.scala.

Copy link
Copy Markdown
Member Author

@satishd satishd May 4, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It does not require that to be completed in 5 mins. lifecycleManager.controlledShutdownFuture is more about processing the controlled shutdown event to the controller for that broker. It will wait for 5 mins before proceeding with other sequence of actions. But that will not get affected because of the code introduced here.
Logging subsystem handles unclean shutdown for log segments and it would have been already finished before RemoteLogManager is closed. So, they will not get affected because of this timeout. But we can have a short duration here like 10 secs, we can revisit introducing a config if it is really needed for closing the remote log subsystem.

Comment thread core/src/main/java/kafka/log/remote/RemoteLogReader.java Outdated
Comment thread core/src/main/java/kafka/log/remote/RemoteLogReader.java
Comment thread core/src/test/java/kafka/log/remote/RemoteLogReaderTest.java Outdated
Copy link
Copy Markdown
Member

@divijvaidya divijvaidya left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This change looks good to me! (assuming tests will be merged in separate PR)

Copy link
Copy Markdown
Contributor

@junrao junrao left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@satishd : Thanks for the updated PR. Haven't looked at the testing code. A few more comments.

// may arrive and hence make this operation completable.
delayedFetchPurgatory.tryCompleteElseWatch(delayedFetch, delayedFetchKeys)

if (remoteFetchInfo.isPresent) {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the following code, we should go into that branch only if remoteFetchInfo is empty, right? Otherwise, if we could get into a situation that a remote partition is never served because the fetch request is always satisfied with new local data on other partitions.

    if (params.maxWaitMs <= 0 || fetchInfos.isEmpty || bytesReadable >= params.minBytes || errorReadingData ||
      hasDivergingEpoch || hasPreferredReadReplica) {

Comment thread core/src/main/scala/kafka/server/ReplicaManager.scala Outdated
Comment thread core/src/main/scala/kafka/server/ReplicaManager.scala Outdated
Comment thread core/src/main/scala/kafka/server/ReplicaManager.scala Outdated
Comment thread core/src/main/scala/kafka/server/ReplicaManager.scala Outdated
Comment thread core/src/main/scala/kafka/server/ReplicaManager.scala Outdated
Comment thread core/src/main/scala/kafka/server/ReplicaManager.scala Outdated
Comment thread core/src/main/java/kafka/log/remote/RemoteLogManager.java Outdated
Comment thread core/src/main/java/kafka/log/remote/RemoteLogManager.java Outdated
}

if (searchInLocalLog) {
txnIndexOpt = (localLogSegments.hasNext()) ? Optional.of(localLogSegments.next().txnIndex()) : Optional.empty();

This comment was marked as resolved.

@satishd
Copy link
Copy Markdown
Member Author

satishd commented May 15, 2023

@junrao We are not sure whether those failures are related to this change. They do not fail on the laptop or other hosts. We are looking into those failures.

@kamalcph
Copy link
Copy Markdown
Contributor

@satishd : Thanks for the updated PR. Are the test failures related to this PR especially the following ones related to remote store?

[Build / JDK 11 and Scala 2.13 / kafka.server.ListOffsetsRequestWithRemoteStoreTest.testResponseIncludesLeaderEpoch()](https://ci-builds.apache.org/job/Kafka/job/kafka-pr/job/PR-13535/30/testReport/junit/kafka.server/ListOffsetsRequestWithRemoteStoreTest/Build___JDK_11_and_Scala_2_13___testResponseIncludesLeaderEpoch___2/)
[Build / JDK 8 and Scala 2.12 / kafka.server.ListOffsetsRequestWithRemoteStoreTest.testResponseIncludesLeaderEpoch()](https://ci-builds.apache.org/job/Kafka/job/kafka-pr/job/PR-13535/30/testReport/junit/kafka.server/ListOffsetsRequestWithRemoteStoreTest/Build___JDK_8_and_Scala_2_12___testResponseIncludesLeaderEpoch__/)
[Build / JDK 8 and Scala 2.12 / kafka.server.ListOffsetsRequestWithRemoteStoreTest.testResponseIncludesLeaderEpoch()](https://ci-builds.apache.org/job/Kafka/job/kafka-pr/job/PR-13535/30/testReport/junit/kafka.server/ListOffsetsRequestWithRemoteStoreTest/Build___JDK_8_and_Scala_2_12___testResponseIncludesLeaderEpoch___2/)

It is not because of the changes in the PR. #10389 attempted to stabilize this test but it can still fail if the machine is slow.

Copy link
Copy Markdown
Contributor

@junrao junrao left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@kamalcph : Thanks for the investigation. The PR LGTM. Just a couple of minor comments.

Also, should we reopen https://issues.apache.org/jira/browse/KAFKA-12384 since it's still flaky?

Comment thread core/src/main/scala/kafka/server/ReplicaManager.scala Outdated
Comment thread core/src/main/scala/kafka/server/ReplicaManager.scala Outdated
@satishd
Copy link
Copy Markdown
Member Author

satishd commented May 17, 2023

Thanks @junrao for the updated review. Addressed your latest minor review comments.

Copy link
Copy Markdown
Contributor

@junrao junrao left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@satishd : Thanks for the updated PR. A couple of minor comments.

Comment thread core/src/main/scala/kafka/server/ReplicaManager.scala Outdated
Comment thread core/src/main/scala/kafka/server/ReplicaManager.scala Outdated
@satishd
Copy link
Copy Markdown
Member Author

satishd commented May 17, 2023

Thanks @junrao for the latest comments, addressed them with the latest commit.

Copy link
Copy Markdown
Contributor

@junrao junrao left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@satishd : Thanks for the updated PR. LGTM

@satishd satishd merged commit 6f19730 into apache:trunk May 18, 2023
Copy link
Copy Markdown

@Hangleton Hangleton left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the PR Satish!

@dajac
Copy link
Copy Markdown
Member

dajac commented May 22, 2023

@satishd I see many failed tests here. Are they related to changes made in this PR? This commit seems to be the only recent changes in this area.

@satishd
Copy link
Copy Markdown
Member Author

satishd commented May 23, 2023

@dajac They do not seem to be related to this PR. Please take a look at the comment.

@dajac
Copy link
Copy Markdown
Member

dajac commented May 23, 2023

@satishd testResponseIncludesLeaderEpoch fails locally. Does it pass for you? It does not seem to be related to slow CI.

@satishd
Copy link
Copy Markdown
Member Author

satishd commented May 23, 2023

@dajac It is passed locally on my laptop.

Gradle Test Run :core:test > Gradle Test Executor 65 > ListOffsetsRequestTest > testResponseDefaultOffsetAndLeaderEpochForAllVersions() PASSED

Gradle Test Run :core:test > Gradle Test Executor 65 > ListOffsetsRequestTest > testListOffsetsMaxTimeStampOldestVersion() PASSED

Gradle Test Run :core:test > Gradle Test Executor 65 > ListOffsetsRequestTest > testListOffsetsErrorCodes() PASSED

Gradle Test Run :core:test > Gradle Test Executor 65 > ListOffsetsRequestTest > testCurrentEpochValidation() PASSED

Gradle Test Run :core:test > Gradle Test Executor 65 > ListOffsetsRequestTest > testResponseIncludesLeaderEpoch() PASSED

BUILD SUCCESSFUL in 1m 8s
55 actionable tasks: 6 executed, 49 up-to-date
➜  kafka git:(apache-trunk) date
Tue May 23 18:00:01 IST 2023
➜  kafka git:(apache-trunk) git rev-parse --verify HEAD
15f8705246e094f7825b76a38d9f12f95d626ee5
➜  kafka git:(apache-trunk)
> Task :core:test

Gradle Test Run :core:test > Gradle Test Executor 71 > ListOffsetsRequestWithRemoteStoreTest > testResponseDefaultOffsetAndLeaderEpochForAllVersions() PASSED

Gradle Test Run :core:test > Gradle Test Executor 71 > ListOffsetsRequestWithRemoteStoreTest > testListOffsetsMaxTimeStampOldestVersion() PASSED

Gradle Test Run :core:test > Gradle Test Executor 71 > ListOffsetsRequestWithRemoteStoreTest > testListOffsetsErrorCodes() PASSED

Gradle Test Run :core:test > Gradle Test Executor 71 > ListOffsetsRequestWithRemoteStoreTest > testCurrentEpochValidation() PASSED

Gradle Test Run :core:test > Gradle Test Executor 71 > ListOffsetsRequestWithRemoteStoreTest > testResponseIncludesLeaderEpoch() PASSED

BUILD SUCCESSFUL in 1m 9s
55 actionable tasks: 6 executed, 49 up-to-date
➜  kafka git:(apache-trunk) date
Tue May 23 18:05:20 IST 2023
➜  kafka git:(apache-trunk) git rev-parse --verify HEAD
15f8705246e094f7825b76a38d9f12f95d626ee5
➜  kafka git:(apache-trunk)

@dajac
Copy link
Copy Markdown
Member

dajac commented May 23, 2023

@satishd Weird... It fails all the time on my laptop.

Gradle Test Run :core:test > Gradle Test Executor 9 > ListOffsetsRequestTest > testResponseIncludesLeaderEpoch() FAILED
    org.opentest4j.AssertionFailedError: expected: <(10,1,0)> but was: <(-1,-1,78)>
        at app//org.junit.jupiter.api.AssertionFailureBuilder.build(AssertionFailureBuilder.java:151)
        at app//org.junit.jupiter.api.AssertionFailureBuilder.buildAndThrow(AssertionFailureBuilder.java:132)
        at app//org.junit.jupiter.api.AssertEquals.failNotEqual(AssertEquals.java:197)
        at app//org.junit.jupiter.api.AssertEquals.assertEquals(AssertEquals.java:182)
        at app//org.junit.jupiter.api.AssertEquals.assertEquals(AssertEquals.java:177)
        at app//org.junit.jupiter.api.Assertions.assertEquals(Assertions.java:1142)
        at app//kafka.server.ListOffsetsRequestTest.testResponseIncludesLeaderEpoch(ListOffsetsRequestTest.scala:210)
% git rev-parse --verify HEAD
15f8705246e094f7825b76a38d9f12f95d626ee5

@dajac
Copy link
Copy Markdown
Member

dajac commented May 23, 2023

@satishd I raised a PR to fix this: #13747. Could you take a look?

quota: ReplicaQuota,
responseCallback: Seq[(TopicIdPartition, FetchPartitionData)] => Unit
): Unit = {
def fetchMessages(params: FetchParams,
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A small comment. We should avoid completely changing the code style without reasons. The format of the method was not a mistake. It is the format that we mainly used in this class nowadays.

Comment on lines +1436 to +1439
private def handleOffsetOutOfRangeError(tp: TopicIdPartition, params: FetchParams, fetchInfo: PartitionData,
adjustedMaxBytes: Int, minOneMessage:
Boolean, log: UnifiedLog, fetchTimeMs: Long,
exception: OffsetOutOfRangeException): LogReadResult = {
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We usually don't format method like this. Could we put one argument per line?

Comment on lines +1460 to +1461
val fetchDataInfo =
new FetchDataInfo(new LogOffsetMetadata(offset), MemoryRecords.EMPTY, false, Optional.empty(),
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: new FetchDataInfo should be on previous line or indented.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

tiered-storage Related to the Tiered Storage feature

Projects

None yet

Development

Successfully merging this pull request may close these issues.

9 participants