Skip to content

KAFKA-12552: Introduce LogSegments class abstracting the segments map#10401

Merged
junrao merged 4 commits intoapache:trunkfrom
kowshik:refactor_extract_LogSegments_class
Mar 30, 2021
Merged

KAFKA-12552: Introduce LogSegments class abstracting the segments map#10401
junrao merged 4 commits intoapache:trunkfrom
kowshik:refactor_extract_LogSegments_class

Conversation

@kowshik
Copy link
Copy Markdown
Contributor

@kowshik kowshik commented Mar 25, 2021

This PR is a precursor to the recovery logic refactor work (KAFKA-12553).

In this PR, I've extracted the behavior surrounding segments map access within kafka.log.Log class into a new class: kafka.log.LogSegments. This class encapsulates a thread-safe navigable map of kafka.log.LogSegment instances and provides the required read and write behavior on the map. The Log class now encapsulates an instance of the LogSegments class.

Couple advantages of this PR:

  • Makes the Log class a bit more modular as it moves out certain private behavior thats otherwise within the Log class.
  • This is a precursor to refactoring the recovery logic (KAFKA-12553). In the future, the logic for recovery and loading of segments from disk (during Log) init will reside outside the Log class. Such logic would need to instantiate and access an instance of the newly added LogSegments class.

Tests:
Added a new test suite: kafka.log.LogSegmentsTest covering the APIs of the newly introduced class.

@kowshik
Copy link
Copy Markdown
Contributor Author

kowshik commented Mar 25, 2021

@junrao, @dhruvilshah3, @satishd - this PR is ready for review.

Copy link
Copy Markdown
Contributor

@junrao junrao left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@kowshik : Thanks for the PR. Just one comment below.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this the same as lastSegment()?

Copy link
Copy Markdown
Contributor Author

@kowshik kowshik Mar 29, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes. That's a good question. I'll eliminate this method.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

@kowshik kowshik force-pushed the refactor_extract_LogSegments_class branch from 5830b78 to 4d73714 Compare March 29, 2021 18:36
@kowshik
Copy link
Copy Markdown
Contributor Author

kowshik commented Mar 29, 2021

Thanks for the review @junrao ! I've addressed the review comment(s) in 8f1e292 and 4d73714.

Copy link
Copy Markdown
Contributor

@dhruvilshah3 dhruvilshah3 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PR looks reasonable to me. Thanks @kowshik.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: should this return an Iterable instead of Seq?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The suggestion feels reasonable to me. I've addressed it in 43bf65c.

Copy link
Copy Markdown
Contributor

@dhruvilshah3 dhruvilshah3 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks for the PR

Copy link
Copy Markdown
Contributor

@junrao junrao left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@kowshik : Thanks for the updated PR. LGTM. Do you know why the jenkins tests didn't run?

@kowshik
Copy link
Copy Markdown
Contributor Author

kowshik commented Mar 29, 2021

@junrao For jenkins, let me try bumping this PR with a rebase to see if that will start up jenkins. I'm not sure why there is a delay. It seems the pr-merge checker seems to have just started building at 23:47 UTC time i.e. ~5m ago: https://ci-builds.apache.org/blue/organizations/jenkins/Kafka%2Fkafka-pr/detail/PR-10430/2/pipeline . There maybe have been a delay there too, I'm not sure why exactly.

@kowshik kowshik force-pushed the refactor_extract_LogSegments_class branch from 43bf65c to 9dbb679 Compare March 29, 2021 23:58
@kowshik
Copy link
Copy Markdown
Contributor Author

kowshik commented Mar 30, 2021

@junrao The Jenkins run has started now. We can wait for the tests to pass.

Copy link
Copy Markdown
Member

@satishd satishd left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks @kowshik for the PR, LGTM.

@kowshik
Copy link
Copy Markdown
Contributor Author

kowshik commented Mar 30, 2021

@junrao The jenkins tests have finished. I checked the test failures, they seem to be unrelated to this PR:

kafka.server.RaftClusterTest.testCreateClusterAndCreateListDeleteTopic()
org.apache.kafka.connect.mirror.integration.MirrorConnectorsIntegrationSSLTest.testReplication()

@junrao junrao merged commit d92d464 into apache:trunk Mar 30, 2021
Terrdi pushed a commit to Terrdi/kafka that referenced this pull request Apr 1, 2021
…apache#10401)

This PR is a precursor to the recovery logic refactor work (KAFKA-12553).

In this PR, I've extracted the behavior surrounding segments map access within kafka.log.Log class into a new class: kafka.log.LogSegments. This class encapsulates a thread-safe navigable map of kafka.log.LogSegment instances and provides the required read and write behavior on the map. The Log class now encapsulates an instance of the LogSegments class.

Couple advantages of this PR:

Makes the Log class a bit more modular as it moves out certain private behavior thats otherwise within the Log class.
This is a precursor to refactoring the recovery logic (KAFKA-12553). In the future, the logic for recovery and loading of segments from disk (during Log) init will reside outside the Log class. Such logic would need to instantiate and access an instance of the newly added LogSegments class.
Tests:
Added a new test suite: kafka.log.LogSegmentsTest covering the APIs of the newly introduced class.

Reviewers: Satish Duggana <satishd@apache.org>, Dhruvil Shah <dhruvil@confluent.io>, Jun Rao <junrao@gmail.com>
junrao pushed a commit that referenced this pull request May 27, 2021
…10684)

In Log.collectAbortedTransactions() I've restored a previously used logic, such that it would handle the case where the starting segment could be null. This was the case previously, but the PR #10401 accidentally changed the behavior causing the code to assume that the starting segment won't be null.

In Log.rebuildProducerState() I've removed usage of the allSegments local variable. The logic looks a bit simpler after I removed it.

I've introduced a new LogSegments.higherSegments() API. This is now used to make the logic a bit more readable in Log. collectAbortedTransactions() and Log.deletableSegments() APIs.

I've removed the unnecessary use of java.lang.Long in LogSegments class' segments map definition.

I've converted a few LogSegments API from public to private, as they need not be public.

Reviewers: Ismael Juma <ismael@juma.me.uk>, Cong Ding <cong@ccding.com>, Jun Rao <junrao@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants