Skip to content

Improve performance of checkpointHighWatermarks, patch 2/2#6742

Closed
gardnervickers wants to merge 8 commits intoapache:trunkfrom
gardnervickers:checkpoint-highwatermarks-alloc-2
Closed

Improve performance of checkpointHighWatermarks, patch 2/2#6742
gardnervickers wants to merge 8 commits intoapache:trunkfrom
gardnervickers:checkpoint-highwatermarks-alloc-2

Conversation

@gardnervickers
Copy link
Copy Markdown
Contributor

@gardnervickers gardnervickers commented May 15, 2019

This PR builds on #6741 to improve the performance of checkpointHighWatermarks().

Commit gardnervickers/kafka-1@387fce0

This is a pretty invasive/significant refactor of the CheckpointFile code and accompanying classes. The goal was to avoid unnecessary buffering and allocation when constructing and writing out the checkpoint file. Also, I attempted to encourage inlining as much as possible. The change set is pretty drastic but the improvements are attractive.

Improvements over gardnervickers/kafka-1@4307a73 (baseline)

Topic Count Ops/ms MB/sec allocated
100 + 61% - 481%
1000 + 251% - 177%
2000 + 211% - 239%

noop log4j logger to avoid warnings when benchmarking.
when constructing the parent directory path. Avoid extra copying
of the in `ReplicaManager.checkpointHighWatermarks`
These include:
1. Use an object, OffsetCheckpointFileEntry constructed in
ReplicaManager.checkpointHighWatermarks() to pass to the CheckpointFile.
This removes the need for copying into an intermediate tuple.

2. Avoid recreating the CheckpointFile string builder between
write invocations.

3. Make OffsetCheckpointFile and LeaderEpochCheckpointFile AnyVal
wrappers which just reify the generic CheckpointFile.

4. Simplify the CheckpointFile.read logic, making the reading logic
decoupled from the object being read.
…rvickers/kafka-1 into checkpoint-highwatermarks-alloc-2
@gardnervickers
Copy link
Copy Markdown
Contributor Author

retest this please

2 similar comments
@gardnervickers
Copy link
Copy Markdown
Contributor Author

retest this please

@gardnervickers
Copy link
Copy Markdown
Contributor Author

retest this please

ijuma pushed a commit that referenced this pull request Feb 3, 2020
Refactors CheckpointFile such that buffers can be read in lieu of files. This
is a relatively simple refactoring as we already create a buffered reader over
the checkpoint file.

#6742, which improves the performance of the checkpointing code, requires
a similar refactoring of code, although it does go further than this.

Reviewers: Colin Patrick McCabe <cmccabe@apache.org>, Ismael Juma <ismael@juma.me.uk>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant