-
Notifications
You must be signed in to change notification settings - Fork 3.4k
HBASE-25392 Direct insert compacted HFiles into data directory. #3389
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from all commits
675c889
bbb851e
a017422
4f7aba7
440b66a
10586d8
948b6e1
304226c
da72e28
9bd929f
ed21c95
db80e29
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -28,6 +28,7 @@ | |
| import java.util.Collection; | ||
| import java.util.List; | ||
| import java.util.Map; | ||
| import java.util.function.Function; | ||
|
|
||
| import org.apache.hadoop.conf.Configuration; | ||
| import org.apache.hadoop.fs.Path; | ||
|
|
@@ -136,6 +137,12 @@ protected static class FileDetails { | |
| public long minSeqIdToKeep = 0; | ||
| /** Total size of the compacted files **/ | ||
| private long totalCompactedFilesSize = 0; | ||
|
|
||
| public long getTotalCompactedFilesSize() { | ||
| return totalCompactedFilesSize; | ||
| } | ||
|
|
||
|
|
||
| } | ||
|
|
||
| /** | ||
|
|
@@ -266,7 +273,9 @@ public InternalScanner createScanner(ScanInfo scanInfo, List<StoreFileScanner> s | |
| * @param fd The file details. | ||
| * @return Writer for a new StoreFile in the tmp dir. | ||
| * @throws IOException if creation failed | ||
| * @deprecated Use initWriter instead. | ||
petersomogyi marked this conversation as resolved.
Show resolved
Hide resolved
|
||
| */ | ||
| @Deprecated | ||
| protected final StoreFileWriter createTmpWriter(FileDetails fd, boolean shouldDropBehind, boolean major) | ||
| throws IOException { | ||
| // When all MVCC readpoints are 0, don't write them. | ||
|
|
@@ -278,6 +287,21 @@ protected final StoreFileWriter createTmpWriter(FileDetails fd, boolean shouldDr | |
| HConstants.EMPTY_STRING); | ||
| } | ||
|
|
||
| /** | ||
| * Default method for initializing a StoreFileWriter in the compaction process, this creates the | ||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Is this right if we have the StoreEngine in the mix? Its supposed to do this right?
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. StoreEngine delegates the work to a
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Is creating them in the temp directory really "default"? Or is it "default" just because the direct-insert stuff isn't available yet? ;) When you say other compactors may not use it, it makes me wonder if it makes sense to provide it here.
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
It is the default now, if we don't explicitly specify
What I meant is that other compactors may overwrite the method with a different logic.
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
I'm just thinking about what other potential compaction algorithms may choose to do. I would expect that any future compactions would still want to leverage the "commit logic" here to avoid that rename. Just keeping tabs on "pre-optimization" without a specific use-case in mind. Although, as long as other compaction algorithms can extend DirectStoreCompactor and get the new shiny solution, it's not a concern. |
||
| * resulting files on a temp directory. Therefore, upon compaction commit time, these files | ||
| * should be renamed into the actual store dir. | ||
| * @param fd the file details. | ||
| * @param shouldDropBehind boolean for the drop-behind output stream cache settings. | ||
| * @param major if compaction is major. | ||
| * @return Writer for a new StoreFile in the tmp dir. | ||
| * @throws IOException if it fails to initialise the writer. | ||
| */ | ||
| protected StoreFileWriter initWriter(FileDetails fd, boolean shouldDropBehind, boolean major) | ||
| throws IOException { | ||
| return createTmpWriter(fd, shouldDropBehind, major); | ||
| } | ||
|
|
||
| protected final StoreFileWriter createTmpWriter(FileDetails fd, boolean shouldDropBehind, | ||
| String fileStoragePolicy, boolean major) throws IOException { | ||
| return store.createWriterInTmp(fd.maxKeyCount, | ||
|
|
@@ -533,4 +557,57 @@ protected InternalScanner createScanner(HStore store, ScanInfo scanInfo, | |
| return new StoreScanner(store, scanInfo, scanners, smallestReadPoint, earliestPutTs, | ||
| dropDeletesFromRow, dropDeletesToRow); | ||
| } | ||
|
|
||
| /** | ||
| * Default implementation for committing store files created after a compaction. Assumes new files | ||
| * had been created on a temp directory, so it renames those files into the actual store dir, | ||
| * then create a reader and cache it into the store. | ||
| * @param cr the compaction request. | ||
| * @param newFiles the new files created by this compaction under a temp dir. | ||
| * @param user the running user. | ||
| * @param fileProvider a lambda expression with logic for loading a HStoreFile given a Path. | ||
| * @return A list of the resulting store files already placed in the store dir and loaded into the | ||
| * store cache. | ||
| * @throws IOException if the commit fails. | ||
| */ | ||
| public List<HStoreFile> commitCompaction(CompactionRequestImpl cr, List<Path> newFiles, | ||
| User user, StoreFileProvider fileProvider) throws IOException { | ||
| List<HStoreFile> sfs = new ArrayList<>(newFiles.size()); | ||
| for (Path newFile : newFiles) { | ||
| assert newFile != null; | ||
| this.store.validateStoreFile(newFile); | ||
petersomogyi marked this conversation as resolved.
Show resolved
Hide resolved
|
||
| // Move the file into the right spot | ||
| HStoreFile sf = createFileInStoreDir(newFile, fileProvider); | ||
| if (this.store.getCoprocessorHost() != null) { | ||
| this.store.getCoprocessorHost().postCompact(this.store, sf, cr.getTracker(), cr, user); | ||
| } | ||
| assert sf != null; | ||
| sfs.add(sf); | ||
| } | ||
| return sfs; | ||
| } | ||
|
|
||
| /** | ||
| * Assumes new file was created initially on a temp folder. | ||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Don't we want the Compactor asking the StoreEngine to do the writing for us? Rather than doing it here?
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. We still need to vary the logic of "where" the store file is originally created. Default behaviour is to create it under temp (that's done by the call from lines #591/#592), whilst
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Same way above, I think it makes sense to not have the abstract Compactor dealing with where storefiles are put as a part of the compaction algorithm. That's an implementation detail of the algorithm itself. Can we reasonably push down where paths/files are stored into the implementation itself?
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Originally, the logic of creating files in temp dir was already implemented in the abstract
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Actually, when I tried to move it out of
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Devil's advocate: do you think that's correct for StripeC and DTC to do?
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I honestly had not looked further into these two. Am only worried to not break current logic used by those. Was planning to take care of those later on another PR, if we are to make it compatible with direct store/renameless writing. |
||
| * Moves the new file from temp to the actual store directory, then create the related | ||
| * HStoreFile instance | ||
| * @param newFile the new file created. | ||
| * @param fileProvider a lambda expression with logic for creating a HStoreFile given a Path. | ||
| * @return an HStoreFile instance. | ||
| * @throws IOException if the file store creation fails. | ||
| */ | ||
| protected HStoreFile createFileInStoreDir(Path newFile, StoreFileProvider fileProvider) | ||
| throws IOException { | ||
| Path destPath = this.store.getRegionFileSystem(). | ||
| commitStoreFile(this.store.getColumnFamilyName(), newFile); | ||
| return fileProvider.createFile(destPath); | ||
| } | ||
|
|
||
| /** | ||
| * Functional interface to allow callers provide the logic specific for creating StoreFiles. | ||
| */ | ||
| @FunctionalInterface | ||
| public interface StoreFileProvider { | ||
| HStoreFile createFile(Path path) throws IOException; | ||
| } | ||
| } | ||
Uh oh!
There was an error while loading. Please reload this page.