-
Notifications
You must be signed in to change notification settings - Fork 3.4k
HBASE-29131 Introduce the option for post-compaction validation of HFiles #6700
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
ndimiduk
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sketched out a basic implementation.
Still working out how to add a test for this -- seems like I'll need a unit-test context DataBlockEncoder, which our Enum system for codec registration does not support.
It's probably also interesting to add a metric here.
| } | ||
|
|
||
| /** | ||
| * We are trying to remove / relax the region read lock for compaction. Let's see what are the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This comment was helpful for understanding the compaction logic. I cleaned up the formatting to how it was before spotless, using proper javadoc syntax.
| * <p> | ||
| * Compaction event should be idempotent, since there is no IO Fencing for the region directory in | ||
| * hdfs. A region server might still try to complete the compaction after it lost the region. That | ||
| * is why the following events are carefully ordered for a compaction: 1. Compaction writes new |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same with this comment.
| this.openStoreFileThreadPoolCreator = store.getHRegion()::getStoreFileOpenAndCloseThreadPool; | ||
| this.storeFileTracker = createStoreFileTracker(conf, store); | ||
| assert compactor != null && compactionPolicy != null && storeFileManager != null | ||
| && storeFlusher != null && storeFileTracker != null; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Static analysis says storeFileTracker is always null.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
never null?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That was the nit that IntelliJ provided, yes. Reading down through createStoreFileTracker(), it never returns null. If it fails, it throws.
| HStoreFile storeFile = null; | ||
| try { | ||
| storeFile = createStoreFileAndReader(path); | ||
| if (conf.getBoolean(READ_FULLY_ON_VALIDATE_KEY, DEFAULT_READ_FULLY_ON_VALIDATE)) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is the core of the new feature. Anywhere that we previously would invoke validateStoreFile() will now conditionally include the more rigorous check. There are a couple places where we do NOT call validateStoreFile after writing it, I'm not yet sure why.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
57bb4a4 to
5457de6
Compare
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
petersomogyi
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like this safeguard!
| this.openStoreFileThreadPoolCreator = store.getHRegion()::getStoreFileOpenAndCloseThreadPool; | ||
| this.storeFileTracker = createStoreFileTracker(conf, store); | ||
| assert compactor != null && compactionPolicy != null && storeFileManager != null | ||
| && storeFlusher != null && storeFileTracker != null; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
never null?
|
I ran a naive test of the performance impact of this feature and shared some results over on the JIRA. |
hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/StoreEngine.java
Outdated
Show resolved
Hide resolved
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
…iles Introduces the option for an HStore to fully read the file it just wrote after a flush or compaction. To enable this feature, set `hbase.hstore.validate.read_fully=true`. This is an HStore configuration feature, so it can be enabled in hbase-site.xml, in the TableDescriptor, or in the ColumnFamilyDescriptor.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
|
🎊 +1 overall
This message was automatically generated. |
|
🎊 +1 overall
This message was automatically generated. |
…iles (apache#6700) Introduces the option for an HStore to fully read the file it just wrote after a flush or compaction. To enable this feature, set `hbase.hstore.validate.read_fully=true`. This is an HStore configuration feature, so it can be enabled in hbase-site.xml, in the TableDescriptor, or in the ColumnFamilyDescriptor. Signed-off-by: Peter Somogyi <psomogyi@apache.org >
…iles (apache#6700) Introduces the option for an HStore to fully read the file it just wrote after a flush or compaction. To enable this feature, set `hbase.hstore.validate.read_fully=true`. This is an HStore configuration feature, so it can be enabled in hbase-site.xml, in the TableDescriptor, or in the ColumnFamilyDescriptor. Signed-off-by: Peter Somogyi <psomogyi@apache.org >
…iles (apache#6700) Introduces the option for an HStore to fully read the file it just wrote after a flush or compaction. To enable this feature, set `hbase.hstore.validate.read_fully=true`. This is an HStore configuration feature, so it can be enabled in hbase-site.xml, in the TableDescriptor, or in the ColumnFamilyDescriptor. Signed-off-by: Peter Somogyi <psomogyi@apache.org >
…iles (apache#6700) Introduces the option for an HStore to fully read the file it just wrote after a flush or compaction. To enable this feature, set `hbase.hstore.validate.read_fully=true`. This is an HStore configuration feature, so it can be enabled in hbase-site.xml, in the TableDescriptor, or in the ColumnFamilyDescriptor. Signed-off-by: Peter Somogyi <psomogyi@apache.org >
…iles (#6700) Introduces the option for an HStore to fully read the file it just wrote after a flush or compaction. To enable this feature, set `hbase.hstore.validate.read_fully=true`. This is an HStore configuration feature, so it can be enabled in hbase-site.xml, in the TableDescriptor, or in the ColumnFamilyDescriptor. Signed-off-by: Peter Somogyi <psomogyi@apache.org >
…iles (#6700) Introduces the option for an HStore to fully read the file it just wrote after a flush or compaction. To enable this feature, set `hbase.hstore.validate.read_fully=true`. This is an HStore configuration feature, so it can be enabled in hbase-site.xml, in the TableDescriptor, or in the ColumnFamilyDescriptor. Signed-off-by: Peter Somogyi <psomogyi@apache.org >
…iles (#6700) Introduces the option for an HStore to fully read the file it just wrote after a flush or compaction. To enable this feature, set `hbase.hstore.validate.read_fully=true`. This is an HStore configuration feature, so it can be enabled in hbase-site.xml, in the TableDescriptor, or in the ColumnFamilyDescriptor. Signed-off-by: Peter Somogyi <psomogyi@apache.org >
…iles (#6700) Introduces the option for an HStore to fully read the file it just wrote after a flush or compaction. To enable this feature, set `hbase.hstore.validate.read_fully=true`. This is an HStore configuration feature, so it can be enabled in hbase-site.xml, in the TableDescriptor, or in the ColumnFamilyDescriptor. Signed-off-by: Peter Somogyi <psomogyi@apache.org >
…n validation of HFiles (apache#6700) (will be in 2.6.3) Introduces the option for an HStore to fully read the file it just wrote after a flush or compaction. To enable this feature, set `hbase.hstore.validate.read_fully=true`. This is an HStore configuration feature, so it can be enabled in hbase-site.xml, in the TableDescriptor, or in the ColumnFamilyDescriptor. Signed-off-by: Peter Somogyi <psomogyi@apache.org >
…n validation of HFiles (apache#6700) (will be in 2.6.3) Introduces the option for an HStore to fully read the file it just wrote after a flush or compaction. To enable this feature, set `hbase.hstore.validate.read_fully=true`. This is an HStore configuration feature, so it can be enabled in hbase-site.xml, in the TableDescriptor, or in the ColumnFamilyDescriptor. Signed-off-by: Peter Somogyi <psomogyi@apache.org >
…n validation of HFiles (apache#6700) (will be in 2.6.3) Introduces the option for an HStore to fully read the file it just wrote after a flush or compaction. To enable this feature, set `hbase.hstore.validate.read_fully=true`. This is an HStore configuration feature, so it can be enabled in hbase-site.xml, in the TableDescriptor, or in the ColumnFamilyDescriptor. Signed-off-by: Peter Somogyi <psomogyi@apache.org >
…iles (apache#6700) Introduces the option for an HStore to fully read the file it just wrote after a flush or compaction. To enable this feature, set `hbase.hstore.validate.read_fully=true`. This is an HStore configuration feature, so it can be enabled in hbase-site.xml, in the TableDescriptor, or in the ColumnFamilyDescriptor. Signed-off-by: Peter Somogyi <psomogyi@apache.org >
…iles (apache#6700) Introduces the option for an HStore to fully read the file it just wrote after a flush or compaction. To enable this feature, set `hbase.hstore.validate.read_fully=true`. This is an HStore configuration feature, so it can be enabled in hbase-site.xml, in the TableDescriptor, or in the ColumnFamilyDescriptor. Signed-off-by: Peter Somogyi <psomogyi@apache.org >
Introduces the option for an HStore to fully read the file it just wrote after a flush or compaction.
To enable this feature, set
hbase.hstore.validate.read_fully=true. This is an HStore configuration feature, so it can be enabled in hbase-site.xml, in the TableDescriptor, or in the ColumnFamilyDescriptor.