-
Notifications
You must be signed in to change notification settings - Fork 6.7k
Raw SST File Iterator & reader #12370
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
@swamirishi Thank you for this work! Just curious, is the motivation for this class to be able to read the sequence number and type of each entry from a table file? In that case, RocksDB's rocksdb/table/sst_file_dumper.cc Line 500 in cb4f438
|
@jowlyzhang In Ozone we use rocksdb for metadata store. We implemented snapshots in ozone relying on the rocksdb checkpoint functionality. In order to perform efficient snapshot diffs we currently need all the tombstone entries that are written to the sst files to figure out the keys that have changed over the course of multiple checkpoints. Currently we are patching up the rocksdb code and are building this particular tool and wrote our own jni layer to access the tombstone entries and the sequence number. It would be a really great functionality on the sst file reader. Does this seem like a valid usecase and ask from a feature perspective? I can work on it to augment the sst file reader to be able to do this by adding a flag on read options. You can take a look at this PR to get a better understanding apache/ozone#6182 |
Currently the db_iter skips non user keys Line 289 in 003197f
|
sst file reader should be returning a table iterator that iterates the table, not a DBIter: rocksdb/table/sst_file_reader.cc Line 90 in cb4f438
For block based table, this would be a
This iterator iterates the whole table file, tombstones are surfaced too. |
rocksdb/table/sst_file_reader.cc Line 94 in cb4f438
makes it a db_iter |
|
rocksdb/table/sst_file_reader.cc Line 94 in cb4f438
I see, so you need an iterator to pragmatically iterate the raw sst file to get the tombstone. So you want to define a public iterator class. Would separate tool like sst_dump work for your flow? This tool can be augmented to print out the type and sequence number, you would need to parse its output. We have a feature to allow users to collect table properties, it sounds like it can work for your use case. You can use it to pragmatically collect the tombstones from each SST file. You can define a rocksdb/include/rocksdb/table_properties.h Line 153 in cb4f438
This factory is responsible for creating a
When each SST file is being created, this rocksdb/include/rocksdb/table_properties.h Lines 110 to 112 in cb4f438
You can define your own Later on when this SST file needs to be processed, you can use |
We actually wanted a jni which would let us iterate through the sst file and get the key,value, sequenceNumber for each of the records including the tombstone entries from the sst file. We are not particularly looking to get the table properties. |
This function |
RocksDB have some implementations that can be leveraged to create such a raw table iterator. Some of the changes in this PR are not required, I have prototyped a simpler change based on these implementations to achieve similar effect. It's at: main...jowlyzhang:rocksdb:internal_key_interator Let me know if this looks OK to you, I can work on checking it in. |
Yup this change definitely solves my purpose |
|
Hello @swamirishi, I have put together a PR to support raw table iterator in #12385 Please feel free to comment if any functionality is missing. This PR doesn't include the jni wrapper yet. I will add those after we have settled that the functionality added there are sufficient. |
Summary: This PR adds support to programmatically iterate a raw table file with an iterator returned by `SstFileReader::NewTableIterator`. For third party tools to use to observe SST files created by RocksDB. The original feature request was from this merge request: #12370 Since keys returned by raw table iterators are internal keys, this PR also adds a struct `ParsedEntryInfo` and util method `ParseEntry` to support user to parse internal key. `GetInternalKeyForSeek`, and `GetInternalKeyForSeekForPrev` to support users to create internal keys for seek operations with this raw table iterator. Pull Request resolved: #12385 Test Plan: Added unit tests Reviewed By: cbi42 Differential Revision: D55662855 Pulled By: jowlyzhang fbshipit-source-id: 0716a173ee95924fbd4e1f9b6cccf06525c40049
|
Hello @swamirishi, since we have merged #12385, shall we close this PR? |
Yeah this can be closed in favour of #12385 |
Raw SST File Reader for reading tombstone entries from sst file along with sequence number & type of the data in rocksdb