-
Notifications
You must be signed in to change notification settings - Fork 4k
PARQUET-1300: [C++] Implement encrypted Parquet read and write support #4826
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Closed
Closed
Changes from all commits
Commits
Show all changes
207 commits
Select commit
Hold shift + click to select a range
a3a8d41
encryption (from apache/parquet-cpp github repo)
thamht4190 ab65b1d
update thrift change and update encrypted footer
thamht4190 0ad09da
add encryption source files into CMakeLists.txt
thamht4190 5c295c2
add example from old PR of parquet-cpp
thamht4190 138b896
change due to new update in crypto package
thamht4190 31cf871
pass EncryptionProperties into parquet_encryption::Encrypt()/Decrypt(…
thamht4190 3ebebb5
fix issue of wrong column name in encryption-example and remove FileE…
thamht4190 189b2f2
get column path from ColumnCryptoMetadata when column is encrypted wi…
thamht4190 8dcf50c
let encryption examples to be able to cover more cases
thamht4190 bd96d43
footer plaintext mode
thamht4190 6a2a918
footer plaintext mode example
thamht4190 e428fda
fix compiling issue
thamht4190 537b5e0
fix plaintext mode verification
thamht4190 5d495c3
fix memory issue when serializing plaintext mode footer
thamht4190 e1e9470
protection of sensitive metadata
thamht4190 9040fbc
fix duplication of variable i
thamht4190 b177e8c
column metadata encryption: read algorithm, aad from FileCryptoMetada…
thamht4190 921830c
keep redacted metadata version for old readers
thamht4190 4f727f2
hidden column exception
thamht4190 7717ac3
remove log
thamht4190 e9ed8c8
add example for hidden column
thamht4190 a193daa
handle row group file_offset and total_compressed_size
thamht4190 e42cc4a
Apply API changes
revit13 8a80946
Add AAD calculation
revit13 053c2a6
Fix parquet tests to work with the changes required to support AAD
revit13 613056e
verify plaintext footer depends on config of decryption properties
thamht4190 a195bd5
Fix code style
revit13 b176ba5
Code style fixes in properties.h
revit13 ca1bd2b
revert change in parquet.thrift
thamht4190 a5eee07
Move all encrypted related classes from properties.h to new files: en…
4d4aef1
update crypto API change
thamht4190 fbeeff2
fix issue when column is encrypted in footer plaintext mode
thamht4190 2eb4f3f
remove EncryptionProperties
thamht4190 3cf56be
Change HiddenColumnExceptio message
97aad7b
Fix indentation in encryption_properties.cc
9dabd99
Rename functions in DecryptionKeyRetriever
e7871d6
Add check for aad_prefix to withoutAADPrefixStorage
99ca2a6
Add exception to FromThrift in thrift.h
baa162c
Fix prefix aad calculation
f92df57
Remove fileAAD from ReaderProperties
f2c000e
Remove column_map from ReaderProperties
8b9574f
Fix check for encryption and the existance of file_decryption in file…
dc25ba1
Save footer_key_metadata, algorithm, footer_decryptor and footer_sign…
5fdb0c7
Rename file_decryption to file_decryption_properties in properties.h
931829b
Do not pass file_decryption as function parameter
fde0627
Rename is_plaintext_mode to is_encryption_algorithm_set
85dd7ee
fix function naming
thamht4190 af3aeef
fix const&
thamht4190 1d51d12
make format
thamht4190 f7ea94d
Add plaintext_files_allowed
f8d03dc
Remove file_crypto_metadata_ field from SerializedRowGroup and Serial…
5838185
Pass file_aad, algorithm and key_metadata to InternalFileDecryptor co…
444a95e
Fixes to previous commits
b894de3
Put encryption_properties.h/cc content in encryption.h/cc
5c661dd
Remove encryption_properties.cc from CMakeLists.txt
b988c60
Add column_metadata_map_, column_data_map_, footer_signing_encryptor_…
5382b02
Add column_data_map_, column_metadata_map_, footer_data_decryptor_ an…
11e68f1
Rename aad to update_aad in Encryptor and Decryptor classes
9273de0
Move PARQUET_EMAGIC and PARQUET_MAGIC to file_writer.h and use it in …
37461ca
Rename file_encryption to file_encryption_properties in file_writer.cc
bba24b8
Remove unused footer_decryptor_ from InternalFileDecryptor class and …
4bb0238
Fix format
a1f7039
Change implementation of NULL_STRING
b8f5fba
Change ParquetException message format in file_reader.cc
5d023b2
Make format
04cda18
Add comments to encryption-reader-writer.cc example
5d7b271
Rename enable_plaintext_footer to set_plaintext_footer
d676693
Rename aad variable in NextPage function
59d4abb
Change comment in GetColumnPageReader
a36cbf8
Change additional comments in GetColumnPageReader
7810db3
Add comments in file_writer.cc
ab94416
Create both data and metadata decryptors to avoid redundant retrieval…
0a8e030
Fix metadata parameter sent to parquet_encryption::AesDecryptor
ac5a96d
Rename aad in GetFooterEncryptor and GetFooterSigningEncryptor
1bc3329
Rename verify to verify_signature
00e68ab
Add comments to void WriteTo
b951882
Add additional comment in void WriteTo
fb38044
Rename file_encryption to file_encryption_properties in WriterProperties
b949797
Use encrypted_footer instead of footer_signing_key when checking for …
bd989f6
Rename column_encryption_props to column_encryption_properties
b6ff133
Add comments in thrift.h
1aa5747
Change parameters order in ColumnChunkMetaData::Make
329633f
Change parameters order in PageReader::Open
0dfd5f2
Remove footer_encryption_key and footer_signing_key
6bf0d62
Remove ParquetException in GetFooterSigningEncryptor and GetFooterEnc…
16f5d78
make format
db94e05
make format in thrift.h
18876be
fix rebase mistake in parquet.thrift
thamht4190 96af8cb
Fix aad settings in thrift.h
3da31c2
Port key erasure mechanism
57c4840
Fix columnMetaData
e2f7cab
Minor fixes to previous code
b6dfe9c
fix build issue on MacOS
thamht4190 049d69c
apply change from crypto package
thamht4190 9910905
format code
thamht4190 4f0796a
post-rebase change
thamht4190 8eb339b
add unit tests for encryption properties
thamht4190 72c4554
write unit tests for metadata
thamht4190 c0585f9
Add encryption samples
7017089
fix lint and format issue
thamht4190 a3d924c
fix metadata set, statistics set issues
thamht4190 daeb600
Various changes to encryption-reader-writer-all-crypto-options test a…
980a8f5
Fix logging error
cbcac60
post-rebase change
thamht4190 ca71bab
fix isset of column chunk metadata and statistics
thamht4190 8c2d449
temporarily remove encryption-metadata-test
thamht4190 d6f30e1
fix windows compiling issue
thamht4190 512fd1f
fix issue of parquet-encryption-example
thamht4190 e738932
rename encryption-test.cc to encryption-properties-test.cc
thamht4190 85c08a5
use isset instead of creating a copy of column chunk metadata
thamht4190 4419abc
Address review comments
9397cd2
Fix SerializedPageReader initialization
6e5d7ec
Fix Format
fb7ac18
let parquet encryption be able to be off (when openssl is not found)
thamht4190 2e0ef53
Fix LogicalType
376b4ad
keep encryption parameters at method declaration
thamht4190 36cd316
add PARQUET_EXPORT into Builder class of encryption properties
thamht4190 e5a771a
Change assert to ASSERT_EQ in encryption-configurations-test.cc
0458d0d
fix cmake format
thamht4190 2a2a50c
Add MemoryPool field to Decryptors/Encryptors
f2ff1d7
keep encryption parameters at method declaration (column_writer.cc/.h)
thamht4190 7bc5635
Write to parquet stream to file in encryption test
5f7503c
Add file reader and file writer Close to encryption-configurations-te…
79372c6
Change encryption-configuration-test
fe773e1
Delete encryption-configuration-encrypted-columns-plaintext-footer.cc…
b6f4d22
Remove FooterSigningEncryptor class
0a0e0f8
remove some PARQUET_ENCRYPTION define check
thamht4190 00d4ad8
remove PARQUET_ENCRYPTION ifdefs and add encryption_internal-nossl.cc…
thamht4190 91a3197
remove PARQUET_ENCRYPTION defines from CMakeLists.txt
thamht4190 b93d791
fix comments for encryption_internal_nossl.cc
thamht4190 da1acf1
Format fixes and check that all columns in columnEncryptionProperties…
ab76ece
Add encryption tests
657609f
Throw exception when files are missing from parquet-testing repo
f466781
update parquet_testing submodule with new encrypted files
thamht4190 55cd2bf
add crypto dependency to R build
1a508f2
Print location of OpenSSL library
01b4c48
try adding crypto dependency to R build again
bccb0fe
add missing crypto deps
cf74661
fix ci openssl url
75915e8
add crypt32 lib
8e148c2
Applying revital's const-fix patch & Addressing Deepak's review comme…
e9be805
post-rebase change
thamht4190 9fa1967
fix comments
thamht4190 cbc3e0e
rename test file using underscore
thamht4190 1f6479e
fix make lint
thamht4190 487b329
fix a bad merge in r/configure.win
thamht4190 40e1e10
merge master to parquet encryption
thamht4190 72d39d8
merge master to parquet encryption (2)
thamht4190 861ef2c
add PARQUET_EXPORT to Encryptor, Decryptor
thamht4190 b6f6b2b
fix errors when encryption is disabled
bdded5e
do not build encryption support by default when Parquet is built
ef0583b
specify PARQUET_REQUIRE_ENCRYPTION in doc, build, CI, etc.
657b886
Merge branch 'master' into master
majetideepak 8e03784
fix format
f6472ce
refactor column reader
1a39821
ARROW-6610: [C++] Add cmake option to disable filesystem layer
pitrou 58e1144
ARROW-6564: [Python] Do not require pandas for invoking ChunkedArray.…
jorisvandenbossche 4aaa211
ARROW-6729: [C++] Prevent data copying in StlStringBuffer
st-pasha c95aaab
ARROW-6646: [Go] Write no IPC buffer metadata for NullType
sbinet 610deb7
ARROW-6685: [C++] Ignore trailing slashes in S3FS
pitrou df9fc54
ARROW-6740: [C++] Unmap MemoryMappedFile as soon as possible
pitrou 9e0e1a2
ARROW-6708: [C++] Fix hardcoded boost library names
pitrou 98d8a6d
ARROW-6722: [Java] Provide a uniform way to get vector name
liyafan82 155415c
ARROW-6655: [Python] Filesystem bindings for S3
kszucs 3b262f6
ARROW-6648: [Go] Expose the bitutil package
jsternberg 8231fcb
ARROW-5831: [Release] Add Python program to download binary artifacts…
wesm 871aedb
ARROW-6751: [CI] Fix ccache setup on Travis-CI
pitrou 9694200
ARROW-6745: [Rust] Fix a variety of minor typos.
waywardmonkeys d75d186
ARROW-6730: [CI] Use GitHub Actions for "C++ with clang 7" docker image
fsaintjacques e72a0da
ARROW-6752: [Go] make Null array implement Stringer, add tests for Nu…
sbinet 7d18c1c
ARROW-6750: [Python] Silence S3 error logs by default
pitrou 5f93f85
ARROW-6755: [Release] Improve Windows release verification script
wesm b70f04a
ARROW-6614: [C++][Dataset] Add DataSourceDiscovery class
fsaintjacques 48b56bd
ARROW-6581: [C++] Fix fuzzit job submission
pitrou fda549a
ARROW-6761: [Rust] Travis build now uses the correct Rust toolchain
andygrove ad4eccb
ARROW-6777: [GLib][CI] Unpin gobject-introspection gem
kou 5050d87
ARROW-6767: [JS] Lazily bind batches in scan/scanReverse
1165cdb
ARROW-6686: [CI] Pull and push docker images to speed up the nightly …
kszucs 560a597
ARROW-6770: [CI][Travis] Download Minio quietly
kszucs f2e8f85
ARROW-6773: [C++] Fix filter kernel when filtering with a boolean Arr…
nealrichardson 31a3259
ARROW-6762: [C++] Support reading JSON files with no newline at end
pitrou a4738cf
ARROW-6613: [C++] Minimize usage of boost::filesystem
pitrou 227a33f
ARROW-6494: [C++][Dataset] Implement PartitionSchemes
bkietz b5ccbd2
ARROW-6771: [Packaging][Python] Missing pytest dependency from conda …
kszucs cc05a89
ARROW-6785: [JS] Remove superfluous child assignment
akre54 21636fa
ARROW-6744: [Rust] Publicly expose JsonEqual
d3ba809
ARROW-6091: [Rust] [DataFusion] Implement physical execution plan for…
andygrove e0efdbd
ARROW-6634: [C++] Vendor Flatbuffers and check in compiled sources
wesm e9f7457
ARROW-6688: [Packaging] Include s3 support in the conda packages
kszucs a98a61d
ARROW-3808: [R] Array extract, including Take method
nealrichardson 399ab8f
ARROW-6736: [Rust] [DataFusion] Evaluate the input to the aggregate e…
andygrove 1e2cf1f
ARROW-6580: [Java] Support comparison for unsigned integers
liyafan82 368562b
ARROW-6657: [Rust] [DataFusion] Add Count Aggregate Expression
sinistersnare 86eaa6b
ARROW-6760: [C++] More informative error messages for JSON parsing er…
bkietz 461ff53
ARROW-6437: [R] Add AWS SDK to Homebrew formulae
nealrichardson afdb86f
Merge remote-tracking branch 'upstream/master' into PARQUET-1300
0f6e0e5
refactor ColumnPath
c0e0d8a
make format
3275cbe
fix tests
ed693e8
merge master
af1bd12
Fix file_reader
6bc493b
refactor metadata
fa9683d
make lint
d26a119
cpp cli lint
99b9713
fix multi-rowgroup aad
c59daf7
Merge branch 'master' into master
majetideepak 4e9b9af
Stylistic fixes, remove cruft
wesm File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Uh oh!
There was an error while loading. Please reload this page.