-
Notifications
You must be signed in to change notification settings - Fork 1.5k
Closed
Description
A mechanism for modular encryption and decryption of Parquet files. Allows to keep data fully encrypted in the storage - while enabling efficient analytics on the data, via reader-side extraction / authentication / decryption of data subsets required by columnar projection and predicate push-down.
Enables fine-grained access control to column data by encrypting different columns with different keys.
Supports a number of encryption algorithms, to account for different security and performance requirements.
Reporter: Gidon Gershinsky / @ggershinsky
Assignee: Gidon Gershinsky / @ggershinsky
Subtasks:
- Thrift crypto metadata structures
- parquet-format-structures encryption
- parquet-mr code changes for encryption support
- Document the modular encryption in parquet-format
- Crypto package in parquet-mr
- Separate iv_prefix for GCM and CTR modes
- RowGroup offset and total compressed size fields
- Enable old readers to access unencrypted columns in files with plaintext footer
- Detailed crypto specification
- Thrift crypto updates
- Update encryption spec for Bloom filter encryption
- Merge crypto spec and structures to format master
- Encryption: Interop and Function test suite for Java version
- Merge encryption branch into master
Related issues:
- Passing Field Metadata to Parquet (Blocked)
- Upgrade Parquet to 1.12.0 (relates to)
- [C++] Parquet modular encryption (depends upon)
- CLONE - [C++] Parquet modular encryption (depends upon)
- Data obfuscation layer for encryption (is depended upon by)
- [C++] Data set integrity tool (is depended upon by)
- Encryption key management tools (is depended upon by)
- High level interface to Parquet encryption (is depended upon by)
- Sample of usage Parquet-1396 and Parquet-1178 for column level encryption with pluggable key access (is depended upon by)
PRs and other links:
Note: This issue was originally created as PARQUET-1178. Please see the migration documentation for further details.