-
Notifications
You must be signed in to change notification settings - Fork 1.5k
Open
Description
Some personal data columns need to be masked instead of being pruned(Parquet-1791). We need a tool to replace the raw data columns with masked value. The masked value could be hash, null, redact etc. For the unchanged columns, they should be moved as a whole like 'merge', 'prune' command in Parquet-tools.
Implementing this feature in file format is 10X faster than doing it by rewriting the table data in the query engine.
Reporter: Xinli Shang / @shangxinli
Assignee: Xinli Shang / @shangxinli
Subtasks:
Note: This issue was originally created as PARQUET-1792. Please see the migration documentation for further details.