Skip to content

Add 'mask' command to parquet-tools/parquet-cli #2455

@asfimport

Description

@asfimport

Some personal data columns need to be masked instead of being pruned(Parquet-1791). We need a tool to replace the raw data columns with masked value. The masked value could be hash, null, redact etc.  For the unchanged columns, they should be moved as a whole like 'merge', 'prune' command in Parquet-tools. 

 

Implementing this feature in file format is 10X faster than doing it by rewriting the table data in the query engine. 

Reporter: Xinli Shang / @shangxinli
Assignee: Xinli Shang / @shangxinli

Subtasks:

Note: This issue was originally created as PARQUET-1792. Please see the migration documentation for further details.

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions