Skip to content

Data obfuscation layer for encryption #2211

@asfimport

Description

@asfimport

Data obfuscation in sensitive columns - for users without access to column encryption keys.

  1. Implement on top of basic Parquet encryption 
  2. Built-in support for multiple masking mechanisms, with different trade-off between data utility, leakage, and size/throughput overhead
  3. Provide interface for plug-in custom masking mechanism
  4. Enable storing multiple masked versions of the same column in a file
  5. Provide readers with explicit list of column’s masked versions in a file
  6. Enable readers to select a masked version of a column
  7. Stretch: Implement tools for analysis of file data privacy properties and information leakage
  8. Stretch: Leverage privacy analysis tools for tuning file data anonymity
  9. Optional: Support aggregated obfuscation

Reporter: Gidon Gershinsky / @ggershinsky
Assignee: Gidon Gershinsky / @ggershinsky

Related issues:

PRs and other links:

Note: This issue was originally created as PARQUET-1376. Please see the migration documentation for further details.

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions