Skip to content

Abort dictionary encoding if HLL estimation turned out to be incorrect #5006

@westonpace

Description

@westonpace

Currently we use HLL to decide if we should apply dictionary encoding or not. However, sometimes HLL is wrong, and it thinks we should use dictionary encoding even in cases where the resulting encoded array is larger than the unencoded array (or not significantly smaller enough to justify the decode cost)

We should check this scenario and, if dictionary encoding was not as helpful as we had hoped, we should abandon the dictionary encoded data.

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions