Currently we use HLL to decide if we should apply dictionary encoding or not. However, sometimes HLL is wrong, and it thinks we should use dictionary encoding even in cases where the resulting encoded array is larger than the unencoded array (or not significantly smaller enough to justify the decode cost)
We should check this scenario and, if dictionary encoding was not as helpful as we had hoped, we should abandon the dictionary encoded data.
Currently we use HLL to decide if we should apply dictionary encoding or not. However, sometimes HLL is wrong, and it thinks we should use dictionary encoding even in cases where the resulting encoded array is larger than the unencoded array (or not significantly smaller enough to justify the decode cost)
We should check this scenario and, if dictionary encoding was not as helpful as we had hoped, we should abandon the dictionary encoded data.