In #5365 we found via a UBSAN issue that Parquet decoding of RLE-encoded dict-encoded Parquet data with nulls does not initialize the output data array for null entries. Since the output data array is generally a freshly-allocated memory buffer, this means it will contain uninitialized memory.
Reporter: Antoine Pitrou / @pitrou
Assignee: Antoine Pitrou / @pitrou
Related issues:
PRs and other links:
Note: This issue was originally created as ARROW-6572. Please see the migration documentation for further details.