Skip to content

Pass empty vectors as min/max for all null pages when building ColumnIndex#6316

Merged
alamb merged 3 commits intoapache:masterfrom
etseidl:issue_6315
Aug 31, 2024
Merged

Pass empty vectors as min/max for all null pages when building ColumnIndex#6316
alamb merged 3 commits intoapache:masterfrom
etseidl:issue_6315

Conversation

@etseidl
Copy link
Copy Markdown
Contributor

@etseidl etseidl commented Aug 27, 2024

Which issue does this PR close?

Closes #6315.

Rationale for this change

Pages with all null values should write an empty array for min and max to the ColumnIndex. The current behavior is to write one 0 byte for each.

What changes are included in this PR?

Pass empty vectors to ColumnIndexBuilder::append.

Are there any user-facing changes?

No, since min/max statistics should be ignored for pages with all nulls.

@github-actions github-actions Bot added the parquet Changes to the parquet crate label Aug 27, 2024
Copy link
Copy Markdown
Contributor

@alamb alamb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you @etseidl -- would it be possible to get a test case for this?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

parquet Changes to the parquet crate

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Parquet writer should not write any min/max data to ColumnIndex when all values are null

2 participants