Skip to content

Conversation

@yuzelin
Copy link
Contributor

@yuzelin yuzelin commented Apr 1, 2025

Purpose

Fix bug of #4780
Why the bug occurs:
ColumnIndexFilter#applyPredicate:

if (!columns.contains(columnPath)) {
    return rangesForMissingColumns;
}

It returns empty range, then VectorizedParquetRecordReader:

this.totalRowCount = reader.getFilteredRecordCount(); // always 0

Finally, VectorizedParquetRecordReader#nextBatch:

if (rowsReturned >= totalRowCount) { // 0 >= 0
    return false;
}

Tests

API and Format

Documentation

yuzelin added 2 commits April 1, 2025 20:38
@JingsongLi
Copy link
Contributor

What compute engine?

@JingsongLi
Copy link
Contributor

Paimon should only consider the fields after projection when converting to Parquet Filters, and discard other filters.

@yuzelin yuzelin closed this Apr 8, 2025
@yuzelin yuzelin deleted the parquet_filter_fix branch April 10, 2025 03:39
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants