Skip to content

Conversation

@kaka11chen
Copy link

[Fix] Revert computeRGIdx() changes in #309, it will fix in the next PR.

@morningman morningman merged commit 77e8f92 into apache:orc Apr 27, 2025
kaka11chen added a commit to kaka11chen/doris-thirdparty that referenced this pull request Apr 29, 2025
morningman pushed a commit to apache/doris that referenced this pull request May 7, 2025
…sent stream failing to access repeatedly when late materialization occurs. (#50358)

### What problem does this PR solve?

Related PR: apache/doris-thirdparty#309
apache/doris-thirdparty#310

Problem Summary:
When using an older version of pyorc (e.g., pyorc-0.3.0), If there are
null values in the data, a present stream will be generated for the top
level struct column.
However, this behavior does not occur in newer versions of pyorc (e.g.,
pyorc-0.10.0) or in ORC files generated by tools like Hive or Spark.
Therefore, the present stream generated by the older version causes the
ORC file to be read twice during late materialization, resulting in an
error 'bad read in next buffer' during the second read. The current
solution is to avoid reading the present stream if it is in the top
level struct column.
koarz pushed a commit to koarz/doris that referenced this pull request Jun 4, 2025
…sent stream failing to access repeatedly when late materialization occurs. (apache#50358)

### What problem does this PR solve?

Related PR: apache/doris-thirdparty#309
apache/doris-thirdparty#310

Problem Summary:
When using an older version of pyorc (e.g., pyorc-0.3.0), If there are
null values in the data, a present stream will be generated for the top
level struct column.
However, this behavior does not occur in newer versions of pyorc (e.g.,
pyorc-0.10.0) or in ORC files generated by tools like Hive or Spark.
Therefore, the present stream generated by the older version causes the
ORC file to be read twice during late materialization, resulting in an
error 'bad read in next buffer' during the second read. The current
solution is to avoid reading the present stream if it is in the top
level struct column.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants