Skip to content

[GLUTEN-11330][VL] Make PartialProject support array and map with null values#11331

Merged
jinchengchenghh merged 2 commits intoapache:mainfrom
jiangjiangtian:access_null_value_in_map
Jan 2, 2026
Merged

[GLUTEN-11330][VL] Make PartialProject support array and map with null values#11331
jinchengchenghh merged 2 commits intoapache:mainfrom
jiangjiangtian:access_null_value_in_map

Conversation

@jiangjiangtian
Copy link
Copy Markdown
Contributor

@jiangjiangtian jiangjiangtian commented Dec 25, 2025

What changes are proposed in this pull request?

This PR introduces a new class named ArrowColumnarArray. Its implementation is copied from Spark-4.0, except that the handleNull parameter is set to true when we call SpecializedGettersReader.read in get, which means that when trying to access a value of array, we will check whether the value to get is null first. So we can avoid throwing exception when we try to access a null value of array.
Besides, this PR introduces another new class named ArrowColumnarMap. This class defines two fields of type ArrowColumnarArray to represent keys and values, separately. With this class, we can also avoid throwing exception when we try to access a null value of map.

How was this patch tested?

unit tests.

Related issue: #11330

@github-actions github-actions bot added the VELOX label Dec 25, 2025
@jiangjiangtian jiangjiangtian changed the title [GLUTEN-11330][VL]Make PartialProject support array and map with null values [GLUTEN-11330][VL] Make PartialProject support array and map with null values Dec 25, 2025
@github-actions github-actions bot added the CORE works for Gluten Core label Dec 25, 2025
@github-actions
Copy link
Copy Markdown

Run Gluten Clickhouse CI on x86

@jiangjiangtian jiangjiangtian force-pushed the access_null_value_in_map branch from f9c5f70 to 55c5764 Compare December 25, 2025 11:30
@github-actions
Copy link
Copy Markdown

Run Gluten Clickhouse CI on x86

@jiangjiangtian jiangjiangtian force-pushed the access_null_value_in_map branch from 55c5764 to 16ab9d6 Compare December 25, 2025 11:53
@github-actions
Copy link
Copy Markdown

Run Gluten Clickhouse CI on x86

Copy link
Copy Markdown
Contributor

@jinchengchenghh jinchengchenghh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for your enhancement!

import org.apache.spark.sql.vectorized.ColumnVector;

/**
* Because `get` method in `ColumnarArray` don't check whether the data to get is null and arrow
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this the Spark shortage or design? What's the Spark usage for ColumnarArray with null value?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it is Spark shortage. If there exists null values, ColumnarArray will get the value(this might be a default value or previously set value) because call get on ColumnarArray will eventually call getXXX on ColumnVector and getXXX will not check if it is null value, either.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you also raise an issue in Spark?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, I will. Thanks.

@jinchengchenghh jinchengchenghh merged commit 41073d5 into apache:main Jan 2, 2026
106 of 109 checks passed
QCLyu pushed a commit to QCLyu/incubator-gluten that referenced this pull request Jan 8, 2026
…l values (apache#11331)

---------

Co-authored-by: jiangtian <JT2677636391@outlook.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

CORE works for Gluten Core VELOX

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants