Skip to content

[VL] IllegalStateException thrown when accessing null values of map in partial project #11330

@jiangjiangtian

Description

@jiangjiangtian

Backend

VL (Velox)

Bug description

The exception is as follows:

Reason: Operator::getOutput failed for [operator: ValueStream, plan node ID: 0]: Error during calling Java code from native code: java.lang.IllegalStateException: Value at index is null
	at org.apache.gluten.shaded.org.apache.arrow.vector.BigIntVector.get(BigIntVector.java:105)
	at org.apache.gluten.vectorized.ArrowWritableColumnVector$LongAccessor.getLong(ArrowWritableColumnVector.java:989)
	at org.apache.gluten.vectorized.ArrowWritableColumnVector.getLong(ArrowWritableColumnVector.java:639)
	at org.apache.spark.sql.vectorized.ColumnarArray.getLong(ColumnarArray.java:133)
	at org.apache.spark.sql.catalyst.expressions.SpecializedGettersReader.read(SpecializedGettersReader.java:49)
	at org.apache.spark.sql.vectorized.ColumnarArray.get(ColumnarArray.java:180)
	at org.apache.spark.sql.catalyst.util.MapData.foreach(MapData.scala:38)
	at org.apache.spark.sql.hive.HiveInspectors.$anonfun$wrapperFor$49(HiveInspectors.scala:433)
	at org.apache.spark.sql.hive.HiveInspectors.$anonfun$withNullSafe$1(HiveInspectors.scala:262)
	at org.apache.spark.sql.hive.HiveSimpleUDFEvaluator.setArg(hiveUDFEvaluators.scala:92)
	at org.apache.spark.sql.hive.HiveSimpleUDF.$anonfun$eval$1(hiveUDFs.scala:67)
	at org.apache.spark.sql.hive.HiveSimpleUDF.$anonfun$eval$1$adapted(hiveUDFs.scala:66)
	at scala.collection.mutable.ResizableArray.foreach(ResizableArray.scala:62)
	at scala.collection.mutable.ResizableArray.foreach$(ResizableArray.scala:55)
	at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:49)
	at org.apache.spark.sql.hive.HiveSimpleUDF.eval(hiveUDFs.scala:66)
	at org.apache.spark.sql.catalyst.expressions.Alias.eval(namedExpressions.scala:158)
	at org.apache.gluten.expression.InterpretedArrowProjection.apply(InterpretedArrowProjection.scala:76)
	at org.apache.gluten.expression.InterpretedArrowProjection.apply(InterpretedArrowProjection.scala:33)
	at org.apache.gluten.execution.ColumnarPartialProjectExec.$anonfun$getProjectedBatchArrow$3(ColumnarPartialProjectExec.scala:233)
	at org.apache.gluten.execution.ColumnarPartialProjectExec.$anonfun$getProjectedBatchArrow$3$adapted(ColumnarPartialProjectExec.scala:231)
	at scala.collection.immutable.Range.foreach(Range.scala:158)
	at org.apache.gluten.execution.ColumnarPartialProjectExec.org$apache$gluten$execution$ColumnarPartialProjectExec$$getProjectedBatchArrow(ColumnarPartialProjectExec.scala:231)
	at org.apache.gluten.execution.ColumnarPartialProjectExec$$anon$1.next(ColumnarPartialProjectExec.scala:176)
	at org.apache.gluten.execution.ColumnarPartialProjectExec$$anon$1.next(ColumnarPartialProjectExec.scala:165)
	at scala.collection.Iterator$$anon$10.next(Iterator.scala:461)

The reason is that get method of ColumnarArray doesn't handle null values(meaning that it will not check whether the value to get is a null value) and it is illegal to access a null value by get. So the exception is thrown when we try to access a null value of map.

Gluten version

No response

Spark version

Spark-3.5.x

Spark configurations

No response

System information

No response

Relevant logs

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't workingtriage

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions