Skip to content

Implement eq comparison for StructArray #5960

@alamb

Description

@alamb

Is your feature request related to a problem or challenge? Please describe what you are trying to do.
arrow::compute::kernels::cmp::eq does not support StructArray

Here is an example

use arrow::array::{ArrayRef, Int32Array, StructArray};
use arrow::compute::kernels::cmp::eq;
use arrow::datatypes::{DataType, Field, Schema};
use std::sync::Arc;

fn main() {
    let nested_schema = Arc::new(Schema::new(vec![
        Field::new("id", DataType::Int32, true),
        Field::new("lat", DataType::Int32, true),
        Field::new("long", DataType::Int32, true),
    ]));
    let schema = Arc::new(Schema::new(vec![
        Field::new("value", DataType::Int32, true),
        Field::new(
            "nested",
            DataType::Struct(nested_schema.fields.clone()),
            true,
        ),
    ]));

    let arr1: ArrayRef = Arc::new(StructArray::from(vec![
        (
            Arc::new(Field::new("id", DataType::Int32, true)),
            Arc::new(Int32Array::from(vec![1, 2, 3])) as ArrayRef,
        ),
        (
            Arc::new(Field::new("lat", DataType::Int32, true)),
            Arc::new(Int32Array::from(vec![1, 2, 3])) as ArrayRef,
        ),
        (
            Arc::new(Field::new("long", DataType::Int32, true)),
            Arc::new(Int32Array::from(vec![1, 2, 3])) as ArrayRef,
        ),
    ]));

    let arr2: ArrayRef = Arc::new(StructArray::from(vec![
        (
            Arc::new(Field::new("id", DataType::Int32, true)),
            Arc::new(Int32Array::from(vec![1, 2, 3])) as ArrayRef,
        ),
        (
            Arc::new(Field::new("lat", DataType::Int32, true)),
            Arc::new(Int32Array::from(vec![1, 2, 3])) as ArrayRef,
        ),
        (
            Arc::new(Field::new("long", DataType::Int32, true)),
            Arc::new(Int32Array::from(vec![1, 2, 3])) as ArrayRef,
        ),
    ]));

    let eq_result = eq(&arr1, &arr2).unwrap();
    println!("{:?}", eq_result);
}

It currenty fails like this:

thread 'main' panicked at src/main.rs:56:38:
called `Result::unwrap()` on an `Err` value: InvalidArgumentError("Invalid comparison operation: Struct([Field { name: \"id\", data_type: Int32, nullable: true, dict_id: 0, dict_is_ordered: false, metadata: {} }, Field { name: \"lat\", data_type: Int32, nullable: true, dict_id: 0, dict_is_ordered: false, metadata: {} }, Field { name: \"long\", data_type: Int32, nullable: true, dict_id: 0, dict_is_ordered: false, metadata: {} }]) == Struct([Field { name: \"id\", data_type: Int32, nullable: true, dict_id: 0, dict_is_ordered: false, metadata: {} }, Field { name: \"lat\", data_type: Int32, nullable: true, dict_id: 0, dict_is_ordered: false, metadata: {} }, Field { name: \"long\", data_type: Int32, nullable: true, dict_id: 0, dict_is_ordered: false, metadata: {} }])")

Describe the solution you'd like
I would like eq and the other comparison kernels to support StructArray and the example to pass

Describe alternatives you've considered
We can implement comparison downstream via arrow::array::make_comparator as done in apache/datafusion#11117

Additional context
This was reported upstream in datafusion: apache/datafusion#10749

@jayzhan211 added comparison to the dynamic comparator in #5792

Metadata

Metadata

Assignees

No one assigned

    Labels

    arrowChanges to the arrow crateenhancementAny new improvement worthy of a entry in the changelog

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions