Skip to content

[C++][Compute] Support casting a smaller struct to a bigger one if extra fields are nullable #44555

@Tom-Newton

Description

@Tom-Newton

Describe the enhancement requested

Arrow can already cast to a schema with an extra column if that column is nullable but it can't do the same on structs.

More precisely I want the follow unittests to pass

TEST(Cast, StructToBiggerStruct) {
  std::vector<std::string> field_names = {"a", "b"};
  std::shared_ptr<Array> a, b;
  a = ArrayFromJSON(int8(), "[1, 2]");
  b = ArrayFromJSON(int8(), "[3, 4]");
  ASSERT_OK_AND_ASSIGN(auto src, StructArray::Make({a, b}, field_names));

  const auto dest =
      arrow::struct_({std::make_shared<Field>("a", int8()),
                      std::make_shared<Field>("b", int8()),
                      std::make_shared<Field>("c", int8(), /*nullable=*/false)});
  const auto options = CastOptions::Safe(dest);

  EXPECT_RAISES_WITH_MESSAGE_THAT(
      TypeError,
      ::testing::HasSubstr("struct fields don't match or are in the wrong order"),
      Cast(src, options));
}

TEST(Cast, StructToBiggerNullableStruct) {
  std::vector<std::string> field_names = {"a", "b"};
  std::shared_ptr<Array> a, b, c;
  a = ArrayFromJSON(int8(), "[1, 2]");
  b = ArrayFromJSON(int8(), "[3, 4]");
  ASSERT_OK_AND_ASSIGN(auto src, StructArray::Make({a, b}, field_names));

  const auto type_dest = arrow::struct_(
      {std::make_shared<Field>("a", int8()), std::make_shared<Field>("b", int8()),
       std::make_shared<Field>("c", int8(), /*nullable=*/true)});
  const auto options = CastOptions::Safe(type_dest);

  c = ArrayFromJSON(int8(), "[null, null]");
  ASSERT_OK_AND_ASSIGN(auto dest, StructArray::Make({a, b, c}, {"a", "b", "c"}));
  CheckCast(src, dest);
}

Does this sound like a reasonable thing to support? The motivation comes from delta-io/delta-rs#1610 but I think the fix would require a change to arrow.

Component(s)

C++

Metadata

Metadata

Assignees

Type

No type

Projects

No projects

Milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions