Skip to content

[VARIANT] Add support for DataType::Utf8/LargeUtf8/Utf8View for cast_to_variant#8089

Merged
alamb merged 5 commits intoapache:mainfrom
cmu-db:utf8cast
Aug 14, 2025
Merged

[VARIANT] Add support for DataType::Utf8/LargeUtf8/Utf8View for cast_to_variant#8089
alamb merged 5 commits intoapache:mainfrom
cmu-db:utf8cast

Conversation

@carpecodeum
Copy link
Copy Markdown
Contributor

Which issue does this PR close?

Rationale for this change

Add support for DataType::Utf8/LargeUtf8/Utf8View for cast_to_variant

What changes are included in this PR?

Added support for casting and added tests as well

Are these changes tested?

Yes

Are there any user-facing changes?

yes casting to variant is a user facing issue

Props to @mprammer!!

@carpecodeum carpecodeum changed the title [ADD] Add support for DataType::Utf8/LargeUtf8/Utf8View for cast_to_variant [VARIANT] Add support for DataType::Utf8/LargeUtf8/Utf8View for cast_to_variant Aug 7, 2025
Comment thread parquet-variant-compute/src/cast_to_variant.rs Outdated
Comment on lines 487 to 1185
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why not using StringArray::from?

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we reuse the macro cast_conversion here?

Copy link
Copy Markdown
Contributor

@alamb alamb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @carpecodeum -- I think @klion26 and @Weijun-H have some excellent comments and suggestions for this PR

@github-actions github-actions Bot added arrow Changes to the arrow crate parquet-variant parquet-variant* crates labels Aug 12, 2025
@carpecodeum carpecodeum force-pushed the utf8cast branch 2 times, most recently from 76d5cd5 to 5570608 Compare August 12, 2025 19:51
@github-actions github-actions Bot removed the arrow Changes to the arrow crate label Aug 12, 2025
@alamb
Copy link
Copy Markdown
Contributor

alamb commented Aug 13, 2025

Hi @carpecodeum -- I can't merge this PR because github says it has a conflict (and since I can't push commits to your fork I can't resolve them myself)

Screenshot 2025-08-13 at 12 33 55 PM

Could you

  1. Please resolve the conflicts
  2. Let me know if you plan to address @klion26 's comment [VARIANT] Add support for DataType::Utf8/LargeUtf8/Utf8View for cast_to_variant #8089 (comment) (I don't think it is necessary but I do think it would improve the code -- we can also do it as a follow on PR too)

@carpecodeum
Copy link
Copy Markdown
Contributor Author

Hi @carpecodeum -- I can't merge this PR because github says it has a conflict (and since I can't push commits to your fork I can't resolve them myself)

Screenshot 2025-08-13 at 12 33 55 PM Could you
  1. Please resolve the conflicts
  2. Let me know if you plan to address @klion26 's comment [VARIANT] Add support for DataType::Utf8/LargeUtf8/Utf8View for cast_to_variant #8089 (comment) (I don't think it is necessary but I do think it would improve the code -- we can also do it as a follow on PR too)

Yes, I will make those changes as well, totally missed this

@alamb alamb dismissed Weijun-H’s stale review August 14, 2025 16:56

Requested changes are made

@alamb alamb merged commit 536ccf5 into apache:main Aug 14, 2025
12 checks passed
@alamb
Copy link
Copy Markdown
Contributor

alamb commented Aug 14, 2025

Thank you @carpecodeum @klion26 and @Weijun-H

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

parquet-variant parquet-variant* crates

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Variant]: Implement DataType::Utf8/LargeUtf8/Utf8View support for cast_to_variant kernel

4 participants