Skip to content

[Variant]: Implement DataType::Interval support for cast_to_variant kernel#8125

Merged
alamb merged 6 commits intoapache:mainfrom
codephage2020:feat/cast-to-variant-interval
Aug 14, 2025
Merged

[Variant]: Implement DataType::Interval support for cast_to_variant kernel#8125
alamb merged 6 commits intoapache:mainfrom
codephage2020:feat/cast-to-variant-interval

Conversation

@codephage2020
Copy link
Copy Markdown
Contributor

@codephage2020 codephage2020 commented Aug 13, 2025

Which issue does this PR close?

We generally require a GitHub issue to be filed for all bug fixes and enhancements and this helps us generate change logs for our releases. You can link an issue to this PR using the GitHub syntax.

What changes are included in this PR?

Added Interval Support:

  • IntervalYearMonth → Variant::Int32
  • IntervalDayTime → Variant::Binary (8 bytes)
  • IntervalMonthDayNano → Variant::Binary (16 bytes)

Are these changes tested?

Yes.

Are there any user-facing changes?

No.

Signed-off-by: codephage2020 <tingwangyan2020@163.com>
@github-actions github-actions Bot added the parquet-variant parquet-variant* crates label Aug 13, 2025
Signed-off-by: codephage2020 <tingwangyan2020@163.com>
@codephage2020 codephage2020 force-pushed the feat/cast-to-variant-interval branch from 3e4a677 to 7cc9eb7 Compare August 13, 2025 02:31
@codephage2020 codephage2020 marked this pull request as ready for review August 13, 2025 02:31
Copy link
Copy Markdown
Contributor

@alamb alamb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you @codephage2020

let mut bytes = Vec::with_capacity(8);
bytes.extend_from_slice(&days_bytes);
bytes.extend_from_slice(&millis_bytes);
bytes
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think if you made this Variant::Binary(bytes) you could then use the existing run_test function rather than adding run_binary_interval_test

Some(i32::MIN),
])),
vec![
Some(Variant::Int32(0)),
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I double checked that there doesn't seem to be a interval / duration type in Variant: https://github.com/apache/parquet-format/blob/master/VariantEncoding.md#encoding-types

Converting to Int32 / VariantBinary is about the best we can do for now. However, it might be misleading

Another behavior might be to throw a "InvalidArgument" Error and make it clear that casting from Intervals to Variant is not well defined / supported.

What do you think?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You're right. I think throwing an InvalidArgument error is the better choice.
It's more honest and prevents silent data corruption or misinterpretation.

Let's modify it.

Copy link
Copy Markdown
Contributor

@alamb alamb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you @codephage2020

@alamb alamb merged commit e410cd0 into apache:main Aug 14, 2025
12 checks passed
@codephage2020 codephage2020 deleted the feat/cast-to-variant-interval branch August 14, 2025 11:59
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

parquet-variant parquet-variant* crates

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Variant]: Implement DataType::Interval support for cast_to_variant kernel

2 participants