Skip to content

Conversation

@the-sakthi
Copy link
Member

@the-sakthi the-sakthi commented Mar 17, 2025

What changes were proposed in this pull request?

This PR adds support for extracting the minute component from TIME (TimeType) values in Spark SQL.

scala> spark.sql("SELECT minute(TIME'07:01:09.12312321231232');").show()
+------------------------------+
|minute(TIME '07:01:09.123123')|
+------------------------------+
|                             1|
+------------------------------+

Why are the changes needed?

  • Spark previously supported minute() for only TIMESTAMP type values.
  • TIME support was missing, leading to implicit casting attempt to TIMESTAMP, which was incorrect.
  • This PR ensures that minute(TIME'HH:MM:SS.######') behaves correctly without unnecessary type coercion.

Does this PR introduce any user-facing change?

Yes

  • Before this PR, calling minute(TIME'HH:MM:SS.######') resulted in a type mismatch error or an implicit cast attempt to TIMESTAMP, which was incorrect.
  • With this PR, minute(TIME'HH:MM:SS.######') now works correctly for TIME values without implicit casting.
  • Users can now extract the minute component from TIME values natively.

How was this patch tested?

By running new tests:

$ build/sbt "test:testOnly *TimeExpressionsSuite.scala"

Was this patch authored or co-authored using generative AI tooling?

No

@github-actions github-actions bot added the SQL label Mar 17, 2025
@the-sakthi
Copy link
Member Author

@MaxGekk looking forward to know your thoughts on this one!

Copy link
Member

@MaxGekk MaxGekk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For sure, it would be nice to re-use the existing code for getting minutes, but I believe it would be better to implement minute from a TIME value as a separate expression.

@the-sakthi
Copy link
Member Author

Thanks for the review @MaxGekk
That totally makes sense. I’ll introduce a separate MinuteForTime expression for TIME inputs that doesn’t depend on GetTimeField or time zones. For TIMESTAMP inputs, I’ll keep using the existing Minute expression (or rename it to MinuteForTimestamp). Then, in the function registry, I’ll dispatch to either MinuteForTime or MinuteForTimestamp based on the input data type. Let me know if that doesn’t align with what you had in mind.

@MaxGekk
Copy link
Member

MaxGekk commented Mar 17, 2025

I’ll keep using the existing Minute expression (or rename it to MinuteForTimestamp).

@the-sakthi Let's leave it as is.

Then, in the function registry, I’ll dispatch to either MinuteForTime or MinuteForTimestamp based on the input data type.

Could you introduce new expression MinutesOfTime and place it to timeExpressions.scala (see #50287). MinutesOfTime should extend RuntimeReplaceable and use StaticInvoke. The last one just invokes getMinutesOfTime from DateTimeUtils.

@the-sakthi
Copy link
Member Author

Let me know if this updated PR aligns better with your suggestions, @MaxGekk
Also, I'm looking for guidance, if possible, on fixing the codegen issue which I pointed above.

@the-sakthi the-sakthi requested a review from MaxGekk March 20, 2025 00:20
@MaxGekk
Copy link
Member

MaxGekk commented Mar 20, 2025

@the-sakthi Could you update PR's description:

scala> spark.sql("SELECT minute(TIME'12:58:59');").show()
+-------------------+
|minute(46739000000)|

This PR #50299 fixed the column name.

@the-sakthi
Copy link
Member Author

Thanks for the review @MaxGekk , I'll shortly in few mins update the PR. Major changes:

  • Will remove the LiteralGenerator part of the change for a separate PR
  • Will remove the checkConsistencyBetweenInterpretedAndCodegen part of the testing.
  • Will cater to other review comments and update the PR description

@the-sakthi
Copy link
Member Author

Updated the revision with the suggested changes @MaxGekk ! Let me know how this one looks.

Comment on lines 151 to 220
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Moved the function usage/desc/example to the builder to fix a failing test which was complaining about this.

@the-sakthi
Copy link
Member Author

the-sakthi commented Mar 22, 2025

Rebased with current main (master) branch!
Let me know how this revision looks, @MaxGekk !

@the-sakthi the-sakthi requested a review from MaxGekk March 22, 2025 05:30
@MaxGekk
Copy link
Member

MaxGekk commented Mar 22, 2025

+1, LGTM. Merging to master.
Thank you, @the-sakthi.

@MaxGekk MaxGekk closed this in f137f6a Mar 22, 2025
@the-sakthi
Copy link
Member Author

Thank you very much for helping on this one and merging @MaxGekk
Much appreciated!

MaxGekk pushed a commit that referenced this pull request Apr 11, 2025
… minute function

### What changes were proposed in this pull request?
- Followup to the original PR: #50296
- Extend the minute(...) function (MinutesOfTime) to handle TIME types of any precision from 0 to 6.
- Add tests verifying that minute(...) works for all valid TIME precisions.

### Why are the changes needed?
- Previously, minute(...) did not consistently support TIME type inputs with arbitrary precision.
- Users need the minute function to handle TIME(0) through TIME(6).

### Does this PR introduce _any_ user-facing change?
- Yes. Users can now call minute(...) on TIME(p) columns or literals with any valid precision.

### How was this patch tested?
By running new tests:
```
$ build/sbt "test:testOnly *TimeExpressionsSuite.scala"
```

By manual tests:
```
scala> spark.sql("select minute(cast('12:30' as time(0)));").show()
+------------------------------+
|minute(CAST(12:30 AS TIME(0)))|
+------------------------------+
|                            30|
+------------------------------+

scala> spark.sql("select minute(cast('12:30' as time(2)));").show()
+------------------------------+
|minute(CAST(12:30 AS TIME(2)))|
+------------------------------+
|                            30|
+------------------------------+

scala> spark.sql("select minute(cast('12:30' as time(5)));").show()
+------------------------------+
|minute(CAST(12:30 AS TIME(5)))|
+------------------------------+
|                            30|
+------------------------------+
```

### Was this patch authored or co-authored using generative AI tooling?
No

Closes #50551 from the-sakthi/SPARK-51420-FOLLOWUP.

Authored-by: Sakthi <sakthi@apache.org>
Signed-off-by: Max Gekk <max.gekk@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants