[SPARK-51456][SQL] Add the `to_time` function #50287

MaxGekk · 2025-03-15T20:14:06Z

What changes were proposed in this pull request?

In the PR, I propose to add new function to_time(). It casts a STRING input value to TIME using an optional formatting.

Syntax

to_time(expr[, fmt])

Arguments

expr: A STRING expression representing a time.
fmt: An optional format STRING expression. If fmt is supplied, it must conform with the datetime patterns, see https://spark.apache.org/docs/latest/sql-ref-datetime-pattern.html . If fmt is not supplied, the function is a synonym for cast(expr AS TIME).

Returns

A TIME(n) where n is always 6 in the proposed implementation.

Examples

> SELECT to_time('00:12:00');
 00:12:00

> SELECT to_time('12.10.05', 'HH.mm.ss');
 12:10:05

Why are the changes needed?

To improve user experience with Spark SQL, and allow to construct values of the new data type TIME from strings.
To simplify migration from other systems where to_time is supported. For instance:
For consistency: Spark SQL has already to_timestamp() for TIMESTAMP and to_date() for DATE.

Does this PR introduce any user-facing change?

No.

How was this patch tested?

By running the related test suites:

$ build/sbt "test:testOnly *ExpressionInfoSuite"
$ build/sbt "test:testOnly *TimeExpressionsSuite"
$ build/sbt "sql/testOnly org.apache.spark.sql.SQLQueryTestSuite -- -z time.sql"

Was this patch authored or co-authored using generative AI tooling?

No.

yaooqinn · 2025-03-17T09:47:20Z

sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/timeExpressions.scala

+@ExpressionDescription(
+  usage = """
+    _FUNC_(str[, format]) - Parses the `str` expression with the `format` expression to
+      a time. Returns null with invalid input. By default, it follows casting rules to a time if


Returns null with invalid input

Need this behavior to follow [try_]to_timestamp?

I will re-formulate it something like If format is malformed or its application does not result in a well formed time, the function raises an error.

yaooqinn · 2025-03-17T09:50:24Z

sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/timeExpressions.scala

+
+  override lazy val replacement: Expression = format match {
+    case None => invokeParser()
+    case Some(expr) if expr.foldable => invokeParser(Some(expr.eval().toString), Seq(str))


Will it be possible to hit NPE? e.g. Literal.create(null).eval().toString.

Let me check and write a test.

Fixed and added new checks for nulls.

dongjoon-hyun

+1, LGTM. Thank you, @MaxGekk .

I guess we had better a final sign-off from @yaooqinn . I'll leave this to you and him.

yaooqinn

LGTM

yaooqinn · 2025-03-18T02:15:57Z

Thank you @MaxGekk @dongjoon-hyun, merged to master

see apache/spark#50287

* Move GlutenStreamingQuerySuite to correct package * Add Spark 4.1 new test suites for Gluten * Enable new and existing Gluten test suites for Spark 4.1 UT * Update workflow trigger paths to exclude Spark 4.0 and 4.1 shims directories for clickhouse backend * Add support for Spark 4.1 in build script * Merge Spark 4.1.0 sql-tests into Gluten Spark 4.1 (three-way merge) Three-way merge performed using Git: - Base: Spark 4.0.1 (29434ea766b) - Left: Spark 4.1.0 (e221b56be7b) - Right: Gluten Spark 4.1 backends-velox Summary: - Auto-merged: 165 files - New tests added: 31 files (collations, edge cases, recursion, spatial, etc.) - Modified tests: 134 files - Deleted tests: 2 files (collations.sql -> split into 4 files, timestamp-ntz.sql) Conflicts resolved: - inputs/timestamp-ntz.sql: Right deleted + Left modified -> DELETED (per resolution rule) New test suites from Spark 4.1.0: - Collations (4 files): aliases, basic, padding-trim, string-functions - Edge cases (6 files): alias-resolution, extract-value, join-resolution, etc. - Advanced features: cte-recursion, generators, kllquantiles, thetasketch, time - Name resolution: order-by-alias, session-variable-precedence, runtime-replaceable - Spatial functions: st-functions (ANSI and non-ANSI variants) - Various resolution edge cases Total files after merge: 671 (up from 613) * Enable additional Spark 4.1 SQL tests by resolving TODOs * Add new Spark 4.1 test files to VeloxSQLQueryTestSettings * [Fix] Replace `RuntimeReplaceable` with its `replacement` to fix UT. see apache/spark#50287 * [4.1.0] Exclude "infer shredding with mixed scale" see apache/spark#52406 * [Fix] Implement Kryo serialization for CachedColumnarBatch see apache/spark#50599 * [4.1.0] Exclude GlutenMapStatusEndToEndSuite and configure parallelism see apache/spark#50230 * [4.1.0] Exclude Spark Structure Steaming tests in Gluten see - apache/spark#52473 - apache/spark#52870 - apache/spark#52891 * [4.1.0] Exclude failing SQL tests on Spark 4.1 * Replace SparkException.require with standard require in ColumnarCachedBatchSerializer to work across different spark versions * [Fix] Replace `RuntimeReplaceable` with its `replacement` to fix UT. see apache/spark#50287 * Exclude Spark 4.0 and 4.1 paths in clickhouse_be_trigger using `!` prefix * [Fix] Update GlutenShowNamespacesParserSuite to use GlutenSQLTestsBaseTrait

Add ParseToTime

17fa26a

github-actions bot added the SQL label Mar 15, 2025

MaxGekk added 6 commits March 16, 2025 08:38

Re-gen ExpressionsSchemaSuite

95a0f9e

Add TimeExpressionsSuite

f3aa9f2

Add tests to time.sql

1556538

Convert exceptions

96eb223

Refactoring

0c446fc

ParseToTime -> ToTime

f79a59d

MaxGekk changed the title ~~[WIP][SQL] Add the to_time function~~ [WIP][SPARK-51456][SQL] Add the to_time function Mar 16, 2025

MaxGekk added 2 commits March 16, 2025 23:30

Merge remote-tracking branch 'origin/master' into to_time-2

0f5944f

Re-gen ExpressionsSchemaSuite

621644a

MaxGekk changed the title ~~[WIP][SPARK-51456][SQL] Add the to_time function~~ [SPARK-51456][SQL] Add the to_time function Mar 17, 2025

MaxGekk marked this pull request as ready for review March 17, 2025 04:44

Propoer error message

de4ec1a

MaxGekk requested review from LuciferYang, beliefer, dongjoon-hyun and yaooqinn March 17, 2025 08:36

Change the format in errors by default

ecd7ffe

yaooqinn reviewed Mar 17, 2025

View reviewed changes

MaxGekk added 2 commits March 17, 2025 12:58

Improve function description

ea519ad

Fix NPE and add tests

f844e03

MaxGekk mentioned this pull request Mar 17, 2025

[SPARK-51420][SQL] Get minutes of TIME datatype #50296

Closed

dongjoon-hyun approved these changes Mar 17, 2025

View reviewed changes

yaooqinn approved these changes Mar 18, 2025

View reviewed changes

yaooqinn closed this in 76a4239 Mar 18, 2025

baibaichen added a commit to baibaichen/gluten that referenced this pull request Jan 12, 2026

[Fix] Replace RuntimeReplaceable with its replacement to fix UT.

ecb0b1e

see apache/spark#50287

baibaichen added a commit to baibaichen/gluten that referenced this pull request Jan 13, 2026

[Fix] Replace RuntimeReplaceable with its replacement to fix UT.

f204152

see apache/spark#50287

baibaichen added a commit to baibaichen/gluten that referenced this pull request Jan 13, 2026

[Fix] Replace RuntimeReplaceable with its replacement to fix UT.

a291da4

see apache/spark#50287

baibaichen mentioned this pull request Jan 13, 2026

[GLUTEN-11355][UT] Add new Spark 4.1 tests apache/incubator-gluten#11380

Merged

baibaichen added a commit to baibaichen/gluten that referenced this pull request Jan 13, 2026

[Fix] Replace RuntimeReplaceable with its replacement to fix UT.

e77762e

see apache/spark#50287

baibaichen added a commit to baibaichen/gluten that referenced this pull request Jan 13, 2026

[Fix] Replace RuntimeReplaceable with its replacement to fix UT.

80461d5

see apache/spark#50287

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[SPARK-51456][SQL] Add the `to_time` function #50287

[SPARK-51456][SQL] Add the `to_time` function #50287

Uh oh!

MaxGekk commented Mar 15, 2025 •

edited

Loading

Uh oh!

yaooqinn Mar 17, 2025

Uh oh!

MaxGekk Mar 17, 2025

Uh oh!

yaooqinn Mar 17, 2025

Uh oh!

MaxGekk Mar 17, 2025

Uh oh!

MaxGekk Mar 17, 2025

Uh oh!

dongjoon-hyun left a comment

Uh oh!

yaooqinn left a comment

Uh oh!

yaooqinn commented Mar 18, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

[SPARK-51456][SQL] Add the to_time function #50287

[SPARK-51456][SQL] Add the to_time function #50287

Uh oh!

Conversation

MaxGekk commented Mar 15, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What changes were proposed in this pull request?

Syntax

Arguments

Returns

Examples

Why are the changes needed?

Does this PR introduce any user-facing change?

How was this patch tested?

Was this patch authored or co-authored using generative AI tooling?

Uh oh!

yaooqinn Mar 17, 2025

Choose a reason for hiding this comment

Uh oh!

MaxGekk Mar 17, 2025

Choose a reason for hiding this comment

Uh oh!

yaooqinn Mar 17, 2025

Choose a reason for hiding this comment

Uh oh!

MaxGekk Mar 17, 2025

Choose a reason for hiding this comment

Uh oh!

MaxGekk Mar 17, 2025

Choose a reason for hiding this comment

Uh oh!

dongjoon-hyun left a comment

Choose a reason for hiding this comment

Uh oh!

yaooqinn left a comment

Choose a reason for hiding this comment

Uh oh!

yaooqinn commented Mar 18, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

[SPARK-51456][SQL] Add the `to_time` function #50287

[SPARK-51456][SQL] Add the `to_time` function #50287

MaxGekk commented Mar 15, 2025 •

edited

Loading