[VL] Track result mismatch cases in GlutenSQLQueryTestSuite

### Backend

VL (Velox)

### Bug description

`GlutenSQLQueryTestSuite` is a replacement for `SQLQueryTestSuite` in Spark. It runs sample SQL queries provided by Spark under the path `spark/sql/core/src/test/resources/sql-tests/inputs` and verify the results. However, `spark.sql.optimizer.excludedRules` is not set in the test suite to forbid static evaluation for literal inputs, which leads to many sql expressions with literal input being evaluated by Spark before being offloaded to Gluten.

To ensure that the expressions are effectively offloaded, we should set:
```
spark.sql.optimizer.excludedRules=org.apache.spark.sql.catalyst.optimizer.ConvertToLocalRelation,org.apache.spark.sql.catalyst.optimizer.ConstantFolding,org.apache.spark.sql.catalyst.optimizer.NullPropagation
```
 in `GlutenSQLQueryTestSuite`. With this setting, many queries fail due to result mismatches and need to be fixed to ensure consistency with Spark. For queries that produce result mismatches, we record and track them in this issue (see the list below) and temporarily overwrite the sample file by removing the failed SQL queries until all failures in that file are resolved.

Additionally, there are three other types of failures that are not considered as result mismatch:
1. **Exception Message Mismatches:** Gluten may throw different exceptions and messages for invalid SQL queries. This is not a result mismatch issue and should be addressed by overwriting the original files with the updated SQL output.
2. **Queries That Also Fail in Vanilla Spark:** If a query fails in vanilla Spark when the optimizer rules `org.apache.spark.sql.catalyst.optimizer.ConstantFolding` or `org.apache.spark.sql.catalyst.optimizer.NullPropagation` are excluded, we should overwrite the original files and ignore these queries. (Because Spark doesn't need to set `spark.sql.optimizer.excludedRules` in `SQLQueryTestSuite`)
3. **Legacy Date/Timestamp Function Behavior:** For some date/timestamp functions, Spark uses the configuration `SET spark.sql.legacy.timeParserPolicy=LEGACY` to control the output for invalid SQL queries. When `spark.sql.legacy.timeParserPolicy=LEGACY` is set, queries should return `NULL` for invalid inputs; otherwise, an exception is thrown. Gluten's behaviour is mostly consistent with `spark.sql.legacy.timeParserPolicy=LEGACY`. We need to update the documentation to explain the scope of the supported date/timestamp functions.

**Below are the failed sample files and the sqls:**

- string-functions.sql
  - select to_binary('GG')
  - select to_binary('01 AF', 'hex')
  - select to_binary(' ab cdef= = ', replace('HEX0', '0'))

- date.sql
  - select to_date("16", "dd")
  - select to_date("02-29", "MM-dd")

- datetime-parsing.sql / datetime-parsing-legacy.sql
  - select to_timestamp('009999', 'y')
  - select to_timestamp('99', 'yy')
  - select to_timestamp('009999', 'yyy')
  - select to_timestamp('9999', 'yyyy')
  - select to_timestamp('09999', 'yyyyy')
  - select to_timestamp('009999', 'yyyyyy')
  - select to_timestamp('9', 'D')
  - select to_timestamp('300', 'D')
  - select to_timestamp('09', 'DD')
  - select to_timestamp('99', 'DD')
  - select to_timestamp('009', 'DDD')
  - select to_timestamp('365', 'DDD')
  - select to_timestamp('31-365', 'dd-DDD')
  - select to_timestamp('12-365', 'MM-DDD')
  - select to_timestamp('12-31-365', 'MM-dd-DDD')

- datetime-parsing-invalid.sql
  - select to_timestamp('1', 'yy')
  - select to_timestamp('-12', 'yy')
  - select to_timestamp('123', 'yy')
  - select to_timestamp('1', 'yyy')
  - select to_timestamp('366', 'D')
  - select to_timestamp('9', 'DD')
  - select to_timestamp('9', 'DDD')
  - select to_timestamp('99', 'DDD')
  - select to_timestamp('30-365', 'dd-DDD')
  - select to_timestamp('11-365', 'MM-DDD')
  - select to_timestamp('12-30-365', 'MM-dd-DDD')
  - select to_timestamp('2020-01-365', 'yyyy-dd-DDD')
  - select to_timestamp('2020-10-350', 'yyyy-MM-DDD')

- datetime-legacy.sql
  - select to_timestamp("2019-10-06S10:11:12.12345", "yyyy-MM-dd'S'HH:mm:ss.SSSSSS")
  - select to_timestamp("12.12342019-10-06S10:11", "ss.SSSSyyyy-MM-dd'S'HH:mm")
  - select to_timestamp("12.1232019-10-06S10:11", "ss.SSSSyyyy-MM-dd'S'HH:mm")
  - select to_timestamp("12.1232019-10-06S10:11", "ss.SSSSyy-MM-dd'S'HH:mm")
  - select to_timestamp("12.1234019-10-06S10:11", "ss.SSSSy-MM-dd'S'HH:mm")
  - select to_timestamp("2019-10-06T10:11:12'12", "yyyy-MM-dd'T'HH:mm:ss''SSSS")
  - select to_timestamp("16", "dd")
  - select to_timestamp("02-29", "MM-dd")
  - select to_timestamp('22 05 2020 Friday', 'dd MM yyyy EEEEEE')
  - select to_timestamp('22 05 2020 Friday', 'dd MM yyyy EEEEE')
  - select unix_timestamp('22 05 2020 Friday', 'dd MM yyyy EEEEE')

- datetime-formatting.sql/datetime-formating-legacy.sql
  - All queries fail. Exclude from the supported list.

- try_datetime_functions.sql
  - select try_to_timestamp("02-29", "MM-dd")
  - select try_to_timestamp('22 05 2020 Friday', 'dd MM yyyy EEEEEE')

- try_cast.sql
  - SELECT TRY_CAST('1.23' AS int)
  - SELECT TRY_CAST('1.23' AS long)
  - SELECT TRY_CAST('-4.56' AS int)
  - SELECT TRY_CAST('-4.56' AS long)

- try_arithmetic.sql
  - SELECT try_add(1, (2147483647 + 1))
  - SELECT try_add(1L, (9223372036854775807L + 1L))

- regexp-functions.sql
  - SELECT regexp_extract_all('1a 2b 14m', '(\\d+)?', 1)
  - SELECT regexp_extract_all('a 2b 14m', '(\\d+)?', 1)
  - SELECT regexp_extract_all('a 2b 14m', '(\\d+)?([a-z]+)', 1)

- window.sql
  - SELECT val_date, cate, avg(val_timestamp) OVER(PARTITION BY cate ORDER BY val_date
     RANGE BETWEEN CURRENT ROW AND interval '1-1' year to month FOLLOWING) FROM testData
     ORDER BY cate, val_date;

### Spark version

Spark-3.5.x

### Spark configurations

_No response_

### System information

_No response_

### Relevant logs

```bash

```

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[VL] Track result mismatch cases in GlutenSQLQueryTestSuite #8738

Backend

Bug description

Spark version

Spark configurations

System information

Relevant logs

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

[VL] Track result mismatch cases in GlutenSQLQueryTestSuite #8738

Description

Backend

Bug description

Spark version

Spark configurations

System information

Relevant logs

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions