Fix ODBC datetime literal parsing in IN clauses by tkaunlaky-e6 · Pull Request #1 · tkaunlaky-e6/sqlglot

tkaunlaky-e6 · 2025-08-07T11:35:54Z

Problem

ODBC datetime literals like {d '2025-05-31'} were failing to parse when used inside IN clauses in Databricks SQL queries, throwing "Expecting )" errors. The parser's _parse_primary() method would fall back to _parse_paren() which only handles L_PAREN tokens, not the L_BRACE tokens used by ODBC literals. This caused complex Databricks queries with multiple date literals in IN clauses to fail parsing.

Fix

Added ODBC literal detection directly in _parse_primary() method before falling back to _parse_paren(), checking for the exact pattern: {, followed by a valid ODBC type (d, t, or ts), followed by a STRING token. This leverages the existing _parse_odbc_datetime_literal() method that was already implemented but unreachable from the primary expression parsing path. The fix also ensures struct/map literals like {d: Map(...)} are not incorrectly identified as ODBC literals by verifying the third token is a STRING.

Testing

Added comprehensive test cases covering all ODBC datetime formats (date, time, timestamp)
Tests verify parsing in various SQL contexts: SELECT, WHERE, IN clauses, BETWEEN
All tests pass successfully with proper transpilation to E6 and other dialects

* feat(teradata): parse column format syntax Support Teradata FORMAT column syntax Add Teradata format tests Add comments and docs for Teradata FORMAT column Modified Comments * style and linter modifications * feat(optimizer)!: annotate type for SHA and SHA2 (tobymao#5346) * chore(optimizer)!: annotate type SHA1, SHA256, SHA512 for BigQuery (tobymao#5347) * additional linter cleanup --------- Co-authored-by: Giorgos Michas <geomichas96@gmail.com>

* Feat(duckdb): support new lambda syntax * Improve testing coverage, fix multi-arg version

* feat(optimizer)!: annotate type for DATETIME * fix format

* feat(optimizer)!: annotate type for ENDS_WITH * minor test refactor

) * feat(fabric): Treat TIMESTAMPTZ as TIMESTAMP if not used with AT TIME ZONE * fix(fabric): simplify TIMESTAMPTZ handling * fix(fabric): Convert TIMESTAMPTZ to UTC if not within AT TIME ZONE

* Override round for postgres generator * Code style changes * Include `ROUND(x, y)` test --------- Co-authored-by: Jo <46752250+georgesittas@users.noreply.github.com>

* feat(optimizer)!: parse and annotate type for ASCII * fix annotation

…o#5380) * chore(postgres, hive): use ASCII node instead of UNICODE node * refactor tests

* feat(dremio): Add TIME_MAPPING for Dremio dialect * Fix linter checks * Address comments --------- Co-authored-by: Mateusz Poleski <Mateusz.Poleski@imc.com>

…o#5387) * fix(snowflake): transpile bigquery CURRENT_DATE with timezone * PR feedback 1 (vag) * fix test

P2 - INTERVAL '5 hours 30 minutes' (as discussed on Zepto channel, you mentioned it has to be developed and you will check on it)

…r for FIND_IN_SET

…ning_collabrative # Conflicts: # apis/utils/supported_functions_in_all_dialects.json # sqlglot/dialects/databricks.py # sqlglot/dialects/e6.py # sqlglot/dialects/spark.py # sqlglot/expressions.py # tests/dialects/test_e6.py

Add TRANSLATE, TYPEOF, and FIND_IN_SET function mappings

- Add TIMESTAMP_MILLIS and TIMESTAMP_MICROS functions to Spark parser (Spark 3.1+) - Add TIMESTAMP_MILLIS and TIMESTAMP_MICROS functions to Databricks parser - Convert microseconds to milliseconds by dividing by 1000 in E6 dialect - Use proper UnixToTime scale constants for consistency across dialects

- Merge statistical functions support (CORR, COVAR_POP, COVAR_SAMP, VARIANCE_SAMP, VAR_SAMP) - Add GROUP BY ALL support - Integrate TYPEOF, TIMEDIFF, and INTERVAL functions - Update supported functions list for e6 dialect

…unctions Add GROUP BY ALL and statistical functions support

Map TIMESTAMP_MILLIS to FROM_UNIXTIME

I also fixed the comma missed in the json file.

Rebase demo

there was some merge conflicts when rebase was merged that included some changes in the transforms.py. While solving merge conflicts i added some things on my branch that lead to error thats also sorted now. Along with this ran make check too.

EXTRACT- ISSUE

…ONCAT_WS - Replace hardcoded 'ARRAY' string with proper exp.Array expression node - Use self.func('ARRAY_TO_STRING', ...) directly to avoid ARRAY_JOIN mapping - Maintain all Databricks CONCAT_WS behaviors (NULL filtering, array flattening)

feat: implement CONCAT_WS function for E6 dialect

Previously, ODBC datetime literals like {d '2025-05-31'} would fail to parse when used inside IN clauses, throwing 'Expecting )' errors. This occurred because _parse_primary() only called _parse_paren() for unmatched tokens, but _parse_paren() only handles L_PAREN tokens, not L_BRACE tokens used by ODBC literals. The fix adds ODBC literal detection directly in _parse_primary() before falling back to _parse_paren(). This leverages the existing _parse_odbc_datetime_literal() method that was already implemented but unreachable from the primary expression parsing path. Changes: - Added ODBC literal check in _parse_primary() method (lines 5813-5821) - Supports {d 'date'}, {t 'time'}, and {ts 'timestamp'} formats - Works in all SQL contexts: SELECT, WHERE, IN clauses, etc. Fixes parsing errors for Databricks queries with ODBC escape sequences in complex expressions.

Enhanced the ODBC datetime literal parsing to be more robust by checking for the exact pattern (L_BRACE, VAR, STRING, R_BRACE) to avoid false positives with struct/map literals. Changes: - Updated parser to verify the third token is STRING to distinguish ODBC literals from struct literals like {d: Map(...)} - Added comprehensive test cases for all ODBC datetime formats (date, time, timestamp) - Tests cover various SQL contexts: SELECT, WHERE, IN clauses, BETWEEN - Included tests for single and multiple ODBC literals in IN clauses - Added test case similar to the original complex Databricks query This ensures ODBC datetime literals work correctly in all SQL contexts while avoiding conflicts with other brace-delimited syntax.

github-actions · 2025-08-07T11:40:47Z

Benchmark for `0bb5632`

Click to view benchmark

Test	Base	PR	%
long	212.7±2.42µs	216.8±2.07µs	+1.93%

geooo109 and others added 30 commits July 3, 2025 18:15

feat(optimizer)!: annotate type for SHA and SHA2 (tobymao#5346)

0337c4d

chore(optimizer)!: annotate type SHA1, SHA256, SHA512 for BigQuery (t…

cc389fa

…obymao#5347)

Feat: add case-insensitive uppercase normalization strategy (tobymao#…

835d9e6

…5349)

feat(exasol): Add TO_CHAR function support in exasol dialect (tobymao…

f80493e

…#5350)

Chore(exasol): clean up TO_CHAR

194850a

Fix!: preserve multi-arg DECODE function instead of converting to CASE (

509b741

tobymao#5352)

Clean up Teradata FORMAT phrase logic

eeddeae

Feat(duckdb): support new lambda syntax (tobymao#5359)

eae64e1

* Feat(duckdb): support new lambda syntax * Improve testing coverage, fix multi-arg version

feat(duckdb): Add support for SET VARIABLE (tobymao#5360)

e77991d

fix(optimizer): downstream column for PIVOT (tobymao#5363)

188d446

feat(optimizer)!: annotate type for CORR (tobymao#5364)

c1d3d61

feat(optimizer)!: annotate type for COVAR_POP (tobymao#5365)

c1e8677

feat(optimizer)!: annotate type for COVAR_SAMP (tobymao#5367)

e110ef4

feat(optimizer)!: annotate type for DATETIME (tobymao#5369)

5b59c16

* feat(optimizer)!: annotate type for DATETIME * fix format

feat(optimizer)!: annotate type for ENDS_WITH (tobymao#5370)

47176ce

* feat(optimizer)!: annotate type for ENDS_WITH * minor test refactor

feat(fabric): Ensure TIMESTAMPTZ is used with AT TIME ZONE (tobymao#5362

1fd757e

) * feat(fabric): Treat TIMESTAMPTZ as TIMESTAMP if not used with AT TIME ZONE * fix(fabric): simplify TIMESTAMPTZ handling * fix(fabric): Convert TIMESTAMPTZ to UTC if not within AT TIME ZONE

Cleanup fabric tests

800a82c

feat(optimizer)!: annotate type for LAG (tobymao#5371)

2cce53d

Feat!: improve transpilation of ROUND(x, y) to Postgres (tobymao#5368)

a3227de

* Override round for postgres generator * Code style changes * Include `ROUND(x, y)` test --------- Co-authored-by: Jo <46752250+georgesittas@users.noreply.github.com>

Clean up Postgres ROUND logic

e705e8e

Chore: bump min. supported version to python 3.9 (tobymao#5353)

1abd461

docs: update API docs, CHANGELOG.md for v27.0.0 [skip ci]

debd616

fix(duckdb)!: week/quarter support (tobymao#5374)

d7ccb48

feat(optimizer)!: parse and annotate type for ASCII (tobymao#5377)

b368fba

* feat(optimizer)!: parse and annotate type for ASCII * fix annotation

chore(postgres, hive): use ASCII node instead of UNICODE node (tobyma…

71b1349

…o#5380) * chore(postgres, hive): use ASCII node instead of UNICODE node * refactor tests

feat(optimizer)!: annotate type for UNICODE (tobymao#5381)

7f19b31

feat(dremio): Add TIME_MAPPING for Dremio dialect (tobymao#5378)

f035bf0

* feat(dremio): Add TIME_MAPPING for Dremio dialect * Fix linter checks * Address comments --------- Co-authored-by: Mateusz Poleski <Mateusz.Poleski@imc.com>

Chore: improve error msg for PIVOT with missing aggregation

a5c2245

fix(snowflake): transpile bigquery CURRENT_DATE with timezone (tobyma…

252469d

…o#5387) * fix(snowflake): transpile bigquery CURRENT_DATE with timezone * PR feedback 1 (vag) * fix test

NiranjGaurav and others added 28 commits July 31, 2025 13:57

JSON isuue

bf7689b

P1 - SATURDAY & SUNDAY keyword issue

b78de56

P2 - INTERVAL '5 hours 30 minutes' (as discussed on Zepto channel, you mentioned it has to be developed and you will check on it)

P1 - SATURDAY & SUNDAY keyword issue

c920949

P2 - INTERVAL '5 hours 30 minutes' (as discussed on Zepto channel, you mentioned it has to be developed and you will check on it)

Ran make check and removed the comments added in the databricks parse…

6d85489

…r for FIND_IN_SET

Rebase issues solved and ran make check

cb0a212

Ran Make Check and improved logic for ARRAY_SLICE

cead918

Merge pull request e6data#142 from tkaunlaky-e6/learning_collabrative

896c312

Add TRANSLATE, TYPEOF, and FIND_IN_SET function mappings

[FIX]: Make check formatting

033bbf9

Merge branch 'main' into rebase_demo

c824763

Resolve merge conflicts with main branch

4141a20

- Merge statistical functions support (CORR, COVAR_POP, COVAR_SAMP, VARIANCE_SAMP, VAR_SAMP) - Add GROUP BY ALL support - Integrate TYPEOF, TIMEDIFF, and INTERVAL functions - Update supported functions list for e6 dialect

Merge pull request e6data#139 from tkaunlaky-e6/mapping-statistical-f…

32065ce

…unctions Add GROUP BY ALL and statistical functions support

Merge pull request e6data#135 from tkaunlaky-e6/Timestamp_Millis

496b24b

Map TIMESTAMP_MILLIS to FROM_UNIXTIME

Mapped the QUARTER to EXTRACT and also ran make check.

ce03770

I also fixed the comma missed in the json file.

Merge branch 'main' into rebase_demo

4621f15

Merge pull request e6data#148 from e6data/rebase_demo

e92e8fb

Rebase demo

Merge branch 'main' into extract-issue

8ffb0c2

Merge pull request e6data#152 from e6data/extract-issue

ae0a6b5

EXTRACT- ISSUE

Added CONCAT_WS mapping

bd71e69

Added CONCAT_WS in apis/utils/supported_functions_in_all_dialects.json

0501b84

Remove CONCAT_WS

bafb043

Merge pull request e6data#137 from tkaunlaky-e6/Mapping-CONCAT_SW

21b7285

feat: implement CONCAT_WS function for E6 dialect

Fix the : issue

5c60446

tkaunlaky-e6 closed this Sep 19, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix ODBC datetime literal parsing in IN clauses#1

Fix ODBC datetime literal parsing in IN clauses#1
tkaunlaky-e6 wants to merge 438 commits intomainfrom
fix-odbc-date-literals-in-clause

tkaunlaky-e6 commented Aug 7, 2025

Uh oh!

github-actions bot commented Aug 7, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

20 participants

Conversation

tkaunlaky-e6 commented Aug 7, 2025

Problem

Fix

Testing

Uh oh!

github-actions bot commented Aug 7, 2025

Benchmark for 0bb5632

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

20 participants

Benchmark for `0bb5632`