Skip to content

Add TRANSLATE, TYPEOF, and FIND_IN_SET function mappings#142

Merged
ranjanankur314 merged 54 commits intoe6data:mainfrom
tkaunlaky-e6:learning_collabrative
Jul 31, 2025
Merged

Add TRANSLATE, TYPEOF, and FIND_IN_SET function mappings#142
ranjanankur314 merged 54 commits intoe6data:mainfrom
tkaunlaky-e6:learning_collabrative

Conversation

@NiranjGaurav
Copy link

@NiranjGaurav NiranjGaurav commented Jul 23, 2025

Summary

  • Added TYPEOF function mapping support
  • Added FIND_IN_SET function mapping from Databricks to E6
  • Note: TYPEOF function doesn't support custom datatypes created at runtime

Changes

  • Added TYPEOF, and FIND_IN_SET to supported functions list
  • Implemented FIND_IN_SET mapping in E6 dialect
  • Added comprehensive test cases for all three functions
  • Updated Databricks, Hive, and Spark dialects to support these functions

NOTE:

There are two functions namely "SUBSTRING_INDEX" and "RANDSTR" couldn't find any direct mapping or indirect mapping. Support for Substring_index could be added but for negative index needs to be discussed.

NiranjGaurav and others added 5 commits July 22, 2025 16:30
- Implemented FindInSet expression class in expressions.py
- Added parser support in databricks.py FUNCTIONS dictionary
- Created E6 transformation using ARRAY_POSITION + SPLIT approach
- Added comprehensive tests for literal and column reference cases
- Updated supported functions JSON with E6 dialect support

FIND_IN_SET returns 1-based position of search string in comma-separated list,
or 0 if not found. E6 implementation uses ARRAY_POSITION(search, SPLIT(list, ','))
to achieve equivalent functionality.
…ing. But TYPEOF function doesnt support custom datatypes created at runtime
…ing. But TYPEOF function doesnt support custom datatypes created at runtime
"COUNT_IF"
"COUNT_IF",
"TRANSLATE",
"SPACE",

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@NiranjGaurav space is not supported directly in e6 right? kindly check

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Adithyak-0926 i got confused. so should i write the supported functions or the functions that are mapped. please clarify i will change accordingly

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@NiranjGaurav one should add into supported function of e6 here if and only if that function when run directly on e6 engine runs! If space runs directly, u can add if not remove.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Adithyak-0926 ok i will fix it now

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Adithyak-0926 done you can check

"COUNT_IF"
"COUNT_IF",
"TRANSLATE",
"typeof"

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@NiranjGaurav "typeof" mention in capital

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 - INTERVAL '5 hours 30 minutes' (as discussed on Zepto channel, you mentioned it has to be developed and you will check on it)
P2 - INTERVAL '5 hours 30 minutes' (as discussed on Zepto channel, you mentioned it has to be developed and you will check on it)
P2 - INTERVAL '5 hours 30 minutes' (as discussed on Zepto channel, you mentioned it has to be developed and you will check on it)
P2 - INTERVAL '5 hours 30 minutes' (as discussed on Zepto channel, you mentioned it has to be developed and you will check on it)
P2 - INTERVAL '5 hours 30 minutes' (as discussed on Zepto channel, you mentioned it has to be developed and you will check on it)
P2 - INTERVAL '5 hours 30 minutes' (as discussed on Zepto channel, you mentioned it has to be developed and you will check on it)
Tanay Kulkarni and others added 13 commits July 31, 2025 13:51
- Implemented FindInSet expression class in expressions.py
- Added parser support in databricks.py FUNCTIONS dictionary
- Created E6 transformation using ARRAY_POSITION + SPLIT approach
- Added comprehensive tests for literal and column reference cases
- Updated supported functions JSON with E6 dialect support

FIND_IN_SET returns 1-based position of search string in comma-separated list,
or 0 if not found. E6 implementation uses ARRAY_POSITION(search, SPLIT(list, ','))
to achieve equivalent functionality.
…ing. But TYPEOF function doesnt support custom datatypes created at runtime
…ing. But TYPEOF function doesnt support custom datatypes created at runtime
P2 - INTERVAL '5 hours 30 minutes' (as discussed on Zepto channel, you mentioned it has to be developed and you will check on it)
P2 - INTERVAL '5 hours 30 minutes' (as discussed on Zepto channel, you mentioned it has to be developed and you will check on it)
P2 - INTERVAL '5 hours 30 minutes' (as discussed on Zepto channel, you mentioned it has to be developed and you will check on it)
P2 - INTERVAL '5 hours 30 minutes' (as discussed on Zepto channel, you mentioned it has to be developed and you will check on it)
P2 - INTERVAL '5 hours 30 minutes' (as discussed on Zepto channel, you mentioned it has to be developed and you will check on it)
P2 - INTERVAL '5 hours 30 minutes' (as discussed on Zepto channel, you mentioned it has to be developed and you will check on it)
P2 - INTERVAL '5 hours 30 minutes' (as discussed on Zepto channel, you mentioned it has to be developed and you will check on it)
Adithyak-0926 and others added 27 commits July 31, 2025 13:54
…rted_functions_in_all_dialects.json, implemented tests for them and ran make check.
P2 - INTERVAL '5 hours 30 minutes' (as discussed on Zepto channel, you mentioned it has to be developed and you will check on it)
P2 - INTERVAL '5 hours 30 minutes' (as discussed on Zepto channel, you mentioned it has to be developed and you will check on it)
- Removed TimestampSeconds class from expressions.py to avoid unnecessary class proliferation
- Updated Databricks parser to map TIMESTAMP_SECONDS directly to UnixToTime with scale='seconds'
- Enhanced E6 generator from_unixtime_sql to handle both 'seconds' and 'milliseconds' scale parameters
- Added TIMESTAMP_SECONDS to E6 supported functions list
- All existing tests pass, confirming backward compatibility
P2 - INTERVAL '5 hours 30 minutes' (as discussed on Zepto channel, you mentioned it has to be developed and you will check on it)
P2 - INTERVAL '5 hours 30 minutes' (as discussed on Zepto channel, you mentioned it has to be developed and you will check on it)
…ning_collabrative

# Conflicts:
#	apis/utils/supported_functions_in_all_dialects.json
#	sqlglot/dialects/databricks.py
#	sqlglot/dialects/e6.py
#	sqlglot/dialects/spark.py
#	sqlglot/expressions.py
#	tests/dialects/test_e6.py
Copy link

@ranjanankur314 ranjanankur314 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me

@ranjanankur314 ranjanankur314 merged commit 896c312 into e6data:main Jul 31, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants