Skip to content

fix REGEXP_LIKE, CONTAINS_STRING, ICONTAINS_STRING for correct 3vl behavior#15963

Merged
clintropolis merged 1 commit intoapache:masterfrom
clintropolis:3vl-regex-contains-icontains
Mar 20, 2024
Merged

fix REGEXP_LIKE, CONTAINS_STRING, ICONTAINS_STRING for correct 3vl behavior#15963
clintropolis merged 1 commit intoapache:masterfrom
clintropolis:3vl-regex-contains-icontains

Conversation

@clintropolis
Copy link
Copy Markdown
Member

Description

Fixes regexp_like, contains_string, and icontains_string to return null instead of false in SQL compatible null handling mode to have correct three-value logic behavior. I did not put this behavior behind a flag, though if we are worried about the bug fix changing behavior I suppose we could associate it with NullHandling.useThreeValueLogic(), though kind of weird since that is currently tied to use for native filters, or ExpressionProcessing.useStrictBooleans() which traditionally controls the other expression related 3vl stuff.

Release note

REGEXP_LIKE, CONTAINS_STRING, and ICONTAINS_STRING correctly return null for null value inputs in SQL compatible null handling mode (the default configuration).


This PR has:

  • been self-reviewed.
  • a release note entry in the PR description.
  • added comments explaining the "why" and the intent of the code wherever would not be obvious for an unfamiliar reader.
  • added unit tests or modified existing tests to cover new code paths, ensuring the threshold for code coverage is met.
  • been tested in a test Druid cluster.

…tead of false for null inputs in sql compatible mode
@abhishekagarwal87
Copy link
Copy Markdown
Contributor

let's put it behind one of these feature flags, please :) My experience is that long-time users often adjust to this weird behavior and then complain about their workflows breaking.

@clintropolis
Copy link
Copy Markdown
Member Author

let's put it behind one of these feature flags, please :) My experience is that long-time users often adjust to this weird behavior and then complain about their workflows breaking.

The problem is that neither of these feature flags seem very appropriate. druid.generic.useThreeValueLogicForNativeFilters definitely seems not right because it has 'native filters' in the name. druid.expressions.useStrictBooleans also doesn't really feel right because the primary thing that config does is make 'LONG' be the standard boolean type, instead of a mix of boolean-ish types. It by side-effect has some three-value logic stuff that it fixed with AND and OR filters, but doesn't fully control SQL compatible three-value logic behavior for the rest of expressions, for the most part, that just happens by default if druid.generic.useDefaultValueForNull=false.

I'm pretty strongly opposed to adding yet another feature flag for a bug fix, it isn't really sustainable. This fix already only has any effect if druid.generic.useDefaultValueForNull=false...

Copy link
Copy Markdown
Contributor

@abhishekagarwal87 abhishekagarwal87 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. The SQL compatibility feature flag is good enough.

@clintropolis clintropolis merged commit 48b8d42 into apache:master Mar 20, 2024
@clintropolis clintropolis deleted the 3vl-regex-contains-icontains branch March 20, 2024 05:13
@adarshsanjeev adarshsanjeev added this to the 30.0.0 milestone May 6, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants