Skip to content

fix bugs with expression virtual column indexes for expression virtual columns which refer to other virtual columns#15633

Merged
clintropolis merged 3 commits intoapache:masterfrom
clintropolis:bug-fix-expression-indexes
Jan 8, 2024
Merged

fix bugs with expression virtual column indexes for expression virtual columns which refer to other virtual columns#15633
clintropolis merged 3 commits intoapache:masterfrom
clintropolis:bug-fix-expression-indexes

Conversation

@clintropolis
Copy link
Copy Markdown
Member

@clintropolis clintropolis commented Jan 6, 2024

Description

Fixes some bugs with #15585 involving expression virtual columns which refer to other virtual columns, as well as an issue that can happen during SQL planning when extractionFn is specified for inputs to equality and null filters, but virtualColumnRegistry is null.

changes:

  • ColumnIndexSelector now extends ColumnSelector. The only real implementation of ColumnIndexSelector, ColumnSelectorColumnIndexSelector, already has a ColumnSelector, so this isn't very disruptive
  • removed getColumnNames from ColumnSelector since it was not used
  • VirtualColumns and VirtualColumn method getIndexSupplier now needs argument of ColumnIndexSelector instead of ColumnSelector, which allows expression virtual columns to correctly recognize other virtual columns, fixing an issue which would incorrectly handle other virtual columns as non-existent columns instead
  • fixed a bug with sql planner incorrectly not using expression filter for equality filters on columns with extractionFn and no virtual column registry

Release note


This PR has:

  • been self-reviewed.
  • added documentation for new or modified features or behaviors.
  • a release note entry in the PR description.
  • added Javadocs for most classes and all non-trivial methods. Linked related entities via Javadoc links.
  • added or updated version, license, or notice information in licenses.yaml
  • added comments explaining the "why" and the intent of the code wherever would not be obvious for an unfamiliar reader.
  • added unit tests or modified existing tests to cover new code paths, ensuring the threshold for code coverage is met.
  • added integration tests.
  • been tested in a test Druid cluster.

…l columns which refer to other virtual columns

changes:
* ColumnIndexSelector now extends ColumnSelector. The only real implementation of ColumnIndexSelector, ColumnSelectorColumnIndexSelector, already has a ColumnSelector, so this isn't very disruptive
* removed getColumnNames from ColumnSelector since it was not used
* VirtualColumns and VirtualColumn getIndexSupplier method now needs argument of ColumnIndexSelector instead of ColumnSelector, which allows expression virtual columns to correctly recognize other virtual columns, fixing an issue which would incorrectly handle other virtual columns as non-existent columns instead
* fixed a bug with sql planner incorrectly not using expression filter for equality filters on columns with extractionFn and no virtual column registry
@Nullable
default ColumnIndexSupplier getIndexSupplier(String columnName, ColumnSelector columnSelector)
default ColumnIndexSupplier getIndexSupplier(
String columnName,

Check notice

Code scanning / CodeQL

Useless parameter

The parameter 'columnName' is never used.
Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added columnName parameter to this method originally just in case stuff was using the 'usesDotNotation', unsure if the column name would be useful in such a situation, though perhaps it isn't really necessary and could be removed...

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There hasn't been much use that comes up for dot notation other than the original map virtual column. The JSON stuff could have used it, but we ended up going the more standard JSON_VALUE way instead. IMO it's fine to say simplify this and say that dot-notation virtual columns don't get to supply indexes. I'm OK either way though really.

@clintropolis
Copy link
Copy Markdown
Member Author

code coverage failure is related to code that isn't run in default value mode so i think can be ignored

Copy link
Copy Markdown
Contributor

@gianm gianm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Approved, although if you want to remove columnName from getIndexSupplier, that's ok by me too.

@clintropolis clintropolis merged commit df5bcd1 into apache:master Jan 8, 2024
@clintropolis clintropolis deleted the bug-fix-expression-indexes branch January 8, 2024 21:10
@LakshSingla LakshSingla added this to the 29.0.0 milestone Jan 29, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants