Skip to content

fix filtering nested field virtual column when used with non nested column input#13779

Merged
clintropolis merged 7 commits intoapache:masterfrom
clintropolis:fix-nested-virtual-column-index-stuff
Feb 9, 2023
Merged

fix filtering nested field virtual column when used with non nested column input#13779
clintropolis merged 7 commits intoapache:masterfrom
clintropolis:fix-nested-virtual-column-index-stuff

Conversation

@clintropolis
Copy link
Copy Markdown
Member

Description

This PR fixes an issue when filtering on a NestedFieldVirtualColumn that is using an input column that is not a nested column, such as a standard string, long, float, or double column. #13732 updated NestedFieldVirtualColumn to be permissive and still able to create selectors when used with other column types, but I forgot to add this handling to the getIndexSupplier method, so attempting to filter these columns with a filter which would typically use an index, would result in an exception being thrown about the column being the wrong type, something like

Column [__time] is invalid type, found [class org.apache.druid.segment.column.LongsColumn] instead of [NestedDataComplexColumn]

This PR fixes this method to use the same logic as the selector methods, returning the underlying columns index supplier directly if selecting the 'root' path, else null as if the column didn't exist.


This PR has:
  • been self-reviewed.
  • added comments explaining the "why" and the intent of the code wherever would not be obvious for an unfamiliar reader.
  • added unit tests or modified existing tests to cover new code paths, ensuring the threshold for code coverage is met.
  • been tested in a test Druid cluster.

theColumn.makeVectorValueSelector(offset),
capabilities.toColumnType(),
expectedType
expectedType != null ? expectedType : capabilities.toColumnType() // cast to itself in case the underlying column doesn't support object selector...
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The comment here isn't adding any value for me?


if (capabilities != null) {
if (!capabilities.isPrimitive() && capabilities.isDictionaryEncoded().isTrue()) {
// if the underlying column is a nested column (and persisted to disk, re: the dictionary encoded check)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we really have no better way of determining if we have been persisted than checking isDictionaryEncoded()? Given that we are already a NestedFieldVirtualColumn and supposedly working with a NestedField that we understand the class of, I wonder if we couldn't have something more specific. Maybe for another day, but perhaps we could have a NestedFieldColumnCapabilities that we could cast to and get access to more specific methods?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah, im certain there is a cooler way to do this, especially when the underlying inspector is the segment, but i'll save that for another day

@clintropolis
Copy link
Copy Markdown
Member Author

i cancelled travis since it seems redundant

@clintropolis clintropolis merged commit ffeda72 into apache:master Feb 9, 2023
@clintropolis clintropolis deleted the fix-nested-virtual-column-index-stuff branch February 9, 2023 11:16
@clintropolis clintropolis added this to the 26.0 milestone Apr 10, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants