Skip to content

fix issue with nested column null value index incorrectly matching non-null values#13211

Merged
cheddar merged 1 commit intoapache:masterfrom
clintropolis:nested-column-null-index-fix
Oct 11, 2022
Merged

fix issue with nested column null value index incorrectly matching non-null values#13211
cheddar merged 1 commit intoapache:masterfrom
clintropolis:nested-column-null-index-fix

Conversation

@clintropolis
Copy link
Copy Markdown
Member

Description

Fixes a mistake in NestedFieldLiteralColumnIndexSupplier when creating NullValueIndex, where it was incorrectly always returning the bitmap for the first dictionary entry, instead of first confirming that this value is actually null. While the global value dictionary entry 0 is always null, within the local dictionary of the nested literal column, dictionary id must point to the global id 0 for the first bitmap entry to be the null value bitmap. If this is not the case, an empty bitmap should be used since the column contains no null values.

This fixes queries using IS NULL or IS NOT NULL filters on nested columns that do not contain a null value in every segment, which presents as having a single value per segment matched and returned in the results

Screen Shot 2022-10-11 at 12 26 35 AM

Screen Shot 2022-10-11 at 12 26 50 AM


This PR has:

  • been self-reviewed.
  • added comments explaining the "why" and the intent of the code wherever would not be obvious for an unfamiliar reader.
  • added unit tests or modified existing tests to cover new code paths, ensuring the threshold for code coverage is met.
  • been tested in a test Druid cluster.

// null index is always 0 in the global dictionary, even if there are no null rows in any of the literal columns
return (T) (NullValueIndex) () -> new SimpleImmutableBitmapIndex(bitmaps.get(0));
final BitmapColumnIndex nullIndex;
if (dictionary.get(0) == 0) {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it would have been nicer to have these constants (RHS) defined at the top - something like GLOBAL_DICTIONARY_NULL_INDEX. Some other day perhaps.

@cheddar cheddar merged commit 9688674 into apache:master Oct 11, 2022
@clintropolis clintropolis deleted the nested-column-null-index-fix branch October 12, 2022 00:12
clintropolis added a commit to clintropolis/druid that referenced this pull request Oct 28, 2022
@kfaraz kfaraz added this to the 24.0.1 milestone Nov 1, 2022
kfaraz pushed a commit that referenced this pull request Nov 1, 2022
* use object[] instead of string[] for vector expressions to be consistent with vector object selectors (#13209)

* fix issue with nested column null value index incorrectly matching non-null values (#13211)

* fix json_value sql planning with decimal type, fix vectorized expression math null value handling in default mode (#13214)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants