Skip to content

fix sql compatible null handling config work with runtime.properties#8908

Closed
clintropolis wants to merge 3 commits intoapache:masterfrom
clintropolis:null-handling-config-fix-alt
Closed

fix sql compatible null handling config work with runtime.properties#8908
clintropolis wants to merge 3 commits intoapache:masterfrom
clintropolis:null-handling-config-fix-alt

Conversation

@clintropolis
Copy link
Copy Markdown
Member

Description

This PR is an alternate to #8876, which was just awful, but I was afraid of introducing an if statement to NullHandling. replaceWithDefault in something that is called often on the query path. However, curiosity got the best of me, so I added a new benchmark schema to help determine how much of a performance hit introducing an if statement. It turns it that it isn't really noticeable, so I'm closing #8876 in favor of this PR instead.

before PR:

Benchmark                                              (defaultStrategy)  (initialBuckets)  (numProcessingThreads)  (numSegments)  (queryGranularity)  (rowsPerSegment)  (schemaAndQuery)  (vectorize)  Mode  Cnt      Score      Error  Units
GroupByBenchmark.queryMultiQueryableIndexWithSerde                    v2                -1                       2              4                 all            100000    simple-nulls.A        false  avgt   15   8610.792 ±  260.568  us/op
GroupByBenchmark.queryMultiQueryableIndexWithSpilling                 v2                -1                       2              4                 all            100000    simple-nulls.A        false  avgt   15   7769.964 ±  184.177  us/op
GroupByBenchmark.queryMultiQueryableIndexX                            v2                -1                       2              4                 all            100000    simple-nulls.A        false  avgt   15   8424.455 ±  268.964  us/op
GroupByBenchmark.querySingleIncrementalIndex                          v2                -1                       2              4                 all            100000    simple-nulls.A        false  avgt   15  15782.469 ± 3313.596  us/op
GroupByBenchmark.querySingleQueryableIndex                            v2                -1                       2              4                 all            100000    simple-nulls.A        false  avgt   15   3978.851 ±  284.614  us/op

after PR:

Benchmark                                              (defaultStrategy)  (initialBuckets)  (numProcessingThreads)  (numSegments)  (queryGranularity)  (rowsPerSegment)  (schemaAndQuery)  (vectorize)  Mode  Cnt      Score      Error  Units
GroupByBenchmark.queryMultiQueryableIndexWithSerde                    v2                -1                       2              4                 all            100000    simple-nulls.A        false  avgt   15   8104.835 ±  275.995  us/op
GroupByBenchmark.queryMultiQueryableIndexWithSpilling                 v2                -1                       2              4                 all            100000    simple-nulls.A        false  avgt   15   8215.000 ±  228.912  us/op
GroupByBenchmark.queryMultiQueryableIndexX                            v2                -1                       2              4                 all            100000    simple-nulls.A        false  avgt   15   8294.836 ±  189.671  us/op
GroupByBenchmark.querySingleIncrementalIndex                          v2                -1                       2              4                 all            100000    simple-nulls.A        false  avgt   15  16709.884 ± 3195.983  us/op
GroupByBenchmark.querySingleQueryableIndex                            v2                -1                       2              4                 all            100000    simple-nulls.A        false  avgt   15   3882.968 ±   93.530  us/op

This PR has:

  • been self-reviewed.
  • added comments explaining the "why" and the intent of the code wherever would not be obvious for an unfamiliar reader.
  • added unit tests or modified existing tests to cover new code paths.
  • added integration tests.
  • been tested in a test Druid cluster.

Key changed/added classes in this PR
  • NullHandling

);
private static NullValueHandlingConfig INSTANCE;

private static volatile Boolean isReplaceWithDefault = null;
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure having a volatile static is a good idea for something that will be used by multiple threads while processing every row. Your benchmark checked single-threaded performance but this could degrade under multithreading.

Could you please either look into this, or do a hybrid of your last two approaches (require explicit initialization in tests, of a non-volatile static, and if-guard it in methods that use it).

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After discussion, maybe the explicit initialization is better in order to rid magic from our world, closing this PR and re-opening #8876

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants