Monomorphic processing of TopN queries with 1 and 2 aggregators (key part of #3798)#3889
Conversation
a2afcbc to
befef43
Compare
…ShapeInspector, SpecializationService. Specialize topN queries with 1 or 2 aggregators. Add Cursor.advanceUninterruptibly() and isDoneOrInterrupted() for exception-free query processing.
f239fb0 to
76093b6
Compare
|
@leventov, nice work. I looked over the patch and am still reviewing. |
|
@jihoonson thank you. Runtime inspection overhead is about 150 ns per |
| public IntSet.IntIterator clone() | ||
| { | ||
| return invertedIndex.get(currRow); | ||
| return new EmptyIntIterator(); |
There was a problem hiding this comment.
Should clone return this, rather than allocating a new instance? It seems like the original intent was for EmptyIntIterator.INSTANCE to be the only object of its class.
There was a problem hiding this comment.
I left it because Object.clone() contract implies that a new object is returned. I think this is not very harmful. This could be fixed later, if IntIterator is refactored to have a custom copy() method instead of inheriting Object.clone().
| { | ||
| private final String name; | ||
| private final List<ColumnSelectorPlus<CardinalityAggregatorColumnSelectorStrategy>> selectorPlusList; | ||
| private final ColumnSelectorPlus<CardinalityAggregatorColumnSelectorStrategy>[] selectorPluses; |
There was a problem hiding this comment.
In some places, you replace a selectorPlusList with an array (for performance?). Would it make sense to make the same change elsewhere? E.g., it looks like CardinalityAggregatorFactory specifically converts an array to a List, then passes the List to CardinalityAggregator, which just copies the List into a new array.
There was a problem hiding this comment.
I replaced List with array because RuntimeShapeInspector accepts only arrays as series of objects, not lists. Partially this is made because RuntimeShapeInspector may need to accept List as a "singular" field (e. g. to distinguish ArrayList from Collections.singletonList() or Guava's ImmutableList). Partially to enforce the conversion like made in CardinalityAggregator and CardinalityBufferAggregator, yes, for performance.
Removed unnecessary array -> list -> array conversion in CardinalityAggregatorFactory.
…t and back to array in CardinalityAggregatorFactory
|
@jeffs thanks for review. Also regarding |
| { | ||
| if (prototypeClassBytecode == null) { | ||
| ClassLoader cl = prototypeClass.getClassLoader(); | ||
| InputStream prototypeClassBytecodeStream = cl.getResourceAsStream(prototypeClassBytecodeName + ".class"); |
There was a problem hiding this comment.
Need to close the stream.
| Field theUnsafe = Unsafe.class.getDeclaredField("theUnsafe"); | ||
| theUnsafe.setAccessible(true); | ||
| UNSAFE = (Unsafe) theUnsafe.get(null); | ||
| } catch (Exception e) { |
There was a problem hiding this comment.
The catch block needs to be moved to one line below.
There was a problem hiding this comment.
I don't understand this comment
There was a problem hiding this comment.
I mean, a catch block should start on a new line. Please find <option name="CATCH_ON_NEW_LINE" value="true" /> at here.
|
Going to add some tests. |
|
@jihoonson addressed comments, testing specialization now. |
| .aggregators(Lists.<AggregatorFactory>newArrayList(QueryRunnerTestHelper.rowsCount)) | ||
| .aggregators(duplicateAggregators( | ||
| QueryRunnerTestHelper.rowsCount, | ||
| new CountAggregatorFactory("rows1", "rows") |
There was a problem hiding this comment.
I think simply changing its name from rows to rows1 should work. The changes in CountAggregatorFactory are not necessary.
There was a problem hiding this comment.
Good observation, reverted back.
| * to specialize class for the specific runtimeShape. The default value is chosen to be so that the specialized | ||
| * class will likely be compiled with C2 HotSpot compiler with the default values of *BackEdgeThreshold options. | ||
| */ | ||
| private static final long triggerSpecializationIterationsThreshold = |
There was a problem hiding this comment.
why is type long when we are reading an int for sure?
also generally system properties are not used for configuration in druid? can you explain why they were necessary e.g. fakeSpecialize too ?
There was a problem hiding this comment.
or, more importantly, do users need to tune it?
There was a problem hiding this comment.
Changed type of triggerSpecializationIterationsThreshold to int.
SpecializationService is not an established thing, so it's quite possible that when this code is battle-tested more, the design of SpecializationService will change so that triggerSpecializationIterationsThreshold doesn't make sense anymore or should have different semantics. So I don't want to expose it as an "official" configuration yet, because it will impose compatibility constrains.
There was a problem hiding this comment.
fakeSpecialize is needed exclusively in Druid development, it allows to analyze generated assembly with JITWatch. Users don't need to touch it.
| } | ||
| } | ||
|
|
||
| private long addAndGetTotalIterations(long newIterations) |
There was a problem hiding this comment.
can u add a comment, that it does so for last one hour... took me a bit to figure out
| !Boolean.getBoolean("dontSpecializeGeneric1AggPooledTopN"); | ||
| @VisibleForTesting | ||
| static boolean specializeGeneric2AggPooledTopN = | ||
| !Boolean.getBoolean("dontSpecializeGeneric2AggPooledTopN"); |
There was a problem hiding this comment.
do you intend these to be user configurable, in that case they should be part of TopNQueryConfig really
There was a problem hiding this comment.
It's not for users, it's for performance comparison in benchmarks
|
@himanshug any other comments? |
|
@leventov LGTM aside from #3889 (review) However, this PR updates BufferAggregator to implement HotLoopCallee , BufferAggregator is a druid extension point and many users have custom aggregator extensions which will break with this. Given that change, this can only be released in 0.10.0 or 0.11.0 |
|
I think this is too big to pull into 0.10.0, we branched it off already and the branch should get bug fixes only. We could jump directly to 0.11.0 if we need to I guess. Although, would this really be incompatible with existing extension jars? HotLoopCallee has no methods, so does adding it to BufferAggregator require extensions to be recompiled? |
|
@gianm Another way is to move to Java 8 in 0.10.1 and provide a default implementation of |
|
Oh, I missed that. A default implementation sounds good if that means an extension compiled for 0.10.0 would work in 0.10.1 (sorry, I'm not that familiar with what interface changes will and will not require recompiles). |
|
https://docs.oracle.com/javase/specs/jls/se8/html/jls-13.html#jls-13.5.6
|
|
@leventov prior conversations were about requiring a java8 JVM, not about compiling all of druid for java 8 target. Requiring a java8 target for all code would need community discussion. |
|
@drcrallen why compiling Druid for Java 8 target is more risky, than requiring Java 8 JVM? |
…, for compatibility with extensions
|
@himanshug now |
|
@himanshug could this PR be merged now? |
|
👍 |
| if (existingValue != null) { | ||
| existingValue.addAndGet(newIterations); | ||
| } | ||
| perMinuteIterations.computeIfAbsent(currentMinute, AtomicLong::new).addAndGet(newIterations); |
There was a problem hiding this comment.
This line actually contains bug, fixed here: 37b04c3
HotLoopCallee,@CalledFromHotLoop,RuntimeShapeInspector,StringRuntimeShape,SpecializationServiceCursor.advanceUninterruptibly()andisDoneOrInterrupted()for exception-free query processing, to avoid deoptimization of compiled code.NoopAggregatorandNoopBufferAggregatorDistinctCountBufferAggregator: useInt2ObjectOpenHashMapinstead ofHashMap<Integer, ..>The key part of #3798.
This PR has minor intersection with #3883 (
EmptyIntIteratoradded), but they can be merged in any order.