Monomorphic processing of TopN queries with 1 and 2 aggregators (key part of #3798) by leventov · Pull Request #3889 · apache/druid

leventov · 2017-01-28T06:41:06Z

Monomorphic processing: add HotLoopCallee, @CalledFromHotLoop, RuntimeShapeInspector, StringRuntimeShape, SpecializationService
Specialized topN queries with 1 or 2 aggregators
Added Cursor.advanceUninterruptibly() and isDoneOrInterrupted() for exception-free query processing, to avoid deoptimization of compiled code.
Made some stateless classes singletons, such as NoopAggregator and NoopBufferAggregator
Optimization of DistinctCountBufferAggregator: use Int2ObjectOpenHashMap instead of HashMap<Integer, ..>
Removed several unused classes

The key part of #3798.

This PR has minor intersection with #3883 (EmptyIntIterator added), but they can be merged in any order.

…ShapeInspector, SpecializationService. Specialize topN queries with 1 or 2 aggregators. Add Cursor.advanceUninterruptibly() and isDoneOrInterrupted() for exception-free query processing.

jihoonson · 2017-02-02T10:46:53Z

@leventov, nice work. I looked over the patch and am still reviewing.
So far, the patch looks good and I have only one question. How large is the runtime inspection overhead?

leventov · 2017-02-02T23:14:13Z

@jihoonson thank you.

Runtime inspection overhead is about 150 ns per PooledTopNAlgorithm.scanAndAggregate() call.

jeffs · 2017-02-02T20:05:51Z

+  public IntSet.IntIterator clone()
  {
-    return invertedIndex.get(currRow);
+    return new EmptyIntIterator();


Should clone return this, rather than allocating a new instance? It seems like the original intent was for EmptyIntIterator.INSTANCE to be the only object of its class.

I left it because Object.clone() contract implies that a new object is returned. I think this is not very harmful. This could be fixed later, if IntIterator is refactored to have a custom copy() method instead of inheriting Object.clone().

jeffs · 2017-02-06T23:47:09Z

 {
  private final String name;
-  private final List<ColumnSelectorPlus<CardinalityAggregatorColumnSelectorStrategy>> selectorPlusList;
+  private final ColumnSelectorPlus<CardinalityAggregatorColumnSelectorStrategy>[] selectorPluses;


In some places, you replace a selectorPlusList with an array (for performance?). Would it make sense to make the same change elsewhere? E.g., it looks like CardinalityAggregatorFactory specifically converts an array to a List, then passes the List to CardinalityAggregator, which just copies the List into a new array.

I replaced List with array because RuntimeShapeInspector accepts only arrays as series of objects, not lists. Partially this is made because RuntimeShapeInspector may need to accept List as a "singular" field (e. g. to distinguish ArrayList from Collections.singletonList() or Guava's ImmutableList). Partially to enforce the conversion like made in CardinalityAggregator and CardinalityBufferAggregator, yes, for performance.

Removed unnecessary array -> list -> array conversion in CardinalityAggregatorFactory.

…t and back to array in CardinalityAggregatorFactory

leventov · 2017-02-07T00:46:52Z

@jeffs thanks for review. Also regarding @CalledFromHotLoop - decided to annotate only interface methods, not implementation methods, because IntelliJ doesn't copy the annotation automatically and it's too tedious to insert it manually everywhere.

jihoonson

Thanks @leventov, 150 ns per call sounds reasonable.
I left some comments, and it would be great if the class specialization is tested as well. Maybe TopNQueryRunnerTest can be run with/without the class specialization.

jihoonson · 2017-02-07T01:45:32Z

+    {
+      if (prototypeClassBytecode == null) {
+        ClassLoader cl = prototypeClass.getClassLoader();
+        InputStream prototypeClassBytecodeStream = cl.getResourceAsStream(prototypeClassBytecodeName + ".class");


Need to close the stream.

jihoonson · 2017-02-07T01:47:03Z

+      Field theUnsafe = Unsafe.class.getDeclaredField("theUnsafe");
+      theUnsafe.setAccessible(true);
+      UNSAFE = (Unsafe) theUnsafe.get(null);
+    } catch (Exception e) {


The catch block needs to be moved to one line below.

I don't understand this comment

I mean, a catch block should start on a new line. Please find <option name="CATCH_ON_NEW_LINE" value="true" /> at here.

@jihoonson thanks, fixed

leventov · 2017-02-07T02:53:33Z

Going to add some tests.

leventov · 2017-02-08T23:02:13Z

@jihoonson addressed comments, testing specialization now.

jihoonson

@leventov, thanks for the update. I left one more comment. Everything other looks good.

jihoonson · 2017-02-09T02:40:13Z

-        .aggregators(Lists.<AggregatorFactory>newArrayList(QueryRunnerTestHelper.rowsCount))
+        .aggregators(duplicateAggregators(
+            QueryRunnerTestHelper.rowsCount,
+            new CountAggregatorFactory("rows1", "rows")


I think simply changing its name from rows to rows1 should work. The changes in CountAggregatorFactory are not necessary.

Good observation, reverted back.

… FloatWrappingDimensionSelector

himanshug · 2017-03-09T21:42:30Z

+   * to specialize class for the specific runtimeShape. The default value is chosen to be so that the specialized
+   * class will likely be compiled with C2 HotSpot compiler with the default values of *BackEdgeThreshold options.
+   */
+  private static final long triggerSpecializationIterationsThreshold =


why is type long when we are reading an int for sure?

also generally system properties are not used for configuration in druid? can you explain why they were necessary e.g. fakeSpecialize too ?

or, more importantly, do users need to tune it?

Changed type of triggerSpecializationIterationsThreshold to int.

SpecializationService is not an established thing, so it's quite possible that when this code is battle-tested more, the design of SpecializationService will change so that triggerSpecializationIterationsThreshold doesn't make sense anymore or should have different semantics. So I don't want to expose it as an "official" configuration yet, because it will impose compatibility constrains.

fakeSpecialize is needed exclusively in Druid development, it allows to analyze generated assembly with JITWatch. Users don't need to touch it.

himanshug · 2017-03-09T22:14:30Z

+      }
+    }
+
+    private long addAndGetTotalIterations(long newIterations)


can u add a comment, that it does so for last one hour... took me a bit to figure out

himanshug · 2017-03-09T22:15:57Z

+      !Boolean.getBoolean("dontSpecializeGeneric1AggPooledTopN");
+  @VisibleForTesting
+  static boolean specializeGeneric2AggPooledTopN =
+      !Boolean.getBoolean("dontSpecializeGeneric2AggPooledTopN");


do you intend these to be user configurable, in that case they should be part of TopNQueryConfig really

It's not for users, it's for performance comparison in benchmarks

leventov · 2017-03-09T23:34:36Z

@himanshug any other comments?

himanshug · 2017-03-10T21:00:10Z

@leventov LGTM aside from #3889 (review)

However, this PR updates BufferAggregator to implement HotLoopCallee , BufferAggregator is a druid extension point and many users have custom aggregator extensions which will break with this. Given that change, this can only be released in 0.10.0 or 0.11.0
given that, we haven't released 0.10.0 yet, so we can try and pull this one into 0.10.0 milestone. @gianm any objections ?

gianm · 2017-03-10T22:27:05Z

I think this is too big to pull into 0.10.0, we branched it off already and the branch should get bug fixes only. We could jump directly to 0.11.0 if we need to I guess.

Although, would this really be incompatible with existing extension jars? HotLoopCallee has no methods, so does adding it to BufferAggregator require extensions to be recompiled?

leventov · 2017-03-10T22:30:44Z

@gianm HotLoopCallee has method inspectRuntimeShape().

Another way is to move to Java 8 in 0.10.1 and provide a default implementation of inspectRuntimeShape() in BufferAggregator.

…make

gianm · 2017-03-11T01:21:22Z

Oh, I missed that. A default implementation sounds good if that means an extension compiled for 0.10.0 would work in 0.10.1 (sorry, I'm not that familiar with what interface changes will and will not require recompiles).

leventov · 2017-03-11T02:03:17Z

@gianm

https://docs.oracle.com/javase/specs/jls/se8/html/jls-13.html#jls-13.5.6

In other words, adding a default method is a binary-compatible change because it does not introduce errors at link time

drcrallen · 2017-03-11T17:15:43Z

@leventov prior conversations were about requiring a java8 JVM, not about compiling all of druid for java 8 target. Requiring a java8 target for all code would need community discussion.

leventov · 2017-03-11T17:46:07Z

@drcrallen why compiling Druid for Java 8 target is more risky, than requiring Java 8 JVM?

…ssing

…, for compatibility with extensions

leventov · 2017-03-15T06:30:25Z

@himanshug now BufferAggregator.inspectRuntimeShape() has default implementation.

leventov · 2017-03-17T08:49:10Z

@himanshug could this PR be merged now?

himanshug · 2017-03-17T19:44:32Z

👍

leventov · 2017-03-22T03:29:46Z

-        if (existingValue != null) {
-          existingValue.addAndGet(newIterations);
-        }
+        perMinuteIterations.computeIfAbsent(currentMinute, AtomicLong::new).addAndGet(newIterations);


This line actually contains bug, fixed here: 37b04c3

leventov force-pushed the monomorphic-processing branch from a2afcbc to befef43 Compare January 28, 2017 07:33

leventov added the Performance label Jan 28, 2017

leventov assigned jon-wei Feb 1, 2017

leventov added 2 commits February 1, 2017 16:55

Monomorphic processing: add HotLoopCallee, CalledFromHotLoop, Runtime…

d6a7b33

…ShapeInspector, SpecializationService. Specialize topN queries with 1 or 2 aggregators. Add Cursor.advanceUninterruptibly() and isDoneOrInterrupted() for exception-free query processing.

Use Execs.singleThreaded()

76093b6

leventov force-pushed the monomorphic-processing branch from f239fb0 to 76093b6 Compare February 1, 2017 22:57

leventov and others added 2 commits February 3, 2017 19:26

Merge branch 'master' into monomorphic-processing

9964299

RuntimeShapeInspector to support nullable fields

3340f0f

jeffs approved these changes Feb 6, 2017

View reviewed changes

leventov added 2 commits February 6, 2017 18:25

Make CalledFromHotLoop annotation Inherited

1b9a658

Remove unnecessary conversion of array of ColumnSelectorPluses to lis…

86c927d

…t and back to array in CardinalityAggregatorFactory

jihoonson requested changes Feb 7, 2017

View reviewed changes

Close InputStream in SpecializationService

661e9ec

leventov added the WIP label Feb 7, 2017

leventov added 5 commits February 8, 2017 15:14

Formatting

9b33411

Test specialized PooledTopNScanners

b249a09

Set flags in PooledTopNAlgorithm directly

00fddcd

Fix tests, dependent on CountAggragatorFactory toString() form

4652266

Fix

94c93b1

leventov removed the WIP label Feb 8, 2017

jihoonson reviewed Feb 9, 2017

View reviewed changes

leventov and others added 4 commits February 9, 2017 11:59

Revert CountAggregatorFactory changes

519a2b0

Merge branch 'master' into monomorphic-processing

7af0f4e

Implement inspectRuntimeShape() for LongWrappingDimensionSelector and…

14dc4da

… FloatWrappingDimensionSelector

Merge branch 'master' into monomorphic-processing

c6d1b8a

himanshug reviewed Mar 9, 2017

View reviewed changes

leventov added 2 commits March 9, 2017 15:58

Make triggerSpecializationIterationsThreshold an int

f1411f7

Remove SpecializationService.PerPrototypeClassState.of()

cc54534

himanshug reviewed Mar 9, 2017

View reviewed changes

Add comments

ae00e9b

Limit the amount of specializations that SpecializationService could …

de8d713

…make

leventov mentioned this pull request Mar 11, 2017

Allow compilation as Java8 source and target #3328

Merged

leventov mentioned this pull request Mar 13, 2017

Mark interfaces which are Druid extension points in code #4044

Closed

leventov added 2 commits March 14, 2017 22:26

Merge remote-tracking branch 'upstream/master' into monomorphic-proce…

205025f

…ssing

Add default implementation for BufferAggregator.inspectRuntimeShape()…

f0110cc

…, for compatibility with extensions

Use more efficient ConcurrentMap's idioms in SpecializationService

77c3981

himanshug merged commit 84fe91b into apache:master Mar 17, 2017

leventov mentioned this pull request Mar 18, 2017

Monomorphic processing of TopN queries with simple double aggregators over historical segments (part of #3798) #4079

Merged

leventov deleted the monomorphic-processing branch March 21, 2017 21:35

leventov commented Mar 22, 2017

View reviewed changes

clambertus unassigned himanshug and jon-wei Jul 6, 2018

gianm mentioned this pull request Jan 8, 2019

Query vectorization. #6794

Merged

Conversation

leventov commented Jan 28, 2017 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

jihoonson commented Feb 2, 2017

Uh oh!

leventov commented Feb 2, 2017

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

leventov commented Feb 7, 2017

Uh oh!

jihoonson left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

leventov commented Feb 7, 2017

Uh oh!

leventov commented Feb 8, 2017

Uh oh!

jihoonson left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

leventov Mar 9, 2017 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

leventov Mar 9, 2017 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

leventov commented Mar 9, 2017

Uh oh!

himanshug commented Mar 10, 2017 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

gianm commented Mar 10, 2017

Uh oh!

leventov commented Mar 10, 2017

Uh oh!

gianm commented Mar 11, 2017

Uh oh!

leventov commented Mar 11, 2017

Uh oh!

drcrallen commented Mar 11, 2017

Uh oh!

leventov commented Mar 11, 2017

leventov commented Jan 28, 2017 •

edited

Loading

leventov Mar 9, 2017 •

edited

Loading

leventov Mar 9, 2017 •

edited

Loading

himanshug commented Mar 10, 2017 •

edited

Loading