Monomorphic processing of TopN queries with simple double aggregators over historical segments (part of #3798)#4079
Conversation
… and historical segments
…bleBufferAggregator
|
When we reach the global limit here, we begin to spam to logs on each call |
|
@egor-ryashin thanks, fixed to emit only once. |
| @Override | ||
| public void inspectRuntimeShape(RuntimeShapeInspector inspector) | ||
| { | ||
| inspector.visit("theBuffer", theBuffer); |
There was a problem hiding this comment.
Is it ok that theBuffer is not final and changed afterwards?
There was a problem hiding this comment.
Changed theBuffer to be final
| @Override | ||
| public void inspectRuntimeShape(RuntimeShapeInspector inspector) | ||
| { | ||
| inspector.visit("baseList", baseList); |
There was a problem hiding this comment.
What about the clazz field?
There was a problem hiding this comment.
It's not used in get() (the only @CalledFromHotLoop method in Indexed), so it shouldn't be visited in inspectRuntimeShape()
There was a problem hiding this comment.
I would create an annotation like:
@Inspect
private final List<T> baseList;
make an abstract super class with a method
inspectRuntimeShape(RuntimeShapeInspector inspector) {
for(Field field : this.getClass().getDeclaredFields()){
field.isAnnotationPresent(Inspect.class);
field.setAccessible(true);
inspector.visit(field.getName(), field.get(this))
}
}
and forget about overriding methods.
It's up to you though.
There was a problem hiding this comment.
It would make it too easy to forget to annotate some fields. For the same reason inspectRuntimeShape() method doesn't have default Java 8 implementation with empty body, although it is always correct and a lot of implementations just do that: to force developers to implement and think about the runtime shape.
It's just my opinion, though. Maybe it makes sense to add annotation-based runtime shape inspection in some form. But anyway, not as part of this PR.
| public abstract void aggregate(ByteBuffer buf, int position, double value); | ||
|
|
||
| @Override | ||
| public final void aggregate(ByteBuffer buf, int position) |
There was a problem hiding this comment.
Should this method be annotated with @CalledFromHotLoop?
There was a problem hiding this comment.
No, because this method is annotated in BufferAggregator
| Historical1AggPooledTopNScanner defaultHistoricalSingleValueDimSelector1SimpleDoubleAggScanner = | ||
| new HistoricalSingleValueDimSelector1SimpleDoubleAggPooledTopNScannerPrototype(); | ||
|
|
||
| private final Capabilities capabilities; |
There was a problem hiding this comment.
Why do we need additional ref to Capabilities here?
There was a problem hiding this comment.
It is used to create BaseArrayProvider in makeInitParams().
There was a problem hiding this comment.
Capabilities is protected field in BaseTopNAlgorithm
| SpecializationService.getSpecializationState( | ||
| prototypeScanner.getClass(), | ||
| runtimeShape, | ||
| ImmutableMap.<Class<?>, Class<?>>of(Offset.class, historicalCursor.getOffset().getClass()) |
There was a problem hiding this comment.
<Class<?>, Class<?>> could be removed
| ) | ||
| { | ||
| return getSpecializationState(prototypeClass, runtimeShape, ImmutableMap.<Class<?>, Class<?>>of()); | ||
| return getSpecializationState(prototypeClass, runtimeShape, ImmutableMap.of()); |
There was a problem hiding this comment.
is synchronization required on call to perPrototypeClassState.get() from multiple threads? or call to computeValue()?
Could you please clarify, why WindowedLoopIterationCounter#getSpecialized() always returns null?
There was a problem hiding this comment.
is synchronization required on call to perPrototypeClassState.get() from multiple threads? or call to computeValue()?
According to ClassValue documentation, it should be safe: "The actual installation of the value on the class is performed atomically. At that point, if several racing threads have computed values, one is chosen, and returned to all the racing threads."
There was a problem hiding this comment.
Could you please clarify, why WindowedLoopIterationCounter#getSpecialized() always returns null?
Added comment
| @Override | ||
| public void inspectRuntimeShape(RuntimeShapeInspector inspector) | ||
| { | ||
| inspector.visit("delegate", delegate); |
There was a problem hiding this comment.
should we add lookupCacheSize to runtimeShape?
| } | ||
|
|
||
| @Override | ||
| public void inspectRuntimeShape(RuntimeShapeInspector inspector) |
There was a problem hiding this comment.
baseArray is not inspected, but is used in get()
There was a problem hiding this comment.
It's used just for indexing, there should be no difference for Hotspot for any Object array type. So no need to distinguish between String[] and Object[], for example.
| @Override | ||
| public void inspectRuntimeShape(RuntimeShapeInspector inspector) | ||
| { | ||
| inspector.visit("cachedValues", cachedValues != null); |
There was a problem hiding this comment.
I wonder why we does not care about cachedValues element type?
There was a problem hiding this comment.
Because CachingIndexed.get() doesn't call methods on instances of this type directly. So for Hotspot, it's just an object, doesn't matter of what type. If it does matter in delegate.get(), delegate's runtime shape should reflect this, delegate is inspected in CachingIndexed.inspectRuntimeShape().
| @Override | ||
| public void inspectRuntimeShape(RuntimeShapeInspector inspector) | ||
| { | ||
| // ideally should inspect buffer, but at the moment of inspectRuntimeShape() call buffer might be null although |
There was a problem hiding this comment.
So if we get those fields with subclasses different from the original object then we still get the first mapping and ignore the second implementation and eventually get unexpected/different results on dash, right?
There was a problem hiding this comment.
Didn't understand this question. However any implementation of inspectRuntimeShape() anywhere doesn't affect correctness.
The purpose of inspectRuntimeShape() is to ensure that some copy of the code is always called with the same runtime shape (monomorphic), and (ideally) there is only one copy of the code for each runtime shape, not to make JIT perform the same compilation work twice and waste code cache (on Hotspot level) and instruction cache (on hardware level).
Not reflecting all fields which actually belong to runtime shape in inspectRuntimeShape() could lead to the situation when some copy of the code is called with non-monomorphic runtime shape, that will make Hotspot JIT to generate slower (but still correct) code.
Reflecting some fields which actually don't belong to runtime shape in inspectRuntimeShape() could lead to the situation when there are several copies of the code which are called with the same runtime shape, that will pollute code cache and will make JIT to perform the same work several times, but the code is still correct.
| @Override | ||
| public void inspectRuntimeShape(RuntimeShapeInspector inspector) | ||
| { | ||
| inspector.visit("buffer", buffer); |
There was a problem hiding this comment.
As I see this is a mutable field. Could have "null", then something else, but description will be 'null' for all the instances, right?
Also 'bigEndian' is changed during processing when loadBuffer() is called.
There was a problem hiding this comment.
Yes, that's bad that buffer couldn't be properly inspected here. Probably this situation could be improved later.
Also 'bigEndian' is changed during processing when loadBuffer() is called.
Thanks, removed inspection of bigEndian too.
| } | ||
|
|
||
| return new DimensionSelector() | ||
| return new SingleValueDimensionSelector() |
There was a problem hiding this comment.
There is an inspection of extractionFn below, but I don't see it plays as a 'boolean flag'.
io/druid/query/groupby/RowBasedColumnSelectorFactory.java:180
There was a problem hiding this comment.
It is used in lookupName() which is @CalledFromHotLoop
| Offset offset = (Offset) TopNUtils.copyOffset(cursor); | ||
| long scannedRows = 0; | ||
| int positionToAllocate = 0; | ||
| while (offset.withinBounds() && !Thread.currentThread().isInterrupted()) { |
There was a problem hiding this comment.
if this is interrupted, where is it intended to be caught?
There was a problem hiding this comment.
In PooledTopNAlgorithm.scanAndAggregate(), BaseQuery.checkInterrupted() is called in the end of each branch.
| Offset offset = (Offset) TopNUtils.copyOffset(cursor); | ||
| long scannedRows = 0; | ||
| int positionToAllocate = 0; | ||
| while (offset.withinBounds() && !Thread.currentThread().isInterrupted()) { |
There was a problem hiding this comment.
It feels like we should be able to use an Iterator or IntStream here, but I'm not finding anything reasonable in the standard java libraries at a cursory glance.
There was a problem hiding this comment.
There is IntIterator from fastutil, and even https://docs.oracle.com/javase/8/docs/api/java/util/PrimitiveIterator.OfInt.html.
I don't think such change should be part of this PR.
| if (specializeGeneric1AggPooledTopN && theAggregators.length == 1) { | ||
| scanAndAggregateGeneric1Agg(params, positions, theAggregators[0], cursor); | ||
| } else if (specializeGeneric2AggPooledTopN && theAggregators.length == 2) { | ||
| if (theAggregators.length == 1) { |
There was a problem hiding this comment.
Can we abstract the predicate / function checking into a List of Pairs of predicates and runnables or similar? and loop through the list until it finds a matching predicate to run? Then to deny specializeHistoricalSingleValueDimSelector1SimpleDoubleAggPooledTopN is simply to not add it in the list.
There was a problem hiding this comment.
Those BaseQuery.checkInterrupted(); + return; are pretty buried and would be easy to miss in future additions. So having a loop and finishing on those two items feels more sustainable.
| ) | ||
| { | ||
| String runtimeShape = StringRuntimeShape.of(aggregator); | ||
| HistoricalCursor historicalCursor = (HistoricalCursor) cursor; |
There was a problem hiding this comment.
suggest only taking in HistoricalCursor and doing the cast in the caller
|
|
||
| @Override | ||
| public void inspectRuntimeShape(RuntimeShapeInspector inspector) | ||
| { |
There was a problem hiding this comment.
If really nothing suggest also adding comment as per the other examples below
| public void inspectRuntimeShape(RuntimeShapeInspector inspector) | ||
| { | ||
| inspector.visit("firstBaseMatcher", baseMatchers[0]); | ||
| inspector.visit("secondBaseMatcher", baseMatchers[1]); |
There was a problem hiding this comment.
What guarantees baseMatchers.length() > 1?
There was a problem hiding this comment.
ah, I see, >0 and !=1 ok
| public void inspectRuntimeShape(RuntimeShapeInspector inspector) | ||
| { | ||
| inspector.visit("baseSelector", baseSelector); | ||
| inspector.visit("extractionFn", extractionFn); |
drcrallen
left a comment
There was a problem hiding this comment.
A lot of neat added functionality that seems to be ok, but there is a severe lack of unit tests proving to future developers that they don't mess it up somehow. Please add unit tests for the new features.
Also, there seems to be a lot of discussion or changes around what needs to be included in inspectRuntimeShape. Can checks for that be automated or at least tested in unit tests somehow?
…type and HistoricalSingleValueDimSelector1SimpleDoubleAggPooledTopNScannerPrototype, cover them with tests
|
@drcrallen I've changed existing tests so that they cover all added "prototypes" (and it helped to find bugs). Regarding testing Manual testing of
Ways to test |
| @Override | ||
| public Offset clone() | ||
| { | ||
| FilteredOffset offset = (FilteredOffset) super.clone(); |
There was a problem hiding this comment.
What guarantees an Offset clone is castable like this?
There was a problem hiding this comment.
It delegates to Object.clone(), which returns instance of the same class as this.
| import io.druid.segment.historical.OffsetHolder; | ||
| import org.roaringbitmap.IntIterator; | ||
|
|
||
| final class FilteredOffset extends Offset |
There was a problem hiding this comment.
This class seems to have no unit tests
There was a problem hiding this comment.
This class if factored out of QueryableIndexStorageAdapter. It didn't exist before this PR, because QueryableIndexStorageAdapter used to handle post filters using a special cursor, and now there is the same cursor accepting any provided Offset, post-filtered like FilteredOffset, or not. So this class is tested by all the same tests which test QueryableIndexStorageAdapter directly or indirectly, mostly tests of specific query types.
| advance(); | ||
| count++; | ||
| if (!Thread.currentThread().isInterrupted()) { | ||
| cursorOffset.increment(); |
| public int lookupId(ActualType name); | ||
| public int getCardinality(); | ||
|
|
||
| HistoricalDimensionSelector makeDimensionSelector(OffsetHolder offsetHolder, ExtractionFn extractionFn); |
There was a problem hiding this comment.
I'm curious if this will screw up @cheddar / @himanshug 's lucine index stuff
There was a problem hiding this comment.
Is there a way to architect this so there isn't a historical-specific method in DictionaryEncodedColumn?
There was a problem hiding this comment.
Changed to return just DimensionSelector
| return cachedLookups.size(); | ||
| } | ||
|
|
||
| @Override |
There was a problem hiding this comment.
This seems to have no unit tests
There was a problem hiding this comment.
The same as for FilteredOffset, this method is factored out of QueryableIndexStorageAdapter and tested through it
| return true; | ||
| } | ||
| }; | ||
| return makeMatcher(matchers.toArray(new ValueMatcher[0])); |
There was a problem hiding this comment.
new ValueMatcher[0] can be a static constant
|
|
||
| private ValueMatcher makeMatcher(final ValueMatcher[] baseMatchers) | ||
| { | ||
| Preconditions.checkState(baseMatchers.length > 0); |
There was a problem hiding this comment.
Can this support a 0 length array and just return BooleanValueMatcher.of(false);?
There was a problem hiding this comment.
We prohibit 0 length in constructor, so here it shouldn't appear too
| public ValueMatcher makeMatcher(ColumnSelectorFactory factory) | ||
| { | ||
| if (filters.size() == 0) { | ||
| return BooleanValueMatcher.of(false); |
There was a problem hiding this comment.
This is odd because on one hand you end up with a value matcher that returned false here, but makeMatcher has a fall-through case of true.
There was a problem hiding this comment.
makeMatcher(ValueMatcher[]) has Preconditions.checkState(baseMatchers.length > 0);, so it should end up throwing an exception anyway
| public void inspectRuntimeShape(RuntimeShapeInspector inspector) | ||
| { | ||
| inspector.visit("firstBaseMatcher", baseMatchers[0]); | ||
| inspector.visit("secondBaseMatcher", baseMatchers[1]); |
There was a problem hiding this comment.
ah, I see, >0 and !=1 ok
| /** | ||
| * Public bridge is needed, because MagicAccessorImpl is a package-private class. | ||
| */ | ||
| public class MagicAccessorBridge extends MagicAccessorImpl |
There was a problem hiding this comment.
Doesn't extending sun.* packages violate the java license agreement?
There was a problem hiding this comment.
Seems so (http://www.oracle.com/technetwork/java/javase/terms/license/index.html). Among high-profile projects, I found Groovy extends MagicAccessorImpl:
There was a problem hiding this comment.
but org.codehaus.groovy.reflection is not in sun.reflect
There was a problem hiding this comment.
It creates "sun/reflect/GroovyMagic" class. Actually an interesting question, asked here: https://twitter.com/leventov/status/863155057113669632
|
On #4079 (comment): no, because it's in the method called "advanceUninterruptibly" |
…storical-topn-monomorphic-processing
Cursorspecialization:HistoricalCursorDimensionSelectorspecializations:SingleValueDimensionSelector,HistoricalDimensionSelector,SingleValueHistoricalDimensionSelectorFloatColumnSelectorspecialization:HistoricalFloatColumnSelectorIndexedandValueMatcherinterfaces to extendHotLoopCalleeSpecializationServiceFollow-up of #3889, part of #3798