Conversation
| import org.apache.druid.query.rowsandcols.RowsAndColumns; | ||
| import org.apache.druid.segment.column.RowSignature; | ||
|
|
||
| public interface FrameMaker |
There was a problem hiding this comment.
I think its not necessary needed to introduce an interface if there will be only 1 implementation
I wonder why not somehow make RowsAndColumns.as(Frame.class) and
RowsAndColumns.as(RowSignature.class) work for this?
There was a problem hiding this comment.
While this currently only has toColumnBasedFrame(), it's equally plausible to have a toRowBasedFrame() and allow implementations to ask for the one that they expect to be working with. The notion of which kind of Frame that you are getting is not something that can be relayed with RowsAndColumns.as(Frame.class).
That said, one thing that does annoy me about the .as() and the semantic interfaces is that you have these factory-style interfaces for when you are building a thing. I don't know of another way to deal with it and maintain the extensibility, so I've just been thinking that it's the price we pay.
There was a problem hiding this comment.
currently the DefaultFrameMaker class has no internal state - and with these methods I think there won't be any - unless it tries to cache the frame or something...
I think one of the key problems with Frame is that its decoupled from RowSignature - however most of the time those two should be together....why not introduce a class which could hold a Frame along with the RowSignature or extend frame somehow to optionally add a RowSignature
that way a single as method could be implemented; and there won't be a need for the FrameMaker
my recent experiences sugest that the production of a frame from a rac may not need to be aligned as 1-1 with it; so I think a class to which rac-s would be written to produce a frame would be more like what we will need in the future - other way to get to to more-or-less the same is to use ConcatRowAndColumns
would it be possible to use the approach you end up with at the places where we have this frame creation logic copied-to already?
LazilyDecoratedRowsAndColumns(2 places)StorageAdapterRowsAndColumns
There was a problem hiding this comment.
I like this idea, aligning a RowSignature with a Frame is definitely a really annoying thing when dealing with Frames, having the Frame carry it forward is nice. That said, one of the reasons for this interface is, if we assume that a Frame is the serialization format that we want for the wire, then we will want some "standard" means of converting from RowsAndColumns to the concrete wire-serializable form.
|
|
||
| public class SemanticUtils | ||
| { | ||
| private static final Map<Class<?>, Map<Class<?>, Function<?, ?>>> OVERRIDES = new LinkedHashMap<>(); |
There was a problem hiding this comment.
why do we need to put everything into a centralized map?
there is no point in caching this as all filed which are calling makeAsMap are static fields for which only once that method will be invoked
There was a problem hiding this comment.
This exists to allow extensions to register new interfaces and implementations without needing to impact the core code. This is probably worthy of javadoc.
There was a problem hiding this comment.
Added a javadoc
There was a problem hiding this comment.
you mean to override default behaviour?
could you give an example? how this supposed to work if 2-3 classes want to override the same?
what's the problem you are trying to solve?
There was a problem hiding this comment.
Essentially, we could have a class A that has a SemanticCreator toB(). This allows A.as(B.class) to work.
However, there could be extensions that want a different implementation to be bound. The extension would want to use that implementation only if the extension is loaded and also, it would not be able to change the first binding. Overrides will allow the extension to bind something like:
SemanticUtils.registerAsOverride(
A.class,
B.class,
(a) -> new C(a) // C extends B
);
to modify
If there are multiple bindings for A and B from different places, SemanticUtils would throw an exception.
| @Nullable | ||
| private final Metadata metadata; | ||
| @NotNull | ||
| private final Supplier<Metadata> metadataSupplier; |
There was a problem hiding this comment.
this looks odd - an index with named columns ; but the Metadata is softened with a supplier
why soften Metadata with a Supplier ?
There was a problem hiding this comment.
I'm not 100% certain that this is why it was done, but one benefit of softening it could be to avoid holding the object in memory which potentially softens memory pressure. There's probably a legitimate question to be asked if this should really be a Supplier<> or if SimpleQueryableIndex should be made an abstract class.
There was a problem hiding this comment.
Converted it to an abstract class
|
|
||
| public class SemanticUtils | ||
| { | ||
| private static final Map<Class<?>, Map<Class<?>, Function<?, ?>>> OVERRIDES = new LinkedHashMap<>(); |
There was a problem hiding this comment.
This exists to allow extensions to register new interfaces and implementations without needing to impact the core code. This is probably worthy of javadoc.
| import org.apache.druid.query.rowsandcols.RowsAndColumns; | ||
| import org.apache.druid.segment.column.RowSignature; | ||
|
|
||
| public interface FrameMaker |
There was a problem hiding this comment.
While this currently only has toColumnBasedFrame(), it's equally plausible to have a toRowBasedFrame() and allow implementations to ask for the one that they expect to be working with. The notion of which kind of Frame that you are getting is not something that can be relayed with RowsAndColumns.as(Frame.class).
That said, one thing that does annoy me about the .as() and the semantic interfaces is that you have these factory-style interfaces for when you are building a thing. I don't know of another way to deal with it and maintain the extensibility, so I've just been thinking that it's the price we pay.
| @Nullable | ||
| private final Metadata metadata; | ||
| @NotNull | ||
| private final Supplier<Metadata> metadataSupplier; |
There was a problem hiding this comment.
I'm not 100% certain that this is why it was done, but one benefit of softening it could be to avoid holding the object in memory which potentially softens memory pressure. There's probably a legitimate question to be asked if this should really be a Supplier<> or if SimpleQueryableIndex should be made an abstract class.
|
Looking at the failures, this needs improved coverage. For |
| */ | ||
| public interface ColumnarInts extends IndexedInts, Closeable | ||
| { | ||
| default void get(int[] out, int offset, int start, int length) |
There was a problem hiding this comment.
are there any benefit of implementing this method here - and not in IndexedInts which already has a get method?
| } | ||
|
|
||
| final int limit = Math.min(length - p, sizePer - bufferIndex); | ||
| reader.read(out, p, bufferIndex, limit); |
There was a problem hiding this comment.
are there any tests which are passing not 0 as offset?
| import org.apache.druid.query.rowsandcols.RowsAndColumns; | ||
| import org.apache.druid.segment.column.RowSignature; | ||
|
|
||
| public interface FrameMaker |
There was a problem hiding this comment.
currently the DefaultFrameMaker class has no internal state - and with these methods I think there won't be any - unless it tries to cache the frame or something...
I think one of the key problems with Frame is that its decoupled from RowSignature - however most of the time those two should be together....why not introduce a class which could hold a Frame along with the RowSignature or extend frame somehow to optionally add a RowSignature
that way a single as method could be implemented; and there won't be a need for the FrameMaker
my recent experiences sugest that the production of a frame from a rac may not need to be aligned as 1-1 with it; so I think a class to which rac-s would be written to produce a frame would be more like what we will need in the future - other way to get to to more-or-less the same is to use ConcatRowAndColumns
would it be possible to use the approach you end up with at the places where we have this frame creation logic copied-to already?
LazilyDecoratedRowsAndColumns(2 places)StorageAdapterRowsAndColumns
| @Override | ||
| public Metadata getMetadata() | ||
| { | ||
| return null; |
There was a problem hiding this comment.
if its valid to have null metadata; it could be the default implementation in the interface
| @Nullable | ||
| default <T> T as(Class<? extends T> clazz) |
There was a problem hiding this comment.
since this PR is about to make the SemanticCreator more mainstream: could you please introduce an interface for it and use it everywhere ?
There was a problem hiding this comment.
I didn't understand this, do you mean adding a new interface like "SemanticClass" that all other classes would implement if it has a SemanticCreator?
| import java.util.Map; | ||
| import java.util.function.Function; | ||
|
|
||
| public class SemanticUtils |
There was a problem hiding this comment.
this Utils could be renamed to be the interface containing the as for these things; the static method could remain as those are providing services...
please also add apidoc how this as stuff works/etc
There was a problem hiding this comment.
I like the idea about creating the interface for .as() and having these utils be static helpers on it. As a refactor, that should probably be done for all of the places that are using an .as() so I think I'd like to let this current PR go through (as it's following the current pattern in the code) and one of us can take on the refactor as you defined it as I do think that's an improvement.
| @Nullable | ||
| public <T> T as(Class<? extends T> clazz) | ||
| { | ||
| return column.as(clazz); |
There was a problem hiding this comment.
this is a non-overridable as method....or not?
how will I be able to override this with SemanticUtils.registerOverride?
There was a problem hiding this comment.
The class does not use SemanticUtils.makeAsMap so it would not be directly overridable. I think the decision of using overrides or not currently belongs to the class itself, in this way. The extension can still override it by adding an override for ColumnarLongs which this class is pointing to.
| @Nullable | ||
| default <T> T as(Class<? extends T> clazz) | ||
| { | ||
| return null; |
There was a problem hiding this comment.
this is a non-overridable as method....or not?
I think registerOverride should also work when it was not provided before....
how will I be able to override this with SemanticUtils.registerOverride?
There was a problem hiding this comment.
The intention is to have implementations of base column be classes that have semantic creators and have possible conversions. Overrides can be registered against them.
If a class does not have any conversion to another class, and does not override this function, it defaults to null.
Adding a map and allowing functions to register overrides to it may cause it to use the conversions from the base class, if it is not overridden. I am not sure if that behavior is expected.
| get(out, 0, start, length); | ||
| } | ||
|
|
||
| default void get(long[] out, int offset, int start, int length) |
There was a problem hiding this comment.
what's driving this change?
why the need for this here and not in other classes like ColumnarDoubles ?
| public Metadata getMetadata() | ||
| { | ||
| try { | ||
| ByteBuffer metadataBB = smooshedFiles.mapFile("metadata.drd"); |
There was a problem hiding this comment.
I doesn't really know these parts; but this will mean that the Metadata will be read from disk every time its asked for; is that desired?
There was a problem hiding this comment.
Not necessarily. The smooshedFiles is memory mapped and this is returning a memoized memory-mapped buffer. It will do some less than fast operations, but not necessarily brand new disk IO. The getMetadata() call is used seldom (only a few specific introspection-oriented calls like the segmentMetadata query) so it's better to have it be lazy and not consume on-heap memory.
|
I had some off-line conversations with Zoltan about his comments. He's got some great comments for cleaning things up. I like eliminating SemanticUtils and just putting it on the interface that we use for the |
Refactors the SemanticCreator annotation.
This PR has: