Support for disabling bitmap indexes.#5402
Conversation
Can save space for columns where bitmap indexes are pointless (like free-form text).
| { | ||
| "type": "string", | ||
| "name": "comment", | ||
| "bitmapIndex": false |
There was a problem hiding this comment.
Sure, why not. Changed it.
| { | ||
| private VERSION version = null; | ||
| private int flags = NO_FLAGS; | ||
| private int flags = Feature.NO_BITMAP_INDEX.getMask(); |
There was a problem hiding this comment.
Suggested to make a method maskOf(Collection<Feature>), and DEFAULT_FEATURES = Collections.singletonList(NO_BITMAP_INDEX)
There was a problem hiding this comment.
I made a STARTING_FLAGS constant but I didn't do the maskOf function- it doesn't seem useful enough to exist at this point.
|
@leventov thanks for review- updated the patch. |
| ); | ||
| if (!Feature.NO_BITMAP_INDEX.isSet(rFlags)) { | ||
| GenericIndexed<ImmutableBitmap> rBitmaps = GenericIndexed.read( | ||
| buffer, bitmapSerdeFactory.getObjectStrategy(), builder.getFileMapper() |
nishantmonu51
left a comment
There was a problem hiding this comment.
left few comments,
Also, would be great if you can also modify dimensionSchema in some integration test too so that this feature is tested end to end.
| private static final Map<Interval, DimensionSchema> MIXED_TYPE_COLUMN_MAP = ImmutableMap.of( | ||
| Intervals.of("2017-01-01/2017-02-01"), | ||
| new StringDimensionSchema(MIXED_TYPE_COLUMN, null), | ||
| new StringDimensionSchema(MIXED_TYPE_COLUMN, null, null), |
There was a problem hiding this comment.
minor nit - StringDimensionSchema has a constr for just name, we can use that here.
There was a problem hiding this comment.
Sure, I changed these.
| @@ -287,6 +287,7 @@ protected IncrementalIndex( | |||
| if (dimSchema.getTypeName().equals(DimensionSchema.SPATIAL_TYPE_NAME)) { | |||
| capabilities.setHasSpatialIndexes(true); | |||
There was a problem hiding this comment.
Noticed above in NewSpatialDimensionSchema the const passes true for hasBitmapIndexes,
probably need to set capabilities.setBitmapIndex(true) here.
There was a problem hiding this comment.
Moved this outside the if/else.
| @@ -165,6 +167,12 @@ public SerializerBuilder withBitmapSerdeFactory(BitmapSerdeFactory bitmapSerdeFa | |||
|
|
|||
| public SerializerBuilder withBitmapIndex(GenericIndexedWriter<ImmutableBitmap> bitmapIndexWriter) | |||
| if (Feature.MULTI_VALUE.isSet(flags)) { | ||
| return VSizeColumnarMultiInts.readFromByteBuffer(buffer); | ||
| } else { | ||
| throw new IAE("Unrecognized multi-value flag[%d] for version[%s]", flags, version); |
There was a problem hiding this comment.
missing multi value V3 handling ?
There was a problem hiding this comment.
This is missing on purpose; from the code it looks like MULTI_VALUE_V3 is only supported for compressed formats.
There was a problem hiding this comment.
please add a comment to code.
|
|
||
| private static boolean mustWriteFlags(final int flags) | ||
| { | ||
| return flags != NO_FLAGS && flags != Feature.MULTI_VALUE.getMask(); |
There was a problem hiding this comment.
MUTLI_VALUE_V3 must be written so that's why it's not listed here.
There was a problem hiding this comment.
please add a comment to code.
|
LGTM after addressing the comments of @nishantmonu51. |
|
@nishantmonu51 @leventov updated per your comments & added an integration test. |
|
thanks @gianm |
Can save space for columns where bitmap indexes are pointless (like
free-form text).
Requires adding a new flag and version code to dictionary encoded string
columns. So, segments written with this option will not be backwards
compatible with older versions of Druid.