Skip to content

Support for disabling bitmap indexes.#5402

Merged
gianm merged 6 commits intoapache:masterfrom
gianm:feature-disable-index
Mar 1, 2018
Merged

Support for disabling bitmap indexes.#5402
gianm merged 6 commits intoapache:masterfrom
gianm:feature-disable-index

Conversation

@gianm
Copy link
Copy Markdown
Contributor

@gianm gianm commented Feb 20, 2018

Can save space for columns where bitmap indexes are pointless (like
free-form text).

Requires adding a new flag and version code to dictionary encoded string
columns. So, segments written with this option will not be backwards
compatible with older versions of Druid.

Can save space for columns where bitmap indexes are pointless (like
free-form text).
Comment thread docs/content/ingestion/index.md Outdated
{
"type": "string",
"name": "comment",
"bitmapIndex": false
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe "createBitmapIndex"?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, why not. Changed it.

{
private VERSION version = null;
private int flags = NO_FLAGS;
private int flags = Feature.NO_BITMAP_INDEX.getMask();
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested to make a method maskOf(Collection<Feature>), and DEFAULT_FEATURES = Collections.singletonList(NO_BITMAP_INDEX)

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I made a STARTING_FLAGS constant but I didn't do the maskOf function- it doesn't seem useful enough to exist at this point.

@gianm
Copy link
Copy Markdown
Contributor Author

gianm commented Feb 20, 2018

@leventov thanks for review- updated the patch.

);
if (!Feature.NO_BITMAP_INDEX.isSet(rFlags)) {
GenericIndexed<ImmutableBitmap> rBitmaps = GenericIndexed.read(
buffer, bitmapSerdeFactory.getObjectStrategy(), builder.getFileMapper()
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One arg at one line

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changed it.

Copy link
Copy Markdown
Member

@nishantmonu51 nishantmonu51 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

left few comments,
Also, would be great if you can also modify dimensionSchema in some integration test too so that this feature is tested end to end.

private static final Map<Interval, DimensionSchema> MIXED_TYPE_COLUMN_MAP = ImmutableMap.of(
Intervals.of("2017-01-01/2017-02-01"),
new StringDimensionSchema(MIXED_TYPE_COLUMN, null),
new StringDimensionSchema(MIXED_TYPE_COLUMN, null, null),
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

minor nit - StringDimensionSchema has a constr for just name, we can use that here.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, I changed these.

@@ -287,6 +287,7 @@ protected IncrementalIndex(
if (dimSchema.getTypeName().equals(DimensionSchema.SPATIAL_TYPE_NAME)) {
capabilities.setHasSpatialIndexes(true);
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Noticed above in NewSpatialDimensionSchema the const passes true for hasBitmapIndexes,
probably need to set capabilities.setBitmapIndex(true) here.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Moved this outside the if/else.

@@ -165,6 +167,12 @@ public SerializerBuilder withBitmapSerdeFactory(BitmapSerdeFactory bitmapSerdeFa

public SerializerBuilder withBitmapIndex(GenericIndexedWriter<ImmutableBitmap> bitmapIndexWriter)
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit : @nullable annotation for bitmapIndexWriter.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added

if (Feature.MULTI_VALUE.isSet(flags)) {
return VSizeColumnarMultiInts.readFromByteBuffer(buffer);
} else {
throw new IAE("Unrecognized multi-value flag[%d] for version[%s]", flags, version);
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

missing multi value V3 handling ?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is missing on purpose; from the code it looks like MULTI_VALUE_V3 is only supported for compressed formats.

Copy link
Copy Markdown
Member

@nishantmonu51 nishantmonu51 Feb 27, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please add a comment to code.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will add a comment


private static boolean mustWriteFlags(final int flags)
{
return flags != NO_FLAGS && flags != Feature.MULTI_VALUE.getMask();
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Multi value V3 mask ?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

MUTLI_VALUE_V3 must be written so that's why it's not listed here.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please add a comment to code.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added a comment.

@jihoonson
Copy link
Copy Markdown
Contributor

LGTM after addressing the comments of @nishantmonu51.

Copy link
Copy Markdown
Member

@nishantmonu51 nishantmonu51 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, left minor comment about adding small comment to code, after that 👍

@gianm
Copy link
Copy Markdown
Contributor Author

gianm commented Feb 27, 2018

@nishantmonu51 @leventov updated per your comments & added an integration test.

@nishantmonu51
Copy link
Copy Markdown
Member

thanks @gianm

@fjy fjy added this to the 0.13.0 milestone Feb 27, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants