nested columns + arrays = array columns!#13803
Conversation
changes: * add support for storing nested arrays of string, long, and double values as specialized nested columns instead of breaking them into separate element columns * nested column typic mimic behavior means that columns ingested with only root arrays of primitive values will be ARRAY typed columns * neat test stuff
| if (o instanceof Object[]) { | ||
| Object[] array = (Object[]) o; | ||
| if (elementNumber < array.length) { | ||
| return array[elementNumber]; |
Check failure
Code scanning / CodeQL
Improper validation of user-provided array index
|
I want to get #13809 finished first and pull those changes into this PR to make things consistent with null value coercion |
imply-cheddar
left a comment
There was a problem hiding this comment.
Only got through a few of the files so far, but need to go and come back to it.
|
|
||
| List<InputRow> rows = readAllRows(reader); | ||
| Assert.assertEquals(ImmutableList.of("dim1", "metric1", "timestamp"), rows.get(0).getDimensions()); | ||
| Assert.assertEquals(ImmutableList.of("dim1", "metric1"), rows.get(0).getDimensions()); |
There was a problem hiding this comment.
Why the change in expectation?
There was a problem hiding this comment.
this is related to the change in dimension filter in MapBasedInputRow to always filter out the timestamp spec column.
In 'normal' production code the timestamp spec is added to dimension exclusions so the code in MapBasedInputRow that computes dimensions to ensure time would not be in the dimensions list. However, in test code, especially in processing, which doesn't have access to the methods that take a dataschema and transform it into an input row schema, its pretty easy to not explicitly add timestamp column to the dimensions exclusion list. So as a result of not manually adding timestamp column to exclusions, it would end up in the dimensions list in schema discovery modes, as a string (or nested column, depending on config), which when doing rollup tests means it ends up as part of the rollup key, and so on. (Again this doesn't happen in current production code because it goes through that translator utility method).
I made the change there to always filter the timestamp spec column from the dimensions list to make it easier to not write wrong tests for schema discovery mode, which caused the change here and other places.
| public static DelimitedInputFormat ofColumns(String... columns) | ||
| { | ||
| return new DelimitedInputFormat( | ||
| Arrays.asList(columns), | ||
| null, | ||
| null, | ||
| false, | ||
| false, | ||
| 0 | ||
| ); | ||
| } |
There was a problem hiding this comment.
Location nit: does this need to be in main rather than test?
There was a problem hiding this comment.
removed since there was only one caller
| public static final JsonInputFormat DEFAULT = new JsonInputFormat( | ||
| JSONPathSpec.DEFAULT, | ||
| null, | ||
| null, | ||
| null, | ||
| null | ||
| ); | ||
|
|
There was a problem hiding this comment.
Location nit: does this need to be in main rather than test?
| LinkedHashSet<String> dimensions = new LinkedHashSet<>(dimensionsSpec.getDimensionNames()); | ||
| dimensions.addAll(Sets.difference(rawInputRow.keySet(), dimensionsSpec.getDimensionExclusions())); | ||
| for (String field : rawInputRow.keySet()) { | ||
| if (timestampSpec.getTimestampColumn().equals(field) || dimensionsSpec.getDimensionExclusions().contains(field)) { |
There was a problem hiding this comment.
can we just get these two things once?
There was a problem hiding this comment.
yeah, can adjust, though after an earlier discussion i just want to remove getDimensions from InputRow, but haven't decided if in this PR or some later change.
| @Nullable Object maybeArrayOfLiterals | ||
| ) | ||
| { | ||
| ExprEval<?> eval = ExprEval.bestEffortOf(maybeArrayOfLiterals); |
There was a problem hiding this comment.
Would it make sense to break ExprEval.bestEffortOf into a bunch of checks for different groups of expected types (i.e. ExprEval.maybeLiteral() and ExprEval.maybeArray, etc.). Calls to bestEffortOf can cascade through, but places like this that already know some of what they expect can call the one that more aligns with expectations?
There was a problem hiding this comment.
i split out bestEffortArray(@Nullable List<?> theList) and changed StructuredDataProcessor method to always pass a List when attempting to process arrays
| final ExprEval<?> maybeLiteralArray = ExprEval.bestEffortOf(maybeArrayOfLiterals); | ||
| if (maybeLiteralArray.type().isArray() && maybeLiteralArray.type().getElementType().isPrimitive()) { | ||
| final String fieldName = NestedPathFinder.toNormalizedJsonPath(fieldPath); | ||
| LiteralFieldIndexer fieldIndexer = fieldIndexers.get(fieldName); | ||
| if (fieldIndexer == null) { | ||
| estimatedFieldKeySize += StructuredDataProcessor.estimateStringSize(fieldName); | ||
| fieldIndexer = new LiteralFieldIndexer(globalDictionary); | ||
| fieldIndexers.put(fieldName, fieldIndexer); | ||
| } | ||
| return fieldIndexer.processValue(maybeLiteralArray); | ||
| } | ||
| return null; |
There was a problem hiding this comment.
This looks a lot like code in NestedDataExpressions, except this doesn't return the NULL_LITERAL. I find myself wondering if the NestedDataExpressions code shouldn't look like this?
There was a problem hiding this comment.
the contract of processArrayOfLiteralsField is supposed to return a ProcessedLiteral if and only if the value was an array of literals (it is marked @Nullable). processLiteralField is not nullable, and must always return a ProcessedLiteral.
The StructuredDataProcessor code when processing some input and it encounters arrays will first attempt to processArrayOfLiteralField, if it returns something, it was an array, else it must instead process the array elements recursively. processLiteralField is called on everything that isn't a map or array.
I'll see if i can clarify it better
There was a problem hiding this comment.
i modified this stuff a bit and updated javadocs so it is hopefully clearer, the new abstract method names are processField and processArrayField and the latter indicates that returning a non-null value halts further processing of arrays, otherwise the processor will continue for each element of the array
| final LiteralFieldIndexer root = fieldIndexers.get(NestedPathFinder.JSON_PATH_ROOT); | ||
| if (root.getTypes().getSingleType().isArray()) { | ||
| throw new UnsupportedOperationException("Not supported"); | ||
| } |
There was a problem hiding this comment.
I'm reading this as "if all we have are root-level entries and they are always arrays, then throw a UOE exception". I'm pretty sure I'm reading it wrong, but wishing the error message gave me more context without me feeling like I need to expand lines on the review to know what this is validating.
There was a problem hiding this comment.
heh, you are reading it correctly, currently the 'single typed root' dimension selector should only be used for scalar string columns, everything else should use the column value selector instead, will try to clarify with comments
There was a problem hiding this comment.
Or put more words into the exception message so it's clear what's not supported?
There was a problem hiding this comment.
clarified exception
| final StructuredData data = (StructuredData) dims[dimIndex]; | ||
| if (data != null) { | ||
| return data.getValue(); | ||
| return ExprEval.bestEffortOf(data.getValue()).value(); |
There was a problem hiding this comment.
Is this counting on coercion or something?
There was a problem hiding this comment.
yeah, i did this to make it consistent with the value that is stored in the dictionary (since the selector is used for merging/persisting)
| case ARRAY: | ||
| // skip empty arrays for now, they will always be called 'string' arrays, which isn't very helpful here since | ||
| // it will pollute the type set | ||
| Preconditions.checkNotNull(columnType.getElementType(), "Array element type must not be null"); |
There was a problem hiding this comment.
If this were to ever happen, I think I'd want to know which field was the bad one.
| // skip empty arrays for now, they will always be called 'string' arrays, which isn't very helpful here since | ||
| // it will pollute the type set |
There was a problem hiding this comment.
How does it do the skipping of empties?
There was a problem hiding this comment.
it was missing some code to do that, updated and added tests for nulls, empties, and arrays of nulls
| } else if (adapter instanceof QueryableIndexIndexableAdapter) { | ||
| dimValues = getSortedIndexesFromQueryableAdapter((QueryableIndexIndexableAdapter) adapter, mergedFields); | ||
| } else { | ||
| throw new ISE("Unable to merge columns of unsupported adapter %s", adapter.getClass()); |
| boolean allNulls = dimValues == null || allNull(dimValues.getSortedStrings()) && | ||
| allNull(dimValues.getSortedLongs()) && | ||
| allNull(dimValues.getSortedDoubles()) && | ||
| dimValues.getArrayCardinality() == 0; |
There was a problem hiding this comment.
This seems like a nice check to delegate to GlobalDictionarySortedCollector instead of implementing here?
There was a problem hiding this comment.
moved to GlobalDictionarySortedCollector
| defaultSerializer.serializeArrayDictionary(() -> new ArrayDictionaryMergingIterator( | ||
| sortedArrayLookups, | ||
| defaultSerializer.getGlobalLookup() | ||
| )); |
There was a problem hiding this comment.
Why can't this one just be sortedLookup.getSortedArrays() like the other 3?
| column.getArraysIterable(), | ||
| column.getArrayDictionary().size() |
There was a problem hiding this comment.
Why the 2 argument set of Iterable() and size() instead of a single collection-style object like the others?
There was a problem hiding this comment.
it felt strange to try to wire this iterable up to wrap in an Indexed when all we need it for is to iterate over
There was a problem hiding this comment.
Sure, doesn't have to be an Indexed, just also seemed weird to be passing in fully encapsulated objects above and then suddenly start passing in 2 arguments to wrap a new thing that seems similar to the ones above...
| if (next[i] == null) { | ||
| newIdsWhoDis[i] = 0; | ||
| } else if (next[i] instanceof String) { | ||
| newIdsWhoDis[i] = idLookup.lookupString((String) next[i]); | ||
| } else if (next[i] instanceof Long) { | ||
| newIdsWhoDis[i] = idLookup.lookupLong((Long) next[i]); | ||
| } else if (next[i] instanceof Double) { | ||
| newIdsWhoDis[i] = idLookup.lookupDouble((Double) next[i]); | ||
| } else { | ||
| newIdsWhoDis[i] = -1; | ||
| } | ||
| Preconditions.checkArgument( | ||
| newIdsWhoDis[i] >= 0, | ||
| "unknown global id [%s] for value [%s]", | ||
| newIdsWhoDis[i], | ||
| next[i] | ||
| ); |
There was a problem hiding this comment.
Given that the global dictionaries, once merged, will be in type-sorted order, do we really need to convert back into the actual values instead of just converting the dictionary id?
There was a problem hiding this comment.
So, we are dealing with them here like this so we can lookup the new global id from the newly merged lower scalar value dictionaries. Otherwise we would need the mappings of old ids to new ids, which we don't currently have anywhere, and its a lot more complicated since its per segment. This way we just lookup the old values and after the lower dictionaries are merged, just lookup the array elements for the newly sorted values
There was a problem hiding this comment.
i think since this stuff is all working i'd like to save further changes/optimizations for a follow-up
| Preconditions.checkNotNull(inputFormat, "inputFormat"); | ||
| Preconditions.checkNotNull(inputSourceTmpDir, "inputSourceTmpDir"); | ||
|
|
||
| TransformSpec tranformer = transformSpec != null ? transformSpec : TransformSpec.NONE; |
There was a problem hiding this comment.
spell-check: transformer
| .setIndexSchema(schema) | ||
| .setMaxRowCount(maxRows) | ||
| .build(); | ||
| TransformSpec tranformer = transformSpec != null ? transformSpec : TransformSpec.NONE; |
| InputSourceReader transformingReader = tranformer.decorate(reader); | ||
| try (CloseableIterator<InputRow> rowIterator = transformingReader.read()) { | ||
| while (rowIterator.hasNext()) { | ||
| incrementalIndex.add(rowIterator.next()); |
There was a problem hiding this comment.
When maxRows is hit, are we expecting an exception? Generally speaking, setting the maxRows on the tests is done as a way to force running queries against multiple segments, so I had expected to see a check for the numRows and incremental persists in anything that takes maxRows.
There was a problem hiding this comment.
Hmm, good question, this looks wired up to the maxRowCount but callers are not checking canAppendRow or anything like that and buildIncrementalIndex can only return a single IncrementalIndex so I think all it can do is explode. There is also intermediatePersistSize which can be used to force a bunch of incremental indexes to be written when mergeIndexes is called to make sure that segment merging happens.
Nothing much seems to be explicitly setting either of these things, and maybe could be removed or reworked in a follow-up.
| final DruidExpression rightExpr = druidExpressions.get(1); | ||
|
|
||
| if (leftExpr.isSimpleExtraction()) { | ||
| if (leftExpr.isSimpleExtraction() && !(leftExpr.getDruidType() != null && leftExpr.getDruidType().isArray())) { |
There was a problem hiding this comment.
What if it's an array_contains() over just a normal single-valued String column? Shouldn't that also match the filter, pretending that each row contains an array of size 1?
There was a problem hiding this comment.
the type shouldn't show up as array here if it isn't an array column
| public void testQuery( | ||
| final String sql, | ||
| final Map<String, Object> queryContext, | ||
| final List<Query<?>> expectedQueries, | ||
| final List<Object[]> expectedResults, | ||
| final RowSignature expectedResultSignature | ||
| ) | ||
| { | ||
| testBuilder() | ||
| .sql(sql) | ||
| .queryContext(queryContext) | ||
| .expectedQueries(expectedQueries) | ||
| .expectedResults(expectedResults) | ||
| .expectedSignature(expectedResultSignature) | ||
| .run(); | ||
| } |
There was a problem hiding this comment.
Part of me wonders if, instead of adding more functions with extra arguments to their signatures, we should switch the call-sites to using the builder directly?
There was a problem hiding this comment.
removed in favor of just calling testBuilder, this should probably just be done everywhere, but its kind of tedious to switch
|
I've fixed the behavior of but currently we are treating them as |
| } | ||
| } else { | ||
| index++; | ||
| if (++index >= unnestListForCurrentRow.size()) { |
Check failure
Code scanning / CodeQL
User-controlled data in arithmetic expression
| // empty | ||
| NonnullPair<ExpressionType, Object[]> coerced = coerceListToArray(theList, false); | ||
| if (coerced == null) { | ||
| return bestEffortOf(null); |
There was a problem hiding this comment.
Calling bestEffortOf(null) looks to me like it's going to go through like 15 if statements, failing them all before just returning new StringExprEval(null). Why not just return the good thing here given that we already know what it should be and avoid the potential branches?
There was a problem hiding this comment.
I think it would probably be best to have null handled first in bestEffortOf so that if we ever decide to represent null as something more sensible (like introduce a null type), we won't be making a 'string' here, not to mention the saving of not running through a bunch of checks for a null value.
There was a problem hiding this comment.
Sure, I expected you to actually to create a method for makeNullNode() or something in ExprEval instead of newing it up directly here. Just moving around the ordering of the statements is 🤷 . Though, checking for null first is probably better than last.
| Preconditions.checkNotNull( | ||
| columnType.getElementType(), | ||
| "Array type [%s] for value [%s] missing element type, how did this possibly happen?", | ||
| eval.type(), | ||
| eval.valueOrDefault() | ||
| ); |
There was a problem hiding this comment.
Nit: I think this would be better as an if statement. If this ever gets thrown out, someone is gonna want to attach a debugger to the point that this gets thrown from and they are gonna need to convert it to an if statement to be able to do that without setting some conditions and stuff on their debug breakpoint.
There was a problem hiding this comment.
this shouldn't ever happen, but I guess i can wrap it in an if
There was a problem hiding this comment.
Just for clarity, I was suggesting that instead of using Preconditions you use an if statement instead.
| sizeEstimate = globalDimensionDictionary.addStringArray(stringArray); | ||
| return new StructuredDataProcessor.ProcessedValue<>(stringArray, sizeEstimate); | ||
| default: | ||
| throw new IAE("Unhandled type: %s", columnType); |
There was a problem hiding this comment.
Just double checking, is this where we expect an array-of-arrays or an array-of-objects to end up?
There was a problem hiding this comment.
We don't really expect it to happen because the StructuredDataProcessor should have done its job correctly and only defined fields for primitive and array of primitive, so this is more of a sanity check
There was a problem hiding this comment.
Ah, so array-of-arrays should be handled in StructuredDataProcessor instead?
Even after this PR, the handling of array-of-arrays and array-of-objects is still a bit muddy, right? (it's falling back to the "an array is just an object with numbers for field names" behavior?)
| throw new IAE("Unhandled type: %s", columnType); | ||
| } | ||
| case STRING: | ||
| default: |
There was a problem hiding this comment.
What's an example of when we hit the default case? I'm legitimately asking because I do not know the answer and find myself wondering if believing that it's a String is really the right thing to do versus generating a parsing error or something of that nature.
There was a problem hiding this comment.
I don't think we should hit the default case, i'm not entirely sure why i wrote it like this originally
| cardinality, | ||
| System.currentTimeMillis() - dimStartTime | ||
| ); | ||
| catch (Throwable ioe) { |
There was a problem hiding this comment.
Catching Throwable is dangerous, why cast such a wide net?
There was a problem hiding this comment.
hmm, I don't remember why I added this, I think I was debugging something and needed an easy to place to catch to tell me what was failing for which column, and catching here seemed the best way to tell me what was messed up, but it could probably narrow it down to IOException and just rethrow it
| column.getArraysIterable(), | ||
| column.getArrayDictionary().size() |
There was a problem hiding this comment.
Sure, doesn't have to be an Indexed, just also seemed weird to be passing in fully encapsulated objects above and then suddenly start passing in 2 arguments to wrap a new thing that seems similar to the ones above...
| // due to vbyte encoding, the null value is not actually stored in the bucket (no negative values), so we adjust | ||
| // the index |
There was a problem hiding this comment.
If I understood correctly, can I suggest updating the comment to be
// For arrays, there is a conundrum of how to encode null different from the empty array. In this code,
// we take the following approach:
// 1) the 0 entry in our dictionary is assumed to be indicative of a completely null entry
// 2) The 1 value in our dictionary is assumed to be the empty array
// 3) Instead of storing the 0 value in this dictionary, we employ an indexing-shifting technique, where the
// dictionary never stores null, and starts with the empty array (index 1), but shifted by 1 such that what
// is persisted on disk is actually 0 instead of 1.
// adjustIndex represents whether the null existed and thus whether we actually need to adjust the value
| return ProcessedValue.NULL_LITERAL; | ||
| } | ||
| catch (IOException e) { | ||
| throw new RE(e, "Failed to write field [%s] value [%s]", fieldPath, array); |
There was a problem hiding this comment.
This exception leaks the content of the data if it ever gets built. I think the "best" we can do here is mention which field it was trying to write.
There was a problem hiding this comment.
yeah, that's fair, i can change it, was thinking in terms of helping myself debug what wasn't handled correctly but it is leaky
| private static final byte STRING_ARRAY_MASK = 1 << 4; | ||
|
|
||
| private static final byte LONG_ARRAY_MASK = 1 << 5; | ||
|
|
||
| private static final byte DOUBLE_ARRAY_MASK = 1 << 6; |
There was a problem hiding this comment.
I dunno if I'm being stingy, but couldn't we have an array mask that is applied and then we check the type mask after that? Would mean that we can add array support to this and only consume 1 more bit instead of consuming 3 more bits.
There was a problem hiding this comment.
that would work if we only set 1 type, or if any row being an array meant all rows are arrays, but it doesn't currently, so we sort of need to know what type of array, at least the way things currently work
There was a problem hiding this comment.
what about composite arrays? Like if the array has a String and a double? Or a String and an Object?
I had intepretted these as "this type exists in the column" not as "every row is this type". In which case, if there is an array, then it is true that the array type exists in the column, no? And then if the only scalar type is a long, then you would know that the arrays are all longs, right?
|
|
||
| public static final Map<String, Object> QUERY_CONTEXT_NO_STRINGIFY_ARRAY = | ||
| DEFAULT_QUERY_CONTEXT_BUILDER.put(QueryContexts.CTX_SQL_STRINGIFY_ARRAYS, false) | ||
| .put(PlannerContext.CTX_ENABLE_UNNEST, true) |
There was a problem hiding this comment.
Not sure how much I care about this, but I expected this to be put on the Unnest/Array tests instead of across all of the tests in this base class.
There was a problem hiding this comment.
QUERY_CONTEXT_NO_STRINGIFY_ARRAY is afaik only used on CalciteArrayQueryTest and CalciteNestedDataQueryTest so i think this is fine
changes: * introduce ColumnFormat to separate physical storage format from logical type. ColumnFormat is now used instead of ColumnCapabilities to get column handlers for segment creation * introduce new 'standard type' indexer, merger, and family of serializers, which is the next logical iteration of the nested column stuff. Essentially this is an automatic type column indexer that produces the most appropriate column for the given inputs, making either STRING, ARRAY<STRING>, LONG, ARRAY<LONG>, DOUBLE, ARRAY<DOUBLE>, or COMPLEX<json>. * revert NestedDataColumnIndexer, NestedDataColumnMerger, NestedDataColumnSerializer to their version pre apache#13803 behavior (v4) for backwards compatibility * fix a bug in RoaringBitmapSerdeFactory if anything actually ever wrote out an empty bitmap using toBytes and then later tried to read it (the nerve!)
changes: * introduce ColumnFormat to separate physical storage format from logical type. ColumnFormat is now used instead of ColumnCapabilities to get column handlers for segment creation * introduce new 'standard type' indexer, merger, and family of serializers, which is the next logical iteration of the nested column stuff. Essentially this is an automatic type column indexer that produces the most appropriate column for the given inputs, making either STRING, ARRAY<STRING>, LONG, ARRAY<LONG>, DOUBLE, ARRAY<DOUBLE>, or COMPLEX<json>. * revert NestedDataColumnIndexer, NestedDataColumnMerger, NestedDataColumnSerializer to their version pre apache#13803 behavior (v4) for backwards compatibility * fix a bug in RoaringBitmapSerdeFactory if anything actually ever wrote out an empty bitmap using toBytes and then later tried to read it (the nerve!)
changes: * introduce ColumnFormat to separate physical storage format from logical type. ColumnFormat is now used instead of ColumnCapabilities to get column handlers for segment creation * introduce new 'standard type' indexer, merger, and family of serializers, which is the next logical iteration of the nested column stuff. Essentially this is an automatic type column indexer that produces the most appropriate column for the given inputs, making either STRING, ARRAY<STRING>, LONG, ARRAY<LONG>, DOUBLE, ARRAY<DOUBLE>, or COMPLEX<json>. * revert NestedDataColumnIndexer, NestedDataColumnMerger, NestedDataColumnSerializer to their version pre apache#13803 behavior (v4) for backwards compatibility * fix a bug in RoaringBitmapSerdeFactory if anything actually ever wrote out an empty bitmap using toBytes and then later tried to read it (the nerve!)
changes: * introduce ColumnFormat to separate physical storage format from logical type. ColumnFormat is now used instead of ColumnCapabilities to get column handlers for segment creation * introduce new 'standard type' indexer, merger, and family of serializers, which is the next logical iteration of the nested column stuff. Essentially this is an automatic type column indexer that produces the most appropriate column for the given inputs, making either STRING, ARRAY<STRING>, LONG, ARRAY<LONG>, DOUBLE, ARRAY<DOUBLE>, or COMPLEX<json>. * revert NestedDataColumnIndexer, NestedDataColumnMerger, NestedDataColumnSerializer to their version pre apache#13803 behavior (v4) for backwards compatibility * fix a bug in RoaringBitmapSerdeFactory if anything actually ever wrote out an empty bitmap using toBytes and then later tried to read it (the nerve!)
changes: * introduce ColumnFormat to separate physical storage format from logical type. ColumnFormat is now used instead of ColumnCapabilities to get column handlers for segment creation * introduce new 'auto' type indexer and merger which produces a new common nested format of columns, which is the next logical iteration of the nested column stuff. Essentially this is an automatic type column indexer that produces the most appropriate column for the given inputs, making either STRING, ARRAY<STRING>, LONG, ARRAY<LONG>, DOUBLE, ARRAY<DOUBLE>, or COMPLEX<json>. * revert NestedDataColumnIndexer, NestedDataColumnMerger, NestedDataColumnSerializer to their version pre #13803 behavior (v4) for backwards compatibility * fix a bug in RoaringBitmapSerdeFactory if anything actually ever wrote out an empty bitmap using toBytes and then later tried to read it (the nerve!)



Description
This PR adds support for storing arrays of primitive values (
ARRAY<STRING>,ARRAY<LONG>andARRAY<DOUBLE>) as specialized nested columns, instead of the current strategy which stores a separate nested column for each array element.Basically it works by adding an additional value dictionary for all array values to our current
STRING->LONG->DOUBLEdictionary stack. The array dictionary storesint[]values, where the elements are globalIds that point to values in theSTRING/LONG/DOUBLEvalue dictionaries.So instead of the 2-phase lookup that scalar nested columns have, nested array values have a 3-phase lookup - first lookup localId -> globalId, which gives an int[] of global ids which can then be translated into the actual values.
This dictionary is stored with a newly added
FrontCodedIntArrayIndexed, which is pretty similar toFrontCodedIndexedadded in #12277, but storingint[]instead ofbyte[]. There is probably more code that could be shared between the two implementations (currently there is none andFrontCodedIntArrayIndexedand its writer are complete copies).additional description TBD
changes:
Release note
This PR has: