bitwise aggregators, better null handling options for expression agg#11280
bitwise aggregators, better null handling options for expression agg#11280clintropolis merged 8 commits intoapache:masterfrom
Conversation
|
|
||
| this.initialValueExpressionString = initialValue; | ||
| this.initialCombineValueExpressionString = initialCombineValue == null ? initialValue : initialCombineValue; | ||
| this.initiallyNull = initiallyNull == null ? NullHandling.sqlCompatible() : initiallyNull; |
There was a problem hiding this comment.
This could be named as useNullInitially or isInitiallyNull to make it more clear in code. As above, I would also be fine with some other better name.
There was a problem hiding this comment.
renamed to isNullUnlessAggregated to more clearly indicate that it is a boolean and hopefully indicate its main role in determining aggregator behavior. initiallyNull seemed confusing alongside initialValue.
| { | ||
| // | expression type (byte) | expression bytes | | ||
| ExprType type = ExprType.fromByte(buffer.get(position)); | ||
| ExprType type = ExprType.fromByte((byte) (buffer.get(position) & TYPE_MASK)); |
There was a problem hiding this comment.
I think it would be better to add comment suggesting that only BufferLambdaExpressionAggregator calls this hence we are clearing the sign bit due to implementation in aggregator.
There was a problem hiding this comment.
I reworked this to do the masking in the buffer aggregator instead of here
| return initialCombineValueExpressionString; | ||
| } | ||
|
|
||
| @JsonProperty("initiallyNull") |
There was a problem hiding this comment.
this should also be changed to isNullUnlessAggregated
| { | ||
| private static final short NOT_AGGREGATED_BIT = 1 << 7; | ||
| private static final short IS_AGGREGATED_MASK = 0x3F; | ||
| private static final byte TYPE_MASK = 0x0F; |
There was a problem hiding this comment.
is it possible to drop either TYPE_MASK or IS_AGGREGATED_MASK and use a common mask whose value is 0x0F ?
|
thanks for review @jihoonson and @rohangarg 🤘 |
…pache#11280) * bitwise aggregators, better nulls for expression agg * correct behavior * rework deserialize, better names * fix json, share mask
Description
Builds on top of #11104 and #10605 to add bitwise aggregator functions:
BIT_AND(expr)nullifdruid.generic.useDefaultValueForNull=false, otherwise0BIT_OR(expr)nullifdruid.generic.useDefaultValueForNull=false, otherwise0BIT_XOR(expr)nullifdruid.generic.useDefaultValueForNull=false, otherwise0In the process of adding this, I've also modified
ExpressionLambdaAggregatorFactoryto have an additional JSON property,initiallyNull, which determines if the aggregator will produce anullvalue orinitialValue/initialCombineValue. For example, an SQL compatible count aggregator would haveinitiallyNullset tofalseand haveinitialValueset to0, so that it would always return 0 even if no rows were aggregated, while a sum would have it set totrueso that it would returnnullin the same case. For the buffer aggregator, this is tracked by setting a bit in the expression type byte which prefixes all of the serialized expressions, which is then cleared whenever the aggregate function is called. This change simplifiesARRAY_AGGsince it was previously using a finalize expression to coerce empty results back to null, but now it can just naturally be initialized to null.This PR has: