enrich expression cache key information to support expressions which depend on external state#11358
Conversation
…depend on external state such as lookups
| @Override | ||
| default byte[] getCacheKey() | ||
| { | ||
| return new CacheKeyBuilder(EXPR_CACHE_KEY).appendString(stringify()).build(); |
There was a problem hiding this comment.
stringify is somewhat expensive and this will be called once per segment. Can we cache it, like we cache the original and parsed expressions? Best way I can think of right now is to implement that caching in ExpressionFilter, ExpressionVirtualColumn, etc.
Btw, that caching should be lazy, for two reasons:
- otherwise we'll waste time computing it when it'll never be needed
- sometimes the cache key will be incomputable (lookup doesn't exist, or error retrieving it) and we don't want that to totally prevent instantiating the VirtualColumn or DimFilter objects
There was a problem hiding this comment.
added caching similar to the Expr themselves
|
|
||
| public class LookupExprMacro implements ExprMacroTable.ExprMacro | ||
| { | ||
| private static final byte LOOKUP_EXPR_KEY = 0x01; |
There was a problem hiding this comment.
Can we centralize these? Otherwise I worry about accidentally reusing IDs.
There was a problem hiding this comment.
I was worried about that too, but hesitated since Expr is in core and this is in processing and also some others such as the DimensionSpec have this problem, but yeah I agree and fixed it and moved both of them to Exprs because wasn't sure where else to put it and it seemed sort of funny on Expr.
| public class Exprs | ||
| { | ||
| public static final byte EXPR_CACHE_KEY = 0x00; | ||
| public static final byte LOOKUP_EXPR_KEY = 0x01; |
There was a problem hiding this comment.
Naming? LOOKUP_EXPR_KEY should be something like LOOKUP_CACHE_KEY, I think. For symmetry.
There was a problem hiding this comment.
oops, meant to rename to LOOKUP_EXPR_CACHE_KEY when i moved here, fixed
| return columnSelectorFactory.makeValueSelector(fieldName); | ||
| } | ||
|
|
||
| public static Supplier<byte[]> getSimpleAggregatorCacheKeySupplier( |
Description
This PR fixes an issue where expressions which depend on some external state which is also typically encoded into a cache key, such as a lookup, can have the incorrect cache key because the expression was only taking into account the string form of the expression. This causes incorrect results if the underlying lookup has changed, as the cache key did not encode this information and so remains unchanged, despite that the expression would result in a different value with the newer lookup.
This has been resolved by making
ExprextendCacheableand providing a defaultgetCacheKeyimplementation which uses the stringified form, but is overridden by the lookup expression to form a composite key of the stringified expression and the lookup extraction functions cache key.I moved
CacheKeyBuilderfromdruid-processingtodruid-core, which was marked with@PublicApi, so tagging with release notes (Cacheablewas already defined indruid-core).This PR has: