Math expressional parameters for aggregator#2783
Conversation
44280df to
1c5721f
Compare
037e526 to
303d7f7
Compare
|
Do you allow quoting of column names? Are column names case-sensitive? I suggest you make the syntax as close to SQL as possible. That would also mean supporting quoting, and also using 'iif' rather than 'if'. |
|
@julianhyde Currently it does not allow quoting and identifiers are case-sensitive. I've used double quote for string literal in #2836 but it would be better to change it single quote. I'll handle your comments in #2836. |
|
@navis Thanks! We know that this expression language is going to be popular, and then we're going to wish that it as close to SQL as we can manage, otherwise we'll spend years trying to undo the differences. |
|
@julianhyde we made the decision back in Druid 0.7 to have column names be case-sensitive to keep things simple implementation-wise. As far as I know SQL doesn't prescribe things to be case-insensitive for column names. For instance PostgreSQL is case-sensitive, but has it's SQL layer fold column names to lower-case, if not quoted. I think it would be worthwhile to have a similar behavior for Druid where any case-folding is left to the SQL layer doing the query translation. |
|
@xvrl didn't changed case-sensitivity of identifiers and have no intention to do that, but possibly make an option for it. |
|
It's totally valid and workable to have case-sensitive identifiers and leave their case intact (i.e. not convert them to lower case as PostgreSQL does, or to upper case as Oracle does). The only other thing is spaces in column names. If you allow space (and other punctuation) in column names then you need a way to use these in expressions. You might use double quotes (as Oracle and PostgreSQL), brackets (as SQL Server) or back-quotes (as MySQL). |
a0dfa51 to
8b645c3
Compare
|
I think this is duplicate of #1965. @himanshug could you review this? |
8b645c3 to
7beac62
Compare
f972078 to
83f93fe
Compare
6a27b92 to
d1367e8
Compare
d1367e8 to
071eec7
Compare
071eec7 to
a71bd01
Compare
a71bd01 to
2df3d6c
Compare
|
Squashed and merged with #2820, caused by hard-maintenance for long abandoned time. |
gianm
left a comment
There was a problem hiding this comment.
@navis, thanks for the patch, looks great modulo some comments.
@himanshug, you started the original math expression, are you interested in taking a look at this patch? This would also be the first time the math expression stuff is actually exposed as an external API - do you have any concerns?
We will need documentation for expressions at some point, but I think it's ok to merge this without that, and leave it as an undocumented feature for now in master. Ideally before 0.9.3 we will write up some docs, firm up the API, and make this a documented feature.
There was a problem hiding this comment.
IMO this would be nicer as a method on Expr like List<String> requiredBindings(). instanceof generally feels brittle and better to avoid if possible.
There was a problem hiding this comment.
replaced with visitor, which seemed more elegant
There was a problem hiding this comment.
Suggest naming this objectBindings
There was a problem hiding this comment.
ok. no one will use druid with javafx.
There was a problem hiding this comment.
Suggest naming this supplierBindings
There was a problem hiding this comment.
Suggest calling this "expression" to be consistent with the aggregators' "fieldExpression"
There was a problem hiding this comment.
Would be nice to verify here that exactly one of fieldName / fieldExpression is set, but not both.
There was a problem hiding this comment.
Yep, there was double type here and looks like I removed them altogether by mistake. This would be rewritten with #2836, as commented above.
There was a problem hiding this comment.
It looks like the intent is missing/mistyped columns will be bound as null. Does the math expression library handle that well? what happens exactly?
There was a problem hiding this comment.
It'll throw NPE in evaluation. MathExpressionSelector should be rewritten with following PR #2836, anyway
There was a problem hiding this comment.
With that, null is handled as null string and translated to 0 whenever it's needed to be numeric type.
There was a problem hiding this comment.
null -> druid-defaults (empty string or 0) sounds good to me
There was a problem hiding this comment.
throw UnsupportedOperationException preferred to returning null IMO.
There was a problem hiding this comment.
All others returns null. It's named DUMMY_COLUMN_SELECTOR_FACTORY
There was a problem hiding this comment.
this could take Expr to avoid parsing overhead for each segment's query runner.
|
did a quick scan, overall looks fine to me... can give a more careful look next week. |
|
Will there also be some documentation updates for this? |
2df3d6c to
a450350
Compare
|
@erikdubbelboer yes, although likely in a future PR (the expression stuff this is based on was never documented). |
There was a problem hiding this comment.
How come this NumericColumnSelector reads everything as floats? Should there be a check for long type as well?
There was a problem hiding this comment.
similar comment, should this check for long type and call getLongMetric() in that case?
There was a problem hiding this comment.
similar suggestion to above could work here too.
There was a problem hiding this comment.
agree, that we should be supporting long in all the places
There was a problem hiding this comment.
Can you elaborate on the choice for this ordering?
Should infinite > numeric? That seems more natural to me
There was a problem hiding this comment.
The default is Long.compare / Double.compare, this code only gets used if you say "ordering" : "numericFirst". So I think that's okay.
There was a problem hiding this comment.
Hmm, this could probably do something like,
Object val = row.getRaw(columnName);
return val instanceof Number ? val : row.get().getFloatMetric(columnName);Most of the time, val should be an instanceof number. The getFloatMetric should only trigger if it's a string we're trying to cast to a number.
|
Still 👍 on this, although consider https://github.com/druid-io/druid/pull/2783/files#r83581162. |
There was a problem hiding this comment.
not sure why we can't keep these private anymore?
There was a problem hiding this comment.
We need to access fields when traversing the expression.
There was a problem hiding this comment.
static here is probably not needed as enums are static classes anyways
There was a problem hiding this comment.
It's copy from ArithmeticPostAggregator$Ordering. Would it be better to separate this into independent class?
There was a problem hiding this comment.
Ah ha, I knew I saw this somewhere before :)
Yeah, I think we could have a set of NumberComparators utilities sort of like we have StringComparators. Separate PR would be fine by me though.
There was a problem hiding this comment.
these two are repeated everywhere, is it appropriate to change method signature to take arguments (Expression expr, List requiredBindings) ?
There was a problem hiding this comment.
I think it's not necessary.
…r) also includes math post aggregator (was apache#2820)
a450350 to
60b2ea7
Compare
|
👍 to changes, thx @navis. |
|
new changes LGTM, 👍 |
|
thx @jon-wei for taking a look. @himanshug, any further comments? |
* Supports expression-paramed aggregator (squashed and rebased on master) also includes math post aggregator (was apache#2820) * Addressed comments * addressed comments
* Supports expression-paramed aggregator (squashed and rebased on master) also includes math post aggregator (was apache#2820) * Addressed comments * addressed comments
Someone wanted druid to support expressional parameter for aggregator something like
sum(a + b / c). I don't know why it's needed either but this is it. Will make propose in dev mailing list soon after.Based on great work(#2090) of @himanshug.