Support string type in math expression#2836
Conversation
efe86e9 to
7d9feab
Compare
7d9feab to
10cbe62
Compare
|
rebased and refactored to be looked less awful |
1fc6e6b to
c2bd31d
Compare
|
@himanshug can you take a look? |
fjy
left a comment
There was a problem hiding this comment.
Took a high level look and left high level comments
There was a problem hiding this comment.
as opposed to checking for type everywhere, can we use polymorphic deserialization instead? As in, we have different types of ExprEvals that are automatically instantiated based on Expr type?
There was a problem hiding this comment.
I think with how this all works, it makes sense to have a type() method rather than polymorphism, since with polymorphism the value types would need to know how to operate on each other and that could get messy.
There was a problem hiding this comment.
can we call this class EvaluatedExpr as opposed to ExprEval?
There was a problem hiding this comment.
Can we keep this? It's trivial but makes so many conflicts in following PRs. We can discuss anytime later.
There was a problem hiding this comment.
Why can't you just return value here?
There was a problem hiding this comment.
I think this class should use polymorphic deserialization to avoid so many type checks everywhere
There was a problem hiding this comment.
It can also avoid passing in type everywhere
There was a problem hiding this comment.
it's generated name by antlr
There was a problem hiding this comment.
why does the type have to change here?
There was a problem hiding this comment.
Expression recognizes this as long type and make long type result, which is different from arithmetic post aggregator which regards all numbers as double.
c2bd31d to
f421015
Compare
|
@fjy did you mean something like this? (f421015) |
There was a problem hiding this comment.
It's for 'case' function which is not included in this PR. it seemed I've separated the part into another PR. I'll remove it.
There was a problem hiding this comment.
Any reason we need asLong() and asDouble() in addition to longValue() and doubleValue()? These aren't currently being used. Can StringExprEval just override longValue() and doubleValue() (and probably intValue() and numericValue() as well)? This would also avoid the ClassCastException you'd get right now calling say longValue() on a StringExprEval.
There was a problem hiding this comment.
Also having a method called asDouble() and one called doubleValue() is confusing to me as I'm not sure when I'd use each one, so I suggest consolidating them if possible.
There was a problem hiding this comment.
If StringExprEval overrode longValue() and doubleValue() these wouldn't be necessary
There was a problem hiding this comment.
I have three or four different version of cast function and this seemed the oldest one. I'll address that.
There was a problem hiding this comment.
Should be able to just call x.asString() which has the same logic
There was a problem hiding this comment.
yep, that is the difference of 'stringValue()' with 'asString()'. if user have checked the type of expr is X, then can use 'Xvalue()' to access the value without needing to check type again. Still would be better to remove them?
There was a problem hiding this comment.
@navis IMO if there is no significant performance hit, it feels better to consolidate them as it is confusing to me which one I should use without looking through the code, but I'm open for other suggestions
There was a problem hiding this comment.
Same idea, do we need stringValue() and asString()? Can we just use this logic (using String.valueOf()) in stringValue() instead of the typecast or is there performance implications?
There was a problem hiding this comment.
Has some historical reason for this but ok, I'll remove some methods.
There was a problem hiding this comment.
Druid favors JodaTime over java.util.Date. Any reason we can't implement this method using JodaTime?
There was a problem hiding this comment.
Might as well check that either 1 or 2 arguments were supplied instead of > 0
There was a problem hiding this comment.
IllegalArgumentException for consistency, needs 2 arguments
There was a problem hiding this comment.
I think this is different behavior from SQL NVL which doesn't treat empty strings as null. If this behavior is required, would it make sense to make an additional function to handle the nullOrEmpty case?
There was a problem hiding this comment.
Some following PR will make use of dimension as input of expression(, which is the main purpose of this PR). Then it's not that clear whether empty string should be considered as null or not.
There was a problem hiding this comment.
@navis I was wrong, seems that Oracle does treat empty strings as null (http://docs.oracle.com/cd/B19306_01/server.102/b14200/sql_elements005.htm). I'm good with this.
There was a problem hiding this comment.
Would be great to see tests for the rest of the functions in Function.java if possible, or at least for the ones that were added (cast, unix_timestamp, nvl)
|
Also please add documentation for the new functions |
f421015 to
6896510
Compare
|
Committed addressed patch first. I think I should squash these to rebase on #3630, which is really painful. |
|
On @dclim's "please add documentation" comment, IMO, we don't need this right now. None of the expression stuff is documented currently. I raised #3634 for this earlier today. We need to start those docs and pay down that debt before 0.9.3, but I think it doesn't need to be in this PR since the API is still in flux. I think it makes sense to work out the API details through further PRs and then write the docs. |
|
@gianm that sounds fine to me, but FYI there is a documentation page (but I don't think it's reachable from anywhere right now!): http://druid.io/docs/0.9.1.1/misc/math-expr.html |
|
@dclim oops, I missed that. Nevermind, in that case I agree this PR should have some additions there. But let's keep it unlinked until we're happy with the API. |
|
btw, that conflict between #3630 and here makes me wonder how we should deal with null strings in the expression language. My first instinct (not having thought about it too much) is that we should take the attitude that Druid doesn't actually generate any nulls from its columns. I'm intending to do something similar in the druid-sql I'm working on, see https://groups.google.com/d/msg/druid-development/3npt9Qxpjr0/IeJQP0YXBQAJ. i.e. if the dimension selector or an extractionFn returns |
86ed3ea to
c53bdcc
Compare
f09cf16 to
048797e
Compare
There was a problem hiding this comment.
I think with how this all works, it makes sense to have a type() method rather than polymorphism, since with polymorphism the value types would need to know how to operate on each other and that could get messy.
There was a problem hiding this comment.
Should this be x.type() == ExprType.STRING || y.type() == ExprType.STRING?
There was a problem hiding this comment.
It'll be very funny when someone forgets to override at least one of these 😄
There was a problem hiding this comment.
Hmm, I think this means all nulls, even nulls of columns we want to treat as numbers (e.g. missing columns), are getting represented as StringExprEvals. I guess that works but seems strange. It might make sense for null to have its own datatype.
I guess I don't have strong feelings on this now for this PR but we should consider null handling before finalizing the expressions API.
There was a problem hiding this comment.
Raised issue in #3645 to consider this further.
|
@navis the point of the IAE and ISE exceptions is that it allows you to use formatted strings, so rather than doing:
You can do:
which is preferred in Druid. |
There was a problem hiding this comment.
stringValue() is only being used in unit tests now, if not necessary for a follow-on PR it can be removed
There was a problem hiding this comment.
should just say needs 1 or 2 arguments, instead of 'at least'
|
I'm good with this after the last few comments and Gian's comments have been addressed. |
addressed comments addressed comments Addressed comments
048797e to
7c93ccb
Compare
7c93ccb to
ef40017
Compare
|
👍 |
* Support string type in math expression addressed comments addressed comments Addressed comments * Updated math function document * Addressed comments
* Support string type in math expression addressed comments addressed comments Addressed comments * Updated math function document * Addressed comments
Wanted to do something like
__time > timestamp("2010-03-12T11:27:00)with math expr but currently it does not support that. This is the first try to support string type in math expr.