Update the docs for EARLIEST_BY/LATEST_BY aggregators with the newly added numeric capabilities#15670
Conversation
|
the earliest/latest/earliest_by/latest_by should all be deprecated anyway. Please redirect this work to finishing up this PR: #14195 |
| || SerializablePairLongDoubleComplexMetricSerde.TYPE_NAME.equals(complexColumnTypeName) | ||
| || SerializablePairLongStringComplexMetricSerde.TYPE_NAME.equals(complexColumnTypeName))) { | ||
| plannerContext.setPlanningError( | ||
| "Cannot call %s with an explicit 'timeExpr' column for pre-aggregated metric of type [%s]. Use %s instead " |
There was a problem hiding this comment.
this doesn't look right to me - EARLIEST/LATEST is rewritten to their *_BY variants by #15095
this message would suggest that something like:
@Test
public void testEarliestWorks1()
{
testBuilder()
.sql("SELECT EARLIEST(long_last_added) FROM wikipedia_first_last")
.run();
}
should pass - however it fails with the same error message
There was a problem hiding this comment.
Thanks for the comment. I tried to add an isReplaced flag to distinguish between the two calls, but it seems like during the SQL planning phase of calcite, it rewrites the call to lose that information and uses the same SQL function for both variants (earliest, earliest_by). I cannot do a type check on the custom __time operand that was passed, therefore the next best resort is to just confirm if the identifier is __time or not.
This would mean EARLIEST_BY(complexMetric, __time) would also pass, but I couldn't come up with an easy fix for the same.
Since these will be deprecated sometime, I think its an acceptable compromise.
|
Since it's a tiny PR, and with the addition of the pairLongNumeric types, I wanted to tighten up the SQL handling, I am cool with working on both the PRs. Testing stuff out with MSQ is a way to ensure that we have parity with the native engine till the time old aggregators exist. |
vtlim
left a comment
There was a problem hiding this comment.
Just need to fix the missing > on the first <br />. Otherwise docs look good.
Co-authored-by: Victoria Lim <vtlim@users.noreply.github.com>
Co-authored-by: Victoria Lim <vtlim@users.noreply.github.com>
…added numeric capabilities (apache#15670)
Description
This PR:
First/Last aggregators call
.toString()on complex metrics (that aren't type of pairLongLong, pairLongString...) and array types, which is also weird, however, that hasn't been changed, because that has been supported for a long time, and is also documented implicitly.Disallowing
EARLIEST_BY(aggregatedMetric, timestampCol2)will call the users to change their queries, however the equivalent call to this isEARLIEST(aggregatedMetric), which is a lot more clear, as the explicitly typed column by the user isn't ignored.Release note
EARLIEST_BY and LATEST_BY cannot be used with complex objects created during ingestion (with rollup) with the first/last aggregators.
Key changed/added classes in this PR
This PR has: