Skip to content

Windowed min aggregates null-s as 0#15371

Merged
clintropolis merged 27 commits intoapache:masterfrom
kgyrtkirk:window-min-null-to-0
Dec 8, 2023
Merged

Windowed min aggregates null-s as 0#15371
clintropolis merged 27 commits intoapache:masterfrom
kgyrtkirk:window-min-null-to-0

Conversation

@kgyrtkirk
Copy link
Copy Markdown
Member

@kgyrtkirk kgyrtkirk commented Nov 14, 2023

Correct the implementation of ObjectColumnSelector#isNull

Fixes #15370

This PR has:

  • been self-reviewed.

Copy link
Copy Markdown
Contributor

@soumyava soumyava left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Minor comments

* calling. "Polymorphism" of ObjectColumnSelector should be used only when operating on {@link ColumnValueSelector}
* objects.
*/
@Deprecated
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just to confirm this is not deprecated anymore ?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't understand why it was marked like that

  • the implemented interface methods can be deprecated - but they aren't
  • it did provided a working implementation (no idea why...)
  • it has the @Deprecated annotation for >5 years - should we really carry it forward?
    • is anyone committed to fix this deprecation?
    • what are benefits of doing so?
  • the apidoc's were a bit contradicting; but the `is convenient for implementation of "object-sourcing" seemed the most accurate.
  • the class itself is not @Deprecated so anyone can just extend it and use it

because of the above I've just removed them...

I think instead of @Deprecated: if the implementation should reject calls to getLong and friends it should just throw an Exception in the default implementation explaining the situation - but even in that case it can't really be deprecated as this class is not the one declaring those methods.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i think the main reason it was marked deprecated is so that intellij makes it pretty obvious if you're accidentally calling it on an object selector, its not really possible to remove it without reworking the column selector interfaces to split them up

T value = getObject();
if (value == null) {
return 0;
throw DruidException.defensive(COLUMN_IS_NULL_ERROR_MESSAGE);
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it possible to add an unit test as mentioned in the issue

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've extended windowed_long_null.sqlTest a bit to cover one there too - but the drill related cases were also been fixed by this change

Copy link
Copy Markdown
Contributor

@somu-imply somu-imply left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the changes, lets just rebase with master and I'll merge this

public final boolean isNull()
{
return false;
return getObject() == null;
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

while I guess this is harmless, nothing calling getObject should ever be using isNull, instead they should be checking that getObject returns null

if this change is fixing something then whatever it is fixing is probably wrong

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

that said, i guess we could change the contract of isNull to always work, its just also sort of pointless since getObject can just return the null (while getLong/getDouble/getFloat cannot since primitives), so why call two methods when you only need to call one

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this method should be either implemented correctly or throw an exception that its not supported - that will make sure that there won't be correctness issues if somehow it gets called

it is called from places like this - how should we fix that?

I don't understand clearly what you mean by "changing the contract of isNull" ; I believe that's defined here; and returning false have kinda violated it

Copy link
Copy Markdown
Member

@clintropolis clintropolis Nov 29, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this method should be either implemented correctly or throw an exception that its not supported - that will make sure that there won't be correctness issues if somehow it gets called

That is what i was getting at, if getLong/getFloat/getDouble are going to throw exceptions, then this should too, since it should only be called if you are planning on calling one of those methods.

it is called from places like this - how should we fix that?

NullableNumericAggregator is only for things that are going to be calling getLong/getFloat/getDouble, which are the only things that are supposed to call isNull, because those methods return primitives which cannot be null. That aggregator and the things it wraps should never be using a selector derived from this one, since it is designed to aggregate primitive numeric values.

I don't understand clearly what you mean by "changing the contract of isNull" ; I believe that's defined here; and returning false have kinda violated it

This does not violate it, because no one should ever call getLong/getFloat/getDouble on that value selector, which means they should also not be using isNull, and should instead just call getObject and see if the result is null. Calling isNull if you intend to call getObject is violating the current contract.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That is what i was getting at, if getLong/getFloat/getDouble are going to throw exceptions, then this should too, since it should only be called if you are planning on calling one of those methods.

I think it should throw an exception because null can't be represented as a primitive; it could highlight that you should call isNull first - I can revert that change; but I think those default values may only help to hide bugs....

[...] NullableNumericAggregator is only for things that are going to be calling [...]

that sounds rather odd - there are no interfaces enforcing that contract; it only requires that the selector implement BaseNullableColumnValueSelector ; and because of that object which implement that should work with it

This ObjectColumnSelector class provides getLong and friends based on getObject which is exactly how its services are used here

This does not violate it, because no one should ever call

"no one should ever call" : that can't be enforced; it
providing a return false implementation is wrong - and a violation of the contract in this case.

Calling isNull if you intend to call getObject is violating the contract.
now you lost me...then why implement ColumnValueSelector at all? or not just throw exceptions from all of them?
this seems like some kind of design issue...

I think a possible alternate way to fix this could be to not implement isNull in ObjectColumnSelector and force classes extending it to provide their own implementation - what do you think ?

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The current design is not great because it cannot enforce how callers use stuff, I agree completely and have some ambition of reworking how this stuff works to drop all of this polymorphism if possible, but its also a gigantic task since ColumnValueSelector is use in a ton of places, so it will take some time.

ObjectColumnSelector picks up BaseNullableColumnValueSelector because BaseLongColumnValueSelector etc implement it. It seems controversial to me that they provide default implementations of the get primitive methods at all because it seems like it is easy to misuse.

The javadocs of BaseNullableColumnValueSelector explain the current contract of isNull, and specify that things calling getObject should not call isNull because it is not for them.

I actually kind of think maybe ObjectColumnSelector should throw exceptions by default for getLong/getDouble/getFloat/isNull, because it isn't really intended to be used by things which should also maybe be long or float selectors, but i have no idea how many things that might break. For ColumnValueSelector` that do need to act as all types of selectors potentially, such as the example you linked:

This ObjectColumnSelector class provides getLong and friends based on getObject which is exactly how its services are used here

They should probably implement all of the methods themselves, because that is not that common of a pattern and only happens in some special cases. Usually ColumnValueSelector are specialized for the type they are selecting over.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think

return new ObjectColumnSelector()
{
@Override
public void inspectRuntimeShape(RuntimeShapeInspector inspector)
{
}
@Nullable
@Override
public Object getObject()
{
return results[index];
}
@Override
@Nonnull
public Class classOfObject()
{
return results[index].getClass();
}
};
is mistake, and this should not be making an ObjectColumnSelector and instead should implement ColumnValueSelector directly.

As far as I can tell, everything else using ObjectColumnSelector is truly something that will only call getObject like complex types of various sorts.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe there are two ways to provide a ColumnValueSelector for Object -t:

  • one which doesn't want to be treated like something which could answer a call like getLong() or isNull() as it wants to enforce that people will use the getObject() call
  • another one which provides the getLong and other methods based on a backing getObject which could be even a primitive

The old ObjectColumnSelector was kinda tring to do both of the above; in the latest changeset I've modified it to simply throw an Exception for every getX call to prevent incorrect usage - I needed to make some changes to some tests as they were asserting on the isNull method (which was implemented by return false).

I've added ta second class which does the 2nd kind of thing named ObjectBasedColumnSelector - and its used in the windowing stuff and a few other places; I don't see much value of copy-pasting that implementation to every place as if you have an array of objects. If someone might want to call getLong() on it there is not really a alternate ways...if this ends up being slow or something - and identified as a performance bottleneck...we could roll ArrayBasedTypeColumnSelector classes via a strategy pattern and done...but right now I doubt that around windowing this would be the 1st bottleneck :D

* calling. "Polymorphism" of ObjectColumnSelector should be used only when operating on {@link ColumnValueSelector}
* objects.
*/
@Deprecated
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i think the main reason it was marked deprecated is so that intellij makes it pretty obvious if you're accidentally calling it on an object selector, its not really possible to remove it without reworking the column selector interfaces to split them up

…IndexerV4Test,AutoTypeColumnIndexerTest,ConcurrentGrouperTest,DoubleAnyAggregationTest,FastLineIteratorTest,FloatAnyAggregationTest,LongAnyAggregationTest,NestedDataColumnIndexerV4Test,ParallelCombinerTest
*
* This abstract class provides a {@link ColumnValueSelector} based on the methods of {@link BaseObjectColumnValueSelector}.
*/
public abstract class ObjectBasedColumnSelector<T> implements ColumnValueSelector<T>
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we need a better name for this because this isn't strictly an object selector like ObjectColumnSelector is, and instead since it also implements BaseLongColumnValueSelector, BaseDoubleColumnValueSelector,
BaseFloatColumnValueSelector and BaseNullableColumnValueSelector, so its sort of a multi-type selector. It is also important to call the contract of this class - that it doesn't do any real type coercion and only satisfies the numeric selectors if the value is already a Number.

@Override
public final boolean isNull()
{
return getObject() == null;
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this should also probably be checking that the result of getObject() is instanceof Number. The things that are going to call isNull() will do so in service of calling it prior to calling getLong/getFloat/getDouble, but if Object is not a Number then this will result in a class cast exception and there is no restriction on this class that the things it provides are always instances of a Number.

Things that are calling getObject should already know not to be checking isNull for null checks, so I don't think that would cause any problems if isNull returns true but getObject returns a non-null value.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure about the added value of such a check as the classcastexception will tell the story pretty nicely where/why the issue happened - but I've added some ifs

* DoubleColumnSelector} solely in order to make {@link #isNull()} method final.
* Restricts selector usage to only allow {@link #getObject()}.
*/
public abstract class ObjectColumnSelector<T> implements ColumnValueSelector<T>
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think might have gone a bit overboard in removing the javadoc contract of this class, some of that stuff still seems relevant about only extending it

Comment thread processing/src/test/java/org/apache/druid/segment/AutoTypeColumnIndexerTest.java Outdated
Comment thread processing/src/test/java/org/apache/druid/segment/AutoTypeColumnIndexerTest.java Outdated
@clintropolis clintropolis merged commit c353ccf into apache:master Dec 8, 2023
@317brian
Copy link
Copy Markdown
Contributor

317brian commented Jan 3, 2024

@clintropolis @soumyava should this have the bug fix label?

@bsyk
Copy link
Copy Markdown
Contributor

bsyk commented Jan 5, 2024

This seems to break NullableNumericBufferAggregator
https://github.com/apache/druid/blob/master/processing/src/main/java/org/apache/druid/query/aggregation/NullableNumericBufferAggregator.java#L70

Nevermind - it was just my use of ComplexColumn that was throwing it off.

@LakshSingla LakshSingla added this to the 29.0.0 milestone Jan 29, 2024
@maytasm
Copy link
Copy Markdown
Contributor

maytasm commented May 31, 2024

Hmmm…..this PR might be causing a regression from c353ccf#diff-1aa22bcd62ddd4457f128832de7d4a008b954d015d8b3be6a03e3b3471aad6e3R57
So….seems like if we have a COMPLEX column, the complex column can no longer be an input for numeric aggregator like doubleSum in the sql compatibility mode. In the sql compatibility mode, it will wrap the COMPLEX column selector (ObjectColumnSelector) in a NullableNumericBufferAggregator and then this call: https://github.com/apache/druid/blob/master/processing/src/main/java/org/apache/druid/query/aggregation/NullableNumericBufferAggregator.java#L70 would fails.

@kgyrtkirk
Copy link
Copy Markdown
Member Author

I feel like its a bit odd to have a COMPLEX column as an input to an aggregate which accepts literals like doubleSum - shouldn't that be a compilation (or runtime) error instead?

Could you share a testcase to get a clearer understanding?

@bsyk
Copy link
Copy Markdown
Contributor

bsyk commented May 31, 2024

Why is it odd for a complex column to support SUM?
In our current use-case, we have a complex column that represents a histogram. That can be queried with a custom aggregator to get an aggregated histogram or compute percentiles. It can also be queried as a SUM to get the population of the histogram.
It seems totally reasonable to me to allow complex columns to support SUM, and better to use standard aggregators than force the column to be queried using custom aggregators that are effectively doing the same thing as the standard ones.

Could we get IncrementalIndex.makeMetricColumnValueSelector() to use the actual ColumnValueSelector for the appropriate complex type?

@maytasm
Copy link
Copy Markdown
Contributor

maytasm commented May 31, 2024

@kgyrtkirk
Here is the documentation regarding what @bsyk is talking about: https://druid.apache.org/docs/latest/development/extensions-contrib/spectator-histogram/#longsum-doublesum-and-floatsum-aggregators
The COMPLEX can be an input to longSum, doubleSum and floatSum aggregators, which will return the population size (count of events contributing to the histogram).

@kgyrtkirk
Copy link
Copy Markdown
Member Author

I think it would be better to not provide a sketch and literal values at the same time from the same aggregator.
I'm a little bit unsure if that is a recommended way to do things - but @gianm might know better.

I also suspect that these aggregators will not work well with the windowed codepaths - and in case of a windowed sum a ClassCastException would be thrown.

I was not able to uncover why the need for spectatorHistogramTimer and spectatorHistogramDistribution aliases - as they seem to be just pure aliases with no difference in behaviour (or I missed it?) - but if they do change anything: I wonder if they could be implmeneted as a postagg method which gets a sketch as input like SpectatorHistogramPercentilePostAggregator?

I think if there is a real need to have pure aliases to an existing aggregator - we should find an alternate way which doesn't duplicate class/types/etc.

I feel like its very unfortunate that this was documented ; but not ensured to work with a test...
I don't know if this could be considered a regression as #15340 was merged a month before #15371 - so this was never worked like the documentation suggested.

I've spent some time trying to uncover what your expectations/options might be in this case - If you could write a testcase to get a better understanding - I think that will most likely speed up our communication speed to fix this.

gianm added a commit to gianm/druid that referenced this pull request Jun 5, 2024
PR apache#15371 eliminated ObjectColumnSelector's built-in implementations of
numeric methods, which had been marked deprecated. Some complex types, like
SpectatorHistogram, can be successfully coerced to number.

An alternative approach would be to restore (and un-deprecate) the built-in
implementations of numeric methods in ObjectColumnSelector.
gianm added a commit to gianm/druid that referenced this pull request Jun 6, 2024
PR apache#15371 eliminated ObjectColumnSelector's built-in implementations of
numeric methods, which had been marked deprecated.

However, some complex types, like SpectatorHistogram, can be successfully coerced
to number. The documentation for spectator histograms encourages taking advantage of
this by aggregating complex columns with doubleSum and longSum. Currently, this
doesn't work properly for IncrementalIndex, where the behavior relied on those
deprecated ObjectColumnSelector methods.

This patch fixes the behavior by making two changes:

1) SimpleXYZAggregatorFactory (XYZ = type; base class for simple numeric aggregators;
   all of these extend NullableNumericAggregatorFactory) use getObject for STRING
   and COMPLEX. Previously, getObject was only used for STRING.

2) NullableNumericAggregatorFactory (base class for simple numeric aggregators)
   has a new protected method "useGetObject". This allows the base class to
   correctly check for null (using getObject or isNull).

The patch also adds a test for SpectatorHistogram + doubleSum + IncrementalIndex.
gianm added a commit to gianm/druid that referenced this pull request Jun 6, 2024
PR apache#15371 eliminated ObjectColumnSelector's built-in implementations of
numeric methods, which had been marked deprecated.

However, some complex types, like SpectatorHistogram, can be successfully coerced
to number. The documentation for spectator histograms encourages taking advantage of
this by aggregating complex columns with doubleSum and longSum. Currently, this
doesn't work properly for IncrementalIndex, where the behavior relied on those
deprecated ObjectColumnSelector methods.

This patch fixes the behavior by making two changes:

1) SimpleXYZAggregatorFactory (XYZ = type; base class for simple numeric aggregators;
   all of these extend NullableNumericAggregatorFactory) use getObject for STRING
   and COMPLEX. Previously, getObject was only used for STRING.

2) NullableNumericAggregatorFactory (base class for simple numeric aggregators)
   has a new protected method "useGetObject". This allows the base class to
   correctly check for null (using getObject or isNull).

The patch also adds a test for SpectatorHistogram + doubleSum + IncrementalIndex.
gianm added a commit to gianm/druid that referenced this pull request Jun 6, 2024
PR apache#15371 eliminated ObjectColumnSelector's built-in implementations of
numeric methods, which had been marked deprecated.

However, some complex types, like SpectatorHistogram, can be successfully coerced
to number. The documentation for spectator histograms encourages taking advantage of
this by aggregating complex columns with doubleSum and longSum. Currently, this
doesn't work properly for IncrementalIndex, where the behavior relied on those
deprecated ObjectColumnSelector methods.

This patch fixes the behavior by making two changes:

1) SimpleXYZAggregatorFactory (XYZ = type; base class for simple numeric aggregators;
   all of these extend NullableNumericAggregatorFactory) use getObject for STRING
   and COMPLEX. Previously, getObject was only used for STRING.

2) NullableNumericAggregatorFactory (base class for simple numeric aggregators)
   has a new protected method "useGetObject". This allows the base class to
   correctly check for null (using getObject or isNull).

The patch also adds a test for SpectatorHistogram + doubleSum + IncrementalIndex.
gianm added a commit that referenced this pull request Jul 25, 2024
* Coerce COMPLEX to number in numeric aggregators.

PR #15371 eliminated ObjectColumnSelector's built-in implementations of
numeric methods, which had been marked deprecated.

However, some complex types, like SpectatorHistogram, can be successfully coerced
to number. The documentation for spectator histograms encourages taking advantage of
this by aggregating complex columns with doubleSum and longSum. Currently, this
doesn't work properly for IncrementalIndex, where the behavior relied on those
deprecated ObjectColumnSelector methods.

This patch fixes the behavior by making two changes:

1) SimpleXYZAggregatorFactory (XYZ = type; base class for simple numeric aggregators;
   all of these extend NullableNumericAggregatorFactory) use getObject for STRING
   and COMPLEX. Previously, getObject was only used for STRING.

2) NullableNumericAggregatorFactory (base class for simple numeric aggregators)
   has a new protected method "useGetObject". This allows the base class to
   correctly check for null (using getObject or isNull).

The patch also adds a test for SpectatorHistogram + doubleSum + IncrementalIndex.

* Fix tests.

* Remove the special ColumnValueSelector.

* Add test.
sreemanamala pushed a commit to sreemanamala/druid that referenced this pull request Aug 6, 2024
…6564)

* Coerce COMPLEX to number in numeric aggregators.

PR apache#15371 eliminated ObjectColumnSelector's built-in implementations of
numeric methods, which had been marked deprecated.

However, some complex types, like SpectatorHistogram, can be successfully coerced
to number. The documentation for spectator histograms encourages taking advantage of
this by aggregating complex columns with doubleSum and longSum. Currently, this
doesn't work properly for IncrementalIndex, where the behavior relied on those
deprecated ObjectColumnSelector methods.

This patch fixes the behavior by making two changes:

1) SimpleXYZAggregatorFactory (XYZ = type; base class for simple numeric aggregators;
   all of these extend NullableNumericAggregatorFactory) use getObject for STRING
   and COMPLEX. Previously, getObject was only used for STRING.

2) NullableNumericAggregatorFactory (base class for simple numeric aggregators)
   has a new protected method "useGetObject". This allows the base class to
   correctly check for null (using getObject or isNull).

The patch also adds a test for SpectatorHistogram + doubleSum + IncrementalIndex.

* Fix tests.

* Remove the special ColumnValueSelector.

* Add test.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Windowed min aggregates null-s as 0

8 participants