Skip to content

add substituteCombiningFactory implementations for datasketches aggs#17314

Merged
kfaraz merged 3 commits intoapache:masterfrom
clintropolis:projection-complex-aggs
Oct 10, 2024
Merged

add substituteCombiningFactory implementations for datasketches aggs#17314
kfaraz merged 3 commits intoapache:masterfrom
clintropolis:projection-complex-aggs

Conversation

@clintropolis
Copy link
Copy Markdown
Member

Follow up to #17214, adds implementations for substituteCombiningFactory so that more datasketches aggs can match projections, along with some projections tests for datasketches.

public final CloserRule closer = new CloserRule(false);

public DatasketchesProjectionTest(
String name,

Check notice

Code scanning / CodeQL

Useless parameter

The parameter 'name' is never used.
that.fieldName
)) {
if (lgK <= that.lgK &&
tgtHllType.ordinal() <= that.tgtHllType.ordinal() &&
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think the TgtHllType needs to be part of the check here, since according to the datasketches docs:

These three target types are isomorphic representations of the same underlying HLL algorithm. Thus, given the same value of lgConfigK and the same input, all three HLL target types will produce identical estimates and have identical error distributions.

& I think that the aggregator is able to merge two different HLL types together.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ah good catch 👍

}

@Override
public AggregatorFactory substituteCombiningFactory(AggregatorFactory preAggregated)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

does SketchMergeAggregatorFactory need to override this with an impl that checks isInputThetaSketch?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hmm, i suppose it wouldn't hurt, though i can't imagine the field would be the same name and isInputThetaSketch be true in query agg but not in projection agg, or the other way, can add just in case since i need to make some other changes


ArrayOfDoublesSketchAggregatorFactory that = (ArrayOfDoublesSketchAggregatorFactory) preAggregated;
if (nominalEntries <= that.nominalEntries &&
numberOfValues <= that.numberOfValues &&
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think numberOfValues needs to match. There's some code in ArrayOfDoublesUnion like:

    if (gadget_.getNumValues() != tupleSketch.getNumValues()) {
      throw new SketchesArgumentException("Incompatible sketches: number of values mismatch "
          + gadget_.getNumValues() + " and " + tupleSketch.getNumValues());
    }

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ah yeah, good catch

@kfaraz kfaraz merged commit a6236c3 into apache:master Oct 10, 2024
@kfaraz kfaraz added this to the 31.0.0 milestone Oct 10, 2024
kfaraz pushed a commit to kfaraz/druid that referenced this pull request Oct 10, 2024
…pache#17314)

Follow up to apache#17214, adds implementations for substituteCombiningFactory so that more
datasketches aggs can match projections, along with some projections tests for datasketches.
kfaraz added a commit that referenced this pull request Oct 10, 2024
…17314) (#17323)

Follow up to #17214, adds implementations for substituteCombiningFactory so that more
datasketches aggs can match projections, along with some projections tests for datasketches.

Co-authored-by: Clint Wylie <cwylie@apache.org>
@clintropolis clintropolis deleted the projection-complex-aggs branch October 10, 2024 14:22
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants