Skip to content

Support complex variance object inputs for variance SQL agg function#14463

Merged
jon-wei merged 4 commits intoapache:masterfrom
jon-wei:variance_sql_fix
Jun 28, 2023
Merged

Support complex variance object inputs for variance SQL agg function#14463
jon-wei merged 4 commits intoapache:masterfrom
jon-wei:variance_sql_fix

Conversation

@jon-wei
Copy link
Copy Markdown
Contributor

@jon-wei jon-wei commented Jun 21, 2023

This PR fixes an issue with the VARIANCE and related SQL aggregation functions, allowing them to accept variance aggregator objects as inputs (currently the SQL function only accepts numeric operands).

The new SqlAggFunction definitions are the same as SqlAvgAggFunction from SqlStdOperatorTable but with the operand type broadened to include variance aggregator objects in addition to numerics.

Note: This pulls in the RowSignatures.complexTypeChecker changes from #14195 for tighter operand type checking

This PR has:

  • been self-reviewed.
  • added documentation for new or modified features or behaviors.
  • a release note entry in the PR description.
  • added Javadocs for most classes and all non-trivial methods. Linked related entities via Javadoc links.
  • added or updated version, license, or notice information in licenses.yaml
  • added comments explaining the "why" and the intent of the code wherever would not be obvious for an unfamiliar reader.
  • added unit tests or modified existing tests to cover new code paths, ensuring the threshold for code coverage is met.
  • added integration tests.
  • been tested in a test Druid cluster.

@jon-wei jon-wei force-pushed the variance_sql_fix branch from c772a77 to 7b390e8 Compare June 22, 2023 21:05
Copy link
Copy Markdown
Contributor

@abhishekagarwal87 abhishekagarwal87 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some minor comments. but lgtm otherwise

);
}

private static class VarPopSqlAggFunction extends SqlAggFunction
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you add a note at the top that these various aggregate functions are same as SqlAvgAggFunction except that these versions pass OperandTypes.Any while standard function in calcite pass OperandTypes.Numeric

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changed this to use aggregatorBuilder and added a comment about this

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think a lot of boilerplate code here can be removed if you add a class called DruidSqlAvgAggFunction and in that class, the constructor passes the Any operand type. Then all the variance and std dev aggregation functions can extend that class and simply pass the kind value

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the approach of #14249 which is adding the SqlAggFunction equivalent of OperatorConversions.OperatorBuilder is probably the better way to do this

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

that's replacing the constructor with fluent API but you will still need to reuse stuff.

Copy link
Copy Markdown
Member

@clintropolis clintropolis Jun 23, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah, I guess i was thinking with that pattern can just have a function that makes SqlAggFunction based on the parts that change instead of a bunch of separate classes. The other PR is removing many of the custom SqlAggFunction to replace with the builder so making a more complicated setup here with base class seems like it would conflict even more with the goals there

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changed this have a function that makes SqlAggFunction and uses the aggregator builder

SqlKind.VAR_POP,
ReturnTypes.AVG_AGG_FUNCTION,
null,
OperandTypes.ANY, // Can be more specific after https://github.com/apache/druid/pull/14195 is merged
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

does this mean instead of OperandTypes.ANY you would use OperandTypes.or to check for either numeric or RowSignatures.complexTypeChecker with the appropriate native complex type? Any reason not to pull that change into this PR? it seems generally useful for validating any complex input types

Copy link
Copy Markdown
Member

@clintropolis clintropolis Jun 23, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oh i should have read the PR description, you covered this 😛

I still think it would be nicer to pull in those changes since this PR seems to be easier to merge and i think lots of other things could take advantage of this helper method existing, but I guess this isn't a blocker

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I pulled in the changes for RowSignatures.complexTypeChecker

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the approach of #14249 which is adding the SqlAggFunction equivalent of OperatorConversions.OperatorBuilder is probably the better way to do this

@jon-wei jon-wei force-pushed the variance_sql_fix branch from 7b390e8 to 79417b6 Compare June 26, 2023 23:35
Copy link
Copy Markdown
Member

@clintropolis clintropolis left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

}

/**
* Creates a SqlAggFunction that is the same as SqlAvgAggFunction but with an operand type that accepts
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: javadoc links

@jon-wei jon-wei merged commit c36f12f into apache:master Jun 28, 2023
@abhishekagarwal87 abhishekagarwal87 added this to the 27.0 milestone Jul 19, 2023
sergioferragut pushed a commit to sergioferragut/druid that referenced this pull request Jul 21, 2023
…pache#14463)

* Support complex variance object inputs for variance SQL agg function

* Add test

* Include complexTypeChecker, address PR comments

* Checkstyle, javadoc link
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants