Skip to content

SQL OperatorConversions: Introduce.aggregatorBuilder, allow CAST-as-literal.#14249

Merged
gianm merged 3 commits intoapache:masterfrom
gianm:sql-allow-cast-as-literal
Jun 23, 2023
Merged

SQL OperatorConversions: Introduce.aggregatorBuilder, allow CAST-as-literal.#14249
gianm merged 3 commits intoapache:masterfrom
gianm:sql-allow-cast-as-literal

Conversation

@gianm
Copy link
Copy Markdown
Contributor

@gianm gianm commented May 10, 2023

Four main changes:

  1. Provide aggregatorBuilder, a more consistent way of defining the
    SqlAggFunction we need for all of our SQL aggregators. The mechanism
    is analogous to the one we already use for SQL functions
    (OperatorConversions.operatorBuilder).

  2. Allow CASTs of constants to be considered as "literalOperands". This
    fixes an issue where various of our operators are defined with
    OperandTypes.LITERAL as part of their checkers, which doesn't allow
    casts. However, in these cases we generally do want to allow casts.
    The important piece is that the value must be reducible to a constant,
    not that the SQL text is literally a literal.

  3. Update DataSketches SQL aggregators to use the new aggregatorBuilder
    functionality. The main user-visible effect here is [2]: the aggregators
    would now accept, for example, "CAST(0.99 AS DOUBLE)" as a literal
    argument. Other aggregators could be updated in a future patch.

  4. Rename "requiredOperands" to "requiredOperandCount", because the
    old name was confusing. (It rhymes with "literalOperands" but the
    arguments mean different things.)

…iteral.

Four main changes:

1) Provide aggregatorBuilder, a more consistent way of defining the
   SqlAggFunction we need for all of our SQL aggregators. The mechanism
   is analogous to the one we already use for SQL functions
   (OperatorConversions.operatorBuilder).

2) Allow CASTs of constants to be considered as "literalOperands". This
   fixes an issue where various of our operators are defined with
   OperandTypes.LITERAL as part of their checkers, which doesn't allow
   casts. However, in these cases we generally _do_ want to allow casts.
   The important piece is that the value must be reducible to a constant,
   not that the SQL text is literally a literal.

3) Update DataSketches SQL aggregators to use the new aggregatorBuilder
   functionality. The main user-visible effect here is [2]: the aggregators
   would now accept, for example, "CAST(0.99 AS DOUBLE)" as a literal
   argument. Other aggregators could be updated in a future patch.

4) Rename "requiredOperands" to "requiredOperandCount", because the
   old name was confusing. (It rhymes with "literalOperands" but the
   arguments mean different things.)
@gianm gianm changed the title SQL OperatorConversions: Introduce.aggregatorBuilder, allow CAST-as-l… SQL OperatorConversions: Introduce.aggregatorBuilder, allow CAST-as-literal. May 10, 2023
return new DefaultOperandTypeChecker(
operandTypes,
requiredOperands == null ? operandTypes.size() : requiredOperands,
requiredOperandCount == null ? operandTypes.size() : requiredOperandCount,

Check warning

Code scanning / CodeQL

Dereferenced variable may be null

Variable [operandTypes](1) may be null at this access as suggested by [this](2) null guard.
Copy link
Copy Markdown
Member

@clintropolis clintropolis left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nice 🤘

private static final SqlAggFunction FUNCTION_INSTANCE = new HllSketchApproxCountDistinctSqlAggFunction();
private static final SqlAggFunction FUNCTION_INSTANCE =
OperatorConversions.aggregatorBuilder(NAME)
.operandTypes(SqlTypeFamily.ANY, SqlTypeFamily.NUMERIC, SqlTypeFamily.STRING)
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not a big deal, but does this lose the nice signature that provides the parameter names in the validation error? similar comment on other changes that drop the OperandTypes.sequence.

I personally find the more verbose OperandTypes.or form of using operandTypeChecker to be a bit easier to understand the shape of the function at a glance than trying to piece it together from looking at all of operandTypes, literalOperands and requiredOperandCount, but i guess it isn't a big deal and is also consistent with what is allowed when using the builder to spit out SqlFunction.

Copy link
Copy Markdown
Contributor Author

@gianm gianm Jun 23, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the latest patch, I added operandNames which are used to generate the signature for validation errors.

About the syntax, yeah, I was wanting to match it to SqlFunction. Feel free to introduce a new syntax in a future patch 🙂. IMO it would make sense to have a builder for DefaultOperandTypeChecker that builds up the operand list one at a time. That would enable an alternate syntax like:

OperatorConversions
  .aggregatorBuilder(NAME)
  .operandTypeChecker(
      OperandConversions
        .checkerBuilder()
        .operand("column", SqlTypeFamily.ANY)
        .operand("lgK", SqlTypeFamily.NUMERIC).optional().literal()
        .operand("tgtHllType", SqlTypeFamily.STRING).optional().literal()        
        .build()
  )
  .returnTypeNonNull(SqlTypeName.BIGINT)
  .functionCategory(SqlFunctionCategory.NUMERIC)
  .build();

Copy link
Copy Markdown
Contributor

@jon-wei jon-wei left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@gianm gianm merged commit 3d19b74 into apache:master Jun 23, 2023
@gianm gianm deleted the sql-allow-cast-as-literal branch June 23, 2023 23:25
@abhishekagarwal87 abhishekagarwal87 added this to the 27.0 milestone Jul 19, 2023
sergioferragut pushed a commit to sergioferragut/druid that referenced this pull request Jul 21, 2023
…iteral. (apache#14249)

* SQL OperatorConversions: Introduce.aggregatorBuilder, allow CAST-as-literal.

Four main changes:

1) Provide aggregatorBuilder, a more consistent way of defining the
   SqlAggFunction we need for all of our SQL aggregators. The mechanism
   is analogous to the one we already use for SQL functions
   (OperatorConversions.operatorBuilder).

2) Allow CASTs of constants to be considered as "literalOperands". This
   fixes an issue where various of our operators are defined with
   OperandTypes.LITERAL as part of their checkers, which doesn't allow
   casts. However, in these cases we generally _do_ want to allow casts.
   The important piece is that the value must be reducible to a constant,
   not that the SQL text is literally a literal.

3) Update DataSketches SQL aggregators to use the new aggregatorBuilder
   functionality. The main user-visible effect here is [2]: the aggregators
   would now accept, for example, "CAST(0.99 AS DOUBLE)" as a literal
   argument. Other aggregators could be updated in a future patch.

4) Rename "requiredOperands" to "requiredOperandCount", because the
   old name was confusing. (It rhymes with "literalOperands" but the
   arguments mean different things.)

* Adjust method calls.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants