Skip to content

multi-value string expression transformation fix#8019

Merged
fjy merged 8 commits intoapache:masterfrom
clintropolis:expr-multi-val-selector-fix
Jul 4, 2019
Merged

multi-value string expression transformation fix#8019
fjy merged 8 commits intoapache:masterfrom
clintropolis:expr-multi-val-selector-fix

Conversation

@clintropolis
Copy link
Copy Markdown
Member

@clintropolis clintropolis commented Jul 3, 2019

Description

This PR fixes an issue with the automatic translation of expressions for multi-value string column inputs, where if an identifier was duplicated, the expression would not transform correctly.

This has been resolved by modifying expression parsing to uniquely identify all non lambda bound IdentifierExpr, at parsing transformation time in ExprListenerImpl. LambdaExpr identifiers are marked on entering a lambda expression, and all non-lambda identifiers are given a unique identifier to accompany the identifier to use to get the value from the Expr.ObjectBinding.

Expr.BindingDetails has been reworked to accommodate this change, as well as simplified. Further refactoring took place in Parser to handle the transformation required to create LambdaExpr binding IdentifierExpr to ensure the correct transformation.

Example:
before patch:

concat(x, '-lol-', x)

where x is a multi-value string dimension would result in a transformation to something like

map((x) -> concat(x, '-lol-', x), x)

after patch:
this will correctly map to something like

cartesian_map((x, x_0) -> concat(x, '-lol-', x_0), x, x)

This PR has:

  • been self-reviewed.
  • added Javadocs for most classes and all non-trivial methods. Linked related entities via Javadoc links.
  • added comments explaining the "why" and the intent of the code wherever would not be obvious for an unfamiliar reader.
  • added unit tests or modified existing tests to cover new code paths.

this.rootNodeKey = rootNodeKey;
this.macroTable = macroTable;
this.nodes = new HashMap<>();
this.lambadIdentifiers = new HashSet<>();
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lambad -> lambda

}
List<String> unapplied = toApply.stream()
.filter(x -> bindingDetails.getFreeVariables().contains(x))
.filter(x -> bindingDetails.getRequiredColumnsList().contains(x))
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggest using getRequiredColumns here, since contains should be cheaper on the Set vs ArrayList if there are lots of columns (and skip the ArrayList creation too)

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oops, getRequiredColumns originally returned a list, but I changed it to getRequiredColumnsList and added getRequiredColumns to return a set for this reason but forgot to make sure I updated all the places.

}

/**
* Set of identifiers which are used with scalar operators and functions
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggest adding more docs here about how these differ from the get*Variables() methods

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated javadoc to specify if returning IdentifierExpr.identifier or IdentifierExpr.bindingIdentifier

Copy link
Copy Markdown
Contributor

@jon-wei jon-wei left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 after CI

@fjy fjy merged commit e6ba258 into apache:master Jul 4, 2019
@clintropolis clintropolis deleted the expr-multi-val-selector-fix branch July 4, 2019 06:11
gianm pushed a commit to implydata/druid-public that referenced this pull request Jul 4, 2019
* multi-value string expression transformation fix

* fixes

* more docs and test

* revert unintended doc change

* formatting

* change tostring to print binding identifier

* review fixup

* oops
@clintropolis clintropolis added this to the 0.16.0 milestone Aug 8, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants