Skip to content

string -> expression -> string -> expression#9367

Merged
jon-wei merged 11 commits intoapache:masterfrom
clintropolis:reversible-expressions
Feb 21, 2020
Merged

string -> expression -> string -> expression#9367
jon-wei merged 11 commits intoapache:masterfrom
clintropolis:reversible-expressions

Conversation

@clintropolis
Copy link
Copy Markdown
Member

@clintropolis clintropolis commented Feb 15, 2020

Description

This PR adds a new method to the Expr interface, Expr.stringify(). which produces parseable expression strings so that any Expr tree can be converted back into a String which can later be parsed into an equivalent expression.

Prior to this PR, not all Expr which could exist at evaluation time were actually parseable, specifically empty numeric arrays and arrays with null elements. To make all Expr able to satisfy the stringify contract, the grammar has been updated to support these constructs. Empty arrays may now be defined like so: <STRING>[], <DOUBLE>[], <LONG>[], and arrays like [null, 1, 2] are now able to be correctly parsed from an expression string. The explicit typing syntax for arrays extends beyond empty arrays, so things like <LONG>[1, 2, 3] are also valid (and equivalent to [1, 2, 3]).

For testing, I modified the majority of expression unit tests assertions to also ensure that parsing the stringified form of a parsed expression produces the same results with and without flattening, so that nearly every expression that is tested also tests round tripping back to string.

Tagged with design review to discuss syntax for the empty arrays.


This PR has:

  • been self-reviewed.
  • added documentation for new or modified features or behaviors.
  • added Javadocs for most classes and all non-trivial methods. Linked related entities via Javadoc links.
  • added unit tests or modified existing tests to cover new code paths.
  • added integration tests.
  • been tested in a test Druid cluster.

Key changed/added classes in this PR
  • Expr.java
  • Expr.g4
  • ExprListenerImpl.java

…r support for null values in arrays, and parser support for empty numeric arrays
@Override
public String stringify()
{
return value == null ? NULL_LITERAL : StringUtils.format("'%s'", StringEscapeUtils.escapeJavaScript(value));
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you add a comment about why escapeJavaScript is used (or escapeJava for IdentifierExpr )?

Double[] values = new Double[ctx.doubleElement().size()];
for (int i = 0; i < values.length; i++) {
values[i] = Double.parseDouble(ctx.DOUBLE(i).getText());
if (ctx.doubleElement(i).getText().equalsIgnoreCase("null")) {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This and similar blocks could use the NULL_LITERAL constant

Copy link
Copy Markdown
Contributor

@jon-wei jon-wei left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, the new syntax looks fine to me

return StringUtils.format(
"%s(%s)",
name,
Joiner.on(", ").join(args.stream().map(Expr::stringify).iterator())
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: should it use Expr.ARG_JOINER?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

changed

assertExpr("fold((a, acc) -> concat(a, acc), a, '')", "foobarbazbarfoo");
assertExpr("fold((a, acc) -> array_append(acc, a), a, [])", new String[]{"foo", "bar", "baz", "foobar"});
assertExpr("fold((a, acc) -> array_append(acc, a), b, cast([], 'LONG_ARRAY'))", new Long[]{1L, 2L, 3L, 4L, 5L});
assertExpr("fold((a, acc) -> array_append(acc, a), b, <LONG>[])", new Long[]{1L, 2L, 3L, 4L, 5L});
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

if (ctx.longArrayBody().longElement(i).getText().equalsIgnoreCase(Expr.NULL_LITERAL)) {
values[i] = null;
} else {
values[i] = Long.parseLong(ctx.longArrayBody().longElement(i).getText());
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wondering if the array elements should be casted if they are not longs.

Copy link
Copy Markdown
Member Author

@clintropolis clintropolis Feb 21, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, it seems more druidish to do that, I've modified explicitly typed numeric arrays to use a more permissive parsing with a couple of methods I've added to Numbers utility class, so <LONG>[1, 1.2, 3] and <DOUBLE>[1, 2, 3] will coerce the elements as new Long[]{1L, 1L, 3L} and new Double[]{1.0, 2.0, 3.0} respectively, as one might hope. However, I am not converting string literals, so things like <LONG>[1, 2, '3'] are still not valid at this time.

I also made explicit string arrays more permissive, and can accept any type of literal as an element and will convert them all to strings, <STRING>['hello', 1, 1.2] -> new String[]{"hello", "1", "1.2"}.

String and long implicitly typed arrays are unchanged, but I did modify the parser to allow double implicitly typed arrays to also accept longs.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, sounds good.

@lgtm-com
Copy link
Copy Markdown

lgtm-com Bot commented Feb 21, 2020

This pull request introduces 1 alert when merging 7efa7d6 into 30c24df - view on LGTM.com

new alerts:

  • 1 for Unused format argument

Copy link
Copy Markdown
Contributor

@jihoonson jihoonson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

@jon-wei jon-wei merged commit 6d8dd5e into apache:master Feb 21, 2020
@clintropolis clintropolis deleted the reversible-expressions branch February 21, 2020 23:48
@jihoonson jihoonson added this to the 0.18.0 milestone Mar 26, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants