METRON-1563 [DISCUSS] Add Assignment to Stellar Language #687

ottobackwards · 2017-08-06T18:49:00Z

This PR adds support for Assignment to the Stellar Language. It is dependent on METRON-711 before it can be merged.

I still need to do the doc work, and more testing.

Concept

Allow for assignment of values to variables only
Pre and Post increment and decrement for variables
Assignment both returns the assigned value AND updates the variable
VariableResolver implementation may or may not choose to support updates
The Shell handles all assignments ( foo := 'val' ) as ("foo = 'val'"). Exposing the whole operation to Stellar allows catching illegal values and grammar
The LambdaExpression does attempt to update variables, but this would have effect if the state was using a VariableResolver instance that supported updates AND the variable was not a 'lambda' variable.

 String expr  = "MAP([ foo, bar, baz ], (item) -> count *= 2  )";
      Map<String,Object> map = new HashMap<String,Object>(){{
        put("foo",1);
        put("bar",1);
        put("baz",1);
        put("count", 1);
      }};
      for(int i = 0 ; i < 5; i++){
        run(expr, map);
      }
      Assert.assertEquals(new DecimalFormat("#").format(Math.pow(2,15)), map.get("count").toString());

Support for:
= , +=, -=, *=, /=, ++var, --var, var++, var--

Side effects:

This is opt-in for the resolvers. The MapVariableResolver does NOT support update, so there is no change to Enrichment, Parsers, Profile, and Shell ( beyond what has been mentioned ).

Testing:

See BasicStellarTest:test*Assign() for some usages.

Existing profiles, enrichments, parser transforms would work
Run different statements in the shell ( including those that would have worked before like = := 6.
more to come

NOTE

this is like the rebound relationship, the PR that leads to the PR that actually lands
this issues around the variable resolvers and such will dictate a different approach

This PR is to get the concept as a proposal out for review and start the discussion. I will start a
discuss thread as well.

Note: This PR resolves METRON-693 : Stellar REPL lets reserved tokens be used as unusable variable names as well.

Pull Request Checklist

Thank you for submitting a contribution to Apache Metron.
Please refer to our Development Guidelines for the complete guide to follow for contributions.
Please refer also to our Build Verification Guidelines for complete smoke testing guides.

In order to streamline the review of the contribution we ask you follow these guidelines and ask you to double check the following:

For all changes:

Is there a JIRA ticket associated with this PR? If not one needs to be created at Metron Jira.
Does your PR title start with METRON-XXXX where XXXX is the JIRA number you are trying to resolve? Pay particular attention to the hyphen "-" character.
Has your PR been rebased against the latest commit within the target branch (typically master)?

and METRON-711

For code changes:

Have you included steps to reproduce the behavior or problem that is being changed or addressed?
Have you included steps or a guide to how the change may be verified and tested manually?
Have you ensured that the full suite of tests and checks have been executed in the root metron folder via:
```
mvn -q clean integration-test install && build_utils/verify_licenses.sh 
```
Have you written or updated unit tests and or integration tests to verify your changes?
If adding new dependencies to the code, are these dependencies licensed in a way that is compatible for inclusion under ASF 2.0?
Have you verified the basic functionality of the build by building and running locally with Vagrant full-dev environment or the equivalent?

For documentation related changes:

Have you ensured that format looks appropriate for the output in which it is rendered by building and verifying the site-book? If not then run the following commands and the verify changes via site-book/target/site/index.html:
```
cd site-book
mvn site
```

Note:

Please ensure that once the PR is submitted, you check travis-ci for build issues and submit an update to your PR as soon as possible.
It is also recommended that travis-ci is set up for your personal repository such that your branches are built there before submitting a pull request.

ottobackwards · 2017-08-07T16:25:59Z

Added pre and post increment and decrement

JonZeolla · 2017-10-23T14:09:39Z

Is this ready for review? If so, can you deconflict?

ottobackwards · 2017-10-24T19:33:08Z

should be set @JonZeolla

ottobackwards · 2017-10-24T19:34:55Z

It is fair to say that this PR is more for conceptual review than anything else.

ottobackwards · 2017-10-24T19:35:05Z

@cestella

ottobackwards · 2018-01-06T14:29:08Z

Reviewer Note: Would a default method on the interface be better for update()?

cestella · 2018-01-18T16:25:05Z

Hmm, so this is an interesting one. We sorta have a de facto assignment outside of stellar, but I'd be interested in having it moved into stellar like this. I'd also be sorta ok with it being =, but we chose := earlier, so that is an open question as to whether it should be consistent.

Also, it seems like processing a stellar statement should return 2 things now:

the output value
optionally a variable assigned

We should also refactor the StellarAssignment class to use this somehow. Other than that, this seems extremely promising!

cestella · 2018-01-18T16:29:23Z

I would like to mention that I'm actually much more open minded to moving to = than my text may seem. I just want the conversation, of course :)

ottobackwards · 2018-01-18T16:35:14Z

I just removed the conflicts and changed StellarAssignment to account for this

ottobackwards · 2018-01-18T16:36:21Z

The question for me @cestella is the resolvers, and the consequences of having them support updating etc. I left it so that they don't by default because I didn't understand all the consequences and thought we could think it through

cestella · 2018-01-18T16:37:38Z

Yeah, I'm ok with them not updating at the moment (though I think that's where we may want to go). I think a stop-gap is to just be able to pass back the variable assigned and the value from a stellar statement. We manage updating the resolvers out of band in the existing code.

ottobackwards · 2018-01-18T17:15:57Z

@cestella also: := is in the shell not the language. The mechanics of what is LANG and what is MACHINE is part of the issue.

cestella · 2018-01-18T17:20:09Z

well, := is in the shell and also in the enrichment configuration (see here. Seems like if we're promoting this to the language, we should make things consistent in one way or the other.

ottobackwards · 2018-01-18T17:24:46Z

POV issue I think. Stellar has the LANGUAGE ( grammar ), the default implementation, and the 'host' execution mechanics.

:= is part of the host execution mechanics.

The issue may not be being the same as much as the necessity for := as a host construct at all if things are done inside stellar.

Does that make sense?

cestella · 2018-01-18T17:28:20Z

Agreed, this is a POV issue. Though from the user perspective they aren't distinguishing between "this is in the REPL" vs "this is in the enrichment topology", so I think we do need to manage the experience if we're moving functionality from the MACHINE to the LANGUAGE to ensure it's consistent.

Beyond that, it's unclear whether we need an resolver update function as part of THIS PR, but I think we very much will need it if we decide to make stellar multi-expression. I suspect we should punt on it until that effort.

ottobackwards · 2018-01-18T17:41:03Z

By punt you mean leave this implementation that doesn't implement update in the resolvers?

ottobackwards · 2018-01-18T18:20:11Z

also, we aren't consistent, in some places ( transformations ) we use the configuration logical structure to imply assignment, in enrichment it is :=, but those are configuration mechanisms not stellar mechanisms

nickwallen · 2018-01-18T19:23:07Z

I have not looked at the code yet, but wanted to comment on the assignment discussion. IMO focusing on user experience is the way to go. To that end...

We should use = because that is just more obvious.
We should remove := from the REPL environment (very easy, just delete AssignmentCommand).
All references in the docs to := need to be changed to =. (Ugh)

All of that can happen in this PR or as multiple PRs (however @ottobackwards sees fit), but that is the right end state.

Alternative Option: The only other reasonable option IMO is to support both = and := (for some undetermined period of time.) The only snafu is that both should have the same implementation. So support for := would have to be added to the 'language' and we should NOT rely on the AssignmentCommand implementation in the REPL.

I am mildly against the "Alternative Option", but wanted to offer it up for discussion.

cestella · 2018-01-18T19:31:30Z

Yeah, I'm totally cool with going with =, but I will point out one snag. What about users with existing stellar enrichments using :=, would we provide a migration path beyond saying, "Hey, move your stuff over" in the Upgrading docs?

Migration, sadly sometimes, is also part of user experience. :)

nickwallen · 2018-01-18T19:35:51Z

@cestella Migration, sadly sometimes, is also part of user experience. :)

Very true. That's why I offered up the "Alternative Option".

Only my own laziness makes me "mildly against" that :) . But based on user experience (like you mention), I think we should probably do it.

cestella · 2018-01-18T19:40:13Z

hah, well @nickwallen, laziness is often a good trait in a developer. Honestly, I am ok with not migrating here in favor of more intuitive approaches. I chose := because I needed the functionality and worried about making the language more complex (there's something different about a language that has assignments versus an expression language. I wanted to keep it an expression language at the time.), so I was forced to choose a character sequence that didn't exist as a lexable token. Now, honestly, we're already halfway down that road, so probably the right thing to do is deprecate it and rip the bandaid fast. I honestly could be convinced either way.

nickwallen · 2018-01-18T19:47:31Z

@cestella: What about users with existing stellar enrichments using := ...

Oh, and one clarification. No users will have production code (like an enrichment) that uses :=. That has only ever been available in the REPL. So that should not be a concern, right? The only concern would be documented uses of the REPL itself; like READMEs.

ottobackwards · 2018-01-18T19:47:45Z

@cestella @nickwallen
I have changed stellar assignment to cover the := already. I think it will continue to work.
89f682f

cestella · 2018-01-18T19:50:38Z

@nickwallen No, that's not correct. Stellar enrichments can contain :=. Consider the enrichments here in one of our use-cases. Enrichments can either take a map of enrichments or a list of strings. If you pass a list of strings (which you would need to do to create temporary variables that you then remove before passing through, like in the example), you'd use an assignment, which is currently :=.

mattf-apache · 2018-01-18T19:52:02Z

As a professed language nut, when you mention the word "variable", I think "scope". And it seems to me that by placing the responsibility for variable assignment/update in the VariableResolver, you've actually tied into a very natural definition of scope given the way MapVariableResolver already works in association with a target dictionary or stack of dictionaries.

A. I think you should explicitly discuss scope in the documentation. There are several scopes already being used in practice, and I think users need to understand how it all works together, otherwise they will misunderstand and be bewildered. I am, no doubt, a good example, so dense that I have to ask the questions below :-)

B. I don't understand why you don't go ahead and enable update in MapVariableResolver, since IIRC that's the resolver for essentially everything interesting that uses variables in Stellar.

C. Do I understand correctly from the patch, that variable assignment IS enabled in lambda expressions? Presumably the scope will be the same instance dictionary where the input variables are stored, thus making them local, correct? Can new variables be created as well as input variables be updated?

D. The following scopes come to mind, with associated need for clarification:

Lambdas save their input variables in their ephemeral instance context, initialized by call syntax.
- Comment: questions already asked above
The REPL used one global dict for scope of its variables, defined by ':='.
- Comment: If I understand correctly, when in the REPL, '=' and ':=' assignments will go to the same global context dictionary, is that correct?
Profilers, transformations, and several other xml-defined Stellar-using features, have a persistent instance context, with variables initialized and updated via an xml syntax.
- Comment: Apparently here we've specifically exclude assignment, even though this seems to be where it's needed most. Why is that? The instance context seems ready-made for local variables.

mattf-apache · 2018-01-18T19:58:19Z

So a bunch more comments came in while I was writing the above. Nick's comment about enrichments using assignment already of course modifies my comment about Profilers, etc. Just making it uniform would give users a lot less to wonder about.

As for := vs =, it's not so terrible to have both as exact synonyms. This gives you both your preferred aesthetics and backward compatibility.

nickwallen · 2018-01-18T20:06:08Z

@cestella: No, that's not correct. Stellar enrichments can contain :=. Consider the enrichments here in one of our use-cases. Enrichments can either take a map of enrichments

Thanks for clarifying, but 'ugh'. I'll have to look at how that is implemented. I didn't realize we had a separate implementation of assignment. Does this PR address that? Shouldn't we have one implementation of the assignment mechanism?

Good commentary from @mattf-horton . I knew there was a reason why I didn't want to touch this. :)

ottobackwards · 2018-01-18T20:08:08Z

OK,

The others can comment, but I think proper scope would be implemented within the execution of the call stack, not arbitrarily by the resolvers. Each resolver should be a 'scope' and new scopes should be created.

A. Because it is a resolver thing, where would you document it in a way that makes sense? I have been thinking that we should refactor to explicitly support scoping in the processor/evaluations and then have the resolvers used in scope as scope implementations. So the nesting of scopes would not be done in the resolver.

B. As noted in the PR description, I was hoping for discussion ( THIS discussion as a matter of fact ). Having had some issues with my other PRs and scope, I chose to explicitly NOT go that far until we talked it through. I did not feel all the way back then that I had enough of a grasp of the different resolvers to change their behavior and understand the consequences either way.

C. The LambaExpression actually delegates to the state variable resolver:

  VariableResolver variableResolver = new DefaultVariableResolver(
        variable -> lambdaVariables.getOrDefault(variable
            , state.variableResolver.resolve(variable)
        ), variable -> true,
        (variable, value) -> {
          if (state.variableResolver.exists(variable)) {
            state.variableResolver.update(variable, value);
          }
        });

D.

I think I answered above.
I think so, maybe @cestella or @nickwallen can answer that
Again, I stopped short until I got some kind of feedback.
I have changed StellarAssignment, I don't think we need both = and :=, and if there are := floating around they will work, unless we remove StellarAssignment

mattf-apache · 2018-01-18T20:59:12Z

@ottobackwards And yet this discussion primarily relates to whether the assignment operation as defined in Enrichment should be unified with the assignment operation being added to core stellar, presumably due to user demand.

cestella · 2018-01-18T21:01:52Z

So, I was wrong when I initially looked at this PR, this adds variable updates to the resolver. Should we perhaps separate parsing assignments (e.g. move StellarAssignment.from into the language like you've done here) from actually updating the variables. I think I'd like to see Enrichments and the REPL refactored to not hand-code the variable update logic when we get around to updating the resolver to allow updates. Thoughts?

ottobackwards · 2018-01-18T21:08:49Z

@mattf-horton if by 'due to user demand' you mean to work around the fact that assignment was not in the language in the first place, ok.

ottobackwards · 2018-04-03T17:00:24Z

Bump

nickwallen · 2018-04-04T17:21:50Z

@cestella Should we perhaps separate parsing assignments (e.g. move StellarAssignment.from into the language like you've done here) from actually updating the variables.

@ottobackwards I didn't see you respond to this. I would agree with @cestella on this.

ottobackwards · 2018-04-04T18:22:05Z

@nickwallen @cestella @mattf-horton , Sorry, I thought that question was more to @mattf-horton at the time.

@cestella I'm not sure I understand what you are saying. Are you saying that we should not do assignment until we overhaul the current mechanisms? That this is too much a 1/2 measure?

ottobackwards · 2018-04-10T13:59:07Z

This is pretty old at this point, and there has been a lot of discussion all over the place about it. I'm not sure I can sum up where we are. @cestella can you sum up your current feeling on how we can move this forward?

ottobackwards · 2018-05-07T12:50:24Z

Bump @cestella

cestella · 2018-05-09T15:58:18Z

Sorry, @ottobackwards this got lost in the shuffle. What I was saying was that I think we should not adjust the VariableResolver infrastructure to update variables as the result of assignments, but rather return back the variable and the updated object. For instance, in the BaseStellarProcessor class we have a parse method that returns the output of the expression. I'd suggest we have a parseExpression and parseAssignment where the first acts as it does now and the second returns the variable name and the result. As a follow-on, I think we can adjust the variableresolver infrastructure to support variable updates.

I guess ultimately, I suggest a tactical solution that leads us to the right direction rather than trying to fix it all at once.

Thoughts?

ottobackwards · 2018-05-09T16:43:04Z

I don't quite get what you are saying. I can't wrap my head around it. If you look at the example from the PR description under concept, that doesn't work without the resolver being updated.

I have to be missing something

cestella · 2018-05-09T18:40:50Z

So, I guess I'm suggesting that this is like..3/4 of the way to a complete solution, which is for stellar to handle variable assignments completely.

Specifically, I'd say that our dominantly used variable resolver (MapVariableResolver) doesn't support updates, so nothing actually happens when you update outside of the DefaultVariableResolver, so assignments don't actually function yet and an error isn't thrown.
So, if someone reads the updated docs and uses ++ in the profiler or enrichments, it won't actually work and it'll not be apparent why.

What I'm suggesting is one of these:

refactoring the MapVariableResolver to support updates and refactoring the places where we use assignments to NOT handle variable assignment outside of stellar.
Making a feature branch for stellar assignments and I'd +1 this into that branch and we can do the final PR in that branch
pull back the scope of this PR and just have assignments supported in the parser without the unary operators. This is what I was getting here in my comment before:

For instance, in the BaseStellarProcessor class we have a parse method that returns the output of the expression. I'd suggest we have a parseExpression and parseAssignment where the first acts as it does now and the second returns the variable name and the result.

Ultimately, I LOVE this PR and I don't want you to think that I'm poo-pooing it. I think this is great work and I look forward to the feature being added. It just seemed ALMOST done, but not quite enough to stand by itself in master because the gap may leave people confused.

ottobackwards · 2018-05-09T18:57:25Z

Ok, so we are on the same page. I purposely did not put the update in the map resolver because I knew it warranted more discussion etc, and this is it. I just wasn't sure I was understanding what you were saying exactly, and honestly, every time i go back to read from the start @mattf-horton 's comments make me mush minded ;)

Now we have to pick one of the options above to continue with.

cestella · 2018-05-09T19:01:45Z

I think I have a minor preference for 2, honestly. I like this work and don't want it to languish or grow too big. Entirely as an aside, we should probably have a set of microbenchmark tests for stellar to ensure no performance regressions given changes to the grammar.

ottobackwards · 2018-05-14T19:25:33Z

I have renamed in preparation for the Feature branch based on the original jira ( create a new subtask to land this on feature and change to that ID )

=, +=, -=, *=, /= This are valid for assignment to variables. All assignments return the value assigned, as other scripting languages do. VariableResolver extended with an update() method, though it may not be supported by all resolvers, this gives the ability to update variables and save between calls. The shell was updated for this, such that assignment calls More testing need and doc but I wanted to get it out there. This is dependent on PR 686

fixes for handling validate with null values

…foo++, foo-- )

ottobackwards · 2018-05-15T17:21:33Z

Ok everyone, this has moved to the new PR. Thanks!

ottobackwards · 2018-10-26T14:00:49Z

btw, if you haven't checked the new fb and pr, the initial pr has landed, and has added support for := to go along with =

ottobackwards changed the title ~~METRON-1090 [NO MERGE UNTIL METRON-711] Add Assignment to Stellar Language~~ METRON-1090 Add Assignment to Stellar Language Aug 17, 2017

nickwallen mentioned this pull request Jan 3, 2018

METRON-1382 Run Stellar in a Zeppelin Notebook #884

Closed

9 tasks

ottobackwards changed the title ~~METRON-1090 Add Assignment to Stellar Language~~ METRON-1090 [DISCUSS] Add Assignment to Stellar Language May 9, 2018

ottobackwards changed the title ~~METRON-1090 [DISCUSS] Add Assignment to Stellar Language~~ METRON-1563 [DISCUSS] Add Assignment to Stellar Language May 14, 2018

ottobackwards added 9 commits May 14, 2018 16:02

add reserved symbols to readme

a453605

tests for assigning var1 = var2

b3314f4

fixes for handling validate with null values

fix spelling

f44b6c2

add support for pre and post increment and decrement ( ++foo, --foo, …

2ac2ef5

…foo++, foo-- )

added readme line for pre-post

1ab91d5

fix variable used

f18748a

account of variables on assignment now being seen by stellar

9689487

conflict resolution

2e94e14

ottobackwards force-pushed the stellar_assign branch from 375df1d to 2e94e14 Compare May 14, 2018 21:10

ottobackwards closed this May 14, 2018

ottobackwards mentioned this pull request May 14, 2018

METRON-1563 : Base Stellar assign for feature branch #1014

Closed

METRON-1563 [DISCUSS] Add Assignment to Stellar Language #687

METRON-1563 [DISCUSS] Add Assignment to Stellar Language #687

Uh oh!

Conversation

ottobackwards commented Aug 6, 2017 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Concept

Side effects:

Testing:

NOTE

Pull Request Checklist

For all changes:

For code changes:

For documentation related changes:

Note:

Uh oh!

ottobackwards commented Aug 7, 2017

Uh oh!

JonZeolla commented Oct 23, 2017

Uh oh!

ottobackwards commented Oct 24, 2017

Uh oh!

ottobackwards commented Oct 24, 2017

Uh oh!

ottobackwards commented Oct 24, 2017

Uh oh!

ottobackwards commented Jan 6, 2018

Uh oh!

cestella commented Jan 18, 2018

Uh oh!

cestella commented Jan 18, 2018

Uh oh!

ottobackwards commented Jan 18, 2018

Uh oh!

ottobackwards commented Jan 18, 2018

Uh oh!

cestella commented Jan 18, 2018

Uh oh!

ottobackwards commented Jan 18, 2018 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

cestella commented Jan 18, 2018

Uh oh!

ottobackwards commented Jan 18, 2018

Uh oh!

cestella commented Jan 18, 2018

Uh oh!

ottobackwards commented Jan 18, 2018

Uh oh!

ottobackwards commented Jan 18, 2018

Uh oh!

nickwallen commented Jan 18, 2018

Uh oh!

cestella commented Jan 18, 2018

Uh oh!

nickwallen commented Jan 18, 2018

Uh oh!

cestella commented Jan 18, 2018

Uh oh!

nickwallen commented Jan 18, 2018

Uh oh!

ottobackwards commented Jan 18, 2018 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

cestella commented Jan 18, 2018

Uh oh!

mattf-apache commented Jan 18, 2018

Uh oh!

mattf-apache commented Jan 18, 2018

Uh oh!

nickwallen commented Jan 18, 2018 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

ottobackwards commented Jan 18, 2018

Uh oh!

mattf-apache commented Jan 18, 2018

Uh oh!

cestella commented Jan 18, 2018

Uh oh!

ottobackwards commented Jan 18, 2018

ottobackwards commented Aug 6, 2017 •

edited

Loading

ottobackwards commented Jan 18, 2018 •

edited

Loading

ottobackwards commented Jan 18, 2018 •

edited

Loading

nickwallen commented Jan 18, 2018 •

edited

Loading