METRON-1339 Stellar Shell functionality to verify stored stellar statements #856
METRON-1339 Stellar Shell functionality to verify stored stellar statements #856ottobackwards wants to merge 22 commits intoapache:masterfrom
Conversation
This will allow users to check their deployed statements, say after upgrade, when they are at rest ( and would fail on use ). In other words, they were valid when stored, but are not now because of stellar changes, such as new keywords. The interface StellarConfiguredStatementReporter, which is @IndexSubclasses marked, allows the shell to discover reporters that can provide statements for validation. This discovery allows de-coupling of stellar and 'hosts' that know about the location of the stored statements, and the configuration structure details. We do mention the configurations in the shell output at this time. metron-common implements this interface, and can run through visiting all the configurations.
|
Any chance we can add a |
|
Also, it might be useful for |
|
@cestella I would say that proposed validate function has to be very much in a namespace. It feels like a name that would be much more useful for a function replacing our current approach to global validation in the future than config validation, other than that it sounds like a good idea. |
|
So, the scenario here is checking things that were valid when uploaded, but have been invalidated by external changes ( language changes ). I would like to keep the magic specific. I think the functionality for the management functions is valid, but can we do that as a separate Jira/PR? I'll do it, I just want to keep this tight. If you create the jira and assign it to me that would be super. I would do the files on disk using the management functions as well. So we just need to think of the stellar interface for calling Does that make sense? |
|
@simonellistonball, yes, the namespace should be part of the jira and interface design |
| } | ||
| List<String> names = Arrays | ||
| .asList(getName(), ENRICHMENT.toString(), topicName, "THREAT_TRIAGE", name); | ||
| visitor.visit(names, r.getRule()); |
There was a problem hiding this comment.
I like what you are doing with this PR.
There is just one downside that bugs me. How do we ensure that as our configuration evolves over time that we also update this to ensure it gets validated?
We have Stellar expressions as configurations all over the place; parsing, enrichment, triage, indexing, and profiler. I feel like any/all of these will evolve and change over time. I would definitely forget to come here and update this to make sure it gets validated.
As a motivating example, right here you are only validating the rule field. But triage also has the reason field that is a Stellar expression. That would need to be validated also. That was something we added later and likely the scenario where I would forget to add that expression to your validation logic.
There was a problem hiding this comment.
And let me follow-on with one proposal that might help address this. There are probably other (better?) ways to solve this, but here is one approach to chew on.
All of our configuration gets deserialized from JSON into POJOs; EnrichmentConfig, ProfilerConfig, etc. What if we had an annotation that we marked the fields that are required to be valid Stellar expressions? The annotation would go in those 'config' POJO classes. Based on the annotation we can then validate each of those fields.
Since the annotation is directly in the configuration classes, it is less likely I'm going to forget that annotation. And it is also remains decoupled, which is a good benefit of your current approach.
There was a problem hiding this comment.
I need to think about this, but I will add the reason check
There was a problem hiding this comment.
This also makes it simple to validate the configuration, no matter where it rests.
To validate the configuration in Zk, you just read all the configs in from Zk, deserialize into POJOs, iterate all the fields looking for the annotation, if you find the annotation, then run your validation logic on the expression.
To validate the configuration on the file system, you just read it all in from the FS, then repeat the same steps I described in the Zk scenario.
There was a problem hiding this comment.
So - if I'm understanding you correctly, as it applies to what I am doing ->
when visiting configs, instead of explicitly validating fields, we would want to 'visit' all the members per pojo by attribute. We would only need to worry about tracking the pojos but not the fields ( we need to keep the context ). Am I understanding you correctly?
There was a problem hiding this comment.
I think we eventually want to get to that point, with a consistent, comprehensive mechanism to validate stellar statements ( short of compiling ;) ).
There was a problem hiding this comment.
I think I know how to do it and not have it be terrible. The question is if it should be in this PR or after
There was a problem hiding this comment.
I think I can do it such that the stellar-common holds the code for doing the traversals as well as the interfaces and callbacks, and the reporter just provides the configs.
I'm thinking it through, but now that I'm thinking of it I will be miserable until I do it so I'll do it.
There was a problem hiding this comment.
I think you are really going to like this @nickwallen
There was a problem hiding this comment.
but if you complain about scope or the review size @nickwallen I will be very cross
|
@simonellistonball Agree to the namespace idea. My bad :) |
|
I am glad for the interest in this PR, and that it seems to have sparked some great ideas for continuing on. What I would like to do is line it up as follows
Over the course of that work, and other work identified through working and reviewing it, I will iteratively refactor to a common code and reusable approach. |
|
Per the conversation above, i'm going to take a stab at the attributed approach. |
Although the original implementation was functional, it required maintainence to keep current. The suggested 'best state' was to have it be possible, maybe through annotations, for the validation system to be able to handle any config, regarless or composition using annotations. That would leave it up to the implementor to propertly annotate thier configurations, and allow for support of new fields. This is an implementation of that. I have refactored the implemenations and details, but kept the discovery and mechanics ( loading and visitation ) somewhat the same. Hopefully keeping the good and reworking to a more sustainable solution. Several annotations where created to marks ceratin stellar configruation objects or scenarios. A holder object, to hold the configuration object, but knows how to process the annotations and run the visitation was added. This holder object and the annotations have parameters and handling for several special scenarios, such as 2x nested maps. This implementation should facilitate follow on work to validate files and streams and blobs by using implementing the StellarValidator interface and re-using the holder concept ( replacing the providers )
|
65278a6 introduces conceptually what @nickwallen and I have been discussing. From the commit ->Refactor based on review and inspiration from review. This is an implementation of that. I have refactored the implemenations and details, but kept the discovery and mechanics ( loading and visitation ) somewhat the same. Several annotations where created to marks certain stellar configuration objects or scenarios. This implementation should facilitate follow on work to validate files and streams and blobs by using implementing the StellarValidator interface |
|
@cestella @simonellistonball |
|
@nickwallen @mattf-horton I think we can use the annotation approach to resolve METRON-989 as well. Thoughts? |
|
Closing to test build in travis |
|
Bump? |
|
@nickwallen any feedback, does the annotated approach match what you imagined? |
|
@ottobackwards Yes, definitely. I like it at a 50k foot level. The only thing that struck me was the need for the different annotation types. But I haven't had a chance to dig into it yet. |
|
@nickwallen yeah, we need to cover a bunch a scenarios |
|
Why is this validation process driven by a %magic command? Magics were made for functionality that cannot be implemented directly within a Stellar execution environment. Often for answering questions 'about' the execution environment itself. For example This logic doesn't seem like a good fit for a %magic, unless there is a limitation that I am not understanding. This could be implemented, I would argue more simply, as a regular Stellar function. |
|
It did not seem appropriate to me for this to be a stellar function, and %magic is the other way to execute things from the shell. At the time at least. |
|
In other words, a stellar function that called the stellar compilation stuff, did not seem correct. |
|
Do you feel strongly that this should be a Function? @cestella ? I'm not opposed to changing it if you are. I would like to here some more feedback |
|
If that question is to me too, yes, I feel strongly it should be a function. A function that is part of the management functions. That was also suggested previously here. |
|
The justification that you mentioned just doesn't seem strong enough to me. Unless there is more that I am missing. IMHO We should only use magic commands for things that can't be accomplished using the preferred extension mechanism; aka defining Stellar functions. |
|
Ok, I'll change it. Feels a little crossing the streams, but we'll see |
|
I don't think it should be in management necessarily though. |
|
@nickwallen I have refactored to a function. |
| * | ||
| * @return Field Name or empty String | ||
| */ | ||
| String qualify_with_field() default ""; |
There was a problem hiding this comment.
These variables_go_against_the_style convention, no? qualify_with_field, qualify_with_field_type, inner_map_keys
| try { | ||
| children = client.getChildren().forPath(PARSER.getZookeeperRoot()); | ||
| } catch (Exception nne) { | ||
| return; |
There was a problem hiding this comment.
Need to log and comment here. We are silently eating the exception. Seems especially problematic because of the overly generic Exception declaration that the Curator library gives us.
| try { | ||
| children = client.getChildren().forPath(ENRICHMENT.getZookeeperRoot()); | ||
| } catch (Exception nne) { | ||
| return; |
There was a problem hiding this comment.
Need to log and comment here. We are silently eating the exception.
| List<ExpressionConfigurationHolder> holders = new LinkedList<>(); | ||
| visitParserConfigs(client, holders, errorConsumer); | ||
| visitEnrichmentConfigs(client, holders, errorConsumer); | ||
| visitProfilerConfigs(client, holders, errorConsumer); |
There was a problem hiding this comment.
There is no stellar in indexing is there? It is not in the readme
There was a problem hiding this comment.
I think you're right. Would suggest just a comment to clarify that point and maybe help prompt us should that change
// indexing contains no stellar to validate
| // discover all the StellarConfigurationProvider | ||
| Set<StellarConfigurationProvider> providerSet = new HashSet<>(); | ||
|
|
||
| for (Class<?> c : ClassIndex.getSubclasses(StellarConfigurationProvider.class, |
There was a problem hiding this comment.
What is the following code block doing? Why do we need to discover all of the StellarConfigurationProvider classes?
There was a problem hiding this comment.
If I understand this correctly, it seems that the StellarConfigurationProvider interface allows us to extend where configuration values get pulled in from. In your current default implementation ConfigurationProvider you reach out to Zookeeper to pull in the config.
If I wanted to validate configuration located on a file system, I would just create a FilesystemConfigurationProvider implementation of this interface.
The decision as to whether I want to validate the config in Zookeeper, on the file system or both, needs to be user driven. A user should make that decision based on how they call your new Stellar function.
Based on your discovery logic here, just having a FilesystemConfigurationProvider on the classpath (or any other implementation) will cause the configuration in the file system to get validated. We don't want that to happen. We want the user to control that behavior.
So I don't think we really need this discovery logic, which nicely simplifies things. I think we could just alter the StellarValidater interface to make this relationship simpler and more straight forward. The StellarConfigurationProvider just gets passed in.
StellarValidator {
...
Iterable<ValidationResult> validate(StellarConfigurationProvider provider);
...
}
There was a problem hiding this comment.
The idea is that not only can we not know the details of the classes that hold the rules, but also, that stellar may be hosted by other things than metron, that 'know' how to provide those configurations.
The problem with this isn't the discovery per se, but in that it is not correct given it's purpose and the implementation
There was a problem hiding this comment.
I am not following your explanation of why we need the discovery logic. Can you try to explain it again? In the latest commits, I still see the discovery logic in StellarSimpleValidator.
There was a problem hiding this comment.
I will try to think of a different way to put it
There was a problem hiding this comment.
Maybe that is where we are missing each other?
There was a problem hiding this comment.
You mean couple it to metron? We don't want to do that anymore.
I don't see your justification. Maybe another reviewer will understand this better. As is, I am a +0 on this.
There was a problem hiding this comment.
https://issues.apache.org/jira/browse/METRON-989
https://issues.apache.org/jira/browse/METRON-876
@mattf-horton @cestella
I have implemented this as to not increase the amount of tie-in between stellar and metron, and support future non-metron configuration sources.
Thus, the sources of the configuration are discovered and not coupled. I believe this is in the spirit if not the letter of the design discussions that we have had.
Can you take a look?
There was a problem hiding this comment.
It is true that Stellar should exist, as much as possible, independent from Metron; that was the aim of the effort to move stellar out of metron-common and into its own top level component in the project. I'll look closer at this PR and the (rather long, but seemingly coherent...so congrats ;) comment thread and weigh in later.
| * {@code ConfigurationProvider} is used to report all of the configured / deployed Stellar statements in | ||
| * the system. | ||
| */ | ||
| public class ConfigurationProvider implements StellarConfigurationProvider { |
There was a problem hiding this comment.
This implementation of a StellarConfigurationProvider is one that talks to Zookeeper, right? It retrieves Stellar configuration from Zookeeper.
Should we call it ZookeeperConfigurationProvider or something more descriptive?
There was a problem hiding this comment.
That is a good idea
| * StellarConfiguredStatementProviders are used provide stellar statements | ||
| * and the context around those statements to the caller | ||
| */ | ||
| public interface StellarConfiguredStatementContainer { |
There was a problem hiding this comment.
ExpressionConfigurationHolder is an implementation of this interface. That being said, I don't understand the point of this interface.
In all the code that I see, you use the implementation class ExpressionConfigurationHolder rather than this interface. For example, in StellarZookeeperBasedValidator and other places.
We should either use the interface or get rid of it.
There was a problem hiding this comment.
The issue is not using the interface, I'll address that.
| * Returns: The String representation of the enrichment config | ||
|
|
||
| ### Validation Functions | ||
| * `VALIDATE_STELLAR_RULE_CONFIGS` |
There was a problem hiding this comment.
This name confuses me, why Rules? VALIDATE_STELLAR seems simple and to the point to me.
There was a problem hiding this comment.
Is you search the tree for 'rules' you will see that we call them rule or rules in various places. It 'seemed like the thing to do'.
There was a problem hiding this comment.
There is reference to "rules" because that was the first use for Stellar. @cestella called them rules when he first implemented. But that is ancient history. We use Stellar everywhere now. Its long ago outgrown that.
|
|
||
|
|
||
| @Override | ||
| public Iterable<ValidationResult> validate(Optional<LineWriter> writer) { |
There was a problem hiding this comment.
Why do we need a 'writer'? It doesn't make sense to me why we need it nor do we use it. I think it could be removed from the method signature completely in StellarValidator.
There was a problem hiding this comment.
Sorry, I was going to take this out, this is from the prior shell based integration
Attempt to correct the validator from assuming zookeeper. It is more correct that there will be multple types of validators We may support injection of more than one type of validator, make that more clear.
|
Refactored based on feedback for some things, based on making what I was trying for more correct in others. |
|
Resolved conflicts, in the event we ever get around to this |
|
Should I close this? |
|
This has gone from a small thing to at least 'say' we have a way to check if we broke all your stellar stuff after upgrade, to stretching it based on feedback which was a mistake, to a run around to abandoned review status. I'm going shelve this |
|
The next attempt at this, if there is one, should start off with some sort of consensus first. And some agreement on initial scope. This PR would have been smaller and less ambitious, if that were true. Or at least everyone would have been on the same page as to the 'why' of certain things. |
|
It has been long enough that I don't even like this PR any more. -1 |

This will allow users to check their deployed statements, say after upgrade, when they are at rest ( and would fail on use ).
In other words, they were valid when stored, but are not now because of stellar changes, such as new keywords.
The interface
StellarConfiguredStatementReporter, which is@IndexSubclasses( ClassIndex) marked, allows the shell to discover reporters that can provide statements for validation. This discovery allows de-coupling of stellar and 'hosts' that know about the location of the stored statements, and the configuration structure details.metron-commonimplements this interface, and can run through visiting all the configurations.A management function was added
VALIDATE_STELLAR_RULE_CONFIGS()When executed, the shell
I'm getting this out there, still a couple of things todo:
[x]
full dev run. I have been testing with stellar external to full dev iteratively[x]
readme[x]
steps to test[x]
unit test[x]
ThreatTriage Rule ReasonTesting
=in zookeeper{ "parserClassName": "org.apache.metron.parsers.GrokParser", "sensorTopic": "squid", "parserConfig": { "grokPath": "/patterns/squid", "patternLabel": "SQUID_DELIMITED", "timestampField": "timestamp" }, "fieldTransformations" : [ { "transformation" : "STELLAR" ,"output" : [ "full_hostname", "domain_without_subdomains" ] ,"config" : { "full_hostname" : "URL_TO_HOST(url) =" ,"domain_without_subdomains" : "DOMAIN_REMOVE_SUBDOMAINS(full_hostname)" } } ] })){ "enrichment" : { "fieldMap": { "geo": ["ip_dst_addr", "ip_src_addr"], "host": ["host"] } }, "threatIntel" : { "fieldMap": { "hbaseThreatIntel": ["ip_src_addr", "ip_dst_addr"] }, "fieldToTypeMap": { "ip_src_addr" : ["malicious_ip"], "ip_dst_addr" : ["malicious_ip"] }, "triageConfig" : { "riskLevelRules" : [ { "rule" : "not(IN_SUBNET(ip_dst_addr, '192.168.0.0/24')) )", "score" : 10 } ], "aggregator" : "MAX" } } }Working with zookeeper
I am not a zk cli maestro, so I took the easy way out and used ZK-WEB.
Following the readme instructions it was very simple to clone, edit the config for full dev, and run from source. If you log in with the creds in the config you can edit the nodes.
Results
When you run the function, it will report the failed stellar statements.
Examples with and without :
For all changes:
For code changes:
Have you included steps to reproduce the behavior or problem that is being changed or addressed?
Have you included steps or a guide to how the change may be verified and tested manually?
Have you ensured that the full suite of tests and checks have been executed in the root metron folder via:
Have you written or updated unit tests and or integration tests to verify your changes?
If adding new dependencies to the code, are these dependencies licensed in a way that is compatible for inclusion under ASF 2.0?
Have you verified the basic functionality of the build by building and running locally with Vagrant full-dev environment or the equivalent?
For documentation related changes:
Have you ensured that format looks appropriate for the output in which it is rendered by building and verifying the site-book? If not then run the following commands and the verify changes via
site-book/target/site/index.html: