METRON-366: Add MODEL_APPLY to Stellar #210

cestella · 2016-08-15T12:48:22Z

The preferred method of applying models should be via stellar integration. This should be added as a function and made available as a FieldTransformation and as part of Threat Triage.

This has been run on full-dev. Testing instructions pending...

cestella · 2016-08-15T15:03:22Z

Testing Instructions

Free Up Space on SNV

First, let's free up some headroom on SNV. If you are running this on a multinode cluster, you would not have to do this.

Kill monit via service monit stop
Kill tcpreplay via for i in $(ps -ef | grep tcpreplay | awk '{print $2}');do kill -9 $i;done
Kill existing parser topologies via
- storm kill snort
- storm kill bro
Kill flume via for i in $(ps -ef | grep flume | awk '{print $2}');do kill -9 $i;done
Kill yaf via for i in $(ps -ef | grep yaf | awk '{print $2}');do kill -9 $i;done
Kill bro via for i in $(ps -ef | grep bro | awk '{print $2}');do kill -9 $i;done

Install Prerequisites and Mock DGA Service

Now let's install some prerequisites:

Flask via yum install python-flask
Jinja2 via yum install python-jinja2
Squid client via yum install squid
ES Head plugin via /usr/share/elasticsearch/bin/plugin install mobz/elasticsearch-head

Start Squid via service squid start

Now that we have flask and jinja, we can create a mock DGA service to deploy with MaaS:

Download the files in this gist into the /root/mock_dga directory
Make rest.sh executable via chmod +x /root/mock_dga/rest.sh

This service will treat yahoo.com and amazon.com as legit and everything else as malicious. The contract is that the REST service exposes an endpoint /apply and returns back JSON maps with a single key is_malicious which can be malicious or legit.

Deploy Mock DGA Service via MaaS

Now let's start MaaS and deploy the Mock DGA Service:

Start MaaS via /usr/metron/0.2.0BETA/bin/maas_service.sh -zq node1:2181
Start one instance of the mock DGA model with 512M of memory via /usr/metron/0.2.0BETA/bin/maas_deploy.sh -zq node1:2181 -lmp /root/mock_dga -hmp /user/root/models -mo ADD -m 512 -n dga -v 1.0 -ni 1
As a sanity check:
- Ensure that the model is running via /usr/metron/0.2.0BETA/bin/maas_deploy.sh -zq node1:2181 -mo LIST. You should see Model dga @ 1.0 be displayed and under that a url such as (but not exactly) http://node1:36161
- Try to hit the model via curl: curl 'http://localhost:36161/apply?host=caseystella.com' and ensure that it returns a JSON map indicating the domain is malicious.

Adjust Configurations for Squid to Call Model

Now that we have a deployed model, let's adjust the configurations for the Squid topology to annotate the messages with the output of the model.

Edit the squid parser configuration at /usr/metron/0.2.0BETA/config/zookeeper/parsers/squid.json in your favorite text editor and add a new FieldTransformation to indicate a threat alert based on the model (note the addition of is_malicious and is_alert):

{
  "parserClassName": "org.apache.metron.parsers.GrokParser",
  "sensorTopic": "squid",
  "parserConfig": {
    "grokPath": "/patterns/squid",
    "patternLabel": "SQUID_DELIMITED",
    "timestampField": "timestamp"
  },
  "fieldTransformations" : [
    {
      "transformation" : "STELLAR"
    ,"output" : [ "full_hostname", "domain_without_subdomains", "is_malicious", "is_alert" ]
    ,"config" : {
      "full_hostname" : "URL_TO_HOST(url)"
      ,"domain_without_subdomains" : "DOMAIN_REMOVE_SUBDOMAINS(full_hostname)"
      ,"is_malicious" : "MAP_GET('is_malicious', MAAS_MODEL_APPLY(MAAS_GET_ENDPOINT('dga'), {'host' : domain_without_subdomains}))"
      ,"is_alert" : "if is_malicious == 'malicious' then 'true' else null"
                }
    }
                           ]
}

Edit the squid enrichment configuration at /usr/metron/0.2.0BETA/config/zookeeper/enrichments/squid.json (this file will not exist, so create a new one) to make the threat triage adjust the level of risk based on the model output:

{
  "index": "squid",
  "batchSize": 1,
  "enrichment" : {
    "fieldMap": {}
  },
  "threatIntel" : {
    "fieldMap":{},
    "triageConfig" : {
      "riskLevelRules" : {
        "is_malicious == 'malicious'" : 100
      },
      "aggregator" : "MAX"
    }
  }
}

Upload new configs via /usr/metron/0.2.0BETA/bin/zk_load_configs.sh --mode PUSH -i /usr/metron/0.2.0BETA/config/zookeeper -z node1:2181
Make the Squid topic in kafka via /usr/hdp/current/kafka-broker/bin/kafka-topics.sh --zookeeper node1:2181 --create --topic squid --partitions 1 --replication-factor 1

Start Topologies and Send Data

Now we need to start the topologies and send some data:

Start the squid topology via /usr/metron/0.2.0BETA/bin/start_parser_topology.sh -k node1:6667 -z node1:2181 -s squid
Generate some data via the squid client:
- Generate a legit example: squidclient http://yahoo.com
- Generate a malicious example: squidclient http://cnn.com
Send the data to kafka via cat /var/log/squid/access.log | /usr/hdp/current/kafka-broker/bin/kafka-console-producer.sh --broker-list node1:6667 --topic squid
Browse the data in elasticsearch via the ES Head plugin @ http://node1:9200/_plugin/head/ and verify that in the squid index you have two documents
- One from yahoo.com which does not have is_alert set and does have is_malicious set to legit
- One from cnn.com which does have is_alert set to true, is_malicious set to malicious and threat:triage:level set to 100

cestella · 2016-08-15T16:12:36Z

Please note that it is non-optimal to only be able to reference models from the beginning (i.e. parsers) and end (i.e. threat triage) of the pipeline. As a follow-on, I'll be adding an enrichment adapter which can be called from the enrichment or threat triage phase and perform arbitrary stellar statement transformations. This should fill in the gap and allow the user to apply their models anywhere in the pipeline.

merrimanr · 2016-08-19T14:58:14Z

+1 by inspection. Will try to run it up on full-dev later today or this weekend. Nice job!

nickwallen · 2016-08-19T15:44:38Z

Do you have documented anywhere the model classes in org.apache.metron.maas? These classes seem to represent core abstractions.

A few questions come to mind that some simple javadoc might help with. But one example... A Model has a version and a ModelEndpoint has a version. Why do they both have versions? Wouldn't they evolve together and so have the same version?

nickwallen · 2016-08-19T18:22:13Z

...ics/metron-maas-common/src/main/java/org/apache/metron/maas/discovery/ServiceDiscoverer.java

+                            .concurrencyLevel(4)
+                            .weakKeys()
+                            .expireAfterWrite(10, TimeUnit.MINUTES)
+                            .build();


Should we parameterize these settings?

cestella · 2016-08-19T18:31:23Z

happy to add javadoc to that package. To answer your question, Model and ModelEndpoint are different in the sense that Model is a reference to the model. The ModelEndpoint is a reference to where the model is currently being served. You generally search for a Model and are returned a set of ModelEndpoints. If you do not specify a version in the search, you will get ModelEndpoints of multiple versions of the same model.

dlyle65535 · 2016-08-19T18:42:08Z

...alytics/metron-maas-service/src/main/java/org/apache/metron/maas/submit/ModelSubmission.java

          for(ModelEndpoint endpoint : kv.getValue()){
-            System.out.println("\t" + endpoint.getContainerId() + " at " + endpoint.getUrl());
+            System.out.println(endpoint);
          }


Not a today thing, but do you think this would be better off as a logger?

Actually, this the LIST operation is intended to output the list of endpoints returned. What I think I will do is make some of the logging debug level because it's getting quite chatty in practice and redundant.

Got it, makes sense to me.

dlyle65535 · 2016-08-19T19:36:33Z

+1, worked like a champ!

I'm going to run it up one more time with some additional skip tags. Since we're close on memory headroom, it doesn't make sense to me to take time to install a bunch of stuff I'll just have to shut down. If I have any success, I'll put the commands here.

I'm also using the new quick-dev image (vagrant box update).

dlyle65535 · 2016-08-19T20:41:25Z

Also successfully tested doing the following:

Rather than run.sh, I ran:

vagrant --ansible-tags="hdp-deploy,metron" --ansible-skip-tags="solr,sensors,start,report,monit" up

Started enrichment and indexing topologies:

/usr/metron/0.2.0BETA/bin/start_enrichment_topology.sh
/usr/metron/0.2.0BETA/bin/start_elasticsearch_topology.sh

Followed your instructions

This dropped the startup time to just over 14 minutes.

cestella · 2016-08-19T20:43:39Z

14 minutes?! Woah! @dlyle65535 is my hero.

METRON-366: Add MODEL_APPLY to Stellar

3888a19

Update README.md

3064d74

nickwallen reviewed Aug 19, 2016
View reviewed changes

dlyle65535 reviewed Aug 19, 2016
View reviewed changes

Reacted to PR requests for better docs and code clarity improvements.

411d344

asfgit closed this in 1ea8d9e Aug 22, 2016

cestella mentioned this pull request Sep 19, 2016

METRON-429 Profiler Missing Dependencies When Uber Jar Deployed #259

Closed

mmiklavc mentioned this pull request Oct 21, 2016

METRON-495: Upgrade Storm to 1.0.x #318

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

METRON-366: Add MODEL_APPLY to Stellar #210

METRON-366: Add MODEL_APPLY to Stellar #210

Uh oh!

cestella commented Aug 15, 2016 •

edited

Loading

Uh oh!

cestella commented Aug 15, 2016 •

edited

Loading

Uh oh!

cestella commented Aug 15, 2016

Uh oh!

merrimanr commented Aug 19, 2016

Uh oh!

nickwallen commented Aug 19, 2016

Uh oh!

nickwallen Aug 19, 2016

Uh oh!

cestella commented Aug 19, 2016

Uh oh!

dlyle65535 Aug 19, 2016

Uh oh!

cestella Aug 19, 2016

Uh oh!

dlyle65535 Aug 19, 2016

Uh oh!

dlyle65535 commented Aug 19, 2016

Uh oh!

dlyle65535 commented Aug 19, 2016

Uh oh!

cestella commented Aug 19, 2016

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

METRON-366: Add MODEL_APPLY to Stellar #210

METRON-366: Add MODEL_APPLY to Stellar #210

Uh oh!

Conversation

cestella commented Aug 15, 2016 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

cestella commented Aug 15, 2016 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Free Up Space on SNV

Install Prerequisites and Mock DGA Service

Deploy Mock DGA Service via MaaS

Adjust Configurations for Squid to Call Model

Start Topologies and Send Data

Uh oh!

cestella commented Aug 15, 2016

Uh oh!

merrimanr commented Aug 19, 2016

Uh oh!

nickwallen commented Aug 19, 2016

Uh oh!

nickwallen Aug 19, 2016

Choose a reason for hiding this comment

Uh oh!

cestella commented Aug 19, 2016

Uh oh!

dlyle65535 Aug 19, 2016

Choose a reason for hiding this comment

Uh oh!

cestella Aug 19, 2016

Choose a reason for hiding this comment

Uh oh!

dlyle65535 Aug 19, 2016

Choose a reason for hiding this comment

Uh oh!

dlyle65535 commented Aug 19, 2016

Uh oh!

dlyle65535 commented Aug 19, 2016

Uh oh!

cestella commented Aug 19, 2016

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

cestella commented Aug 15, 2016 •

edited

Loading

cestella commented Aug 15, 2016 •

edited

Loading