-
Notifications
You must be signed in to change notification settings - Fork 506
METRON-119 Move PCAP infrastructure from HBase #93
Conversation
| command: storm jar {{ metron_directory }}/lib/{{ metron_parsers_jar_name }} org.apache.storm.flux.Flux --filter {{ metron_parsers_properties_config_path }} --remote {{ item }} | ||
| command: "{{ metron_directory }}/bin/start_parser_topology.sh {{ item }}" | ||
| with_items: | ||
| - "{{ storm_parser_topologies }}" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Everything worked well in EC2. Could you add an auto-start capability to deployment? Perhaps just add pcap to the list of parser topologies?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So, adding pcap to the list of parser topologies won't do it because pcap has a special script (start_pcap_topology.sh) due to it having a different config file (all of the parser topologies share the same config). Also, it's just a different sort of beast than a parser topology (i.e. we don't actually parse anything, we just take the raw data, slap on a header and put it in HDFS).
That being said, what I think we need to do is start the pcap topology when pycapa is installed. I'll have to look into where and how to do that in ansible. If you have any thoughts or suggestions, I'd be all ears. ;)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In retrospect, why don't we push this to a follow-on JIRA?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm for that. Lets us have a bit of a think on it without holding this up.
|
Since we're fixed on Java 8 after this, do you think it would make sense to get rid of all of the jvm parameters that cause these kinds of warnings during the Maven run: Java HotSpot(TM) 64-Bit Server VM warning: ignoring option MaxPermSize=256m; support was removed in 8.0 |
|
@dlyle65535 Definitely agreed, I'll submit a change this morning to remove the warnings. |
|
+1 on this, looks great! Ran it up in EC2 with pycapa enabled (the default) after starting the topology, everything just worked. |
|
+1 Deployed successfully on EC2. All existing feeds worked out of the box. Followed manual instructions to deploy topology. Was able to successfully open and validate the pcap file produced by the Metron UI. |
As it stands, the existing approach to handling PCAP data has some issues handling high volume packet capture data. With the advent of a DPDK plugin for capturing packet data, we are going to hit some limitations on the throughput of consumption if we continue to try to push packet data into HBase at line-speed.
Furthermore, storing PCAP data into HBase limits the range of filter queries that we can perform (i.e. only those expressible within the key). As of now, we require all fields to be present (source IP/port, destination IP/port and protocol), rather than allowing any wildcards.
To address these issues, we should create a higher performance topology which attaches the appropriate header to the raw packet and timestamp read from Kafka (as placed onto kafka by the packet capture sensor) and appends this packet to a sequence file in HDFS. The sequence file will be rolled based on number of packets or time (e.g. 1 hrs worth of packets in a given sequence file).
On the query side, we should adjust the middle tier service layer to start a MR job on the appropriate set of sequence files to filter out the appropriate packets. NOTE: the UI modifications to make this reasonable for the end-user will need to be done in a follow-on JIRA.
In order to test this PR, I would suggest doing the following as the "happy path":
for i in bro enrichment yaf snort;do storm kill $i;done/usr/metron/0.1BETA/bin/start_pcap_topology.sh/usr/bin/pycapa --producer --topic pcap -i eth1 -k node1:6667hadoop fs -ls /apps/metron/pcappcap_inspectorutility via/usr/metron/0.1BETA/bin/pcap_inspector.sh -i $FILE -n 5If the payload is not valid PCAP, then please look at the job history and note the reason for job failure if any.
Also, please note changes and addition to the documentation for the pcap service and pcap backend.