METRON-119 Move PCAP infrastructure from HBase #93

cestella · 2016-04-28T18:01:08Z

As it stands, the existing approach to handling PCAP data has some issues handling high volume packet capture data. With the advent of a DPDK plugin for capturing packet data, we are going to hit some limitations on the throughput of consumption if we continue to try to push packet data into HBase at line-speed.

Furthermore, storing PCAP data into HBase limits the range of filter queries that we can perform (i.e. only those expressible within the key). As of now, we require all fields to be present (source IP/port, destination IP/port and protocol), rather than allowing any wildcards.

To address these issues, we should create a higher performance topology which attaches the appropriate header to the raw packet and timestamp read from Kafka (as placed onto kafka by the packet capture sensor) and appends this packet to a sequence file in HDFS. The sequence file will be rolled based on number of packets or time (e.g. 1 hrs worth of packets in a given sequence file).

On the query side, we should adjust the middle tier service layer to start a MR job on the appropriate set of sequence files to filter out the appropriate packets. NOTE: the UI modifications to make this reasonable for the end-user will need to be done in a follow-on JIRA.

In order to test this PR, I would suggest doing the following as the "happy path":

Install the pycapa library & utility via instructions here
(if using singlenode vagrant) Kill the enrichment and sensor topologies via for i in bro enrichment yaf snort;do storm kill $i;done
Start the pcap topology via /usr/metron/0.1BETA/bin/start_pcap_topology.sh
Start the pycapa packet capture producer on eth1 via /usr/bin/pycapa --producer --topic pcap -i eth1 -k node1:6667
Watch the topology in the Storm UI and kill the packet capture utility from before when the number of packets ingested is over 1k.
Ensure that at at least 2 files exist on HDFS by running hadoop fs -ls /apps/metron/pcap
Choose a file (denoted by $FILE) and dump a few of the contents using the pcap_inspector utility via /usr/metron/0.1BETA/bin/pcap_inspector.sh -i $FILE -n 5
Choose one of the lines and note the source ip/port and dest ip/port
Go to the kibana app at http://node1:5000 on the singlenode vagrant (ymmv on ec2) and input that query in the kibana PCAP panel.
Wait patiently while the MR job completes and the results are sent back in the form of a valid PCAP payload suitable for opening in wireshark
Open in wireshark to ensure the payload is valid.

If the payload is not valid PCAP, then please look at the job history and note the reason for job failure if any.

Also, please note changes and addition to the documentation for the pcap service and pcap backend.

dlyle65535 · 2016-04-29T09:07:10Z

metron-deployment/roles/metron_streaming/tasks/metron_topology.yml

-  command: storm jar {{ metron_directory }}/lib/{{ metron_parsers_jar_name }} org.apache.storm.flux.Flux  --filter {{ metron_parsers_properties_config_path }} --remote {{ item }}
+  command: "{{ metron_directory }}/bin/start_parser_topology.sh {{ item }}"
  with_items:
      - "{{ storm_parser_topologies }}"


Everything worked well in EC2. Could you add an auto-start capability to deployment? Perhaps just add pcap to the list of parser topologies?

So, adding pcap to the list of parser topologies won't do it because pcap has a special script (start_pcap_topology.sh) due to it having a different config file (all of the parser topologies share the same config). Also, it's just a different sort of beast than a parser topology (i.e. we don't actually parse anything, we just take the raw data, slap on a header and put it in HDFS).

That being said, what I think we need to do is start the pcap topology when pycapa is installed. I'll have to look into where and how to do that in ansible. If you have any thoughts or suggestions, I'd be all ears. ;)

In retrospect, why don't we push this to a follow-on JIRA?

I'm for that. Lets us have a bit of a think on it without holding this up.

dlyle65535 · 2016-04-29T09:25:42Z

Since we're fixed on Java 8 after this, do you think it would make sense to get rid of all of the jvm parameters that cause these kinds of warnings during the Maven run:

Java HotSpot(TM) 64-Bit Server VM warning: ignoring option MaxPermSize=256m; support was removed in 8.0
Java HotSpot(TM) 64-Bit Server VM warning: ignoring option UseSplitVerifier; support was removed in 8.0

cestella · 2016-04-29T12:05:23Z

@dlyle65535 Definitely agreed, I'll submit a change this morning to remove the warnings.

dlyle65535 · 2016-04-29T12:35:33Z

+1 on this, looks great!

Ran it up in EC2 with pycapa enabled (the default) after starting the topology, everything just worked.

nickwallen · 2016-04-29T15:35:29Z

+1 Deployed successfully on EC2. All existing feeds worked out of the box. Followed manual instructions to deploy topology. Was able to successfully open and validate the pcap file produced by the Metron UI.

cestella added 3 commits April 28, 2016 13:51

METRON-119 Move the PCAP topology from HBase

e506260

Updating the documentation.

99bf163

Merge branch 'master' into METRON-119

6e9a503

dlyle65535 reviewed Apr 29, 2016
View reviewed changes

cestella added 2 commits April 29, 2016 08:13

removing bad options for java8

3547e1f

removing bad options for java8

e405791

asfgit closed this in 28c250d Apr 29, 2016

cestella mentioned this pull request May 12, 2016

METRON-155 Added query filtering capability for PCAP via Metron REST API #119

Closed

mmiklavc mentioned this pull request Jun 17, 2016

METRON-235 Expose filtering capability for PCAP via CLI tool #156

Closed

asfgit pushed a commit that referenced this pull request Jun 24, 2016

METRON-119 - Move PCAP infrastructure from HBase closes #93

c7ec265

mmiklavc mentioned this pull request Sep 16, 2016

Metron 257 Enable pcap result pagination from the Pcap CLI #256

Closed

cestella mentioned this pull request Feb 26, 2017

METRON-743: Sort the files when reading results from Pcap #467

Closed

10 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

METRON-119 Move PCAP infrastructure from HBase #93

METRON-119 Move PCAP infrastructure from HBase #93

Uh oh!

cestella commented Apr 28, 2016

Uh oh!

dlyle65535 Apr 29, 2016

Uh oh!

cestella Apr 29, 2016

Uh oh!

cestella Apr 29, 2016

Uh oh!

dlyle65535 Apr 29, 2016

Uh oh!

dlyle65535 commented Apr 29, 2016

Uh oh!

cestella commented Apr 29, 2016

Uh oh!

dlyle65535 commented Apr 29, 2016

Uh oh!

nickwallen commented Apr 29, 2016

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

METRON-119 Move PCAP infrastructure from HBase #93

METRON-119 Move PCAP infrastructure from HBase #93

Uh oh!

Conversation

cestella commented Apr 28, 2016

Uh oh!

dlyle65535 Apr 29, 2016

Choose a reason for hiding this comment

Uh oh!

cestella Apr 29, 2016

Choose a reason for hiding this comment

Uh oh!

cestella Apr 29, 2016

Choose a reason for hiding this comment

Uh oh!

dlyle65535 Apr 29, 2016

Choose a reason for hiding this comment

Uh oh!

dlyle65535 commented Apr 29, 2016

Uh oh!

cestella commented Apr 29, 2016

Uh oh!

dlyle65535 commented Apr 29, 2016

Uh oh!

nickwallen commented Apr 29, 2016

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants