Skip to content
This repository was archived by the owner on Aug 20, 2025. It is now read-only.

Conversation

@mmiklavc
Copy link
Contributor

@mmiklavc mmiklavc commented Sep 16, 2016

This builds on efforts from #217

This closes https://issues.apache.org/jira/browse/METRON-257

The purpose for this PR is to give the user the ability to specify how many records per file should be written by the PCAP CLI tool. For example, if 1,000 records are returned by a PCAP query and the user specifies 200 records per file, then the user should expect 5 PCAP files to be written to the current working directory.

Note - I tested this on quick-dev

Testing
Get PCAP data into Metron: Install and setup pycapa - the instructions below reference/mirror those in PR-93

  1. Install the pycapa library & utility $ cd /opt/pycapa/pycapa && pip install -r requirements.txt && python setup.py install
  2. (if using singlenode vagrant) Kill the enrichment and sensor topologies via for i in bro enrichment yaf snort;do storm kill $i;done
  3. Start the pcap topology via /usr/metron/0.2.0BETA/bin/start_pcap_topology.sh
  4. Start the pycapa packet capture producer on eth1 via /usr/bin/pycapa --producer --topic pcap -i eth1 -k node1:6667
  5. Watch the topology in the Storm UI and kill the packet capture utility from before, when the number of packets ingested is over 3k.
  6. Ensure that at at least 3 files exist on HDFS by running hadoop fs -ls /apps/metron/pcap
  7. Choose a file (denoted by $FILE) and dump a few of the contents using the pcap_inspector utility via /usr/metron/0.2.0BETA/bin/pcap_inspector.sh -i $FILE -n 5
  8. Choose one of the lines and note the protocol.
  9. Note that when you run the commands below, the resulting file will be placed in the execution directory where you kicked off the job from.

Fixed filter

  1. Run a fixed filter query by executing the following command with the values noted above (match your start_time format to the date format provided - default is to use millis since epoch)
  2. /usr/metron/0.2.0BETA/bin/pcap_query.sh fixed -st <start_time> -df "yyyyMMdd" -p <protocol_num> -rpf 500
  3. Verify the MR job finishes successfully. Upon completion, you should see multiple files named with relatively current datestamps in your current directory, e.g. pcap-data-20160617160549737+0000.pcap
  4. Copy the files to your local machine and verify you can them it in Wireshark. I chose a middle file and the last file. The middle file should have 500 records (per the records_per_file option), and the last one will likely have a number of records <= 500.

Query filter

  1. Run a Stellar query filter query by executing a command similar to the following, with the values noted above (match your start_time format to the date format provided - default is to use millis since epoch)
  2. /usr/metron/0.2.0BETA/bin/pcap_query.sh query -st "20160617" -df "yyyyMMdd" -query "protocol == '6'" -rpf 500
  3. Verify the MR job finishes successfully. Upon completion, you should see multiple files named with relatively current datestamps in your current directory, e.g. pcap-data-20160617160549737+0000.pcap
  4. Copy the files to your local machine and verify you can them it in Wireshark. I chose a middle file and the last file. The middle file should have 500 records (per the records_per_file option), and the last one will likely have a number of records <= 500.

References:

reader = null;
}
} catch (IOException e) {
// ah well, we tried...
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we log this Exception? I'm not sure we can do anything about it, but it would be nice to be able to ensure we can see it.

@nickwallen
Copy link
Contributor

CI failure seems unrelated. Probably something we need to address, but a re-run will probably fix it for this PR.

Failed tests: 
  StellarStatisticsFunctionsTest.testMergeProviders:215 Percentile mismatch for 
60.0th %ile expected:<0.22611711437989881> but was:<0.23631231837333944>

@cestella
Copy link
Member

It's troubling because that RNG is seeded and the results should be deterministic. I wonder if the t-digest merge has some non-determinism. Anyway, it needs to be fixed, but definitely not here.

Great job on the PR, @mmiklavc +1 pending CI build. Just close and reopen the PR and everything SHOULD be kosher.

@cestella
Copy link
Member

FYI: That test nondeterminism should be fixed as of METRON-426 aka #257

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants