Skip to content
This repository was archived by the owner on Aug 20, 2025. It is now read-only.

Conversation

@JonZeolla
Copy link
Member

@JonZeolla JonZeolla commented Apr 25, 2017

Contributor Comments

This PR is a follow-on of #545.

The primary change here resolves a thread safety issue that is only seen when under load. It has been reported in numerous places, but I've seen it best documented here.

Testing

The following steps can be used to validate the PR.

  1. Create a working directory.
    mkdir metron-858
    cd metron-858
    
  2. Launch a CentOS host.
    vagrant init bento/centos-6.7
    vagrant up
    vagrant ssh
    
  3. Install some dependencies.
    sudo su -
    yum -y install epel-release
    yum -y install "@Development tools" java-1.8.0-openjdk cmake libpcap-devel openssl-devel python-devel
    
  4. Create a new HDP.repo Yum repository; this will allow us to install Kafka.
    cat << EOF > /etc/yum.repos.d/HDP.repo
    [HDP-2.5]
    name=HDP-2.5
    baseurl=http://public-repo-1.hortonworks.com/HDP/centos7/2.x/updates/2.5.3.0
    path=/
    enabled=1
    gpgcheck=0
    EOF
    
  5. Install and start Kafka.
    yum -y install kafka
    export PATH=$PATH:/usr/hdp/current/kafka-broker/bin
    zookeeper-server start
    kafka start
    
  6. Install Librdkafka 0.9.4.
    wget https://github.com/edenhill/librdkafka/archive/v0.9.4.tar.gz  -O - | tar -xz
    cd librdkafka-0.9.4/
    ./configure --prefix=/usr
    make
    make install
    
  7. Add Librdkafka to our default load path.
    echo "/usr/lib" >> /etc/ld.so.conf.d/bro-plugin.conf
    ldconfig -v
    
  8. Build and install Bro.
    yum -y install cmake libpcap-devel openssl-devel python-devel
    wget https://www.bro.org/downloads/release/bro-2.4.1.tar.gz -O ~/bro-2.4.1.tar.gz
    tar -xzf ~/bro-2.4.1.tar.gz -C ~
    cd ~/bro-2.4.1
    ./configure --prefix=/usr
    make
    make install
    
  9. Fetch the code from this PR.
    git clone https://github.com/apache/metron ~/metron
    cd ~/metron
    git pull origin pull/547/head
    
  10. Install the Bro Plugin.
    cd metron-sensors/bro-plugin-kafka
    ./configure --bro-dist=/root/bro-2.4.1 --install-root=/usr/lib/bro/plugins/ --with-librdkafka=/usr
    make
    make install
    
  11. Modify your /usr/share/bro/site/local.bro:
    cat << EOF >> /usr/share/bro/site/local.bro
    
    @load Bro/Kafka/logs-to-kafka.bro
    redef Kafka::logs_to_send = set(HTTP::LOG, DNS::LOG);
    redef Kafka::topic_name = "bro";
    redef Kafka::tag_json = T;
    redef Kafka::kafka_conf = table( ["metadata.broker.list"] = "localhost:9092" );
    EOF
    
  12. Create a virtual interface called tap0 to listen on.
    yum install -y tunctl
    tunctl -p
    ifconfig tap0 10.0.0.1 up
    ip link set tap0 promisc on
    
  13. Configure Bro to listen on virtual interface.
    sed -i 's/eth0/tap0/g' /usr/etc/node.cfg
    
  14. Create a Kafka topic called bro.
    kafka-topics.sh --zookeeper localhost:2181 --create --topic bro --partitions 1 --replication-factor 1
    
  15. Make sure the Bro changes are installed and start Bro.
    broctl deploy
    
  16. Grab an example pcap file and replay some packet data through tap0. Keep this running in a separate session.
    yum -y install tcpreplay
    wget https://github.com/apache/metron/raw/master/metron-deployment/roles/sensor-test-mode/files/example.pcap
    tcpreplay -i tap0 --loop=0 --stats=5 example.pcap
    
  17. Ensure that data is hitting the bro topic in Kafka.
    # kafka-console-consumer.sh --zookeeper localhost:2181 --topic bro --from-beginning
    OpenJDK 64-Bit Server VM warning: If the number of processors is expected to increase from one, then you should configure the number of parallel GC threads appropriately using -XX:ParallelGCThreads=N
    {metadata.broker.list=localhost:9092, request.timeout.ms=30000, client.id=console-consumer-99442, security.protocol=PLAINTEXT}
    {"dns": {"ts":1493145915.795376,"uid":"CNfwFh1xJrsdwezojd","id.orig_h":"192.168.138.158","id.orig_p":60078,"id.resp_h":"192.168.138.2","id.resp_p":53,"proto":"udp","trans_id":18350,"query":"va872g.g90e1h.b8.642b63u.j985a2.v33e.37.pa269cc.e8mfzdgrf7g0.groupprograms.in","qclass":1,"qclass_name":"C_INTERNET","qtype":1,"qtype_name":"A","rcode":0,"rcode_name":"NOERROR","AA":false,"TC":false,"RD":true,"RA":true,"Z":0,"answers":["62.75.195.236"],"TTLs":[29.0],"rejected":false}}
    {"dns": {"ts":1493145916.433874,"uid":"CL3LrkiZoYceFU2Nh","id.orig_h":"192.168.138.158","id.orig_p":65315,"id.resp_h":"192.168.138.2","id.resp_p":53,"proto":"udp","trans_id":27248,"query":"ubb67.3c147o.u806a4.w07d919.o5f.f1.b80w.r0faf9.e8mfzdgrf7g0.groupprograms.in","qclass":1,"qclass_name":"C_INTERNET","qtype":1,"qtype_name":"A","rcode":0,"rcode_name":"NOERROR","AA":false,"TC":false,"RD":true,"RA":true,"Z":0,"answers":["62.75.195.236"],"TTLs":[29.0],"rejected":false}}
    {"dns": {"ts":1493145916.434025,"uid":"CbNL2S3VggZKyweUA6","id.orig_h":"192.168.138.158","id.orig_p":50683,"id.resp_h":"192.168.138.2","id.resp_p":53,"proto":"udp","trans_id":62139,"query":"r03afd2.c3008e.xc07r.b0f.a39.h7f0fa5eu.vb8fbl.e8mfzdgrf7g0.groupprograms.in","qclass":1,"qclass_name":"C_INTERNET","qtype":1,"qtype_name":"A","rcode":0,"rcode_name":"NOERROR","AA":false,"TC":false,"RD":true,"RA":true,"Z":0,"answers":["62.75.195.236"],"TTLs":[29.0],"rejected":false}}
    
  18. Do some load testing to ensure bro doesn't segfault.

Pull Request Checklist

In order to streamline the review of the contribution we ask you follow these guidelines and ask you to double check the following:

For all changes:

  • Is there a JIRA ticket associated with this PR? If not one needs to be created at Metron Jira.
  • Does your PR title start with METRON-XXXX where XXXX is the JIRA number you are trying to resolve? Pay particular attention to the hyphen "-" character.
  • Has your PR been rebased against the latest commit within the target branch (typically master)?

For code changes:

  • [N/A] Have you included steps to reproduce the behavior or problem that is being changed or addressed? (See Contributor Comments)

  • Have you included steps or a guide to how the change may be verified and tested manually?

  • Have you ensured that the full suite of tests and checks have been executed in the root incubating-metron folder via:

    mvn -q clean integration-test install && build_utils/verify_licenses.sh 
    
  • [N/A] Have you written or updated unit tests and or integration tests to verify your changes?

  • [N/A] If adding new dependencies to the code, are these dependencies licensed in a way that is compatible for inclusion under ASF 2.0?

  • Have you verified the basic functionality of the build by building and running locally with Vagrant full-dev environment or the equivalent?

For documentation related changes:

  • Have you ensured that format looks appropriate for the output in which it is rendered by building and verifying the site-book? If not then run the following commands and the verify changes via site-book/target/site/index.html:

    cd site-book
    bin/generate-md.sh
    mvn site:site
    

Note:

Please ensure that once the PR is submitted, you check travis-ci for build issues and submit an update to your PR as soon as possible.
It is also recommended that travis-ci is set up for your personal repository such that your branches are built there before submitting a pull request.

Copy link
Contributor

@nickwallen nickwallen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, Jon!

$pred(rec: Conn::Info) = { return ! (( |rec$id$orig_h| == 128 || |rec$id$resp_h| == 128 )); },
$config = table(["stream_id"] = fmt("%s", Conn::LOG))
]);
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

With this script, I would expect to find a total of 6 log filters having been created. The first 3 created by Bro/Kafka/logs-to-kafka.bro and then the last 3 created by your bro_init() function. To avoid this, I think what you want to do something more like this...

@load Bro/Kafka/logs-to-kafka.bro
redef Kafka::topic_name = "";
redef Kafka::tag_json = T;

event bro_init() &priority=-5
{
    # handles HTTP
    Log::add_filter(HTTP::LOG, [
        $name = "kafka-http",
        $writer = Log::WRITER_KAFKAWRITER,
        $pred(rec: HTTP::Info) = { return ! (( |rec$id$orig_h| == 128 || |rec$id$resp_h| == 128 )); },
        $config = table(
            ["stream_id"] = fmt("%s", HTTP::LOG),
            ["metadata.broker.list"] = "localhost:9092"
        )
    ]);

    # handles DNS
    Log::add_filter(DNS::LOG, [
        $name = "kafka-dns",
        $writer = Log::WRITER_KAFKAWRITER,
        $pred(rec: DNS::Info) = { return ! (( |rec$id$orig_h| == 128 || |rec$id$resp_h| == 128 )); },
        $config = table(
            ["stream_id"] = fmt("%s", DNS::LOG),
            ["metadata.broker.list"] = "localhost:9092"
        )
    ]);
}

The goal in this example is to send all HTTP and DNS records to a Kafka topic named `bro`.
* Any configuration value accepted by librdkafka can be added to the `kafka_conf` configuration table.
* By defining `topic_name` all records will be sent to the same Kafka topic.
* By providing a set of logs via `logs_to_send`.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This doesn't sound like a complete thought to me. Maybe this?

"Defining logs_to_send will ensure that only HTTP and DNS records are sent."

As documented in [METRON-285](https://issues.apache.org/jira/browse/METRON-285) and [METRON-286](https://issues.apache.org/jira/browse/METRON-286), various components in Metron do not currently support IPv6. Because of this, you may not want to send bro logs that contain IPv6 source or destination IPs into Metron. In this example, we are assuming a somewhat standard bro configuration for sending logs into a Metron cluster, such that:
* Each type of bro log is sent to the `bro` topic, but is tagged with the appropriate log type (such as `http`, `dns`, or `conn`). This is done by setting `topic_name` to `bro`, setting `$path` to an empty string (or leaving it unset), and by setting `tag_json` to true.
* The Kafka writer is set appropriately to send logs to the `bro` Kafka topic being used in your Metron cluster. This requires that your `kafka_conf` and `$config` tables are appropriately configured.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The effect of this paragraph is saying, "this is like example 1, but excludes IPv6", right?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes. Do you think it's too wordy?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In my humble opinion yes, but it is subjective. If you like what you have, we can keep it. I think its a great example to add.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My goal was just to be explicit. I can take another stab at it tomorrow.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I took a stab at being more concise. Take a look and let me know, I think I like it more this way than before so thanks for the critique.

case "$1" in
--with-librdkafka=*)
append_cache_entry LibRdKafka_ROOT_DIR PATH $optarg
append_cache_entry LibRDKafka_ROOT_DIR PATH $optarg
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch.


```
@load Bro/Kafka/logs-to-kafka.bro
redef Kafka::topic_name = "bro";
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We have to set topic_name to empty string otherwise logs-to-kafka.bro will create its own filters.

redef Kafka::topic_name = "";

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure I follow. If Kafka::logs_to_send is empty why would logs-to-kafka.bro make its own filters?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, exactly. It needs to be an empty string. I am seeing Example 3 in your README setting it to 'bro'. It needs to be set to empty string.

Copy link
Member Author

@JonZeolla JonZeolla Apr 27, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm obviously missing something. logs_to_send is not topic_name? In my example logs_to_send is not set. I also note below that "logs_to_send is mutually exclusive with $pred, thus you must individually add a filter per log that you would like to send into Metron if you want to configure a predicate."

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, Jon. You are 100% right. I am the one confused. I had misunderstood what you are showing with the example. Ignore me. This looks good.

@nickwallen
Copy link
Contributor

@JonZeolla Let me know when you have your "Outstanding Items" complete. Once you're happy, I'll run it through some testing. It is looking real good and seems ready to go.

@JonZeolla
Copy link
Member Author

Will do, thanks.

@JonZeolla JonZeolla closed this May 13, 2017
@JonZeolla JonZeolla reopened this May 13, 2017
@JonZeolla
Copy link
Member Author

JonZeolla commented May 20, 2017

@nickwallen This should be ready to test now. Sorry about the delay

@ottobackwards
Copy link
Contributor

ok, I tried this and was flumoxed by nonexistent directories or files:

  • cat << EOF >> /usr/share/bro/site/local.bro ( bro/site doesn't exist )

  • sed -i 's/eth0/tap0/g' /usr/etc/node.cfg ( node.cfg does not exist )

Are the instructions up to date?

@JonZeolla
Copy link
Member Author

Just ran through the instructions from scratch and it works for me, can you give it another shot from the beginning?

@ottobackwards
Copy link
Contributor

+1 - although I can't do the stress testing part. Steps followed correctly work perfectly.

@nickwallen
Copy link
Contributor

+1 Works like a charm. Tested basic functioning on a multi-node cluster against 1 gbps of canned traffic. Thanks for the contribution @JonZeolla !

@asfgit asfgit closed this in 85872bd Jun 1, 2017
JonZeolla added a commit to JonZeolla/jzeolla-metron-bro-plugin-kafka that referenced this pull request Sep 18, 2017
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants