Skip to content
This repository was archived by the owner on Aug 20, 2025. It is now read-only.

Conversation

@nickwallen
Copy link
Contributor

@nickwallen nickwallen commented Jul 5, 2019

This change updates the Profiler to function with HBase 2.0.2.

  • This PR is for the feature/METRON-2088-support-HDP-3.1 feature branch.

This PR is dependent on the following PRs and the diff will show those changes here until that PR is merged

Changes

  • Added a new method to the Stellar function resolver; FunctionResolver.withInstance. This makes it possible to instrument and setup a Stellar function for testing, but also rely on the existing function resolution system for tests.

  • Altered HBaseProfilerClient to use the HBaseClient abstraction.

  • Created the ProfilerClientFactory and HBaseProfilerClientFactory that contains common code for creating a ProfilerClient. This code was being duplicated in some of the different Stellar functions related to the Profiler. This also makes it simpler to test this common code.

  • Update the PROFILE_GET and PROFILE_VERBOSE functions to use the ProfilerClientFactory.

Acceptance Testing

Note: This needs to be tested in the centos6 environment built against HDP 2.6. Some functionality, like the Batch Profiler, will not fully function until we fully upgrade other dependencies like HDFS to the HDP 3.1 versions.

  1. Ensure that we can continue to parse, enrich, and index telemetry. Launch the development environment and ensure that telemetry is visible within the Alerts UI.

Profiler in the REPL

  1. Test a profile in the REPL according to these instructions.

    [Stellar]>>> values := PROFILER_FLUSH(profiler)
    [{period={duration=900000, period=1723089, start=1550780100000, end=1550781000000}, profile=hello-world, groups=[], value=4, entity=192.168.138.158}]
    

Streaming Profiler

  1. Deploy that profile to the Streaming Profiler in Storm.

    [Stellar]>>> CONFIG_PUT("PROFILER", conf)
    
  2. Wait for the Streaming Profiler in Storm to flush and retrieve the measurement from HBase.

    For the impatient, you can reset the period duration to 1 minute. Alternatively, you can allow the Profiler topology to work for a minute or two and then kill the profiler topology which will force it to flush a profile measurement to HBase.

    Retrieve the measurement from HBase. Prior to this PR, it was not possible to query HBase from the REPL.

    [Stellar]>>> PROFILE_GET("hello-world","192.168.138.158",PROFILE_FIXED(30,"DAYS"))
    [2979]
    

Batch Profiler

  1. Install Spark using Ambari.

    1. Stop Storm, YARN, Elasticsearch, Kibana, and Kafka.

    2. Install Spark2 using Ambari.

    3. Ensure that Spark can talk with HBase.

      cp /etc/hbase/conf/hbase-site.xml /etc/spark2/conf/
      
  2. Use the Batch Profiler to back-fill your profile. To do this, follow the direction provided here.

  3. Retrieve the entire profile, including the back-filled data.

    [Stellar]>>> PROFILE_GET("hello-world","192.168.138.158",PROFILE_FIXED(30,"DAYS"))
    [1203, 2849, 2900, 1944, 1054, 1241, 1721]
    

Pull Request Checklist

  • Is there a JIRA ticket associated with this PR? If not one needs to be created at Metron Jira.
  • Does your PR title start with METRON-XXXX where XXXX is the JIRA number you are trying to resolve? Pay particular attention to the hyphen "-" character.
  • Has your PR been rebased against the latest commit within the target branch (typically master)?
  • Have you included steps to reproduce the behavior or problem that is being changed or addressed?
  • Have you included steps or a guide to how the change may be verified and tested manually?
  • Have you ensured that the full suite of tests and checks have been executed in the root metron folder via:
  • Have you written or updated unit tests and or integration tests to verify your changes?
  • If adding new dependencies to the code, are these dependencies licensed in a way that is compatible for inclusion under ASF 2.0?
  • Have you verified the basic functionality of the build by building and running locally with Vagrant full-dev environment or the equivalent?

@nickwallen nickwallen changed the base branch from feature/METRON-2088-support-hdp-3.1 to master July 19, 2019 17:49
@nickwallen nickwallen changed the base branch from master to feature/METRON-2088-support-hdp-3.1 July 19, 2019 17:49
Copy link
Contributor Author

@nickwallen nickwallen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some comments to help guide reviewers.

@nickwallen
Copy link
Contributor Author

nickwallen commented Jul 29, 2019

Any other feedback on this @mmiklavc?

@merrimanr
Copy link
Contributor

I ran through the testing instructions and hit some problems with the Batch Profiler. First error I ran into was this:

19/07/29 21:47:31 INFO BatchProfilerCLI: Loading profiles from '/usr/metron/0.7.2/config/zookeeper/profiler.json'
Exception in thread "main" com.fasterxml.jackson.databind.exc.MismatchedInputException: Cannot construct instance of `org.apache.metron.common.configuration.profiler.ProfileResult` (although at least one Creator exists): no String-argument constructor/factory method to deserialize from String value ('count')
 at [Source: (String)"{
  "profiles": [
    {
      "profile": "hello-world",
      "foreach": "'global'",
      "init":    { "count": "0" },
      "update":  { "count": "count + 1" },
      "result":  "count"
    }
  ],
  "timestampField": "timestamp"
}
"; line: 8, column: 18] (through reference chain: org.apache.metron.common.configuration.profiler.ProfilerConfig["profiles"]->java.util.ArrayList[0]->org.apache.metron.common.configuration.profiler.ProfileConfig["result"])

The same profiler.json deserializes without issue in the integration tests. I suspect it is caused by a jackson version problem. When I update profiler.json to:

{
  "profiles": [
    {
      "profile": "hello-world",
      "foreach": "'global'",
      "init":    { "count": "0" },
      "update":  { "count": "count + 1" },
      "result":  { "profile": "count" }
     }
  ],
  "timestampField": "timestamp"
}

I am able to get past that error but then I run into this error:

19/07/29 21:50:42 INFO StateStoreCoordinatorRef: Registered StateStoreCoordinator endpoint
Exception in thread "main" java.lang.IllegalAccessError: class org.apache.hadoop.hdfs.web.HftpFileSystem cannot access its superinterface org.apache.hadoop.hdfs.web.TokenAspect$TokenManagementDelegator
	at java.lang.ClassLoader.defineClass1(Native Method)
	at java.lang.ClassLoader.defineClass(ClassLoader.java:763)

Copy link
Contributor

@mmiklavc mmiklavc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This upgrade has been a pretty big lift, thanks @nickwallen.

…ion resolution cannot use Builder and tests should work as similarily to production as possible
@nickwallen
Copy link
Contributor Author

The issue that @merrimanr ran into was because he was testing with the centos7 environment built against HDP 3.1.

I just added a note to the description about this, but this PR will only work when run against HDP 2.6 since the Hadoop version has yet been updated. Until that happens the Batch Profiler is not able to read from HDFS.

After the recent round of review edits, I ran up the test instructions again in centos6 + HDP 2.6 and everything worked as planned.

@merrimanr
Copy link
Contributor

Your explanation makes sense to me @nickwallen. +1

@mmiklavc
Copy link
Contributor

mmiklavc commented Aug 8, 2019

I think that's everything - please feel free to check me on that as github doesn't seem to rollup the requests like a checklist. Thanks @nickwallen! +1 by inspection pending Travis.

@nickwallen
Copy link
Contributor Author

Thanks for the reviews. This has been merged into the feature branch. See 1e1afc3

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants