Skip to content
This repository was archived by the owner on Aug 20, 2025. It is now read-only.

Conversation

@nickwallen
Copy link
Contributor

@nickwallen nickwallen commented Feb 21, 2019

When running any functions that attempt to access HBase from the REPL, an IllegalAccessError exception is thrown. This can be replicated with Stellar functions like ENRICHMENT_GET and PROFILE_GET that attempt to read from HBase.

To replicate, start the Stellar REPL with HBase and Zookeeper running and accessible.

[root@node1 ~]# source /etc/default/metron
[root@node1 ~]# cd $METRON_HOME
[root@node1 0.7.1]# bin/stellar -z $ZOOKEEPER
[Stellar]>>> ENRICHMENT_GET("example","192.168.1.1","example","E")
2019-01-30 08:51:31 ERROR SimpleHBaseEnrichmentFunctions:251 - Unable to call exists: java.lang.IllegalAccessError: tried to access method com.google.common.base.Stopwatch.<init>()V from class org.apache.hadoop.hbase.zookeeper.MetaTableLocator
org.apache.hadoop.hbase.DoNotRetryIOException: java.lang.IllegalAccessError: tried to access method com.google.common.base.Stopwatch.<init>()V from class org.apache.hadoop.hbase.zookeeper.MetaTableLocator
 at org.apache.hadoop.hbase.client.RpcRetryingCaller.translateException(RpcRetryingCaller.java:229)
 at org.apache.hadoop.hbase.client.RpcRetryingCaller.callWithRetries(RpcRetryingCaller.java:140)
 at org.apache.hadoop.hbase.client.HTable.get(HTable.java:879)
 at org.apache.hadoop.hbase.client.HTable.get(HTable.java:845)
 at org.apache.metron.enrichment.lookup.EnrichmentLookup$Handler.get(EnrichmentLookup.java:70)
 at org.apache.metron.enrichment.lookup.EnrichmentLookup$Handler.get(EnrichmentLookup.java:52)
 at org.apache.metron.enrichment.lookup.Lookup.get(Lookup.java:68)
 at org.apache.metron.enrichment.stellar.SimpleHBaseEnrichmentFunctions$EnrichmentGet.apply(SimpleHBaseEnrichmentFunctions.java:245)
 at org.apache.metron.stellar.common.StellarCompiler.lambda$exitTransformationFunc$13(StellarCompiler.java:664)
 at org.apache.metron.stellar.common.StellarCompiler$Expression.apply(StellarCompiler.java:259)
 at org.apache.metron.stellar.common.BaseStellarProcessor.parse(BaseStellarProcessor.java:151)
 at org.apache.metron.stellar.common.shell.DefaultStellarShellExecutor.executeStellar(DefaultStellarShellExecutor.java:407)
 at org.apache.metron.stellar.common.shell.DefaultStellarShellExecutor.execute(DefaultStellarShellExecutor.java:257)
 at org.apache.metron.stellar.common.shell.cli.StellarShell.execute(StellarShell.java:359)
 at org.jboss.aesh.console.AeshProcess.run(AeshProcess.java:53)
 at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
 at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
 at java.lang.Thread.run(Thread.java:748)
Caused by: java.lang.IllegalAccessError: tried to access method com.google.common.base.Stopwatch.<init>()V from class org.apache.hadoop.hbase.zookeeper.MetaTableLocator
 at org.apache.hadoop.hbase.zookeeper.MetaTableLocator.blockUntilAvailable(MetaTableLocator.java:596)
 at org.apache.hadoop.hbase.zookeeper.MetaTableLocator.blockUntilAvailable(MetaTableLocator.java:580)
 at org.apache.hadoop.hbase.zookeeper.MetaTableLocator.blockUntilAvailable(MetaTableLocator.java:559)
 at org.apache.hadoop.hbase.client.ZooKeeperRegistry.getMetaRegionLocation(ZooKeeperRegistry.java:61)
 at org.apache.hadoop.hbase.client.ConnectionManager$HConnectionImplementation.locateMeta(ConnectionManager.java:1185)
 at org.apache.hadoop.hbase.client.ConnectionManager$HConnectionImplementation.locateRegion(ConnectionManager.java:1152)
 at org.apache.hadoop.hbase.client.ConnectionManager$HConnectionImplementation.relocateRegion(ConnectionManager.java:1126)
 at org.apache.hadoop.hbase.client.ConnectionManager$HConnectionImplementation.locateRegionInMeta(ConnectionManager.java:1331)
 at org.apache.hadoop.hbase.client.ConnectionManager$HConnectionImplementation.locateRegion(ConnectionManager.java:1155)
 at org.apache.hadoop.hbase.client.ConnectionManager$HConnectionImplementation.locateRegion(ConnectionManager.java:1139)
 at org.apache.hadoop.hbase.client.ConnectionManager$HConnectionImplementation.locateRegion(ConnectionManager.java:1096)
 at org.apache.hadoop.hbase.client.ConnectionManager$HConnectionImplementation.getRegionLocation(ConnectionManager.java:931)
 at org.apache.hadoop.hbase.client.HRegionLocator.getRegionLocation(HRegionLocator.java:83)
 at org.apache.hadoop.hbase.client.RegionServerCallable.prepare(RegionServerCallable.java:79)
 at org.apache.hadoop.hbase.client.RpcRetryingCaller.callWithRetries(RpcRetryingCaller.java:124)
 ... 16 more
{}
[Stellar]>>> PROFILE_GET("hello-world","192.168.1.1",PROFILE_FIXED(30, "DAYS"))
[!] Unable to parse: PROFILE_GET("hello-world","192.168.1.1",PROFILE_FIXED(30, "DAYS")) due to: tried to access method com.google.common.base.Stopwatch.<init>()V from class org.apache.hadoop.hbase.zookeeper.MetaTableLocator
org.apache.metron.stellar.dsl.ParseException: Unable to parse: PROFILE_GET("hello-world","192.168.1.1",PROFILE_FIXED(30, "DAYS")) due to: tried to access method com.google.common.base.Stopwatch.<init>()V from class org.apache.hadoop.hbase.zookeeper.MetaTableLocator
 at org.apache.metron.stellar.common.BaseStellarProcessor.createException(BaseStellarProcessor.java:166)
 at org.apache.metron.stellar.common.BaseStellarProcessor.parse(BaseStellarProcessor.java:154)
 at org.apache.metron.stellar.common.shell.DefaultStellarShellExecutor.executeStellar(DefaultStellarShellExecutor.java:407)
 at org.apache.metron.stellar.common.shell.DefaultStellarShellExecutor.execute(DefaultStellarShellExecutor.java:257)
 at org.apache.metron.stellar.common.shell.cli.StellarShell.execute(StellarShell.java:359)
 at org.jboss.aesh.console.AeshProcess.run(AeshProcess.java:53)
 at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
 at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
 at java.lang.Thread.run(Thread.java:748)
Caused by: java.lang.IllegalAccessError: tried to access method com.google.common.base.Stopwatch.<init>()V from class org.apache.hadoop.hbase.zookeeper.MetaTableLocator
 at org.apache.hadoop.hbase.zookeeper.MetaTableLocator.blockUntilAvailable(MetaTableLocator.java:596)
 at org.apache.hadoop.hbase.zookeeper.MetaTableLocator.blockUntilAvailable(MetaTableLocator.java:580)
 at org.apache.hadoop.hbase.zookeeper.MetaTableLocator.blockUntilAvailable(MetaTableLocator.java:559)
 at org.apache.hadoop.hbase.client.ZooKeeperRegistry.getMetaRegionLocation(ZooKeeperRegistry.java:61)
 at org.apache.hadoop.hbase.client.ConnectionManager$HConnectionImplementation.locateMeta(ConnectionManager.java:1185)
 at org.apache.hadoop.hbase.client.ConnectionManager$HConnectionImplementation.locateRegion(ConnectionManager.java:1152)
 at org.apache.hadoop.hbase.client.ConnectionManager$HConnectionImplementation.relocateRegion(ConnectionManager.java:1126)
 at org.apache.hadoop.hbase.client.ConnectionManager$HConnectionImplementation.locateRegionInMeta(ConnectionManager.java:1331)
 at org.apache.hadoop.hbase.client.ConnectionManager$HConnectionImplementation.locateRegion(ConnectionManager.java:1155)
 at org.apache.hadoop.hbase.client.AsyncProcess$AsyncRequestFutureImpl.findAllLocationsOrFail(AsyncProcess.java:940)
 at org.apache.hadoop.hbase.client.AsyncProcess$AsyncRequestFutureImpl.groupAndSendMultiAction(AsyncProcess.java:857)
 at org.apache.hadoop.hbase.client.AsyncProcess$AsyncRequestFutureImpl.access$100(AsyncProcess.java:575)
 at org.apache.hadoop.hbase.client.AsyncProcess.submitAll(AsyncProcess.java:557)
 at org.apache.hadoop.hbase.client.HTable.batch(HTable.java:923)
 at org.apache.hadoop.hbase.client.HTable.batch(HTable.java:940)
 at org.apache.hadoop.hbase.client.HTable.get(HTable.java:901)
 at org.apache.metron.profiler.client.HBaseProfilerClient.doFetch(HBaseProfilerClient.java:138)
 at org.apache.metron.profiler.client.HBaseProfilerClient.fetch(HBaseProfilerClient.java:120)
 at org.apache.metron.profiler.client.stellar.GetProfile.apply(GetProfile.java:182)
 at org.apache.metron.stellar.common.StellarCompiler.lambda$exitTransformationFunc$13(StellarCompiler.java:664)
 at org.apache.metron.stellar.common.StellarCompiler$Expression.apply(StellarCompiler.java:259)
 at org.apache.metron.stellar.common.BaseStellarProcessor.parse(BaseStellarProcessor.java:151)
 ... 7 more

Changes

The Stellar REPL is being launched in a manner that pulls in multiple conflicting versions of Guava. Guava 17 was being used by metron-management and metron-parsers-storm, but the metron-profiler-client was unexpectedly pulling in Guava 12 through the metron-hbase project.

  • Altered the Profiler Client so that it does not depend on Guava.
  • Excluded Guava from any Profiler Client dependencies.
  • Removed unnecessary Hadoop dependencies from the Profiler family of projects.
  • Ensure that Netty 4.1.13 is explicitly being pulled in for the Elasticsearch integration tests.
  • Guava relocations now use a standard prefix, which includes the version number; org.apache.metron.guava.${guava_version}

What's Lacking?

  • We do not have any automated tests that would have caught this issue. To do that we would need some tests that exercise the REPL after it is launched using the $METRON_HOME/bin/stellar script along with the deployed shaded JARs.

Testing

End to End

  1. Ensure that we can continue to parse, enrich, and index telemetry. Launch the development environment and ensure that telemetry is visible within the Alerts UI.

Streaming Enrichment

  1. Create a Streaming Enrichment by following these instructions.

  2. Define the streaming enrichment and save it as a new source of telemetry.

    [Stellar]>>> conf := SHELL_EDIT()
    {
      "parserClassName": "org.apache.metron.parsers.csv.CSVParser",
      "writerClassName": "org.apache.metron.enrichment.writer.SimpleHbaseEnrichmentWriter",
      "sensorTopic": "user",
      "parserConfig": {
        "shew.table": "enrichment",
        "shew.cf": "t",
        "shew.keyColumns": "ip",
        "shew.enrichmentType": "user",
        "columns": {
          "user": 0,
          "ip": 1
        }
      }
    }
    [Stellar]>>>
    [Stellar]>>> CONFIG_PUT("PARSER", conf, "user")
    
  3. Go to the Management UI and start the new parser called 'user'.

  4. Create some test telemetry.

    [Stellar]>>> msgs := ["user1,192.168.1.1", "user2,192.168.1.2", "user3,192.168.1.3"]
    [user1,192.168.1.1, user2,192.168.1.2, user3,192.168.1.3]
    [Stellar]>>> KAFKA_PUT("user", msgs)
    3
    [Stellar]>>> KAFKA_PUT("user", msgs)
    3
    [Stellar]>>> KAFKA_PUT("user", msgs)
    3
    
  5. Ensure that the enrichments are persisted in HBase.

    [Stellar]>>> ENRICHMENT_GET('user', '192.168.1.1', 'enrichment', 't')
    {original_string=user1,192.168.1.1, guid=a6caf3c1-2506-4eb7-b33e-7c05b77cd72c, user=user1, timestamp=1551813589399, source.type=user}
    
    [Stellar]>>> ENRICHMENT_GET('user', '192.168.1.2', 'enrichment', 't')
    {original_string=user2,192.168.1.2, guid=49e4b8fa-c797-44f0-b041-cfb47983d54a, user=user2, timestamp=1551813589399, source.type=user}
    
    [Stellar]>>> ENRICHMENT_GET('user', '192.168.1.3', 'enrichment', 't')
    {original_string=user3,192.168.1.3, guid=324149fd-6c4c-42a3-b579-e218c032ea7f, user=user3, timestamp=1551813589402, source.type=user}
    

Profiler

  1. Test a profile in the REPL according to these instructions.

    [Stellar]>>> values := PROFILER_FLUSH(profiler)
    [{period={duration=900000, period=1723089, start=1550780100000, end=1550781000000}, profile=hello-world, groups=[], value=4, entity=192.168.138.158}]
    
  2. Deploy that profile to the Streaming Profiler.

    [Stellar]>>> CONFIG_PUT("PROFILER", conf)
    
  3. Wait for the Streaming Profiler in Storm to flush and retrieve the measurement from HBase.

    For the impatient, you can reset the period duration to 1 minute. Alternatively, you can allow the Profiler topology to work for a minute or two and then kill the profiler topology which will force it to flush a profile measurement to HBase.

    Retrieve the measurement from HBase. Prior to this PR, it was not possible to query HBase from the REPL.

    [Stellar]>>> PROFILE_GET("hello-world","192.168.138.158",PROFILE_FIXED(30,"DAYS"))
    [2979]
    
  4. Install Spark using Ambari.

    1. Stop Storm, YARN, Elasticsearch, Kibana, and Kafka.

    2. Install Spark2 using Ambari.

    3. Ensure that Spark can talk with HBase.

      cp /etc/hbase/conf/hbase-site.xml /etc/spark2/conf/
      
  5. Use the Batch Profiler to back-fill your profile. To do this, follow the direction provided here.

  6. Retrieve the entire profile, including the back-filled data.

    [Stellar]>>> PROFILE_GET("hello-world","192.168.138.158",PROFILE_FIXED(30,"DAYS"))
    [1203, 2849, 2900, 1944, 1054, 1241, 1721]
    

Pull Request Checklist

  • Is there a JIRA ticket associated with this PR? If not one needs to be created at Metron Jira.
  • Does your PR title start with METRON-XXXX where XXXX is the JIRA number you are trying to resolve? Pay particular attention to the hyphen "-" character.
  • Has your PR been rebased against the latest commit within the target branch (typically master)?
  • Have you included steps to reproduce the behavior or problem that is being changed or addressed?
  • Have you included steps or a guide to how the change may be verified and tested manually?
  • Have you ensured that the full suite of tests and checks have been executed in the root metron folder via:
  • Have you written or updated unit tests and or integration tests to verify your changes?
  • If adding new dependencies to the code, are these dependencies licensed in a way that is compatible for inclusion under ASF 2.0?
  • Have you verified the basic functionality of the build by building and running locally with Vagrant full-dev environment or the equivalent?

@JonZeolla
Copy link
Member

When did this start being an issue, or is it only an issue in full-dev? I am running e4d793a on a physical cluster and I have been using ENRICHMENT_GET in the REPL extensively without issue.

Just looking to see if this impacted any releases.

@nickwallen
Copy link
Contributor Author

I am not sure when it happened.

@nickwallen
Copy link
Contributor Author

Looks like there is another odd dependency impact on metron-elasticsearch now. Daft!

…Elasticsearch's expection. Otherwise, chaos insues
@nickwallen
Copy link
Contributor Author

Multiple conflicting versions of Netty were being pulled in by the integration tests. I had to alter the pom to ensure that Netty 4.1.13 is explicitly being pulled in for the Elasticsearch integration tests.

<dependency>
<groupId>org.apache.storm</groupId>
<artifactId>flux-core</artifactId>
<version>${global_flux_version}</version>
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Most storm-based projects are getting this dependency through metron-common. Rather than that, let's be more explicit that we need it.

In the future, we need to get Storm out of metron-common anyways.

<artifactId>transport-netty4-client</artifactId>
<version>${global_elasticsearch_version}</version>
<scope>test</scope>
</dependency>
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

While the compile time dependencies for Elasticsearch are walled-off in the elasticsearch-shaded project, the same is not done for the test dependencies, including the use of ElasticsearchComponent. This is why we have this specific dependency on Elastic here.

Other changes in this PR had an unexpected ripple effect causing multiple, incompatible versions of Netty to get pulled in. I had to explicitly exclude that from some dependencies, then create an explicit dependency on the version of Netty we need.

Order also matters here. If this is too low in the dependency list, the integration tests will fail.

@nickwallen
Copy link
Contributor Author

Hey @JonZeolla - Thanks for the reference to what version you are running and NOT seeing the problem. That has helped.

The bug might have been introduced here during the parser reorganization. And if that is the case, then the issue is not in any Apache release yet.

@nickwallen
Copy link
Contributor Author

nickwallen commented Mar 5, 2019

I noticed another problem after I reverted what I thought was an unnecessary change. After reverting these changes, I noticed an unrelated integration test failed.

-------------------------------------------------------
 T E S T S
-------------------------------------------------------
Running org.apache.metron.dataloads.nonbulk.flatfile.SimpleEnrichmentFlatFileLoaderIntegrationTest
Formatting using clusterid: testClusterID
2019-03-04 22:05:58 FATAL HMaster:1650 - Failed to become active master
java.lang.IllegalAccessError: tried to access method org.apache.metron.guava.base.Stopwatch.<init>()V from class org.apache.hadoop.hbase.zookeeper.MetaTableLocator
	at org.apache.hadoop.hbase.zookeeper.MetaTableLocator.blockUntilAvailable(MetaTableLocator.java:596)
	at org.apache.hadoop.hbase.zookeeper.MetaTableLocator.waitMetaRegionLocation(MetaTableLocator.java:217)
	at org.apache.hadoop.hbase.zookeeper.MetaTableLocator.getMetaServerConnection(MetaTableLocator.java:363)
	at org.apache.hadoop.hbase.zookeeper.MetaTableLocator.verifyMetaRegionLocation(MetaTableLocator.java:283)
	at org.apache.hadoop.hbase.master.HMaster.assignMeta(HMaster.java:906)
	at org.apache.hadoop.hbase.master.HMaster.finishActiveMasterInitialization(HMaster.java:742)
	at org.apache.hadoop.hbase.master.HMaster.access$500(HMaster.java:182)
	at org.apache.hadoop.hbase.master.HMaster$1.run(HMaster.java:1646)
	at java.lang.Thread.run(Thread.java:748)
2019-03-04 22:05:58 FATAL HMaster:2095 - Master server abort: loaded coprocessors are: []
2019-03-04 22:05:58 FATAL HMaster:2098 - Unhandled exception. Starting shutdown.
java.lang.IllegalAccessError: tried to access method org.apache.metron.guava.base.Stopwatch.<init>()V from class org.apache.hadoop.hbase.zookeeper.MetaTableLocator
	at org.apache.hadoop.hbase.zookeeper.MetaTableLocator.blockUntilAvailable(MetaTableLocator.java:596)
	at org.apache.hadoop.hbase.zookeeper.MetaTableLocator.waitMetaRegionLocation(MetaTableLocator.java:217)
	at org.apache.hadoop.hbase.zookeeper.MetaTableLocator.getMetaServerConnection(MetaTableLocator.java:363)
	at org.apache.hadoop.hbase.zookeeper.MetaTableLocator.verifyMetaRegionLocation(MetaTableLocator.java:283)
	at org.apache.hadoop.hbase.master.HMaster.assignMeta(HMaster.java:906)
	at org.apache.hadoop.hbase.master.HMaster.finishActiveMasterInitialization(HMaster.java:742)
	at org.apache.hadoop.hbase.master.HMaster.access$500(HMaster.java:182)
	at org.apache.hadoop.hbase.master.HMaster$1.run(HMaster.java:1646)
	at java.lang.Thread.run(Thread.java:748)
Process Thread Dump: Thread dump because: Master not initialized after 200000ms seconds
...

What I thought was an unnecessary change, relocated Guava to a unique path for metron-profiler-client. Once I reverted that, things broke.

I think what is happening is that in many places we are relocating guava to the same path, something like org.apache.metron.guava usually. And since many projects pull-in different versions of Guava, you never know which version of Guava will get relocated there.

<relocation>
<pattern>com.google.common</pattern>
<shadedPattern>org.apache.metron.guava</shadedPattern>
<shadedPattern>org.apache.metron.guava.${guava_version}</shadedPattern>
Copy link
Contributor Author

@nickwallen nickwallen Mar 5, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In all the places where Guava is relocated, I used this standard prefix, which includes the version number. org.apache.metron.guava.${guava_version}

This ensures that multiple projects pulling in different versions of Guava do not overwrite one another. This also ensures that multiple copies of a specific Guava version will not be included in the jar.

I followed this pattern throughout. The specific Guava version necessary is declared at the top. I then ensure we have an explicit dependency on Guava, and make sure that Guava is relocated to a path that includes the version number. Using a property for the Guava version makes sure the relocation and the dependency line up correctly.

The resulting JARs will look like the following.

$ jar -tvf metron-platform/metron-common/target/metron-common-0.7.1.jar | grep Stopwatch
  1088 Tue Mar 05 13:11:28 EST 2019 org/apache/metron/guava/17/0/base/Stopwatch$1.class
  4002 Tue Mar 05 13:11:28 EST 2019 org/apache/metron/guava/17/0/base/Stopwatch.class

@nickwallen
Copy link
Contributor Author

Unrelated web test failure/

@nickwallen nickwallen closed this Mar 5, 2019
@nickwallen nickwallen reopened this Mar 5, 2019
@nickwallen nickwallen requested a review from mmiklavc March 5, 2019 21:48
@anandsubbu
Copy link
Contributor

+1, thanks for the contribution @nickwallen !

Ran up full dev and validated the following:
a) Able to create a streaming enrichment config with the steps outlines, and I am able to successfully run ENRICHMENT_GET using stellar on a given IP address which fetches info from HBase enrichments table.
b) Created a sample profiler with the steps outlined and I am able to perform a PROFILE_GET on a given IP address which fetches info from HBase profiler table.
c) Also verified end to end flow of events to the Alerts UI.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants