A couple of hdfs related fixes by jasonxh · Pull Request #1454 · apache/druid

jasonxh · 2015-06-20T00:40:06Z

Class loading issue with hdfs-storage extension
Even though the extension has hadoop-hdfs as a dependency, currently it's not able to load relevant classes during runtime because Hadoop is using thread context class loader by default.
Exception when using hdfs with non-fully qualified segment path

@fjy can you take a look?

* Class loading issue with hdfs-storage extension * Exception when using hdfs with non-fully qualified segment path

drcrallen · 2015-06-20T00:53:23Z

Hi @jasonxh thanks for the patch!

Do you happen to have more information on when setting the HdfsStorageDruidModule classloading is required? If you have some examples of failures that would be awesome.

jasonxh · 2015-06-21T07:54:43Z

This is the exception from path merging that happened during hadoop indexing task

Error: java.lang.IllegalArgumentException: java.net.URISyntaxException: Relative path in absolute URI: hdfs://hadoop:9000druid/segments/p13n_results
    at org.apache.hadoop.fs.Path.initialize(Path.java:206)
    at org.apache.hadoop.fs.Path.<init>(Path.java:116)
    at org.apache.hadoop.fs.Path.<init>(Path.java:89)
    at io.druid.indexer.JobHelper.prependFSIfNullScheme(JobHelper.java:521)
    at io.druid.indexer.JobHelper.makeSegmentOutputPath(JobHelper.java:420)
    at io.druid.indexer.IndexGeneratorJob$IndexGeneratorReducer.reduce(IndexGeneratorJob.java:445)
    at io.druid.indexer.IndexGeneratorJob$IndexGeneratorReducer.reduce(IndexGeneratorJob.java:272)
    at org.apache.hadoop.mapreduce.Reducer.run(Reducer.java:171)
    at org.apache.hadoop.mapred.ReduceTask.runNewReducer(ReduceTask.java:627)
    at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:389)
    at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:163)
    at java.security.AccessController.doPrivileged(Native Method)
    at javax.security.auth.Subject.doAs(Subject.java:415)
    at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628)
    at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158)
Caused by: java.net.URISyntaxException: Relative path in absolute URI: hdfs://hadoop:9000druid/segments/p13n_results
    at java.net.URI.checkPath(URI.java:1804)
    at java.net.URI.<init>(URI.java:752)
    at org.apache.hadoop.fs.Path.initialize(Path.java:203)
    ... 14 more</code>

jasonxh · 2015-06-21T07:57:18Z

This is the exception where historical node failed to load segment from hdfs. I used to work around it by adding hadoop libs to class path, but was wondering why it should be necessary, considering that hdfs-storage-extension supposedly includes the dependency already.

io.druid.segment.loading.SegmentLoadingException: Exception loading segment[p13n_results_2015-06-16T00:00:00.000Z_2015-06-17T00:00:00.000Z_2015-06-21T07:43:09.266Z]
        at io.druid.server.coordination.ZkCoordinator.loadSegment(ZkCoordinator.java:146) ~[druid-services-0.8.0-rc2-SNAPSHOT-selfcontained.jar:0.8.0-rc2-SNAPSHOT]
        at io.druid.server.coordination.ZkCoordinator.addSegment(ZkCoordinator.java:171) [druid-services-0.8.0-rc2-SNAPSHOT-selfcontained.jar:0.8.0-rc2-SNAPSHOT]
        ...
Caused by: java.lang.RuntimeException: java.lang.ClassNotFoundException: Class org.apache.hadoop.hdfs.DistributedFileSystem not found
        at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:1882) ~[?:?]
        at org.apache.hadoop.fs.FileSystem.getFileSystemClass(FileSystem.java:2298) ~[?:?]
        at org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:2311) ~[?:?]
        at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:90) ~[?:?]
        at org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:2350) ~[?:?]
        at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:2332) ~[?:?]
        at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:369) ~[?:?]
        at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:360) ~[?:?]
        at org.apache.hadoop.fs.Path.getFileSystem(Path.java:296) ~[?:?]
        at io.druid.storage.hdfs.HdfsDataSegmentPuller.getSegmentFiles(HdfsDataSegmentPuller.java:177) ~[?:?]
        at io.druid.storage.hdfs.HdfsLoadSpec.loadSegment(HdfsLoadSpec.java:59) ~[?:?]
        at io.druid.segment.loading.SegmentLoaderLocalCacheManager.getSegmentFiles(SegmentLoaderLocalCacheManager.java:141) ~[druid-services-0.8.0-rc2-SNAPSHOT-selfcontained.jar:0.8.0-rc2-SNAPSHOT]
        at io.druid.segment.loading.SegmentLoaderLocalCacheManager.getSegment(SegmentLoaderLocalCacheManager.java:93) ~[druid-services-0.8.0-rc2-SNAPSHOT-selfcontained.jar:0.8.0-rc2-SNAPSHOT]
        at io.druid.server.coordination.ServerManager.loadSegment(ServerManager.java:151) ~[druid-services-0.8.0-rc2-SNAPSHOT-selfcontained.jar:0.8.0-rc2-SNAPSHOT]
        at io.druid.server.coordination.ZkCoordinator.loadSegment(ZkCoordinator.java:142) ~[druid-services-0.8.0-rc2-SNAPSHOT-selfcontained.jar:0.8.0-rc2-SNAPSHOT]

nishantmonu51 · 2015-06-21T18:24:22Z

looks like it should fix the case mentioned in #713 ?

himanshug · 2015-06-21T19:08:11Z

we put the hadoop libs and extension jars explicitly on the classpath. please make sure that this change does not break other existing setups like those.

jasonxh · 2015-06-21T21:04:58Z

@himanshug the extension class loader uses the root loader as a fallback, so anything on your root CLASSPATH that doesn't exist in extension dependency hierarchy (e.g. your yarn configs) will still be discoverable. It impacts only the case where extension's hadoop dependency has a version (2.3.0) different than your hadoop version. Now that extension's own hadoop is being preferred over your hadoop, a version conflict is possible. However, it is recommended to recompile druid to match your hadoop version anyway.

drcrallen · 2015-06-21T21:06:10Z

@nishantmonu51 : We actually do that currently. That issue might be resolved.

fjy · 2015-06-23T16:05:31Z

@jasonxh do you mind filing out the CLA: http://druid.io/community/cla.html ?

cheddar · 2015-06-23T17:06:56Z

👍

drcrallen · 2015-06-23T17:08:10Z

Yeah, turns out this is needed for 0.8.0

jasonxh · 2015-06-23T20:39:03Z

@fjy I've filled out the CLA

A couple of hdfs related fixes

drcrallen · 2015-06-23T20:40:32Z

@jasonxh Thanks!

A couple of hdfs related fixes

1931491

* Class loading issue with hdfs-storage extension * Exception when using hdfs with non-fully qualified segment path

drcrallen modified the milestones: 0.8.1, 0.8.0 Jun 23, 2015

drcrallen added a commit that referenced this pull request Jun 23, 2015

Merge pull request #1454 from optimizely/hao/hadoop-fixes

8795792

A couple of hdfs related fixes

drcrallen merged commit 8795792 into apache:master Jun 23, 2015

jasonxh deleted the hao/hadoop-fixes branch June 24, 2015 18:15

himanshug mentioned this pull request Jul 10, 2015

Fix BUG "No FileSystem for scheme: hdfs" for hdfs-srorage-extension #1022

Closed

vogievetsky mentioned this pull request Jun 9, 2019

Reported classloader issues with hdfs-storage-module #713

Closed

seoeun25 pushed a commit to seoeun25/incubator-druid that referenced this pull request Jan 10, 2020

apache#1703 Prepare for apache#1454 and remove some useless codes

558bbfd

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

A couple of hdfs related fixes#1454

A couple of hdfs related fixes#1454
drcrallen merged 1 commit intoapache:masterfrom
optimizely:hao/hadoop-fixes

jasonxh commented Jun 20, 2015

Uh oh!

drcrallen commented Jun 20, 2015

Uh oh!

jasonxh commented Jun 21, 2015

Uh oh!

jasonxh commented Jun 21, 2015

Uh oh!

nishantmonu51 commented Jun 21, 2015

Uh oh!

himanshug commented Jun 21, 2015

Uh oh!

jasonxh commented Jun 21, 2015

Uh oh!

drcrallen commented Jun 21, 2015

Uh oh!

fjy commented Jun 23, 2015

Uh oh!

cheddar commented Jun 23, 2015

Uh oh!

drcrallen commented Jun 23, 2015

Uh oh!

jasonxh commented Jun 23, 2015

Uh oh!

drcrallen commented Jun 23, 2015

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

6 participants

Conversation

jasonxh commented Jun 20, 2015

Uh oh!

drcrallen commented Jun 20, 2015

Uh oh!

jasonxh commented Jun 21, 2015

Uh oh!

jasonxh commented Jun 21, 2015

Uh oh!

nishantmonu51 commented Jun 21, 2015

Uh oh!

himanshug commented Jun 21, 2015

Uh oh!

jasonxh commented Jun 21, 2015

Uh oh!

drcrallen commented Jun 21, 2015

Uh oh!

fjy commented Jun 23, 2015

Uh oh!

cheddar commented Jun 23, 2015

Uh oh!

drcrallen commented Jun 23, 2015

Uh oh!

jasonxh commented Jun 23, 2015

Uh oh!

drcrallen commented Jun 23, 2015

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

6 participants