A couple of hdfs related fixes#1454
Conversation
* Class loading issue with hdfs-storage extension * Exception when using hdfs with non-fully qualified segment path
|
Hi @jasonxh thanks for the patch! Do you happen to have more information on when setting the HdfsStorageDruidModule classloading is required? If you have some examples of failures that would be awesome. |
|
This is the exception from path merging that happened during hadoop indexing task |
|
This is the exception where historical node failed to load segment from hdfs. I used to work around it by adding hadoop libs to class path, but was wondering why it should be necessary, considering that hdfs-storage-extension supposedly includes the dependency already. |
|
looks like it should fix the case mentioned in #713 ? |
|
we put the hadoop libs and extension jars explicitly on the classpath. please make sure that this change does not break other existing setups like those. |
|
@himanshug the extension class loader uses the root loader as a fallback, so anything on your root CLASSPATH that doesn't exist in extension dependency hierarchy (e.g. your yarn configs) will still be discoverable. It impacts only the case where extension's hadoop dependency has a version (2.3.0) different than your hadoop version. Now that extension's own hadoop is being preferred over your hadoop, a version conflict is possible. However, it is recommended to recompile druid to match your hadoop version anyway. |
|
@nishantmonu51 : We actually do that currently. That issue might be resolved. |
|
@jasonxh do you mind filing out the CLA: http://druid.io/community/cla.html ? |
|
👍 |
|
Yeah, turns out this is needed for 0.8.0 |
|
@fjy I've filled out the CLA |
A couple of hdfs related fixes
|
@jasonxh Thanks! |
Even though the extension has hadoop-hdfs as a dependency, currently it's not able to load relevant classes during runtime because Hadoop is using thread context class loader by default.
@fjy can you take a look?