FIX: HadoopFsWrapper not compatible with hadoop 2.7.1 if compile with hadoop 2.3.0#3787
FIX: HadoopFsWrapper not compatible with hadoop 2.7.1 if compile with hadoop 2.3.0#3787hamlet-lee wants to merge 1 commit intoapache:masterfrom
Conversation
e9e10b9 to
e77982b
Compare
e77982b to
c46ea94
Compare
|
@jon-wei can we run this through Hadoop gauntlet? |
|
on the surface, it looks like this patch does not change anything except making a particular method invocation using reflection instead of direct call. we do run hadoop 2.7.2 with this patch and things have worked fine so far. @hamlet-lee did you find out why you were getting IllegalAccessError without reflection usage? Can you check that you did not have multiple hadoop versions on the classpath e.g. one set of hadoop jars coming via hdfs-storage module and one set explicitly specified on the classpath ? I'm trying to understand why reflection solves this particular issue. |
|
my middle manager is started as I print classloader by below code: HadoopFsWrapper and FileSystem are from different classloader. This may caused the exception. |
|
If that is correct this patch probably makes sure the class |
|
Stupid mobile..... Makes sure the class loader is correct. Can you debug why there is a miss match in class loaders here? |
|
my guess is, hdfs-storage (and default hadoop jars of version 2.3.0) loaded in the extension classloader , main classloader has hadoop 2.7.1 jars on the classpath .... and depending on position of stars unpredictable things happen :) I think this issue will also possibly get solved if hdfs-storage was built with hadoop 2.7.1 as dependency in the pom so as to have hadoop 2.7.1 jars in the extension folder. |
|
this has been reported by another user in https://groups.google.com/d/msg/druid-user/mFRYT2dRS7g/XUDWIPjhAwAJ . Even if things work fine in most cluster deployments. It seems worthwhile to merge this PR in order to solve this problem for users. Using reflection wouldn't cause any performance degradation because its not inside a hot loop. Only downside is that if/when rename function signature changes with hadoop upgrade, compiler wouldn't inform us about the change and then we might end up with a bug. |
|
Given the issue fixed by #5187, we need to be aware that having the correct renaming/overwriting behavior on HDFS is important for correctness. So I would want to make sure this patch throws an exception if the reflection fails, rather than returning |
|
@hamlet-lee not sure if you're still bothered by this but do you want to patch it and resolve conflict ? |
|
@himanshug I patched it on an older version and have been using it. Currently I do not have plan to patch and resolve the conflict. I am glad if anyone would like to take it and make it into master. |
|
@hamlet-lee I fixed the conflicts and did a bit of updation to take care of @gianm 's concern in #3787 (comment) |
|
closing this in favor of #5296 |
I worked around #3786 this problem by invoking
renameby reflection.Its a temporary way, and I am not sure, if this way suitable to be merged into druid/master.