Skip to content

FIX: HadoopFsWrapper not compatible with hadoop 2.7.1 if compile with hadoop 2.3.0#3787

Closed
hamlet-lee wants to merge 1 commit intoapache:masterfrom
hamlet-lee:tryfix_3786
Closed

FIX: HadoopFsWrapper not compatible with hadoop 2.7.1 if compile with hadoop 2.3.0#3787
hamlet-lee wants to merge 1 commit intoapache:masterfrom
hamlet-lee:tryfix_3786

Conversation

@hamlet-lee
Copy link
Copy Markdown
Contributor

@hamlet-lee hamlet-lee commented Dec 18, 2016

I worked around #3786 this problem by invoking rename by reflection.
Its a temporary way, and I am not sure, if this way suitable to be merged into druid/master.

@hamlet-lee hamlet-lee changed the base branch from 0.9.2 to master December 19, 2016 10:28
@fjy
Copy link
Copy Markdown
Contributor

fjy commented Dec 19, 2016

@jon-wei can we run this through Hadoop gauntlet?

@himanshug
Copy link
Copy Markdown
Contributor

on the surface, it looks like this patch does not change anything except making a particular method invocation using reflection instead of direct call. we do run hadoop 2.7.2 with this patch and things have worked fine so far.

@hamlet-lee did you find out why you were getting IllegalAccessError without reflection usage? Can you check that you did not have multiple hadoop versions on the classpath e.g. one set of hadoop jars coming via hdfs-storage module and one set explicitly specified on the classpath ?

I'm trying to understand why reflection solves this particular issue.

@hamlet-lee
Copy link
Copy Markdown
Contributor Author

hamlet-lee commented Dec 20, 2016

my middle manager is started as

HADOOP_CLASSPATH=$HADOOP_CONF_DIR nohup  java -XX:+PrintGCDetails -XX:+PrintGCTimeStamps -XX:MaxPermSize=256m -XX:PermSize=256m -Xms1024m -Xmx5000m -Duser.timezone=GMT+8 -Dfile.encoding=UTF-8 -classpath ${HADOOP_CONF_DIR}:${DRUID_CONF}/_common:${DRUID_CONF}/middleManager:${DRUID_HOME}/lib/*:`$HADOOP_HOME/bin/hadoop classpath`: io.druid.cli.Main server middleManager

I print classloader by below code:

public static boolean rename(FileSystem fs, Path from, Path to) throws IOException
  {
    try {
      log.info( "HadoopFsWrapper classloader: " +HadoopFsWrapper.class.getClassLoader());
      log.info( "FileSystem classloader: " + FileSystem.class.getClassLoader());

      Method renameMethod = fs.getClass().getMethod("rename", Path.class, Path.class, Options.Rename[].class);
      renameMethod.invoke(fs, from, to, new Options.Rename[]{Options.Rename.NONE});
      return true;
    }
    catch (Exception ex) {
      log.warn(ex, "Failed to rename [%s] to [%s].", from, to);
      return false;
    }
  }
2016-12-20T15:01:30,234 INFO [appenderator_merge_0] org.apache.hadoop.fs.HadoopFsWrapper - HadoopFsWrapper classloader: java.net.URLClassLoader@7cd1ac19
2016-12-20T15:01:30,234 INFO [appenderator_merge_0] org.apache.hadoop.fs.HadoopFsWrapper - FileSystem classloader: sun.misc.Launcher$AppClassLoader@5f184fc6

HadoopFsWrapper and FileSystem are from different classloader. This may caused the exception.

@drcrallen
Copy link
Copy Markdown
Contributor

If that is correct this patch probably makes sure the class

@drcrallen
Copy link
Copy Markdown
Contributor

Stupid mobile.....

Makes sure the class loader is correct. Can you debug why there is a miss match in class loaders here?

@himanshug
Copy link
Copy Markdown
Contributor

my guess is,

hdfs-storage (and default hadoop jars of version 2.3.0) loaded in the extension classloader , main classloader has hadoop 2.7.1 jars on the classpath .... and depending on position of stars unpredictable things happen :)

I think this issue will also possibly get solved if hdfs-storage was built with hadoop 2.7.1 as dependency in the pom so as to have hadoop 2.7.1 jars in the extension folder.

@himanshug himanshug mentioned this pull request Jun 5, 2017
@himanshug
Copy link
Copy Markdown
Contributor

this has been reported by another user in https://groups.google.com/d/msg/druid-user/mFRYT2dRS7g/XUDWIPjhAwAJ .

Even if things work fine in most cluster deployments. It seems worthwhile to merge this PR in order to solve this problem for users. Using reflection wouldn't cause any performance degradation because its not inside a hot loop. Only downside is that if/when rename function signature changes with hadoop upgrade, compiler wouldn't inform us about the change and then we might end up with a bug.

@gianm
Copy link
Copy Markdown
Contributor

gianm commented Jan 9, 2018

Given the issue fixed by #5187, we need to be aware that having the correct renaming/overwriting behavior on HDFS is important for correctness. So I would want to make sure this patch throws an exception if the reflection fails, rather than returning false.

@himanshug
Copy link
Copy Markdown
Contributor

@hamlet-lee not sure if you're still bothered by this but do you want to patch it and resolve conflict ?

@hamlet-lee
Copy link
Copy Markdown
Contributor Author

@himanshug I patched it on an older version and have been using it. Currently I do not have plan to patch and resolve the conflict. I am glad if anyone would like to take it and make it into master.

@himanshug
Copy link
Copy Markdown
Contributor

@hamlet-lee I fixed the conflicts and did a bit of updation to take care of @gianm 's concern in #3787 (comment)

@himanshug
Copy link
Copy Markdown
Contributor

closing this in favor of #5296

@himanshug himanshug closed this Jan 24, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants