Skip to content

Set context classloader to the one used to load the hdfs extension in HdfsDataSegmentPusher#1716

Closed
drcrallen wants to merge 1 commit intoapache:masterfrom
metamx:fixHdfsRealtime1714
Closed

Set context classloader to the one used to load the hdfs extension in HdfsDataSegmentPusher#1716
drcrallen wants to merge 1 commit intoapache:masterfrom
metamx:fixHdfsRealtime1714

Conversation

@drcrallen
Copy link
Copy Markdown
Contributor

@drcrallen drcrallen added the Bug label Sep 10, 2015
@drcrallen
Copy link
Copy Markdown
Contributor Author

@himanshug I'm not sure if this will cause *.xml files on the execution classpath to be not found. Do you know any good ways to test that this doesn't clobber hdfs-site.xml and other configs?

@himanshug
Copy link
Copy Markdown
Contributor

I don't think this will necessarily fix the issue because FileSystem caches a scheme to FileSystem map at class level. So, if even a single FileSystem.getFileSystem() call happens before this code is reached, setting this classloader will not do anything. This patch will make things unpredictable depending upon whether FileSystem builds the cache on call to this code or before.
I submitted a patch to HDFS for potentially fixing this issue [build failure in the JIRA is unrelated like our transitive failures :) ]
see https://issues.apache.org/jira/browse/HDFS-8750

regarding clobbering, yes it might(and unpredictably so) as Configuration really caches stuff at class level first time it is used.

@drcrallen
Copy link
Copy Markdown
Contributor Author

@himanshug I think your HDFS patch is the better solution, and I agree with your assessment about the unpredictability in class-level caching (noted in #1714 (comment) , should have noted here as well). As such I'm closing this PR since this will only make things more unsteady in general, while only solving a very specific use case.

@himanshug
Copy link
Copy Markdown
Contributor

@drcrallen thought more today, and this can work if we ensure that classloader switch happens before FileSystem internal initialization , which can be made possible by forcing it inside HdfsStorageDruidModule .
created another PR #1721

@drcrallen drcrallen deleted the fixHdfsRealtime1714 branch September 11, 2015 23:36
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants