-
Notifications
You must be signed in to change notification settings - Fork 29k
[SPARK-14062][Yarn] Fix log4j and upload metrics.properties automatically with distributed cache #11885
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
Test build #53763 has finished for PR 11885 at commit
|
|
Test build #53912 has finished for PR 11885 at commit
|
|
Test build #53911 has finished for PR 11885 at commit
|
|
CC @vanzin @tgravescs please help to review, thanks a lot. |
|
Test build #53918 has finished for PR 11885 at commit
|
|
LGTM, FWIW. You're just uploading an additional file here and cleaning up the code. |
| // Also uploading metrics.properties to distributed cache if exists in classpath. | ||
| // If user specify this file using --files then executors will use the one | ||
| // from --files instead. | ||
| for { prop <- Seq("log4j.properties", "metrics.properties") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
does this break the oldLog4jConf functionality above? I think it will throw an exception if both exist.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I haven't tried yet, I will do a quick test on this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi @tgravescs , I just did a quick test on this.
If oldLog4jConf points to the same log4j file under <SPARK_HOME>/conf, it will be added to distributed cache once and get a warning for the following one. If oldLog4jConf points to a different log4j file other than the default one under <SPARK_HOME>/conf, so the one under conf took precedence.
I think since SPARK_LOG4J_CONF is deprecated, so there should be no problem, and semantically still keep the consistent.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since its deprecated and I would like to see it removed I don't think its that big of deal, but I disagree with the order if we are keeping it.
If I explicitly specify something in SPARK_LOG4J_CONF it should take precendence over anything in the <SPARK_HOME>/conf dir.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agree with Tom, but I'd rather just remove support for that env variable now. It's basically one line of code and a warning log in this file...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure, I will remove the support of this env variable.
|
I'd prefer if these files were uploaded inside the config archive generated by Spark, as the code you're deleting does for log4j.properties. That avoids creating more small files in HDFS and speeds things up even if a tiny bit. Is the problem here that the archive is not distributed to executors? If so, then maybe the better solution is to do that instead. |
|
@vanzin , thanks for your review. I know that putting into
So I'm not sure if there's any side-effect if we add this |
|
I don't think there's any harm in using the archive everywhere; it's currently only used in the AM mostly as an optimization, since it wasn't really used in the executors (aside from the oversight of log4j.properties). |
|
My concern is about hadoop related configurations, who will take the precedence if several paths have different configurations. |
|
There's no "several paths". Spark will broadcast the hadoop configs before running tasks and use that in the executors, so Spark won't use whatever is in the executor's classpath anyway. |
|
Thanks a lot for your explanation. I'm not sure if I understand correctly, currently we will add If the two copies, one in cluster's hadoop home and one send from client, has difference, not sure if there's any side-effect. It's just my concern, we haven't yet met such issue. |
|
As I've said above, spark does not use the Hadoop configuration from the classpath in the executors. It uses the hadoop configuration broadcast from the driver. So no matter what you add to the executor's classpath, it will not be used. And in any case, using the configuration present in the submitting node is more correct than using whatever configuration might or might not be available on the cluster nodes, which was the whole point of uploading the configuration archive to the AM in the first place. |
|
OK, thanks a lot for your explanation 😄 . |
|
Test build #54011 has finished for PR 11885 at commit
|
|
Test build #54019 has finished for PR 11885 at commit
|
|
LGTM, just need to fix the env variable thing one way or another. |
|
|
||
| val statCache: Map[URI, FileStatus] = HashMap[URI, FileStatus]() | ||
|
|
||
| val oldLog4jConf = Option(System.getenv("SPARK_LOG4J_CONF")) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here I removed the support of SPARK_LOG4J_CONF, though I already did it in #11603 , I can handle the merge conflicts.
|
Test build #54293 has finished for PR 11885 at commit
|
|
The Mima failure is not related to this patch. Jenkins, retest this please. |
|
Test build #54294 has finished for PR 11885 at commit
|
|
@vazin, please help to review again, thanks a lot. |
|
CC @tgravescs @vanzin , any further comment about this patch? |
| @@ -545,8 +528,7 @@ private[spark] class Client( | |||
| // Distribute an archive with Hadoop and Spark configuration for the AM. | |||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
update comment since now going everywhere
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, I will update the comment.
|
minor comment update, otherwise +1 |
|
Test build #54627 has finished for PR 11885 at commit
|
|
LGTM, merging to master. |
What changes were proposed in this pull request?
How was this patch tested?
Unit test and integration test is done.