Skip to content

Conversation

@jerryshao
Copy link
Contributor

What changes were proposed in this pull request?

  1. Currently log4j which uses distributed cache only adds to AM's classpath, not executor's, this is introduced in [SPARK-11105][yarn] Distribute log4j.properties to executors #9118, which breaks the original meaning of that PR, so here add log4j file to the classpath of both AM and executors.
  2. Automatically upload metrics.properties to distributed cache, so that it could be used by remote driver and executors implicitly.

How was this patch tested?

Unit test and integration test is done.

@SparkQA
Copy link

SparkQA commented Mar 22, 2016

Test build #53763 has finished for PR 11885 at commit 6c20d37.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@jerryshao jerryshao closed this Mar 22, 2016
@jerryshao jerryshao reopened this Mar 23, 2016
@jerryshao jerryshao changed the title [SPARK=14062][Yarn] Upload metrics.properties automatically with distributed cache [SPARK-14062][Yarn] Fix log4j and upload metrics.properties automatically with distributed cache Mar 23, 2016
@SparkQA
Copy link

SparkQA commented Mar 23, 2016

Test build #53912 has finished for PR 11885 at commit f9cb06b.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Mar 23, 2016

Test build #53911 has finished for PR 11885 at commit 6c20d37.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@jerryshao
Copy link
Contributor Author

CC @vanzin @tgravescs please help to review, thanks a lot.

@SparkQA
Copy link

SparkQA commented Mar 23, 2016

Test build #53918 has finished for PR 11885 at commit 260ff0e.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@srowen
Copy link
Member

srowen commented Mar 23, 2016

LGTM, FWIW. You're just uploading an additional file here and cleaning up the code.

// Also uploading metrics.properties to distributed cache if exists in classpath.
// If user specify this file using --files then executors will use the one
// from --files instead.
for { prop <- Seq("log4j.properties", "metrics.properties")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

does this break the oldLog4jConf functionality above? I think it will throw an exception if both exist.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I haven't tried yet, I will do a quick test on this.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @tgravescs , I just did a quick test on this.

If oldLog4jConf points to the same log4j file under <SPARK_HOME>/conf, it will be added to distributed cache once and get a warning for the following one. If oldLog4jConf points to a different log4j file other than the default one under <SPARK_HOME>/conf, so the one under conf took precedence.

I think since SPARK_LOG4J_CONF is deprecated, so there should be no problem, and semantically still keep the consistent.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since its deprecated and I would like to see it removed I don't think its that big of deal, but I disagree with the order if we are keeping it.

If I explicitly specify something in SPARK_LOG4J_CONF it should take precendence over anything in the <SPARK_HOME>/conf dir.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agree with Tom, but I'd rather just remove support for that env variable now. It's basically one line of code and a warning log in this file...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, I will remove the support of this env variable.

@vanzin
Copy link
Contributor

vanzin commented Mar 24, 2016

I'd prefer if these files were uploaded inside the config archive generated by Spark, as the code you're deleting does for log4j.properties. That avoids creating more small files in HDFS and speeds things up even if a tiny bit.

Is the problem here that the archive is not distributed to executors? If so, then maybe the better solution is to do that instead.

@jerryshao
Copy link
Contributor Author

@vanzin , thanks for your review. I know that putting into confArchive is a more elegant way, but here confArchive is only added to AM's classpath. I read your patch why it only adds to AM's classpath,

These are only used by the AM, since executors will use the configuration object broadcast by
the driver. The files are zipped and added to the job as an archive, so that YARN will explode
it when distributing to the AM. This directory is then added to the classpath of the AM
process, just to make sure that everybody is using the same default config.

So I'm not sure if there's any side-effect if we add this confArchive to executor's classpath.

@vanzin
Copy link
Contributor

vanzin commented Mar 24, 2016

I don't think there's any harm in using the archive everywhere; it's currently only used in the AM mostly as an optimization, since it wasn't really used in the executors (aside from the oversight of log4j.properties).

@jerryshao
Copy link
Contributor Author

My concern is about hadoop related configurations, who will take the precedence if several paths have different configurations.

@vanzin
Copy link
Contributor

vanzin commented Mar 24, 2016

There's no "several paths". Spark will broadcast the hadoop configs before running tasks and use that in the executors, so Spark won't use whatever is in the executor's classpath anyway.

@jerryshao
Copy link
Contributor Author

Thanks a lot for your explanation.

I'm not sure if I understand correctly, currently we will add <spark_home>/etc/hadoop into the classpath by default for AM and executors. And now if we add __spark_conf__ into classpath of executors, there will be another copy of hadoop conf, and we create Configuration() at executor start, which will add some specific configurations like s3 and spark.hadoop.xxx send from driver.

If the two copies, one in cluster's hadoop home and one send from client, has difference, not sure if there's any side-effect.

It's just my concern, we haven't yet met such issue.

@vanzin
Copy link
Contributor

vanzin commented Mar 24, 2016

As I've said above, spark does not use the Hadoop configuration from the classpath in the executors. It uses the hadoop configuration broadcast from the driver.

So no matter what you add to the executor's classpath, it will not be used.

And in any case, using the configuration present in the submitting node is more correct than using whatever configuration might or might not be available on the cluster nodes, which was the whole point of uploading the configuration archive to the AM in the first place.

@jerryshao
Copy link
Contributor Author

OK, thanks a lot for your explanation 😄 .

@SparkQA
Copy link

SparkQA commented Mar 24, 2016

Test build #54011 has finished for PR 11885 at commit 6702927.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Mar 24, 2016

Test build #54019 has finished for PR 11885 at commit b1da8e5.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@vanzin
Copy link
Contributor

vanzin commented Mar 25, 2016

LGTM, just need to fix the env variable thing one way or another.


val statCache: Map[URI, FileStatus] = HashMap[URI, FileStatus]()

val oldLog4jConf = Option(System.getenv("SPARK_LOG4J_CONF"))
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here I removed the support of SPARK_LOG4J_CONF, though I already did it in #11603 , I can handle the merge conflicts.

@SparkQA
Copy link

SparkQA commented Mar 28, 2016

Test build #54293 has finished for PR 11885 at commit ea17176.

  • This patch fails MiMa tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@jerryshao
Copy link
Contributor Author

The Mima failure is not related to this patch. Jenkins, retest this please.

@SparkQA
Copy link

SparkQA commented Mar 28, 2016

Test build #54294 has finished for PR 11885 at commit ea17176.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@jerryshao
Copy link
Contributor Author

@vazin, please help to review again, thanks a lot.

@jerryshao
Copy link
Contributor Author

CC @tgravescs @vanzin , any further comment about this patch?

@@ -545,8 +528,7 @@ private[spark] class Client(
// Distribute an archive with Hadoop and Spark configuration for the AM.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

update comment since now going everywhere

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, I will update the comment.

@tgravescs
Copy link
Contributor

minor comment update, otherwise +1

@SparkQA
Copy link

SparkQA commented Mar 31, 2016

Test build #54627 has finished for PR 11885 at commit a619dfd.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@vanzin
Copy link
Contributor

vanzin commented Mar 31, 2016

LGTM, merging to master.

@asfgit asfgit closed this in 3b3cc76 Mar 31, 2016
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants