-
Notifications
You must be signed in to change notification settings - Fork 29k
[YARN][DOC] Remove non-Yarn specific configurations from running-on-yarn.md #15869
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
srowen
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
CC @vanzin
docs/configuration.md
Outdated
| Amount of memory to use per executor process (e.g. <code>2g</code>, <code>8g</code>). | ||
| </td> | ||
| </tr> | ||
| <tr> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this is actually YARN-specific right now? At least it is according to SparkSubmit.scala
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for reviewing this PR, @srowen
Actually I am confused by this. From the information below, it seems not YARN-specific:
- https://github.com/apache/spark/blob/master/core/src/test/scala/org/apache/spark/SparkContextSuite.scala#L403
- In
configuration.md:
<tr>
<td><code>spark.dynamicAllocation.initialExecutors</code></td>
<td><code>spark.dynamicAllocation.minExecutors</code></td>
<td>
Initial number of executors to run if dynamic allocation is enabled.
<br /><br />
If `--num-executors` (or `spark.executor.instances`) is set and larger than this value, it will
be used as the initial number of executors.
</td>
</tr>
- Currently
spark.executor.instancesis defined in config/package.scala. If it is for YARN-specific, it should be defined inyarn/config.scalainstead ofconfig/package.scala.
| Use lower-case suffixes, e.g. <code>k</code>, <code>m</code>, <code>g</code>, <code>t</code>, and <code>p</code>, for kibi-, mebi-, gibi-, tebi-, and pebibytes, respectively. | ||
| </td> | ||
| </tr> | ||
| <tr> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, I see that these are duplicated in the YARN docs and main config docs. So are things like spark.driver.cores, but, the info on what it means in YARN is slightly more specific to YARN in the YARN docs. The spark.driver.memory doc isn't different, but, for consistency, maybe does not hurt to leave a note about these key, generic props in the YARN docs.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, in Yarn mode, the properties spark.driver.cores, spark.driver.memory, and spark.executor.memory are often used. But if searching them in Spark code (i.e. the information below), we will find they are not Yarn-specific.
- https://github.com/apache/spark/blob/master/core/src/main/scala/org/apache/spark/deploy/SparkSubmit.scala#L448
- https://github.com/apache/spark/blob/master/core/src/main/scala/org/apache/spark/deploy/SparkSubmit.scala#L477
- https://github.com/apache/spark/blob/master/core/src/main/scala/org/apache/spark/deploy/SparkSubmit.scala#L469
- https://github.com/apache/spark/blob/master/core/src/main/scala/org/apache/spark/deploy/SparkSubmit.scala#L469
- https://github.com/apache/spark/blob/master/core/src/main/scala/org/apache/spark/deploy/rest/StandaloneRestServer.scala#L135
- https://github.com/apache/spark/blob/master/mesos/src/main/scala/org/apache/spark/deploy/rest/mesos/MesosRestServer.scala#L92
- ...
docs/running-on-yarn.md
Outdated
| </td> | ||
| </tr> | ||
| <tr> | ||
| <td><code>spark.yarn.report.interval</code></td> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am not sure whether these are meant to be exposed to users -- not secret exactly, but not sure if they're to be advertised. Maybe it's fine to document. I'm neutral.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If it wasn't documented, it was on purpose. If users find it a useful config we can expose it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for this information.
docs/running-on-yarn.md
Outdated
| <td><code>spark.yarn.report.interval</code></td> | ||
| <td>1s</td> | ||
| <td> | ||
| Interval between reports of the current app status in Yarn cluster mode. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yarn -> YARN
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
docs/running-on-yarn.md
Outdated
| </tr> | ||
| <tr> | ||
| <td><code>spark.yarn.services</code></td> | ||
| <td>Nil</td> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As far the user is concerned, I think the best way to express the default is "(none)". Nil is an implementation detail
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
|
Hi, @srowen I have replied your comments and updated the PR. Could you please review it again? Thanks. |
|
Can you just put some of the more trivial pull requests into one, e.g. the two documentation update ones? |
docs/running-on-yarn.md
Outdated
| </td> | ||
| </tr> | ||
| <tr> | ||
| <td><code>spark.yarn.services</code></td> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think we should expose this. I think this was added with a PR to start adding integration with ATS which I don't know we officially support yet, this config is just for testing at this point. @steveloughran to verify.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for pointing out this.
|
@rxin Thanks for commenting. The reason I created two PRs is that I noticed these issues not at the same time. This one is still during discussion, and I am not sure if it will be merged, however, the other one I am sure it will be merged. I did not think too much about this. I just though if I find an issue, no matter how small it is, just submit a PR directly. But, I will pay attention next time. So do you mean I should close PR #15886 and merge it into this PR? I can do that. Thanks. |
|
Hi, @tgravescs What do you think about
Actually I noticed this issue when I was working on PR #15563. So if they are documented in current way on purpose, I can just close this PR. I would like your suggestion. Thanks. |
|
…iguration doc, etc
|
@tgravescs Thanks for the reply. I have updated the PR. |
srowen
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK, but can you now update the title and description? "Update Yarn doc" is too generic, and the description doesn't match the change now.
|
The plugin point is more generic than ATS integration; it lets you stick anything in to come up in the driver. Weakness: it's actually yarn specific; I could imagine uses in standalone too. The update interval flag? I don't remember this or what it does. Sorry |
|
Hi, @srowen I have updated the title and description. Thanks. |
|
Test build #3429 has finished for PR 15869 at commit
|
|
Merged to master/2.1 |
…arn.md ## What changes were proposed in this pull request? Remove `spark.driver.memory`, `spark.executor.memory`, `spark.driver.cores`, and `spark.executor.cores` from `running-on-yarn.md` as they are not Yarn-specific, and they are also defined in`configuration.md`. ## How was this patch tested? Build passed & Manually check. Author: Weiqing Yang <yangweiqing001@gmail.com> Closes #15869 from weiqingy/yarnDoc. (cherry picked from commit a3cac7b) Signed-off-by: Sean Owen <sowen@cloudera.com>
…arn.md ## What changes were proposed in this pull request? Remove `spark.driver.memory`, `spark.executor.memory`, `spark.driver.cores`, and `spark.executor.cores` from `running-on-yarn.md` as they are not Yarn-specific, and they are also defined in`configuration.md`. ## How was this patch tested? Build passed & Manually check. Author: Weiqing Yang <yangweiqing001@gmail.com> Closes apache#15869 from weiqingy/yarnDoc.
What changes were proposed in this pull request?
Remove
spark.driver.memory,spark.executor.memory,spark.driver.cores, andspark.executor.coresfromrunning-on-yarn.mdas they are not Yarn-specific, and they are also defined inconfiguration.md.How was this patch tested?
Build passed & Manually check.