-
Notifications
You must be signed in to change notification settings - Fork 29k
[SPARK-17672] Spark 2.0 history server web Ui takes too long for a single application #15247
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
ajbozarth
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overall I like this addition, just a couple comments
| } | ||
|
|
||
| def getApplicationInfo(appId: String): Option[ApplicationInfo] = { | ||
| throw new UnsupportedOperationException() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this would be better to return the expected result (the info if the id matches or a None if it doesn't) rather than throw an UnsupportedException, just for consistency
| case Some(LoadedAppUI(ui, updateState)) => | ||
| val completed = ui.getApplicationInfoList.exists(_.attempts.last.completed) | ||
| val completed = ui.getApplicationInfo(appId) | ||
| .map(_.attempts.last.completed).getOrElse(false) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Though I get what you're trying to do here, I'm not sure this is actually faster in this case since the list is only length one. It may be cleaner to leave this as is, what are other reviewers opinions?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Makes sense. I'll roll it back.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So while reviewing #15248 I think this original change actually might be better, mostly because it removes dependance both the getApplicationInfoList method that you're working with in that pr, even if it "looks" less clean.
|
If this passes tests, LGTM |
|
ok to test |
|
Test build #66004 has finished for PR 15247 at commit
|
|
Test build #66013 has finished for PR 15247 at commit
|
| } | ||
|
|
||
| def getApplicationInfo(appId: String): Option[ApplicationInfo] = { | ||
| if (appId == this.appId) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd rather this one use the iterator.find approach to avoid duplication. It's a one item list anyway.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point. I've made this change.
|
One minor comment otherwise looks ok. |
|
Test build #66050 has finished for PR 15247 at commit
|
|
Anyone knows why the last test failed in the sql module? My change has nothing to do with it. |
|
Jenkins, retest this please |
|
Test build #66059 has finished for PR 15247 at commit
|
|
Another test failed in the mllib: org.apache.spark.mllib.classification.NaiveBayesSuite.Naive Bayes Multinomial @ajbozarth Have you seen this kind of unrelated failure before? Do I have the permission to trigger the test? Thanks! |
|
retest this please |
|
Test build #66110 has finished for PR 15247 at commit
|
| /** | ||
| * Returns an ApplicationHistoryInfo for the appId. | ||
| * | ||
| * @return ApplicationHistoryInfo of one appId if exists. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is kind of verbose. I would just say @return the [[ApplicationHistoryInfo]] for the appId if it exists
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll fix this myself when I merge
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks!
| getApplicationList().iterator.map(ApplicationsListResource.appHistoryInfoToPublicAppInfo) | ||
| } | ||
|
|
||
| def getApplicationInfo(appId: String): Option[ApplicationInfo] = { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe this is a public API, but it's a reasonable one
| private[spark] trait UIRoot { | ||
| def getSparkUI(appKey: String): Option[SparkUI] | ||
| def getApplicationInfoList: Iterator[ApplicationInfo] | ||
| def getApplicationInfo(appId: String): Option[ApplicationInfo] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
also this
|
LGTM merging into master 2.0, thanks. |
…ngle application Added a new API getApplicationInfo(appId: String) in class ApplicationHistoryProvider and class SparkUI to get app info. In this change, FsHistoryProvider can directly fetch one app info in O(1) time complexity compared to O(n) before the change which used an Iterator.find() interface. Both ApplicationCache and OneApplicationResource classes adopt this new api. manual tests Author: Gang Wu <wgtmac@uber.com> Closes #15247 from wgtmac/SPARK-17671. (cherry picked from commit cb87b3c) Signed-off-by: Andrew Or <andrewor14@gmail.com>
What changes were proposed in this pull request?
Added a new API getApplicationInfo(appId: String) in class ApplicationHistoryProvider and class SparkUI to get app info. In this change, FsHistoryProvider can directly fetch one app info in O(1) time complexity compared to O(n) before the change which used an Iterator.find() interface.
Both ApplicationCache and OneApplicationResource classes adopt this new api.
How was this patch tested?
manual tests