-
Notifications
You must be signed in to change notification settings - Fork 29k
[SPARK-17671] Changed implementation of HistoryServer.getApplicationInfoList for lazy evaluation #15303
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
Can one of the admins verify this patch? |
|
@srowen @ajbozarth I created this PR without adding any new API. Just rewrote the way getApplicationList constructing the iterator. Can you guys take a look? Thanks! |
|
See my comments on the previous PR. I think this is much more simply handled by returning an Iterator over apps from the beginning, calling application.values.iterator instead of application.values and returning that up the stack. |
|
|
||
| def getApplicationInfoList: Iterator[ApplicationInfo] = { | ||
| getApplicationList().iterator.map(ApplicationsListResource.appHistoryInfoToPublicAppInfo) | ||
| new Iterator[ApplicationInfo] { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You don't need to do this at all. Just call .iterator.map(...) but that's what it already did.
| attempt.startTime.getTime <= maxDate.timestamp | ||
| val numApps = Option(limit).getOrElse(Integer.MAX_VALUE).asInstanceOf[Int] | ||
|
|
||
| new Iterator[ApplicationInfo] { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Likewise, this is entirely fine as-is, filtering on an iterator
|
@srowen Thank you so much for spending time on this. Now I think it is OK to leave it unchanged. I'm inclined to close this PR unless you think the minor code change makes sense. |
|
@ajbozarth Yup. This is truly a good learning process to me. Really appreciate the patient help from you and @srowen |
srowen
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right now this change doesn't really do anything, just moves the conditional logic about the limit around. Hm, it seemed like this was pretty fine and solvable by operating on an iterator over the applications? is it hard and/or not worth it really?
| } | ||
| } | ||
|
|
||
| val numApps = Option(limit).getOrElse(Integer.MAX_VALUE).asInstanceOf[Int] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh I see, this is weird because limit is a java.lang.Integer. I suggest Option(limit).map(_.toInt).getOrElse(Int.MaxValue) is a tiny bit better.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yup the iterator works fine if I limit the number of applications. I just updated this line in the PR for archive use.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can't we just change limit to be an Int to start, it seems that would make this easier.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, JAX-RS will be OK with that, but I assume that when the parameter is missing it becomes 0 or something and that won't work nicely here.
| // keep the app if *any* attempts fall in the right time window | ||
| val dateOk = app.attempts.exists { attempt => | ||
| attempt.startTime.getTime >= minDate.timestamp && | ||
| attempt.startTime.getTime <= maxDate.timestamp |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This may be off topic but if your original concern was performance, then I'd note that there's no real need to evaluate dateOk by looking at all attempts again, if statusOk is false. Not sure it matters.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree. Compared to the transformation from ApplicationHistoryInfo to ApplicationInfo, this one is negligible.
|
@srowen No problem. |
|
OK, I'd suggest closing this unless anyone feels strongly about implementing this change too, but I think it doesn't have any functional impact. |
|
It's cleaner, but I have no strong opinions |
|
OK I ported a version of all the discussed changes to the new PR. It should be cleaner still, and possibly a tiny bit faster, which seems reasonable if the point of this PR is to optimize this path |
Closes apache#15303 Closes apache#15078 Closes apache#15080 Closes apache#15135 Closes apache#14565 Closes apache#12355 Closes apache#15404
Closes apache#15303 Closes apache#15078 Closes apache#15080 Closes apache#15135 Closes apache#14565 Closes apache#12355 Closes apache#15404 Author: Sean Owen <sowen@cloudera.com> Closes apache#15451 from srowen/CloseStalePRs.
What changes were proposed in this pull request?
Changed implementation of HistoryServer.getApplicationInfoList() to lazily evaluation the time-consuming transformation. In this way, ApplicationListResource can return an iterator without evaluating the whole input iterator.
How was this patch tested?
No API added. Just manual unit tests.