Skip to content

Conversation

@wgtmac
Copy link
Member

@wgtmac wgtmac commented Sep 29, 2016

What changes were proposed in this pull request?

Changed implementation of HistoryServer.getApplicationInfoList() to lazily evaluation the time-consuming transformation. In this way, ApplicationListResource can return an iterator without evaluating the whole input iterator.

How was this patch tested?

No API added. Just manual unit tests.

@AmplabJenkins
Copy link

Can one of the admins verify this patch?

@wgtmac wgtmac changed the title changed implementation of HistoryServer.getApplicationInfoList for lazy evaluation [SPARK-17671] Changed implementation of HistoryServer.getApplicationInfoList for lazy evaluation Sep 29, 2016
@wgtmac
Copy link
Member Author

wgtmac commented Sep 29, 2016

@srowen @ajbozarth I created this PR without adding any new API. Just rewrote the way getApplicationList constructing the iterator. Can you guys take a look? Thanks!

@srowen
Copy link
Member

srowen commented Sep 30, 2016

See my comments on the previous PR. I think this is much more simply handled by returning an Iterator over apps from the beginning, calling application.values.iterator instead of application.values and returning that up the stack.


def getApplicationInfoList: Iterator[ApplicationInfo] = {
getApplicationList().iterator.map(ApplicationsListResource.appHistoryInfoToPublicAppInfo)
new Iterator[ApplicationInfo] {
Copy link
Member

@srowen srowen Sep 30, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You don't need to do this at all. Just call .iterator.map(...) but that's what it already did.

attempt.startTime.getTime <= maxDate.timestamp
val numApps = Option(limit).getOrElse(Integer.MAX_VALUE).asInstanceOf[Int]

new Iterator[ApplicationInfo] {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Likewise, this is entirely fine as-is, filtering on an iterator

@wgtmac
Copy link
Member Author

wgtmac commented Sep 30, 2016

@srowen Thank you so much for spending time on this. Now I think it is OK to leave it unchanged. I'm inclined to close this PR unless you think the minor code change makes sense.

@ajbozarth
Copy link
Member

@wgtmac I think this small change is a better implementation of my previous pr and as long as @srowen agrees this LGTM. (I also learned a lot about how Iterators and Views work in scala while reviewing this)

@wgtmac
Copy link
Member Author

wgtmac commented Sep 30, 2016

@ajbozarth Yup. This is truly a good learning process to me. Really appreciate the patient help from you and @srowen

Copy link
Member

@srowen srowen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right now this change doesn't really do anything, just moves the conditional logic about the limit around. Hm, it seemed like this was pretty fine and solvable by operating on an iterator over the applications? is it hard and/or not worth it really?

}
}

val numApps = Option(limit).getOrElse(Integer.MAX_VALUE).asInstanceOf[Int]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh I see, this is weird because limit is a java.lang.Integer. I suggest Option(limit).map(_.toInt).getOrElse(Int.MaxValue) is a tiny bit better.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yup the iterator works fine if I limit the number of applications. I just updated this line in the PR for archive use.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can't we just change limit to be an Int to start, it seems that would make this easier.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, JAX-RS will be OK with that, but I assume that when the parameter is missing it becomes 0 or something and that won't work nicely here.

// keep the app if *any* attempts fall in the right time window
val dateOk = app.attempts.exists { attempt =>
attempt.startTime.getTime >= minDate.timestamp &&
attempt.startTime.getTime <= maxDate.timestamp
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This may be off topic but if your original concern was performance, then I'd note that there's no real need to evaluate dateOk by looking at all attempts again, if statusOk is false. Not sure it matters.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree. Compared to the transformation from ApplicationHistoryInfo to ApplicationInfo, this one is negligible.

@srowen
Copy link
Member

srowen commented Sep 30, 2016

@vanzin I know you raised this issue. @wgtmac what if I pursue my idea for resolving this in a separate PR, as kind of discussed in #15248 ? this change doesn't do anything really.

@wgtmac
Copy link
Member Author

wgtmac commented Sep 30, 2016

@srowen No problem.

@srowen
Copy link
Member

srowen commented Oct 2, 2016

OK, I'd suggest closing this unless anyone feels strongly about implementing this change too, but I think it doesn't have any functional impact.

@ajbozarth
Copy link
Member

It's cleaner, but I have no strong opinions

srowen added a commit to srowen/spark that referenced this pull request Oct 2, 2016
@srowen
Copy link
Member

srowen commented Oct 2, 2016

OK I ported a version of all the discussed changes to the new PR. It should be cleaner still, and possibly a tiny bit faster, which seems reasonable if the point of this PR is to optimize this path

srowen added a commit to srowen/spark that referenced this pull request Oct 12, 2016
@asfgit asfgit closed this in eb69335 Oct 12, 2016
zifeif2 pushed a commit to zifeif2/spark that referenced this pull request Nov 22, 2025
Closes apache#15303
Closes apache#15078
Closes apache#15080
Closes apache#15135
Closes apache#14565
Closes apache#12355
Closes apache#15404

Author: Sean Owen <sowen@cloudera.com>

Closes apache#15451 from srowen/CloseStalePRs.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants