Skip to content

Conversation

@OmarIthawi
Copy link
Contributor

@OmarIthawi OmarIthawi commented Oct 12, 2022

Copy link
Contributor

@shadinaif shadinaif left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you @OmarIthawi

@johnbaldwin
Copy link
Contributor

@OmarIthawi I'm reviewing this PR. I'm going to do some poking around. I just want to add a note of caution at this point.

Sometimes MySQL has degraded performance when using LIMIT. If you do a web search with "mysql slow limit" and "mysql slow limit 1", you should see some pages about this. My knowledge on this issue is still quite limited. The problem may "just" be with offsets, however, I'm not certain of that.

@johnbaldwin
Copy link
Contributor

johnbaldwin commented Oct 12, 2022

@OmarIthawi What do you think of skipping the initial check if there are records.

This should work. I ran it in the Django shell for dates that would return both records and return no records

val = CourseDailyMetrics.objects.filter(site=site, date_for__lte=date_for).aggregate(
    Sum('num_learners_completed'))['num_learners_completed__sum']
return val if val else 0

You probably also want to make sure that test coverage covers both cases

edit this might not work. I need to look over what's being collected so we're duplicating counts. If the 'num_learners_completed' is just for that given day, we're good. If it is a running total, then my suggestion is not going to work

edit Checked the data. It looks cumulative. Sigh. This shows I prematurely optimized. I should have retained the value just for the given day

@OmarIthawi
Copy link
Contributor Author

OmarIthawi commented Oct 13, 2022

Sometimes MySQL has degraded performance when using LIMIT. If you do a web search with "mysql slow limit" and "mysql slow limit 1", you should see some pages about this. My knowledge on this issue is still quite limited. The problem may "just" be with offsets, however, I'm not certain of that.

Thanks @johnbaldwin. Yes, I think I've had this and sadly it sometimes slows even without OFFSET. In this case we're optimizing python for sure since if qs uses to evaluate all 83k CourseDailyMetric instances, with LIMIT this is no longer the case. For SQL speed, we'll have to see in practice how this fix affects the Figures API performance.

@OmarIthawi OmarIthawi merged commit 7d6ae30 into main Oct 13, 2022
@OmarIthawi OmarIthawi deleted the optimize_cert_count branch October 13, 2022 07:21
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants