Skip to content

Conversation

@Serj-N
Copy link
Contributor

@Serj-N Serj-N commented Aug 5, 2025

PR Rationale

The Course Info Blocks API endpoint has been known to be rather slow to return the response. Previous investigation showed that the major time sink was the get_course_blocks function, which is called three times in a single request. This PR aims to improve the response times by reducing the number of times that this function is called.

Solution Summary

The first time the function get_course_blocks is called, the result (transformed course blocks) is stored in the current WSGI request object. Later in the same request, before the second get_course_blocks call is triggered, the already transformed course blocks are taken from the request object, and if they are available, get_course_blocks is not called (if not, it is called as a fallback). Later in the request, the function is called again as before (see Optimization Strategy and Difficulties).

Optimization Strategy and Difficulties

The original idea was to fetch and transform the course blocks once and reuse them in all three cases, which would reduce get_course_blocks call count to 1. However, this did not turn out to be a viable solution because of the arguments passed to get_course_blocks. Notably, the allow_start_dates_in_future boolean flag affects the behavior of StartDateTransformer, which is a filtering transformer modifying the block structure returned.
The first two times allow_start_dates_in_future is False, the third time it is True. Setting it to True in all three cases would mean that some blocks would be incorrectly included in the response.

This left us with one option - optimize the first two calls. The difference between the first two calls is the non-filtering transformers, however the second call applies a subset of transformers from the first call, so it was safe to apply the superset of transformers in both cases. This allowed to reduce the number of function calls to 2. However, the cached structure may be further mutated by filters downstream, which means we need to cache a copy of the course structure (not the structure itself). The copy method itself is quite heavy (it calls deepcopy three times), making the benefits of this solution much less tangible. In fact, another potential optimization that was considered was to reuse the collected block structure (pre-transformation), but since calling copy on a collected structure proved to be more time-consuming than calling get_collected, this change was discarded, considering that the goal is to improve performance.

UPD: Revised Solution

To achieve a more tangible performance improvement, it was decided to modify the previous strategy as follows:

  • Pass a for_blocks_view parameter to the get_blocks function to make sure the new caching logic only affects the blocks view
  • Collect and cache course blocks with future dates included
  • Include start key in requested fields
  • Reuse the cached blocks in the third call, which is in get_course_assignments
  • Before returning the response, filter out any blocks with a future start date, and also remove the start key if it was not in requested fields

Response Times Measured

Here is a sample of the response times (in seconds).
The measurements were taken on a course with ~3000 blocks.

With the revised solution
Current time: ~12 seconds
New time: ~ 8 seconds
Improvement: 4 seconds (~30%)

get_course_blocks_highlighted_big_course

@openedx-webhooks
Copy link

openedx-webhooks commented Aug 5, 2025

Thanks for the pull request, @Serj-N!

This repository is currently maintained by @openedx/wg-maintenance-edx-platform.

Once you've gone through the following steps feel free to tag them in a comment and let them know that your changes are ready for engineering review.

🔘 Get product approval

If you haven't already, check this list to see if your contribution needs to go through the product review process.

  • If it does, you'll need to submit a product proposal for your contribution, and have it reviewed by the Product Working Group.
    • This process (including the steps you'll need to take) is documented here.
  • If it doesn't, simply proceed with the next step.
🔘 Provide context

To help your reviewers and other members of the community understand the purpose and larger context of your changes, feel free to add as much of the following information to the PR description as you can:

  • Dependencies

    This PR must be merged before / after / at the same time as ...

  • Blockers

    This PR is waiting for OEP-1234 to be accepted.

  • Timeline information

    This PR must be merged by XX date because ...

  • Partner information

    This is for a course on edx.org.

  • Supporting documentation
  • Relevant Open edX discussion forum threads
🔘 Get a green build

If one or more checks are failing, continue working on your changes until this is no longer the case and your build turns green.

Details
Where can I find more information?

If you'd like to get more details on all aspects of the review process for open source pull requests (OSPRs), check out the following resources:

When can I expect my changes to be merged?

Our goal is to get community contributions seen and reviewed as efficiently as possible.

However, the amount of time that it takes to review and merge a PR can vary significantly based on factors such as:

  • The size and impact of the changes that it introduces
  • The need for product review
  • Maintenance status of the parent repository

💡 As a result it may take up to several weeks or months to complete a review and merge your PR.

@openedx-webhooks openedx-webhooks added the open-source-contribution PR author is not from Axim or 2U label Aug 5, 2025
@github-project-automation github-project-automation bot moved this to Needs Triage in Contributions Aug 5, 2025
@mphilbrick211 mphilbrick211 moved this from Needs Triage to Waiting on Author in Contributions Aug 5, 2025
@mphilbrick211 mphilbrick211 added the FC Relates to an Axim Funded Contribution project label Aug 5, 2025

# store transformed blocks in the current request to be reused where possible for optimization
if current_request := get_current_request():
setattr(current_request, "reusable_transformed_blocks", blocks) # pylint: disable=literal-used-as-attribute
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please use a RequestCache here, instead of setting this on the request object.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ormsbee reminded me of a presentation he had given some time ago that's still relevant :)

image

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ormsbee @e0d Thanks for the tip and the chart, very useful!
Change implemented.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Awesome. @Serj-N I notice there are some failing pylint checks, can you have a look?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, I fixed the pylint checks. But now I see some failing tests, I'll take a look at those, too.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ormsbee Looks like a must-watch! Many thanks!

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@e0d @ormsbee I revised the solution slightly, the tests pass now. A few changes were made in the PR description, as well.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I meant to ask if you have checked that the performance numbers look the same.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Indeed, I have: because of some trade-offs we had to make, the net gain in performance is less than before (compared to the original solution). I added a few details on that in the PR description.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@e0d Hi! I finally got around to revising the current solution on this one to get a more tangible improvement. I updated the code as well as the description. Please take a look when you have a chance. Thanks!

@Serj-N Serj-N force-pushed the nanai/axm-2610/optimize-api branch from 6e889a8 to ebb32f3 Compare August 18, 2025 14:19
@Serj-N Serj-N force-pushed the nanai/axm-2610/optimize-api branch from ebb32f3 to 9e0d736 Compare October 9, 2025 08:02
@Serj-N Serj-N force-pushed the nanai/axm-2610/optimize-api branch from 8619d8a to 3ec750c Compare October 9, 2025 09:08
@Serj-N Serj-N marked this pull request as ready for review October 14, 2025 14:40
block_types_filter=None,
hide_access_denials=False,
allow_start_dates_in_future=False,
for_blocks_view=False,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The name of this argument doesn't really express what it does -- to me at least. I also question the API design:

  1. Why wouldn't caching blocks be the default? Wouldn't all callers benefit?
  2. Could the caching code be abstracted from the body of the function so it's not inline with the business logic?
  3. I realize you are working around old code, but the future start date stuff feels like it could be the callers responsibility and that complexity could be removed from the getter.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As for the naming, I guess we could come up something better: cache_with_future_dates seems to reflect the intent more clearly.

As for the other questions:

  1. Making this caching the default behavior would not benefit the other callers, because they either:
  • only call get_blocks() once and never need to reuse the collected course structure; or

  • do call get_blocks() more than once, but they either pass the same arguments (benefiting from @request_cached) or they pass a different user (the cached structure has to be recollected from scratch).

  1. Yes, this abstraction seems reasonable, and I think it can be nicely paired with the other refactoring you suggest - creating variables for cache key names.

  2. The logic here was indeed influenced by the existing codebase in a lot of ways, that's for sure. And while some of the previous design choices might seem questionable, this particular decision seems to make sense: we want to get_blocks and we specify a filtering criteria - whether to include future start dates or not. Filtering after the fact in each caller would basically mean trying to reproduce what the existing transformers are already designed to do. This doesn't seem like a very clean approach, and the only reason we resort to it here is because of the constraints (in terms of scope and performance) of this particular api view.

Copy link
Contributor

@cmltaWt0 cmltaWt0 Oct 16, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The name of this argument doesn't really express what it does -- to me at least.

Not only for you - I've also had this question.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • Renamed the flag to cache_with_future_dates
  • Moved the logic for getting blocks from cache to utils.py

# unfiltered, course blocks from RequestCache and thus reducing the number of times that
# the get_course_blocks function is called.

request_cache = RequestCache("unfiltered_course_structure")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As the names are used multiple time, let's create a variable at the appropriate scope for these.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done, constants added to utils.py

@Serj-N Serj-N force-pushed the nanai/axm-2610/optimize-api branch 2 times, most recently from 86317fd to 25970a0 Compare October 17, 2025 15:58
@Serj-N Serj-N force-pushed the nanai/axm-2610/optimize-api branch from 25970a0 to f7bc539 Compare October 17, 2025 16:20
Copy link
Contributor

@e0d e0d left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good with the last round of changes.

@e0d
Copy link
Contributor

e0d commented Oct 21, 2025

@ormsbee This has been revised since your original input. The current version looks good to me. There are some parts of the pre-existing API that I'm not crazy about, but I don't think that now is the time to fix them.

Copy link
Contributor

@ormsbee ormsbee left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some minor requests, but mostly questions to check my understanding of edge case behavior.

# Store a copy of the transformed, but still unfiltered, course blocks in RequestCache to be reused
# wherever possible for optimization. Copying is required to make sure the cached structure is not mutated
# by the filtering below.
request_cache = RequestCache(UNFILTERED_STRUCTURE_CACHE_KEY)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: The naming of this is confusing because UNFILTERED_STRUCTURE_CACHE_KEY implies that this is a cache key as part of a key-value pairing, but UNFILTERED_STRUCTURE_CACHE_KEY is really a namespace for this particular RequestCache as a whole. The namespaces just ensure that there's no chance of collision with other apps that need RequestCache functionality, so it would be more common to make the namespace be the module or app name.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point, changed it to COURSE_API_REQUEST_CACHE_NAMESPACE = "course_api"

- also removes references to them from parents' 'children' lists
- removes 'start' key from all blocks if it wasn't requested
"""
from datetime import datetime, timezone
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's no need for this to be a function-local import, is there?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There isn't indeed, moved the import to module level

Comment on lines +382 to +385
for block in course_blocks.values():
children = block.get("children")
if children:
block["children"] = [cid for cid in children if cid not in to_remove]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just to check my understanding: Is it the case that it's okay to do this simple child removal (and not go down into further descendants) because all the inheritance has already been pre-computed, and the start attribute has been denormalized and put on all the nodes?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, this is correct: by this point, StartDateTransformer has traversed the structure and resolved the start dates, and BlockSerializer has put the start key on all the returned blocks.

Comment on lines +371 to +372
if start and start > now:
to_remove.add(block_id)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems like this function is ignoring days_early_for_beta for content beta testers. Is that accounted for somewhere else?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

At this point, we are dealing with the start dates that have been computed by StartDateTransformer - which is where the logic concerning days_early_for_beta lives.

# wherever possible for optimization. Copying is required to make sure the cached structure is not mutated
# by the filtering below.
request_cache = RequestCache(UNFILTERED_STRUCTURE_CACHE_KEY)
request_cache.set(REUSABLE_BLOCKS_CACHE_KEY, blocks.copy())
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We're doing a copy on the way in here, but we also have to worry about people mutating what they get back from get_cached_transformed_blocks(), right? Will that be a problem?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That is a reasonable concern, but once again, we need to take into account whether this affects performance. And unfortunately it does: BlockStructureBlockData.copy() is a heavy operation, and calling it on cache retrieval will add several seconds to response time, negating the optimization we're looking for.
Luckily, in our current flow the retrieved structure is only used for reading, so there is no actual need to make a copy on the way out of the cache.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If it's going to severely impact performance, then it's okay to have those kinds of side-effects, though we should put comments in the docstring of get_cached_transformed_blocks() that makes it clear that people should not mutate what they get back. This whole optimization is a trade-off between performance and long-term maintainability, and we will need to fix this at a more fundamental level.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I added a word of caution to the function's docstring.

"""
request_cache = RequestCache(COURSE_API_REQUEST_CACHE_NAMESPACE)
cached_response = request_cache.get_cached_response(REUSABLE_BLOCKS_CACHE_KEY)
reusable_transformed_blocks = cached_response.value.copy() if cached_response.is_found else None
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you added the copy() here because of my comment, it's okay to just return what you had before. Please just make sure to note it in the docstring.

Copy link
Contributor Author

@Serj-N Serj-N Oct 30, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done, .copy() reverted back, docstring updated.

Copy link
Contributor

@ormsbee ormsbee left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor request to add something to the docstring. You don't have to do the extra-defensive copy if it's going to have a negative impact on performance. I can merge this in the morning once those changes are done (or @e0d can if he gets to it before me). Thank you.

@Serj-N
Copy link
Contributor Author

Serj-N commented Oct 30, 2025

@ormsbee I implemented the requested changes (reverted .copy() and updated docstring). Thank you for the review!

@ormsbee ormsbee merged commit 7cd4170 into openedx:master Oct 30, 2025
49 checks passed
@github-project-automation github-project-automation bot moved this from Waiting on Author to Done in Contributions Oct 30, 2025
ormsbee pushed a commit to ormsbee/edx-platform that referenced this pull request Oct 30, 2025
The Course Info Blocks API endpoint has been known to be rather slow
to return the response. Previous investigation showed that the major
time sink was the get_course_blocks function, which is called three
times in a single request. This commit aims to improve the response
times by reducing the number of times that this function is called.

Solution Summary

The first time the function get_course_blocks is called, the result
(transformed course blocks) is stored in the current WSGI request
object. Later in the same request, before the second get_course_blocks
call is triggered, the already transformed course blocks are taken
from the request object, and if they are available, get_course_blocks
is not called (if not, it is called as a fallback). Later in the
request, the function is called again as before (see Optimization
Strategy and Difficulties).

Optimization Strategy and Difficulties

The original idea was to fetch and transform the course blocks once
and reuse them in all three cases, which would reduce get_course_blocks
call count to 1. However, this did not turn out to be a viable solution
because of the arguments passed to get_course_blocks. Notably, the
allow_start_dates_in_future boolean flag affects the behavior of
StartDateTransformer, which is a filtering transformer modifying the
block structure returned.

The first two times allow_start_dates_in_future is False, the third
time it is True. Setting it to True in all three cases would mean that
some blocks would be incorrectly included in the response.

This left us with one option - optimize the first two calls. The
difference between the first two calls is the non-filtering
transformers, however the second call applies a subset of transformers
from the first call, so it was safe to apply the superset of
transformers in both cases. This allowed to reduce the number of
function calls to 2. However, the cached structure may be further
mutated by filters downstream, which means we need to cache a copy of
the course structure (not the structure itself). The copy method itself
is quite heavy (it calls deepcopy three times), making the benefits of
this solution much less tangible. In fact, another potential
optimization that was considered was to reuse the collected block
structure (pre-transformation), but since calling copy on a collected
structure proved to be more time-consuming than calling get_collected,
this change was discarded, considering that the goal is to improve
performance.

Revised Solution

To achieve a more tangible performance improvement, it was decided to
modify the previous strategy as follows:

* Pass a for_blocks_view parameter to the get_blocks function to make
  sure the new caching logic only affects the blocks view.
* Collect and cache course blocks with future dates included.
* Include start key in requested fields.
* Reuse the cached blocks in the third call, which is in
  get_course_assignments
* Before returning the response, filter out any blocks with a future
  start date, and also remove the start key if it was not in requested
  fields
ormsbee pushed a commit that referenced this pull request Oct 30, 2025
The Course Info Blocks API endpoint has been known to be rather slow
to return the response. Previous investigation showed that the major
time sink was the get_course_blocks function, which is called three
times in a single request. This commit aims to improve the response
times by reducing the number of times that this function is called.

Solution Summary

The first time the function get_course_blocks is called, the result
(transformed course blocks) is stored in the current WSGI request
object. Later in the same request, before the second get_course_blocks
call is triggered, the already transformed course blocks are taken
from the request object, and if they are available, get_course_blocks
is not called (if not, it is called as a fallback). Later in the
request, the function is called again as before (see Optimization
Strategy and Difficulties).

Optimization Strategy and Difficulties

The original idea was to fetch and transform the course blocks once
and reuse them in all three cases, which would reduce get_course_blocks
call count to 1. However, this did not turn out to be a viable solution
because of the arguments passed to get_course_blocks. Notably, the
allow_start_dates_in_future boolean flag affects the behavior of
StartDateTransformer, which is a filtering transformer modifying the
block structure returned.

The first two times allow_start_dates_in_future is False, the third
time it is True. Setting it to True in all three cases would mean that
some blocks would be incorrectly included in the response.

This left us with one option - optimize the first two calls. The
difference between the first two calls is the non-filtering
transformers, however the second call applies a subset of transformers
from the first call, so it was safe to apply the superset of
transformers in both cases. This allowed to reduce the number of
function calls to 2. However, the cached structure may be further
mutated by filters downstream, which means we need to cache a copy of
the course structure (not the structure itself). The copy method itself
is quite heavy (it calls deepcopy three times), making the benefits of
this solution much less tangible. In fact, another potential
optimization that was considered was to reuse the collected block
structure (pre-transformation), but since calling copy on a collected
structure proved to be more time-consuming than calling get_collected,
this change was discarded, considering that the goal is to improve
performance.

Revised Solution

To achieve a more tangible performance improvement, it was decided to
modify the previous strategy as follows:

* Pass a for_blocks_view parameter to the get_blocks function to make
  sure the new caching logic only affects the blocks view.
* Collect and cache course blocks with future dates included.
* Include start key in requested fields.
* Reuse the cached blocks in the third call, which is in
  get_course_assignments
* Before returning the response, filter out any blocks with a future
  start date, and also remove the start key if it was not in requested
  fields
haftamuk pushed a commit to haftamuk/edx-platform that referenced this pull request Nov 3, 2025
The Course Info Blocks API endpoint has been known to be rather slow
to return the response. Previous investigation showed that the major
time sink was the get_course_blocks function, which is called three
times in a single request. This commit aims to improve the response
times by reducing the number of times that this function is called.

Solution Summary

The first time the function get_course_blocks is called, the result
(transformed course blocks) is stored in the current WSGI request
object. Later in the same request, before the second get_course_blocks
call is triggered, the already transformed course blocks are taken
from the request object, and if they are available, get_course_blocks
is not called (if not, it is called as a fallback). Later in the
request, the function is called again as before (see Optimization
Strategy and Difficulties).

Optimization Strategy and Difficulties

The original idea was to fetch and transform the course blocks once
and reuse them in all three cases, which would reduce get_course_blocks
call count to 1. However, this did not turn out to be a viable solution
because of the arguments passed to get_course_blocks. Notably, the
allow_start_dates_in_future boolean flag affects the behavior of
StartDateTransformer, which is a filtering transformer modifying the
block structure returned.

The first two times allow_start_dates_in_future is False, the third
time it is True. Setting it to True in all three cases would mean that
some blocks would be incorrectly included in the response.

This left us with one option - optimize the first two calls. The
difference between the first two calls is the non-filtering
transformers, however the second call applies a subset of transformers
from the first call, so it was safe to apply the superset of
transformers in both cases. This allowed to reduce the number of
function calls to 2. However, the cached structure may be further
mutated by filters downstream, which means we need to cache a copy of
the course structure (not the structure itself). The copy method itself
is quite heavy (it calls deepcopy three times), making the benefits of
this solution much less tangible. In fact, another potential
optimization that was considered was to reuse the collected block
structure (pre-transformation), but since calling copy on a collected
structure proved to be more time-consuming than calling get_collected,
this change was discarded, considering that the goal is to improve
performance.

Revised Solution

To achieve a more tangible performance improvement, it was decided to
modify the previous strategy as follows:

* Pass a for_blocks_view parameter to the get_blocks function to make
  sure the new caching logic only affects the blocks view.
* Collect and cache course blocks with future dates included.
* Include start key in requested fields.
* Reuse the cached blocks in the third call, which is in
  get_course_assignments
* Before returning the response, filter out any blocks with a future
  start date, and also remove the start key if it was not in requested
  fields
asadali145 added a commit to mitodl/edx-platform that referenced this pull request Nov 20, 2025
@pdpinch
Copy link
Contributor

pdpinch commented Nov 20, 2025

FYI, we're encountering issues with persistent grades and we've opened #37661 to revert this to buy some time to figure out what's going on.

blarghmatey pushed a commit to mitodl/edx-platform that referenced this pull request Nov 20, 2025
ormsbee pushed a commit that referenced this pull request Nov 20, 2025
ormsbee pushed a commit to ormsbee/edx-platform that referenced this pull request Dec 10, 2025
ormsbee pushed a commit that referenced this pull request Dec 10, 2025
blarghmatey pushed a commit to mitodl/edx-platform that referenced this pull request Dec 18, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

FC Relates to an Axim Funded Contribution project open-source-contribution PR author is not from Axim or 2U

Projects

Archived in project

Development

Successfully merging this pull request may close these issues.

7 participants