status: limit problem ranges request to improve performance #158912
+413
−70
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
On large clusters requesting all ranges just to compute the problem range data is wasteful and can take a very long time to do. This change introduces a new RPC endpoint which is more limited in scope and can be used by the existing problem ranges RPC to fetch just what's needed from all the nodes.
The new endpoint ProblemRangesLocal only retrieves the problem range counters for the local node eliminating the need to fetch all and filter at the gateway.
Epic: None
Resolves: #121706
Release note: None
Note for reviewers:
This is an LLM generated diff. There's some cruft that I want to clean up in the tests. They have unnecessary comments and can perhaps be combined into a single test with a bunch of cases. Mostly looking for feedback on whether this approach is a reasonable solution to the performance bottleneck on large clusters. I'll do a closer pass if this is a good idea.