perf(github): Cache accessible repos for accessibleOnly search#112548
Draft
perf(github): Cache accessible repos for accessibleOnly search#112548
Conversation
When accessibleOnly=true with a search query, the old path fetched all installation repos (up to 5,000) on every debounced keystroke, then filtered with a Python list comprehension. Replace this with a cached set of accessible repo IDs (5-min Redis TTL) combined with the GitHub Search API, reducing each typed query from O(pages) API calls to a single search call plus a Redis lookup. Refs VDY-68
…h API Switch from Search API + cached ID set to caching the full repo list and filtering locally. This avoids the Search API's shared 30 req/min rate limit and uses sentry.cache.default_cache (Redis-backed) instead of django.core.cache (DummyCache in Sentry). Refs VDY-68
Keep the cached repo list unfiltered so the cache is a faithful snapshot of the GitHub API response. Apply the archived filter in get_repositories alongside the other transforms. Also let the accessible_only path handle both with and without a query. Refs VDY-68
The Search API does not return archived repos, so the archived filter should only apply to the /installation/repositories paths.
Move no-query path first since accessible_only is only useful with a query (repeated keystrokes). Combine archived and query filters into a single pass through to_repo_info.
Strip raw GitHub repo dicts down to the 5 fields used by get_repositories before storing in the cache. Reduces per-integration cache size from ~3KB per repo to ~100 bytes.
getsentry configures CACHES with memcached in production, so django.core.cache.cache works and matches the pattern used by the rest of the integrations codebase.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
OrganizationIntegrationReposEndpoint(/integrations/{id}/repos/) lets the frontend search for GitHub repos available to a GitHub App installation. When called withaccessibleOnly=trueand a search query (as the SCM onboarding repo selector does on each debounced keystroke), the previous implementation fetched all installation-accessible repos from the GitHub API (up to 50 pages of 100 = 5,000 repos) on every request, then filtered with a Python list comprehensionsentry.cache.default_cache(Redis) for 5 minutes, and filter locally on subsequent requests — reducing each typed query from O(pages) GitHub API calls to zeroTest plan
get_repositoriestests pass (6/6)test_get_repositories_accessible_only_caches_reposverifies cache hit path skips/installation/repositoriescallsaccessibleOnlysearch returns instantly from cacheRefs VDY-68