Skip to content

fix(backend): harden statistics rpc retries#1930

Merged
riderx merged 4 commits into
mainfrom
codex/statistics-rpc-retries
Apr 22, 2026
Merged

fix(backend): harden statistics rpc retries#1930
riderx merged 4 commits into
mainfrom
codex/statistics-rpc-retries

Conversation

@riderx
Copy link
Copy Markdown
Member

@riderx riderx commented Apr 21, 2026

Summary (AI generated)

  • retry transient 5xx/PostgREST failures when statistics endpoints load app ownership, metrics, and storage totals
  • return app_not_found instead of falling through to a generic app statistics failure when the app lookup returns no row
  • add focused unit coverage for the statistics retry helpers and missing-app lookup path

Motivation (AI generated)

The PostHog backend error stream shows live statistics failures (Cannot get organization statistics) and older matching app statistics failures. Those routes were treating transient upstream query failures as hard failures and also ignored one app lookup error path entirely.

Business Impact (AI generated)

This reduces noisy backend alerts and prevents users from seeing intermittent statistics failures when the underlying PostgREST layer briefly returns 5xx errors. It also makes missing-app cases fail with a clearer API response instead of a generic server error.

Test Plan (AI generated)

  • bunx vitest run tests/statistics-retries.unit.test.ts
  • bunx eslint supabase/functions/_backend/public/statistics/index.ts tests/statistics-retries.unit.test.ts
  • bun typecheck
  • GitHub CI

Generated with AI

Summary by CodeRabbit

  • Bug Fixes

    • Added automatic retry/backoff with centralized logging for statistics queries to recover from transient failures.
    • Improved app-owner lookup and error handling, distinguishing missing apps (now returned as 404) from server errors.
    • Fixed storage-size handling so null values are treated as zero.
  • Tests

    • Added unit tests covering retry behavior, owner-lookup not-found detection, and missing-app error selection.

@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented Apr 21, 2026

Warning

Rate limit exceeded

@riderx has exceeded the limit for the number of commits that can be reviewed per hour. Please wait 2 minutes and 56 seconds before requesting another review.

Your organization is not enrolled in usage-based pricing. Contact your admin to enable usage-based pricing to continue reviews beyond the rate limit, or try again in 2 minutes and 56 seconds.

⌛ How to resolve this issue?

After the wait time has elapsed, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

We recommend that you space out your commits to avoid hitting the rate limit.

🚦 How do rate limits work?

CodeRabbit enforces hourly rate limits for each developer per organization.

Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout.

Please see our FAQ for further information.

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 76595e46-3d90-4ba9-97cc-cba13adb5e85

📥 Commits

Reviewing files that changed from the base of the PR and between 7705228 and 165b97c.

📒 Files selected for processing (2)
  • supabase/functions/_backend/public/statistics/index.ts
  • tests/statistics-retries.unit.test.ts
📝 Walkthrough

Walkthrough

Adds a retry/backoff wrapper for statistics Supabase queries, unifies app→owner_org resolution with retry and not-found signaling, normalizes null storage sizes to 0, exposes statistics test utilities, and adds unit tests for retry and owner-resolution behaviors.

Changes

Cohort / File(s) Summary
Statistics core
supabase/functions/_backend/public/statistics/index.ts
Introduced executeStatsQueryWithRetry, getRetryableStatus, isRetryableStatsError, getMissingAppStatsError; added resolveAppOwnerOrg and getStatsAppOwnerOrgOrThrow for app→owner_org lookup with retry and structured not-found; replaced direct RPC/.single() calls with retry-wrapped executions; normalized currentStorageBytes null → 0; exported statisticsTestUtils.
Tests
tests/statistics-retries.unit.test.ts
New Vitest tests validating retry behavior (retries on 502, not on 400), resolveAppOwnerOrg not-found handling for { data: null, error: null }, and selection of missing-app error from aggregated errors.

Sequence Diagram

sequenceDiagram
    autonumber
    actor Client
    participant Handler as "Statistics Handler"
    participant Retry as "Retry Wrapper\n(executeStatsQueryWithRetry)"
    participant Supabase as "Supabase RPC/DB"

    Client->>Handler: GET /statistics/app/:app_id
    Handler->>Retry: request metrics RPC (app/org), storage RPC, owner lookup
    Retry->>Supabase: execute RPC / query
    Supabase-->>Retry: { error: 502 } 
    Note right of Retry: detect retryable error\napply backoff & retry
    Retry->>Retry: wait (exponential backoff)
    Retry->>Supabase: retry RPC / query
    Supabase-->>Retry: { data: ... , error: null }
    Retry-->>Handler: consolidated results (metrics, storage normalized, ownerOrg)
    Handler-->>Client: HTTP 200 with statistics
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Possibly related PRs

Poem

🐰 I hopped through logs with careful beats,
Retry and backoff beneath my feets.
If 502s try to make me stop,
I hop again until they drop.
Cheers — a rabbit clap for stable ops! 🥕

🚥 Pre-merge checks | ✅ 3 | ❌ 2

❌ Failed checks (1 warning, 1 inconclusive)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 12.50% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
Description check ❓ Inconclusive The pull request description includes a comprehensive summary of changes, motivation, business impact, and test plan. However, it does not follow the repository's required template structure with explicit sections for Summary, Test plan, Screenshots, and Checklist. Restructure the description to match the repository template: add explicit 'Summary', 'Test plan', and 'Checklist' sections with the required format and checkbox items from the template.
✅ Passed checks (3 passed)
Check name Status Explanation
Title check ✅ Passed The pull request title clearly and concisely summarizes the main change: hardening statistics RPC retry logic with transient failure handling.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch codex/statistics-rpc-retries

Comment @coderabbitai help to get the list of available commands and usage tips.

@codspeed-hq
Copy link
Copy Markdown
Contributor

codspeed-hq Bot commented Apr 21, 2026

Merging this PR will not alter performance

✅ 28 untouched benchmarks


Comparing codex/statistics-rpc-retries (165b97c) with main (a221796)

Open in CodSpeed

@riderx riderx marked this pull request as ready for review April 21, 2026 17:38
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (1)
tests/statistics-retries.unit.test.ts (1)

8-44: These unit tests can run with it.concurrent().

They are independent and only assert local mocks/results, so keeping them serial just slows the file down.

♻️ Suggested change
 describe('statistics retry helpers', () => {
-  it('retries transient statistics query failures and returns the recovered result', async () => {
+  it.concurrent('retries transient statistics query failures and returns the recovered result', async () => {
     // ...
   })

-  it('does not retry non-retryable statistics query failures', async () => {
+  it.concurrent('does not retry non-retryable statistics query failures', async () => {
     // ...
   })

-  it('marks missing apps as not found when the lookup returns no rows', async () => {
+  it.concurrent('marks missing apps as not found when the lookup returns no rows', async () => {
     // ...
   })
 })

As per coding guidelines tests/**/*.{ts,js}: Use it.concurrent() instead of it() when possible to run tests in parallel within the same file, maximizing parallelism for faster CI/CD.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@tests/statistics-retries.unit.test.ts` around lines 8 - 44, Change the three
serial tests in the "statistics retry helpers" suite to run concurrently by
replacing it(...) with it.concurrent(...); specifically update the tests that
call statisticsTestUtils.executeStatsQueryWithRetry (the two cases using the
mocked query) and the test that calls statisticsTestUtils.resolveAppOwnerOrg
(the supabase maybeSingle mock) so they use it.concurrent to allow parallel
execution without changing test logic or mocks.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@supabase/functions/_backend/public/statistics/index.ts`:
- Around line 195-201: The missing-app 404 from resolveAppOwnerOrg is being
swallowed by the aggregator that collapses every stat.error into a generic
cannot_get_user_statistics; update the aggregator that processes stat.error (the
code path that currently converts all errors into cannot_get_user_statistics for
the /user statistics path) to detect and propagate the not-found/discriminated
error produced by resolveAppOwnerOrg (e.g., the error object with error:
'app_not_found' or a status: 404) instead of turning it into a 500; ensure
ownerOrgId resolution code (resolveAppOwnerOrg) can return a discriminated
result and that the aggregator checks stat.error.status === 404 or
stat.error.error === 'app_not_found' and returns that 404-shaped response up the
stack so callers of the /user endpoint receive the correct not-found result
rather than a generic cannot_get_user_statistics.

---

Nitpick comments:
In `@tests/statistics-retries.unit.test.ts`:
- Around line 8-44: Change the three serial tests in the "statistics retry
helpers" suite to run concurrently by replacing it(...) with it.concurrent(...);
specifically update the tests that call
statisticsTestUtils.executeStatsQueryWithRetry (the two cases using the mocked
query) and the test that calls statisticsTestUtils.resolveAppOwnerOrg (the
supabase maybeSingle mock) so they use it.concurrent to allow parallel execution
without changing test logic or mocks.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: c0334c59-3210-4345-8d3e-cdfc3e2d8652

📥 Commits

Reviewing files that changed from the base of the PR and between a221796 and 286a2f0.

📒 Files selected for processing (2)
  • supabase/functions/_backend/public/statistics/index.ts
  • tests/statistics-retries.unit.test.ts

Comment thread supabase/functions/_backend/public/statistics/index.ts
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 77052282e1

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread supabase/functions/_backend/public/statistics/index.ts Outdated
Comment thread supabase/functions/_backend/public/statistics/index.ts Outdated
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@supabase/functions/_backend/public/statistics/index.ts`:
- Around line 100-107: The getMissingAppStatsError detector currently treats any
404-shaped error as an app-missing signal; tighten it to match the synthetic app
error shape by returning true only when appError.error === 'app_not_found' OR
when appError.status === 404 AND typeof (appError as any).app_id === 'string'.
Update the getMissingAppStatsError function accordingly and make the identical
fix at the other occurrence that uses the same 404-based check (the similar
block around lines 828-830) so both places only treat the specific app-shaped
error as "app not found".
- Around line 55-58: The QueryResult<T> type only captures data and error so
HTTP status is lost; add a top-level status?: number | null to the
QueryResult<T> interface and update the retry logic (the function(s) referencing
getRetryableStatus or checking error.status) to inspect the response.status (not
error.status) for 5xx detection; ensure any function that returns or forwards
Supabase/PostgREST responses (the code paths that construct QueryResult from
Supabase responses) include the response.status in the QueryResult so retry
checks can accurately detect 5xx HTTP errors.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: caea3e01-5976-47b4-b056-c88a9b2e33f5

📥 Commits

Reviewing files that changed from the base of the PR and between 8c3fae9 and 7705228.

📒 Files selected for processing (2)
  • supabase/functions/_backend/public/statistics/index.ts
  • tests/statistics-retries.unit.test.ts
✅ Files skipped from review due to trivial changes (1)
  • tests/statistics-retries.unit.test.ts

Comment thread supabase/functions/_backend/public/statistics/index.ts
Comment thread supabase/functions/_backend/public/statistics/index.ts
@sonarqubecloud
Copy link
Copy Markdown

@riderx riderx merged commit 4299b0f into main Apr 22, 2026
15 checks passed
@riderx riderx deleted the codex/statistics-rpc-retries branch April 22, 2026 09:28
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant