Skip to content

fix(manifest): check upload-encoded storage keys#2320

Merged
riderx merged 1 commit into
mainfrom
codex/manifest-upload-encoded-storage-keys
May 21, 2026
Merged

fix(manifest): check upload-encoded storage keys#2320
riderx merged 1 commit into
mainfrom
codex/manifest-upload-encoded-storage-keys

Conversation

@riderx
Copy link
Copy Markdown
Member

@riderx riderx commented May 21, 2026

Summary (AI generated)

  • Add a third manifest storage candidate for upload-location encoded keys such as %2540 when the database row contains %40.
  • Use the same candidate list in the manifest size queue path, the Capgo file read path, and the manifest file-size backfill script.
  • Extend encoding tests to cover decoded, raw percent, and upload-location encoded R2 keys.

Motivation (AI generated)

Some legacy manifest rows still report missing metadata after trying both the stored %40 key and the decoded @ key. The failed CSV shows the fallback ran, but both candidates 404. The remaining legacy shape is the upload URL encoded key, where %40 becomes %2540 during resumable upload routing.

Business Impact (AI generated)

This makes delta manifest downloads and file-size repair more reliable for old bundles with encoded asset names. It reduces support load and prevents customers from seeing metadata not found when the object exists under the upload-location encoded R2 key.

Test Plan (AI generated)

  • node --check scripts/backfill_manifest_file_sizes.mjs
  • bun scripts/backfill_manifest_file_sizes.mjs --help
  • bunx vitest run tests/manifest-path-encoding.unit.test.ts tests/upload-path-encoding.unit.test.ts
  • bunx eslint --no-ignore scripts/backfill_manifest_file_sizes.mjs supabase/functions/_backend/utils/manifest_encoding.ts supabase/functions/_backend/utils/s3.ts supabase/functions/_backend/files/util.ts tests/manifest-path-encoding.unit.test.ts tests/upload-path-encoding.unit.test.ts
  • bun lint
  • bun lint:backend

Summary by CodeRabbit

  • Chores
    • Enhanced file size backfilling with improved path resolution strategy to increase robustness when locating stored objects.
    • Refactored internal utilities to better handle file path variations and encoding formats.
    • Updated test suite to verify robust file retrieval across different path encoding scenarios.

Review Change Stack

@chatgpt-codex-connector
Copy link
Copy Markdown

You have reached your Codex usage limits for code reviews. You can see your limits in the Codex usage dashboard.

@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented May 21, 2026

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: ASSERTIVE

Plan: Pro

Run ID: bbd015d5-2844-4360-b918-15f7a9473b71

📥 Commits

Reviewing files that changed from the base of the PR and between 3957bda and a516d0d.

📒 Files selected for processing (6)
  • scripts/backfill_manifest_file_sizes.mjs
  • supabase/functions/_backend/files/util.ts
  • supabase/functions/_backend/utils/manifest_encoding.ts
  • supabase/functions/_backend/utils/s3.ts
  • tests/manifest-path-encoding.unit.test.ts
  • tests/upload-path-encoding.unit.test.ts

📝 Walkthrough

Walkthrough

The PR broadens S3 manifest file sizing from single-path fallback to a multi-candidate strategy. Encoding utilities generate deduplicated candidate keys; attachment reads and S3 size resolution iterate candidates; backfill operations record attempted paths for visibility.

Changes

Multi-Candidate S3 Key Strategy

Layer / File(s) Summary
Manifest Encoding Utilities & Candidate Generation
supabase/functions/_backend/utils/manifest_encoding.ts, tests/manifest-path-encoding.unit.test.ts
decodeManifestPathSegments is exported; new getManifestStorageCandidateKeys generates deduplicated S3 key candidates from an original path by optionally including decoded and re-encoded variants when percent-encoded octets are detected.
Attachment Read Candidate Refactoring
supabase/functions/_backend/files/util.ts, tests/upload-path-encoding.unit.test.ts
getAttachmentReadCandidateKeys refactored to build candidates in an array and deduplicate via Set. Test expectations updated to include double-encoded "raw percent route" variants and verify correct candidate ordering.
S3 Size Resolution with Candidates
supabase/functions/_backend/utils/s3.ts, tests/upload-path-encoding.unit.test.ts
New getSizeForKey helper encapsulates stat-based HEAD with range-based fallback. s3.getSize iterates manifest candidates, returning the first positive size and logging recovery from non-primary keys. Test validates sequential candidate checking.
Manifest Backfill Script with Candidate Metadata
scripts/backfill_manifest_file_sizes.mjs
Updated to iterate through S3 candidates until success or non-retryable error. CSV headers and error payloads now include attempted_s3_paths array for audit trail and debugging.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Possibly related PRs

  • Cap-go/capgo#2318: Modifies the same backfill script's manifest file-size flow with complementary changes to parallelization of candidate size checks.
  • Cap-go/capgo#2313: Updates the same backfill script's storage size lookup and candidate/attempt metadata handling.
  • Cap-go/capgo#2247: Overlaps on percent-encoded manifest/attachment candidate-key logic in getAttachmentReadCandidateKeys and related fallbacks.
🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Title check ✅ Passed The title 'fix(manifest): check upload-encoded storage keys' clearly and specifically describes the main change: adding support for checking upload-encoded storage keys in manifest handling.
Description check ✅ Passed The description includes most required sections from the template with detailed information about the changes and a comprehensive test plan, though Screenshots and some Checklist items are missing.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch

Comment @coderabbitai help to get the list of available commands and usage tips.

@codspeed-hq
Copy link
Copy Markdown
Contributor

codspeed-hq Bot commented May 21, 2026

Merging this PR will not alter performance

✅ 43 untouched benchmarks
⏩ 2 skipped benchmarks1


Comparing codex/manifest-upload-encoded-storage-keys (a516d0d) with main (3957bda)

Open in CodSpeed

Footnotes

  1. 2 benchmarks were skipped, so the baseline results were used instead. If they were deleted from the codebase, click here and archive them to remove them from the performance reports.

@sonarqubecloud
Copy link
Copy Markdown

@riderx riderx merged commit 05b1652 into main May 21, 2026
42 checks passed
@riderx riderx deleted the codex/manifest-upload-encoded-storage-keys branch May 21, 2026 15:38
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant