Handle large email attachments via R2 storage#1443
Conversation
|
Note Other AI code review bot(s) detectedCodeRabbit has detected other AI code review bot(s) in this pull request and will avoid duplicating their findings in the review comments. This may lead to a less comprehensive review. 📝 WalkthroughWalkthroughSplits email attachments by size: small (≤25MB) sent directly to Discord; large (>25MB) uploaded to R2 with 7-day TTL and served via a new /files/{fileKey}/{filename} route. Discord messages/embed content now include R2 links for large attachments. Changes
Sequence Diagram(s)sequenceDiagram
participant Sender as Email Sender
participant Worker as Cloudflare Worker
participant R2 as R2 Storage
participant KV as KV Store
participant Discord as Discord API
Sender->>Worker: Incoming email + attachments
Worker->>Worker: processAttachmentsForDiscord(classify by size)
alt Large attachments (>25MB)
Worker->>R2: uploadLargeAttachment(attachment, emailMessageId)
R2-->>Worker: stored & returns /files/{fileKey}/{filename} URL
Worker->>KV: write metadata with TTL
KV-->>Worker: ack
end
alt Small attachments (≤25MB)
Worker->>Worker: keep for direct multipart upload
end
Worker->>Worker: formatEmailForDiscord(email, r2Links)
Worker->>Discord: create thread message (attachments + r2 links in content/embeds)
Discord-->>Worker: thread created
sequenceDiagram
participant Client as Browser/User
participant Worker as Cloudflare Worker
participant KV as KV Store
participant R2 as R2 Storage
Client->>Worker: GET /files/{fileKey}/{filename}
Worker->>KV: fetch metadata for fileKey
alt Metadata not found
Worker-->>Client: 404 Not Found
else Metadata found
Worker->>KV: check expiry
alt Expired
Worker->>R2: delete object
R2-->>Worker: deleted
Worker->>KV: delete metadata
KV-->>Worker: deleted
Worker-->>Client: 410 Gone
else Valid
Worker->>R2: fetch object
R2-->>Worker: file content
Worker-->>Client: 200 + file (Content-Type, Content-Disposition, Cache-Control, X-Expires-At)
end
end
Estimated code review effort🎯 4 (Complex) | ⏱️ ~45 minutes Possibly related PRs
Poem
🚥 Pre-merge checks | ✅ 3✅ Passed checks (3 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing touches
🧹 Recent nitpick comments
📜 Recent review detailsConfiguration used: defaults Review profile: CHILL Plan: Pro 📒 Files selected for processing (2)
🚧 Files skipped from review as they are similar to previous changes (1)
🧰 Additional context used📓 Path-based instructions (3)**/*.{vue,ts,tsx,js,jsx}📄 CodeRabbit inference engine (AGENTS.md)
Files:
**/*.{ts,tsx,js,jsx}📄 CodeRabbit inference engine (AGENTS.md)
Files:
**/*.{ts,tsx,js,jsx,vue}📄 CodeRabbit inference engine (AGENTS.md)
Files:
🧠 Learnings (2)📚 Learning: 2025-12-23T02:53:12.055ZApplied to files:
📚 Learning: 2025-12-23T02:53:12.055ZApplied to files:
🧬 Code graph analysis (1)cloudflare_workers/email/r2-storage.ts (1)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
🔇 Additional comments (7)
✏️ Tip: You can disable this entire section by setting Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
💡 Codex Review
capgo/cloudflare_workers/email/wrangler.jsonc
Lines 15 to 18 in 1c1df69
The email worker README still documents wrangler deploy --env preprod (Deploy section), but the updated wrangler.jsonc now only defines env.prod. That means wrangler deploy --env preprod (and any alpha/local deployments) will fail with “environment not found,” breaking staging workflows and scripts that rely on those environments.
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| function getBaseUrl(_env: Env): string { | ||
| return 'https://email.capgo.app' | ||
| } |
There was a problem hiding this comment.
Avoid hardcoded prod base URL for R2 links
getBaseUrl() always returns https://email.capgo.app, so attachments uploaded in preprod/dev (or when running wrangler dev) will generate links that point to prod, where the file doesn’t exist. Recipients will see 404s even though the upload succeeded. Derive the base URL from the request host or an env var so non-prod deployments generate usable links.
Useful? React with 👍 / 👎.
There was a problem hiding this comment.
Actionable comments posted: 2
🤖 Fix all issues with AI agents
In `@cloudflare_workers/email/r2-storage.ts`:
- Around line 154-168: cleanupExpiredFile currently builds the R2 key from the
URL-provided filename which can mismatch stored metadata; change callers (e.g.,
serveR2File) to pass the metadata.originalFilename instead of the URL filename,
and update cleanupExpiredFile to accept and use that originalFilename when
constructing r2Key (`attachments/${fileKey}/${originalFilename}`) for
env.EMAIL_ATTACHMENTS.delete (and still delete `r2:${fileKey}` from
env.EMAIL_THREAD_MAPPING); ensure any other callers of cleanupExpiredFile are
updated to pass the correct originalFilename from the file metadata.
- Around line 126-133: The fetch path may use a URL-decoded filename that
doesn't match the stored R2 key created in uploadLargeAttachment; change the key
construction in the download/serve flow to use the stored metadata
originalFilename (e.g. metadata.originalFilename) or the same source used when
uploading (instead of the incoming filename param) so the lookup via
EMAIL_ATTACHMENTS.get(...) uses the exact key created by uploadLargeAttachment
(r2Key should be built from fileKey and the stored attachment filename/metadata,
not the request filename).
🧹 Nitpick comments (3)
cloudflare_workers/email/r2-storage.ts (3)
67-78: Consider sanitizing the filename before using it in the R2 key.The
attachment.filenameis used directly in the R2 key path without sanitization. While R2 keys are not filesystem paths, filenames from email attachments could contain unexpected characters (e.g., slashes, null bytes) that might cause issues or confusion. Consider sanitizing or validating the filename.♻️ Suggested sanitization
+/** + * Sanitizes filename for safe use in R2 keys + */ +function sanitizeFilename(filename: string): string { + return filename + .replace(/[/\\]/g, '_') // Replace path separators + .replace(/\0/g, '') // Remove null bytes + .substring(0, 255) // Limit length +} + // Upload to R2 with metadata -const r2Key = `attachments/${fileKey}/${attachment.filename}` +const r2Key = `attachments/${fileKey}/${sanitizeFilename(attachment.filename)}`
173-175: Hardcoded base URL may complicate local development.The base URL is hardcoded to
'https://email.capgo.app'. Consider deriving it from the request URL or environment configuration to support local development and testing scenarios.
180-186: DuplicateformatFileSizefunction.This function is also defined in
discord.ts(lines 414-418). Consider extracting it to a shared utility module to avoid duplication.
📜 Review details
Configuration used: defaults
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (5)
cloudflare_workers/email/discord.tscloudflare_workers/email/index.tscloudflare_workers/email/r2-storage.tscloudflare_workers/email/types.tscloudflare_workers/email/wrangler.jsonc
🧰 Additional context used
📓 Path-based instructions (3)
**/*.{vue,ts,tsx,js,jsx}
📄 CodeRabbit inference engine (AGENTS.md)
Run
bun lintto lint Vue, TypeScript, and JavaScript files; usebun lint:fixto auto-fix issues
Files:
cloudflare_workers/email/index.tscloudflare_workers/email/types.tscloudflare_workers/email/r2-storage.tscloudflare_workers/email/discord.ts
**/*.{ts,tsx,js,jsx}
📄 CodeRabbit inference engine (AGENTS.md)
Use single quotes and no semicolons per
@antfu/eslint-configrules
Files:
cloudflare_workers/email/index.tscloudflare_workers/email/types.tscloudflare_workers/email/r2-storage.tscloudflare_workers/email/discord.ts
**/*.{ts,tsx,js,jsx,vue}
📄 CodeRabbit inference engine (AGENTS.md)
Run
bun lintor lint/format command before validating any backend or frontend task to ensure consistent formatting
Files:
cloudflare_workers/email/index.tscloudflare_workers/email/types.tscloudflare_workers/email/r2-storage.tscloudflare_workers/email/discord.ts
🧠 Learnings (4)
📚 Learning: 2025-12-23T02:53:12.055Z
Learnt from: CR
Repo: Cap-go/capgo PR: 0
File: .github/copilot-instructions.md:0-0
Timestamp: 2025-12-23T02:53:12.055Z
Learning: Applies to cloudflare_workers/files/index.ts : Files Worker (port 8789) handles file upload/download operations
Applied to files:
cloudflare_workers/email/index.tscloudflare_workers/email/r2-storage.ts
📚 Learning: 2025-12-23T02:53:12.055Z
Learnt from: CR
Repo: Cap-go/capgo PR: 0
File: .github/copilot-instructions.md:0-0
Timestamp: 2025-12-23T02:53:12.055Z
Learning: Applies to cloudflare_workers/{api,plugin,files}/index.ts : Cloudflare Workers are split across three ports: API Worker (8787), Plugin Worker (8788), Files Worker (8789); see routing in `cloudflare_workers/{api,plugin,files}/index.ts`
Applied to files:
cloudflare_workers/email/index.tscloudflare_workers/email/r2-storage.ts
📚 Learning: 2025-12-23T02:53:12.055Z
Learnt from: CR
Repo: Cap-go/capgo PR: 0
File: .github/copilot-instructions.md:0-0
Timestamp: 2025-12-23T02:53:12.055Z
Learning: Applies to cloudflare_workers/api/index.ts : API Worker (port 8787) routes: `/bundle`, `/app`, `/device`, `/channel`, `/private/*`, `/triggers`
Applied to files:
cloudflare_workers/email/index.ts
📚 Learning: 2025-12-23T02:53:12.055Z
Learnt from: CR
Repo: Cap-go/capgo PR: 0
File: .github/copilot-instructions.md:0-0
Timestamp: 2025-12-23T02:53:12.055Z
Learning: Applies to cloudflare_workers/plugin/index.ts : Plugin Worker (port 8788) routes: `/updates`, `/channel_self`, `/stats`
Applied to files:
cloudflare_workers/email/index.ts
🧬 Code graph analysis (3)
cloudflare_workers/email/index.ts (1)
cloudflare_workers/email/r2-storage.ts (1)
serveR2File(102-149)
cloudflare_workers/email/r2-storage.ts (1)
cloudflare_workers/email/types.ts (2)
Env(1-24)EmailAttachment(53-58)
cloudflare_workers/email/discord.ts (3)
cloudflare_workers/email/types.ts (3)
EmailAttachment(53-58)Env(1-24)DiscordMessage(70-76)cloudflare_workers/email/r2-storage.ts (1)
uploadLargeAttachment(27-97)cloudflare_workers/email/index.ts (1)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
- GitHub Check: Analyze (java-kotlin)
🔇 Additional comments (11)
cloudflare_workers/email/types.ts (1)
5-7: LGTM!The R2Bucket type binding is correctly added to the Env interface, properly aligning with the wrangler.jsonc configuration.
cloudflare_workers/email/wrangler.jsonc (1)
42-47: R2 bucket binding defined at both top-level and environment-specific level.The
r2_bucketsconfiguration appears both at the top level (lines 56-61) and underenv.prod(lines 42-47). This is valid for Wrangler configurations where the top-level applies to local development/default, whileenv.prodapplies to production deployments. Ensure this duplication is intentional and that the same bucket should be used across all environments.Also applies to: 56-61
cloudflare_workers/email/r2-storage.ts (2)
1-14: LGTM!Constants and interface are well-defined. The 7-day expiry aligns with the Discord thread archive duration as documented.
16-21: LGTM!Using
crypto.randomUUID()is appropriate for generating unique file keys in Cloudflare Workers.cloudflare_workers/email/index.ts (2)
6-6: LGTM!Import added correctly.
175-181: Route handler looks correct.The file serving route is properly integrated. The
decodeURIComponentis necessary for handling URL-encoded filenames. Note that security around the filename parameter is addressed by the suggested fix inr2-storage.tsto usemetadata.originalFilenameinstead of the URL parameter.cloudflare_workers/email/discord.ts (5)
15-60: LGTM with observation on error handling.The attachment classification and processing logic is correct. Note that if an R2 upload fails (lines 53-55), the large attachment is silently dropped rather than causing a failure. This is acceptable for a non-critical feature, but users won't be notified if a large attachment failed to upload.
72-82: LGTM!Clean refactoring to check for empty attachments array.
145-156: LGTM!The integration of
processAttachmentsForDiscordis well-implemented, with proper logging of attachment counts.
259-276: LGTM!Good implementation for reply messages. The R2 link formatting with the expiry notice is helpful for users.
337-387: LGTM!The updated
formatEmailForDiscordfunction cleanly separates small attachments (uploaded to Discord) from large attachments (R2 links) in the embed. The expiry notice is consistently shown for R2 links.
✏️ Tip: You can disable this entire section by setting review_details to false in your review settings.
- Files ≥25MB uploaded to R2 instead of Discord - Serve files via worker with expiring signed URLs - Links expire after 7 days (matching thread archive) - Small attachments still uploaded directly to Discord - Simplified to prod environment only Co-Authored-By: Claude Haiku 4.5 <noreply@anthropic.com>
1c1df69 to
81638c6
Compare
- Use metadata.originalFilename for R2 key lookup (avoid URL encoding mismatch) - Pass originalFilename to cleanupExpiredFile for consistent cleanup - Add EMAIL_WORKER_URL env var for configurable base URL Co-Authored-By: Claude Haiku 4.5 <noreply@anthropic.com>
|



Summary (AI generated)
Added R2 storage support for large email attachments (≥25MB). Files that exceed Discord's 25MB limit are now uploaded to R2 with time-limited private URLs instead of being silently discarded. Small attachments continue to be uploaded directly to Discord.
Test plan (AI generated)
Checklist
Summary by CodeRabbit
New Features
Chores
✏️ Tip: You can customize this high-level summary in your review settings.