perf: add sendNotifOrgCached with smart TTL-based caching#1477
Conversation
Reduces database queries on sendNotifOrg by caching "not sendable" results during plugin endpoints (stats, updates, channel_self). Cache TTL is calculated from the cron schedule and last_send_at, expiring exactly when the notification becomes sendable again. Co-Authored-By: Claude Haiku 4.5 <noreply@anthropic.com>
|
Caution Review failedThe pull request is closed. 📝 WalkthroughWalkthroughThis PR introduces a caching layer for organization notifications. Multiple call sites are updated from Changes
Sequence Diagram(s)sequenceDiagram
participant Caller
participant sendNotifOrgCached
participant CacheHelper
participant sendNotifOrg
participant backgroundTask
Caller->>sendNotifOrgCached: sendNotifOrgCached(context, event, data, orgId, uniqId, cron)
sendNotifOrgCached->>CacheHelper: getNotifCacheStatus(cacheKey)
alt Cache Hit - Not Sendable
CacheHelper-->>sendNotifOrgCached: cached not-sendable state
sendNotifOrgCached-->>Caller: return false
else Cache Miss or Sendable
sendNotifOrgCached->>sendNotifOrg: sendNotifOrg(context, event, data, orgId, uniqId)
sendNotifOrg-->>sendNotifOrgCached: result (boolean)
alt Result Not Sendable
sendNotifOrgCached->>sendNotifOrgCached: getSecondsUntilNextCronWindow(cron, lastSendTime)
sendNotifOrgCached->>backgroundTask: setNotifCacheStatus(cacheKey, not-sendable, ttl)
backgroundTask-->>sendNotifOrgCached: cache updated
end
sendNotifOrgCached-->>Caller: return result
end
Estimated Code Review Effort🎯 3 (Moderate) | ⏱️ ~25 minutes Possibly Related PRs
Poem
✨ Finishing touches
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 36d10b86e6
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| // Notification exists, check if sendable | ||
| if (!isSendable(c, notif.last_send_at, cron)) { | ||
| cloudlog({ requestId: c.get('requestId'), message: 'notif already sent', event: eventName, orgId }) | ||
| return false | ||
| return { sent: false, lastSendAt: notif.last_send_at } |
There was a problem hiding this comment.
Avoid returning truthy object for throttled send
When a notification is not sendable, sendNotifOrg now returns an object instead of false. Callers that treat the return as a boolean (e.g., sendNotifToOrgMembers uses if (!orgEmailSent) return false) will see this object as truthy and proceed to send additional emails even though the notification was throttled. This effectively disables rate-limiting for org-member emails whenever sendNotifOrg returns the new object. Consider returning a boolean (and exposing lastSendAt via a separate channel) or updating all boolean consumers to explicitly handle the object case.
Useful? React with 👍 / 👎.
There was a problem hiding this comment.
Pull request overview
This PR adds a caching layer (sendNotifOrgCached) around the existing sendNotifOrg function to reduce redundant database queries in high-traffic plugin endpoints (updates, stats, channel_self). The cache uses smart TTL calculation based on cron schedules, expiring exactly when notifications become sendable again.
Changes:
- Introduces
sendNotifOrgCachedfunction with cache-aware logic and dynamic TTL calculation - Modifies
sendNotifOrgto return{ sent: false, lastSendAt: string }for "not sendable" cases (previously just returnedfalse) - Replaces all calls to
sendNotifOrgwithsendNotifOrgCachedin update.ts, stats.ts, and channel_self.ts
Reviewed changes
Copilot reviewed 4 out of 4 changed files in this pull request and generated 8 comments.
| File | Description |
|---|---|
| supabase/functions/_backend/utils/notifications.ts | Adds caching infrastructure (cache key builders, getters/setters), TTL calculation function, and the main sendNotifOrgCached wrapper function |
| supabase/functions/_backend/utils/update.ts | Replaces 4 sendNotifOrg calls with sendNotifOrgCached for payment and plugin issue notifications |
| supabase/functions/_backend/plugins/stats.ts | Replaces sendNotifOrg with sendNotifOrgCached for missing payment notification |
| supabase/functions/_backend/plugins/channel_self.ts | Replaces 4 sendNotifOrg calls with sendNotifOrgCached across POST, PUT, DELETE, and list operations |
| * Calculate seconds until the next cron window opens based on last send time. | ||
| */ | ||
| function getSecondsUntilNextCronWindow(lastSendAt: string, cron: string): number { | ||
| const interval = parseCronExpression(cron) | ||
| const lastSendDate = new Date(lastSendAt) | ||
| const now = new Date() | ||
| const nextDate = interval.getNextDate(lastSendDate) | ||
| const diffMs = nextDate.getTime() - now.getTime() | ||
| // Return at least 1 second, and cap at reasonable max (1 week) | ||
| return Math.max(1, Math.min(Math.ceil(diffMs / 1000), 604800)) |
There was a problem hiding this comment.
The TTL calculation logic in getSecondsUntilNextCronWindow has an important assumption that should be documented: it calculates the next occurrence from lastSendDate (when the notification was last sent), not from the current time. This means if a notification was sent on Monday and the cron is weekly, the TTL will be calculated to the following Monday.
However, this could be confusing because the isSendable function checks if "now" is after the next date from last_send_at. If there's a clock skew or if diffMs is negative (when now is already past the next cron window), the Math.max(1, ...) will set TTL to 1 second, but this edge case should be explicitly documented or handled more clearly.
| * Calculate seconds until the next cron window opens based on last send time. | |
| */ | |
| function getSecondsUntilNextCronWindow(lastSendAt: string, cron: string): number { | |
| const interval = parseCronExpression(cron) | |
| const lastSendDate = new Date(lastSendAt) | |
| const now = new Date() | |
| const nextDate = interval.getNextDate(lastSendDate) | |
| const diffMs = nextDate.getTime() - now.getTime() | |
| // Return at least 1 second, and cap at reasonable max (1 week) | |
| return Math.max(1, Math.min(Math.ceil(diffMs / 1000), 604800)) | |
| * Calculate seconds until the next cron window opens. | |
| * | |
| * Primary behavior: | |
| * - The "next" cron occurrence is computed from `lastSendAt` (the last time | |
| * the notification was sent), not from the current time. | |
| * | |
| * Edge-case handling: | |
| * - If the next occurrence computed from `lastSendAt` is already in the past | |
| * or exactly equal to `now` (e.g. due to clock skew or delayed processing), | |
| * we fall back to computing the next occurrence from `now`. This avoids | |
| * returning an almost-zero TTL that would cause repeated rapid checks. | |
| */ | |
| function getSecondsUntilNextCronWindow(lastSendAt: string, cron: string): number { | |
| const interval = parseCronExpression(cron) | |
| const lastSendDate = new Date(lastSendAt) | |
| const now = new Date() | |
| let nextDate = interval.getNextDate(lastSendDate) | |
| // If the next cron time based on last send is not strictly in the future, | |
| // recompute from "now" to keep TTL semantics aligned with `isSendable`. | |
| if (nextDate.getTime() <= now.getTime()) { | |
| nextDate = interval.getNextDate(now) | |
| } | |
| const diffMs = nextDate.getTime() - now.getTime() | |
| const ONE_WEEK_SECONDS = 7 * 24 * 60 * 60 | |
| const ttlSeconds = Math.ceil(diffMs / 1000) | |
| // Return at least 1 second, and cap at a reasonable max (1 week). | |
| return Math.max(1, Math.min(ttlSeconds, ONE_WEEK_SECONDS)) |
| function buildNotifCacheRequest(c: Context, orgId: string, eventName: string, uniqId: string) { | ||
| const helper = new CacheHelper(c) | ||
| if (!helper.available) | ||
| return null | ||
| return { | ||
| helper, | ||
| request: helper.buildRequest(NOTIF_CACHE_PATH, { org_id: orgId, event: eventName, uniq_id: uniqId }), | ||
| } | ||
| } |
There was a problem hiding this comment.
Performance consideration: A new CacheHelper instance is created on every call to buildNotifCacheRequest, which triggers cache initialization asynchronously via resolveGlobalCache(). For high-traffic endpoints like stats/updates/channel_self that call this function multiple times, consider reusing a single CacheHelper instance per request context by storing it in the Hono context variables (similar to how requestId is stored).
This would avoid redundant cache initialization attempts and the race condition with the available property check.
| // For other cases (true/false), just return the boolean result | ||
| // No need to cache "sent=true" as next check should query DB anyway | ||
| return result === true |
There was a problem hiding this comment.
The return statement result === true is fragile and unclear. After handling the object case on line 194, result can still be false (from various error cases in sendNotifOrg). Using strict equality === true means that false would correctly return false, but this logic is not immediately obvious and could be error-prone if the return type of sendNotifOrg changes.
Consider making this more explicit, for example: return typeof result === 'boolean' ? result : false or restructuring the conditions to be clearer.
| // For other cases (true/false), just return the boolean result | |
| // No need to cache "sent=true" as next check should query DB anyway | |
| return result === true | |
| // For other cases, normalize the result to a boolean. | |
| // No need to cache "sent=true" as next check should query DB anyway. | |
| return typeof result === 'boolean' ? result : false |
| cloudlog({ requestId: c.get('requestId'), message: 'notif caching not sendable', event: eventName, orgId, ttlSeconds }) | ||
| setNotifCacheStatus(c, orgId, eventName, uniqId, false, ttlSeconds) | ||
| return false | ||
| } | ||
|
|
||
| // For other cases (true/false), just return the boolean result | ||
| // No need to cache "sent=true" as next check should query DB anyway |
There was a problem hiding this comment.
When multiple concurrent requests trigger sendNotifOrg simultaneously (cache miss), the function uses optimistic locking to ensure only one sends the notification. However, the losing requests return plain false (lines 122, 149 in sendNotifOrg), not the object with lastSendAt. This means:
- The winning request returns
true(line 164) - not cached - Losing requests return
false- not cached - Subsequent requests will query the DB again instead of hitting cache
To optimize this, consider having losing requests in sendNotifOrg re-query the notification record to get the updated last_send_at and return the object form, allowing proper caching. Alternatively, after detecting a race loss, the cached version could re-query the DB to get the lastSendAt for proper TTL calculation before returning.
| cloudlog({ requestId: c.get('requestId'), message: 'notif caching not sendable', event: eventName, orgId, ttlSeconds }) | |
| setNotifCacheStatus(c, orgId, eventName, uniqId, false, ttlSeconds) | |
| return false | |
| } | |
| // For other cases (true/false), just return the boolean result | |
| // No need to cache "sent=true" as next check should query DB anyway | |
| cloudlog({ requestId: c.get('requestId'), message: 'notif caching not sendable (with lastSendAt)', event: eventName, orgId, uniqId, ttlSeconds }) | |
| setNotifCacheStatus(c, orgId, eventName, uniqId, false, ttlSeconds) | |
| return false | |
| } | |
| // If we only got a plain boolean false, we still cache it for a short period to | |
| // avoid hammering the database on immediate subsequent checks (e.g. lock losers). | |
| if (result === false) { | |
| const fallbackTtlSeconds = 5 | |
| cloudlog({ | |
| requestId: c.get('requestId'), | |
| message: 'notif caching not sendable (boolean result, fallback TTL)', | |
| event: eventName, | |
| orgId, | |
| uniqId, | |
| ttlSeconds: fallbackTtlSeconds, | |
| }) | |
| setNotifCacheStatus(c, orgId, eventName, uniqId, false, fallbackTtlSeconds) | |
| return false | |
| } | |
| // For "sent = true", just return true; we don't cache this as the next check | |
| // should consult the database to see the updated last_send_at value. |
| function buildNotifCacheRequest(c: Context, orgId: string, eventName: string, uniqId: string) { | ||
| const helper = new CacheHelper(c) | ||
| if (!helper.available) | ||
| return null |
There was a problem hiding this comment.
Race condition: The available property checks if this.cache is non-null, but this property is only set asynchronously in the constructor. When buildNotifCacheRequest is called immediately after creating the CacheHelper, the cache promise may not have resolved yet, causing available to return false even though the cache will be available. This means the cache could be incorrectly considered unavailable on first use.
Consider awaiting the cache initialization or using the async ensureCache() method instead of the synchronous available property.
| if (!isSendable(c, notif.last_send_at, cron)) { | ||
| cloudlog({ requestId: c.get('requestId'), message: 'notif already sent', event: eventName, orgId }) | ||
| return false | ||
| return { sent: false, lastSendAt: notif.last_send_at } |
There was a problem hiding this comment.
Inconsistent return type: The function sendNotifOrg now returns either boolean or { sent: false, lastSendAt: string }, but this creates a type safety issue. The function signature should explicitly reflect this union return type: Promise<boolean | { sent: false, lastSendAt: string }>.
Additionally, the check on line 194 uses typeof result === 'object' which would also be true for null, though unlikely in this context. Consider using a more specific type guard like checking if result has the sent property.
| const result = await sendNotifOrg(c, eventName, eventData, orgId, uniqId, cron) | ||
|
|
||
| // Handle the "not sendable" case with lastSendAt for proper TTL calculation | ||
| if (typeof result === 'object' && result.sent === false && result.lastSendAt) { |
There was a problem hiding this comment.
Type guard could match null: The check typeof result === 'object' will also match null since typeof null is 'object' in JavaScript. While the current logic flow makes null unlikely, this should be more explicit for type safety. Consider checking result && typeof result === 'object' && 'sent' in result or using a proper type guard function.
| if (typeof result === 'object' && result.sent === false && result.lastSendAt) { | |
| if (result && typeof result === 'object' && result.sent === false && result.lastSendAt) { |
| export async function sendNotifOrgCached(c: Context, eventName: string, eventData: EventData, orgId: string, uniqId: string, cron: string): Promise<boolean> { | ||
| // Check cache first - if we recently checked and it wasn't sendable, skip DB query | ||
| const cachedSendable = await getNotifCacheStatus(c, orgId, eventName, uniqId) | ||
| if (cachedSendable === false) { | ||
| cloudlog({ requestId: c.get('requestId'), message: 'notif cache hit - not sendable', event: eventName, orgId, uniqId }) | ||
| return false | ||
| } | ||
|
|
||
| // Cache miss, call the actual function | ||
| const result = await sendNotifOrg(c, eventName, eventData, orgId, uniqId, cron) | ||
|
|
||
| // Handle the "not sendable" case with lastSendAt for proper TTL calculation | ||
| if (typeof result === 'object' && result.sent === false && result.lastSendAt) { | ||
| const ttlSeconds = getSecondsUntilNextCronWindow(result.lastSendAt, cron) | ||
| cloudlog({ requestId: c.get('requestId'), message: 'notif caching not sendable', event: eventName, orgId, ttlSeconds }) | ||
| setNotifCacheStatus(c, orgId, eventName, uniqId, false, ttlSeconds) | ||
| return false | ||
| } | ||
|
|
||
| // For other cases (true/false), just return the boolean result | ||
| // No need to cache "sent=true" as next check should query DB anyway | ||
| return result === true | ||
| } |
There was a problem hiding this comment.
The new caching functionality in sendNotifOrgCached lacks test coverage. Given the critical nature of notification delivery (especially for payment-related notifications like 'org:missing_payment'), tests should verify:
- Cache hit/miss behavior
- Correct TTL calculation based on cron schedules
- Cache expiry exactly when notifications become sendable again
- Behavior when cache is unavailable (fallback to direct DB query)
This is particularly important since the caching logic involves complex TTL calculations and the repository has comprehensive test coverage for other backend functionality.
|


Summary (AI generated)
Adds
sendNotifOrgCached, a caching wrapper forsendNotifOrgthat dramatically reduces database queries in plugin endpoints (stats, updates, channel_self). The cache TTL is dynamically calculated based on the cron schedule and last send time, expiring exactly when the notification becomes sendable again.Motivation (AI generated)
sendNotifOrgis called multiple times in stats, updates, and channel_self endpoints and executes multiple database queries each time. Most of these calls result in "not sendable" because the notification was recently sent. By caching the "not sendable" result for the appropriate duration (until the next cron window), we eliminate redundant DB queries without risk of stale data.Business Impact (AI generated)
Reduces database load during high-traffic periods when the same orgs/events trigger notifications multiple times. This is especially valuable for the weekly
org:missing_paymentnotifications which are checked during every stats/update/channel_self request.Test Plan (AI generated)
'0 0 * * 1'weekly on Monday)Generated with Claude Code
Summary by CodeRabbit
✏️ Tip: You can customize this high-level summary in your review settings.