Skip to content

(SP: 1) [Stripe Webhook] Add multi-instance claim lock (claimed_at/by…#153

Merged
ViktorSvertoka merged 1 commit into
developfrom
lso/feat/shop
Jan 17, 2026
Merged

(SP: 1) [Stripe Webhook] Add multi-instance claim lock (claimed_at/by…#153
ViktorSvertoka merged 1 commit into
developfrom
lso/feat/shop

Conversation

@liudmylasovetovs
Copy link
Copy Markdown
Collaborator

@liudmylasovetovs liudmylasovetovs commented Jan 17, 2026

…/expires_at, TTL, atomic claim, retry-after)

Description

This PR hardens the Stripe webhook processing path for multi-instance deployments by introducing a durable claim/lock mechanism on stripe_events. The goal is to prevent duplicate or concurrent processing of the same Stripe event across multiple running instances while preserving safe retry semantics when processing fails.


Related Issue

Issue: #<issue_number>


Changes

  • Added multi-instance claim/lock fields to stripe_events (claimed_at, claim_expires_at, claimed_by) plus an index on claim_expires_at to support efficient TTL-based claiming.
  • Implemented an atomic claim step in the Stripe webhook route: the handler performs business logic only after successfully claiming the event; competing instances receive 503 with Retry-After.
  • Ensured safe retry behavior: if processing fails, the claim TTL is released early (best-effort) so subsequent retries can be claimed immediately; processed_at is only set on successful completion.

Database Changes (if applicable)

  • Schema migration required
  • Seed data updated
  • Breaking changes to existing queries
  • Transaction-safe migration
  • Migration tested locally on Neon

How Has This Been Tested?

  • Tested locally
  • Verified in development environment
  • Checked responsive layout (if UI-related)
  • Tested accessibility (keyboard / screen reader)

Commands (PowerShell):

  • npx vitest run .\lib\tests\stripe-webhook-contract.test.ts
  • npx vitest run .\lib\tests\stripe-webhook-mismatch.test.ts
  • npx vitest run .\lib\tests\stripe-webhook-paid-status-repair.test.ts
  • npx vitest run .\lib\tests\stripe-webhook-psp-fields.test.ts
  • npx vitest run .\lib\tests\stripe-webhook-refund-full.test.ts

Screenshots (if applicable)

N/A (server-side webhook + DB changes only)


Checklist

Before submitting

  • Code has been self-reviewed
  • No TypeScript or console errors
  • Code follows project conventions
  • Scope is limited to this feature/fix
  • No unrelated refactors included
  • English used in code, commits, and docs
  • New dependencies discussed with team
  • Database migration tested locally (if applicable)
  • GitHub Projects card moved to In Review

Reviewers

Summary by CodeRabbit

  • Bug Fixes
    • Enhanced Stripe webhook event handling with improved duplicate prevention and better state tracking for payment transactions.
    • Strengthened error recovery and automatic retry mechanisms for failed webhook processing.
    • Improved payment processing reliability across distributed environments with better transaction state management.

✏️ Tip: You can customize this high-level summary in your review settings.

…/expires_at, TTL, atomic claim, retry-after)
@netlify
Copy link
Copy Markdown

netlify Bot commented Jan 17, 2026

Deploy Preview for develop-devlovers ready!

Name Link
🔨 Latest commit 5380a3d
🔍 Latest deploy log https://app.netlify.com/projects/develop-devlovers/deploys/696c1db7bc5d160008f76bc6
😎 Deploy Preview https://deploy-preview-153--develop-devlovers.netlify.app
📱 Preview on mobile
Toggle QR Code...

QR Code

Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify project configuration.

@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented Jan 17, 2026

📝 Walkthrough

Walkthrough

Implements multi-instance webhook handling for Stripe events using a distributed locking mechanism. Adds claim-based processing to atomically assign webhook events to specific instances, with TTL-based expiration and retry-after logic for busy states.

Changes

Cohort / File(s) Summary
Stripe Webhook Multi-Instance Locking
frontend/app/api/shop/webhooks/stripe/route.ts
Added tryClaimStripeEvent() function for atomic event claiming with three outcomes: 'claimed', 'already_processed', or 'busy'. Updated POST handler to claim events before processing; returns 200 for duplicates, 503 with Retry-After for busy claims, or 200 on successful processing. Added early claim release on errors via claimExpiresAt field reset.
Database Schema Updates
frontend/db/schema/shop.ts, frontend/drizzle/0003_add_stripe_events_claim_lock.sql
Added claimedAt, claimExpiresAt, and claimedBy columns to stripe_events table with index on claimExpiresAt. Added CHECK constraint to payment_attempts enforcing provider = 'stripe'.
Migration Artifacts
frontend/drizzle/meta/0003_snapshot.json, frontend/drizzle/meta/_journal.json
Auto-generated schema snapshot and migration journal entry tracking the claim-lock schema changes.

Sequence Diagram(s)

sequenceDiagram
    participant Stripe as Stripe API
    participant Handler as Webhook Handler
    participant DB as Database
    participant Logger as Logger

    Stripe->>Handler: POST webhook event
    Handler->>DB: tryClaimStripeEvent()
    DB-->>DB: Check if already processed
    alt Already processed
        DB-->>Handler: 'already_processed'
        Handler-->>Stripe: 200 OK (skip duplicate)
        Handler->>Logger: Log duplicate
    else Claim owned by another instance
        DB-->>Handler: 'busy'
        Handler-->>Stripe: 503 Service Unavailable<br/>(Retry-After header)
    else Successfully claimed
        DB-->>Handler: 'claimed'
        Handler->>DB: Business logic (order, metadata)<br/>under claim ownership
        alt Processing succeeds
            Handler->>DB: Mark processed<br/>(ack event)
            DB-->>Handler: Success
            Handler-->>Stripe: 200 OK
        else Processing fails
            Handler->>DB: Clear claimExpiresAt<br/>(release claim early)
            Handler-->>Stripe: 503 (retry eligible)
        end
    end
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

Possibly related PRs

Suggested reviewers

  • AM1007
  • ViktorSvertoka

Poem

🐰 Across the fields of instances wide,
Each webhook claims its rightful stride,
With locks and TTLs set just right,
Distributed harmony, pure delight!
No race conditions in our sight! ✨

🚥 Pre-merge checks | ✅ 3
✅ Passed checks (3 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title clearly and specifically describes the main change: adding a multi-instance claim lock mechanism for Stripe webhooks with TTL-based claiming semantics.
Docstring Coverage ✅ Passed No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing touches
  • 📝 Generate docstrings

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 5380a3d258

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment on lines 1222 to +1226
logError('Stripe webhook processing failed', error);
// P0.8: release claim early so Stripe retries can be claimed immediately.
try {
await db
.update(stripeEvents)
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Release claim on refund-fullness retry path

The new claim-release logic only runs after the generic logError branch, but the isRefundFullnessUndeterminedError path above returns early with a 500. With the new claim lock, that means claim_expires_at stays in the future and every retry for this specific error will hit the busyRetry path until the 10‑minute TTL elapses, even though the code explicitly wants Stripe to retry promptly. Consider releasing the claim (or setting claim_expires_at to epoch) in the refund‑fullness branch as well so retries can actually be processed.

Useful? React with 👍 / 👎.

@ViktorSvertoka ViktorSvertoka merged commit c991ccf into develop Jan 17, 2026
9 checks passed
@ViktorSvertoka ViktorSvertoka deleted the lso/feat/shop branch January 17, 2026 23:48
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants