Skip to content

fix(billing): add network error recovery for transaction retrieval#2559

Merged
baktun14 merged 7 commits intomainfrom
fix/tx-polling
Jan 25, 2026
Merged

fix(billing): add network error recovery for transaction retrieval#2559
baktun14 merged 7 commits intomainfrom
fix/tx-polling

Conversation

@baktun14
Copy link
Contributor

@baktun14 baktun14 commented Jan 22, 2026

Summary by CodeRabbit

  • Bug Fixes

    • Improved transaction signing/broadcasting resilience with a broader recovery path for transient network errors; adds recovery and success logging while preserving prior behavior for non-network errors.
  • New Features

    • Public utility in the HTTP SDK to detect retriable network errors (now includes socket cases).
  • Tests

    • Added unit tests covering transaction recovery for network/socket errors and non-recovery scenarios.
  • Documentation

    • Expanded comments clarifying network-recovery behavior.

✏️ Tip: You can customize this high-level summary in your review settings.

@baktun14 baktun14 requested a review from a team as a code owner January 22, 2026 17:41
@coderabbitai
Copy link
Contributor

coderabbitai bot commented Jan 22, 2026

📝 Walkthrough

Walkthrough

Adds a transaction-recovery path to batch-signing-client: on retriable network/socket failures while fetching a broadcast tx, the service retries with exponential backoff (maxDelay 10s) to recover the IndexedTx; non-network errors still propagate. Tests added and isRetriableError exported from the HTTP SDK.

Changes

Cohort / File(s) Summary
Batch Signing Client (recovery logic)
apps/api/src/billing/lib/batch-signing-client/batch-signing-client.service.ts
Replaced simple getTx-based retry with a txRecoveryExecutor (updated backoff, maxDelay 10s). Added isRetriableNetworkError and tryRecoverTransaction; signAndBroadcast now uses recovery path, logs success/failure, and throws SIGN_AND_BROADCAST_TX_NOT_FOUND when recovery fails. (+74/-14)
Unit tests (recovery paths)
apps/api/src/billing/lib/batch-signing-client/batch-signing-client.service.spec.ts
Added tests that simulate getTx failing with retriable network/socket errors and later succeeding (verifies retries and success), plus a test ensuring non-network errors are propagated without recovery attempts. (+83/-0)
HTTP SDK export & helper
packages/http-sdk/src/index.ts, packages/http-sdk/src/utils/createFetchAdapter/createFetchAdapter.ts
Exported isRetriableError from SDK index and added UND_ERR_SOCKET to retriable error checks in createFetchAdapter. This exposes the helper used by the client recovery logic. (+6/-1 across files)

Sequence Diagram(s)

sequenceDiagram
  participant Client
  participant BatchSigningClient
  participant Broadcaster
  participant NodeAPI

  Client->>BatchSigningClient: signAndBroadcast(tx)
  BatchSigningClient->>Broadcaster: broadcast(tx)
  Broadcaster-->>BatchSigningClient: txHash
  BatchSigningClient->>NodeAPI: getTx(txHash)
  alt getTx succeeds
    NodeAPI-->>BatchSigningClient: IndexedTx
    BatchSigningClient-->>Client: return IndexedTx
  else getTx fails with retriable network/socket error
    NodeAPI-->>BatchSigningClient: network error
    BatchSigningClient->>BatchSigningClient: log recovery start
    BatchSigningClient->>BatchSigningClient: txRecoveryExecutor (retry/backoff)
    BatchSigningClient->>NodeAPI: getTx(txHash) (retries)
    alt retry succeeds
      NodeAPI-->>BatchSigningClient: IndexedTx
      BatchSigningClient->>BatchSigningClient: log recovery success
      BatchSigningClient-->>Client: return IndexedTx
    else all retries fail
      BatchSigningClient->>BatchSigningClient: log recovery failure
      BatchSigningClient-->>Client: throw SIGN_AND_BROADCAST_TX_NOT_FOUND
    end
  else getTx fails non-network error
    NodeAPI-->>BatchSigningClient: non-network error
    BatchSigningClient-->>Client: throw original error
  end
Loading

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Possibly related PRs

Poem

🐇 I sniffed the net where packets hide away,
I twitched my nose and tried again with care,
When sockets hiccuped, I hopped through delay,
Found the hash at last—tiny carrot to share.
Recovery wins! 🥕

🚥 Pre-merge checks | ✅ 2 | ❌ 1
❌ Failed checks (1 warning)
Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (2 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title accurately reflects the main change: adding network error recovery for transaction retrieval in the billing service.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing touches
  • 📝 Generate docstrings

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Fix all issues with AI agents
In
`@apps/api/src/billing/lib/batch-signing-client/batch-signing-client.service.ts`:
- Around line 399-406: The tryRecoverTransaction method currently may return
undefined because getTxExecutor.execute() yields IndexedTx | undefined while the
method signature promises IndexedTx | null; change the implementation in
tryRecoverTransaction to await the executor result into a variable (e.g., const
tx = await this.getTxExecutor.execute(() => this.client.getTx(hash))) and return
tx ?? null, and ensure the catch block also returns null so the method never
leaks undefined and always conforms to IndexedTx | null; keep the existing
initial delay and use the same getTxExecutor.execute and client.getTx calls.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Fix all issues with AI agents
In
`@apps/api/src/billing/lib/batch-signing-client/batch-signing-client.service.spec.ts`:
- Around line 166-168: The test's assertion doesn't enforce the "only once"
intent—replace the loose check expect(client.getTx).toHaveBeenCalled() with a
precise call count assertion expect(client.getTx).toHaveBeenCalledTimes(1) so
the test verifies client.getTx was invoked exactly once (no recovery attempts);
update the assertion in the batch-signing-client.service.spec.ts test
referencing client.getTx.
🧹 Nitpick comments (3)
apps/api/src/billing/lib/batch-signing-client/batch-signing-client.service.spec.ts (3)

99-124: Consider using try/finally for fake timer cleanup.

If the test throws before reaching jest.useRealTimers(), fake timers will leak to subsequent tests and may cause flaky failures. Wrap the test body or use afterEach for cleanup.

♻️ Suggested pattern
  it("should recover transaction when getTx fails with network error but tx exists on chain", async () => {
    jest.useFakeTimers();
+   try {
      const granter = createAkashAddress();
      // ... test body ...
      expect(client.getTx).toHaveBeenCalledTimes(2);
+   } finally {
      jest.useRealTimers();
+   }
  });

Or alternatively, add cleanup to the describe block:

afterEach(() => {
  jest.useRealTimers();
});

Also applies to: 126-151


107-111: Consider reducing inline comments.

Per coding guidelines, unnecessary comments should be avoided. The mock setup is self-explanatory from method names like mockRejectedValueOnce and mockResolvedValueOnce. Consider removing or consolidating comments.

♻️ Example simplification
-    // Reset getTx mock to simulate network error then recovery
     client.getTx.mockReset();
     client.getTx
-      .mockRejectedValueOnce(new Error("TypeError: fetch failed")) // First call fails with network error
-      .mockResolvedValueOnce(testData.tx); // Recovery call succeeds
+      .mockRejectedValueOnce(new Error("TypeError: fetch failed"))
+      .mockResolvedValueOnce(testData.tx);

Also applies to: 134-138, 159-162


99-99: Test descriptions use "should" prefix.

Per coding guidelines, test descriptions should use present simple, 3rd person singular without prepending "should." However, this is consistent with existing tests in the file.

♻️ Example following guidelines
-  it("should recover transaction when getTx fails with network error but tx exists on chain", async () => {
+  it("recovers transaction when getTx fails with network error but tx exists on chain", async () => {

-  it("should recover transaction when getTx fails with socket error", async () => {
+  it("recovers transaction when getTx fails with socket error", async () => {

-  it("should not attempt recovery for non-network errors", async () => {
+  it("does not attempt recovery for non-network errors", async () => {

Note: Existing tests also use "should" prefix, so this would be a broader refactor affecting the entire file. As per coding guidelines.

Also applies to: 126-126, 153-153

@codecov
Copy link

codecov bot commented Jan 22, 2026

Codecov Report

❌ Patch coverage is 86.36364% with 3 lines in your changes missing coverage. Please review.
✅ Project coverage is 50.78%. Comparing base (fa01b2c) to head (f0d5844).
⚠️ Report is 3 commits behind head on main.
✅ All tests successful. No failed tests found.

Files with missing lines Patch % Lines
...tch-signing-client/batch-signing-client.service.ts 86.36% 3 Missing ⚠️

❌ Your project status has failed because the head coverage (79.29%) is below the target coverage (80.00%). You can increase the head coverage or adjust the target coverage.

Additional details and impacted files
@@            Coverage Diff             @@
##             main    #2559      +/-   ##
==========================================
+ Coverage   50.76%   50.78%   +0.02%     
==========================================
  Files        1069     1069              
  Lines       29728    29745      +17     
  Branches     6583     6595      +12     
==========================================
+ Hits        15091    15107      +16     
+ Misses      14286    14276      -10     
- Partials      351      362      +11     
Flag Coverage Δ
api 79.29% <86.36%> (+0.02%) ⬆️
deploy-web 31.45% <ø> (ø)
log-collector 75.35% <ø> (ø)
notifications 87.94% <ø> (ø)
provider-console 81.48% <ø> (ø)
provider-proxy 84.35% <ø> (ø)
Files with missing lines Coverage Δ
...tch-signing-client/batch-signing-client.service.ts 89.16% <86.36%> (-1.13%) ⬇️

... and 7 files with indirect coverage changes

🚀 New features to boost your workflow:
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Fix all issues with AI agents
In
`@apps/api/src/billing/lib/batch-signing-client/batch-signing-client.service.ts`:
- Around line 376-385: In tryRecoverTransaction, don't swallow all errors — only
convert retriable/network errors into null; rethrow any other failures. Inside
the catch for the this.txRecoveryExecutor.execute call in tryRecoverTransaction,
inspect the caught error (e.g., check error type, error.code/status or use a
helper like isNetworkError) and if it matches a retriable/network condition log
the TX_RECOVERY_FAILED and return null; otherwise rethrow the error so callers
can handle non-network failures. Ensure references remain to
tryRecoverTransaction, txRecoveryExecutor.execute, this.client.getTx and
this.logger.warn in your changes.

@baktun14 baktun14 merged commit 51b37e4 into main Jan 25, 2026
67 of 68 checks passed
@baktun14 baktun14 deleted the fix/tx-polling branch January 25, 2026 16:55
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants

Comments