fix: improve fallback tx error handling by skosito · Pull Request #3770 · zeta-chain/node

skosito · 2025-03-27T12:26:45Z

Description

extended solana examples a bit to enable e2e test zeta-chain/protocol-contracts-solana#97

idea is to check error logs for Program <PROGRAM_ID> invoked logs, and to check if some program is invoked after gateway

if only gateway is invoked, we just skip NonceMismatch errors, anything else is requires fallback (eg: token transfer, regular transfer, etc)
if something else is invoked after gateway, fallback is needed, error msg from that program is not considered, so it can be anything, including NonceMismatch - e2e test is extended with this case
if connected program calls back gateway, it is reentrancy and it will fail, but gateway invoke might appear again, so this should cover that scenario as well

we probably should look into other solana golang libraries, this one just gives error string so we must do some parsing to figure out on our own

How Has This Been Tested?

Summary by CodeRabbit

Documentation
- Updated the changelog to reflect improved fallback transaction error handling.
Bug Fixes
- Enhanced the error handling in transactions to trigger fallback processing under additional conditions for improved reliability.
Tests
- Refined test conditions to validate more specific revert triggers.
- Added tests to verify accurate detection of error conditions based on invocation sequences.

coderabbitai · 2025-03-27T12:26:53Z

Important

Review skipped

Auto incremental reviews are disabled on this repository.

Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

📝 Walkthrough

Walkthrough

The pull request integrates several changes across different modules. It updates the changelog with a note on fallback transaction error handling, amends a test payload for a specific revert condition, introduces a new function to analyze error log sequences in Solana contracts, and adds corresponding tests. Additionally, the error handling in the Solana signer’s broadcastOutbound method is modified to use the new function for determining when to execute a fallback transaction.

Changes

File(s)	Change Summary
`changelog.md`	Added a new entry in the "Fixes" section regarding improved fallback transaction error handling with a reference to PR 3770.
`e2e/e2etests/test_solana_withdraw_and_call_revert_with_call.go`	Updated the test payload from `"revert"` to `"revert NonceMismatch"`, clarifying the reason for the revert condition.
`pkg/contracts/solana/instruction.go` `pkg/contracts/solana/instruction_test.go`	Introduced the new function `ProgramInvokedAfterTargetInErrStr` using regex to check program invocation order in error logs, and added tests validating its behavior with multiple sub-tests.
`zetaclient/chains/solana/signer/signer.go`	Modified the `broadcastOutbound` method to remove the explicit `"NonceMismatch"` check, now leveraging the new function to decide on using a fallback transaction when a program is invoked after the gateway.

Sequence Diagram(s)

sequenceDiagram
    participant S as Signer
    participant N as Solana Network
    participant C as Contracts (Error Checker)

    S->>N: Broadcast transaction
    N-->>S: Return error response
    S->>C: Invoke ProgramInvokedAfterTargetInErrStr(errMsg, targetProgram)
    C-->>S: Return true/false based on error log scan
    alt Fallback condition met
        S->>N: Broadcast fallback transaction
    else
        S->>S: Handle error without fallback
    end

Possibly related PRs

fix(zetaclient): ensure fallbackTx is not nil #3632: This PR modifies the broadcastOutbound function to ensure fallback transactions are only used when non-nil, indicating a related enhancement in error handling logic.

Suggested labels

bug, zetaclient, chain:solana

Suggested reviewers

gartnera
lumtis
kingpinXD

🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

Review comments: Directly reply to a review comment made by CodeRabbit. Example:
- I pushed a fix in commit <commit_id>, please review it.
- Generate unit testing code for this file.
- Open a follow-up GitHub issue for this discussion.
Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
- @coderabbitai generate unit testing code for this file.
- @coderabbitai modularize this function.
PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
- @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
- @coderabbitai read src/utils.ts and generate unit testing code.
- @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
- @coderabbitai help me debug CodeRabbit configuration file.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (Invoked using PR comments)

@coderabbitai pause to pause the reviews on a PR.
@coderabbitai resume to resume the paused reviews.
@coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
@coderabbitai full review to do a full review from scratch and review all the files again.
@coderabbitai summary to regenerate the summary of the PR.
@coderabbitai generate docstrings to generate docstrings for this PR.
@coderabbitai resolve resolve all the CodeRabbit review comments.
@coderabbitai plan to trigger planning for file edits and PR creation.
@coderabbitai configuration to show the current CodeRabbit configuration for the repository.
@coderabbitai help to get help.

Other keywords and placeholders

Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
Add @coderabbitai anywhere in the PR title to generate the title automatically.

Documentation and Community

Visit our Documentation for detailed information on how to use CodeRabbit.
Join our Discord Community to get help, request features, and share feedback.
Follow us on X/Twitter for updates and announcements.

codecov · 2025-03-27T12:33:57Z

Codecov Report

Attention: Patch coverage is 80.39216% with 10 lines in your changes missing coverage. Please review.

Project coverage is 64.40%. Comparing base (53143b3) to head (c93d83b).
Report is 1 commits behind head on develop.

Files with missing lines	Patch %	Lines
zetaclient/chains/solana/signer/fallback_tx.go	82.00%	6 Missing and 3 partials ⚠️
zetaclient/chains/solana/signer/signer.go	0.00%	1 Missing ⚠️

Additional details and impacted files

@@             Coverage Diff             @@
##           develop    #3770      +/-   ##
===========================================
+ Coverage    64.37%   64.40%   +0.03%     
===========================================
  Files          462      463       +1     
  Lines        32915    32961      +46     
===========================================
+ Hits         21188    21229      +41     
- Misses       10755    10757       +2     
- Partials       972      975       +3

Files with missing lines	Coverage Δ
zetaclient/chains/solana/signer/signer.go	`10.86% <0.00%> (+0.08%)`	⬆️
zetaclient/chains/solana/signer/fallback_tx.go	`82.00% <82.00%> (ø)`

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

coderabbitai

Actionable comments posted: 0

🧹 Nitpick comments (2)

pkg/contracts/solana/instruction_test.go (1)
82-197: Good test coverage, but could benefit from some refinements.

The test function Test_ProgramInvokedAfterTargetInErrStr provides comprehensive coverage for the ProgramInvokedAfterTargetInErrStr function, covering key scenarios including:

No program invoked after the target gateway program

A different program invoked after the target gateway program

The gateway program invoked again after the initial invocation

I recommend the following improvements for better maintainability and robustness:
+// targetGatewayProgramID is the Solana program ID used for testing
+const targetGatewayProgramID = "94U5AHQMKkV5txNJ17QPXWoh474PheGou6cNP2FEuL1d"

 func Test_ProgramInvokedAfterTargetInErrStr(t *testing.T) {
-	t.Run("no program invoked after gateway", func(t *testing.T) {
+	// Define test cases to avoid repetition of test structure
+	testCases := []struct {
+		name          string
+		errorStr      string
+		targetProgram string
+		expected      bool
+	}{
+		{
+			name:          "no program invoked after gateway",
+			targetProgram: targetGatewayProgramID,
+			expected:      false,
+			errorStr: `(*jsonrpc.RPCError)(0x400233b920)({
 		// ...error string content...
+			})`,
+		},
+		{
+			name:          "program invoked after gateway",
+			targetProgram: targetGatewayProgramID,
+			expected:      true,
+			errorStr: `(*jsonrpc.RPCError)(0x40019dc210)({
 		// ...error string content...
+			})`,
+		},
+		{
+			name:          "gateway invoked after gateway",
+			targetProgram: targetGatewayProgramID,
+			expected:      true,
+			errorStr: `(*jsonrpc.RPCError)(0x40019dc210)({
 		// ...error string content...
+			})`,
+		},
+		{
+			name:          "empty error string",
+			targetProgram: targetGatewayProgramID,
+			expected:      false,
+			errorStr:      "",
+		},
+	}
+
+	for _, tc := range testCases {
+		t.Run(tc.name, func(t *testing.T) {
+			invoked := contracts.ProgramInvokedAfterTargetInErrStr(tc.errorStr, tc.targetProgram)
+			if tc.expected {
+				require.True(t, invoked)
+			} else {
+				require.False(t, invoked)
+			}
+		})
+	}
Consider adding a comment to explain the purpose of these tests and the structure of the error strings, e.g.:
+// Test_ProgramInvokedAfterTargetInErrStr verifies the ProgramInvokedAfterTargetInErrStr function
+// which analyzes Solana JSON-RPC error logs to determine if a program was invoked after
+// a target program. This is used for transaction fallback decisions.
 func Test_ProgramInvokedAfterTargetInErrStr(t *testing.T) {
changelog.md (1)

21-21: Changelog Entry for PR 3770 – Fallback TX Error Handling:
The new entry is clear and properly formatted, aligning with the other changelog items in the Fixes section. It succinctly indicates the improvement in fallback transaction error handling. Consider whether additional context (such as mentioning that this change alters how errors from a fallback scenario are differentiated based on program invocation) might further aid future readers.

📜 Review details

Configuration used: .coderabbit.yaml
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 52a57c6 and 358bb2d.

⛔ Files ignored due to path filters (2)

contrib/localnet/solana/connected.so is excluded by !**/*.so
contrib/localnet/solana/connected_spl.so is excluded by !**/*.so

📒 Files selected for processing (5)

changelog.md (1 hunks)
e2e/e2etests/test_solana_withdraw_and_call_revert_with_call.go (1 hunks)
pkg/contracts/solana/instruction.go (2 hunks)
pkg/contracts/solana/instruction_test.go (1 hunks)
zetaclient/chains/solana/signer/signer.go (1 hunks)

🧰 Additional context used

📓 Path-based instructions (1)

`**/*.go`: Review the Go code, point out issues relative to principles of clean code, expressiveness, and performance.

**/*.go: Review the Go code, point out issues relative to principles of clean code, expressiveness, and performance.

e2e/e2etests/test_solana_withdraw_and_call_revert_with_call.go
pkg/contracts/solana/instruction.go
zetaclient/chains/solana/signer/signer.go
pkg/contracts/solana/instruction_test.go

🧠 Learnings (1)

zetaclient/chains/solana/signer/signer.go (1)

Learnt from: gartnera
PR: zeta-chain/node#3632
File: zetaclient/chains/solana/signer/signer.go:304-304
Timestamp: 2025-03-27T14:00:41.939Z
Learning: The Solana signer implementation in zetaclient/chains/solana/signer/signer.go has limited test coverage, particularly for the transaction broadcasting logic with fallback scenarios. Adding this coverage has been acknowledged as a potential future improvement outside the scope of immediate fixes.

🧬 Code Definitions (2)

zetaclient/chains/solana/signer/signer.go (1)

pkg/contracts/solana/instruction.go (1)

ProgramInvokedAfterTargetInErrStr (589-612)

pkg/contracts/solana/instruction_test.go (1)

pkg/contracts/solana/instruction.go (1)

ProgramInvokedAfterTargetInErrStr (589-612)

🪛 GitHub Check: codecov/patch

zetaclient/chains/solana/signer/signer.go

[warning] 347-353: zetaclient/chains/solana/signer/signer.go#L347-L353
Added lines #L347 - L353 were not covered by tests

⏰ Context from checks skipped due to timeout of 90000ms (1)

GitHub Check: start-solana-test / e2e

🔇 Additional comments (3)

e2e/e2etests/test_solana_withdraw_and_call_revert_with_call.go (1)

54-54: More specific test case with NonceMismatch error handling.

The test payload has been expanded to include "NonceMismatch", which aligns with the enhanced error handling in the broadcastOutbound function that now differentiates between regular errors and NonceMismatch errors.

pkg/contracts/solana/instruction.go (1)

587-612: Good implementation of program invocation analysis.

This function provides a necessary utility to determine if other programs were invoked after a specific target program, which enables more precise error handling in Solana transactions. The implementation uses a clean regex approach to identify program invocations in error logs.
zetaclient/chains/solana/signer/signer.go (1)

347-353:
Details

❓ Verification inconclusive

Improved fallback transaction handling logic for program invocations.

The conditional logic now handles two distinct scenarios for using fallback transactions:

When a program is invoked after the gateway, regardless of error type

When an error occurs that is not a NonceMismatch error

This enhancement allows for more robust handling of Solana transaction failures.

This implementation lacks test coverage. Consider adding unit tests specifically for this error handling logic to ensure it behaves as expected in various scenarios.

🏁 Script executed:
#!/bin/bash
# Check if there are tests covering the broadcastOutbound function in the Solana signer
rg -A 10 -B 10 "Test.*broadcastOutbound" --type go
Length of output: 52
Enhanced Fallback Transaction Handling – Test Coverage Needed

The updated conditional logic in zetaclient/chains/solana/signer/signer.go properly refines the fallback transaction mechanism for Solana. It differentiates between cases where a program is invoked after the gateway—ensuring that the fallback is used regardless of the error—and when NonceMismatch errors should be bypassed to accommodate multiple relay attempts.

However, our investigation indicates that there is currently no unit test covering this logic (as verified by the absence of matching tests for broadcastOutbound). To bolster confidence in these changes, please add dedicated tests that simulate:

An error message containing "Error processing Instruction" with a valid fallback transaction, including scenarios where a program is invoked after the gateway.

A case where the error is a "NonceMismatch" and the fallback transaction should not be applied when no post-target program invocation is detected.

Once these tests are in place, we can ensure the robustness of error handling in production.

🧰 Tools

🪛 GitHub Check: codecov/patch

[warning] 347-353: zetaclient/chains/solana/signer/signer.go#L347-L353
Added lines #L347 - L353 were not covered by tests

pkg/contracts/solana/instruction.go

lumtis

Wondering, do we have a case where we produce a NonceMismatch from signers in E2E tests?

Is it something that could be reproduced?

skosito · 2025-03-27T17:07:13Z

Wondering, do we have a case where we produce a NonceMismatch from signers in E2E tests?

Is it something that could be reproduced?

issue reported is reproduced in e2e test in this repo, connected program is reverting with NonceMismatch error

lumtis · 2025-03-27T17:12:13Z

Wondering, do we have a case where we produce a NonceMismatch from signers in E2E tests?
Is it something that could be reproduced?

issue reported is reproduced in e2e test in this repo, connected program is reverting with NonceMismatch error

The test check that the false positive is handled, but doesn't check for the actual NonceMismatch from ZetaClient that should be retried and not reverted?

skosito · 2025-03-27T17:15:07Z

Wondering, do we have a case where we produce a NonceMismatch from signers in E2E tests?
Is it something that could be reproduced?

issue reported is reproduced in e2e test in this repo, connected program is reverting with NonceMismatch error

The test check that the false positive is handled, but doesn't check for the actual NonceMismatch from ZetaClient that should be retried and not reverted?

those are happening constantly as we have 2 relayers locally that are submitting txs, so that is implicitly tested out with solana outbounds working

ws4charlie

looks good

skosito added 2 commits March 27, 2025 13:20

improve fallback tx error handling

ca41cec

extend unit test

5007f7e

skosito added the SOLANA_TESTS Run make start-solana-test label Mar 27, 2025

changelog

358bb2d

skosito marked this pull request as ready for review March 27, 2025 15:44

skosito requested a review from a team as a code owner March 27, 2025 15:44

coderabbitai bot reviewed Mar 27, 2025

View reviewed changes

gartnera reviewed Mar 27, 2025

View reviewed changes

pkg/contracts/solana/instruction.go Outdated Show resolved Hide resolved

lumtis approved these changes Mar 27, 2025

View reviewed changes

ws4charlie approved these changes Mar 27, 2025

View reviewed changes

parse logs from rpc error

f6b6f94

gartnera approved these changes Mar 27, 2025

View reviewed changes

ws4charlie approved these changes Mar 27, 2025

View reviewed changes

Merge branch 'develop' into improve-fallback-tx-error-handling

c93d83b

skosito added this pull request to the merge queue Mar 28, 2025

Merged via the queue into develop with commit 1f09edc Mar 28, 2025
46 checks passed

skosito deleted the improve-fallback-tx-error-handling branch March 28, 2025 12:26

coderabbitai bot mentioned this pull request Mar 28, 2025

refactor: use SignBatch keysign for solana outbound tx and fallback tx #3777

Merged

5 tasks

ws4charlie mentioned this pull request Mar 31, 2025

improve Sui cancel tx error handling #3778

Closed

Conversation

skosito commented Mar 27, 2025 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

How Has This Been Tested?

Summary by CodeRabbit

Uh oh!

coderabbitai bot commented Mar 27, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Review skipped

Walkthrough

Changes

Sequence Diagram(s)

Possibly related PRs

Suggested labels

Suggested reviewers

Chat

CodeRabbit Commands (Invoked using PR comments)

Other keywords and placeholders

Documentation and Community

Uh oh!

codecov bot commented Mar 27, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

lumtis left a comment

Choose a reason for hiding this comment

Uh oh!

skosito commented Mar 27, 2025

Uh oh!

lumtis commented Mar 27, 2025

Uh oh!

skosito commented Mar 27, 2025

Uh oh!

ws4charlie left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

skosito commented Mar 27, 2025 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Mar 27, 2025 •

edited

Loading

codecov bot commented Mar 27, 2025 •

edited

Loading