Skip to content

Conversation

@fbac
Copy link
Collaborator

@fbac fbac commented Mar 10, 2025

Closes #486

  • Simplify upgrade test logic by using testcontainers.
  • Continue on error in upgrade tests, instead make a comment with test results.

Summary by CodeRabbit

  • New Features
    • Introduced a new GitHub Actions workflow for automated upgrade testing.
  • Chores
    • Updated multiple dependencies to boost stability, performance, and security.
  • Tests
    • Enhanced functionality to pull Docker images directly in tests.
    • Streamlined the upgrade test setup, including a dedicated test validating the latest development version.
    • Improved error handling and context management in test functions.
    • Refined container-based tests for improved reliability and clarity.
    • Improved version management in upgrade tests for better structure and clarity.
    • Reordered import statements in test files for better organization.
    • Removed the "Run Upgrade Tests" step from the GitHub Actions workflow.

@fbac fbac requested a review from a team as a code owner March 10, 2025 16:36
@coderabbitai
Copy link
Contributor

coderabbitai bot commented Mar 10, 2025

Walkthrough

This pull request removes the "Run Upgrade Tests" step from the GitHub Actions workflow and introduces a new workflow for upgrade testing. The go.mod file is updated with several dependency version changes, new dependencies added, and one removed. Test files in the pkg/upgrade directory are refactored to enhance container management and context handling in tests. A new function for pulling Docker images is introduced, and the structure of test functions is improved for clarity and error handling.

Changes

File(s) Change Summary
.github/workflows/test.yml, .github/workflows/upgrade-test.yml Removed the "Run Upgrade Tests" step from the test workflow and added a new workflow for upgrade testing triggered by specific events.
go.mod Updated several dependency versions (e.g., grpc-gateway, genproto, protobuf, otel packages), added new dependencies such as testcontainers-go and mergo, and removed the indirect dependency github.com/rogpeppe/go-internal.
pkg/upgrade/docker_utils_test.go Added a new function pullImage to pull Docker images from a registry, enhancing test utility capabilities.
pkg/upgrade/main_test.go Modified the setup function to pull Docker images for specified versions using the new pullImage function during test setup.
pkg/upgrade/upgrade_test.go Transitioned from a map for version management to a slice, added a new variable for the repository, and introduced a new test function TestLatestVersion. Enhanced error handling and context usage in test functions.
pkg/blockchain/blockchainPublisher_test.go Reordered import statements without adding or removing any imports.

Possibly related PRs

  • Disable upgrade tests #619: Directly related to the removal of the "Run Upgrade Tests" step from the workflow, which was commented out in this PR.
  • Post 0.2.1 release chores #584: Related to workflow configuration changes, including branch pattern adjustments that connect with the workflow modifications in this PR.
  • Upgrade tests #483: Introduces upgrade tests, which are relevant to the changes made in the testing process and environment in this PR.

Suggested reviewers

  • mkysel

Warning

There were issues while running some tools. Please review the errors and either fix the tool’s configuration or disable the tool if it’s a critical failure.

🔧 golangci-lint (1.62.2)

Error: can't load config: the Go language version (go1.23) used to build golangci-lint is lower than the targeted Go version (1.24)
Failed executing command with error: can't load config: the Go language version (go1.23) used to build golangci-lint is lower than the targeted Go version (1.24)

Tip

⚡🧪 Multi-step agentic review comment chat (experimental)
  • We're introducing multi-step agentic chat in review comments. This experimental feature enhances review discussions with the CodeRabbit agentic chat by enabling advanced interactions, including the ability to create pull requests directly from comments.
    - To enable this feature, set early_access to true under in the settings.

📜 Recent review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between e90c2dc and a749687.

⛔ Files ignored due to path filters (1)
  • go.sum is excluded by !**/*.sum
📒 Files selected for processing (1)
  • go.mod (7 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
  • go.mod
⏰ Context from checks skipped due to timeout of 90000ms (5)
  • GitHub Check: Push Docker Images to GitHub Packages (xmtpd-cli)
  • GitHub Check: Lint-Go
  • GitHub Check: Build pre-baked anvil-xmtpd
  • GitHub Check: Push Docker Images to GitHub Packages (xmtpd)
  • GitHub Check: Test (Node)

🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

  • Review comments: Directly reply to a review comment made by CodeRabbit. Example:
    • I pushed a fix in commit <commit_id>, please review it.
    • Generate unit testing code for this file.
    • Open a follow-up GitHub issue for this discussion.
  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
    • @coderabbitai generate unit testing code for this file.
    • @coderabbitai modularize this function.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
    • @coderabbitai read src/utils.ts and generate unit testing code.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
    • @coderabbitai help me debug CodeRabbit configuration file.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (Invoked using PR comments)

  • @coderabbitai pause to pause the reviews on a PR.
  • @coderabbitai resume to resume the paused reviews.
  • @coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
  • @coderabbitai full review to do a full review from scratch and review all the files again.
  • @coderabbitai summary to regenerate the summary of the PR.
  • @coderabbitai generate docstrings to generate docstrings for this PR.
  • @coderabbitai resolve resolve all the CodeRabbit review comments.
  • @coderabbitai configuration to show the current CodeRabbit configuration for the repository.
  • @coderabbitai help to get help.

Other keywords and placeholders

  • Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
  • Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
  • Add @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (.coderabbit.yaml)

  • You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
  • Please see the configuration documentation for more information.
  • If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Documentation and Community

  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (4)
.github/workflows/test.yml (1)

56-70: Upgrade the GitHub Script Action version.
A static analysis tool (actionlint) flagged that actions/github-script@v6 may be too old. Consider updating to a more recent version to avoid runtime issues.

- uses: actions/github-script@v6
+ uses: actions/github-script@v7
🧰 Tools
🪛 actionlint (1.7.4)

58-58: the runner of "actions/github-script@v6" action is too old to run on GitHub Actions. update the action's version to fix this issue

(action)

pkg/upgrade/upgrade_test.go (3)

9-16: Consider externalizing version data.
Defining xmtpdVersions inline is acceptable, but consider externalizing it to a config or a shared constant so it can be easily updated without modifying code.


21-46: Add assertions to validate container readiness.
This test starts containers in both old and dev versions, but it only logs progress without validating the container’s actual state or readiness aside from log checks. Consider adding checks or assertions that confirm the container is operational (e.g., a simple API call) to bolster your confidence in the upgrade process.


48-58: Enhance test coverage.
TestLatestVersion starts the dev container but does not run any post-launch assertions. Consider adding logic to confirm that the container is fully functional, similar to the suggestion in TestUpgradeToLatest.

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between c0c8e57 and 1ade3e5.

⛔ Files ignored due to path filters (1)
  • go.sum is excluded by !**/*.sum
📒 Files selected for processing (5)
  • .github/workflows/test.yml (1 hunks)
  • go.mod (6 hunks)
  • pkg/upgrade/docker_utils_test.go (2 hunks)
  • pkg/upgrade/main_test.go (0 hunks)
  • pkg/upgrade/upgrade_test.go (1 hunks)
💤 Files with no reviewable changes (1)
  • pkg/upgrade/main_test.go
🧰 Additional context used
🪛 actionlint (1.7.4)
.github/workflows/test.yml

58-58: the runner of "actions/github-script@v6" action is too old to run on GitHub Actions. update the action's version to fix this issue

(action)

⏰ Context from checks skipped due to timeout of 90000ms (2)
  • GitHub Check: Test (Node)
  • GitHub Check: Lint-Go
🔇 Additional comments (29)
.github/workflows/test.yml (1)

49-50: Good use of id and continue-on-error.
Defining an explicit id for this step makes it easy to reference its outcome. Setting continue-on-error: true helps gather results even when tests fail.

pkg/upgrade/upgrade_test.go (2)

4-4: No concerns.
The new import for context is necessary to support context-aware operations.


17-18: Repository variable looks good.
Centralizing the repository reference in ghcrRepository improves maintainability.

pkg/upgrade/docker_utils_test.go (3)

16-16: No functional change.
This line is a harmless addition or whitespace change.


17-21: New imports look correct.
Importing testcontainers-go and wait is consistent with the migration to container-based tests.


118-149:

Details

❓ Verification inconclusive

Validate readiness and environment portability.
The 10-second timeout may be too short in some environments, potentially causing sporadic test failures. Also, references to "host.docker.internal" (used elsewhere) can be problematic on Linux Docker without additional flags.

Please confirm this approach on all supported platforms. You can use the script below to locate references to "host.docker.internal" and verify that your environment handles them correctly:


🏁 Script executed:

#!/bin/bash
# Search for host.docker.internal usage in the codebase
rg "host\.docker\.internal" -A 3 -B 3

Length of output: 772


Action: Validate Container Readiness Timeout and Docker Host Settings

  • The 10-second timeout for container startup may be insufficient in slower environments, which could lead to sporadic test failures.
  • Our search confirms that host.docker.internal is actively used in pkg/upgrade/docker_utils_test.go (via the conversion in convertLocalhost) as well as in dev/docker/prometheus.yml. This reinforces the need to ensure that Linux Docker users apply the necessary flags or configurations to correctly resolve this hostname.

Please verify that these settings work reliably across all supported platforms and consider adjustments (such as extending the timeout or handling Linux-specific Docker configurations) if any issues are encountered.

go.mod (23)

8-8: Dependency Version Update: grpc-gateway/v2

The version for github.com/grpc-ecosystem/grpc-gateway/v2 has been updated to v2.26.1. Please ensure that this change is compatible with your existing code and that you have reviewed the dependency’s changelog for any breaking changes.


17-19: Bumped gRPC-Related Dependencies

The versions for:

  • google.golang.org/genproto/googleapis/apiv0.0.0-20250218202821-56aae31c358a
  • google.golang.org/grpcv1.71.0
  • google.golang.org/protobufv1.36.5

have been updated. Make sure these updates do not introduce incompatibilities with your gRPC logic and that any new features or breaking changes are handled appropriately.


27-27: New Dependency Added: testcontainers-go

A new dependency github.com/testcontainers/testcontainers-go at v0.35.0 has been added. This aligns with the PR objective to simplify upgrade tests using testcontainers. Ensure that its usage in your test code follows best practices.


32-34: New/Updated Indirect Dependencies for Utilities

The following indirect dependencies have been added or updated:

  • dario.cat/mergov1.0.0
  • github.com/AdaLogics/go-fuzz-headersv0.0.0-20240806141605-e8a1dd7889d6
  • github.com/Azure/go-ansitermv0.0.0-20250102033503-faa5f7b0171c

Please verify that these versions are required for the new features or tests and that they do not introduce incompatibilities.


54-54: Updated Dependency: cenkalti/backoff/v4

The dependency github.com/cenkalti/backoff/v4 has been updated to v4.3.0. Confirm that its retry/backoff behavior still meets your expectations in test and production environments.


59-59: Updated Dependency: containerd/log

github.com/containerd/log is now at v0.1.0. Verify that logging integration works as expected after this update.


60-60: Updated Dependency: containerd/platforms

The dependency github.com/containerd/platforms has been bumped to v0.2.1. Ensure that this version is fully compatible with your container management workflow.


61-61: Updated Dependency: cpuguy83/dockercfg

github.com/cpuguy83/dockercfg has been updated to v0.3.2. Check that Docker configuration handling remains consistent with this new version.


68-68: Updated Dependency: distribution/reference

The dependency github.com/distribution/reference is now at v0.6.0. Ensure that any image reference parsing in your project behaves correctly with this version.


70-70: Updated Dependency: docker/go-connections

github.com/docker/go-connections has been updated to v0.5.0. Confirm its compatibility with your Docker API interactions.


71-71: Updated Dependency: docker/go-units

The dependency github.com/docker/go-units has also been updated to v0.5.0. Verify that unit conversion and related functionalities perform as expected.


107-111: Updated Moby-Related Dependencies

The following Moby-related libraries have been updated:

  • github.com/moby/docker-image-specv1.3.1
  • github.com/moby/patternmatcherv0.6.0
  • github.com/moby/sys/sequentialv0.5.0
  • github.com/moby/sys/userv0.1.0
  • github.com/moby/sys/usernsv0.1.0

Please verify that these changes do not disrupt any Docker container management functionalities.


115-115: Updated Dependency: morikuni/aec

The dependency github.com/morikuni/aec has been bumped to v1.0.0. Confirm its impact on any related terminal formatting or logging outputs.


118-118: Updated Dependency: opencontainers/go-digest

github.com/opencontainers/go-digest has been updated to v1.0.0. Ensure that digest calculations (if used) are functioning as intended.


119-119: Updated Dependency: opencontainers/image-spec

The dependency github.com/opencontainers/image-spec is now at v1.1.0. Verify that updates to image specification handling are in line with your container practices.


136-136: Updated Dependency: sirupsen/logrus

github.com/sirupsen/logrus has been upgraded to v1.9.3. Confirm that your logging configuration and any custom hooks maintain expected behavior with this version.


143-143: Updated Dependency: OpenTelemetry Auto SDK

go.opentelemetry.io/auto/sdk is now at v1.1.0. Ensure that your observability setup correctly integrates with this update.


150-150: Updated Dependency: OpenTelemetry (otel)

The dependency go.opentelemetry.io/otel has been updated to v1.35.0. Verify that the tracing and metric functionalities in your application remain stable.


151-151: Updated Dependency: OTLP Trace HTTP Exporter

go.opentelemetry.io/otel/exporters/otlp/otlptrace/otlptracehttp has been bumped to v1.35.0. Please confirm that this exporter interacts correctly with your tracing pipeline.


152-152: Updated Dependency: OpenTelemetry Metric

The dependency go.opentelemetry.io/otel/metric has been updated to v1.35.0. Ensure that metric collection and reporting continue to function as expected.


153-153: Updated Dependency: OpenTelemetry SDK

go.opentelemetry.io/otel/sdk has been updated to v1.35.0. Verify that the SDK initialization and configuration match your current observability requirements.


154-154: Updated Dependency: OpenTelemetry Trace

The dependency go.opentelemetry.io/otel/trace is now at v1.35.0. Confirm that tracing instrumentation remains compatible after this update.


166-166: Updated Dependency: googleapis/rpc

google.golang.org/genproto/googleapis/rpc has been bumped to v0.0.0-20250218202821-56aae31c358a. Please ensure this version is consistent with other related Google API dependencies.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (2)
pkg/upgrade/docker_utils_test.go (2)

118-152: Well-implemented testcontainers integration

The runContainer function has been nicely refactored to use the testcontainers-go library. This is a significant improvement over direct Docker command execution:

  1. Using a proper context with timeout (10 seconds)
  2. Using testcontainers.ContainerRequest for configuration
  3. Implementing a proper waiting condition using wait.ForLog
  4. Clean container termination

There's a TODO comment about ideally waiting for health/liveness probe, which is good to highlight for future improvements.

One minor suggestion: consider adding a comment explaining why we start and immediately terminate the container (I assume it's to verify successful startup for upgrade testing).


132-135: Consider enhancing the container wait condition

The TODO comment indicates that waiting for a health/liveness probe would be ideal. Consider implementing this enhancement in a future update, as health checks provide more reliable container readiness detection than log message scanning.

You could implement this by using wait.ForHealthCheck() instead of wait.ForLog() if the container exposes a health check endpoint:

WaitingFor: wait.ForLog(
  "replication.api\tserving grpc",
), // TODO: Ideally we wait for health/liveness probe

If the container already has a health check defined, you could change to:

WaitingFor: wait.ForHealthCheck(),
📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 1ade3e5 and 4395e03.

📒 Files selected for processing (2)
  • pkg/upgrade/docker_utils_test.go (2 hunks)
  • pkg/upgrade/upgrade_test.go (1 hunks)
🧰 Additional context used
🪛 GitHub Check: Lint-Go
pkg/upgrade/upgrade_test.go

[failure] 55-55:
Error return value is not checked (errcheck)

⏰ Context from checks skipped due to timeout of 90000ms (2)
  • GitHub Check: Test (Node)
  • GitHub Check: Build pre-baked anvil-xmtpd
🔇 Additional comments (4)
pkg/upgrade/upgrade_test.go (3)

4-9: Approval of import modifications

The imports have been appropriately updated to include context and the testify/require package, which aligns with the PR's objective of using testcontainers in upgrade tests.


11-20: Well-structured version management

The change from a map to an ordered slice with a clear comment ("Guaranteed to be in order of increasing version") is a good improvement. Separating the repository URL into a constant variable is also a clean approach that makes future updates easier.


22-50: Context-aware container management implemented correctly

The TestUpgradeToLatest function has been properly modified to use context and the new testcontainers approach. The flow is clear and each step is well-documented with appropriate logging and error checking.

pkg/upgrade/docker_utils_test.go (1)

16-21: Appropriate inclusion of testcontainers dependencies

The imports have been updated to include the necessary testcontainers-go packages, which aligns with the PR objective of simplifying upgrade tests with testcontainers.

@github-actions
Copy link
Contributor

❌ Upgrade tests failed. Please check the logs for more details.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🔭 Outside diff range comments (1)
pkg/upgrade/docker_utils_test.go (1)

121-159: 🛠️ Refactor suggestion

Consider returning the container instance

The current implementation terminates the container immediately after creation, which might not be the desired behavior for all test cases. Consider modifying the function to return the container instance, allowing the caller to decide when to terminate it:

func runContainer(
	t *testing.T,
	ctx context.Context,
	imageName string,
	containerName string,
	envVars map[string]string,
-) (err error) {
+) (container testcontainers.Container, err error) {
	ctxwc, cancel := context.WithTimeout(ctx, 10*time.Second)
	defer cancel()

	req := testcontainers.ContainerRequest{
		// ... existing configuration ...
	}

	container, err = testcontainers.GenericContainer(ctxwc, testcontainers.GenericContainerRequest{
		ContainerRequest: req,
		Started:          true,
		Logger:           testcontainers.TestLogger(t),
	})
	if err != nil {
-		return err
+		return nil, err
	}

-	err = container.Terminate(ctxwc)
-	if err != nil {
-		return err
-	}
-
-	return nil
+	return container, nil
}

This would provide more flexibility in how the container is used in tests, especially for upgrade tests where you might need to interact with the container before terminating it.

🧹 Nitpick comments (5)
pkg/upgrade/docker_utils_test.go (5)

128-129: Consider moving defer call

It's a good practice to place the defer call immediately after the context creation to ensure it's always executed, even if there's a panic between these lines.

-	ctxwc, cancel := context.WithTimeout(ctx, 10*time.Second)
-	defer cancel()
+	ctxwc, cancel := context.WithTimeout(ctx, 10*time.Second)
+	defer cancel()

While the current code works correctly, moving the defer right after the context creation is a defensive programming practice.


131-142: Well-structured container configuration

The container request configuration is well-structured and clearly defines all necessary parameters. The use of wait.ForLog is a good way to ensure the container is ready before proceeding.

You might want to consider implementing the TODO comment to use a health/liveness probe instead of waiting for a log message, as it would make the tests more reliable.


139-141: Consider implementing health check

As noted in your TODO comment, waiting for a log message is less reliable than a proper health check. Consider implementing a custom wait strategy using wait.ForHTTP or wait.ForHealthCheck for more robust testing.


153-156: Consider terminating container only if created successfully

While the current code will never reach this point if container creation fails (due to the return statement on line 150), it might be more explicit to check if the container was created successfully before attempting to terminate it.

-	err = container.Terminate(ctxwc)
-	if err != nil {
-		return err
-	}
+	if container != nil {
+		err = container.Terminate(ctxwc)
+		if err != nil {
+			return err
+		}
+	}

This change would make the code more defensive against potential future modifications.


121-159: Missing log streaming functionality

The previous implementation likely had a way to stream container logs (through the removed streamDockerLogs function mentioned in the AI summary). Consider adding log streaming capability for debugging purposes.

You could use the container.Logs() method to capture container logs:

if logs, err := container.Logs(ctx); err == nil {
    defer logs.Close()
    buf := new(bytes.Buffer)
    _, _ = buf.ReadFrom(logs)
    t.Logf("Container logs: %s", buf.String())
}

This would help with debugging failed tests by providing container output in the test logs.

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 4d00a2b and 7e420ac.

📒 Files selected for processing (1)
  • pkg/upgrade/docker_utils_test.go (2 hunks)
⏰ Context from checks skipped due to timeout of 90000ms (1)
  • GitHub Check: Build pre-baked anvil-xmtpd
🔇 Additional comments (5)
pkg/upgrade/docker_utils_test.go (5)

16-20: Good choice of library to simplify container management!

Using the testcontainers-go library is a significant improvement over manual Docker command execution. This approach provides better abstraction, error handling, and cleaner test code.


23-26: Good practice to define constants

Extracting these values as constants improves maintainability and makes the code more readable.


121-127: Function signature improvement with context

The updated function signature with context.Context parameter follows Go best practices for cancelable operations, which is especially important for container management.


142-142: Required closing brace for WaitingFor configuration

This line is indeed necessary to close the WaitingFor configuration block.


144-151: Good use of testcontainers API

The GenericContainer function with TestLogger integration is a clean way to create and start the container. The error handling is appropriate here.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (3)
.github/workflows/upgrade-test.yml (3)

17-26: Job Definition and Checkout Step

The job definition is clear and uses ubuntu-latest as the runner. The checkout step uses actions/checkout@v4 with recursive sub-module checkout, tag fetching, and full history (fetch-depth: 0). This is beneficial for builds requiring the full repository context, though consider if full history is strictly needed to optimize performance.


32-32: Initiating Docker Environment

The step running dev/docker/up is expected to start the necessary Docker containers for the upgrade tests. Given the PR objective regarding error handling, please ensure that the underlying script (dev/docker/up) is updated to log errors without halting the testing process.


39-39: Local Contracts Deployment

The command contracts/dev/deploy local is used to deploy the local contracts. In line with the new error handling objective, verify that any errors in the deployment process are captured and documented (e.g., as comments) rather than causing an immediate halt to subsequent test steps.

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 7e420ac and b8ad532.

📒 Files selected for processing (2)
  • .github/workflows/test.yml (0 hunks)
  • .github/workflows/upgrade-test.yml (1 hunks)
💤 Files with no reviewable changes (1)
  • .github/workflows/test.yml
⏰ Context from checks skipped due to timeout of 90000ms (3)
  • GitHub Check: Test (Node)
  • GitHub Check: Upgrade Tests
  • GitHub Check: Build pre-baked anvil-xmtpd
🔇 Additional comments (4)
.github/workflows/upgrade-test.yml (4)

1-16: Workflow Trigger Configuration

The trigger configuration is comprehensive. It correctly watches for pushes to main and branches matching rel/**, as well as pull requests touching relevant files. This setup should ensure that the upgrade tests run when necessary.


27-31: Go Environment Setup with Caching

The use of actions/setup-go@v5 with go-version-file pointing to go.mod and caching dependency files via the "**/*.sum" pattern is a good practice. It helps ensure reproducibility and speeds up dependency installation.


34-38: Foundry Toolchain Installation

Installing the Foundry toolchain using foundry-rs/foundry-toolchain@v1 and specifying version v1.0.0 is implemented correctly. Make sure that this version remains up-to-date and secure for your use cases.


41-41: Local Node Registration

The step that runs dev/register-local-node appropriately sets up the local node required for the upgrade tests. As with the other steps, ensure that any intermittent failures here do not stop the workflow execution and are properly logged.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (3)
pkg/upgrade/docker_utils_test.go (3)

129-130: Consider making the timeout configurable

The hardcoded 10-second timeout might be too short for some environments or slower machines. Consider making this configurable or at least use a constant.

- ctxwc, cancel := context.WithTimeout(ctx, 10*time.Second)
+ const containerStartupTimeout = 30 * time.Second
+ ctxwc, cancel := context.WithTimeout(ctx, containerStartupTimeout)

132-142: Well-structured container configuration

The ContainerRequest provides a clear, structured way to configure the container. The WaitingFor condition is a significant improvement over time-based waiting.

Regarding the TODO on line 141-142: I agree that waiting for health/liveness probe would be more robust than waiting for a specific log message, as log formats can change.


153-156: Consider handling termination errors differently

Currently, if container termination fails, the error is returned. Consider logging this error instead of returning it, especially if this function is called as part of test cleanup. Termination errors shouldn't necessarily cause test failures if the test itself was successful.

- err = container.Terminate(ctxwc)
- if err != nil {
-   return err
- }
+ if err := container.Terminate(ctxwc); err != nil {
+   t.Logf("Warning: Failed to terminate container: %v", err)
+ }
📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between b8ad532 and 7e8c540.

📒 Files selected for processing (1)
  • pkg/upgrade/docker_utils_test.go (2 hunks)
⏰ Context from checks skipped due to timeout of 90000ms (2)
  • GitHub Check: Push Docker Images to GitHub Packages (xmtpd)
  • GitHub Check: Build pre-baked anvil-xmtpd
🔇 Additional comments (6)
pkg/upgrade/docker_utils_test.go (6)

16-21: Good addition of testcontainers library imports

The new imports properly support the transition from manually executing Docker commands to using the testcontainers library, which provides better abstractions for container management in tests.


24-27: Good use of constants

Extracting these values as constants improves code maintainability and readability.


144-151: Good use of testcontainers API

The GenericContainer setup with proper logging and automatic starting is a clean approach.


153-159: Verify container lifetime management

The current implementation starts the container and immediately terminates it. Make sure this is the intended behavior and that any tests that need to interact with the container occur between these operations.

This pattern suggests that either:

  1. This function is part of a setup/teardown flow and actual tests happen elsewhere, or
  2. The container is only needed to perform some initialization that happens during startup

Please confirm that this behavior aligns with the intended use case.


122-128:

Details

✅ Verification successful

Function signature change improves testability

The updated signature now takes a context parameter and has reordered parameters. This change aligns with the goal of using testcontainers and enables proper context management for container lifecycle.

Make sure all callers of this function have been updated to match the new parameter order.


🏁 Script executed:

#!/bin/bash
# Find all calls to runContainer in the codebase to verify they've been updated
rg "runContainer\(" -A 5 --type go

Length of output: 1177


Container Function Signature Updated Correctly

The refactor to include a context parameter and reorder parameters for better testability and container lifecycle management has been verified. All call sites (in both pkg/upgrade/docker_utils_test.go and pkg/upgrade/upgrade_test.go) are now updated to match the new signature:

  • Caller arguments now correctly follow the order: t, ctx, imageName, containerName, envVars.

No further changes are required.


139-142:

Details

❓ Verification inconclusive

Ensure the log message is reliable

The wait condition depends on a specific log message. If this message changes in the application, tests will break.


🏁 Script executed:

#!/bin/bash
# Verify that the log message being waited for is consistently used in the application
rg "replication.api\tserving grpc" --type go

Length of output: 46


Action: Confirm log message consistency & consider using a robust health/liveness check

It appears that searching for the hardcoded log message ("replication.api\tserving grpc") produced no matching results. This absence raises concerns about whether the application consistently emits this log or if the literal string (including the tab) is correct. As the wait condition in tests hinges on this specific message, any discrepancy could lead to flaky tests if the log message ever changes.

  • Verify Logging Output: Please manually confirm that the application actually prints "replication.api serving grpc" (with the intended tab character) during startup, or if an alternative log message should be targeted.
  • Improve Robustness: Consider updating the wait condition to rely on a more robust indicator (e.g., a health or liveness probe), as noted by the existing TODO, to guard against future log changes.

@mkysel
Copy link
Collaborator

mkysel commented Mar 10, 2025

I suspect that it does fix #486, but we still need to fix #560 before the tests turn green again

@fbac fbac force-pushed the 03-10-upgrade_test_testcontainers branch from 7e8c540 to 7092281 Compare March 10, 2025 20:45
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (3)
pkg/upgrade/docker_utils_test.go (3)

126-127: Appropriate timeout context management.

Creating a timeout context with a specific duration is a good practice. Consider making the timeout configurable or at least extracting it to a constant at the top of the file for easier adjustment.


129-139: Well-structured container configuration with proper wait condition.

The ContainerRequest is well configured with proper image, name, environment variables, and wait condition. The host configuration modifier correctly adds the host-gateway setting as previously discussed in reviews.

Note regarding the TODO on line 138: Consider creating an issue to track the improvement of waiting for health/liveness probes instead of log messages.

Would you like me to help draft a more robust health check implementation using testcontainers' wait strategies?


150-155: Container termination is properly handled.

The container is correctly terminated with error propagation. Consider using defer for container termination to ensure it runs even if other operations are added to the function in the future.

func runContainer(
	t *testing.T,
	ctx context.Context,
	imageName string,
	containerName string,
	envVars map[string]string,
) (err error) {
	ctxwc, cancel := context.WithTimeout(ctx, 10*time.Second)
	defer cancel()

	req := testcontainers.ContainerRequest{
		Image: imageName,
		Name:  containerName,
		Env:   envVars,
		HostConfigModifier: func(hc *container.HostConfig) {
			hc.ExtraHosts = append(hc.ExtraHosts, "host.docker.internal:host-gateway")
		},
		WaitingFor: wait.ForLog(
			"replication.api\tserving grpc",
		), // TODO: Ideally we wait for health/liveness probe
	}

	container, err := testcontainers.GenericContainer(ctxwc, testcontainers.GenericContainerRequest{
		ContainerRequest: req,
		Started:          true,
		Logger:           testcontainers.TestLogger(t),
	})
	if err != nil {
		return err
	}

+	defer func() {
+		if tErr := container.Terminate(ctxwc); tErr != nil && err == nil {
+			err = tErr
+		}
+	}()

-	err = container.Terminate(ctxwc)
-	if err != nil {
-		return err
-	}

	return nil
}
📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 7e8c540 and 7092281.

📒 Files selected for processing (2)
  • pkg/blockchain/blockchainPublisher_test.go (1 hunks)
  • pkg/upgrade/docker_utils_test.go (2 hunks)
✅ Files skipped from review due to trivial changes (1)
  • pkg/blockchain/blockchainPublisher_test.go
⏰ Context from checks skipped due to timeout of 90000ms (3)
  • GitHub Check: Test (Node)
  • GitHub Check: Build pre-baked anvil-xmtpd
  • GitHub Check: Upgrade Tests
🔇 Additional comments (4)
pkg/upgrade/docker_utils_test.go (4)

16-22: Appropriate imports for testcontainers integration.

The new imports support the transition from manual Docker command execution to the testcontainers library, which provides a more structured approach to container testing.


119-125: Good improvement to the function signature.

The updated function signature properly accepts a context parameter and returns an error instead of failing the test directly. This allows for better control over lifecycle and error handling by the caller.


141-149: Good use of testcontainers API with proper error handling.

The GenericContainer request is well-configured with appropriate parameters. Using the TestLogger ensures that container logs are properly captured in test output.


134-135: ExtraHosts configuration correctly implemented.

The ExtraHosts configuration with "host.docker.internal:host-gateway" addresses the issue mentioned in previous reviews and is correctly implemented.

@fbac
Copy link
Collaborator Author

fbac commented Mar 11, 2025

I suspect that it does fix #486, but we still need to fix #560 before the tests turn green again

That's correct, this PR won't fix the contracts divergence. On the other hand, a contract ABI change as big as this one (Nodes V1 to V2) is 100% an upgrade breaking, so it's expected.

It seems this is a point in favor of splitting the contracts and xmtpd repositories, and have different versioning on both. From xmtpd perspective we only care of the contracts ABI at a specific version, from the contract perspective the development should continue without breaking things in xmtpd.
After that a compatibility matrix could be calculated. And in breaking upgrade events handle it with an specific procedure.

@fbac fbac force-pushed the 03-10-upgrade_test_testcontainers branch from 35d9f80 to aa15aa0 Compare March 11, 2025 12:43
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🔭 Outside diff range comments (1)
pkg/upgrade/docker_utils_test.go (1)

149-181: ⚠️ Potential issue

Code duplication detected - conflicting implementations.

There appears to be two implementations of the runContainer function in the file. The second implementation (lines 149-181) handles errors differently by returning them instead of using require.NoError. This causes confusion and potential issues. The file structure needs to be cleaned up to remove this duplication.

The second implementation seems to include the host configuration missing from the first implementation. Please reconcile these implementations to have a single, correct version of the function.

♻️ Duplicate comments (1)
pkg/upgrade/docker_utils_test.go (1)

118-148: ⚠️ Potential issue

Missing host configuration for Docker host access.

The implementation is missing the host configuration modifier that adds "host.docker.internal:host-gateway" to allow containers to access services running on the host. This was noted in a previous review and is necessary for tests running in Linux environments.

Apply this change:

	req := testcontainers.ContainerRequest{
		Image: imageName,
		Name:  containerName,
		Env:   envVars,
+		HostConfigModifier: func(hc *container.HostConfig) {
+			hc.ExtraHosts = append(hc.ExtraHosts, "host.docker.internal:host-gateway")
+		},
		WaitingFor: wait.ForLog(
			"replication.api\tserving grpc",
		), // TODO: Ideally we wait for health/liveness probe
	}
🧹 Nitpick comments (3)
pkg/upgrade/docker_utils_test.go (3)

144-148: Improve error handling in container termination.

The current implementation always terminates the container at the end of the test through a deferred function, but does not properly handle the case where the container might already be terminated or failed to start properly. Consider a more robust error handling approach.

	defer func() {
-		err := container.Terminate(ctx)
-		require.NoError(t, err, "Failed to terminate container")
+		if err := container.Terminate(ctx); err != nil {
+			t.Logf("Warning: Failed to terminate container: %v", err)
+		}
	}()

125-126: Consider extending timeout for container startup.

The current 10-second timeout for container startup might be too short for some environments or when the system is under load. Consider increasing it to at least 30 seconds to accommodate slower environments.

-	ctxwc, cancel := context.WithTimeout(ctx, 10*time.Second)
+	ctxwc, cancel := context.WithTimeout(ctx, 30*time.Second)
	defer cancel()

132-135: Consider implementing health check wait condition.

The TODO comment indicates that waiting for a health/liveness probe would be ideal. Consider implementing this improvement now rather than waiting for a log message, which can be less reliable.

		WaitingFor: wait.ForLog(
			"replication.api\tserving grpc",
-		), // TODO: Ideally we wait for health/liveness probe
+		).WithStartupTimeout(30 * time.Second),

Alternatively, if your container exposes a health check endpoint:

-		WaitingFor: wait.ForLog(
-			"replication.api\tserving grpc",
-		), // TODO: Ideally we wait for health/liveness probe
+		WaitingFor: wait.ForHTTP("/health")
+			.WithStartupTimeout(30 * time.Second)
+			.WithStatusCodeMatcher(func(status int) bool {
+				return status == 200
+			}),
📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 7092281 and 35d9f80.

📒 Files selected for processing (7)
  • pkg/upgrade/docker_utils_test.go (2 hunks)
  • pkg/upgrade/main_test.go (0 hunks)
  • pkg/upgrade/docker_utils_test.go (2 hunks)
  • pkg/upgrade/docker_utils_test.go (2 hunks)
  • pkg/upgrade/docker_utils_test.go (2 hunks)
  • pkg/upgrade/docker_utils_test.go (3 hunks)
  • pkg/upgrade/main_test.go (1 hunks)
🚧 Files skipped from review as they are similar to previous changes (5)
  • pkg/upgrade/main_test.go
  • pkg/upgrade/main_test.go
  • pkg/upgrade/docker_utils_test.go
  • pkg/upgrade/docker_utils_test.go
  • pkg/upgrade/docker_utils_test.go
⏰ Context from checks skipped due to timeout of 90000ms (3)
  • GitHub Check: Build pre-baked anvil-xmtpd
  • GitHub Check: Test (Node)
  • GitHub Check: Upgrade Tests
🔇 Additional comments (1)
pkg/upgrade/docker_utils_test.go (1)

16-21: Good addition of testcontainers package for structured testing.

The inclusion of the testcontainers library is a positive change that aligns with the PR objective to simplify upgrade test logic. This library provides a more structured and reliable approach to container management during tests compared to manual Docker command execution.

@fbac fbac requested a review from mkysel March 11, 2025 14:42
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (1)
.github/workflows/upgrade-test.yml (1)

27-42: Verify Environment Setup and Testcontainers Configuration

The workflow steps for setting up the environment (e.g., running dev/docker/up, installing Foundry, deploying contracts, and registering the local node) appear to be in place. However, one of the PR objectives is to “use testcontainers in upgrade tests.” Please ensure that any testcontainers integration or configuration required by your tests is either included here or managed in the testing code.

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between aa15aa0 and 5bb5fbb.

📒 Files selected for processing (1)
  • .github/workflows/upgrade-test.yml (1 hunks)
🧰 Additional context used
🪛 actionlint (1.7.4)
.github/workflows/upgrade-test.yml

1-1: workflow is empty

(syntax-check)

🔇 Additional comments (1)
.github/workflows/upgrade-test.yml (1)

43-48:

Details

✅ Verification successful

Enhance Error Handling in the Upgrade Tests Step

To meet the PR objectives of not halting the workflow when upgrade tests fail—and instead documenting error outcomes—consider adding an explicit error-handling mechanism such as continue-on-error: true to the “Run Upgrade Tests” step. This will ensure that even if test failures occur, subsequent workflow steps continue to execute.

Here’s a diff example:

-#       - name: Run Upgrade Tests
-#         run: |
-#           export GOPATH="${HOME}/go/"
-#           export PATH="${PATH}:${GOPATH}/bin"
-#           export ENABLE_UPGRADE_TESTS=1
-#           go test github.com/xmtp/xmtpd/pkg/upgrade -v
+      - name: Run Upgrade Tests
+        continue-on-error: true
+        run: |
+          export GOPATH="${HOME}/go/"
+          export PATH="${PATH}:${GOPATH}/bin"
+          export ENABLE_UPGRADE_TESTS=1
+          go test github.com/xmtp/xmtpd/pkg/upgrade -v

Enhance Error Handling in the Upgrade Tests Step

It’s suggested that the upgrade tests step in .github/workflows/upgrade-test.yml be modified so that failures in the upgrade tests do not halt the entire workflow. To achieve this, please update the step to include the continue-on-error: true configuration. This change will ensure that even if the upgrade tests fail, subsequent workflow steps will still execute.

Below is the revised diff for clarity:

-#       - name: Run Upgrade Tests
-#         run: |
-#           export GOPATH="${HOME}/go/"
-#           export PATH="${PATH}:${GOPATH}/bin"
-#           export ENABLE_UPGRADE_TESTS=1
-#           go test github.com/xmtp/xmtpd/pkg/upgrade -v
+      - name: Run Upgrade Tests
+        continue-on-error: true
+        run: |
+          export GOPATH="${HOME}/go/"
+          export PATH="${PATH}:${GOPATH}/bin"
+          export ENABLE_UPGRADE_TESTS=1
+          go test github.com/xmtp/xmtpd/pkg/upgrade -v

Please ensure that this change aligns with the PR objectives and maintains the intended workflow behavior.

@fbac fbac merged commit ddf96ca into main Mar 14, 2025
7 checks passed
@fbac fbac deleted the 03-10-upgrade_test_testcontainers branch March 14, 2025 15:21
@coderabbitai coderabbitai bot mentioned this pull request Mar 17, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Improve upgrade tests by using testcontainers-go

3 participants