Skip to content

Conversation

@mkysel
Copy link
Collaborator

@mkysel mkysel commented Apr 22, 2025

Add blockchain performance monitoring metrics to measure transaction confirmation, payload publishing, and log processing times

Implements new Prometheus metrics across multiple components to monitor blockchain and indexer performance:

📍Where to Start

Start with the new blockchain metrics implementation in metrics/blockchain.go which defines the core metrics and helper functions used throughout the changes.


Macroscope summarized 40eef93.

Summary by CodeRabbit

  • New Features
    • Added new metrics for monitoring blockchain transaction wait times, payload publishing durations, log processing times, and sync connection statuses.
    • Enhanced observability with additional debug logs for blockchain publishing and node processing.
  • Documentation
    • Updated the metrics catalog to include new metric definitions and descriptions.
  • Style
    • Minor formatting cleanup in logging output.

@mkysel mkysel requested review from a team as code owners April 22, 2025 21:25
@coderabbitai
Copy link
Contributor

coderabbitai bot commented Apr 22, 2025

Walkthrough

This set of changes introduces new Prometheus metrics and corresponding instrumentation across several components to improve observability of blockchain-related operations. New metrics are defined for tracking the time spent waiting for blockchain transaction receipts, the duration of publishing payloads to the blockchain, and the time taken to process blockchain logs. The codebase is updated to emit these metrics at relevant points, and the metrics catalog documentation is extended to describe the new metrics. Additionally, the metrics documentation generator is updated to recognize more metric constructor functions.

Changes

File(s) Change Summary
doc/metrics_catalog.md Extended documentation with new metrics for blockchain, payer, sync, and indexer packages.
pkg/metrics/blockchain.go
pkg/metrics/indexer.go
Introduced new Prometheus histogram metrics for blockchain transaction wait time, payload publish duration, and indexer log processing time, along with functions to emit these metrics. Renamed existing indexer metrics prefix from xmtpd_ to xmtp_.
pkg/metrics/metrics.go Registered new metrics collectors for the added blockchain and indexer metrics.
pkg/api/payer/service.go Added instrumentation to measure and emit metrics for publishing group messages and identity updates to the blockchain, along with debug logs.
pkg/blockchain/client.go Added timing instrumentation to measure and emit the duration of waiting for blockchain transaction receipts.
pkg/indexer/indexer.go Added timing measurement and metric emission for log processing duration in the indexer.
pkg/metrics/docs/generator.go Updated metricTypes map to recognize additional metric constructor functions for documentation generation.
pkg/blockchain/rpcLogStreamer.go Removed an extraneous blank line for formatting cleanup.

Sequence Diagram(s)

sequenceDiagram
    participant PayerService
    participant Metrics
    participant Blockchain

    PayerService->>Metrics: MeasurePublishToBlockchainMethod("group_message", fn)
    activate Metrics
    Metrics->>Blockchain: PublishGroupMessage()
    Blockchain-->>Metrics: result, err
    Metrics-->>PayerService: result, err (after emitting metric)
Loading
sequenceDiagram
    participant BlockchainClient
    participant Metrics

    BlockchainClient->>BlockchainClient: WaitForTransaction()
    Note right of BlockchainClient: Record start time
    BlockchainClient->>BlockchainClient: ...wait for receipt...
    BlockchainClient->>Metrics: EmitBlockchainWaitForTransaction(duration)
Loading
sequenceDiagram
    participant Indexer
    participant Metrics

    loop For each event in eventChannel
        Indexer->>Indexer: Record start time
        Indexer->>Indexer: Store log, update block tracker
        Indexer->>Metrics: EmitIndexerLogProcessingTime(duration)
    end
Loading

Possibly related PRs

  • Add payer metrics #622: Adds payer package metrics and instrumentation for blockchain publishing operations, closely related to the payer metrics and instrumentation in this PR.
  • Indexer metrics + indexer refactors #639: Refactors log streaming and adds indexer metrics related to block processing and retryable errors, related to indexer metrics changes in this PR.
  • Generate metrics catalog #719: Introduces tooling for automated generation of the metrics catalog documentation, directly related to the documentation updates in this PR.

Suggested reviewers

  • fbac
  • neekolas

Warning

There were issues while running some tools. Please review the errors and either fix the tool's configuration or disable the tool if it's a critical failure.

🔧 golangci-lint (1.64.8)

Error: you are using a configuration file for golangci-lint v2 with golangci-lint v1: please use golangci-lint v2
Failed executing command with error: you are using a configuration file for golangci-lint v2 with golangci-lint v1: please use golangci-lint v2


📜 Recent review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 1d1d773 and 29b87f5.

📒 Files selected for processing (3)
  • doc/metrics_catalog.md (1 hunks)
  • pkg/metrics/blockchain.go (1 hunks)
  • pkg/metrics/indexer.go (3 hunks)
🚧 Files skipped from review as they are similar to previous changes (3)
  • pkg/metrics/blockchain.go
  • doc/metrics_catalog.md
  • pkg/metrics/indexer.go
⏰ Context from checks skipped due to timeout of 90000ms (5)
  • GitHub Check: Push Docker Images to GitHub Packages (xmtpd-cli)
  • GitHub Check: Upgrade Tests
  • GitHub Check: Test (Node)
  • GitHub Check: Push Docker Images to GitHub Packages (xmtpd)
  • GitHub Check: Code Review

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share
🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

  • Review comments: Directly reply to a review comment made by CodeRabbit. Example:
    • I pushed a fix in commit <commit_id>, please review it.
    • Generate unit testing code for this file.
    • Open a follow-up GitHub issue for this discussion.
  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
    • @coderabbitai generate unit testing code for this file.
    • @coderabbitai modularize this function.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
    • @coderabbitai read src/utils.ts and generate unit testing code.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
    • @coderabbitai help me debug CodeRabbit configuration file.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (Invoked using PR comments)

  • @coderabbitai pause to pause the reviews on a PR.
  • @coderabbitai resume to resume the paused reviews.
  • @coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
  • @coderabbitai full review to do a full review from scratch and review all the files again.
  • @coderabbitai summary to regenerate the summary of the PR.
  • @coderabbitai generate docstrings to generate docstrings for this PR.
  • @coderabbitai generate sequence diagram to generate a sequence diagram of the changes in this PR.
  • @coderabbitai resolve resolve all the CodeRabbit review comments.
  • @coderabbitai configuration to show the current CodeRabbit configuration for the repository.
  • @coderabbitai help to get help.

Other keywords and placeholders

  • Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
  • Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
  • Add @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (.coderabbit.yaml)

  • You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
  • Please see the configuration documentation for more information.
  • If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Documentation and Community

  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🧹 Nitpick comments (1)
pkg/metrics/blockchain.go (1)

20-26: Consider defining custom histogram buckets.

The blockchainPublishPayload histogram doesn't specify custom buckets, so it will use Prometheus defaults. If you have expectations about the range of publish durations, consider defining custom buckets similar to how you did for blockchainWaitForTransaction.

 	prometheus.HistogramOpts{
 		Name: "xmtpd_blockchain_publish_payload_seconds",
 		Help: "Time to publish a payload to the blockchain",
+		Buckets: []float64{0.1, 0.5, 1, 2.5, 5, 10, 30},
 	},
📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 6022791 and 40eef93.

📒 Files selected for processing (9)
  • doc/metrics_catalog.md (1 hunks)
  • pkg/api/payer/service.go (4 hunks)
  • pkg/blockchain/client.go (2 hunks)
  • pkg/blockchain/rpcLogStreamer.go (0 hunks)
  • pkg/indexer/indexer.go (2 hunks)
  • pkg/metrics/blockchain.go (1 hunks)
  • pkg/metrics/docs/generator.go (1 hunks)
  • pkg/metrics/indexer.go (2 hunks)
  • pkg/metrics/metrics.go (2 hunks)
💤 Files with no reviewable changes (1)
  • pkg/blockchain/rpcLogStreamer.go
🧰 Additional context used
🧬 Code Graph Analysis (3)
pkg/blockchain/client.go (1)
pkg/metrics/blockchain.go (1)
  • EmitBlockchainWaitForTransaction (16-18)
pkg/indexer/indexer.go (1)
pkg/metrics/indexer.go (1)
  • EmitIndexerLogProcessingTime (115-117)
pkg/api/payer/service.go (3)
pkg/metrics/blockchain.go (1)
  • MeasurePublishToBlockchainMethod (33-39)
pkg/abi/groupmessagebroadcaster/GroupMessageBroadcaster.go (1)
  • GroupMessageBroadcasterMessageSent (1093-1098)
pkg/abi/identityupdatebroadcaster/IdentityUpdateBroadcaster.go (1)
  • IdentityUpdateBroadcasterIdentityUpdateCreated (824-829)
🪛 GitHub Check: Lint-Go
pkg/blockchain/client.go

[failure] 6-6:
could not import github.com/xmtp/xmtpd/pkg/metrics (-: # github.com/xmtp/xmtpd/pkg/metrics

pkg/metrics/metrics.go

[failure] 92-92:
undefined: blockchainPublish) (typecheck)


[failure] 92-92:
undefined: blockchainPublish) (typecheck)


[failure] 92-92:
undefined: blockchainPublish) (typecheck)


[failure] 92-92:
undefined: blockchainPublish (typecheck)

🪛 GitHub Check: Test (Node)
pkg/metrics/metrics.go

[failure] 92-92:
undefined: blockchainPublish

⏰ Context from checks skipped due to timeout of 90000ms (3)
  • GitHub Check: Push Docker Images to GitHub Packages (xmtpd-cli)
  • GitHub Check: Push Docker Images to GitHub Packages (xmtpd)
  • GitHub Check: Upgrade Tests
🔇 Additional comments (14)
pkg/metrics/indexer.go (2)

67-72: Well-designed histogram for tracking log processing time.

The new histogram metric is clearly defined with an appropriate name following Prometheus naming conventions, including the unit suffix (_seconds). This will provide valuable insights into blockchain log processing performance.


115-117: LGTM! Correctly converts duration to seconds.

The function properly converts the duration to seconds before recording the observation, which is the standard practice for time-based metrics in Prometheus.

pkg/metrics/blockchain.go (1)

33-39: Good implementation of a generic measurement wrapper.

The generic function elegantly handles timing measurements while preserving the return values and error handling of the wrapped function. The use of defer ensures the metric is always emitted regardless of how the function exits.

pkg/blockchain/client.go (1)

87-91:

Details

✅ Verification successful

Well-placed instrumentation for transaction wait time.

The timing measurement is correctly implemented with defer to ensure it's always executed. This will provide valuable insights into blockchain transaction wait times, which is critical for performance monitoring.


🏁 Script executed:

#!/bin/bash
# Check if the metric has been registered in pkg/metrics/metrics.go
rg -A 1 -B 1 "blockchainWaitForTransaction" ./pkg/metrics/metrics.go

Length of output: 153


Instrumentation Verified: blockchainWaitForTransaction Metric Registered

The defer-based timing in pkg/blockchain/client.go (lines 87–91) is correctly implemented, and the blockchainWaitForTransaction metric is confirmed in pkg/metrics/metrics.go. No further changes needed.

pkg/indexer/indexer.go (1)

254-254: Good placement of timing instrumentation for log processing.

The timing code correctly captures the full duration of processing a blockchain log, including the storage operation and block tracker update. The measurement starts before any processing begins and ends after all operations are completed, providing an accurate view of the end-to-end processing time.

Also applies to: 356-356

pkg/metrics/docs/generator.go (1)

34-37: LGTM! Great extension of metric type recognition

These additions enable the documentation generator to recognize non-Vec metric constructor functions, ensuring all metrics get properly documented regardless of how they're created.

pkg/api/payer/service.go (4)

284-288: Excellent instrumentation of group message publishing

The use of MeasurePublishToBlockchainMethod provides consistent timing metrics for blockchain operations, which aligns well with the PR objectives of improving observability.


310-314: Excellent instrumentation of identity update publishing

The instrumentation approach is consistent with the group message publishing, maintaining a uniform pattern for measuring blockchain operations.


346-347: Good debug logging for performance tracking

These debug logs complement the metrics by providing developer visibility into blockchain publishing performance during troubleshooting.


370-371: Useful debug log for tracking node processing

This log helps track when messages are being processed by nodes, improving the observability of the end-to-end message flow.

doc/metrics_catalog.md (4)

7-8: LGTM! Well-documented blockchain transaction wait time metric

The documentation clearly describes the purpose of this metric for tracking time spent waiting for transaction receipts.


9-9: LGTM! Clear documentation for payer nonce metric

Good description that clarifies this tracks the least recently used nonce and includes the caveat that it's not guaranteed to be the highest nonce.


13-14: LGTM! Useful sync connection metrics

These metrics will help track failed outgoing sync connections, providing visibility into sync issues.


18-20: LGTM! Excellent documentation for new blockchain and indexer metrics

The newly added metrics for outgoing sync connections, blockchain publishing, and log processing are clearly documented with descriptive names and purposes.

@macroscopeapp
Copy link

macroscopeapp bot commented Apr 22, 2025

Add instrumentation to measure blockchain and indexer performance metrics across XMTP services

Introduces new metrics instrumentation across multiple components:

  • Adds blockchain performance tracking in client.go for transaction wait times and service.go for payload publishing
  • Implements indexer performance monitoring in indexer.go for log processing times
  • Creates new metrics collection infrastructure in blockchain.go with histograms for transaction and publishing metrics
  • Updates metrics_catalog.md with documentation for six new metrics
  • Extends metrics generator in generator.go to support non-vector Prometheus types

📍Where to Start

Start with the new metrics infrastructure in blockchain.go which defines the core measurement functions EmitBlockchainWaitForTransaction and EmitBlockchainPublish that are used throughout the changes.

Changes since #736 opened

  • Renamed metrics across blockchain and indexer components from xmtpd_ prefix to xmtp_ prefix [29b87f5]

Macroscope summarized 29b87f5.

@mkysel mkysel merged commit f519ad6 into main Apr 23, 2025
8 checks passed
@mkysel mkysel deleted the mkysel/even-more-metrics branch April 23, 2025 15:34
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants