Standardize logging #1262

fbac · 2025-10-21T11:16:20Z

Standardize logging across services and components by adopting `pkg/utils/log.go` field helpers and named loggers, lowercasing messages, and renaming public APIs that carry logger parameters and fields

Unifies logging by introducing shared logger names and field helpers, lowercasing messages, and refactoring logger parameters and fields from log to logger across services, blockchain admins, indexers, workers, and CLIs; renames select public methods and constants to CamelCase and updates callers and tests accordingly; adds minor error-wrapping fixes and small utility features.

Introduces pkg/utils/log.go with standardized logger names and zap.Field helpers; updates components to use named loggers and consistent snake_case keys; lowers message casing throughout log.go
Refactors config and structs to Logger/logger and propagates named loggers across API servers, gateway, sync, indexer, migrator, workers, and blockchain packages; adjusts startup/shutdown logs and messages
Renames exported constants and method names to CamelCase (e.g., SetHTTPAddress, parameter keys, originator IDs) and updates tests and call sites
Standardizes interceptor and RPC/DB connection logs, field keys, and error messages; adds conditional debug logging and structured fields in hot paths
Applies error wrapping with %w in several constructors and helpers; adds minor utilities (currency.PicoDollar.String, ToDollars) and package comments

📍Where to Start

Start with the logging utilities in pkg/utils/log.go to see the new logger names and field helpers, then review how a representative service adopts them in pkg/api/server.go (https://github.com/xmtp/xmtpd/pull/1262/files#diff-734f9c34b34af099d24da13054e167f7df72706dc733edf21c61c0cc1548224a) and a blockchain component in pkg/blockchain/node_registry_admin.go (https://github.com/xmtp/xmtpd/pull/1262/files#diff-a06914b44c66a242a54ae2cfff18376479af3d12f90918be03417260f4674ea9).

Changes since #1262 opened

Removed unused constant declaration [70c26d5]
Consolidated test address variables [70c26d5]
Standardized parameter naming in mock contract methods [d236294]
Disabled test execution in misbehavior package [df27450]
Corrected log message assertion in prune test [df27450]
Renamed zap logger field key in mintHandler success log [e19e281]

📊 Macroscope summarized e19e281. 53 files reviewed, 101 issues evaluated, 96 issues filtered, 0 comments posted

🗂️ Filtered Issues

cmd/prune/main.go — 0 comments posted, 2 evaluated, 2 filtered

line 64: main opens a database connection via db.ConnectToDB and never closes it. There is no defer dbInstance.Close() or any explicit close on success, error, or early termination paths. This causes a resource leak of the DB connection pool both on normal completion and on error paths (e.g., fatal(...) exits). Given this is a short-lived command, leaking is still a bug per the single paired cleanup invariant. Add defer dbInstance.Close() immediately after a successful connection, ensuring it runs on all exit paths. [ Invalidated by documentation search ]
line 74: main calls prune.NewPruneExecutor(...) after acquiring the DB connection. NewPruneExecutor panics when config.BatchSize <= 0. If that panic occurs, the DB connection remains open because main does not defer dbInstance.Close(). This violates the single paired cleanup invariant on a panic path. Guard against panic by validating BatchSize before calling, or at minimum defer dbInstance.Close() immediately after successful connect to ensure cleanup in all paths. [ Invalidated by documentation search ]

cmd/replication/main.go — 0 comments posted, 2 evaluated, 1 filtered

line 61: If Version cannot be parsed as semver, the code logs the error but continues, passing a potentially nil version into server.WithServerVersion(version). Downstream components (e.g., registrant.NewRegistrant) may assume a non-nil version and could error or behave inconsistently. At minimum, this causes an externally visible contract ambiguity (version required earlier but not validated for semver correctness) and may lead to runtime errors depending on downstream expectations. Consider failing fast when semver parsing fails or providing a safe non-nil fallback. [ Low confidence ]

cmd/xmtpd-cli/commands/admin_setup.go — 0 comments posted, 1 evaluated, 1 filtered

line 344: setupFundsAdmin creates two RPC clients (chainClientSettlement and chainClientApp) but does not close them on any error path after they are created. If any subsequent step fails (e.g., signer construction or blockchain.NewFundsAdmin), the function returns an error with both clients left open. This is a resource leak (open network connections) and violates the single paired cleanup requirement after introducing external effects. Add Close() calls for the clients on every error path after their creation (or use a scoped cleanup via defer that is canceled when returning success). [ Low confidence ]

pkg/api/message/publish_worker.go — 0 comments posted, 5 evaluated, 5 filtered

line 39: startPublishWorker assumes logger and store are non-nil. It immediately calls logger.Named(utils.PublishWorkerName) and queries.New(store). If logger or store are nil, this will cause a runtime panic (nil pointer dereference) in the worker initialization goroutine. The caller does not guard these inputs before calling this function.

Concrete failure paths:

logger == nil → line 39 logger.Named(...) → panic.
store == nil → line 42 queries.New(store) returns a Queries with nil db; later the subscription query will call QueryContext on nil db → panic.
Additionally, feeCalculator and registrant are stored into the worker without validation, creating latent panic/logic issues when used later. [ Low confidence ]
line 121: originatorID := int32(p.registrant.NodeID()) narrows a uint32 to int32 without bounds checking. If a node ID exceeds math.MaxInt32 (2,147,483,647), the conversion will wrap to a negative or otherwise incorrect value. This corrupted originatorID is then used in database parameters (InsertGatewayEnvelopeParams.OriginatorNodeID and IncrementUnsettledUsageParams.OriginatorID) and logs, leading to incorrect persisted data and potential downstream logic errors. [ Low confidence ]
line 128: The worker’s failure handling creates an infinite retry loop for permanent errors, preventing progress on subsequent envelopes and violating the requirement that every input reaches a terminal state. Examples include: invalid payer envelope bytes (NewPayerEnvelopeFromBytes error), invalid topic (ParseTopic error), and invalid signature (RecoverSigner error). In these cases, publishStagedEnvelope returns false without any remediation or dead-letter path, and start() loops forever on the same staged envelope. [ Low confidence ]
line 151: Unbounded retentionDays from the payer envelope is forwarded to Registrant.SignStagedEnvelope without validation. In SignStagedEnvelope, time.Hour * 24 * time.Duration(retentionDays) can overflow time.Duration (an int64 nanoseconds count) when retentionDays exceeds approximately 106,751 days, yielding incorrect expiry computation. This produces invalid ExpiryUnixtime and thus malformed Expiry stored in the DB. The source of retentionDays (env.RetentionDays()) is external input; no guard is present here. [ Low confidence ]
line 208: SpendPicodollars is computed as int64(baseFee) + int64(congestionFee) without guarding against negative values or overflow. If fees.IFeeCalculator returns negative fees or values large enough to overflow int64 on addition, this will produce an incorrect spend (potentially negative), which can corrupt unsettled usage accounting (e.g., decrement usage instead of increment). [ Low confidence ]

pkg/api/message/service.go — 0 comments posted, 9 evaluated, 9 filtered

line 81: NewReplicationAPIService performs a nil check only for validationService and then immediately invokes startPublishWorker and startSubscribeWorker with other dependencies that are assumed non-nil (logger, store, registrant, feeCalculator, updater). If any of these are nil, downstream code will panic at runtime. Examples: startPublishWorker calls logger.Named(...) which dereferences logger; queries.New(store) relies on store being non-nil, and the resulting query function will call QueryContext on a nil DB handle. The service construction path has no guard, so a misconfigured startup will crash.

Concrete failure paths:

logger == nil → startPublishWorker line 39 calls logger.Named(...) → nil pointer deref panic.
store == nil → queries.New(store) creates a Queries with a nil db; later the subscription query will call q.db.QueryContext(...) → nil pointer deref panic.
feeCalculator == nil and registrant == nil may not crash immediately here but will be stored and used later, creating latent panic paths; since they are essential to the worker, these should be validated up-front.
updater == nil may lead to later uses failing (cursor updates) without clear error. The constructor should enforce required invariants before spawning goroutines. [ Low confidence ]
line 391: PublishPayerEnvelopes silently ignores all but the first PayerEnvelope in the request and only returns a single OriginatorEnvelope in the response, creating a contract-parity mismatch and silent data loss for batch inputs. The request type PublishPayerEnvelopesRequest accepts multiple PayerEnvelopes, and the response type PublishPayerEnvelopesResponse returns a list of OriginatorEnvelope. However, the implementation validates and processes only req.GetPayerEnvelopes()[0] and constructs OriginatorEnvelopes: []*envelopesProto.OriginatorEnvelope{originatorEnv}, discarding any additional envelopes without error or notice. This is a runtime logic bug: batch requests are accepted but not fully processed, and the response does not correspond one-to-one with the input elements. [ Low confidence ]
line 443: Non-blocking notification in publishWorker.notifyStagedPublish() can be dropped silently, potentially causing waitForGatewayPublish to hit its 30s timeout unnecessarily if the publisher relies on notifications to wake up. The call site in PublishPayerEnvelopes (s.publishWorker.notifyStagedPublish()) does not check whether the notification was delivered. If the notifier channel is full or has no receiver, the notification is discarded, which can delay processing and lead to timeouts or increased latency. [ Low confidence ]
line 463: waitForGatewayPublish times out or returns early on ctx.Done() but PublishPayerEnvelopes does not propagate any failure or status and still returns a successful response with the OriginatorEnvelope. This can cause clients to receive a success response even though the publish may not have completed, violating expected publish semantics. Specifically, waitForGatewayPublish only logs (at debug) on timeout or cancellation and returns; the caller then unconditionally builds and returns the response. If the API contract requires the publish to have completed before responding, this is a runtime behavior bug. At minimum, an explicit status or error should be returned to avoid silent partial completion. [ Low confidence ]
line 503: GetInboxIds performs no input validation on the number of requests. It accepts an unbounded req.Requests, builds an equally large addresses slice, and sends it to the database via queries.GetAddressLogs. This can create extremely large SQL array parameters and response allocations, causing memory spikes or heavy database load. Constants like maxQueriesPerRequest exist but are not enforced here. [ Out of scope ]
line 510: Inconsistent error mapping between methods: GetInboxIds returns the raw DB error (return nil, err), which gRPC will map to an Unknown code, while GetNewestEnvelope wraps DB errors using status.Errorf(codes.Internal, ...). This inconsistency changes externally visible error codes for similar server-side failures, breaking contract parity across related endpoints. [ Out of scope ]
line 546: GetNewestEnvelope performs no input validation on the number of topics or the length of each topic. It accepts an unbounded req.Topics and passes it to the DB, potentially creating extremely large array parameters and allocations. Constants such as maxQueriesPerRequest and maxTopicLength are defined in this file but not enforced here, exposing the service to resource exhaustion or oversized inputs. [ Out of scope ]
line 550: GetNewestEnvelope mishandles duplicate topics in the request. The map originalSort[string(topic)] = idx overwrites earlier indices when the same topic appears multiple times, and later only a single results[idx] is populated. Earlier duplicate positions remain nil (or unpopulated), violating the intended 1:1 correspondence "The newest envelope for the given topic OR null" for each occurrence. This yields missing responses for earlier duplicates. [ Out of scope ]
line 565: GetNewestEnvelope constructs results := make([]*message_api.GetNewestEnvelopeResponse_Response, len(topics)) and intentionally leaves entries as nil when there is no envelope for a given topic. Returning a repeated message field containing nil elements is invalid for protobuf v2 and will trigger a panic during gRPC/protobuf marshaling (historically: "proto: repeated field contains nil element"). The correct representation of "no envelope" should be a non-nil Response element with its OriginatorEnvelope oneof unset (zero-value), not a nil entry in the slice. [ Out of scope ]

pkg/api/message/subscribe_worker.go — 0 comments posted, 2 evaluated, 1 filtered

line 157: startSubscribeWorker assumes logger and store are non-nil. It immediately calls logger.Named(utils.SubscribeWorkerLoggerName) and queries.New(store). If logger or store are nil, this results in a runtime panic during worker initialization. The function lacks input validation and returns no error for this configuration issue, making the service crash instead of failing fast with a clear error. [ Low confidence ]

pkg/api/payer/service.go — 0 comments posted, 2 evaluated, 2 filtered

line 145: PublishClientEnvelopes assumes the per-node response slice (originatorEnvelopes) has the exact same length and ordering as the request slice (payloadsWithIndex) and directly indexes into it with originatorEnvelopes[idx] while iterating payloadsWithIndex. If the downstream node returns fewer envelopes or a different ordering, this can cause an out-of-bounds panic or incorrectly assign envelopes to the wrong positions in out. There is no guard to verify length equality before indexing. The problematic lines are the loop assigning out[payloadsWithIndex[idx].originalIndex] = originatorEnvelope without validating len(originatorEnvelopes) == len(payloadsWithIndex). [ Low confidence ]
line 436: Misuse of logging field helper utils.OriginatorIDField to log a node ID in publishToBlockchain and publishToNodeWithRetry. For example, s.logger.Debug("waiting for message to be processed by node", utils.OriginatorIDField(targetNodeID)) and error logs in retry use utils.OriginatorIDField(nodeID). This labels the log field as originator_id while the value is a node ID, which can mislead observability and monitoring systems and cause incorrect attribution in metrics and logs. [ Low confidence ]

pkg/blockchain/app_chain_admin.go — 0 comments posted, 2 evaluated, 2 filtered

line 59: NewAppChainAdmin does not check for client == nil, signer == nil, or parameterAdmin == nil. If any of these are nil, subsequent uses of the admin (e.g., reading or writing parameters, interacting with contracts) will dereference nil pointers and panic. The constructor should return an error when required dependencies are nil. [ Low confidence ]
line 65: NewAppChainAdmin does not validate the configured contract addresses. Passing an empty or malformed address string in contractsOptions.AppChain.IdentityUpdateBroadcasterAddress, GroupMessageBroadcasterAddress, or GatewayAddress will result in common.HexToAddress returning the zero address, and the contract bindings will be created against 0x0000000000000000000000000000000000000000. This will silently succeed and defer failures to later calls, causing hard-to-diagnose runtime errors. The constructor should explicitly validate that each address string is non-empty and a valid hex address, and reject invalid inputs with an error. [ Low confidence ]

pkg/blockchain/blockchain_publisher.go — 0 comments posted, 5 evaluated, 4 filtered

line 53: NewBlockchainPublisher does not validate the configured contract addresses. If contractOptions.AppChain.GroupMessageBroadcasterAddress or IdentityUpdateBroadcasterAddress are empty or invalid, common.HexToAddress will produce the zero address and the bindings will be created against it, leading to subtle runtime failures later. The constructor should validate addresses and return errors for invalid inputs. [ Low confidence ]
line 68: NewBlockchainPublisher does not validate that signer and nonceManager are non-nil. It calls signer.FromAddress() (line 68) and methods on nonceManager (lines 75 and 110). If either dependency is nil, the function will panic at runtime. Add explicit nil checks and return errors early to ensure safe operation. [ Low confidence ]
line 99: NewBlockchainPublisher creates a time.NewTicker(10 * time.Second) inside the replenishment goroutine but never stops it. On context cancellation (select case <-innerCtx.Done()), the goroutine returns without calling ticker.Stop(), leaking the ticker's underlying timer resource. Add defer ticker.Stop() immediately after ticker creation to ensure cleanup on all exit paths. [ Invalidated by documentation search ]
line 429: The deferred Cancel on any error after a successful create introduces a correctness bug: once a transaction has been created/submitted using the reserved nonce, a later failure in wait (e.g., timeout, context cancellation, transient RPC error) will cause nonceContext.Cancel() to run via the defer. Per the NonceContext contract, Cancel "releases the nonce reservation, making it available for reuse". However, the transaction may still be pending or even mined; releasing the nonce risks assigning the same nonce to a new transaction, causing a collision and subsequent nonce too low or replacement conflicts.

This violates at-most-once semantics for nonce assignment and breaks invariants after an external effect (the transaction submission). Instead, the code should ensure the nonce is not reintroduced to the pool after the transaction is submitted—either consume immediately upon submission (with appropriate error handling) or mark as pending and ensure it’s not reused on wait failures. [ Invalidated by documentation search ]

pkg/blockchain/funds_admin.go — 0 comments posted, 5 evaluated, 5 filtered

line 71: NewFundsAdmin does not validate required contract addresses (FeeToken, UnderlyingFeeToken, settlement GatewayAddress, and app GatewayAddress). If any are empty or malformed, common.HexToAddress will return the zero address without error, causing all subsequent contract bindings to point to 0x000.... Interacting with these will fail at runtime (reverts or no contract). Add explicit validation that each address string is non-empty and a valid hex address before binding. [ Low confidence ]
line 87: NewFundsAdmin unconditionally binds a MockUnderlyingFeeToken contract at the address provided by UnderlyingFeeToken. If the deployed contract at that address is a standard ERC-20 and not the mock, subsequent calls to mockUnderlyingFeeToken methods will fail at runtime due to ABI mismatch. This should be guarded by an environment/config flag or validation to ensure the address actually points to a mock contract when mock-only functionality is used. [ Low confidence ]
line 108: All three constructors NewFundsAdmin, NewNodeRegistryCaller, and NewContractRatesFetcher call logger.Named(...) without checking if logger is non-nil. If a nil logger is passed, these calls cause a nil pointer dereference panic. Add a nil check and either default to a no-op logger or return an error. [ Low confidence ]
line 120: NewFundsAdmin sets spender to common.HexToAddress(opts.ContractOptions.SettlementChain.GatewayAddress) without validating the address. If this is empty or invalid, spender becomes the zero address, which will cause incorrect approval flows (approving the zero address) and subsequent operations to fail or have unintended effects. Validate GatewayAddress and fail fast if invalid. [ Invalidated by documentation search ]
line 292: Withdraw checks and logs the application-chain balance for from := f.settlement.Signer.FromAddress() but the transaction is executed with f.app.Signer via ExecuteTransaction. This means the preflight balance check and the "current balance" log are performed on the settlement signer’s address rather than the app signer’s address that will actually pay opts.Value and gas. As a result, Withdraw may proceed even when the app signer lacks sufficient funds, causing the transaction to fail later at submission/mining, and the balance logs are misleading (they show the recipient’s or settlement signer’s app-chain balance, not the sender/app signer’s balance). The balance check should use f.app.Signer.FromAddress() for sender-side sufficiency, while still passing the intended recipient to Withdraw. [ Out of scope ]

pkg/blockchain/node_registry_admin.go — 0 comments posted, 3 evaluated, 3 filtered

line 52: NewNodeRegistryAdmin does not validate contractsOptions.SettlementChain.NodeRegistryAddress before binding. If the address string is empty or malformed, common.HexToAddress will produce the zero address, and the admin will be constructed against 0x000.... Subsequent on-chain calls would target the wrong contract silently. Add explicit validation (non-empty, correct hex format, non-zero address) and return a clear error if invalid. [ Low confidence ]
line 78: crypto.FromECDSAPub(signingKeyPub) is called without a nil check on signingKeyPub. If signingKeyPub is nil, this call can panic at runtime. There is no guard in AddNode to prevent a nil public key from being used. The upstream migrator path validates and returns a non-nil key, but AddNode is a public method and can be called with nil from other paths. Add an explicit nil check and return a clear error rather than panicking. [ Low confidence ]
line 116: AddNode does not verify that a NodeAdded event was actually parsed and instead returns whatever value is left in nodeID. If no matching event is found (e.g., the transaction succeeds but the parser rejects all logs), nodeID remains 0 and is returned to the caller. This silently reports success with an invalid/ambiguous node ID (0), diverging from the stricter behavior used in SubmitPayerReport, which errors when no expected event is found. The method should track whether the event was found and return an error if not, to preserve contract parity and avoid propagating an invalid node ID. [ Invalidated by documentation search ]

pkg/blockchain/node_registry_caller.go — 0 comments posted, 1 evaluated, 1 filtered

line 36: NewNodeRegistryCaller does not validate contractsOptions.SettlementChain.NodeRegistryAddress. If it’s empty or invalid, common.HexToAddress yields the zero address and binds the contract to 0x000..., causing runtime failures when calling methods. Add explicit address validation before binding. [ Low confidence ]

pkg/blockchain/parameter_registry_admin.go — 0 comments posted, 3 evaluated, 3 filtered

line 97: NewParameterAdminWithRegistry does not check for nil dependencies (client, signer, or reg). If any are nil, later methods on ParameterAdmin that use these fields will panic. The constructor should validate inputs and return an error when required dependencies are nil. [ Low confidence ]
line 122: NewSettlementParameterAdmin does not validate contractsOptions.SettlementChain.ParameterRegistryAddress before using it to construct the settlement registry adapter. If this address is empty or invalid, common.HexToAddress will resolve it to the zero address and settleReg.NewSettlementChainParameterRegistry may succeed, silently producing an admin bound to a zero-address contract. This misconfiguration will lead to later runtime failures when attempting reads/writes, but the constructor returns successfully, hiding the problem. The function should explicitly validate that ParameterRegistryAddress is a non-empty, valid hex address and fail fast with a clear error if not. [ Low confidence ]
line 146: NewAppChainParameterAdmin does not validate that contractsOptions.AppChain.ParameterRegistryAddress is a non-empty valid address string. If empty or invalid, the adapter will bind to the zero address via HexToAddress, leading to subtle runtime failures. It should reject invalid addresses with an error ahead of binding. [ Low confidence ]

pkg/blockchain/rate_registry_admin.go — 0 comments posted, 1 evaluated, 1 filtered

line 40: NewRatesAdmin does not validate contractsOptions.SettlementChain.RateRegistryAddress before binding. An empty or malformed input will result in binding to the zero address, leading to silent misdirected on-chain interactions. Validate the address and return an error if it is zero or invalid. [ Low confidence ]

pkg/blockchain/settlement_chain_admin.go — 0 comments posted, 1 evaluated, 1 filtered

line 78: NewSettlementChainAdmin binds multiple contracts (SettlementChainGateway, PayerRegistry, DistributionManager, PayerReportManager, NodeRegistry, RateRegistry) using addresses from contractsOptions.SettlementChain.*Address without validating any of them. If any address is empty or malformed, the binding will silently target the zero address. This can cause subtle misbehavior across many admin operations. Explicitly validate each address (non-empty, valid hex, non-zero) and surface a clear error. [ Low confidence ]

pkg/fees/contract_rates.go — 0 comments posted, 4 evaluated, 4 filtered

line 63: NewContractRatesFetcher does not validate options.SettlementChain.RateRegistryAddress. If the address is empty or invalid, common.HexToAddress returns the zero address, resulting in a contract bound to 0x000... and runtime call failures. Validate the address string and fail fast if invalid. [ Low confidence ]
line 120: currentIndex is not validated to be non-negative before use. If c.currentIndex is negative, the loop condition will hold and the code will call GetRates with a negative fromIndex. This likely causes contract errors and the loop may not converge depending on pageSize. The code should enforce currentIndex >= 0 and, ideally, initialize it to zero. [ Low confidence ]
line 120: c.currentIndex is dereferenced without any prior nil check (c.currentIndex.Cmp(availableRatesCount)), which will panic if c.currentIndex was not initialized. There is no visible guard or constructor guarantee in the reviewed code ensuring c.currentIndex is non-nil before refreshData runs. [ Low confidence ]
line 121: pageSize is not validated to be positive. If c.pageSize == 0, then toFetch becomes 0 and c.currentIndex = c.currentIndex.Add(c.currentIndex, toFetch) does not advance the index, causing an infinite loop that repeatedly calls GetRates with a count of 0. If c.pageSize < 0, toFetch is negative; the loop will decrement currentIndex, and the condition c.currentIndex.Cmp(availableRatesCount) < 0 will likely remain true, resulting in a non-terminating loop and potentially invalid contract calls with a negative count. [ Low confidence ]

pkg/indexer/app_chain/app_chain.go — 0 comments posted, 1 evaluated, 1 filtered

line 70: On the error path when blockchain.NewRPCClient fails, the code calls rpcClient.Close() even though rpcClient will be nil when an error is returned from ethclient.DialContext. This results in a guaranteed nil pointer dereference. Specifically, in NewAppChain, after checking if err != nil, the code executes cancel() and then rpcClient.Close() before returning. The Close() call should be guarded with a nil check or omitted entirely when the client couldn't be constructed.

Affected code:

rpcClient, err := blockchain.NewRPCClient(ctxwc, cfg.RPCURL)
if err != nil { cancel(); rpcClient.Close(); return nil, fmt.Errorf("%v: %w", ErrInitializingAppChain, err) } [ Out of scope ]

pkg/indexer/app_chain/contracts/group_message_storer.go — 0 comments posted, 2 evaluated, 2 filtered

line 63: StoreLog does not validate that the client envelope’s target topic identifier matches the on-chain GroupId from the MessageSent event. The only check performed is clientEnvelope.TopicMatchesPayload(), which verifies the topic kind against the payload type, but never confirms the identifier equality between clientEnvelope.TargetTopic() and topic.NewTopic(topic.TopicKindGroupMessagesV1, msgSent.GroupId[:]). This can lead to storing an envelope under the wrong group topic when the payload kind is correct but the identifier is different, violating the contract’s intent and causing mis-association in the database. The issue is present at the construction of topicStruct and the subsequent check that only considers kind, not identifier. [ Out of scope ]
line 107: Potential overflow/semantic corruption when converting msgSent.SequenceId to int64 for OriginatorSequenceID. If msgSent.SequenceId is a uint64 (typical for Solidity uint64/uint256-backed values), casting to int64 with int64(msgSent.SequenceId) will produce a negative value or wrap for values greater than math.MaxInt64. This can lead to storing corrupted sequence IDs in the database and breaking ordering/invariants. There is no guard ensuring msgSent.SequenceId <= math.MaxInt64 before casting. [ Out of scope ]

pkg/indexer/app_chain/contracts/identity_update_storer.go — 0 comments posted, 4 evaluated, 4 filtered

line 114: Unsafe signed-to-unsigned cast in StoreLog when comparing latestSequenceID to msgSent.SequenceId. The code does if uint64(latestSequenceID) >= msgSent.SequenceId { ... } on line 114. If latestSequenceID is negative (e.g., due to an empty table defaulting to -1 or any DB anomaly), converting it to uint64 produces a very large positive value, causing the condition to be true and the function to skip insertion erroneously. This yields incorrect behavior: identity updates may be skipped even though they are not present. The code must guard latestSequenceID >= 0 or otherwise handle absence explicitly before converting to uint64. [ Low confidence ]
line 119: Potential overflow when casting msgSent.SequenceId (type uint64) to int64 in several places: logging via utils.SequenceIDField(int64(msgSent.SequenceId)) (lines 119 and 171) and inserting via InsertGatewayEnvelope (OriginatorSequenceID: int64(msgSent.SequenceId) on line 234-235). If msgSent.SequenceId exceeds math.MaxInt64, the cast wraps to a negative int64, corrupting stored sequence values and misleading logs. This can break ordering guarantees and duplicate detection. [ Invalidated by documentation search ]
line 149: Possible nil pointer dereference when accessing associationState.StateDiff.NewMembers and RemovedMembers without checking that associationState.StateDiff is non-nil. If GetAssociationStateFromEnvelopes returns a result with a nil StateDiff, the loops on lines 149 and 177 will panic. Add a nil check on associationState and associationState.StateDiff before iterating. [ Low confidence ]
line 154: Possible nil pointer dereference when accessing newMember.Kind and removedMember.Kind without verifying that newMember / removedMember are non-nil. Elements of the NewMembers/RemovedMembers slices could be nil. The type assertion address, ok := newMember.Kind.(*associations.MemberIdentifier_EthereumAddress) will panic if newMember is nil because it dereferences newMember.Kind. Add checks if newMember == nil { continue } and similarly for removedMember. Lines 154 and 182 are affected. [ Invalidated by documentation search ]

pkg/indexer/indexer.go — 0 comments posted, 1 evaluated, 1 filtered

line 110: NewIndexer initializes appChain and then proceeds to initialize settlementChain. If settlementchain.NewSettlementChain fails (or any subsequent step after appChain creation), the function calls cancel() on the indexer context and returns the error, but it does not perform any explicit teardown/cleanup of the already-created appChain instance. Depending on AppChain’s resource allocations (RPC/WebSocket clients, goroutines) and whether they respond to the parent context cancel, this can lead to resource leaks or partially initialized components lingering. To ensure correct cleanup and at-most-once semantics, add explicit teardown on failure (e.g., a Close method on AppChain invoked on error) or guarantee that context cancellation deterministically releases all resources acquired during NewAppChain construction. [ Invalidated by documentation search ]

pkg/indexer/rpc_streamer/rpc_log_streamer.go — 0 comments posted, 3 evaluated, 1 filtered

line 202: Subscription resources are never explicitly released, leading to potential leaks. When sub.Err() returns a non-nil error, the code rebuilds the subscription and assigns the new sub without calling sub.Unsubscribe() on the old subscription (lines 202-206, lines 295-299). On exit paths (lines 193-196, lines 304-308) the function returns without unsubscribing the active subscription. Ethereum Subscription typically requires Unsubscribe() to stop the underlying RPC notifications and free resources. Not calling Unsubscribe() when replacing or exiting can leak network resources and goroutines. [ Low confidence ]

pkg/indexer/settlement_chain/contracts/payer_registry_storer.go — 0 comments posted, 3 evaluated, 3 filtered

line 109: Potential integer overflow and silent truncation when converting parsedEvent.Amount (*big.Int) to int64 via parsedEvent.Amount.Int64() and then multiplying by 1e6 in currency.FromMicrodollars. On-chain event amounts are uint256 and can exceed the range of int64. big.Int.Int64() will truncate on overflow, producing incorrect negative or wrapped values. Subsequently, microdollars * 1e6 can overflow int64 again, yielding an incorrect PicoDollar. This can cause incorrect ledger entries (wrong deposit amounts) without error or visibility. Consider validating the range before conversion, using SetString/big.Int arithmetic, or using a wider integer type to avoid loss. [ Out of scope ]
line 151: Potential integer overflow and silent truncation when converting parsedEvent.Amount (*big.Int) to int64 via parsedEvent.Amount.Int64() and then multiplying by 1e6 in currency.FromMicrodollars. For withdrawal requests, large uint256 amounts can exceed int64, causing truncation in Int64() and further overflow in the multiplication. This results in incorrect ledger withdrawal entries (wrong amounts applied) without any error. Use range checks and big.Int arithmetic or a wider type. [ Out of scope ]
line 193: Potential integer overflow and silent truncation when converting parsedEvent.Amount (*big.Int) to int64 via parsedEvent.Amount.Int64() and then multiplying by 1e6 in currency.FromMicrodollars. For usage settlements, large uint256 values can exceed int64, causing truncation upon Int64() and overflow in the * 1e6 multiplication, leading to incorrect settlement amounts recorded in the ledger. Validate ranges and use big.Int arithmetic or a wider integer representation. [ Out of scope ]

pkg/indexer/settlement_chain/contracts/payer_report_manager_storer.go — 0 comments posted, 3 evaluated, 3 filtered

line 106: Converting parsedEvent.StartSequenceId and parsedEvent.EndSequenceId from uint64 to int64 for logging via utils.StartSequenceIDField(int64(parsedEvent.StartSequenceId)) and utils.LastSequenceIDField(int64(parsedEvent.EndSequenceId)) can overflow when the values exceed math.MaxInt64. In Go, converting a uint64 larger than MaxInt64 to int64 yields a negative number due to wraparound, resulting in incorrect (potentially negative) sequence IDs in logs. This is an externally visible artifact being malformed/incorrect. Use zap.Uint64 or introduce utils helpers that accept uint64 to preserve correctness. [ Low confidence ]
line 115: Error classification relies on strings.Contains(err.Error(), ErrReportAlreadyExists) to detect the duplicate-report case. This is a brittle control-flow dependency on the error string. If upstream code changes the message or wraps the error, this branch may fail to recognize the condition and incorrectly return a retryable or non-retryable error, altering external behavior. This should use a sentinel error, type assertion, or errors.Is/errors.As to robustly detect the condition. [ Low confidence ]
line 118: Repeated conversion of parsedEvent.StartSequenceId and parsedEvent.EndSequenceId from uint64 to int64 occurs in the "already exists" debug path as well, e.g., utils.StartSequenceIDField(int64(parsedEvent.StartSequenceId)) and utils.LastSequenceIDField(int64(parsedEvent.EndSequenceId)). If these values exceed math.MaxInt64, the logged sequence IDs will be negative due to wraparound, misleading diagnostics in the duplicate-report path. [ Invalidated by documentation search ]

pkg/integration/builders/xmtp_gateway_container_builder.go — 0 comments posted, 2 evaluated, 2 filtered

line 103: Potential panic when writing to b.envVars if it is nil. In both conditional blocks, the code writes to the map (b.envVars["XMTPD_SETTLEMENT_CHAIN_WSS_URL"] = b.wsURL, b.envVars["XMTPD_APP_CHAIN_WSS_URL"] = b.wsURL, b.envVars["XMTPD_SETTLEMENT_CHAIN_RPC_URL"] = b.rpcURL, b.envVars["XMTPD_APP_CHAIN_RPC_URL"] = b.rpcURL) without ensuring b.envVars is non-nil. If a caller constructs XmtpdGatewayContainerBuilder without initializing envVars and provides a non-empty wsURL or rpcURL, this will cause a runtime panic due to nil map assignment. [ Low confidence ]
line 127: Potential invalid network configuration by always setting req.Networks to []string{b.networkName} without validating b.networkName. When b.networkAlias == "" and b.networkName == "", Networks will contain an empty string, which is likely invalid for testcontainers/Docker and can cause container creation to fail at runtime. The only guard (require.NotEmpty(t, b.networkName)) applies only when b.networkAlias is non-empty. [ Low confidence ]

pkg/integration/builders/xmtpd_container_builder.go — 0 comments posted, 3 evaluated, 3 filtered

line 93: Possible nil map write: b.envVars is a map[string]string but there is no guard or initialization in Build. On lines where the code sets environment variables (b.envVars["XMTPD_SETTLEMENT_CHAIN_WSS_URL"] = b.wsURL, b.envVars["XMTPD_APP_CHAIN_WSS_URL"] = b.wsURL, b.envVars["XMTPD_SETTLEMENT_CHAIN_RPC_URL"] = b.rpcURL, b.envVars["XMTPD_APP_CHAIN_RPC_URL"] = b.rpcURL), a zero-valued XmtpdContainerBuilder or one constructed without initializing envVars will cause a runtime panic (assignment to entry in nil map). You should either initialize envVars during construction or guard against nil before writing. [ Low confidence ]
line 117: Empty network name passed to Networks: The code always sets Networks: []string{b.networkName} even when b.networkName is empty (no guard). This can produce a malformed container request with a single empty-name network, likely causing container start failures. When b.networkAlias is set, there is a require.NotEmpty(t, b.networkName) guard, but when networkAlias is empty the guard does not run, leaving the empty network case unhandled. The fix is to conditionally set Networks only when b.networkName is non-empty, or pass nil/omit the field otherwise. [ Low confidence ]
line 131: Potential panic when registering cleanup with a nil Container: testcontainers.GenericContainer can return a non-nil err and a nil xmtpdContainer. The code calls testcontainers.CleanupContainer(t, xmtpdContainer) unconditionally. If CleanupContainer uses the container value to register termination (e.g., by invoking methods on it during t.Cleanup), passing nil may cause a panic later when the test ends. To be safe, guard the cleanup registration with if err == nil { testcontainers.CleanupContainer(t, xmtpdContainer) }. [ Invalidated by documentation search ]

pkg/interceptors/server/auth.go — 0 comments posted, 1 evaluated, 1 filtered

line 70: AuthInterceptor.logIncomingAddressIfAvailable performs a reverse DNS lookup (net.LookupAddr) on every incoming request when debug logging is enabled, without any timeout or context. This synchronous lookup can block the request path if DNS is slow or misconfigured, causing latency spikes or temporary stalls. It also ignores the request’s context for cancellation. A non-blocking or context-aware approach (e.g., lookup in a separate goroutine with timeout, caching, or disabled by default) is advisable. [ Low confidence ]

pkg/metrics/metrics.go — 0 comments posted, 1 evaluated, 1 filtered

line 43: In NewMetricsServer, s.http is set to &proxyproto.Listener{Listener: httpListener} before checking err from net.Listen. If net.Listen fails, httpListener is nil and s.http wraps a nil listener. While the function immediately returns the error and the wrapped nil listener is not used, this is fragile ordering that risks future dereferences of a partially-initialized Server if the code changes. s.http should be assigned only after confirming err == nil to avoid temporarily storing an invalid listener. [ Code style ]

pkg/migrator/migrator.go — 0 comments posted, 3 evaluated, 3 filtered

line 275: Potential deadlock when batchSize is zero. maxInflight := int(m.batchSize) * 4 yields 0, sem := make(chan struct{}, maxInflight) creates a zero-capacity channel with no tokens prefilled, and later the reader tries to acquire a token via select { case <-ctx.Done(): ...; case <-sem: }. With ctx not cancelled, this select blocks indefinitely because no goroutine ever sends to sem (tokens are only returned by cleanupInflight after processing records, but none can be processed before acquiring a token). This stalls the pipeline. The code should validate batchSize > 0 at startup or handle the zero case by skipping the semaphore or seeding at least one token. [ Invalidated by documentation search ]
line 336: migrationWorker uses logger.Fatal on failure to read migration progress, which will terminate the entire process rather than just failing and retrying the reader loop. This is not in an initialization path and can be triggered by transient database errors, causing a full service crash and bypassing defers/cleanup in the goroutine. Specifically, after emitting a reader error metric, it calls logger.Fatal("failed to get migration progress", zap.Error(err)), which by default calls os.Exit(1). This violates the runtime termination guidance and resilience expectations for a long-running worker. The code should log an error and retry or back off (similar to the fetch error handling paths) instead of calling Fatal. [ Low confidence ]
line 420: Mismatch of IDs used for inflight tracking and cleanup can cause missing latency metrics and a memory leak in the inflight map. The reader stores inflight[id] = startE2ELatency keyed by record.GetID() (source record ID). Later, in the writer’s defer, it looks up and deletes using int64(env.OriginatorSequenceID()). There is no explicit guarantee that env.OriginatorSequenceID() equals record.GetID() for all tables. If they differ, then:
E2E latency is not emitted because exists is false (startTime, exists := inflight[int64(env.OriginatorSequenceID())]).
The delete(inflight, id) in cleanupInflight does not remove the original entry (since it uses the sequence ID), leaving stale entries and growing the map.
The semaphore release sends a token regardless, which prevents hard deadlock but can silently drop tokens via the non-blocking default when over-releasing.
To fix, consistently use the same key for inflight tracking and cleanup (e.g., always use the source record ID, carry it through the envelope, or include it alongside the envelope when passing to writer). [ Low confidence ]

pkg/migrator/writer.go — 0 comments posted, 4 evaluated, 4 filtered

line 32: insertOriginatorEnvelopeDatabase calls env.UnsignedOriginatorEnvelope.PayerEnvelope.RecoverSigner() (line 32). RecoverSigner() accesses p.proto (p.proto.PayerSignature) without a nil check. If PayerEnvelope.proto is nil, this will panic. There is no visible guard in this function ensuring that the PayerEnvelope’s proto is non-nil, making a runtime nil pointer dereference possible. [ Low confidence ]
line 73: Silent data loss in IncrementUnsettledUsage call: queries.IncrementUnsettledUsageParams includes a MessageCount int32 field, but insertOriginatorEnvelopeDatabase does not set it (lines 73–80). This results in MessageCount being sent as the zero value (0), which can corrupt usage accounting and violates the data-conversion boundary rule (all fields must be preserved or explicitly ignored with justification). [ Low confidence ]
line 77: Potential nil pointer dereference when computing spend in insertOriginatorEnvelopeDatabase: env.UnsignedOriginatorEnvelope.BaseFee() and .CongestionFee() (lines 77–78) dereference UnsignedOriginatorEnvelope.proto without a nil check (the methods explicitly skip nil checks). If UnsignedOriginatorEnvelope.proto is nil, this will panic at runtime. There is no visible guard in this function ensuring this invariant. [ Invalidated by documentation search ]
line 113: insertOriginatorEnvelopeBlockchain dereferences env without any nil guard. At function entry, env.TargetTopic().Identifier() (line 113) and env.OriginatorSequenceID() (line 114) will panic if env is nil. Given the upstream writer pipeline forwards the envelope returned by the transformer without a concrete nil check, an (nil, nil) return from the transformer would flow into this function and cause a runtime nil pointer dereference. [ Low confidence ]

pkg/payerreport/utils.go — 0 comments posted, 1 evaluated, 1 filtered

line 43: AddReportLogFields casts uint64 fields StartSequenceID and EndSequenceID to int64 for logging: utils.StartSequenceIDField(int64(report.StartSequenceID)) and utils.LastSequenceIDField(int64(report.EndSequenceID)). If these values exceed math.MaxInt64, the cast will overflow and produce incorrect negative values in logs. This leads to misleading telemetry and can hinder debugging or monitoring. Use unsigned fields or string formatting to avoid overflow. [ Invalidated by documentation search ]

pkg/payerreport/workers/generator.go — 0 comments posted, 2 evaluated, 2 filtered

line 147: Concurrent race in maybeGenerateReport can produce duplicate reports. The method performs a non-atomic check-then-act sequence: it first determines the last submitted report via getLastSubmittedReport, then separately fetches existing pending/approved reports starting at that end sequence ID, and finally generates a new report if none are found. Between these steps, another component (e.g., a concurrent submitter) can transition a report or create a new one, causing this worker to miss it and generate a duplicate. This is explicitly noted in the TODO comments but not mitigated. There is no locking or transactional guarantee across these reads and the subsequent write (generateReport). This violates the no-double-application invariant and can lead to duplicated or overlapping reports. [ Out of scope ]
line 157: Potential integer overflow and misleading logs when casting uint64 sequence IDs to int64 for logging. In maybeGenerateReport, existingReportEndSequenceID is a uint64, but is cast to int64 in utils.LastSequenceIDField(int64(existingReportEndSequenceID)). If the sequence ID exceeds math.MaxInt64, the value will wrap and become negative, corrupting the log output and potentially misleading operators and downstream log parsers expecting non-negative sequence IDs. [ Invalidated by documentation search ]

pkg/payerreport/workers/submitter.go — 0 comments posted, 2 evaluated, 2 filtered

line 133: In the already-submitted error path, the code calls SetReportSubmitted(ctx, report.ID, reportIndex) using reportIndex that was returned alongside a non-nil error from submitReport. There is no guarantee that submitReport returns a valid/meaningful index when it returns an error (e.g., on IsErrPayerReportAlreadySubmitted). This can cause the system to persist an incorrect SubmittedReportIndex (likely zero or stale) for the report. The code should derive or fetch the canonical index for the already-submitted report (e.g., by querying the chain or DB) or avoid writing a potentially invalid index when the submission failed. [ Low confidence ]
line 202: Inconsistent context usage: submitReport is invoked with w.ctx instead of the operation-scoped ctx argument passed to SubmitReports. This can cause the blockchain submission to ignore per-iteration cancellations/timeouts, potentially delaying shutdown or causing work to continue after the caller canceled the operation. For consistency and proper cancellation semantics, submitReport (and thus reportsAdmin.SubmitPayerReport) should use the ctx parameter of SubmitReports. [ Low confidence ]

pkg/registry/node_registry_contract.go — 0 comments posted, 3 evaluated, 3 filtered

line 193: SmartContractRegistry.processNewNodes calls s.newNodesNotifier.trigger(nodes) before updating the internal s.nodes map. This can violate ordering invariants for observers: subscribers may be notified about new nodes while s.nodes still lacks those entries, leading to inconsistent reads or racey behavior in consumers that assume the registry reflects notifications immediately. The fix is to update s.nodes under lock first, and only then trigger notifications, or document and enforce that notifications occur before the registry change with an explicit contract. [ Low confidence ]
line 193: SmartContractRegistry.processNewNodes does not check whether s.newNodesNotifier is nil before calling trigger. Since s.newNodesNotifier is a pointer field with no visible guard or constructor guarantee here, a nil pointer dereference is reachable and would panic. Add a nil check before calling trigger or ensure it is always initialized before use. [ Low confidence ]
line 193: SingleNotificationNotifier.trigger sends on subscriber channels while holding a read lock (RLock). If any send blocks (e.g., an unbuffered channel without a ready receiver or a buffered channel with a full buffer), processNewNodes will block on trigger and hold the read lock, potentially starving writers and causing deadlocks if other paths attempt to modify channels under a write lock. The fix is to copy the subscriber set under lock, then release the lock before performing blocking sends, or use non-blocking send/select with a drop policy. [ Low confidence ]

pkg/server/server.go — 0 comments posted, 1 evaluated, 1 filtered

line 370: startAPIServer starts a background goroutine inside metadata.NewCursorUpdater before any of the subsequent service registrations, and if any later step returns an error (e.g., constructing replicationService, metadataService, or creating the JWT verifier), the function exits without stopping the cursor updater. This leaks the goroutine and its resources because s.cursorUpdater is not torn down on error paths. Specifically, s.cursorUpdater = metadata.NewCursorUpdater(s.ctx, cfg.Logger, cfg.DB) starts the updater, and then errors at lines following (e.g., returning err at service construction or verifier creation) lead to an early return without calling RemoveSubscriber/Stop() or canceling its context. All exit paths after introducing this effect must provide a single paired cleanup; this code does not. [ Invalidated by documentation search ]

pkg/sync/envelope_sink.go — 0 comments posted, 2 evaluated, 2 filtered

line 136: Concurrent queries on the same transaction can lead to runtime errors, deadlocks, or undefined behavior. In InsertGatewayEnvelopeAndIncrementUnsettledUsage, two goroutines are spawned to call txQueries.IncrementUnsettledUsage and txQueries.IncrementOriginatorCongestion concurrently within a single transaction context. database/sql transactions (*sql.Tx) are not safe for concurrent use; many drivers require operations on a transaction/connection to be strictly sequential. Running these queries in parallel on the same txQueries (which likely wraps the *sql.Tx) risks runtime failures such as database/sql: driver does not support concurrent transactions or driver-specific errors, and can violate transactional invariants. Fix by executing these operations sequentially within the transaction, or by performing them in separate transactions if parallelism is desired (but that changes atomicity). [ Low confidence ]
line 141: Unsafe downcasts from unsigned to signed integers can corrupt sequence IDs and node IDs. In storeEnvelope, env.OriginatorSequenceID() returns a uint64, but it is cast to int64 for InsertGatewayEnvelopeParams.OriginatorSequenceID and for logging via utils.SequenceIDField. If the sequence ID exceeds math.MaxInt64, the cast will overflow and wrap to a negative int64, resulting in incorrect database values and misleading logs. Similarly, env.OriginatorNodeID() returns a uint32 but is cast to int32 (originatorID := int32(env.OriginatorNodeID())), which can overflow if the node ID exceeds math.MaxInt32. These casts appear at lines 141-142 and 163-164 (sequence ID) and line 128 (originator ID). The same downcasts occur in storeReservedEnvelope when calling the payer report store functions that insert envelopes with OriginatorSequenceID as int64 and OriginatorNodeID as int32. Fix by preserving the unsigned width end-to-end (use uint64/uint32 where appropriate), or add explicit bounds checks and return an error when values are out of range to avoid silent corruption. [ Invalidated by documentation search ]

pkg/sync/originator_stream.go — 0 comments posted, 5 evaluated, 5 filtered

line 58: originatorStream.listen launches a goroutine that continually calls s.stream.Recv() and then sends results on recvChan. There is no mechanism to stop this goroutine when the main listen loop exits (e.g., on context cancellation or error). Since recvChan is unbuffered and never closed, if listen returns, the goroutine can block forever attempting to send on recvChan, causing a goroutine leak and potential deadlock. Fix by: (a) selecting on s.ctx.Done() in the goroutine and returning; (b) closing recvChan on exit; or (c) making recvChan buffered and ensuring the goroutine terminates when the stream is closed/canceled. [ Low confidence ]
line 99: originatorStream.listen writes to s.writeQueue using a blocking send (s.writeQueue <- parsedEnv) without any context or select. If the consumer of writeQueue is slow or absent (e.g., downstream failure), the listener will block indefinitely, preventing stream processing and potential cleanup on cancellation. Prefer a select with s.ctx.Done() and a bounded buffer or explicit error/timeout handling. [ Invalidated by documentation search ]
line 159: In originatorStream.validateEnvelope, metrics are emitted before full validation succeeds. EmitSyncLastSeenOriginatorSequenceID and EmitSyncOriginatorReceivedMessagesCount are called before checking for out-of-order envelopes and before recovering the payer signature. If validation later fails (e.g., bad signature), the code already recorded the envelope as received and updated the last-seen sequence metric, leading to inconsistent metrics and misleading monitoring. Move metrics emission after all validations succeed, or record separate metrics for rejected envelopes without incrementing received counts. [ Low confidence ]
line 174: In originatorStream.validateEnvelope, the sequence ID is logged via utils.SequenceIDField(int64(env.OriginatorSequenceID())). Casting a uint64 to int64 can overflow for values greater than math.MaxInt64, producing an incorrect negative number and misleading logs. Use zap.Uint64 or change utils.SequenceIDField to accept unsigned values, or validate the range before casting. [ Low confidence ]
line 189: originatorStream.validateEnvelope calls env.UnsignedOriginatorEnvelope.PayerEnvelope.RecoverSigner() without verifying that PayerEnvelope.proto is non-nil. The RecoverSigner method dereferences p.proto.PayerSignature; if p.proto is nil (e.g., due to a malformed UnsignedOriginatorEnvelope), this will panic. Add a nil check on PayerEnvelope.proto before calling RecoverSigner, or ensure NewUnsignedOriginatorEnvelopeFromBytes guarantees PayerEnvelope.proto is non-nil for successfully parsed envelopes. [ Low confidence ]

pkg/sync/sync_worker.go — 0 comments posted, 5 evaluated, 5 filtered

line 106: subscribeToRegistry reads from the channel returned by s.nodeRegistry.OnNewNodes() without checking for a nil channel. If an implementation of NodeRegistry returns a nil channel, case newNodes, ok := <-newNodesCh will permanently block (except for the ctx.Done() case) and the goroutine will hang, preventing the worker from reacting to new nodes and causing a hidden liveness issue. [ Low confidence ]
line 137: s.subscriptions is written to without any visible initialization. At s.subscriptions[nodeid] = struct{}{} a nil map will cause a runtime panic: assignment to entry in nil map. There is no guard in this function ensuring s.subscriptions is non-nil before use. [ Low confidence ]
line 167: subscribeToNode contains a select with a default branch that immediately creates a new notifier context and calls subscribeToNodeRegistration in a tight loop with no blocking or backoff. This results in a busy-spin that can consume CPU, repeatedly registering cancel functions via changeListener.RegisterCancelFunction(notifierCancel) and repeatedly initiating subscriptions. It also rapidly creates many context.WithCancel contexts that may not be cancelled promptly, causing resource leaks. This violates at-most-once callback and re-entrancy safeguards (multiple cancel functions registered), and can lead to multiple overlapping registrations/subscriptions and flood the downstream sink with work. The problematic code is the default branch executing continuously when ctx is not done. [ Low confidence ]
line 221: In subscribeToNodeRegistration’s operation cleanup, stream.stream.CloseSend() is called in a defer even though SubscribeEnvelopes already calls CloseSend() immediately after sending the request. This results in a second CloseSend() invocation on the same client stream. While the error return is ignored, double-closing the send side is a double-application of an effect and can lead to unexpected errors or warnings from the transport. The cleanup should avoid re-closing the send side when it has already been closed, or guard the call to only run when it’s still open. [ Invalidated by documentation search ]
line 371: setupStream builds a single cursor (c) from the last matching row in result, even when originatorNodeIDs contains multiple IDs (e.g., during migration). The loop assigns c repeatedly for each matching originator, resulting in the final value overwriting previous ones. This collapses per-originator state into a single cursor, which can cause incorrect downstream validation/ordering for envelopes from different originator IDs because only the last originator’s sequence/timestamp is retained. The correct behavior would be to maintain per-originator cursor state (e.g., a map keyed by originator ID) or otherwise ensure that validation/ordering is performed with the right per-originator reference. [ Low confidence ]

pkg/utils/log.go — 0 comments posted, 1 evaluated, 1 filtered

line 53: PayerReportContractLoggerName (line 53) and PayerReportMainLoggerName (line 82) are both set to the same string value "payer-report". If these constants are used to derive child loggers (e.g., via logger.Named(...)) for conceptually distinct components, the identical category names will collide. This can cause logs from different subsystems to be interleaved under the same logger, break per-component filtering/routing, skew metrics by aggregating unrelated events, and potentially override per-logger fields/handlers depending on the logging framework. Use distinct values (e.g., "payer-report-contract" and "payer-report" or "payer-report-main") to preserve uniqueness and avoid unintended cross-component coupling. [ Low confidence ]

graphite-app · 2025-10-21T11:16:29Z

How to use the Graphite Merge Queue

Add either label to this PR to merge it via the merge queue:

Queue - adds this PR to the back of the merge queue
Hotfix - for urgent hot fixes, skip the queue and merge this PR next

You must have a Graphite account in order to use the merge queue. Sign up using this link.

_{An organization admin has enabled the Graphite Merge Queue in this repository.} _{Please do not merge from GitHub as this will restart CI on PRs being processed by the merge queue.}

pkg/indexer/settlement_chain/contracts/payer_registry_storer.go

pkg/indexer/settlement_chain/contracts/payer_report_manager_storer.go

pkg/indexer/settlement_chain/contracts/payer_registry_storer.go

mkysel · 2025-10-21T11:41:52Z

cmd/xmtpd-cli/commands/funds.go

 	logger.Info("successfully minted mock underlying fee token",
 		zap.String("to", to.Hex()),
-		zap.String("amountRaw", amount.String()),
+		zap.String("amount_raw", amount.String()),


I guess neither of these names is great. The user passes in --amount so we could use that. Or if we wanted to remind them its raw amount (raw) maybe?

or raw_amount? I'm not sure to be honest. It's the CLI so I'm happy if it's readable.

Ended with amount for the sake of simplicity.

mkysel · 2025-10-21T11:43:18Z

pkg/utils/log.go

 	return logger, &cfg, nil
 }
+
+/* Fields */


I like this

pkg/blockchain/node_registry_admin.go

pkg/indexer/app_chain/contracts/group_message_storer.go

mkysel

QOL goes to the moon

Standardize logging

adbc331

fbac requested a review from a team as a code owner October 21, 2025 11:16

macroscopeapp bot reviewed Oct 21, 2025

View reviewed changes

pkg/indexer/settlement_chain/contracts/payer_registry_storer.go Outdated Show resolved Hide resolved

pkg/indexer/settlement_chain/contracts/payer_report_manager_storer.go Outdated Show resolved Hide resolved

CLI

69ce8a3

mkysel reviewed Oct 21, 2025

View reviewed changes

pkg/indexer/settlement_chain/contracts/payer_registry_storer.go Show resolved Hide resolved

fbac force-pushed the 10-20-logs branch from a3c8d01 to 78fabb9 Compare October 21, 2025 11:40

mkysel reviewed Oct 21, 2025

View reviewed changes

pkg/utils/log.go

return logger, &cfg, nil

}

/* Fields */

Copy link

Collaborator

mkysel Oct 21, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like this

fbac reacted with hooray emoji

fixes

2aa50bf

fbac force-pushed the 10-20-logs branch from 78fabb9 to 2aa50bf Compare October 21, 2025 11:45

macroscopeapp bot reviewed Oct 21, 2025

View reviewed changes

pkg/blockchain/node_registry_admin.go Show resolved Hide resolved

pkg/indexer/app_chain/contracts/group_message_storer.go Show resolved Hide resolved

Borja Aranda added 2 commits October 21, 2025 13:57

lint

70c26d5

fix mocks

d236294

fbac requested a review from mkysel October 21, 2025 12:01

Borja Aranda added 2 commits October 21, 2025 14:15

fix tests

df27450

amount field

e19e281

mkysel approved these changes Oct 21, 2025

View reviewed changes

fbac merged commit db4878a into main Oct 21, 2025
11 checks passed

fbac deleted the 10-20-logs branch October 21, 2025 17:57

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Standardize logging #1262

Standardize logging #1262

Uh oh!

fbac commented Oct 21, 2025 •

edited by macroscopeapp bot

Loading

Uh oh!

graphite-app bot commented Oct 21, 2025

Uh oh!

Uh oh!

Uh oh!

Uh oh!

mkysel Oct 21, 2025

Uh oh!

fbac Oct 21, 2025

Uh oh!

fbac Oct 21, 2025

Uh oh!

mkysel Oct 21, 2025

Uh oh!

Uh oh!

Uh oh!

mkysel left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Standardize logging #1262

Standardize logging #1262

Uh oh!

Conversation

fbac commented Oct 21, 2025 • edited by macroscopeapp bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Standardize logging across services and components by adopting pkg/utils/log.go field helpers and named loggers, lowercasing messages, and renaming public APIs that carry logger parameters and fields

📍Where to Start

Changes since #1262 opened

🗂️ Filtered Issues

Uh oh!

graphite-app bot commented Oct 21, 2025

How to use the Graphite Merge Queue

Uh oh!

Uh oh!

Uh oh!

Uh oh!

mkysel Oct 21, 2025

Choose a reason for hiding this comment

Uh oh!

fbac Oct 21, 2025

Choose a reason for hiding this comment

Uh oh!

fbac Oct 21, 2025

Choose a reason for hiding this comment

Uh oh!

mkysel Oct 21, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

mkysel left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

fbac commented Oct 21, 2025 •

edited by macroscopeapp bot

Loading

Standardize logging across services and components by adopting `pkg/utils/log.go` field helpers and named loggers, lowercasing messages, and renaming public APIs that carry logger parameters and fields