Fix payer report logger chain #1269
Merged
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Standardize payer report logging and route workers and store to
utils.PayerReportMainLoggerNameandutils.PayerReportStoreLoggerNamein server.go and store.goThis PR standardizes logging keys and message casing across payer report and blockchain admin components, and wires payer report workers and store to named sub-loggers.
Route payer report workers to
utils.PayerReportMainLoggerNameand the store toutils.PayerReportStoreLoggerNameinserver.NewReplicationServerandpayerreport.NewStore(server.go, store.go)Unify admin log field keys and "no update needed" messaging in settlement/app chain admins (settlement_chain_admin.go, app_chain_admin.go)
Convert multiple logs to lowercase phrasing and standardized field names across indexer, gateway examples, stress tools, and watchers
Add a structured zap error for semver parsing in replication main (main.go) and a deferred debug duration log in
blockchain.WaitForTransaction(client.go)📍Where to Start
Start with logger wiring in
server.NewReplicationServerin server.go, then review the store logger change inpayerreport.NewStorein store.go.📊 Macroscope summarized 0948726. 20 files reviewed, 48 issues evaluated, 44 issues filtered, 0 comments posted
🗂️ Filtered Issues
cmd/xmtpd-cli/commands/node_registry.go — 0 comments posted, 2 evaluated, 2 filtered
context.Background()for all downstream blockchain operations (setupNodeRegistryAdmin,setupNodeRegistryCaller, and the subsequent calls). In a CLI command, this prevents cancellation (e.g., on SIGINT) and can lead to indefinite hangs if the RPC calls block or the network is slow/unresponsive. The Cobra command provides a context viaRunEthat should be used to allow cancellation propagation. Consider threading a cancellable context from the command (cmd.Context()or a context with a timeout) into the handler and subsequent calls. [ Low confidence ]"set new max canonical size"unconditionally after a successful call toadmin.SetMaxCanonical, butSetMaxCanonicalmay returnnilin the "no update needed" case (it internally checkserr.IsNoChange()and returnsnil). In that case, no change is applied, yet the handler will still log that a new value was set, producing misleading output. Consider either havingSetMaxCanonicalreturn an explicit changed/no-change indicator, or suppressing the "set new" log unless the update is actually applied (e.g., by relying on an event or return value). [ Low confidence ]pkg/api/message/subscribe_worker.go — 0 comments posted, 3 evaluated, 3 filtered
query.Topicsandquery.OriginatorNodeIdslengths. Ifquery.Topicsis non-empty but all entries are invalid (and originators empty),l.isGlobalis left asfalse, preventing the listener from being treated as global even though it has no valid keys. This mismatches expected semantics and contributes to the unregistered listener condition. [ Out of scope ]newListenercan return a listener that is neither global nor keyed, resulting in a listener that is never registered anywhere and whose channel is never closed. This happens whenquery.Topicscontains one or more entries but all are invalid pertopic.ParseTopic, andquery.OriginatorNodeIdsis empty. In this casel.isGlobalremainsfalse(because it is decided based on raw input lengths before validation),l.topicsstays empty (all invalid topics were skipped), andl.originatorsstays empty. The caller (listen) then returns the created channel without registering the listener, resulting in a leaked, never-closed channel and no delivery. The root cause is the global determination at the top ofnewListenerbased on unvalidated inputs and lack of fallback when validation yields no keys. [ Out of scope ]newListenerallows bothl.topicsandl.originatorsto be populated when the query supplies both filters. However, downstreamlistenonly registers the listener on one dimension, preferring topics over originators (else if len(l.topics) > 0 { ... } else if len(l.originators) > 0 { ... }). This silently ignores the originators filter when topics are present, creating a contract asymmetry between the input query and the actual registration and potentially unexpected routing semantics. [ Out of scope ]pkg/authn/claims.go — 0 comments posted, 1 evaluated, 1 filtered
loggerinNewClaimValidator. The function callslogger.Debug(...)(lines 29-32) without checking whetherloggeris nil. SinceNewRegistryVerifierpasses a*zap.Loggerprovided by callers and there is no guard ensuring non-nil, a nilloggerwould cause a runtime panic at theDebugcall. Add a nil-check or ensure non-nil logger is enforced by callers before invokingNewClaimValidator. [ Out of scope ]pkg/blockchain/app_chain_admin.go — 0 comments posted, 16 evaluated, 16 filtered
a.clienttoExecuteTransactioncan lead to a nil pointer dereference insideExecuteTransactionwhen it callsclient.BalanceAt. This method does not validate thata.clientis non-nil before passing it. [ Low confidence ]a.groupMessageBroadcasterwhen invokingUpdateMaxPayloadSizeandParseMaxPayloadSizeUpdated. The method does not validate thata.groupMessageBroadcasteris non-nil before use. IfappChainAdminis constructed with a nilgroupMessageBroadcaster, callinga.groupMessageBroadcaster.UpdateMaxPayloadSize(opts)ora.groupMessageBroadcaster.ParseMaxPayloadSizeUpdated(*log)will panic. [ Low confidence ]login the event parser closures:ParseMaxPayloadSizeUpdated(*log)/ParseMinPayloadSizeUpdated(*log)dereference the*types.Logpointer without nil check. If any entry inreceipt.Logsis nil, this will panic. While go-ethereum typically provides non-nil log pointers, the code does not defensively guard against nil. [ Code style ]a.loggerwhen callinga.logger.Errorora.logger.Info. The method does not validatea.loggerbefore use; a nil logger will cause a panic. [ Low confidence ]a.clienttoExecuteTransactioncan cause a nil pointer dereference insideExecuteTransaction. No local validation is present. [ Low confidence ]a.groupMessageBroadcasterinUpdateGroupMessageMinPayloadSizewhen invokingUpdateMinPayloadSizeandParseMinPayloadSizeUpdated. No nil check is performed before use. [ Low confidence ]login the event parser closure:ParseMinPayloadSizeUpdated(*log)without nil check. Ifreceipt.Logscontains a nil entry, this will panic. While go-ethereum typically provides non-nil log pointers, the code does not defensively guard against nil. [ Code style ]a.loggerinUpdateGroupMessageMinPayloadSizewhen callinga.logger.Errorora.logger.Info. No guard exists. [ Low confidence ]a.clienttoExecuteTransactioncan cause a nil pointer dereference insideExecuteTransaction. No guard exists here. [ Low confidence ]a.identityUpdateBroadcasterwhen invokingUpdateMaxPayloadSizeandParseMaxPayloadSizeUpdated. No nil check is performed. [ Low confidence ]login the event parser closure:ParseMaxPayloadSizeUpdated(*log)without nil check inUpdateIdentityUpdateMaxPayloadSize. If a nil entry exists inreceipt.Logs, this will panic. [ Code style ]a.loggerwhen logging inUpdateIdentityUpdateMaxPayloadSize. No nil guard exists. [ Low confidence ]a.clienttoExecuteTransactioncan cause a nil pointer dereference insideExecuteTransaction. No guard in this method. [ Low confidence ]a.identityUpdateBroadcasterwhen invokingUpdateMinPayloadSizeandParseMinPayloadSizeUpdated. No nil guard is present. [ Low confidence ]login the event parser closure:ParseMinPayloadSizeUpdated(*log)without nil check inUpdateIdentityUpdateMinPayloadSize. If a nil entry exists inreceipt.Logs, this will panic. [ Code style ]a.loggerwhen logging withinUpdateIdentityUpdateMinPayloadSize. No nil guard exists. [ Low confidence ]pkg/blockchain/client.go — 0 comments posted, 1 evaluated, 1 filtered
WaitForTransaction, any error returned byclient.TransactionReceiptthat is notethereum.NotFoundis collapsed intoErrTxFailed(return nil, ErrTxFailed). This misclassifies transport, provider, or context-related errors (e.g., RPC connectivity issues, rate limits, orcontext.DeadlineExceeded) as transaction failures. Upstream callers (e.g.,ExecuteTransaction) treatErrTxFailedspecially by attempting to trace revert reasons, which can be incorrect and wasteful when the transaction did not actually fail but the call itself did. It also discards the original error detail, hampering diagnosis. [ Low confidence ]pkg/blockchain/settlement_chain_admin.go — 0 comments posted, 7 evaluated, 7 filtered
s.clientands.loggerare non-nil before passing them toExecuteTransaction. InsideExecuteTransaction,client.BalanceAtandlogger.Debug/Infoare invoked unconditionally. Ifs.clientors.loggerare nil, this will cause a panic. The risk exists at each call site wheres.clientands.loggerare passed:UpdateDistributionManagerProtocolFeesRecipient(line 291),UpdatePayerRegistryMinimumDeposit(line 330),UpdatePayerRegistryWithdrawLockPeriod(line 368),UpdatePayerReportManagerProtocolFeeRate(line 406), andUpdateNodeRegistryAdmin(line 442). [ Out of scope ]UpdateDistributionManagerProtocolFeesRecipientdoes not validate thats.distributionManageris non-nil before dereferencing it in both the transaction function (s.distributionManager.UpdateProtocolFeesRecipient) and the event parser (s.distributionManager.ParseProtocolFeesRecipientUpdated). Ifs.distributionManageris nil, these calls will panic at runtime. [ Out of scope ]UpdateDistributionManagerProtocolFeesRecipient, the handling ofErrNoChangeis inconsistent with the other update methods. WhenExecuteTransactionreturns aBlockchainErrorthatIsNoChange(), the method logs the "no update needed" message but still returns the error. This causes callers (e.g., CLI handlers) to treat a no-op as a failure, unlike the other methods which returnnilon no-change. This inconsistency can produce meaningfully incorrect behavior and user messaging parity issues across admin operations. [ Out of scope ]UpdatePayerRegistryMinimumDepositdoes not validate thats.payerRegistryis non-nil before dereferencing it ins.payerRegistry.UpdateMinimumDepositands.payerRegistry.ParseMinimumDepositUpdated. Ifs.payerRegistryis nil, these calls will panic. [ Out of scope ]UpdatePayerRegistryWithdrawLockPerioddoes not validate thats.payerRegistryis non-nil before dereferencing it ins.payerRegistry.UpdateWithdrawLockPeriodands.payerRegistry.ParseWithdrawLockPeriodUpdated. Ifs.payerRegistryis nil, these calls will panic. [ Out of scope ]UpdatePayerReportManagerProtocolFeeRatedoes not validate thats.payerReportManageris non-nil before dereferencing it ins.payerReportManager.UpdateProtocolFeeRateands.payerReportManager.ParseProtocolFeeRateUpdated. Ifs.payerReportManageris nil, these calls will panic. [ Out of scope ]UpdateNodeRegistryAdmindoes not validate thats.nodeRegistryis non-nil before dereferencing it ins.nodeRegistry.UpdateAdminands.nodeRegistry.ParseAdminUpdated. Ifs.nodeRegistryis nil, these calls will panic. [ Out of scope ]pkg/gateway/examples/jwt/main.go — 0 comments posted, 1 evaluated, 1 filtered
jwtIdentityFnis invoked inmainwithpublicKeydefined as[]byte, and the identity function’sKeyfuncreturns that[]byteto the JWT parser while requiring an ECDSA signing method. For ECDSA,github.com/golang-jwt/jwt/v5expects a key type compatible with*ecdsa.PublicKey, not a raw[]byte. Returning a[]bytewill causejwt.ParseWithClaimsto fail verification for all tokens, making authentication always fail at runtime. This manifests from the use atWithIdentityFn(jwtIdentityFn(publicKey))with a[]bytepublic key. [ Out of scope ]pkg/indexer/common/log_handler.go — 0 comments posted, 1 evaluated, 1 filtered
IndexLogs, whenevent.Removedis true (indicating a reorged/removed log), the code invokescontract.HandleLog(ctx, event)but then continues to the normal storage path and may attempt to store the removed log viacontract.StoreLog. This conflates reorg handling and normal storage, risking incorrect persistence of removed logs or double-application of effects. Acontinueafter handling removed events is likely required to avoid storing reorged logs. [ Low confidence ]pkg/indexer/settlement_chain/contracts/payer_registry_storer.go — 0 comments posted, 3 evaluated, 3 filtered
parsedEvent.Amount.Int64()silently truncates/overflows when the on-chain amount exceeds the range ofint64, and thencurrency.FromMicrodollarsmultiplies by1e6, which can further overflowint64. This leads to incorrect or negativeamountvalues being recorded in the ledger or classified as invalid. On-chain event fields likeAmountare typicallyuint256and can be greater thanmath.MaxInt64, making this path reachable. The conversion atamount := currency.FromMicrodollars(currency.MicroDollar(parsedEvent.Amount.Int64()))must validate range and convert using big integers to avoid data loss and overflow. [ Out of scope ]parsedEvent.Amount.Int64()inhandleWithdrawalRequestedcan silently truncate/overflow for on-chainuint256amounts larger thanint64, and the subsequent multiplication by1e6inFromMicrodollarscan overflowint64. This results in incorrect ledger withdrawal amounts or misclassification as invalid events. Use safe big integer to picodollar conversion with range checks before narrowing toint64. [ Out of scope ]parsedEvent.Amount.Int64()inhandleUsageSettledrisks silent truncation/overflow when the on-chainAmountexceedsint64. The subsequentFromMicrodollarsmultiplication by1e6can also overflowint64. This can corrupt settlement amounts in the ledger or incorrectly flag events as invalid. Implement safe conversion from*big.Intmicrodollars to picodollars with bounds checking, and reject or cap values exceedingint64capacity. [ Out of scope ]pkg/indexer/settlement_chain/contracts/payer_report_manager_storer.go — 0 comments posted, 1 evaluated, 1 filtered
StoreLog, whensetReportSubmittedreturns an error, the code checksif strings.Contains(err.Error(), ErrReportAlreadyExists)to decide whether to suppress the error and continue. Using a substring on the formatted error message is unsafe: (1) an unrelated error message that happens to contain"report already present in database"will be incorrectly treated as non-fatal and suppressed; (2) if the underlyingNonRecoverableErrorchanges its formatting/wrapping,.Error()might not contain the exact substring and the already-exists condition would incorrectly bubble up as an error. This can cause incorrect behavior (skipping real errors or surfacing expected idempotent conditions as failures). Prefer a typed/sentinel error witherrors.Isor a structured discriminator on theRetryableError(e.g., a method or error code), or match against a well-known exported error variable rather than string contents. [ Low confidence ]pkg/metrics/docs/generator.go — 0 comments posted, 2 evaluated, 2 filtered
dumpToMarkdownwrites Markdown table rows directly usingfmt.Sprintf("|%s|%s| %s |%s|\n", m.Name, m.Type, desc, m.File)at line 87 without escaping Markdown-sensitive characters indesc(and potentiallym.Name/m.File). Ifm.Descriptioncontains the pipe character|, backticks, or newlines, the generated table becomes malformed or breaks rendering. The code should escape or sanitize table cell content (e.g., replace|with\|, backticks with escaped equivalents, and normalize newlines) to ensure valid Markdown output for all metric descriptions. [ Low confidence ]dumpToMarkdownwrites toMARKDOWN_OUTPUT("doc/metrics_catalog.md") without ensuring that the parent directory (doc/) exists. On a fresh checkout or in environments where thedocdirectory is absent,os.WriteFileat line 91 will fail with a "no such file or directory" error, causinglog.Fatalfto terminate the program. The code should create the directory tree (e.g., usingos.MkdirAll(filepath.Dir(MARKDOWN_OUTPUT), 0o755)) before attempting to write the file. [ Low confidence ]pkg/payerreport/verifier.go — 0 comments posted, 1 evaluated, 1 filtered
IsValidReportreturnstruefor empty reports (whereStartSequenceID == EndSequenceID) without validating thatPayersMerkleRootequals the canonical hash for an empty set. The comment explicitly states the merkle root "must always be the hash of an empty set", but the implementation skips any check and accepts potentially invalid merkle roots. This can lead to incorrect acceptance/attestation of malformed empty reports contrary to the stated contract. [ Out of scope ]pkg/server/server.go — 0 comments posted, 1 evaluated, 1 filtered
metrics.NewMetricsServer(lines 175-184) and indexer is started vias.indx.StartIndexer()(lines 233-239). If a later section (e.g., starting API server at lines 265-276 or building payer report workers at lines 339-356) fails, the function returns early without stopping the already started metrics server (s.metrics), indexer (s.indx), or migrator (s.migratorServer). This creates partial initialization with no cleanup in error paths, violating required invariants like single paired cleanup and no leaks. [ Previously rejected ]pkg/stress/stress.go — 0 comments posted, 4 evaluated, 3 filtered
loggerwill cause a runtime panic when calling methods likelogger.Info/logger.Error.StressIdentityUpdatesaccepts*zap.Loggerwithout a guard and uses it at multiple points (e.g., first atlogger.Info("starting transaction", ...)). If a caller passesnil, the method call will dereference a nil pointer and panic. Add a non-nil check at function entry or default tozap.NewNop(). [ Low confidence ]ctxleads to a panic inCastSendCommand.Runwhen it doescontext.WithTimeout(ctx, 30*time.Second).StressIdentityUpdatespasses itsctxdirectly tocs.Run(ctx)without guarding againstnil. If the caller provides a nilcontext.Context, this causes a panic in the callee. Add a non-nil context guard (e.g., default tocontext.Background()ifctx == nil) or document and enforce non-nil at entry. [ Low confidence ]n == 0causes a runtime panic and invalid metrics. Specifically,avgDuration := totalDuration / time.Duration(n)will panic with integer divide-by-zero, andsuccess_rate := float64(successCount)/float64(n)will produce NaN (or panic depending on context) whennis zero. The function accepts anyintfornwith no guards, and the loop will not run whenn == 0, making this path reachable. Add an explicit guard forn <= 0to return early with a defined outcome or compute metrics based on successes only. [ Previously rejected ]pkg/utils/log.go — 0 comments posted, 1 evaluated, 1 filtered
EncoderConfig.NameKeyis set to"caller"(line 116), whileEncodeCalleris set (line 117) butCallerKeyis not configured. This causes the logger "name" (e.g.,xmtpd.api.publish-worker) to be emitted under the field keycaller, and actual caller information will not be included at all becauseCallerKeyis empty. This mismatches intended semantics, can confuse log parsing/analytics that expectcallerto be a file:line, and contradicts the documented guidance for readable name chains. Correct configuration should use distinct keys, e.g., setNameKeyto"logger"(or similar) and setCallerKeyto"caller"if caller info is desired. [ Out of scope ]