Skip to content

gomod: fix btcd mempool leak fix and GetBlock heap escape#8401

Merged
guggero merged 4 commits into
lightningnetwork:masterfrom
yyforyongyu:update-btcd-ver
Jan 23, 2024
Merged

gomod: fix btcd mempool leak fix and GetBlock heap escape#8401
guggero merged 4 commits into
lightningnetwork:masterfrom
yyforyongyu:update-btcd-ver

Conversation

@yyforyongyu
Copy link
Copy Markdown
Member

@yyforyongyu yyforyongyu commented Jan 19, 2024

This PR updates the btcd version to,

We can see the effects from the following profiles,

  • Current profile without the fixes.
  • Profile after the mempool fix, notice the SendCmd is gone.
  • Profile after the GetBlock heap escape fix, notice the BtcDecode is gone.

Fix #8356

Summary by CodeRabbit

  • Documentation
    • Updated release notes for version 0.17.3 with corrections to hyperlinks and text adjustments.
    • Published release notes for version 0.17.4 detailing bug fixes, new features, enhancements, and various improvements. Acknowledged Yong Yu as a contributor.

@yyforyongyu yyforyongyu self-assigned this Jan 19, 2024
@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented Jan 19, 2024

Important

Auto Review Skipped

Auto reviews are limited to the following labels: llm-review. Please add one of these labels to enable auto reviews.

Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository.

To trigger a single review, invoke the @coderabbitai review command.

Walkthrough

The updates in the release notes reflect a critical fix for a memory leak issue when running lnd with bitcoind in polling mode. The changes also include various improvements and new features to enhance performance and code health, following the BOLT specifications and other technical updates. Contributor Yong Yu is acknowledged for their involvement in this release.

Changes

File Path Change Summary
.../release-notes-0.17.3.md Adjusted hyperlink and text related to btcwallet memory optimization.
.../release-notes-0.17.4.md Documented fixes for memory leaks, new features, enhancements, and updates including BOLT spec compliance and contribution acknowledgement.

Assessment against linked issues

Objective Addressed Explanation
Identify and resolve memory leak with lnd and bitcoind.rpcpolling (#8356)
Investigate and address connection cleanup failure (#8356)
Ensure stable memory consumption with bitcoind.rpcpolling (#8356)
Consider impact of issue mentioned by @alexbosworth related to memory leak (#8356)

Poem

To the codebase vast and wide, 🌐
A memory leak we did espy. 🕵️‍♂️
With fixes in and bugs now shy,
Our rabbit's code hops on, spry! 🐇💨

Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media?

Share

Tips

Chat

There are 3 ways to chat with CodeRabbit:

  • Review comments: Directly reply to a review comment made by CodeRabbit. Example:
    • I pushed a fix in commit <commit_id>.
    • Generate unit-tests for this file.
  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
    • @coderabbitai generate unit tests for this file.
    • @coderabbitai modularize this function.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai generate interesting stats about this repository from git and render them as a table.
    • @coderabbitai show all the console.log statements in this repository.
    • @coderabbitai read src/utils.ts and generate unit tests.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (invoked as PR comments)

  • @coderabbitai pause to pause the reviews on a PR.
  • @coderabbitai resume to resume the paused reviews.
  • @coderabbitai review to trigger a review. This is useful when automatic reviews are disabled for the repository.
  • @coderabbitai resolve resolve all the CodeRabbit review comments.
  • @coderabbitai help to get help.

Additionally, you can add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.

CodeRabbit Configration File (.coderabbit.yaml)

  • You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
  • The JSON schema for the configuration file is available here.
  • If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/coderabbit-overrides.v2.json

CodeRabbit Discord Community

Join our Discord Community to get help, request features, and share feedback.

@yyforyongyu yyforyongyu added this to the v0.17.4 milestone Jan 19, 2024
@yyforyongyu
Copy link
Copy Markdown
Member Author

@coderabbitai review

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Review Status

Actionable comments generated: 3

Configuration used: CodeRabbit UI

Commits Files that changed from the base of the PR and between ba4021c and 3f27750691a81b02abc8968778f7087c8c26da85.
Files ignored due to path filters (2)
  • go.mod is excluded by: !**/*.mod
  • go.sum is excluded by: !**/*.sum
Files selected for processing (2)
  • docs/release-notes/release-notes-0.17.3.md (1 hunks)
  • docs/release-notes/release-notes-0.17.4.md (1 hunks)
Files skipped from review due to trivial changes (1)
  • docs/release-notes/release-notes-0.17.3.md
Additional comments: 1
docs/release-notes/release-notes-0.17.4.md (1)
  • 37-38: Ensure that Yong Yu's contributions are accurately reflected in the release notes.

Comment thread docs/release-notes/release-notes-0.17.4.md Outdated
Comment thread docs/release-notes/release-notes-0.17.4.md Outdated
Comment thread docs/release-notes/release-notes-0.17.4.md Outdated
@yyforyongyu yyforyongyu changed the title gomod: update btcd version to include the mempool leak fix gomod: fix btcd mempool leak fix and GetBlock heap escape Jan 21, 2024
@yyforyongyu yyforyongyu force-pushed the update-btcd-ver branch 2 times, most recently from 3c258be to 54bdd4b Compare January 21, 2024 12:18
Comment thread blockcache/blockcache.go Outdated
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

a couple of questions (just for my understanding):

  • are you saying that the pointer to the getBlockImpl gets stored on the heap? if so - im not sure i understand why that would be since we use it once here to produce a result & ie, we are not returning a pointer to the function right?
  • for making a copy of the block - I guess to me it only makes sense to make a copy of something before returning if the original object has references to some other info that is not needed once we return & so we want to get rid of those references. But here - are we removing any references? from the commit body it seems like you are saying that the block would have a reference to getBlockImpl but im not sure if that is true?
  • also - we would want the actual block to be on the heap right?
  • finally: did you notice this escape by eye or is there a tool you used to be able to identify this? asking cause my eye is definitely not familiar with this sort of gotcha

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To answer,

  • nope not the getBlockImpl but the data it uses, hence the block.
  • def, we can copy it at difference places - either in btcd/rpcclient/chain.go's GetBlock, or here. As that being said, I see a lot more issues with heap escaping in btcd lib, kinda don't wanna touch it atm.
  • with the current design the block has to be on the heap because we store a pointer reference to that block in our cache. I have another local PR which removes the pointer references - don't think it helps tho, still digging as as you've said above, the block has to be cached.
  • this is a pure result from looking at the profile above, after fixing SendCmd, I was curious about why there was a heap allocation on GetBlock, then found this blockcache. Then got even more curious, as we only allow 20MB of cache size, but it used more. Then I took a closer look at the stacks,
(pprof) list GetBlock
Total: 115.90MB
ROUTINE ======================== github.com/btcsuite/btcd/rpcclient.(*Client).GetBlock in /Users/yong/Projects/lightning_labs/btcsuite/btcd/rpcclient/chain.go
         0    39.14MB (flat, cum) 33.77% of Total
         .          .    159:func (c *Client) GetBlock(blockHash *chainhash.Hash) (*wire.MsgBlock, error) {
         .    39.14MB    160:	return c.GetBlockAsync(blockHash).Receive()
         .          .    161:}
         .          .    162:
         .          .    163:// FutureGetBlockVerboseResult is a future promise to deliver the result of a
         .          .    164:// GetBlockVerboseAsync RPC invocation (or an applicable error).
         .          .    165:type FutureGetBlockVerboseResult struct {
ROUTINE ======================== github.com/btcsuite/btcd/rpcclient.FutureGetBlockResult.Receive in /Users/yong/Projects/lightning_labs/btcsuite/btcd/rpcclient/chain.go
         0    39.14MB (flat, cum) 33.77% of Total
         .          .    108:func (r FutureGetBlockResult) Receive() (*wire.MsgBlock, error) {
         .          .    109:	res, err := r.client.waitForGetBlockRes(r.Response, r.hash, false, false)
         .          .    110:	if err != nil {
         .          .    111:		return nil, err
         .          .    112:	}
         .          .    113:
         .          .    114:	// Unmarshal result as a string.
         .          .    115:	var blockHex string
         .          .    116:	err = json.Unmarshal(res, &blockHex)
         .          .    117:	if err != nil {
         .          .    118:		return nil, err
         .          .    119:	}
         .          .    120:
         .          .    121:	// Decode the serialized block hex to raw bytes.
         .          .    122:	serializedBlock, err := hex.DecodeString(blockHex)
         .          .    123:	if err != nil {
         .          .    124:		return nil, err
         .          .    125:	}
         .          .    126:
         .          .    127:	// Deserialize the block and return it.
         .          .    128:	var msgBlock wire.MsgBlock
         .    39.14MB    129:	err = msgBlock.Deserialize(bytes.NewReader(serializedBlock))
         .          .    130:	if err != nil {
         .          .    131:		return nil, err
         .          .    132:	}
         .          .    133:	return &msgBlock, nil
         .          .    134:}
ROUTINE ======================== github.com/btcsuite/btcwallet/chain.(*BitcoindClient).GetBlock in /Users/yong/Projects/lightning_labs/btcsuite/btcwallet/chain/bitcoind_client.go
         0    39.14MB (flat, cum) 33.77% of Total
         .          .    143:func (c *BitcoindClient) GetBlock(hash *chainhash.Hash) (*wire.MsgBlock, error) {
         .    39.14MB    144:	return c.chainConn.GetBlock(hash)
         .          .    145:}
         .          .    146:
         .          .    147:// GetBlockVerbose returns a verbose block from the hash.
         .          .    148:func (c *BitcoindClient) GetBlockVerbose(
         .          .    149:	hash *chainhash.Hash) (*btcjson.GetBlockVerboseResult, error) {
ROUTINE ======================== github.com/btcsuite/btcwallet/chain.(*BitcoindConn).GetBlock in /Users/yong/Projects/lightning_labs/btcsuite/btcwallet/chain/bitcoind_conn.go
         0    39.14MB (flat, cum) 33.77% of Total
         .          .    458:func (c *BitcoindConn) GetBlock(hash *chainhash.Hash) (*wire.MsgBlock, error) {
         .    39.14MB    459:	block, err := c.client.GetBlock(hash)
         .          .    460:	// Got the block from the backend successfully, return it.
         .          .    461:	if err == nil {
         .          .    462:		return block, nil
         .          .    463:	}
         .          .    464:
ROUTINE ======================== github.com/lightningnetwork/lnd/blockcache.(*BlockCache).GetBlock in /Users/yong/Projects/lightning_labs/lnd/blockcache/blockcache.go
         0    39.14MB (flat, cum) 33.77% of Total
         .          .     34:func (bc *BlockCache) GetBlock(hash *chainhash.Hash,
         .          .     35:	getBlockImpl func(hash *chainhash.Hash) (*wire.MsgBlock,
         .          .     36:		error)) (*wire.MsgBlock, error) {
         .          .     37:
         .          .     38:	bc.HashMutex.Lock(lntypes.Hash(*hash))
         .          .     39:	defer bc.HashMutex.Unlock(lntypes.Hash(*hash))
         .          .     40:
         .          .     41:	// Create an inv vector for getting the block.
         .          .     42:	inv := wire.NewInvVect(wire.InvTypeWitnessBlock, hash)
         .          .     43:
         .          .     44:	// Check if the block corresponding to the given hash is already
         .          .     45:	// stored in the blockCache and return it if it is.
         .          .     46:	cacheBlock, err := bc.Cache.Get(*inv)
         .          .     47:	if err != nil && err != cache.ErrElementNotFound {
         .          .     48:		return nil, err
         .          .     49:	}
         .          .     50:	if cacheBlock != nil {
         .          .     51:		return cacheBlock.MsgBlock(), nil
         .          .     52:	}
         .          .     53:
         .          .     54:	// Fetch the block from the chain backends.
         .    39.14MB     55:	block, err := getBlockImpl(hash)
         .          .     56:	if err != nil {
         .          .     57:		return nil, err
         .          .     58:	}
         .          .     59:
         .          .     60:	// Add the new block to blockCache. If the Cache is at its maximum
ROUTINE ======================== github.com/lightningnetwork/lnd/chainntnfs/bitcoindnotify.(*BitcoindNotifier).GetBlock in /Users/yong/Projects/lightning_labs/lnd/chainntnfs/bitcoindnotify/bitcoind.go
         0    23.10MB (flat, cum) 19.93% of Total
         .          .   1042:func (b *BitcoindNotifier) GetBlock(hash *chainhash.Hash) (*wire.MsgBlock,
         .          .   1043:	error) {
         .          .   1044:
         .    23.10MB   1045:	return b.blockCache.GetBlock(hash, b.chainConn.GetBlock)
         .          .   1046:}
         .          .   1047:
         .          .   1048:// SubscribeMempoolSpent allows the caller to register a subscription to watch
         .          .   1049:// for a spend of an outpoint in the mempool.The event will be dispatched once
         .          .   1050:// the outpoint is spent in the mempool.
ROUTINE ======================== github.com/lightningnetwork/lnd/lnwallet/btcwallet.(*BtcWallet).GetBlock in /Users/yong/Projects/lightning_labs/lnd/lnwallet/btcwallet/blockchain.go
         0    16.04MB (flat, cum) 13.84% of Total
         .          .    136:func (b *BtcWallet) GetBlock(blockHash *chainhash.Hash) (*wire.MsgBlock, error) {
         .          .    137:	_, ok := b.chain.(*chain.NeutrinoClient)
         .          .    138:	if !ok {
         .    16.04MB    139:		return b.blockCache.GetBlock(blockHash, b.chain.GetBlock)
         .          .    140:	}
         .          .    141:
         .          .    142:	// For the neutrino implementation of lnwallet.BlockChainIO the neutrino
         .          .    143:	// GetBlock function can be called directly since it uses the same block
         .          .    144:	// cache. However, it does not lock the block cache mutex for the given

And also with the command go build -gcflags='-m=3' . |& grep escapes to figure out which line escapes. That being said, I don't think we need to concern about this when writing code, optimization should always come later, as it can be an endless, but it's very difficult for me to resist the fun🤓

That being said, more analysis shows the Copy doesn't actually save us more memory in this case, it does give a better pattern to recognize the memory allocation, as now it's,

(pprof) list GetBlock
Total: 110.41MB
ROUTINE ======================== github.com/lightningnetwork/lnd/blockcache.(*BlockCache).GetBlock in /Users/yong/Projects/lightning_labs/lnd/blockcache/blockcache.go
         0    33.68MB (flat, cum) 30.51% of Total
         .          .     34:func (bc *BlockCache) GetBlock(hash *chainhash.Hash,
         .          .     35:	getBlockImpl func(hash *chainhash.Hash) (*wire.MsgBlock,
         .          .     36:		error)) (*wire.MsgBlock, error) {
         .          .     37:
         .          .     38:	bc.HashMutex.Lock(lntypes.Hash(*hash))
         .          .     39:	defer bc.HashMutex.Unlock(lntypes.Hash(*hash))
         .          .     40:
         .          .     41:	// Create an inv vector for getting the block.
         .          .     42:	inv := wire.NewInvVect(wire.InvTypeWitnessBlock, hash)
         .          .     43:
         .          .     44:	// Check if the block corresponding to the given hash is already
         .          .     45:	// stored in the blockCache and return it if it is.
         .          .     46:	cacheBlock, err := bc.Cache.Get(*inv)
         .          .     47:	if err != nil && err != cache.ErrElementNotFound {
         .          .     48:		return nil, err
         .          .     49:	}
         .          .     50:	if cacheBlock != nil {
         .          .     51:		return cacheBlock.MsgBlock(), nil
         .          .     52:	}
         .          .     53:
         .          .     54:	// Fetch the block from the chain backends.
         .          .     55:	msgBlock, err := getBlockImpl(hash)
         .          .     56:	if err != nil {
         .          .     57:		return nil, err
         .          .     58:	}
         .          .     59:
         .          .     60:	// Make a copy of the block so it won't escape to the heap.
         .    33.68MB     61:	msgBlockCopy := msgBlock.Copy()
         .          .     62:	block := btcutil.NewBlock(msgBlockCopy)
         .          .     63:
         .          .     64:	// Add the new block to blockCache. If the Cache is at its maximum
         .          .     65:	// capacity then the LFU item will be evicted in favour of this new
         .          .     66:	// block.
ROUTINE ======================== github.com/lightningnetwork/lnd/chainntnfs/bitcoindnotify.(*BitcoindNotifier).GetBlock in /Users/yong/Projects/lightning_labs/lnd/chainntnfs/bitcoindnotify/bitcoind.go
         0    33.68MB (flat, cum) 30.51% of Total
         .          .   1042:func (b *BitcoindNotifier) GetBlock(hash *chainhash.Hash) (*wire.MsgBlock,
         .          .   1043:	error) {
         .          .   1044:
         .    33.68MB   1045:	return b.blockCache.GetBlock(hash, b.chainConn.GetBlock)
         .          .   1046:}
         .          .   1047:
         .          .   1048:// SubscribeMempoolSpent allows the caller to register a subscription to watch
         .          .   1049:// for a spend of an outpoint in the mempool.The event will be dispatched once
         .          .   1050:// the outpoint is spent in the mempool.

Copy link
Copy Markdown
Member

@Roasbeef Roasbeef left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM ⛷️

Just needs a rebase now for the release notes.

Comment thread go.mod Outdated
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can remove the replace now!

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done!

This commit fixes a heap escape found in `GetBlock`. This happens
because the `msgBlock` is a pointer returned from `getBlockImpl`, and
the whole `getBlockImpl` escapes to the heap because it's referenced in
two places,
- in the `Cache`, it's used as a reference type to create the new block.
- this pointer is also returned and now needs to stay on the heap.

The fix is simple, we now make a copy of the block and use the copy
instead, freeing `getBlockImpl` from the heap.
@yyforyongyu
Copy link
Copy Markdown
Member Author

Since the latest btcd merge also includes the updated chain.Interface, we need to satisfy it in NoChainBackend, updated in the last commit.

Copy link
Copy Markdown
Collaborator

@ellemouton ellemouton left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

@guggero guggero merged commit e31d159 into lightningnetwork:master Jan 23, 2024
@yyforyongyu yyforyongyu deleted the update-btcd-ver branch January 23, 2024 17:44
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[bug]: Memory leak when running bitcoind backend in polling mode

4 participants