Skip to content

Mempool improvements#3337

Merged
jcnelson merged 33 commits intodevelopfrom
feat/mempool-improvements
Oct 19, 2022
Merged

Mempool improvements#3337
jcnelson merged 33 commits intodevelopfrom
feat/mempool-improvements

Conversation

@obycode
Copy link
Copy Markdown
Contributor

@obycode obycode commented Oct 12, 2022

This is the cleaned up version of simple-iterator discussed in #3326 and #3313.

To-do before checking in:

  • verify that a snapshot chainstate creates a DB with the right query plan
    • can do as part of bringing up mock miner from snapshot.. then can check the query plan

These changes take lessons learned from experiments by @gregorycoppola
and myself, as well as feedback on various related PRs. It uses the
following techniques to improve the speed and scalability of the mempool
walk.
* Uses rusqlite's `Rows` iterator to read one row at a time
* Caches the nonces in memory to avoid repeated lookups
* Restarts search from the highest fee-rate transactions after every
  executed transaction
  * Caches potential transactions in memory to retry on next pass

With this implementation, miners can reliably fill a block in <30s,
regardless of how large the mempool gets.
@codecov
Copy link
Copy Markdown

codecov bot commented Oct 12, 2022

Codecov Report

Merging #3337 (36690d0) into develop (5123caa) will decrease coverage by 0.07%.
The diff coverage is 21.53%.

@@             Coverage Diff             @@
##           develop    #3337      +/-   ##
===========================================
- Coverage    32.02%   31.95%   -0.08%     
===========================================
  Files          261      261              
  Lines       208687   209709    +1022     
===========================================
+ Hits         66830    67008     +178     
- Misses      141857   142701     +844     
Impacted Files Coverage Δ
src/core/tests/mod.rs 0.00% <0.00%> (ø)
src/chainstate/stacks/miner.rs 13.00% <1.21%> (-0.51%) ⬇️
testnet/stacks-node/src/config.rs 48.64% <40.00%> (-0.08%) ⬇️
src/core/mempool.rs 71.39% <88.64%> (+0.10%) ⬆️
src/chainstate/stacks/db/accounts.rs 28.58% <100.00%> (+0.23%) ⬆️
stacks-common/src/deps_common/bitcoin/util/hash.rs 36.86% <0.00%> (-2.19%) ⬇️
src/net/dns.rs 16.90% <0.00%> (-1.72%) ⬇️
src/burnchains/bitcoin/mod.rs 36.61% <0.00%> (-1.41%) ⬇️
...-node/src/burnchains/bitcoin_regtest_controller.rs 86.22% <0.00%> (-0.20%) ⬇️
... and 27 more

📣 We’re building smart automated test selection to slash your CI/CD build times. Learn more

@obycode
Copy link
Copy Markdown
Contributor Author

obycode commented Oct 12, 2022

Re-running my benchmarks today, this PR now looks good when compared to #3326 in benchmarking.

I was previously seeing better numbers for the other implementation, so it might be good for someone else to try benchmarking these and see if it is reproducible.

The better numbers from before were when the caching was applied on top of the changes in #3326.

Version Commit Num Tx %-Full Fees Collected (uSTX) Time Spent (ms)
master 378fc1b 105 4.97% 25175605 33983.1
#3326 6052447 1034 99.97% 144993355 16822.8
#3337 9a62096 952 99.96% 155684975 14121.9

This was a typo when refactoring, and caused worse ordering for the
transactions.
@obycode
Copy link
Copy Markdown
Contributor Author

obycode commented Oct 12, 2022

I made a mistake when refactoring this code and the retry list was being processed in the wrong order. The updated benchmarking numbers are here:

Version Commit Num Tx %-Full Fees Collected (uSTX) Time Spent (ms)
master 378fc1b 105 4.97% 25175605 33983.1
#3326 6052447 1034 99.97% 144993355 16822.8
#3337 (with bug) 9a62096 952 99.96% 155684975 14121.9
#3337 9190093 972 99.95% 131784486 15563.8

It's surprising that the fees collected gets worse when this bug is fixed. That must indicate that the cost estimate (and thus the fee rate) is off.

@obycode
Copy link
Copy Markdown
Contributor Author

obycode commented Oct 12, 2022

Ah, I found another bug in this refactoring. Will fix and rerun the experiment.

@obycode
Copy link
Copy Markdown
Contributor Author

obycode commented Oct 12, 2022

Ok, after the bug fixes, the benchmark results look better:

Version Commit Num Tx %-Full Fees Collected (uSTX) Time Spent (ms)
master 378fc1b 105 4.97% 25175605 33983.1
#3326 6052447 1034 99.97% 144993355 16822.8
#3337 (with bugs) 9a62096 952 99.96% 155684975 14121.9
#3337 1586e86 1097 100% 189810290 15624.7

Now we get a 100% full block (read count is 15,000), and we get 44.82 STX more fees than the alternate implementation.

@obycode
Copy link
Copy Markdown
Contributor Author

obycode commented Oct 12, 2022

The failing unit test, core::tests::mempool_walk_over_fork is failing because it is relying on behavior internal to the old design, the "last known nonces" in the mempool table. This design does not use those, so the test fails. I will look into whether it is worth re-writing the test, or if it should just be skipped.

@obycode
Copy link
Copy Markdown
Contributor Author

obycode commented Oct 12, 2022

Updated table with new numbers from the latest version from #3326:

Version Commit Num Tx %-Full Fees Collected (uSTX) Time Spent (ms)
master 378fc1b 105 4.97% 25175605 33983.1
#3326 faffe42 1034 99.97% 144993355 16604.9
#3337 1586e86 1097 100% 189810290 15624.7

// Simple size cap to the cache -- once it's full, all nonces
// will be looked up every time. This is bad for performance
// but is unlikely to occur due to the typical number of
// transactions processed before filling a block.
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How often is this cache cleared? Is it once per block?

Also, it is knowable how many addresses can be loaded per block -- we could, in theory, calculate the maximum number of transactions that could be mined in a block, and use that to derive a maximum number of addresses that block could touch.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Given how this cache gets used, it might make sense to just use a simple LRU strategy for now. This cache is meant to help minimize the number of times we have to read the MARF to load up a nonce, so eliminating the most-common cases would be a good first attempt.

One day, subsequent refinement might consider the number of nodes that must be visited in the MARF to load the nonce. If the address's nonce was recently changed, then there are fewer tries to visit. It would make sense then to cache a nonce with probability proportional to how long ago it was last changed. It doesn't have to be in this PR, but I'm flagging it here for consideration.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we separate out the cache changes from the non-caching part of this PR?

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@gregorycoppola I believe that because this cache is key to mempool iteration performance, we should get it checked in with this PR.

Copy link
Copy Markdown
Member

@jcnelson jcnelson Oct 15, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@obycode On cache miss, could you instead write the last-known nonce to the mempool DB so we don't hit the MARF more than once per candidate on a call to iterate_candidates()? Then, on a subsequent cache miss, you could first check the mempool DB for the nonce, and then check the MARF it it's not there. You'd probably want to clear all last_known_nonces at the beginning of iterate_candidates().

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On cache miss, could you instead write the last-known nonce to the mempool DB so we don't hit the MARF more than once per candidate on a call to iterate_candidates()?

Yes, we can do this. I'll benchmark again to see what kind of effect this has on performance in the normal case.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LRU or no, I think you can bound transactions at around 256k using:

pub const MEMPOOL_MAX_TRANSACTION_AGE: u64 = 256;

And a calculation that around 1000 transactions fill a block.

So, the most number of transactions you would "expect" to crawl through before hitting a full block would be 256*1000=256k.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That accounts for transactions that have already been mined, but not pending transactions that can't yet be mined because of their nonces. There's no way to cap that number.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i was assuming that, on average, 1/256 transactions would have the right nonce.. i think that in practice the "proportion ready to mine" would actually go up and down a bit.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

also... the height that's used for eviction is independent of chaining, MemPoolDB::tx_submit... it just goes by 256 blocks since the height the tx was submitted, afaiu.

either way, this is just a model.

Copy link
Copy Markdown
Member

@jcnelson jcnelson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for this PR @obycode!

One thing that I think we'll want to see before merging is some test coverage that verifies the following:

  • All transactions are visited at least once on iteration, assuming no time-outs and no out-of-space events occur.

  • We need to see what happens when there are more mempool transactions than the caches have space (maybe the cache sizes could be configurable?). In particular, the test should verify that the caches reduce the number of I/O operations predictably, given their size. The cache could track this information in some internal accounting state.

  • We'll want to know what a good default cache size is for when the chain is under load. I think this could be obtained with live-testing with the mock miner, but it would be ideal if there was a unit test that could show us how to deduce this (or possibly a way for mempool to figure out for us what a good size would be).

  • We'll need to verify in a unit test that the RAM usage does not increase beyond a configured constant. I don't think this is currently happening in the code -- I think you have at least one instance of unbound memory usage. The mempool can have an unbound number of unmineable transactions, so we'll want to make sure the cache doesn't accidentally eat up all the RAM in the process of iteration.

Copy link
Copy Markdown
Member

@jcnelson jcnelson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So, one thing about this PR that still gives me pause is that once the caches are full, there's no eviction strategy. This means that it's possible for the miner to exhibit a pathological behavior where the first NonceCache::MAX_SIZE transactions are considered but are unmineable. Then, once we find the first mineable transaction, we'll always be encountering a cache miss on NonceCache::get(), which incurs a MARF read or (if you agree with my comment above) a database read.

As an easy-to-implement stop-gap to avoid thrashing, could you make the cache sizes configurable, and then plumb through that configuration from the node's config file? Then, at least miners could set higher MAX_SIZE values if they had enough RAM for it.

@gregorycoppola
Copy link
Copy Markdown
Contributor

I think this PR needs more tests.

I can take it over and write the tests that people want.

I opened a discussion #3345 that depicts the mempool walk as a pipeline of transformations as follows. These are some levels at which we can add tests, in whichever style.
dag-representation

Clarified the "best effort", based on Greg's feedback.
@jcnelson
Copy link
Copy Markdown
Member

@obycode Should the nonces get bumped when the result is Skipped or Problematic?

I don't think so. If a transaction from origin address A at nonce N can't be mined, then neither can any transaction from A with nonce N+k.

@obycode
Copy link
Copy Markdown
Contributor Author

obycode commented Oct 19, 2022

I don't think so. If a transaction from origin address A at nonce N can't be mined, then neither can any transaction from A with nonce N+k.

Thanks, that was my thought as well. What about ProcessingError? Does that indicate that the transaction was included in the block with an error or something else? The comment says "It may succeed later depending on the error" which makes me unsure without digging into the code.

@obycode
Copy link
Copy Markdown
Contributor Author

obycode commented Oct 19, 2022

Looks like we should only bump the nonces on Success.

obycode and others added 2 commits October 19, 2022 10:27
On `TransactionEvent`s other than `Success`, the nonces should not be
bumped because they indicate that the transaction is not included in the
block.

#[cfg(test)]
{
assert!(self.cache.len() <= self.max_cache_size + 1);
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

does this really need to be +1? i might have just put that when hacking.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, that should not need to be +1. I'll remove it in my next commit.

let deadline = get_epoch_time_ms() + (self.settings.max_miner_time_ms as u128);
let mut block_limit_hit = BlockLimitFunction::NO_LIMIT_HIT;

mem_pool.reset_last_known_nonces()?;
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We're not using any of the methods that manipulate the last_nonces columns, right? If so, then can you delete those as well, and add a comment to the schema that added these columns that they are no longer used?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure. Just to be clear, you want to remove those columns from the DB in the latest schema in addition to deleting all of the related code?

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If it's not too much trouble -- i.e. if it can be done with a DROP COLUMN. Sqlite has a bunch of constraints on when you can and cannot do this, however, so don't worry about it if sqlite is preventing you.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm seeing:

thread 'main' panicked at 'Failed to open mempool db: SqliteError(SqliteFailure(Error { code: Unknown, extended_code: 1 }, Some("near \"DROP\": syntax error")))', src/main.rs:739:14

But strangely, I can run the same query from the command line and it works.

ALTER TABLE mempool DROP COLUMN last_known_origin_nonce;

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe sqlite allows it but rusqlite does not?

Copy link
Copy Markdown
Member

@jcnelson jcnelson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall this LGTM! Thank you for seeing this through @obycode @gregorycoppola!

Do you have new benchmark numbers? Can you instantiate this on a mock miner and see how it does?

Copy link
Copy Markdown
Contributor

@gregorycoppola gregorycoppola left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

assuming the mock miner and benchmarks are as expected, i approve.

thanks everyone!

@obycode
Copy link
Copy Markdown
Contributor Author

obycode commented Oct 19, 2022

Do you have new benchmark numbers?

I've run informally, but will collect new numbers now.

Can you instantiate this on a mock miner and see how it does?

Yes, it is running now. Last block:

INFO [1666196520.482998] [src/chainstate/stacks/miner.rs:2413] [relayer] Miner: mined anchored block, block_hash: bdb4cae9809c7021c4bf1b6b2b7ce498543a2018
e0724dea5bb24c4e9c14caf3, height: 80211, tx_count: 72, parent_stacks_block_hash: f7d4afe1744f540899486d1c8811a3e42f22aace754dab19b59664e29722444e, parent_
stacks_microblock: 051d171d080744f083318cde80ab998774756632ee68da6d91cc86604374d3ca, parent_stacks_microblock_seq: 0, block_size: 21224, execution_consume
d: {"runtime": 64230390, "write_len": 32388, "write_cnt": 626, "read_len": 12103538, "read_cnt": 4363}, %-full: 29, assembly_time_ms: 2175, tx_fees_micros
tacks: 1054503

@obycode
Copy link
Copy Markdown
Contributor Author

obycode commented Oct 19, 2022

Latest benchmarking numbers:

Version Commit Num Tx %-Full Fees Collected (uSTX) Time Spent (ms)
#3337 35b38cb 1097 100% 189810290 10736.7

@obycode
Copy link
Copy Markdown
Contributor Author

obycode commented Oct 19, 2022

@jcnelson those failing unit tests are due to the fact that they were depending on the fact that the old iterate_candidates was incrementing the nonces on a Skipped result. I will update the tests, but I want to make sure that the new behavior is correct. It seems correct to me.

These tests were dependent on the old implementations increment of
nonces when a transaction was skipped. The correct response is to only
increment the nonces when the transaction is successfully included.
Therefore these tests need to simulate a success event in order to get
the expected behavior to be tested.
@obycode
Copy link
Copy Markdown
Contributor Author

obycode commented Oct 19, 2022

Tests updated in 1a65f0f.

@obycode
Copy link
Copy Markdown
Contributor Author

obycode commented Oct 19, 2022

I am adding some unit tests to test the behavior with a skipped or problematic transaction.

The old implementation incremented nonces when there was an error,
problematic, or skipped transaction, which would cause it to then
attempt to consider later nonces from the same addresses, incorrectly.
Three new unit tests are added to check for these cases.
@obycode
Copy link
Copy Markdown
Contributor Author

obycode commented Oct 19, 2022

New unit tests added in 36690d0. master fails these unit tests which I believe would've caused poor behavior -- repeatedly selecting bad transactions as candidates.


#[test]
/// This test verifies that when a transaction is skipped, other transactions
/// from the same address with higher nonces are not included in a block.
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Specifically, we want it to be the case that these higher-nonce transactions aren't even considered.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No action needed on this, btw. I'm happy to take it once I merge this to #3335

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That is what the test is checking, but you're right, the comment should be more clear. Thanks for handling that!

@blockstack-devops
Copy link
Copy Markdown
Contributor

This pull request has been automatically locked since there has not been any recent activity after it was closed. Please open a new issue for related bugs.

@stacks-network stacks-network locked as resolved and limited conversation to collaborators Nov 16, 2024
@wileyj wileyj deleted the feat/mempool-improvements branch March 11, 2025 21:30
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

2.05.0.5.0 L1 Working Group Issue or PR related to improving L1 locked

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants