Add GetTxPriorityHint and mempool backpressure via priority drops#301
Add GetTxPriorityHint and mempool backpressure via priority drops#301
GetTxPriorityHint and mempool backpressure via priority drops#301Conversation
7067d5c to
ae86e6f
Compare
Add hooks to allow injection of transaction prioritization logic and handle ABCI requests to get transaction priority via the hook. Relates to: * sei-protocol/sei-tendermint#301
Expand the ABCI interface to expose a side-effect free get transaction priority
Rename the priority interfaces to hint, to make it clear that it is a hint. Integrate it with mempool to reject transactions below threshold when mempool is over utilised.
18f4d5a to
492b079
Compare
GetTxPriority to ABCI interfaceGetTxPriority to ABCI interface and integrate with mempool
Codecov Report❌ Patch coverage is ❌ Your patch status has failed because the patch coverage (63.95%) is below the target coverage (70.00%). You can increase the patch coverage or adjust the target coverage. Additional details and impacted files@@ Coverage Diff @@
## main #301 +/- ##
==========================================
+ Coverage 57.10% 57.28% +0.17%
==========================================
Files 258 255 -3
Lines 34487 33914 -573
==========================================
- Hits 19695 19427 -268
+ Misses 13214 12958 -256
+ Partials 1578 1529 -49
🚀 New features to boost your workflow:
|
# Conflicts: # abci/types/types.pb.go # proto/tendermint/abci/types.proto
|
The latest Buf updates on your PR. Results from workflow Buf / buf (pull_request).
|
Keep a sample of transaction priorities handled by mempool to determine the drop cut off when mempool is over-utilised.
GetTxPriority to ABCI interface and integrate with mempoolGetTxPriorityHint to ABCI interface with priority-based transaction drop under load
GetTxPriorityHint to ABCI interface with priority-based transaction drop under loadGetTxPriorityHint and mempool backpressure via priority drops
| } | ||
| if j := s.rng.Int64N(s.seen); int(j) < s.size { | ||
| s.samples[j] = item | ||
| } |
There was a problem hiding this comment.
afaiu, this means that the reservoir is never pruned, i.e. it approximates the whole history of transactions since the node was started. Is this intentional? What if the traffic pattern changes over time?
There was a problem hiding this comment.
Is this intentional? What if the traffic pattern changes over time?
It is. The rationale is to attempt to discover a more and more representative distribution of the global space of priorities a node handles. That then offers a more predictable behaviour for the node operators where they can reason about what proportion of transactions they are willing to give up if mempool is struggling.
The transactions end up being retried anyway so I think this is a relatively low risk approach?
Taking an alternative route where reservoir is a "moving window" of priorities is also valid, and other than my limited intuition I have no data to decide which is better for Sei node operators.
There was a problem hiding this comment.
It has a fixed size though right? What would happen if the total num of unique priority exceed the size limit, do we evict the old samples?
There was a problem hiding this comment.
It has a fixed size though right?
Correct.
What would happen if the total num of unique priority exceed the size limit, do we evict the old samples?
"uniqueness" does not make any difference. Once the number of samples (unique or not) exceeds the fixed size, we may pick an item at random to replace with the new sample. See algorithm R:
The way we select which item to replace is to pick a random value between 0 and the total number of samples seen so far, if the random number is within the range of sample sizes then we replace. Otherwise, we ignore the given sample.
Cache the last asked for percentile for at least 5 seconds and avoid redundant re-computation of it if the sample set has not changed. This cache design is based on the rationale that the percentile value asked for in Sei is fixed in configuration for the lifetime of the application. Therefore, the only value that's updated in reservoir is the samples.
|
|
||
| // lastPercentileCacheTTL is the duration for which a cached percentile value is | ||
| // considered valid if no new percentile p is asked for. | ||
| const lastPercentileCacheTTL = 5 * time.Second |
There was a problem hiding this comment.
I am happy to parameterise this if the reviewers prefer it to be from get go.
To remove a couple of edge cases and performance footgun make the reservoir less generic into a fixed percentile.
Only consider the percentile cache dirty if add results in expansion of samples, or it replaces an item with a different value.
|
Big thanks to @yzang2019 for help in testing this under load 🍻 ❤️ |
Extract the logic of transaction prioritisation from various Ante handlers for both EVM and cosmos transactions into a side effect free lightweight API exposed via ABCI interface. Relates to: * sei-protocol/sei-tendermint#301 * ~sei-protocol/sei-cosmos#598 Repo archived, and changes are ported to this PR.
Extends the ABCI interface with a side-effect free
GetTxPriorityHintmethod and connects transaction priority hints to the mempool. The change gives the network a straightforward way to apply backpressure when the mempool is congested, without blocking liveness-critical transactions. By making the dropping logic priority-aware, nodes can favor important traffic under load.Three new configuration options are introduced:
DropUtilisationThreshold– the utilisation level (0.0–1.0) at which dropping starts. For example,0.8means the mempool must be at least 80% full before the policy takes effect.DropPriorityThreshold– the fraction of lowest-priority transactions to drop once the utilisation threshold is exceeded. The default0.1drops the bottom 10%.DropPriorityReservoirSize– the number of samples used to estimate the distribution of transaction priorities. Defaults to 10,240 entries (~80KB). Larger values improve accuracy at the cost of memory.The reservoir approach is designed to avoid tracking every transaction’s priority directly, which would be inefficient. Instead, the mempool keeps a statistically representative sample (the reservoir) of observed priorities. This sample makes it possible to estimate percentile cutoffs with good accuracy, without storing all values. Operators can tune the reservoir size depending on whether memory savings or precision is more important for their setup.
The combination of the utilisation threshold, priority cutoff, and reservoir sampling introduces a low-risk form of backpressure. Only the least important transactions are dropped when space is tight, ensuring new, high-priority transactions can still enter. This improves resilience under load while keeping the policy simple and predictable.