perf: improve eth_getLogs performance with early rejection and backpressure#2591
perf: improve eth_getLogs performance with early rejection and backpressure#2591
Conversation
|
The latest Buf updates on your PR. Results from workflow Buf / buf (pull_request).
|
Codecov Report❌ Patch coverage is Additional details and impacted files@@ Coverage Diff @@
## main #2591 +/- ##
==========================================
- Coverage 45.85% 43.23% -2.62%
==========================================
Files 1182 1851 +669
Lines 101511 152773 +51262
==========================================
+ Hits 46545 66056 +19511
- Misses 50894 80837 +29943
- Partials 4072 5880 +1808
Flags with carried forward coverage won't be shown. Click here to find out more.
🚀 New features to boost your workflow:
|
…eth_getLogs - add comprehensive worker pool metrics (Prometheus + optional stdout debug) - add backpressure mechanism to reject requests when system is overloaded - add early rejection for pruned blocks to avoid wasting resources - add DB semaphore tracking for I/O monitoring - align DB semaphore size with worker pool size - add EVM_DEBUG_METRICS env var to enable debug output to stdout
f51122d to
230ad6f
Compare
There was a problem hiding this comment.
CodeQL found more than 20 potential problems in the proposed changes. Check the Files changed tab for more details.
| ) | ||
|
|
||
| // WorkerPoolMetrics tracks worker pool performance metrics | ||
| type WorkerPoolMetrics struct { |
There was a problem hiding this comment.
Nonblocker: Sad that this was not implemented using otel. Is it too late to do that instead of adding to the well out of date telemetry and go-metrics stuff considering this repo is already set up with otel?
evmrpc/server.go
Outdated
|
|
||
| // Start metrics printer (every 5 seconds) | ||
| // Prometheus metrics are always exported; stdout printing requires EVM_DEBUG_METRICS=true | ||
| StartMetricsPrinter(5 * time.Second) |
There was a problem hiding this comment.
Can we extract all of the hardcoded intervals into one constant?
| } | ||
|
|
||
| // StopMetricsPrinter stops the metrics printer | ||
| func StopMetricsPrinter() { |
There was a problem hiding this comment.
As far as I can see this not called from anywhere, which means there is nothing to stop the goroutine started by StartMetricsPrinter.
This one is a blocker.
| } | ||
|
|
||
| // recordBlockRangeBucket records the block range into the appropriate bucket | ||
| func (m *WorkerPoolMetrics) recordBlockRangeBucket(blockRange int64) { |
There was a problem hiding this comment.
Hm, why not use histogram?
ditto for the other occurrences.
|
/backport |
…essure (#2591) ## Describe your changes and provide context - add early rejection for pruned blocks to avoid wasting resources - align DB semaphore size with worker pool size - add backpressure mechanism to reject requests when system is overloaded - add comprehensive worker pool metrics (Prometheus + optional stdout debug) - add DB semaphore tracking for I/O monitoring - add EVM_DEBUG_METRICS env var to enable debug output to stdout(nothing print out by default) ## Testing performed to validate your change - significant perf improvement on eth_getlog rpc - the latency average dropped by like 90%+ as we introduced early rejection to avoid resource waste - the nodes are much more responsive, very few situations where they’re lagging --------- Co-authored-by: Yiming Zang <50607998+yzang2019@users.noreply.github.com> Co-authored-by: yzang2019 <zymfrank@gmail.com> (cherry picked from commit 3bfc150)
|
Successfully created backport PR for |
Describe your changes and provide context
Testing performed to validate your change