-
Notifications
You must be signed in to change notification settings - Fork 1.5k
Description
during 0-2048 steps merge of ethmainnet. we observed chain-tip impact. Collecting reasons here:
- During merge: new bloom filters opened while old are not closed yet. It caused +2gb ram use. It's by-design (we opening new files before closing old - closing old only after no readers left) - but
.kvbloom-filters here are adding some constraints:
-
We agreed to prioritize work on "prohibit large merges on User's side". But work on this issue was not started because we had not free arms: handling 2x domain folder free space when merge #15343 (comment)
-
commitment.kv- need disable keys compression (or replace by page-level compression). just for speed of merge. -
commitment.kvmerge - does traversing files sequentially, but it does readingacc.kv/storage.kvfiles ("resolve short keys") in random order. And i think this is main reason of ChainTip impact:
1 @ 0x47a539 0x4bca3b 0x122b862 0x122bf2d 0x123605e 0x1755531 0x1756ff2 0x14bd4ec 0x1756e6a 0x17842d3 0x1729d5e 0x91a350 0x4bb461
# 0x122b861 github.com/erigontech/erigon/db/seg.(*Getter).nextPos+0x101 github.com/erigontech/erigon/db/seg/decompress.go:620
# 0x122bf2c github.com/erigontech/erigon/db/seg.(*Getter).Next+0x4c github.com/erigontech/erigon/db/seg/decompress.go:726
# 0x123605d github.com/erigontech/erigon/db/seg.(*Reader).Next+0x3d github.com/erigontech/erigon/db/seg/seg_auto_rw.go:65
# 0x1755530 github.com/erigontech/erigon/db/state.(*DomainRoTx).lookupByShortenedKey+0x170 github.com/erigontech/erigon/db/state/domain_committed.go:281
# 0x1756ff1 github.com/erigontech/erigon/db/state.(*DomainRoTx).commitmentValTransformDomain.func1.1+0x111 github.com/erigontech/erigon/db/state/domain_committed.go:370
# 0x14bd4eb github.com/erigontech/erigon/execution/commitment.BranchData.ReplacePlainKeys+0xdab github.com/erigontech/erigon/execution/commitment/commitment.go:478
# 0x1756e69 github.com/erigontech/erigon/db/state.(*DomainRoTx).commitmentValTransformDomain.func1+0x6c9 github.com/erigontech/erigon/db/state/domain_committed.go:424
# 0x17842d2 github.com/erigontech/erigon/db/state.(*DomainRoTx).mergeFiles+0xd12 github.com/erigontech/erigon/db/state/merge.go:502
# 0x1729d5d github.com/erigontech/erigon/db/state.(*AggregatorRoTx).mergeFiles.func2+0x31d github.com/erigontech/erigon/db/state/aggregator.go:1462
# 0x91a34f golang.org/x/sync/errgroup.(*Group).Go.func1+0x4f golang.org/x/sync@v0.18.0/errgroup/errgroup.go:93
- We don't have IO-rate-limiter (to make ChainTip impact predictable)
- If restart erigon during indexing files: will be downtime (indexing is blocking-operation now at startup). Example building of:
v2.0-commitment.0-2048.kvi
7 days data from monitoring (just to keep more evidence in 1 place):
And Prune got slower that time (amount of steps growing in DB - prune can't keep-up). Means other 'prune issues' can be related to "merge" impact on node:

Raw Ideas:
- need experiment on embedding state into
commitment.kvto avoid "reslove keys" random-reads ofacc.kv/storage.kv