flatkv cache by cody-littley · Pull Request #3027 · sei-protocol/sei-chain

cody-littley · 2026-03-05T21:45:18Z

Describe your changes and provide context

Add a caching layer to FlatKV, more than doubling performance in cryptosim benchmarks.

Testing performed to validate your change

Unit tests, ran benchmark over several days.

github-actions · 2026-03-05T21:46:17Z

The latest Buf updates on your PR. Results from workflow Buf / buf (pull_request).

Build	Format	Lint	Breaking	Updated (UTC)
`✅ passed`	`✅ passed`	`✅ passed`	`✅ passed`	Apr 1, 2026, 6:27 PM

sei-db/state_db/sc/flatkv/flatcache/cache_impl.go

sei-db/db_engine/pebbledb/pebblecache/cache_impl.go

sei-db/db_engine/pebbledb/pebblecache/read_scheduler.go

codecov · 2026-03-06T19:02:20Z

Codecov Report

❌ Patch coverage is 75.52301% with 117 lines in your changes missing coverage. Please review.
✅ Project coverage is 58.61%. Comparing base (3ed3bf2) to head (872b0a6).
⚠️ Report is 1 commits behind head on main.

Files with missing lines	Patch %	Lines
sei-db/state_db/sc/flatkv/config.go	52.70%	19 Missing and 16 partials ⚠️
sei-db/db_engine/pebbledb/db.go	50.00%	18 Missing and 5 partials ⚠️
sei-db/state_db/sc/flatkv/store_write.go	83.68%	12 Missing and 11 partials ⚠️
sei-db/state_db/sc/flatkv/store.go	75.38%	8 Missing and 8 partials ⚠️
sei-db/common/threading/fixed_pool.go	77.77%	2 Missing and 2 partials ⚠️
sei-db/db_engine/dbcache/cache_config.go	63.63%	2 Missing and 2 partials ⚠️
sei-db/db_engine/pebbledb/pebbledb_config.go	60.00%	2 Missing and 2 partials ⚠️
sei-db/common/threading/adhoc_pool.go	83.33%	1 Missing and 1 partial ⚠️
sei-db/common/threading/elastic_pool.go	90.00%	1 Missing and 1 partial ⚠️
sei-db/db_engine/pebbledb/batch.go	60.00%	1 Missing and 1 partial ⚠️
... and 1 more

Additional details and impacted files

@@            Coverage Diff             @@
##             main    #3027      +/-   ##
==========================================
- Coverage   58.75%   58.61%   -0.15%     
==========================================
  Files        2095     2100       +5     
  Lines      173551   175039    +1488     
==========================================
+ Hits       101965   102594     +629     
- Misses      62465    63273     +808     
- Partials     9121     9172      +51

Flag	Coverage Δ
sei-chain-pr	`73.07% <75.52%> (?)`
sei-db	`70.41% <ø> (ø)`

Flags with carried forward coverage won't be shown. Click here to find out more.

Files with missing lines	Coverage Δ
sei-db/common/metrics/phase_timer.go	`85.10% <100.00%> (+0.32%)`	⬆️
sei-db/config/sc_config.go	`100.00% <100.00%> (ø)`
sei-db/db_engine/dbcache/cache.go	`77.77% <100.00%> (+66.66%)`	⬆️
sei-db/db_engine/dbcache/cache_impl.go	`95.45% <100.00%> (-0.20%)`	⬇️
sei-db/db_engine/dbcache/shard.go	`91.81% <100.00%> (+1.63%)`	⬆️
sei-db/db_engine/pebbledb/pebble_metrics.go	`69.09% <100.00%> (ø)`
sei-db/db_engine/pebbledb/pebbledb_test_config.go	`100.00% <100.00%> (ø)`
sei-db/db_engine/types/types.go	`100.00% <ø> (ø)`
sei-db/state_db/sc/flatkv/flatkv_test_config.go	`100.00% <100.00%> (ø)`
sei-db/state_db/sc/flatkv/snapshot.go	`65.98% <100.00%> (+0.19%)`	⬆️
... and 12 more

... and 41 files with indirect coverage changes

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

sei-db/db_engine/dbcache/cache_impl.go

sei-db/db_engine/pebbledb/pebblecache/shard.go

sei-db/state_db/sc/flatkv/store_write.go

sei-db/common/utils/work_pool.go

sei-db/common/threading/fixed_pool.go

sei-db/common/threading/pool_impl.go

yzang2019 · 2026-03-20T23:46:45Z

sei-db/state_db/sc/flatkv/config.go

+		return fmt.Errorf("metadata db config is invalid: %w", c.MetadataDBConfig.Validate())
+	}
+
+	if c.ReaderThreadsPerCore < 0 {


If ReaderThreadsPerCore == 0, and ReaderConstantThreadCount == 0, it will create a pool with 0 workers, causing deadlock issue?

Good point. I've updated the pool constructor to more elegantly handle this case:

if workers <= 0 { workers = 1 }

yzang2019 · 2026-03-20T23:51:07Z

sei-db/db_engine/pebbledb/pebbledb_config.go

+	if c.DataDir == "" {
+		return fmt.Errorf("data dir is required")
+	}
+	if c.CacheSize > 0 && (c.CacheShardCount&(c.CacheShardCount-1)) != 0 {


Shall we also validate and make sure CacheShardCount > 0?

Good point. Validation added.

yzang2019 · 2026-03-20T23:58:13Z

One blocker comment: I think we should decouple cache from db_engine as much as possible, which allows FlatKV to switch different db_engine easily without reimplmenting the cache for each engine in the future

yzang2019 · 2026-03-30T18:05:27Z

sei-db/state_db/sc/flatkv/store_write.go

-			defer wg.Done()
-			errs[idx] = b.Commit(syncOpt)
-		}(i, p.batch)
+		err := s.miscPool.Submit(s.ctx, func() {


This could a deadlock issue under load.

In commitBatches(), tasks are submitted to miscPool. Each task calls cachedBatch.Commit(), which calls cache.BatchSet(), which also submits to miscPool. If the pool queue is saturated (all workers executing batch commits), the inner BatchSet submissions will block waiting for a free slot — but slots can't free up until workers finish, and workers can't finish until BatchSet completes.

This is a good observation, but I've already taken precautions against this exact scenario. ;)

There several pool implementations:

fixed: N worker threads, work only runs on these threads

elastic: N worker threads, if worker thread is not immediately available then spin up new goroutine

adhoc: each job gets its own goroutine (for unit tests, mostly)

The misc pool is using an elastic pool. This means that it's safe to send tasks with blocking dependencies, without fear of deadlock.

miscPool := threading.NewElasticPool(ctx, "flatkv-misc", miscPoolSize)

In the elastic pool's implementation, the work queue is a channel of size 0, meaning that if there isn't a worker currently sitting idle, we will immediately fall through to the default case in the select statement.

func (ep *elasticPool) Submit(ctx context.Context, task func()) error { if task == nil { return fmt.Errorf("elastic pool: nil task") } select { case <-ctx.Done(): return ctx.Err() case <-ep.ctx.Done(): return fmt.Errorf("elastic pool is shut down") case ep.workQueue <- task: return nil default: // All warm workers are busy; spawn a temporary goroutine. go task() return nil } }

yzang2019 · 2026-03-30T18:56:17Z

sei-db/state_db/sc/flatkv/store_write.go

-		}(i, db)
+		err := s.miscPool.Submit(s.ctx, func() {
+			errs[i] = db.Flush()
+			wg.Done()


we should add defer for wg.Done() to avoid panic hangs forever

Change made.

yzang2019 · 2026-03-30T19:53:04Z

sei-db/state_db/sc/flatkv/store_write.go

 					return fmt.Errorf("invalid address length %d for key kind %d", len(keyBytes), kind)
 				}
 				addrStr := string(addr[:])
+				addrKey := string(AccountKey(addr))


If anyone ever changes AccountKey to add a prefix, or any transformation (which is a natural evolution for a "DB key builder" function), batchReadOldValues would silently fail to find pending writes in s.accountWrites

Suggest to pick one canonical key representation and use it everywhere — either always string(addr[:]) or always string(AccountKey(addr)).

blindchaser · 2026-03-30T20:13:14Z

sei-db/state_db/sc/flatkv/store_write.go

+				if !ok {
+					continue
+				}
+				k := string(AccountKey(addr))


consider using string(addr[:]) for the s.accountWrites lookup (matching ApplyChangeSets and store_read.go), and a separate addrKey := string(AccountKey(addr)) for the accountOld/accountBatch maps. Today AccountKey returns addr[:] so the result is the same, but using two different derivations for the same map key is fragile if AccountKey ever changes.

change made

blindchaser · 2026-03-30T20:16:19Z

sei-db/state_db/sc/flatkv/config.go

+		return fmt.Errorf("metadata db config is invalid: %w", err)
+	}
+
+	if c.ReaderThreadsPerCore < 0 {


nit: <=0 to match error text

blindchaser · 2026-03-30T20:20:45Z

sei-db/state_db/sc/flatkv/store_write.go

+			defer wg.Done()
+			storageErr = s.storageDB.BatchGet(storageBatch)
+		})
+		if err != nil {


If a later Submit fails and we return early, previously submitted tasks may still be running and writing to their batch maps? how about calling wg.Wait() before returning on submit error to avoid a race.

Good point, although I think we can fix this with a slightly simpler solution.

Submit can only fail when contexts get cancelled, i.e. we should only expect to encounter this sort of error during system teardown workflows. So I don't think it's important to optimize performance here, just to make sure it's functionally correct.

The problem is that when we return, we're returning garbage map data. And even worse, we're returning garbage data that isn't threadsafe.

I don't think we need to block until all goroutines are finished. I think the important part is to just always return nil values if we are returning an error. It's ok of the goroutine doesn't immediately stop, as long as the caller isn't receiving unsafe/invalid data.

I've converted error return cases to use the following form:

if err != nil { return nil, nil, nil, nil, fmt.Errorf("failed to submit batch get: %w", err) }

Agreed on the batchReadOldValues fix.

in func commitBatches(), there is a similar issue and I’d still lean toward switching back to plain goroutines, since there’s no returned map state to null out there. the main issue in that path is just the partial-submit/unwind interaction, and direct goroutines seem like the simplest way to avoid that.

As we discussed offline, I've changed the function signature so that Submit() never returns an error.

blindchaser · 2026-03-30T20:31:52Z

sei-db/state_db/bench/cryptosim/cmd/configure-logger/main.go

+		return fmt.Errorf("LogDir is empty, refusing to proceed")
+	}
+
+	if cfg.DeleteDataDirOnStartup {


In cmd/cryptosim DeleteDataDirOnStartup deletes DataDir, and in cmd/configure-logger it deletes LogDir, enabling this flag now has inconsistent

Blame @masih for this one, configuring the logger at init time makes this sort of workflow wonky. 😜

I've added seperate configurations for deleting log dirs, so now it should be easier to grok how the settings control the workflow.

blindchaser · 2026-03-30T20:34:39Z

sei-db/db_engine/dbcache/cache_config.go

+}
+
+// Validate checks that the configuration is sane and returns an error if it is not.
+func (c *CacheConfig) Validate() error {


is this function being used now?

Good catch, they weren't being called! I've added calls to these methods inside flatKV config.

if err := c.AccountCacheConfig.Validate(); err != nil { return fmt.Errorf("account cache config is invalid: %w", err) } if err := c.CodeCacheConfig.Validate(); err != nil { return fmt.Errorf("code cache config is invalid: %w", err) } if err := c.StorageCacheConfig.Validate(); err != nil { return fmt.Errorf("storage cache config is invalid: %w", err) } if err := c.LegacyCacheConfig.Validate(); err != nil { return fmt.Errorf("legacy cache config is invalid: %w", err) } if err := c.MetadataCacheConfig.Validate(); err != nil { return fmt.Errorf("metadata cache config is invalid: %w", err) }

* main: plt-228 fixed static check on app and evmrpc package (#3154) flatkv cache (#3027) Make cryptosim state store backend configurable + No Op Wrapper + Read Disable Config (#3145) Add warning message for IAVL deprecation (#3159) Change default min valid per window to zero (#3157) support for starting autobahn from non-zero global block (#3136) Fix upgrade list comparison to respect semver (#3153)

Cody Littley added 7 commits March 5, 2026 13:31

Created a cache for flatKV.

7c5e216

checkpoint

4a404ee

incremental progress

d36e825

address feedback

2ccbe62

more fixes

f412e85

bugfix

e310037

wire in cache

cf1071c

cody-littley self-assigned this Mar 5, 2026

cody-littley added the non-app-hash-breaking label Mar 5, 2026

github-advanced-security bot found potential problems Mar 5, 2026

View reviewed changes

sei-db/state_db/sc/flatkv/flatcache/cache_impl.go Fixed Show fixed Hide fixed

sei-db/db_engine/pebbledb/pebblecache/cache_impl.go Fixed Show fixed Hide fixed

sei-db/db_engine/pebbledb/pebblecache/read_scheduler.go Fixed Show fixed Hide fixed

Cody Littley added 9 commits March 6, 2026 09:21

Merge branch 'main' into cjl/flatkv-cache

11232ff

incremental improvements

a8c1c75

checkin

221d114

Moved where the cache sits

8eca079

bugfix

267feae

Batch update the cache

50b0be6

Add batch read to cache

2ca00d6

Add batch get to db interface

8f8534a

integrate batch reads

23c0277

github-advanced-security bot found potential problems Mar 6, 2026

View reviewed changes

Cody Littley added 2 commits March 6, 2026 13:18

wire in cache

02d3ca1

Introduce work pool, size caches differently

7ee1b08

github-advanced-security bot found potential problems Mar 6, 2026

View reviewed changes

sei-db/common/utils/work_pool.go Fixed Show fixed Hide fixed

sei-db/common/threading/fixed_pool.go Fixed Show fixed Hide fixed

sei-db/common/threading/pool_impl.go Fixed Show fixed Hide fixed

Cody Littley added 5 commits March 6, 2026 14:46

bugfix

20c70c3

Add unit constants

b714789

refactor threading utils

cc9d41d

cleanup

53b2bd8

Cleanup, fix race condition

c10e0cd

yzang2019 reviewed Mar 20, 2026

View reviewed changes

Cody Littley added 6 commits March 25, 2026 08:21

made suggested changes

f143d30

config changes

a18fd93

made suggested changes

34e711d

Merge branch 'main' into cjl/flatkv-cache

b596f89

bugfix

33378ce

don't ignore errors from batch get

a402238

yzang2019 reviewed Mar 30, 2026

View reviewed changes

Merge branch 'main' into cjl/flatkv-cache

b18bc4e

yzang2019 reviewed Mar 30, 2026

View reviewed changes

blindchaser reviewed Mar 30, 2026

View reviewed changes

Cody Littley added 3 commits March 31, 2026 09:13

made suggested changes

ee30ca9

Merge branch 'main' into cjl/flatkv-cache

c8fd5ec

Merge branch 'main' into cjl/flatkv-cache

8094375

yzang2019 approved these changes Mar 31, 2026

View reviewed changes

blindchaser approved these changes Mar 31, 2026

View reviewed changes

Cody Littley added 3 commits April 1, 2026 08:26

make suggested change to pool

faf4871

Merge branch 'main' into cjl/flatkv-cache

9c8454f

fix merge problem

872b0a6

cody-littley added this pull request to the merge queue Apr 1, 2026

Merged via the queue into main with commit 232fee5 Apr 1, 2026
39 checks passed

cody-littley deleted the cjl/flatkv-cache branch April 1, 2026 18:58

Conversation

cody-littley commented Mar 5, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Describe your changes and provide context

Testing performed to validate your change

Uh oh!

github-actions bot commented Mar 5, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

codecov bot commented Mar 6, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

yzang2019 commented Mar 20, 2026

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

cody-littley Mar 31, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

cody-littley commented Mar 5, 2026 •

edited

Loading

github-actions bot commented Mar 5, 2026 •

edited

Loading

codecov bot commented Mar 6, 2026 •

edited

Loading

cody-littley Mar 31, 2026 •

edited

Loading