[SeiDB] Fix concurrent map access by yzang2019 · Pull Request #411 · sei-protocol/sei-cosmos

yzang2019 · 2024-01-30T00:11:15Z

Describe your changes and provide context

This should fix the concurrent map access for storeV2 root multistore over (rs.ckvStores). The problem is that it is currently not protected by a read lock so when other goroutine modify the map, it will throw panic

Testing performed to validate your change

storev2/rootmulti/store.go

codecov · 2024-01-30T00:43:03Z

Codecov Report

Attention: 2 lines in your changes are missing coverage. Please review.

Comparison is base (5a1afb8) 54.81% compared to head (7934603) 54.81%.

Additional details and impacted files

@@            Coverage Diff             @@
##             main     #411      +/-   ##
==========================================
- Coverage   54.81%   54.81%   -0.01%     
==========================================
  Files         622      622              
  Lines       52203    52205       +2     
==========================================
  Hits        28615    28615              
- Misses      21504    21506       +2     
  Partials     2084     2084

Files	Coverage Δ
storev2/rootmulti/store.go	`3.08% <0.00%> (-0.02%)`	⬇️

* main: [SeiDB] Fix concurrent map access (#411) No longer disable dynamic dep generation during ACL dependency generation (#404) fix(baseapp): Ensure Panic Recovery in Prepare & Process Handlers (#401) Revert removing events for cachekv (#396) Add migration handler for disabling seqno (#394)

## Describe your changes and provide context This should fix the concurrent map access for storeV2 root multistore over (rs.ckvStores). The problem is that it is currently not protected by a read lock so when other goroutine modify the map, it will throw panic ## Testing performed to validate your change

[SeiDB] Part2:Add io.closer interface for commit multistore (#374) Problem: With SeiDB, we need to support closing the database when self remediation is triggered, otherwise the chain will halt due to not able to reopen the database. Changes: - Add an interface for CMS (commit multistore) - Add implementation for existing cms - Add close logic for self remediation to close the commit store and state store Tested with local docker chain [SeiDB] Part3: Adjust Snapshot and Pruning logic for storeV2 (#376) Changes: - Adjust and fix pruning and snapshot logic which could break seidb - Add some logging for snapshot manager Covered by unit test and local docker cluster SeiDB part 4 SeiDB part 5 [SeiDB] Fix concurrent map access (#411) This should fix the concurrent map access for storeV2 root multistore over (rs.ckvStores). The problem is that it is currently not protected by a read lock so when other goroutine modify the map, it will throw panic [SeiDB] Fix various issues from ottersec audit (#418) Fix a bunch of minor issues from audit: - LoadVersionAndUpgrade should not ignore version - Simplify CacheWrap to avoid wrapping twice - SS store query is not setting height correctly in the query response - Add panic logic for bumpversion if it is false in commit - Use the correct commit info in query [SeiDB] Fix SS apply changeset version off by 1 (#424) Problem: When apply changeset to SS store, currently we are using the lastCommitInfo version, which is the previous block height. This means the version is off by 1. The correct way is to use the current block height for the changeset version Will add integration test for this test case SeiDB fixes Add metrics for storeV2 (#456) Add two new metrics for StoreV2: - Add metric to monitor SC commit latency - Add metric to monitor SS store last commit version Tested locally and see metrics appear Fix Panic: Check For Error Before Defer (#482) - Checks for err before `defer` - Without this, if there is a err, the node will panic on the `defer` because `scStore` is `nil` - Verified on node Add validation to prevent data corruption due to SS store misoperation (#503) Problem: Currently if a node did state sync with SS disabled, and then we turn on SS, it will behave wrong due to SS not having initial version for most keys and cause data corruption. And if that node tries to create a snapshot, it will produce a bad snapshot which could cause other nodes to app hash. Solution: In this PR, we simply add a validation check to make sure if SS is suddenly enabled, and SS doesn't have any data yet, SC must also not have any data, otherwise, we will panic during startup. Tested on arctic-1: <img width="887" alt="image" src="https://github.com/sei-protocol/sei-cosmos/assets/50607998/9bc1a67b-7b11-4254-96a2-9e155239bb85"> Enabled SS first, hit panic, and then disable SS again, node was running fine. Bump seidb version SeiDB fixes

[SeiDB] Part2:Add io.closer interface for commit multistore (#374) Problem: With SeiDB, we need to support closing the database when self remediation is triggered, otherwise the chain will halt due to not able to reopen the database. Changes: - Add an interface for CMS (commit multistore) - Add implementation for existing cms - Add close logic for self remediation to close the commit store and state store Tested with local docker chain [SeiDB] Part3: Adjust Snapshot and Pruning logic for storeV2 (#376) Changes: - Adjust and fix pruning and snapshot logic which could break seidb - Add some logging for snapshot manager Covered by unit test and local docker cluster SeiDB part 4 SeiDB part 5 [SeiDB] Fix concurrent map access (#411) This should fix the concurrent map access for storeV2 root multistore over (rs.ckvStores). The problem is that it is currently not protected by a read lock so when other goroutine modify the map, it will throw panic [SeiDB] Fix various issues from ottersec audit (#418) Fix a bunch of minor issues from audit: - LoadVersionAndUpgrade should not ignore version - Simplify CacheWrap to avoid wrapping twice - SS store query is not setting height correctly in the query response - Add panic logic for bumpversion if it is false in commit - Use the correct commit info in query [SeiDB] Fix SS apply changeset version off by 1 (#424) Problem: When apply changeset to SS store, currently we are using the lastCommitInfo version, which is the previous block height. This means the version is off by 1. The correct way is to use the current block height for the changeset version Will add integration test for this test case SeiDB fixes Add metrics for storeV2 (#456) Add two new metrics for StoreV2: - Add metric to monitor SC commit latency - Add metric to monitor SS store last commit version Tested locally and see metrics appear Fix Panic: Check For Error Before Defer (#482) - Checks for err before `defer` - Without this, if there is a err, the node will panic on the `defer` because `scStore` is `nil` - Verified on node Add validation to prevent data corruption due to SS store misoperation (#503) Problem: Currently if a node did state sync with SS disabled, and then we turn on SS, it will behave wrong due to SS not having initial version for most keys and cause data corruption. And if that node tries to create a snapshot, it will produce a bad snapshot which could cause other nodes to app hash. Solution: In this PR, we simply add a validation check to make sure if SS is suddenly enabled, and SS doesn't have any data yet, SC must also not have any data, otherwise, we will panic during startup. Tested on arctic-1: <img width="887" alt="image" src="https://github.com/sei-protocol/sei-cosmos/assets/50607998/9bc1a67b-7b11-4254-96a2-9e155239bb85"> Enabled SS first, hit panic, and then disable SS again, node was running fine. Bump seidb version SeiDB fixes Remove noVersioning and add packages

[SeiDB] Part2:Add io.closer interface for commit multistore (#374) Problem: With SeiDB, we need to support closing the database when self remediation is triggered, otherwise the chain will halt due to not able to reopen the database. Changes: - Add an interface for CMS (commit multistore) - Add implementation for existing cms - Add close logic for self remediation to close the commit store and state store Tested with local docker chain [SeiDB] Part3: Adjust Snapshot and Pruning logic for storeV2 (#376) Changes: - Adjust and fix pruning and snapshot logic which could break seidb - Add some logging for snapshot manager Covered by unit test and local docker cluster SeiDB part 4 SeiDB part 5 [SeiDB] Fix concurrent map access (#411) This should fix the concurrent map access for storeV2 root multistore over (rs.ckvStores). The problem is that it is currently not protected by a read lock so when other goroutine modify the map, it will throw panic [SeiDB] Fix various issues from ottersec audit (#418) Fix a bunch of minor issues from audit: - LoadVersionAndUpgrade should not ignore version - Simplify CacheWrap to avoid wrapping twice - SS store query is not setting height correctly in the query response - Add panic logic for bumpversion if it is false in commit - Use the correct commit info in query [SeiDB] Fix SS apply changeset version off by 1 (#424) Problem: When apply changeset to SS store, currently we are using the lastCommitInfo version, which is the previous block height. This means the version is off by 1. The correct way is to use the current block height for the changeset version Will add integration test for this test case SeiDB fixes Add metrics for storeV2 (#456) Add two new metrics for StoreV2: - Add metric to monitor SC commit latency - Add metric to monitor SS store last commit version Tested locally and see metrics appear Fix Panic: Check For Error Before Defer (#482) - Checks for err before `defer` - Without this, if there is a err, the node will panic on the `defer` because `scStore` is `nil` - Verified on node Add validation to prevent data corruption due to SS store misoperation (#503) Problem: Currently if a node did state sync with SS disabled, and then we turn on SS, it will behave wrong due to SS not having initial version for most keys and cause data corruption. And if that node tries to create a snapshot, it will produce a bad snapshot which could cause other nodes to app hash. Solution: In this PR, we simply add a validation check to make sure if SS is suddenly enabled, and SS doesn't have any data yet, SC must also not have any data, otherwise, we will panic during startup. Tested on arctic-1: <img width="887" alt="image" src="https://github.com/sei-protocol/sei-cosmos/assets/50607998/9bc1a67b-7b11-4254-96a2-9e155239bb85"> Enabled SS first, hit panic, and then disable SS again, node was running fine. Bump seidb version SeiDB fixes

Fix concurrent map access

7934603

yzang2019 requested review from Kbhat1 and alexanderbez January 30, 2024 00:11

Kbhat1 approved these changes Jan 30, 2024

View reviewed changes

alexanderbez reviewed Jan 30, 2024

View reviewed changes

storev2/rootmulti/store.go Show resolved Hide resolved

alexanderbez approved these changes Jan 30, 2024

View reviewed changes

yzang2019 merged commit 219175b into main Jan 30, 2024

yzang2019 deleted the yzang/SEI-6529 branch January 30, 2024 01:01

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[SeiDB] Fix concurrent map access#411

[SeiDB] Fix concurrent map access#411
yzang2019 merged 1 commit intomainfrom
yzang/SEI-6529

yzang2019 commented Jan 30, 2024

Uh oh!

Uh oh!

codecov bot commented Jan 30, 2024

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

yzang2019 commented Jan 30, 2024

Describe your changes and provide context

Testing performed to validate your change

Uh oh!

Uh oh!

codecov bot commented Jan 30, 2024

Codecov Report

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants