-
Notifications
You must be signed in to change notification settings - Fork 2.7k
mmr proof generation failures on rococo #12864
Description
We've been seeing mmr_generateProof calls failures on rococo the last weeks.
1. InconsistentStore
The first source of these errors is mmr_lib::Error::InconsistentStore.
> curl -H "Content-Type: application/json" -d '{"id":1, "jsonrpc":"2.0", "method": "mmr_generateProof", "params":[3, "0x0910d049debc60f0f2ee10b892b1bd5044efff9dbb3f6479c63baacb83f92f26"]}' $NODE_SYNCED_NORMALLY
{"jsonrpc":"2.0","error":{"code":8010,"message":"Unexpected MMR error","data":"Error::Verify"},"id":1}
---
node logs:
2022-12-07 10:08:25 [<wasm:stripped>] MMR error: InconsistentStoreThis issue is due to the offchain worker not being triggered on every block.
Solution
This can be mitigated by firing notifications on the initial block sync too, as suggested by @serban300 and done here:
Lederstrumpf@a571c51
However, this should already be solved by #12753 - will test once runtime 9330 is activate on rococo.
2. CodecError
The other issue is codec::Error. The source of these is that the onchain runtime version's api may mismatch what the client was built with. As such, the failures we've seen changed based on what runtime version was current (unless historic chain state is queried).
For example, on a v0.9.31 or v0.9.32 release node, mmr_generateProof works for blocks with onchain runtime version 9321 or 9310, that is from block 2744973 onwards, but fails with the codec error for earlier blocks. In this case, we changed the API for runtime 9310 to use block numbers in lieu of leaf indices with PR #12345, so moving from u64 → u32 leads to the codec error thrown at
substrate/frame/merkle-mountain-range/rpc/src/lib.rs
Lines 188 to 195 in b0e994c
| let (leaf, proof) = api | |
| .generate_proof_with_context( | |
| &BlockId::hash(block_hash), | |
| sp_core::ExecutionContext::OffchainCall(None), | |
| block_number, | |
| ) | |
| .map_err(runtime_error_into_rpc_error)? | |
| .map_err(mmr_error_into_rpc_error)?; |
Confirmed that this is the root of the codec error since changing back to leaf inputs in the proof generation API atop v0.9.31 resolves this:
Lederstrumpf@72cdcad
> curl -H "Content-Type: application/json" -d '{"id":1, "jsonrpc":"2.0", "method": "mmr_generateProof", "params":[3, "0x0910d049debc60f0f2ee10b892b1bd5044efff9dbb3f6479c63baacb83f92f26"]}' $NODE_RUNNING_PATCHED_v0.9.3.1
{"jsonrpc":"2.0","result":{"blockHash":"0x0910...2f26","leaf":"0xc5010...0000","proof":"0x0300...f299"},"id":1}Solution
Keeping the mmr API interface stable would avoid this.
If we change the API interface, the issue will disappear once the runtime aligned with changes on the API server becomes active. It will then still fail for historic queries, which I see the following options for:
- version the API, and then either fail gracefully on a mismatch or retain older versions
- check against runtime version and fail gracefully if known API break
- ignore this since once the new runtime becomes active, it will only affect historic state queries