fix: verify combine_hash for non-empty trees without subquery in proofs#633
Conversation
Adds a new proof node type KVValueHashFeatureTypeWithChildHash that carries the child Merk root hash when a non-empty tree appears in the result set without a lower-layer subquery. The merk verifier checks combine_hash(H(value_bytes), child_hash) == value_hash, preventing metadata tampering (e.g. swapping a CountTree's count) that was previously undetectable in this scenario. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
|
Note Reviews pausedIt looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the Use the following commands to manage reviews:
Use the checkboxes below for quick actions:
📝 WalkthroughWalkthroughAdds a new Node variant KVValueHashFeatureTypeWithChildHash and propagates it through proof generation, serialization/deserialization, Merkle/tree hashing, verification, debugger mapping, mapbuilder/branch logic, and extensive tests so proofs carry and validate an additional child_hash alongside value hashes. Changes
Sequence Diagram(s)sequenceDiagram
participant Generator as ProofGenerator
participant Encoder as Encoder (encoding.rs)
participant Transport as SerializedProof
participant Decoder as Decoder (encoding.rs)
participant Verifier as Verifier (merk/.../verify.rs)
participant Storage as Tree/Child
Generator->>Generator: create Node::KVValueHashFeatureTypeWithChildHash(key,value,value_hash,feature_type,child_hash)
Generator->>Encoder: serialize node (choose 0x1c/0x1d/0x2e/0x2f by length)
Encoder->>Transport: emit bytes (header + payload)
Transport->>Decoder: deliver serialized proof bytes
Decoder->>Verifier: decode node (key,value,value_hash,feature_type,child_hash)
Verifier->>Verifier: element_value_hash = value_hash(value)
Verifier->>Verifier: computed_value_hash = combine_hash(element_value_hash, child_hash)
Verifier->>Storage: compare computed_value_hash with node_value_hash -> accept/reject
Estimated code review effort🎯 4 (Complex) | ⏱️ ~60 minutes Possibly related PRs
Poem
🚥 Pre-merge checks | ✅ 3✅ Passed checks (3 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 6
🧹 Nitpick comments (2)
merk/src/proofs/branch/mod.rs (1)
111-125: Consider centralizing keyed-node extraction.This exact match table now exists in both branch-query helpers and in
merk/benches/branch_queries.rs. The new variant already had to be threaded through all of them, whilemerk/src/proofs/tree.rsstill has other coverage drift. A sharedNode::key()-style helper would make future enum additions much harder to miss.Also applies to: 374-388
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@merk/src/proofs/branch/mod.rs` around lines 111 - 125, The duplicated match logic for extracting a key from Node (currently implemented as get_key_from_node) should be centralized by adding a single helper (e.g., impl Node::key(&self) -> Option<Vec<u8>> or a free function Node_key) and replacing all local match tables (including get_key_from_node, branch-query helpers, merk/benches/branch_queries.rs and merk/src/proofs/tree.rs) to call that helper; update every call site to use the new Node::key() (or Node_key) accessor so new Node variants only need one update.grovedb-query/src/proofs/encoding.rs (1)
153-177: Add direct round-trip tests for the new opcodes.This file now owns four new wire encodings, but the local unit tests still stop at
KVValueHashFeatureType. A small+large encode/decode round-trip for0x1c/0x1d/0x2e/0x2fwould catch opcode orencoding_length()drift much earlier than the higher-level proof tests.Also applies to: 312-337, 700-749, 965-1020
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@grovedb-query/src/proofs/encoding.rs` around lines 153 - 177, Add direct round-trip unit tests for the new opcodes produced by the Op::Push Node::KVValueHashFeatureTypeWithChildHash encoding (opcodes 0x1c/0x1d/0x2e/0x2f): create small and large test vectors (value lengths below and above the 65536 threshold) that encode with the existing encoding function, check encoding_length() matches the produced buffer length, then decode and assert the decoded Op/Node equals the original; add separate cases to cover both the 16-bit-length and 32-bit-length branches and include assertions that the correct leading opcode byte (0x1c/0x1d for small, 0x2e/0x2f for large) appears in the encoded buffer.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@grovedb/src/operations/proof/generate.rs`:
- Around line 508-518: The fallback arm currently computes only
value_hash(&value_owned) but the verifier expects the combined hash
combine_hash(H(value), child_root_hash); add combine_hash to the imports and in
the fallback (_) arm of the match that sets (vh, ft) compute let vh =
value_hash(&value_owned).unwrap_add_cost(&mut cost); let vh = combine_hash(vh,
child_root_hash).unwrap_add_cost(&mut cost); (keeping ft as
TreeFeatureType::BasicMerkNode); make the identical change in the V1 fallback
match (lines noted) so Node::KV and Node::KVCount cases produce the combined
hash the verifier expects.
In `@grovedb/src/operations/proof/verify.rs`:
- Around line 2276-2282: The match arm that treats
Node::KVValueHashFeatureTypeWithChildHash as safe to extract must be hardened:
in verify_trunk_chunk_proof_v0 and verify_branch_chunk_proof, before calling
execute()/Tree::hash() or deserializing the element, explicitly validate that
combine_hash(H(value), child_hash) == value_hash for nodes of type
KVValueHashFeatureTypeWithChildHash (i.e., check the provided child_hash against
the stored value_hash), and reject the proof if the check fails; update the code
paths that currently rely solely on execute()/Tree::hash() to perform this
additional integrity check so deserialization of tampered values is prevented.
In `@grovedb/src/tests/proof_coverage_tests.rs`:
- Around line 6420-6482: The tamper_kvvaluehash_feature_type_with_child_hash
helper is brittle: it assumes a single small-tag form (tag = 0x1c) and reads
merk_proof[i + 1] without checking bounds. Update
tamper_kvvaluehash_feature_type_with_child_hash to first guard buffer bounds
before any index access (check i+1, vl_start+2, etc.), and broaden parsing to
handle all supported encodings of a KVValueHashFeatureTypeWithChildHash node
instead of only the small 0x1c variant—either by using the existing
merk-proof/node decoder to iterate decoded nodes and mutate the target node, or
by implementing parsing branches for both small and large tag encodings and
their respective key/length formats; preserve value_hash and child_hash when
replacing the value and update the stored value length correctly for the chosen
encoding.
In `@grovedb/src/tests/provable_count_sum_tree_tests.rs`:
- Around line 57-60: The test added handling for
Node::KVValueHashFeatureTypeWithChildHash when extracting the key but
get_node_count still ignores that variant, so proof nodes with that shape are
skipped; update the get_node_count function to include a match arm for
Node::KVValueHashFeatureTypeWithChildHash (mirroring the behavior used for
KVValueHashFeatureType and KVCount) returning the appropriate count/Some(_)
value so collect_tree_node_counts() will process these nodes and the count
assertions exercise the new proof shape; locate get_node_count and add the
corresponding pattern to its match on Node.
In `@merk/src/proofs/query/verify.rs`:
- Around line 416-421: Replace the current map_err that converts
ElementType::from_serialized_value failures into Error::InvalidProofError with
one that wraps the parse error as Error::CorruptedData; specifically update the
closure on ElementType::from_serialized_value in the
KVValueHashFeatureTypeWithChildHash decoding path to use .map_err(|e|
Error::CorruptedData(format!("cannot determine element type in
KVValueHashFeatureTypeWithChildHash node: {}", e))) so the original parse error
is preserved with the corrupted-data context.
In `@merk/src/proofs/tree.rs`:
- Around line 388-389: The new Node variant KVValueHashFeatureTypeWithChildHash
must be handled everywhere the KVDigest/other KV value variants are matched:
update Tree::execute() so Op::Push and Op::PushInverted ordering checks include
Node::KVValueHashFeatureTypeWithChildHash alongside Node::KVDigest and the other
KV cases; update the minimal-only helpers Child::as_link() and
Tree::aggregate_data() to recognize KVValueHashFeatureTypeWithChildHash and
return the same link/aggregate metadata behavior as the existing KV variants so
monotonic-key validation and aggregate metadata are not skipped during
minimal-mode reconstruction; ensure matches and any fallthrough branches
treating KV nodes are extended to include KVValueHashFeatureTypeWithChildHash.
---
Nitpick comments:
In `@grovedb-query/src/proofs/encoding.rs`:
- Around line 153-177: Add direct round-trip unit tests for the new opcodes
produced by the Op::Push Node::KVValueHashFeatureTypeWithChildHash encoding
(opcodes 0x1c/0x1d/0x2e/0x2f): create small and large test vectors (value
lengths below and above the 65536 threshold) that encode with the existing
encoding function, check encoding_length() matches the produced buffer length,
then decode and assert the decoded Op/Node equals the original; add separate
cases to cover both the 16-bit-length and 32-bit-length branches and include
assertions that the correct leading opcode byte (0x1c/0x1d for small, 0x2e/0x2f
for large) appears in the encoded buffer.
In `@merk/src/proofs/branch/mod.rs`:
- Around line 111-125: The duplicated match logic for extracting a key from Node
(currently implemented as get_key_from_node) should be centralized by adding a
single helper (e.g., impl Node::key(&self) -> Option<Vec<u8>> or a free function
Node_key) and replacing all local match tables (including get_key_from_node,
branch-query helpers, merk/benches/branch_queries.rs and
merk/src/proofs/tree.rs) to call that helper; update every call site to use the
new Node::key() (or Node_key) accessor so new Node variants only need one
update.
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: 65d26ea8-e396-468a-88df-58bbde35f1e8
📒 Files selected for processing (12)
grovedb-query/src/proofs/encoding.rsgrovedb-query/src/proofs/mod.rsgrovedb/src/operations/proof/generate.rsgrovedb/src/operations/proof/mod.rsgrovedb/src/operations/proof/verify.rsgrovedb/src/tests/proof_coverage_tests.rsgrovedb/src/tests/provable_count_sum_tree_tests.rsmerk/benches/branch_queries.rsmerk/src/merk/chunks.rsmerk/src/proofs/branch/mod.rsmerk/src/proofs/query/verify.rsmerk/src/proofs/tree.rs
- Fix CI: add new variant to debugger.rs merk_proof_node_to_grovedbg - Add to execute() key ordering checks (Op::Push/PushInverted) - Add to Child::as_link() for aggregate data extraction - Add to Tree::aggregate_data() for feature type extraction - Add to get_node_count in provable_count_sum_tree_tests - Add to count extraction in verify.rs leaf key collection Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Codecov Report❌ Patch coverage is Additional details and impacted files@@ Coverage Diff @@
## develop #633 +/- ##
===========================================
+ Coverage 90.70% 90.71% +0.01%
===========================================
Files 182 182
Lines 51139 51813 +674
===========================================
+ Hits 46387 47004 +617
- Misses 4752 4809 +57
🚀 New features to boost your workflow:
|
There was a problem hiding this comment.
🧹 Nitpick comments (1)
grovedb/src/debugger.rs (1)
522-551: Consider extracting the duplicated feature_type mapping into a helper function.The
feature_typetogrovedbg_types::TreeFeatureTypeconversion logic at lines 524-543 is nearly identical to lines 460-479. Extracting this into a helper function would reduce duplication and improve maintainability.♻️ Proposed refactor to extract helper function
+fn tree_feature_type_to_grovedbg(feature_type: TreeFeatureType) -> grovedbg_types::TreeFeatureType { + match feature_type { + TreeFeatureType::BasicMerkNode => grovedbg_types::TreeFeatureType::BasicMerkNode, + TreeFeatureType::SummedMerkNode(sum) => grovedbg_types::TreeFeatureType::SummedMerkNode(sum), + TreeFeatureType::BigSummedMerkNode(sum) => grovedbg_types::TreeFeatureType::BigSummedMerkNode(sum), + TreeFeatureType::CountedMerkNode(count) => grovedbg_types::TreeFeatureType::CountedMerkNode(count), + TreeFeatureType::CountedSummedMerkNode(count, sum) => grovedbg_types::TreeFeatureType::CountedSummedMerkNode(count, sum), + TreeFeatureType::ProvableCountedMerkNode(count) => grovedbg_types::TreeFeatureType::ProvableCountedMerkNode(count), + TreeFeatureType::ProvableCountedSummedMerkNode(count, sum) => grovedbg_types::TreeFeatureType::ProvableCountedSummedMerkNode(count, sum), + } +} fn merk_proof_node_to_grovedbg(node: Node) -> Result<MerkProofNode, crate::Error> { Ok(match node { // ... other cases ... Node::KVValueHashFeatureType(key, value, hash, feature_type) => { let element = crate::Element::deserialize(&value, GroveVersion::latest())?; - let node_feature_type = match feature_type { - // ... 7 arms ... - }; + let node_feature_type = tree_feature_type_to_grovedbg(feature_type); MerkProofNode::KVValueHashFeatureType(key, element_to_grovedbg(element), hash, node_feature_type) } Node::KVValueHashFeatureTypeWithChildHash(key, value, hash, feature_type, _child_hash) => { let element = crate::Element::deserialize(&value, GroveVersion::latest())?; - let node_feature_type = match feature_type { - // ... 7 arms ... - }; + let node_feature_type = tree_feature_type_to_grovedbg(feature_type); MerkProofNode::KVValueHashFeatureType(key, element_to_grovedbg(element), hash, node_feature_type) } }) }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@grovedb/src/debugger.rs` around lines 522 - 551, Extract the duplicated match that converts TreeFeatureType -> grovedbg_types::TreeFeatureType into a small helper function (e.g., fn convert_tree_feature_type(ft: TreeFeatureType) -> grovedbg_types::TreeFeatureType) and use it in both places (the match inside the Node::KVValueHashFeatureTypeWithChildHash arm and the earlier identical match). Replace the inline matches with a call to convert_tree_feature_type(...) so the MerkProofNode::KVValueHashFeatureType construction still passes the converted node_feature_type to element_to_grovedbg(element) and preserves existing variable names (feature_type, node_feature_type, TreeFeatureType, grovedbg_types::TreeFeatureType).
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Nitpick comments:
In `@grovedb/src/debugger.rs`:
- Around line 522-551: Extract the duplicated match that converts
TreeFeatureType -> grovedbg_types::TreeFeatureType into a small helper function
(e.g., fn convert_tree_feature_type(ft: TreeFeatureType) ->
grovedbg_types::TreeFeatureType) and use it in both places (the match inside the
Node::KVValueHashFeatureTypeWithChildHash arm and the earlier identical match).
Replace the inline matches with a call to convert_tree_feature_type(...) so the
MerkProofNode::KVValueHashFeatureType construction still passes the converted
node_feature_type to element_to_grovedbg(element) and preserves existing
variable names (feature_type, node_feature_type, TreeFeatureType,
grovedbg_types::TreeFeatureType).
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: 448386a3-271b-4025-8349-6f5136745ea2
📒 Files selected for processing (4)
grovedb/src/debugger.rsgrovedb/src/operations/proof/verify.rsgrovedb/src/tests/provable_count_sum_tree_tests.rsmerk/src/proofs/tree.rs
🚧 Files skipped from review as they are similar to previous changes (1)
- grovedb/src/tests/provable_count_sum_tree_tests.rs
… (audit) Addresses findings from security and quality audits: - MapBuilder::insert in map.rs (deprecated API, but correctness matters) - get_node_count in grove_trunk_query_result.rs - get_node_count in grove_branch_query_result.rs - get_node_count in benches/branch_chunk_queries.rs Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
There was a problem hiding this comment.
Actionable comments posted: 3
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@grovedb/benches/branch_chunk_queries.rs`:
- Around line 180-186: The benchmark helper is not extracting counts for nodes
that use the new KVValueHashFeatureTypeWithChildHash path and
get_tree_root_count still only checks the old variants; update the pattern
matches in the helper (the match arms handling Node::KVValueHashFeatureType and
Node::KVValueHashFeatureTypeWithChildHash) to also return counts for
TreeFeatureType::CountedMerkNode and TreeFeatureType::CountedSummedMerkNode (and
their provable variants if present), and modify get_tree_root_count() to mirror
the same variant/feature-type handling so counted-tree nodes produce correct
root_count and ancestor selection diagnostics.
In `@grovedb/src/query/grove_branch_query_result.rs`:
- Around line 123-129: The match over Node::KVValueHashFeatureType /
Node::KVValueHashFeatureTypeWithChildHash is currently extracting only
ProvableCountedMerkNode and ProvableCountedSummedMerkNode manually; change it to
call TreeFeatureType::count() on feature_type and return that Option<u64> (i.e.,
if feature_type.count() is Some(count) return Some(count) else None) so
get_ancestor() correctly recognizes all counted feature variants. Locate the
match in the function handling these Node variants (the block matching
Node::KVValueHashFeatureType and Node::KVValueHashFeatureTypeWithChildHash) and
replace the hard-coded variant checks with a call to feature_type.count().
In `@grovedb/src/query/grove_trunk_query_result.rs`:
- Around line 137-143: Replace the hard-coded match on TreeFeatureType that only
handles ProvableCountedMerkNode and ProvableCountedSummedMerkNode with a call to
TreeFeatureType::count() so all counted node variants are handled; specifically,
in the match arm for Node::KVValueHashFeatureType(...) and
Node::KVValueHashFeatureTypeWithChildHash(...), return feature_type.count() (or
feature_type.count().copied()/map(|c| *c) as appropriate) instead of manually
matching only the Provable* variants—this will fix undercounting in
get_ancestor() and related privacy-selection logic.
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: 553dae32-cd4b-4371-a00e-ea1ef764c03d
📒 Files selected for processing (4)
grovedb/benches/branch_chunk_queries.rsgrovedb/src/query/grove_branch_query_result.rsgrovedb/src/query/grove_trunk_query_result.rsmerk/src/proofs/query/map.rs
…queries Adds `child_hash_verified: bool` to `ProvedKeyOptionalValue` to propagate whether the merk verifier confirmed combine_hash(H(value), child_hash) == value_hash via a KVValueHashFeatureTypeWithChildHash node. At the GroveDB level, non-empty Merk trees without subqueries (no lower layer proof) must have child_hash_verified == true, preventing a MITM downgrade attack where an attacker replaces the WithChildHash opcode with a plain KVValueHashFeatureType to bypass child hash verification. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Encoding/decoding round-trip tests for all four opcodes (0x1c, 0x1d, 0x2e, 0x2f) covering Push/PushInverted × small/large values - Integration tests for non-empty Merk trees without subqueries: Tree, SumTree, CountTree, mixed types, and V0 path - Display impl test for the new Node variant - MapBuilder insertion tests with the new node type - Branch proof key extraction tests (trunk + branch) Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
There was a problem hiding this comment.
Actionable comments posted: 1
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
grovedb/src/operations/proof/util.rs (1)
254-269:⚠️ Potential issue | 🟠 MajorPropagate
child_hash_verifiedthrough the path-wrapped result too.These tests now populate
child_hash_verifiedonProvedKeyOptionalValue, but the convertedProvedPathKeyOptionalValuestill has no place to store it, sofrom_proved_key_valuedrops the verification state entirely. That undercuts the downgrade protection this PR is adding once callers switch to the path-wrapped form. Please addchild_hash_verifiedtoProvedPathKeyOptionalValue, copy it in the conversion helpers, and assert it here so this path cannot silently regress again.Also applies to: 276-344
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@grovedb/src/operations/proof/util.rs` around lines 254 - 269, ProvedPathKeyOptionalValue currently discards the child_hash_verified flag from ProvedKeyOptionalValue; add a child_hash_verified: bool field to the ProvedPathKeyOptionalValue struct, update the conversion helper ProvedPathKeyOptionalValue::from_proved_key_value (and the other similar from_proved_* helpers referenced in the same region) to copy proved.child_hash_verified into the new field, and update tests (the assert constructing ProvedPathKeyOptionalValue and any other assertions in lines ~276-344) to include and check child_hash_verified so the verification state is preserved through the path-wrapped type.
♻️ Duplicate comments (1)
merk/src/proofs/query/verify.rs (1)
419-424:⚠️ Potential issue | 🟡 MinorPreserve corrupted-data context for malformed node values.
This decode path still flattens
ElementType::from_serialized_valuefailures intoInvalidProofError, so malformed proof bytes lose the more specific corruption context.Suggested fix
- let element_type = ElementType::from_serialized_value(value).map_err(|e| { - Error::InvalidProofError(format!( - "cannot determine element type in \ - KVValueHashFeatureTypeWithChildHash node: {e}" - )) - })?; + let element_type = ElementType::from_serialized_value(value).map_err(|e| { + Error::CorruptedData(format!( + "cannot determine element type in \ + KVValueHashFeatureTypeWithChildHash node: {e}" + )) + })?;As per coding guidelines,
Wrap errors with context using .map_err(|e| Error::CorruptedData(format!("context: {}", e))) pattern in Rust source files.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@merk/src/proofs/query/verify.rs` around lines 419 - 424, The current map_err on ElementType::from_serialized_value in the KVValueHashFeatureTypeWithChildHash branch converts all failures into Error::InvalidProofError and loses corruption context; change the map_err to wrap the original error into Error::CorruptedData with a descriptive message (e.g., format!("cannot determine element type in KVValueHashFeatureTypeWithChildHash node: {}", e)) so the underlying parse error is preserved instead of being flattened to InvalidProofError; update the error variant returned from the closure accordingly so callers receive Error::CorruptedData for malformed node values.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@merk/src/proofs/query/verify.rs`:
- Around line 436-438: The verifier drops OperationCost by unwrapping results
from value_hash(value) and combine_hash(...) inside
KVValueHashFeatureTypeWithChildHash handling; replace the unwraps with
cost_return_on_error! so that both value_hash and combine_hash calls use
cost_return_on_error!(...) to propagate errors while accumulating their
OperationCost into the verifier's cost tracking (i.e., call value_hash and
combine_hash via cost_return_on_error! and bind the successful results to
element_value_hash and computed_value_hash respectively inside the verify.rs
verification path).
---
Outside diff comments:
In `@grovedb/src/operations/proof/util.rs`:
- Around line 254-269: ProvedPathKeyOptionalValue currently discards the
child_hash_verified flag from ProvedKeyOptionalValue; add a child_hash_verified:
bool field to the ProvedPathKeyOptionalValue struct, update the conversion
helper ProvedPathKeyOptionalValue::from_proved_key_value (and the other similar
from_proved_* helpers referenced in the same region) to copy
proved.child_hash_verified into the new field, and update tests (the assert
constructing ProvedPathKeyOptionalValue and any other assertions in lines
~276-344) to include and check child_hash_verified so the verification state is
preserved through the path-wrapped type.
---
Duplicate comments:
In `@merk/src/proofs/query/verify.rs`:
- Around line 419-424: The current map_err on ElementType::from_serialized_value
in the KVValueHashFeatureTypeWithChildHash branch converts all failures into
Error::InvalidProofError and loses corruption context; change the map_err to
wrap the original error into Error::CorruptedData with a descriptive message
(e.g., format!("cannot determine element type in
KVValueHashFeatureTypeWithChildHash node: {}", e)) so the underlying parse error
is preserved instead of being flattened to InvalidProofError; update the error
variant returned from the closure accordingly so callers receive
Error::CorruptedData for malformed node values.
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: d3a167da-98b5-4c20-9679-826cecc5ec13
📒 Files selected for processing (4)
grovedb-element/src/element/helpers.rsgrovedb/src/operations/proof/util.rsgrovedb/src/operations/proof/verify.rsmerk/src/proofs/query/verify.rs
🚧 Files skipped from review as they are similar to previous changes (1)
- grovedb/src/operations/proof/verify.rs
There was a problem hiding this comment.
♻️ Duplicate comments (1)
grovedb/src/tests/proof_coverage_tests.rs (1)
6424-6482:⚠️ Potential issue | 🟡 MinorMake the tamper helper robust to all supported encodings.
Line 6437 still reads
merk_proof[i + 1]before any bounds check, and the helper only looks for the small0x1cform. That makes this regression test brittle: a truncated buffer can panic, and valid proofs using the inverted/large encodings will be missed entirely.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@grovedb/src/tests/proof_coverage_tests.rs` around lines 6424 - 6482, The helper tamper_kvvaluehash_feature_type_with_child_hash reads merk_proof[i+1] without bounds checking and only handles the small 0x1c encoding, making it panic on truncated buffers and miss inverted/large encodings; fix it by first ensuring i+1 < merk_proof.len() before any indexing, then implement detection and parsing for all supported encodings for KVValueHashFeatureTypeWithChildHash (small, inverted, large) by reading the correct key length fields only after validating sufficient bytes remain, perform all subsequent reads (value_len u16, value bytes, value_hash, feature_type, child_hash) with bounds checks before access, and on any malformed/insufficient-length case advance i safely (e.g., i += 1) and continue; update references in tamper_kvvaluehash_feature_type_with_child_hash to use these guarded reads and to handle the alternate encodings when matching target_key and replacing the value.
🧹 Nitpick comments (2)
grovedb/src/tests/proof_coverage_tests.rs (1)
6606-6921: Assert the proof actually containsKVValueHashFeatureTypeWithChildHash.These happy-path tests currently only prove that generation+verification succeeds. A downgrade back to plain
KVValueHashFeatureTypecould still pass here, which is exactly the behavior this PR is trying to lock down. Decoding the inner merk proof and asserting the new node variant is present in each tree-type case would make these regressions much harder to miss.Also applies to: 6924-7031
grovedb-query/src/proofs/encoding.rs (1)
2092-2145: Consider adding test coverage for large value variants (0x2e/0x2f).The roundtrip tests thoroughly cover the small value paths (opcodes 0x1c/0x1d), but the large value variants (value.len() >= 65536) using opcodes 0x2e/0x2f are not exercised. Adding a test with a large value would ensure the u32 length encoding and
MAX_VALUE_LENvalidation paths work correctly.💡 Suggested test addition
#[test] fn encode_decode_roundtrip_kvvalue_hash_feature_type_with_child_hash_large_value() { // Test large value variant (>= 65536 bytes) using opcodes 0x2e/0x2f let large_value = vec![0u8; 65536]; let op = Op::Push(Node::KVValueHashFeatureTypeWithChildHash( vec![1, 2, 3], large_value.clone(), [0; 32], BasicMerkNode, [42; 32], )); let mut bytes = vec![]; op.encode_into(&mut bytes).unwrap(); assert_eq!(bytes[0], 0x2e); // Verify large variant opcode let decoded = Op::decode(&bytes[..]).expect("decode failed"); assert_eq!(decoded, op); let op = Op::PushInverted(Node::KVValueHashFeatureTypeWithChildHash( vec![1, 2, 3], large_value, [0; 32], BasicMerkNode, [42; 32], )); let mut bytes = vec![]; op.encode_into(&mut bytes).unwrap(); assert_eq!(bytes[0], 0x2f); // Verify large variant opcode let decoded = Op::decode(&bytes[..]).expect("decode failed"); assert_eq!(decoded, op); }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@grovedb-query/src/proofs/encoding.rs` around lines 2092 - 2145, Add a new roundtrip test that constructs KVValueHashFeatureTypeWithChildHash nodes with a large value (>= 65536 bytes) to exercise the 0x2e/0x2f encoding paths: create large_value = vec![0u8; 65536], build Op::Push and Op::PushInverted instances using Node::KVValueHashFeatureTypeWithChildHash and BasicMerkNode (and optionally SummedMerkNode), call .encode_into(&mut bytes).unwrap(), assert the first byte equals 0x2e for Push and 0x2f for PushInverted, then decode with Op::decode(&bytes[..]) and assert equality with the original op; this verifies u32 length encoding and MAX_VALUE_LEN handling in encode_into/decode for large values.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Duplicate comments:
In `@grovedb/src/tests/proof_coverage_tests.rs`:
- Around line 6424-6482: The helper
tamper_kvvaluehash_feature_type_with_child_hash reads merk_proof[i+1] without
bounds checking and only handles the small 0x1c encoding, making it panic on
truncated buffers and miss inverted/large encodings; fix it by first ensuring
i+1 < merk_proof.len() before any indexing, then implement detection and parsing
for all supported encodings for KVValueHashFeatureTypeWithChildHash (small,
inverted, large) by reading the correct key length fields only after validating
sufficient bytes remain, perform all subsequent reads (value_len u16, value
bytes, value_hash, feature_type, child_hash) with bounds checks before access,
and on any malformed/insufficient-length case advance i safely (e.g., i += 1)
and continue; update references in
tamper_kvvaluehash_feature_type_with_child_hash to use these guarded reads and
to handle the alternate encodings when matching target_key and replacing the
value.
---
Nitpick comments:
In `@grovedb-query/src/proofs/encoding.rs`:
- Around line 2092-2145: Add a new roundtrip test that constructs
KVValueHashFeatureTypeWithChildHash nodes with a large value (>= 65536 bytes) to
exercise the 0x2e/0x2f encoding paths: create large_value = vec![0u8; 65536],
build Op::Push and Op::PushInverted instances using
Node::KVValueHashFeatureTypeWithChildHash and BasicMerkNode (and optionally
SummedMerkNode), call .encode_into(&mut bytes).unwrap(), assert the first byte
equals 0x2e for Push and 0x2f for PushInverted, then decode with
Op::decode(&bytes[..]) and assert equality with the original op; this verifies
u32 length encoding and MAX_VALUE_LEN handling in encode_into/decode for large
values.
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: 23e57e9a-5a7a-4b0b-a535-0fc0cfb49662
📒 Files selected for processing (5)
grovedb-query/src/proofs/encoding.rsgrovedb-query/src/proofs/mod.rsgrovedb/src/tests/proof_coverage_tests.rsmerk/src/proofs/branch/tests.rsmerk/src/proofs/query/map.rs
- generate.rs: use combine_hash in fallback case so the value_hash stored in KVValueHashFeatureTypeWithChildHash is the combined hash (defensive — this fallback is unreachable for tree elements in practice since they always produce KVValueHash or KVValueHashFeatureType nodes) - verify.rs: add combine_hash verification for KVValueHashFeatureTypeWithChildHash nodes in trunk/branch chunk proof extraction to prevent value tampering during state sync - benches: add KVValueHashFeatureTypeWithChildHash to get_tree_root_count match arm Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Add 4 large-value encoding/decoding tests for KVValueHashFeatureTypeWithChildHash (opcodes 0x2e/0x2f) and 3 error path tests that exercise proof tampering scenarios: downgraded proof node rejection, combine_hash mismatch detection, and item element rejection when upgraded to KVValueHashFeatureTypeWithChildHash. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
|
Reviewed |
Summary
KVValueHashFeatureTypeWithChildHashthat carries the child Merk root hash when a non-empty tree element (CountTree, SumTree, etc.) appears in the proof result set without a lower-layer subquerycombine_hash(H(value_bytes), child_hash) == value_hash, preventing metadata tampering (e.g. swapping a CountTree's count from 3 to 999)KVValueHashFeatureTypewhich only verified the combined value_hash against the Merk tree — an attacker could craft a fake element with modified metadata (like count) and a matching child_hash that would produce the same combined hashHow it works
KVValueHashFeatureTypetoKVValueHashFeatureTypeWithChildHash(key, value, value_hash, feature_type, child_root_hash)combine_hash(H(value_bytes), child_hash)and verifies it equals the node'svalue_hash. Rejects the proof if they don't match.KVValueHashFeatureType— the child_hash is verification metadata, not part of the tree hashFiles changed
grovedb-query/src/proofs/mod.rsKVValueHashFeatureTypeWithChildHashNode variantgrovedb-query/src/proofs/encoding.rsmerk/src/proofs/tree.rsmerk/src/proofs/query/verify.rsgrovedb/src/operations/proof/generate.rsgrovedb/src/operations/proof/verify.rsgrovedb/src/operations/proof/mod.rsTest plan
non_empty_count_tree_count_swap_is_detected: creates CountTree with 3 items, proves without subquery, verifies valid proof works, then tampers count to 999 and asserts proof is rejected with "value/child hash mismatch"cargo clippy -- -D warningspasses clean🤖 Generated with Claude Code
Summary by CodeRabbit
New Features
Tests