Skip to content

fix: verify combine_hash for non-empty trees without subquery in proofs#633

Merged
QuantumExplorer merged 7 commits into
developfrom
fix/C1-non-empty-tree-child-hash-verification
Mar 9, 2026
Merged

fix: verify combine_hash for non-empty trees without subquery in proofs#633
QuantumExplorer merged 7 commits into
developfrom
fix/C1-non-empty-tree-child-hash-verification

Conversation

@QuantumExplorer
Copy link
Copy Markdown
Member

@QuantumExplorer QuantumExplorer commented Mar 9, 2026

Summary

  • Adds new proof node type KVValueHashFeatureTypeWithChildHash that carries the child Merk root hash when a non-empty tree element (CountTree, SumTree, etc.) appears in the proof result set without a lower-layer subquery
  • The prover opens the child Merk to read its root hash and includes it in the proof node
  • The merk verifier checks combine_hash(H(value_bytes), child_hash) == value_hash, preventing metadata tampering (e.g. swapping a CountTree's count from 3 to 999)
  • Previously, non-empty trees without subqueries used KVValueHashFeatureType which only verified the combined value_hash against the Merk tree — an attacker could craft a fake element with modified metadata (like count) and a matching child_hash that would produce the same combined hash

How it works

  1. Prover side (generate.rs): When encountering a non-empty tree element without a subquery, opens the child Merk, reads its root hash, and transforms the proof node from KVValueHashFeatureType to KVValueHashFeatureTypeWithChildHash(key, value, value_hash, feature_type, child_root_hash)
  2. Merk verifier (verify.rs): On encountering the new node type, deserializes the value to check it's a tree element (not an item), then computes combine_hash(H(value_bytes), child_hash) and verifies it equals the node's value_hash. Rejects the proof if they don't match.
  3. Hash computation (tree.rs): The new variant computes its Merk tree hash identically to KVValueHashFeatureType — the child_hash is verification metadata, not part of the tree hash

Files changed

File Change
grovedb-query/src/proofs/mod.rs New KVValueHashFeatureTypeWithChildHash Node variant
grovedb-query/src/proofs/encoding.rs Encode/decode with tags 0x1c/0x1d/0x2e/0x2f
merk/src/proofs/tree.rs Hash computation (same as KVValueHashFeatureType)
merk/src/proofs/query/verify.rs combine_hash verification logic
grovedb/src/operations/proof/generate.rs Prover: inject child root hash (V0 + V1)
grovedb/src/operations/proof/verify.rs Add to key/value extraction match
grovedb/src/operations/proof/mod.rs Display formatting
Other files Exhaustive pattern match updates

Test plan

  • New test non_empty_count_tree_count_swap_is_detected: creates CountTree with 3 items, proves without subquery, verifies valid proof works, then tampers count to 999 and asserts proof is rejected with "value/child hash mismatch"
  • All 2,524 existing tests pass (including 237+ proof tests)
  • cargo clippy -- -D warnings passes clean

🤖 Generated with Claude Code

Summary by CodeRabbit

  • New Features

    • Added a proof node variant that includes a child hash with end-to-end support (generation, serialization, verification, display, key/hash handling).
    • Added an Element helper to detect non-empty Merk-backed trees.
  • Tests

    • Expanded tests for encoding/decoding, display, roundtrip, verification and targeted tampering across tree/proof variants to ensure mismatches are detected.

Adds a new proof node type KVValueHashFeatureTypeWithChildHash that
carries the child Merk root hash when a non-empty tree appears in the
result set without a lower-layer subquery. The merk verifier checks
combine_hash(H(value_bytes), child_hash) == value_hash, preventing
metadata tampering (e.g. swapping a CountTree's count) that was
previously undetectable in this scenario.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented Mar 9, 2026

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

  • @coderabbitai resume to resume automatic reviews.
  • @coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

  • ▶️ Resume reviews
  • 🔍 Trigger review
📝 Walkthrough

Walkthrough

Adds a new Node variant KVValueHashFeatureTypeWithChildHash and propagates it through proof generation, serialization/deserialization, Merkle/tree hashing, verification, debugger mapping, mapbuilder/branch logic, and extensive tests so proofs carry and validate an additional child_hash alongside value hashes.

Changes

Cohort / File(s) Summary
Proof Data Model & Display
grovedb-query/src/proofs/mod.rs, grovedb/src/operations/proof/mod.rs
Add Node::KVValueHashFeatureTypeWithChildHash(...) and update Display formatting to include child_hash.
Encoding & Decoding
grovedb-query/src/proofs/encoding.rs
Encode/decode new variant with small/large headers (0x1c/0x1d/0x2e/0x2f), update length calculations and Variant mapping, include tests for small/large branches and roundtrips.
Proof Generation
grovedb/src/operations/proof/generate.rs
Inject child_hash for non-empty/non-subquery Merk trees in v0/v1 flows: read child root, compute combined value hash, construct new Node variant, and adjust limits/flags; distinguish empty vs non-empty tree arms.
Verification (grovedb)
grovedb/src/operations/proof/verify.rs, grovedb/src/query/grove_*_query_result.rs
Treat new variant like other KVValueHash types for count/key extraction and enforce child_hash presence/verification for non-empty Merk trees.
Merk Proof Verification & Tree Logic
merk/src/proofs/query/verify.rs, merk/src/proofs/tree.rs
Verify by computing element_value_hash(value) and combining with child_hash to compare node_value_hash; propagate child_hash_verified flag; include variant in Tree hashing/key accessors (child_hash excluded from merk hash).
Branching, MapBuilder & Key Extraction
merk/src/proofs/branch/mod.rs, merk/benches/branch_queries.rs, merk/src/proofs/query/map.rs
Recognize new variant in key extraction, terminal-key collection, MapBuilder insertion ordering/value handling; added related tests.
Tests & Tampering Helpers
grovedb/src/tests/proof_coverage_tests.rs, grovedb/src/tests/provable_count_sum_tree_tests.rs, grovedb/src/operations/proof/util.rs
Add tamper helper targeting KVValueHashFeatureTypeWithChildHash, tests detecting tampered CountTree/value swaps, and test initializations for child_hash_verified. Extensive new tests across V0/V1 and tree types.
Debugger, Chunks & Benches
grovedb/src/debugger.rs, merk/src/merk/chunks.rs, grovedb/benches/branch_chunk_queries.rs
Debugger mapping for variant (ignores child_hash in grovedbg mapping); test counters and bench node counting include the new variant.
Element Helper
grovedb-element/src/element/helpers.rs
Add pub fn is_non_empty_merk_tree(&self) -> bool to detect non-empty Merk-backed trees.
Miscellaneous small updates
grovedb-query/src/proofs/encoding.rs, grovedb/src/operations/proof/util.rs
Adjust length calculations for new encoded forms and update test data for new field.

Sequence Diagram(s)

sequenceDiagram
  participant Generator as ProofGenerator
  participant Encoder as Encoder (encoding.rs)
  participant Transport as SerializedProof
  participant Decoder as Decoder (encoding.rs)
  participant Verifier as Verifier (merk/.../verify.rs)
  participant Storage as Tree/Child

  Generator->>Generator: create Node::KVValueHashFeatureTypeWithChildHash(key,value,value_hash,feature_type,child_hash)
  Generator->>Encoder: serialize node (choose 0x1c/0x1d/0x2e/0x2f by length)
  Encoder->>Transport: emit bytes (header + payload)
  Transport->>Decoder: deliver serialized proof bytes
  Decoder->>Verifier: decode node (key,value,value_hash,feature_type,child_hash)
  Verifier->>Verifier: element_value_hash = value_hash(value)
  Verifier->>Verifier: computed_value_hash = combine_hash(element_value_hash, child_hash)
  Verifier->>Storage: compare computed_value_hash with node_value_hash -> accept/reject
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

Possibly related PRs

Poem

🐰 I found a child hash tucked beneath the leaf,
I stitched it to the value, kept proof checks brief.
I hop, I encode, then verify with care,
If hashes pair together, the trail is fair. 🥕

🚥 Pre-merge checks | ✅ 3
✅ Passed checks (3 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title 'fix: verify combine_hash for non-empty trees without subquery in proofs' accurately and clearly describes the main change: adding verification of combine_hash (child_hash + value_hash) for non-empty tree nodes without subqueries.
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch fix/C1-non-empty-tree-child-hash-verification

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 6

🧹 Nitpick comments (2)
merk/src/proofs/branch/mod.rs (1)

111-125: Consider centralizing keyed-node extraction.

This exact match table now exists in both branch-query helpers and in merk/benches/branch_queries.rs. The new variant already had to be threaded through all of them, while merk/src/proofs/tree.rs still has other coverage drift. A shared Node::key()-style helper would make future enum additions much harder to miss.

Also applies to: 374-388

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@merk/src/proofs/branch/mod.rs` around lines 111 - 125, The duplicated match
logic for extracting a key from Node (currently implemented as
get_key_from_node) should be centralized by adding a single helper (e.g., impl
Node::key(&self) -> Option<Vec<u8>> or a free function Node_key) and replacing
all local match tables (including get_key_from_node, branch-query helpers,
merk/benches/branch_queries.rs and merk/src/proofs/tree.rs) to call that helper;
update every call site to use the new Node::key() (or Node_key) accessor so new
Node variants only need one update.
grovedb-query/src/proofs/encoding.rs (1)

153-177: Add direct round-trip tests for the new opcodes.

This file now owns four new wire encodings, but the local unit tests still stop at KVValueHashFeatureType. A small+large encode/decode round-trip for 0x1c/0x1d/0x2e/0x2f would catch opcode or encoding_length() drift much earlier than the higher-level proof tests.

Also applies to: 312-337, 700-749, 965-1020

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@grovedb-query/src/proofs/encoding.rs` around lines 153 - 177, Add direct
round-trip unit tests for the new opcodes produced by the Op::Push
Node::KVValueHashFeatureTypeWithChildHash encoding (opcodes
0x1c/0x1d/0x2e/0x2f): create small and large test vectors (value lengths below
and above the 65536 threshold) that encode with the existing encoding function,
check encoding_length() matches the produced buffer length, then decode and
assert the decoded Op/Node equals the original; add separate cases to cover both
the 16-bit-length and 32-bit-length branches and include assertions that the
correct leading opcode byte (0x1c/0x1d for small, 0x2e/0x2f for large) appears
in the encoded buffer.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@grovedb/src/operations/proof/generate.rs`:
- Around line 508-518: The fallback arm currently computes only
value_hash(&value_owned) but the verifier expects the combined hash
combine_hash(H(value), child_root_hash); add combine_hash to the imports and in
the fallback (_) arm of the match that sets (vh, ft) compute let vh =
value_hash(&value_owned).unwrap_add_cost(&mut cost); let vh = combine_hash(vh,
child_root_hash).unwrap_add_cost(&mut cost); (keeping ft as
TreeFeatureType::BasicMerkNode); make the identical change in the V1 fallback
match (lines noted) so Node::KV and Node::KVCount cases produce the combined
hash the verifier expects.

In `@grovedb/src/operations/proof/verify.rs`:
- Around line 2276-2282: The match arm that treats
Node::KVValueHashFeatureTypeWithChildHash as safe to extract must be hardened:
in verify_trunk_chunk_proof_v0 and verify_branch_chunk_proof, before calling
execute()/Tree::hash() or deserializing the element, explicitly validate that
combine_hash(H(value), child_hash) == value_hash for nodes of type
KVValueHashFeatureTypeWithChildHash (i.e., check the provided child_hash against
the stored value_hash), and reject the proof if the check fails; update the code
paths that currently rely solely on execute()/Tree::hash() to perform this
additional integrity check so deserialization of tampered values is prevented.

In `@grovedb/src/tests/proof_coverage_tests.rs`:
- Around line 6420-6482: The tamper_kvvaluehash_feature_type_with_child_hash
helper is brittle: it assumes a single small-tag form (tag = 0x1c) and reads
merk_proof[i + 1] without checking bounds. Update
tamper_kvvaluehash_feature_type_with_child_hash to first guard buffer bounds
before any index access (check i+1, vl_start+2, etc.), and broaden parsing to
handle all supported encodings of a KVValueHashFeatureTypeWithChildHash node
instead of only the small 0x1c variant—either by using the existing
merk-proof/node decoder to iterate decoded nodes and mutate the target node, or
by implementing parsing branches for both small and large tag encodings and
their respective key/length formats; preserve value_hash and child_hash when
replacing the value and update the stored value length correctly for the chosen
encoding.

In `@grovedb/src/tests/provable_count_sum_tree_tests.rs`:
- Around line 57-60: The test added handling for
Node::KVValueHashFeatureTypeWithChildHash when extracting the key but
get_node_count still ignores that variant, so proof nodes with that shape are
skipped; update the get_node_count function to include a match arm for
Node::KVValueHashFeatureTypeWithChildHash (mirroring the behavior used for
KVValueHashFeatureType and KVCount) returning the appropriate count/Some(_)
value so collect_tree_node_counts() will process these nodes and the count
assertions exercise the new proof shape; locate get_node_count and add the
corresponding pattern to its match on Node.

In `@merk/src/proofs/query/verify.rs`:
- Around line 416-421: Replace the current map_err that converts
ElementType::from_serialized_value failures into Error::InvalidProofError with
one that wraps the parse error as Error::CorruptedData; specifically update the
closure on ElementType::from_serialized_value in the
KVValueHashFeatureTypeWithChildHash decoding path to use .map_err(|e|
Error::CorruptedData(format!("cannot determine element type in
KVValueHashFeatureTypeWithChildHash node: {}", e))) so the original parse error
is preserved with the corrupted-data context.

In `@merk/src/proofs/tree.rs`:
- Around line 388-389: The new Node variant KVValueHashFeatureTypeWithChildHash
must be handled everywhere the KVDigest/other KV value variants are matched:
update Tree::execute() so Op::Push and Op::PushInverted ordering checks include
Node::KVValueHashFeatureTypeWithChildHash alongside Node::KVDigest and the other
KV cases; update the minimal-only helpers Child::as_link() and
Tree::aggregate_data() to recognize KVValueHashFeatureTypeWithChildHash and
return the same link/aggregate metadata behavior as the existing KV variants so
monotonic-key validation and aggregate metadata are not skipped during
minimal-mode reconstruction; ensure matches and any fallthrough branches
treating KV nodes are extended to include KVValueHashFeatureTypeWithChildHash.

---

Nitpick comments:
In `@grovedb-query/src/proofs/encoding.rs`:
- Around line 153-177: Add direct round-trip unit tests for the new opcodes
produced by the Op::Push Node::KVValueHashFeatureTypeWithChildHash encoding
(opcodes 0x1c/0x1d/0x2e/0x2f): create small and large test vectors (value
lengths below and above the 65536 threshold) that encode with the existing
encoding function, check encoding_length() matches the produced buffer length,
then decode and assert the decoded Op/Node equals the original; add separate
cases to cover both the 16-bit-length and 32-bit-length branches and include
assertions that the correct leading opcode byte (0x1c/0x1d for small, 0x2e/0x2f
for large) appears in the encoded buffer.

In `@merk/src/proofs/branch/mod.rs`:
- Around line 111-125: The duplicated match logic for extracting a key from Node
(currently implemented as get_key_from_node) should be centralized by adding a
single helper (e.g., impl Node::key(&self) -> Option<Vec<u8>> or a free function
Node_key) and replacing all local match tables (including get_key_from_node,
branch-query helpers, merk/benches/branch_queries.rs and
merk/src/proofs/tree.rs) to call that helper; update every call site to use the
new Node::key() (or Node_key) accessor so new Node variants only need one
update.

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 65d26ea8-e396-468a-88df-58bbde35f1e8

📥 Commits

Reviewing files that changed from the base of the PR and between e2fa568 and 259b0cc.

📒 Files selected for processing (12)
  • grovedb-query/src/proofs/encoding.rs
  • grovedb-query/src/proofs/mod.rs
  • grovedb/src/operations/proof/generate.rs
  • grovedb/src/operations/proof/mod.rs
  • grovedb/src/operations/proof/verify.rs
  • grovedb/src/tests/proof_coverage_tests.rs
  • grovedb/src/tests/provable_count_sum_tree_tests.rs
  • merk/benches/branch_queries.rs
  • merk/src/merk/chunks.rs
  • merk/src/proofs/branch/mod.rs
  • merk/src/proofs/query/verify.rs
  • merk/src/proofs/tree.rs

Comment thread grovedb/src/operations/proof/generate.rs
Comment thread grovedb/src/operations/proof/verify.rs
Comment thread grovedb/src/tests/proof_coverage_tests.rs
Comment thread grovedb/src/tests/provable_count_sum_tree_tests.rs
Comment thread merk/src/proofs/query/verify.rs
Comment thread merk/src/proofs/tree.rs
- Fix CI: add new variant to debugger.rs merk_proof_node_to_grovedbg
- Add to execute() key ordering checks (Op::Push/PushInverted)
- Add to Child::as_link() for aggregate data extraction
- Add to Tree::aggregate_data() for feature type extraction
- Add to get_node_count in provable_count_sum_tree_tests
- Add to count extraction in verify.rs leaf key collection

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@codecov
Copy link
Copy Markdown

codecov Bot commented Mar 9, 2026

Codecov Report

❌ Patch coverage is 90.53085% with 66 lines in your changes missing coverage. Please review.
✅ Project coverage is 90.71%. Comparing base (e2fa568) to head (f73fa23).
⚠️ Report is 1 commits behind head on develop.

Files with missing lines Patch % Lines
grovedb/src/operations/proof/verify.rs 34.14% 27 Missing ⚠️
grovedb/src/operations/proof/generate.rs 83.33% 12 Missing ⚠️
merk/src/proofs/query/verify.rs 78.00% 11 Missing ⚠️
grovedb/src/query/grove_branch_query_result.rs 0.00% 6 Missing ⚠️
merk/src/proofs/tree.rs 60.00% 4 Missing ⚠️
grovedb-query/src/proofs/encoding.rs 99.44% 2 Missing ⚠️
grovedb/src/query/grove_trunk_query_result.rs 66.66% 2 Missing ⚠️
merk/src/merk/chunks.rs 0.00% 1 Missing ⚠️
merk/src/proofs/query/map.rs 98.64% 1 Missing ⚠️
Additional details and impacted files
@@             Coverage Diff             @@
##           develop     #633      +/-   ##
===========================================
+ Coverage    90.70%   90.71%   +0.01%     
===========================================
  Files          182      182              
  Lines        51139    51813     +674     
===========================================
+ Hits         46387    47004     +617     
- Misses        4752     4809      +57     
Components Coverage Δ
grovedb-core 88.76% <66.42%> (-0.11%) ⬇️
merk 91.94% <87.59%> (-0.04%) ⬇️
storage 86.36% <ø> (ø)
commitment-tree 96.41% <ø> (ø)
mmr 96.72% <ø> (ø)
bulk-append-tree 90.85% <ø> (ø)
element 97.56% <100.00%> (+<0.01%) ⬆️
🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (1)
grovedb/src/debugger.rs (1)

522-551: Consider extracting the duplicated feature_type mapping into a helper function.

The feature_type to grovedbg_types::TreeFeatureType conversion logic at lines 524-543 is nearly identical to lines 460-479. Extracting this into a helper function would reduce duplication and improve maintainability.

♻️ Proposed refactor to extract helper function
+fn tree_feature_type_to_grovedbg(feature_type: TreeFeatureType) -> grovedbg_types::TreeFeatureType {
+    match feature_type {
+        TreeFeatureType::BasicMerkNode => grovedbg_types::TreeFeatureType::BasicMerkNode,
+        TreeFeatureType::SummedMerkNode(sum) => grovedbg_types::TreeFeatureType::SummedMerkNode(sum),
+        TreeFeatureType::BigSummedMerkNode(sum) => grovedbg_types::TreeFeatureType::BigSummedMerkNode(sum),
+        TreeFeatureType::CountedMerkNode(count) => grovedbg_types::TreeFeatureType::CountedMerkNode(count),
+        TreeFeatureType::CountedSummedMerkNode(count, sum) => grovedbg_types::TreeFeatureType::CountedSummedMerkNode(count, sum),
+        TreeFeatureType::ProvableCountedMerkNode(count) => grovedbg_types::TreeFeatureType::ProvableCountedMerkNode(count),
+        TreeFeatureType::ProvableCountedSummedMerkNode(count, sum) => grovedbg_types::TreeFeatureType::ProvableCountedSummedMerkNode(count, sum),
+    }
+}

 fn merk_proof_node_to_grovedbg(node: Node) -> Result<MerkProofNode, crate::Error> {
     Ok(match node {
         // ... other cases ...
         Node::KVValueHashFeatureType(key, value, hash, feature_type) => {
             let element = crate::Element::deserialize(&value, GroveVersion::latest())?;
-            let node_feature_type = match feature_type {
-                // ... 7 arms ...
-            };
+            let node_feature_type = tree_feature_type_to_grovedbg(feature_type);
             MerkProofNode::KVValueHashFeatureType(key, element_to_grovedbg(element), hash, node_feature_type)
         }
         Node::KVValueHashFeatureTypeWithChildHash(key, value, hash, feature_type, _child_hash) => {
             let element = crate::Element::deserialize(&value, GroveVersion::latest())?;
-            let node_feature_type = match feature_type {
-                // ... 7 arms ...
-            };
+            let node_feature_type = tree_feature_type_to_grovedbg(feature_type);
             MerkProofNode::KVValueHashFeatureType(key, element_to_grovedbg(element), hash, node_feature_type)
         }
     })
 }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@grovedb/src/debugger.rs` around lines 522 - 551, Extract the duplicated match
that converts TreeFeatureType -> grovedbg_types::TreeFeatureType into a small
helper function (e.g., fn convert_tree_feature_type(ft: TreeFeatureType) ->
grovedbg_types::TreeFeatureType) and use it in both places (the match inside the
Node::KVValueHashFeatureTypeWithChildHash arm and the earlier identical match).
Replace the inline matches with a call to convert_tree_feature_type(...) so the
MerkProofNode::KVValueHashFeatureType construction still passes the converted
node_feature_type to element_to_grovedbg(element) and preserves existing
variable names (feature_type, node_feature_type, TreeFeatureType,
grovedbg_types::TreeFeatureType).
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Nitpick comments:
In `@grovedb/src/debugger.rs`:
- Around line 522-551: Extract the duplicated match that converts
TreeFeatureType -> grovedbg_types::TreeFeatureType into a small helper function
(e.g., fn convert_tree_feature_type(ft: TreeFeatureType) ->
grovedbg_types::TreeFeatureType) and use it in both places (the match inside the
Node::KVValueHashFeatureTypeWithChildHash arm and the earlier identical match).
Replace the inline matches with a call to convert_tree_feature_type(...) so the
MerkProofNode::KVValueHashFeatureType construction still passes the converted
node_feature_type to element_to_grovedbg(element) and preserves existing
variable names (feature_type, node_feature_type, TreeFeatureType,
grovedbg_types::TreeFeatureType).

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 448386a3-271b-4025-8349-6f5136745ea2

📥 Commits

Reviewing files that changed from the base of the PR and between 259b0cc and 226d282.

📒 Files selected for processing (4)
  • grovedb/src/debugger.rs
  • grovedb/src/operations/proof/verify.rs
  • grovedb/src/tests/provable_count_sum_tree_tests.rs
  • merk/src/proofs/tree.rs
🚧 Files skipped from review as they are similar to previous changes (1)
  • grovedb/src/tests/provable_count_sum_tree_tests.rs

Copy link
Copy Markdown
Member Author

@QuantumExplorer QuantumExplorer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed

… (audit)

Addresses findings from security and quality audits:
- MapBuilder::insert in map.rs (deprecated API, but correctness matters)
- get_node_count in grove_trunk_query_result.rs
- get_node_count in grove_branch_query_result.rs
- get_node_count in benches/branch_chunk_queries.rs

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@grovedb/benches/branch_chunk_queries.rs`:
- Around line 180-186: The benchmark helper is not extracting counts for nodes
that use the new KVValueHashFeatureTypeWithChildHash path and
get_tree_root_count still only checks the old variants; update the pattern
matches in the helper (the match arms handling Node::KVValueHashFeatureType and
Node::KVValueHashFeatureTypeWithChildHash) to also return counts for
TreeFeatureType::CountedMerkNode and TreeFeatureType::CountedSummedMerkNode (and
their provable variants if present), and modify get_tree_root_count() to mirror
the same variant/feature-type handling so counted-tree nodes produce correct
root_count and ancestor selection diagnostics.

In `@grovedb/src/query/grove_branch_query_result.rs`:
- Around line 123-129: The match over Node::KVValueHashFeatureType /
Node::KVValueHashFeatureTypeWithChildHash is currently extracting only
ProvableCountedMerkNode and ProvableCountedSummedMerkNode manually; change it to
call TreeFeatureType::count() on feature_type and return that Option<u64> (i.e.,
if feature_type.count() is Some(count) return Some(count) else None) so
get_ancestor() correctly recognizes all counted feature variants. Locate the
match in the function handling these Node variants (the block matching
Node::KVValueHashFeatureType and Node::KVValueHashFeatureTypeWithChildHash) and
replace the hard-coded variant checks with a call to feature_type.count().

In `@grovedb/src/query/grove_trunk_query_result.rs`:
- Around line 137-143: Replace the hard-coded match on TreeFeatureType that only
handles ProvableCountedMerkNode and ProvableCountedSummedMerkNode with a call to
TreeFeatureType::count() so all counted node variants are handled; specifically,
in the match arm for Node::KVValueHashFeatureType(...) and
Node::KVValueHashFeatureTypeWithChildHash(...), return feature_type.count() (or
feature_type.count().copied()/map(|c| *c) as appropriate) instead of manually
matching only the Provable* variants—this will fix undercounting in
get_ancestor() and related privacy-selection logic.

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 553dae32-cd4b-4371-a00e-ea1ef764c03d

📥 Commits

Reviewing files that changed from the base of the PR and between 226d282 and d4cd2a7.

📒 Files selected for processing (4)
  • grovedb/benches/branch_chunk_queries.rs
  • grovedb/src/query/grove_branch_query_result.rs
  • grovedb/src/query/grove_trunk_query_result.rs
  • merk/src/proofs/query/map.rs

Comment thread grovedb/benches/branch_chunk_queries.rs
Comment thread grovedb/src/query/grove_branch_query_result.rs
Comment thread grovedb/src/query/grove_trunk_query_result.rs
QuantumExplorer and others added 2 commits March 9, 2026 22:08
…queries

Adds `child_hash_verified: bool` to `ProvedKeyOptionalValue` to propagate
whether the merk verifier confirmed combine_hash(H(value), child_hash)
== value_hash via a KVValueHashFeatureTypeWithChildHash node.

At the GroveDB level, non-empty Merk trees without subqueries (no lower
layer proof) must have child_hash_verified == true, preventing a MITM
downgrade attack where an attacker replaces the WithChildHash opcode
with a plain KVValueHashFeatureType to bypass child hash verification.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Encoding/decoding round-trip tests for all four opcodes (0x1c, 0x1d,
  0x2e, 0x2f) covering Push/PushInverted × small/large values
- Integration tests for non-empty Merk trees without subqueries:
  Tree, SumTree, CountTree, mixed types, and V0 path
- Display impl test for the new Node variant
- MapBuilder insertion tests with the new node type
- Branch proof key extraction tests (trunk + branch)

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
grovedb/src/operations/proof/util.rs (1)

254-269: ⚠️ Potential issue | 🟠 Major

Propagate child_hash_verified through the path-wrapped result too.

These tests now populate child_hash_verified on ProvedKeyOptionalValue, but the converted ProvedPathKeyOptionalValue still has no place to store it, so from_proved_key_value drops the verification state entirely. That undercuts the downgrade protection this PR is adding once callers switch to the path-wrapped form. Please add child_hash_verified to ProvedPathKeyOptionalValue, copy it in the conversion helpers, and assert it here so this path cannot silently regress again.

Also applies to: 276-344

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@grovedb/src/operations/proof/util.rs` around lines 254 - 269,
ProvedPathKeyOptionalValue currently discards the child_hash_verified flag from
ProvedKeyOptionalValue; add a child_hash_verified: bool field to the
ProvedPathKeyOptionalValue struct, update the conversion helper
ProvedPathKeyOptionalValue::from_proved_key_value (and the other similar
from_proved_* helpers referenced in the same region) to copy
proved.child_hash_verified into the new field, and update tests (the assert
constructing ProvedPathKeyOptionalValue and any other assertions in lines
~276-344) to include and check child_hash_verified so the verification state is
preserved through the path-wrapped type.
♻️ Duplicate comments (1)
merk/src/proofs/query/verify.rs (1)

419-424: ⚠️ Potential issue | 🟡 Minor

Preserve corrupted-data context for malformed node values.

This decode path still flattens ElementType::from_serialized_value failures into InvalidProofError, so malformed proof bytes lose the more specific corruption context.

Suggested fix
-                    let element_type = ElementType::from_serialized_value(value).map_err(|e| {
-                        Error::InvalidProofError(format!(
-                            "cannot determine element type in \
-                                 KVValueHashFeatureTypeWithChildHash node: {e}"
-                        ))
-                    })?;
+                    let element_type = ElementType::from_serialized_value(value).map_err(|e| {
+                        Error::CorruptedData(format!(
+                            "cannot determine element type in \
+                                 KVValueHashFeatureTypeWithChildHash node: {e}"
+                        ))
+                    })?;

As per coding guidelines, Wrap errors with context using .map_err(|e| Error::CorruptedData(format!("context: {}", e))) pattern in Rust source files.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@merk/src/proofs/query/verify.rs` around lines 419 - 424, The current map_err
on ElementType::from_serialized_value in the KVValueHashFeatureTypeWithChildHash
branch converts all failures into Error::InvalidProofError and loses corruption
context; change the map_err to wrap the original error into Error::CorruptedData
with a descriptive message (e.g., format!("cannot determine element type in
KVValueHashFeatureTypeWithChildHash node: {}", e)) so the underlying parse error
is preserved instead of being flattened to InvalidProofError; update the error
variant returned from the closure accordingly so callers receive
Error::CorruptedData for malformed node values.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@merk/src/proofs/query/verify.rs`:
- Around line 436-438: The verifier drops OperationCost by unwrapping results
from value_hash(value) and combine_hash(...) inside
KVValueHashFeatureTypeWithChildHash handling; replace the unwraps with
cost_return_on_error! so that both value_hash and combine_hash calls use
cost_return_on_error!(...) to propagate errors while accumulating their
OperationCost into the verifier's cost tracking (i.e., call value_hash and
combine_hash via cost_return_on_error! and bind the successful results to
element_value_hash and computed_value_hash respectively inside the verify.rs
verification path).

---

Outside diff comments:
In `@grovedb/src/operations/proof/util.rs`:
- Around line 254-269: ProvedPathKeyOptionalValue currently discards the
child_hash_verified flag from ProvedKeyOptionalValue; add a child_hash_verified:
bool field to the ProvedPathKeyOptionalValue struct, update the conversion
helper ProvedPathKeyOptionalValue::from_proved_key_value (and the other similar
from_proved_* helpers referenced in the same region) to copy
proved.child_hash_verified into the new field, and update tests (the assert
constructing ProvedPathKeyOptionalValue and any other assertions in lines
~276-344) to include and check child_hash_verified so the verification state is
preserved through the path-wrapped type.

---

Duplicate comments:
In `@merk/src/proofs/query/verify.rs`:
- Around line 419-424: The current map_err on ElementType::from_serialized_value
in the KVValueHashFeatureTypeWithChildHash branch converts all failures into
Error::InvalidProofError and loses corruption context; change the map_err to
wrap the original error into Error::CorruptedData with a descriptive message
(e.g., format!("cannot determine element type in
KVValueHashFeatureTypeWithChildHash node: {}", e)) so the underlying parse error
is preserved instead of being flattened to InvalidProofError; update the error
variant returned from the closure accordingly so callers receive
Error::CorruptedData for malformed node values.

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: d3a167da-98b5-4c20-9679-826cecc5ec13

📥 Commits

Reviewing files that changed from the base of the PR and between d4cd2a7 and ea88fe8.

📒 Files selected for processing (4)
  • grovedb-element/src/element/helpers.rs
  • grovedb/src/operations/proof/util.rs
  • grovedb/src/operations/proof/verify.rs
  • merk/src/proofs/query/verify.rs
🚧 Files skipped from review as they are similar to previous changes (1)
  • grovedb/src/operations/proof/verify.rs

Comment thread merk/src/proofs/query/verify.rs
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

♻️ Duplicate comments (1)
grovedb/src/tests/proof_coverage_tests.rs (1)

6424-6482: ⚠️ Potential issue | 🟡 Minor

Make the tamper helper robust to all supported encodings.

Line 6437 still reads merk_proof[i + 1] before any bounds check, and the helper only looks for the small 0x1c form. That makes this regression test brittle: a truncated buffer can panic, and valid proofs using the inverted/large encodings will be missed entirely.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@grovedb/src/tests/proof_coverage_tests.rs` around lines 6424 - 6482, The
helper tamper_kvvaluehash_feature_type_with_child_hash reads merk_proof[i+1]
without bounds checking and only handles the small 0x1c encoding, making it
panic on truncated buffers and miss inverted/large encodings; fix it by first
ensuring i+1 < merk_proof.len() before any indexing, then implement detection
and parsing for all supported encodings for KVValueHashFeatureTypeWithChildHash
(small, inverted, large) by reading the correct key length fields only after
validating sufficient bytes remain, perform all subsequent reads (value_len u16,
value bytes, value_hash, feature_type, child_hash) with bounds checks before
access, and on any malformed/insufficient-length case advance i safely (e.g., i
+= 1) and continue; update references in
tamper_kvvaluehash_feature_type_with_child_hash to use these guarded reads and
to handle the alternate encodings when matching target_key and replacing the
value.
🧹 Nitpick comments (2)
grovedb/src/tests/proof_coverage_tests.rs (1)

6606-6921: Assert the proof actually contains KVValueHashFeatureTypeWithChildHash.

These happy-path tests currently only prove that generation+verification succeeds. A downgrade back to plain KVValueHashFeatureType could still pass here, which is exactly the behavior this PR is trying to lock down. Decoding the inner merk proof and asserting the new node variant is present in each tree-type case would make these regressions much harder to miss.

Also applies to: 6924-7031

grovedb-query/src/proofs/encoding.rs (1)

2092-2145: Consider adding test coverage for large value variants (0x2e/0x2f).

The roundtrip tests thoroughly cover the small value paths (opcodes 0x1c/0x1d), but the large value variants (value.len() >= 65536) using opcodes 0x2e/0x2f are not exercised. Adding a test with a large value would ensure the u32 length encoding and MAX_VALUE_LEN validation paths work correctly.

💡 Suggested test addition
#[test]
fn encode_decode_roundtrip_kvvalue_hash_feature_type_with_child_hash_large_value() {
    // Test large value variant (>= 65536 bytes) using opcodes 0x2e/0x2f
    let large_value = vec![0u8; 65536];
    
    let op = Op::Push(Node::KVValueHashFeatureTypeWithChildHash(
        vec![1, 2, 3],
        large_value.clone(),
        [0; 32],
        BasicMerkNode,
        [42; 32],
    ));
    let mut bytes = vec![];
    op.encode_into(&mut bytes).unwrap();
    assert_eq!(bytes[0], 0x2e); // Verify large variant opcode
    let decoded = Op::decode(&bytes[..]).expect("decode failed");
    assert_eq!(decoded, op);

    let op = Op::PushInverted(Node::KVValueHashFeatureTypeWithChildHash(
        vec![1, 2, 3],
        large_value,
        [0; 32],
        BasicMerkNode,
        [42; 32],
    ));
    let mut bytes = vec![];
    op.encode_into(&mut bytes).unwrap();
    assert_eq!(bytes[0], 0x2f); // Verify large variant opcode
    let decoded = Op::decode(&bytes[..]).expect("decode failed");
    assert_eq!(decoded, op);
}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@grovedb-query/src/proofs/encoding.rs` around lines 2092 - 2145, Add a new
roundtrip test that constructs KVValueHashFeatureTypeWithChildHash nodes with a
large value (>= 65536 bytes) to exercise the 0x2e/0x2f encoding paths: create
large_value = vec![0u8; 65536], build Op::Push and Op::PushInverted instances
using Node::KVValueHashFeatureTypeWithChildHash and BasicMerkNode (and
optionally SummedMerkNode), call .encode_into(&mut bytes).unwrap(), assert the
first byte equals 0x2e for Push and 0x2f for PushInverted, then decode with
Op::decode(&bytes[..]) and assert equality with the original op; this verifies
u32 length encoding and MAX_VALUE_LEN handling in encode_into/decode for large
values.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Duplicate comments:
In `@grovedb/src/tests/proof_coverage_tests.rs`:
- Around line 6424-6482: The helper
tamper_kvvaluehash_feature_type_with_child_hash reads merk_proof[i+1] without
bounds checking and only handles the small 0x1c encoding, making it panic on
truncated buffers and miss inverted/large encodings; fix it by first ensuring
i+1 < merk_proof.len() before any indexing, then implement detection and parsing
for all supported encodings for KVValueHashFeatureTypeWithChildHash (small,
inverted, large) by reading the correct key length fields only after validating
sufficient bytes remain, perform all subsequent reads (value_len u16, value
bytes, value_hash, feature_type, child_hash) with bounds checks before access,
and on any malformed/insufficient-length case advance i safely (e.g., i += 1)
and continue; update references in
tamper_kvvaluehash_feature_type_with_child_hash to use these guarded reads and
to handle the alternate encodings when matching target_key and replacing the
value.

---

Nitpick comments:
In `@grovedb-query/src/proofs/encoding.rs`:
- Around line 2092-2145: Add a new roundtrip test that constructs
KVValueHashFeatureTypeWithChildHash nodes with a large value (>= 65536 bytes) to
exercise the 0x2e/0x2f encoding paths: create large_value = vec![0u8; 65536],
build Op::Push and Op::PushInverted instances using
Node::KVValueHashFeatureTypeWithChildHash and BasicMerkNode (and optionally
SummedMerkNode), call .encode_into(&mut bytes).unwrap(), assert the first byte
equals 0x2e for Push and 0x2f for PushInverted, then decode with
Op::decode(&bytes[..]) and assert equality with the original op; this verifies
u32 length encoding and MAX_VALUE_LEN handling in encode_into/decode for large
values.

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 23e57e9a-5a7a-4b0b-a535-0fc0cfb49662

📥 Commits

Reviewing files that changed from the base of the PR and between ea88fe8 and 6cb49ad.

📒 Files selected for processing (5)
  • grovedb-query/src/proofs/encoding.rs
  • grovedb-query/src/proofs/mod.rs
  • grovedb/src/tests/proof_coverage_tests.rs
  • merk/src/proofs/branch/tests.rs
  • merk/src/proofs/query/map.rs

QuantumExplorer and others added 2 commits March 9, 2026 22:28
- generate.rs: use combine_hash in fallback case so the value_hash
  stored in KVValueHashFeatureTypeWithChildHash is the combined hash
  (defensive — this fallback is unreachable for tree elements in
  practice since they always produce KVValueHash or
  KVValueHashFeatureType nodes)
- verify.rs: add combine_hash verification for
  KVValueHashFeatureTypeWithChildHash nodes in trunk/branch chunk
  proof extraction to prevent value tampering during state sync
- benches: add KVValueHashFeatureTypeWithChildHash to
  get_tree_root_count match arm

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Add 4 large-value encoding/decoding tests for KVValueHashFeatureTypeWithChildHash
(opcodes 0x2e/0x2f) and 3 error path tests that exercise proof tampering scenarios:
downgraded proof node rejection, combine_hash mismatch detection, and item element
rejection when upgraded to KVValueHashFeatureTypeWithChildHash.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@QuantumExplorer
Copy link
Copy Markdown
Member Author

Reviewed

@QuantumExplorer QuantumExplorer merged commit e0db549 into develop Mar 9, 2026
10 checks passed
@QuantumExplorer QuantumExplorer deleted the fix/C1-non-empty-tree-child-hash-verification branch March 9, 2026 15:58
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant