diff --git a/docs/book/src/SUMMARY.md b/docs/book/src/SUMMARY.md index 6b091779d..f47ee9b48 100644 --- a/docs/book/src/SUMMARY.md +++ b/docs/book/src/SUMMARY.md @@ -12,6 +12,7 @@ - [The Query System](query-system.md) - [Aggregate Sum Queries](aggregate-sum-queries.md) - [Aggregate Count Queries](aggregate-count-queries.md) +- [Aggregate Sum on Range Queries](aggregate-sum-on-range-queries.md) - [Batch Operations](batch-operations.md) - [Cost Tracking](cost-tracking.md) - [The MMR Tree](mmr-tree.md) diff --git a/docs/book/src/aggregate-sum-on-range-queries.md b/docs/book/src/aggregate-sum-on-range-queries.md new file mode 100644 index 000000000..9eb3bfa7b --- /dev/null +++ b/docs/book/src/aggregate-sum-on-range-queries.md @@ -0,0 +1,255 @@ +# Aggregate Sum on Range Queries + +## Overview + +An **Aggregate Sum on Range Query** lets a caller ask: + +> "What is the total sum of children whose keys fall in this range, in this +> `ProvableSumTree`?" + +The answer is a signed `i64`, and on a `ProvableSumTree` it comes back with a +cryptographic proof. A verifier holding the tree's root hash can compute the +total from the proof in `O(log n + |boundary|)` work — without ever +materializing the `SumItem` values themselves. + +This is the parallel to [Aggregate Count on Range](aggregate-count-queries.md) +for sum trees. The two query types are orthogonal: an aggregate-sum query +returns a sum, an aggregate-count query returns a count, and a single +`PathQuery` may not contain both. + +> **Not to be confused with [Aggregate Sum Queries](aggregate-sum-queries.md).** +> That existing API is a sum-budget iterator — it walks a SumTree returning +> `(key, sum_value)` pairs until a running total is reached. `AggregateSumOnRange` +> is a different feature: it answers "what is the verified total for keys in +> this range?" without returning any values, and only against the +> `ProvableSumTree` element type. + +The feature is implemented as a `QueryItem` variant: + +```rust +pub enum QueryItem { + Key(Vec), + Range(Range>), + // ... existing variants ... + AggregateCountOnRange(Box), + + /// Sum the per-node sum contributions of children matched by the inner + /// range, without returning them. Only valid on ProvableSumTree (and its + /// `NonCounted` / `NotSummed` wrapper variants). + AggregateSumOnRange(Box), +} +``` + +The wrapped `QueryItem` is the **range to sum over**. As with +`AggregateCountOnRange`, it must be one of the true range variants: +`Range`, `RangeInclusive`, `RangeFrom`, `RangeTo`, `RangeToInclusive`, +`RangeAfter`, `RangeAfterTo`, `RangeAfterToInclusive`. The single-key +(`Key`), full-range (`RangeFull`), and self-nested (`AggregateSumOnRange`) +variants are rejected — and `AggregateSumOnRange` may not wrap an +`AggregateCountOnRange` either. + +> **Why are `Key` and `RangeFull` rejected?** +> +> - **`Key(k)`** would return either `0` or the single child's sum +> contribution — degenerate cases the existing `get_raw` / +> `verify_query_with_options` paths already handle more cheaply. +> - **`RangeFull`** has its answer already exposed by the parent's +> `Element::ProvableSumTree(_, sum, _)` bytes, which are hash-verified by +> the parent Merk's proof. Going through `AggregateSumOnRange(RangeFull)` +> would always produce a strictly heavier proof for an answer the caller +> can read directly. + +## Why this works only on ProvableSumTree + +GroveDB has several tree types that track a sum: + +| Tree type | Sum tracked? | Sum in node hash? | AggregateSumOnRange allowed? | +|----------------------------------|:------------:|:-----------------:|:---------------------------:| +| `SumTree` | yes | no | **no** | +| `BigSumTree` | yes (i128) | no | **no** | +| `CountSumTree` | yes | no | **no** | +| `ProvableCountSumTree` | yes | no (count only) | **no** | +| `ProvableSumTree` | yes | **yes** | **yes** | +| `NonCountedProvableSumTree` | yes (inner) | yes (inner) | **yes** | +| `NotSummedProvableSumTree` | yes (inner) | yes (inner) | **yes** | + +Only `ProvableSumTree` bakes the per-node sum into the node hash via +`node_hash_with_sum(kv_hash, left, right, sum)`. Because every node's sum +participates in the Merkle root, a verifier holding only the root hash can +reconstruct enough of the tree from a proof to **trust** the sums embedded +in it. + +`SumTree`, `BigSumTree`, `CountSumTree`, and `ProvableCountSumTree` all +track sums in storage, but those sums are not committed in the node hash +chain. (For `ProvableCountSumTree`, the count is in the hash but the sum +is not.) A "proof" of those sums would be unverifiable, so we reject +`AggregateSumOnRange` against them at query-construction time. + +The wrapper variants are accepted because the wrapper only changes how the +**parent** aggregates this element — the inner is still a fully-fledged +`ProvableSumTree`. + +> **Why not `BigSumTree`?** `BigSumTree` uses `i128` sums and would need a +> separate hash dispatch (`node_hash_with_big_sum`) plus a different verify +> path. It is a documented follow-up, not part of this PR. + +## Query-Level Constraints + +`AggregateSumOnRange` is a **terminal** query item. Its presence reduces +the enclosing `Query` to a single, well-defined operation: "sum, then +return." + +If any `QueryItem::AggregateSumOnRange(_)` appears in `Query::items`, the +query is well-formed only when: + +1. `items.len() == 1` — no other items, no other sums, no mixing with + `AggregateCountOnRange`. +2. The inner `QueryItem` is **not** `Key`, `RangeFull`, or another + `AggregateSumOnRange` / `AggregateCountOnRange`. +3. `default_subquery_branch.subquery.is_none()` and + `subquery_path.is_none()`. +4. `conditional_subquery_branches.is_none()` (or empty). +5. The targeted subtree's `TreeType` is `ProvableSumTree`. +6. The enclosing `SizedQuery` does not set `limit` or `offset`. Summing + is aggregate over the matched range — pagination would silently change + the answer and is rejected. +7. `left_to_right` is **ignored** (summing is direction-agnostic). + +Violations return `Error::InvalidQuery(...)` before any I/O. + +## API Surface + +```rust +// Prove side — unchanged from regular queries: +GroveDb::prove_query(&path_query, prove_options, grove_version) + -> CostResult, Error> + +// Verify side — dedicated, returns (root_hash, sum): +GroveDb::verify_aggregate_sum_query(proof, &path_query, grove_version) + -> Result<(CryptoHash, i64), Error> +``` + +A bare tuple is used rather than a wrapper struct: the sum is already an +`i64` and the `path_query` echoes the inner range. + +> **Note on `NonCounted` and `NotSummed` children.** An +> `Element::NotSummed(child)` wrapper tells the parent sum tree to skip the +> wrapped element when aggregating its own sum. `AggregateSumOnRange` +> honors this: every node in a `ProvableSumTree` carries an own-sum equal +> to its own `SumItem` value or `0` if `NotSummed`-wrapped. The verifier +> credits only the **own-sum** to the in-range total when the boundary key +> falls in range. `NonCounted` is orthogonal to sums — it suppresses count +> aggregation, not sum aggregation — so a `NonCounted` `SumItem` still +> contributes its sum value normally. + +## Proof Node Vocabulary + +For `ProvableSumTree`, every node hash commits to its subtree's aggregate +sum via `node_hash_with_sum(kv_hash, left, right, sum)`. The proof-node +vocabulary is parallel to the count family, with new variants carrying an +`i64` sum field in place of the `u64` count: + +| Role in proof | Proof node type | What it carries | +|----------------------------|------------------------------------------------------------------------------|----------------------------------------------------------------| +| **On-path / boundary** | `KVDigestSum(key, value_hash, sum)` | key + value digest + subtree sum | +| **Fully-inside / outside** | `HashWithSum(kv_hash, left_hash, right_hash, sum)` | the four fields needed to recompute `node_hash_with_sum` | +| **Queried boundary item** | `KVSum(key, value, sum)` | leaf value at a boundary key, with subtree sum | +| **Empty side** | (the empty-tree sentinel, no `Push` needed) | — | + +Wire format tag bytes (V1 only): `0x30..=0x3D` for the push and +push-inverted variants. The on-the-wire sum field is `varint i64` (not +fixed-width) for compactness; the **hash input** to `node_hash_with_sum` +uses fixed 8-byte big-endian — wire and hash are deliberately decoupled. + +> **Why `HashWithSum` is self-verifying.** The `sum` value carried by a +> `HashWithSum` op is *bound* to the parent merk's hash chain, not +> trusted on faith. The verifier recomputes +> `node_hash_with_sum(kv_hash, left, right, sum)` from the four fields +> and uses the result as the subtree's committed `node_hash` for the +> parent's hash recomputation. If the prover lies about `sum`, the +> recomputed `node_hash` diverges from what the parent committed, and the +> parent's Merkle-root check fails. + +The walk-by-example diagrams from +[Aggregate Count on Range Queries](aggregate-count-queries.md) apply +unchanged — substitute `KVDigestCount` → `KVDigestSum` and +`HashWithCount` → `HashWithSum`. + +## Signed-Sum Arithmetic + +Two correctness points differ from the count machinery: + +### Negative sums + +A `ProvableSumTree` can hold negative `SumItem` values, and a range can +sum to a negative or zero total. Two consequences: + +- **No `if sum == 0` short-circuit.** The count generator can skip an + empty subtree (count = 0 means "no elements"), but `sum == 0` does + **not** mean "no elements" — it can mean "+5 and -5 cancelled". The + sum prover descends regardless. +- **No `own_sum = aggregate − left_struct − right_struct` overflow + check.** Count uses `checked_sub` to catch "children claim more than + parent" as corruption. Signed sums can naturally have children's + structural sums in any combination (`+200 + -150 = +50`), so the + subtraction is allowed to wrap. The hash chain still binds every + node, so arithmetic corruption changes the reconstructed root hash + and the caller's root check catches it. + +### i64 overflow at extremes + +A sum of two `i64::MAX` children does **not** fit in `i64`. The verify +path accumulates in `i128` end-to-end: + +- The prover's internal recursion (`emit_sum_proof`) returns + `CostResult`. +- The verifier's `verify_sum_shape` accumulates into an `i128`. +- Both narrow to `i64` at the **outermost entry point** via + `i64::try_from(sum_i128)`, returning `Error::InvalidProofError` if + the i128 result doesn't fit. + +Tests cover the two interesting overflow shapes: + +- `i64::MAX + i64::MAX` → overflows i64, verify rejects with + `InvalidProofError`. +- `i64::MAX + i64::MIN` → `-1`, fits i64, verify succeeds. The + intermediate i128 carries the difference safely. + +## Tests and Examples + +See: + +- `grovedb/src/tests/aggregate_sum_query_tests.rs` — 21 end-to-end + GroveDB tests. +- `merk/src/proofs/query/aggregate_sum.rs` — 14 Merk-level tests + (classification, prover internals, single-`Hash` rejection, + disjoint-with-children rejection, overflow at i64::MAX). +- `grovedb/src/operations/proof/aggregate_sum.rs` — V0/V1 envelope walker + with layer-chain validation. + +The marquee scenarios: + +| Scenario | Result | +|-------------------------------------------------------|-------------------------------------| +| Full range over `[1..=15]` | sum = 120 | +| Subrange `[5..=10]` | sum = 45 | +| Mixed `+50, -100, +30, -50` | sum = -70 | +| All-negative subrange | sum = -10 | +| `+5, -5` (non-zero children, zero sum) | sum = 0 (no short-circuit) | +| `i64::MAX + i64::MAX` | `Error::InvalidProofError` | +| `i64::MAX + i64::MIN` | sum = -1 | +| Tampered `HashWithSum::sum` | rejected (root-hash divergence) | +| `NotSummed(SumItem)` in range | excluded (matches tree's aggregate) | +| Query with subquery / pagination / mixed aggregates | rejected at validation | + +## See Also + +- [Element System](element-system.md) — the `ProvableSumTree` element + variant and `ProvableSummedMerkNode` feature type. +- [Aggregate Count on Range Queries](aggregate-count-queries.md) — the + symmetric count-only feature; most of the proof-shape walk diagrams + apply unchanged. +- [Aggregate Sum Queries](aggregate-sum-queries.md) — the existing + sum-budget iterator (a different feature with a similar name). +- [Hashing](hashing.md) — `node_hash_with_sum` and the broader + hash-binding scheme. diff --git a/docs/book/src/aggregate-sum-queries.md b/docs/book/src/aggregate-sum-queries.md index d4c32c449..2af6d7d50 100644 --- a/docs/book/src/aggregate-sum-queries.md +++ b/docs/book/src/aggregate-sum-queries.md @@ -1,5 +1,13 @@ # Aggregate Sum Queries +> **Heads up — two different features.** This page covers the +> sum-budget iterator: walk a `SumTree` returning `(key, sum_value)` pairs +> until a running total is reached. If you instead want a **cryptographically +> verifiable total** for a key range against a `ProvableSumTree`, see +> [Aggregate Sum on Range Queries](aggregate-sum-on-range-queries.md). +> The two features are independent — the iterator does not produce a +> proof of the running total, only the elements that contributed to it. + ## Overview Aggregate Sum Queries are a specialized query type designed for **SumTrees** in GroveDB. diff --git a/docs/book/src/appendix-a.md b/docs/book/src/appendix-a.md index eb29f8b04..11fb2a7e0 100644 --- a/docs/book/src/appendix-a.md +++ b/docs/book/src/appendix-a.md @@ -1,22 +1,33 @@ # Appendix A: Complete Element Type Reference -| Discriminant | Variant | TreeType | Fields | Cost Size | Purpose | +> **Reading the table.** "Element disc" is the bincode discriminant of +> the `Element` enum (one byte; persisted at the start of every +> serialized element). "TreeType disc" is the discriminant of the +> *separate* `TreeType` enum in `merk/src/tree_type/mod.rs` — it is NOT +> the same numbering. Most rows list both the TreeType disc and its +> variant name (e.g. `0 (NormalTree)`) to keep the distinction obvious; +> `N/A` means the Element variant is not a tree. + +| Element disc | Variant | TreeType disc | Fields | Cost Size | Purpose | |---|---|---|---|---|---| | 0 | `Item` | N/A | `(value, flags)` | varies | Basic key-value storage | | 1 | `Reference` | N/A | `(path, max_hop, flags)` | varies | Link between elements | | 2 | `Tree` | 0 (NormalTree) | `(root_key, flags)` | TREE_COST_SIZE | Container for subtrees | | 3 | `SumItem` | N/A | `(value, flags)` | varies | Contributes to parent sum | | 4 | `SumTree` | 1 (SumTree) | `(root_key, sum, flags)` | SUM_TREE_COST_SIZE | Maintains sum of descendants | -| 5 | `BigSumTree` | 4 (BigSumTree) | `(root_key, sum128, flags)` | BIG_SUM_TREE_COST_SIZE | 128-bit sum tree | -| 6 | `CountTree` | 2 (CountTree) | `(root_key, count, flags)` | COUNT_TREE_COST_SIZE | Element counting tree | -| 7 | `CountSumTree` | 3 (CountSumTree) | `(root_key, count, sum, flags)` | COUNT_SUM_TREE_COST_SIZE | Combined count + sum | -| 8 | `ItemWithSumItem` | N/A | `(value, sum, flags)` | varies | Item with sum contribution | -| 9 | `ProvableCountTree` | 5 | `(root_key, count, flags)` | COUNT_TREE_COST_SIZE | Provable count tree | -| 10 | `ProvableCountSumTree` | 6 | `(root_key, count, sum, flags)` | COUNT_SUM_TREE_COST_SIZE | Provable count + sum | -| 11 | `CommitmentTree` | 7 | `(total_count: u64, chunk_power: u8, flags)` | 12 | ZK-friendly Sinsemilla + BulkAppendTree | -| 12 | `MmrTree` | 8 | `(mmr_size: u64, flags)` | 11 | Append-only MMR log | -| 13 | `BulkAppendTree` | 9 | `(total_count: u64, chunk_power: u8, flags)` | 12 | High-throughput append-only log | -| 14 | `DenseAppendOnlyFixedSizeTree` | 10 | `(count: u16, height: u8, flags)` | 6 | Dense fixed-capacity Merkle storage | +| 5 | `BigSumTree` | 2 (BigSumTree) | `(root_key, sum128, flags)` | BIG_SUM_TREE_COST_SIZE | 128-bit sum tree | +| 6 | `CountTree` | 3 (CountTree) | `(root_key, count, flags)` | COUNT_TREE_COST_SIZE | Element counting tree | +| 7 | `CountSumTree` | 4 (CountSumTree) | `(root_key, count, sum, flags)` | COUNT_SUM_TREE_COST_SIZE | Combined count + sum | +| 8 | `ProvableCountTree` | 5 (ProvableCountTree) | `(root_key, count, flags)` | COUNT_TREE_COST_SIZE | Provable count tree | +| 9 | `ItemWithSumItem` | N/A | `(value, sum, flags)` | varies | Item with sum contribution | +| 10 | `ProvableCountSumTree` | 6 (ProvableCountSumTree) | `(root_key, count, sum, flags)` | COUNT_SUM_TREE_COST_SIZE | Provable count + sum (only count in hash) | +| 11 | `CommitmentTree` | 7 (CommitmentTree) | `(total_count: u64, chunk_power: u8, flags)` | 12 | ZK-friendly Sinsemilla + BulkAppendTree | +| 12 | `MmrTree` | 8 (MmrTree) | `(mmr_size: u64, flags)` | 11 | Append-only MMR log | +| 13 | `BulkAppendTree` | 9 (BulkAppendTree) | `(total_count: u64, chunk_power: u8, flags)` | 12 | High-throughput append-only log | +| 14 | `DenseAppendOnlyFixedSizeTree` | 10 (DenseAppendOnlyFixedSizeTree) | `(count: u16, height: u8, flags)` | 6 | Dense fixed-capacity Merkle storage | +| 15 | `NonCounted` | wrapper | `Box` | inner + 1 byte | Opts inner out of parent count aggregation | +| 16 | `NotSummed` | wrapper | `Box` | inner + 1 byte | Opts inner out of parent sum aggregation | +| 17 | `ProvableSumTree` | 11 (ProvableSumTree) | `(root_key, sum: i64, flags)` | SUM_TREE_COST_SIZE | Sum baked into hash (see [Aggregate Sum on Range Queries](aggregate-sum-on-range-queries.md)) | **Notes:** - Discriminants 11–14 are **non-Merk trees**: data lives outside a child Merk subtree diff --git a/docs/book/src/element-system.md b/docs/book/src/element-system.md index 84efc4e33..09c2e0b76 100644 --- a/docs/book/src/element-system.md +++ b/docs/book/src/element-system.md @@ -25,6 +25,9 @@ pub enum Element { MmrTree(u64, Option), // [12] BulkAppendTree(u64, u8, Option), // [13] DenseAppendOnlyFixedSizeTree(u16, u8, Option), // [14] + NonCounted(Box), // [15] wrapper byte + NotSummed(Box), // [16] wrapper byte + ProvableSumTree(Option>, SumValue, Option), // [17] } ``` @@ -155,11 +158,31 @@ Additional aggregate tree types: | `BigSumTree` | `BigSummedMerkNode(i128)` | 128-bit sum for large values | | `ProvableCountTree` | `ProvableCountedMerkNode(u64)` | Count baked into hash | | `ProvableCountSumTree` | `ProvableCountedSummedMerkNode(u64, i64)` | Count in hash + sum | +| `ProvableSumTree` | `ProvableSummedMerkNode(i64)` | Sum baked into hash | **ProvableCountTree** is special: its count is included in the `node_hash` computation (via `node_hash_with_count`), so a proof can verify the count without revealing any values. +**ProvableSumTree** is the parallel for sums: each node's aggregate sum is +included in the `node_hash` via `node_hash_with_sum(kv_hash, left, right, sum)`, +so a proof can return the verified total of any key range without revealing +the underlying `SumItem` values. Use this when the sum is part of the +protocol invariant — stake weights, fee priorities, vote tallies — and a +peer needs to verify totals from the root hash alone. Use plain `SumTree` +when the sum is bookkeeping metadata that doesn't need cryptographic +binding. The per-node hashing cost is a small fixed addition over plain +`SumTree`. See [Aggregate Sum on Range Queries](aggregate-sum-on-range-queries.md) +for the verifiable range-sum query that this element enables. + +Like its count counterpart, `ProvableSumTree` accepts the `NotSummed` +wrapper so a sum-bearing child can opt out of contributing to its parent's +running sum. The `NotSummed` ElementType twin lives at slot 177 in the +`0xB0..=0xBF` family range, assigned explicitly rather than via the +`prefix | base` formula used elsewhere (the formula would collide because +ProvableSumTree's base discriminant `17` and `0xb0 | 17 = 0xB1` would mask +back to `Reference` under the legacy `& 0x0F` inverse). + ## Element Serialization Elements are serialized using **bincode** with big-endian byte order: @@ -198,6 +221,7 @@ pub enum TreeFeatureType { CountedSummedMerkNode(u64, i64), // Count + sum ProvableCountedMerkNode(u64), // Count in hash ProvableCountedSummedMerkNode(u64, i64), // Count in hash + sum + ProvableSummedMerkNode(i64), // Sum in hash } ``` diff --git a/docs/book/src/hashing.md b/docs/book/src/hashing.md index 01fba9033..eb768db18 100644 --- a/docs/book/src/hashing.md +++ b/docs/book/src/hashing.md @@ -231,4 +231,37 @@ pub fn node_hash_with_count( This means a proof of count doesn't require revealing the actual data — the count is baked into the cryptographic commitment. +## Aggregate Hashing for ProvableSumTree + +`ProvableSumTree` is the sum parallel — each node's aggregate sum is bound +into the node hash: + +```rust +pub fn node_hash_with_sum( + kv: &CryptoHash, + left: &CryptoHash, + right: &CryptoHash, + sum: i64, +) -> CostContext { + let mut hasher = blake3::Hasher::new(); + hasher.update(kv); // 32 bytes + hasher.update(left); // 32 bytes + hasher.update(right); // 32 bytes + hasher.update(&sum.to_be_bytes()); // 8 bytes (signed i64 BE) + // Same 2 hash ops as node_hash_with_count +} +``` + +Hashing uses fixed 8-byte big-endian `i64::to_be_bytes()` (signed), +**not** the varint encoding used for wire-format compactness in proof +ops. The two are deliberately decoupled: wire wants compact, the hash +input must be canonical and length-fixed so the verifier reconstructs the +exact pre-image. Negative sums hash correctly because two's-complement +big-endian is a deterministic content-binding encoding (no order +preservation needed). + +A proof against a `ProvableSumTree` can return the verified total of any +key range without revealing the underlying `SumItem` values — see +[Aggregate Sum on Range Queries](aggregate-sum-on-range-queries.md). + --- diff --git a/grovedb-bulk-append-tree/src/proof/mod.rs b/grovedb-bulk-append-tree/src/proof/mod.rs index c523a69fc..d711b532c 100644 --- a/grovedb-bulk-append-tree/src/proof/mod.rs +++ b/grovedb-bulk-append-tree/src/proof/mod.rs @@ -142,6 +142,13 @@ fn query_to_ranges(query: &Query, total_count: u64) -> Result, B .into(), )); } + QueryItem::AggregateSumOnRange(_) => { + return Err(BulkAppendError::InvalidInput( + "AggregateSumOnRange is only supported on provable sum trees, \ + not on BulkAppendTree" + .into(), + )); + } }; ranges.push((start, end)); } diff --git a/grovedb-bulk-append-tree/src/proof/tests.rs b/grovedb-bulk-append-tree/src/proof/tests.rs index f9ef67e1f..6b328ad85 100644 --- a/grovedb-bulk-append-tree/src/proof/tests.rs +++ b/grovedb-bulk-append-tree/src/proof/tests.rs @@ -291,6 +291,47 @@ mod proof_tests { assert!(matches!(err, BulkAppendError::InvalidInput(_))); } + #[test] + fn test_query_to_ranges_rejects_aggregate_count_on_range() { + // BulkAppendTree has no count commitment in its node hash, so the + // index-resolution helper must reject AggregateCountOnRange + // outright rather than silently fall through. + let mut query = Query::default(); + query.items.push(QueryItem::AggregateCountOnRange(Box::new( + QueryItem::Range(pos_bytes(0)..pos_bytes(5)), + ))); + let err = super::super::query_to_ranges(&query, 10) + .expect_err("AggregateCountOnRange must be rejected"); + match err { + BulkAppendError::InvalidInput(msg) => assert!( + msg.contains("BulkAppendTree") || msg.contains("provable count"), + "unexpected message: {msg}" + ), + other => panic!("expected InvalidInput, got {:?}", other), + } + } + + #[test] + fn test_query_to_ranges_rejects_aggregate_sum_on_range() { + // Same rationale as count: BulkAppendTree has no sum commitment + // either, so AggregateSumOnRange is rejected at index resolution. + let mut query = Query::default(); + query + .items + .push(QueryItem::AggregateSumOnRange(Box::new(QueryItem::Range( + pos_bytes(0)..pos_bytes(5), + )))); + let err = super::super::query_to_ranges(&query, 10) + .expect_err("AggregateSumOnRange must be rejected"); + match err { + BulkAppendError::InvalidInput(msg) => assert!( + msg.contains("BulkAppendTree") || msg.contains("provable sum"), + "unexpected message: {msg}" + ), + other => panic!("expected InvalidInput, got {:?}", other), + } + } + #[test] fn test_query_to_ranges_merges_clamps_and_filters() { let mut query = Query::default(); diff --git a/grovedb-dense-fixed-sized-merkle-tree/src/proof/mod.rs b/grovedb-dense-fixed-sized-merkle-tree/src/proof/mod.rs index 8178f48be..65f255a9a 100644 --- a/grovedb-dense-fixed-sized-merkle-tree/src/proof/mod.rs +++ b/grovedb-dense-fixed-sized-merkle-tree/src/proof/mod.rs @@ -123,6 +123,13 @@ pub(crate) fn query_to_positions(query: &Query, count: u16) -> Result, .into(), )); } + QueryItem::AggregateSumOnRange(_) => { + return Err(DenseMerkleError::InvalidProof( + "AggregateSumOnRange is only supported on provable sum trees, \ + not on dense fixed-size merkle trees" + .into(), + )); + } } } diff --git a/grovedb-dense-fixed-sized-merkle-tree/src/proof/tests.rs b/grovedb-dense-fixed-sized-merkle-tree/src/proof/tests.rs index 987612bc3..bea852aff 100644 --- a/grovedb-dense-fixed-sized-merkle-tree/src/proof/tests.rs +++ b/grovedb-dense-fixed-sized-merkle-tree/src/proof/tests.rs @@ -339,6 +339,47 @@ mod proof_tests { assert!(positions.is_empty()); } + #[test] + fn test_query_to_positions_rejects_aggregate_count_on_range() { + // Dense fixed-size trees don't bind count into the node hash, so + // AggregateCountOnRange is invalid input at the index-resolution + // helper. Exercise that rejection arm directly. + use grovedb_query::QueryItem; + let mut query = Query::new(); + query.items.push(QueryItem::AggregateCountOnRange(Box::new( + QueryItem::Range(vec![0]..vec![3]), + ))); + let err = + query_to_positions(&query, 7).expect_err("AggregateCountOnRange must be rejected"); + match err { + crate::DenseMerkleError::InvalidProof(msg) => assert!( + msg.contains("dense fixed-size") || msg.contains("provable count"), + "unexpected message: {msg}" + ), + other => panic!("expected InvalidProof, got {:?}", other), + } + } + + #[test] + fn test_query_to_positions_rejects_aggregate_sum_on_range() { + // Same rationale for sum. + use grovedb_query::QueryItem; + let mut query = Query::new(); + query + .items + .push(QueryItem::AggregateSumOnRange(Box::new(QueryItem::Range( + vec![0]..vec![3], + )))); + let err = query_to_positions(&query, 7).expect_err("AggregateSumOnRange must be rejected"); + match err { + crate::DenseMerkleError::InvalidProof(msg) => assert!( + msg.contains("dense fixed-size") || msg.contains("provable sum"), + "unexpected message: {msg}" + ), + other => panic!("expected InvalidProof, got {:?}", other), + } + } + // ======================================================================= // Byte encoding: 1-byte, 2-byte, invalid // ======================================================================= diff --git a/grovedb-element/src/element/constructor.rs b/grovedb-element/src/element/constructor.rs index d18bb3d4f..db9e6a094 100644 --- a/grovedb-element/src/element/constructor.rs +++ b/grovedb-element/src/element/constructor.rs @@ -329,6 +329,42 @@ impl Element { Element::ProvableCountSumTree(maybe_root_key, count_value, sum_value, flags) } + /// Set element to default empty provable sum tree without flags. + /// + /// `ProvableSumTree` is the sum analogue of `ProvableCountTree`: it + /// bakes the per-node sum into the node hash so that aggregate-sum + /// range queries can be cryptographically verified. + pub fn empty_provable_sum_tree() -> Self { + Element::new_provable_sum_tree(Default::default()) + } + + /// Set element to default empty provable sum tree with flags. + pub fn empty_provable_sum_tree_with_flags(flags: Option) -> Self { + Element::new_provable_sum_tree_with_flags(Default::default(), flags) + } + + /// Set element to a provable sum tree without flags. + pub fn new_provable_sum_tree(maybe_root_key: Option>) -> Self { + Element::ProvableSumTree(maybe_root_key, 0, None) + } + + /// Set element to a provable sum tree with flags. + pub fn new_provable_sum_tree_with_flags( + maybe_root_key: Option>, + flags: Option, + ) -> Self { + Element::ProvableSumTree(maybe_root_key, 0, flags) + } + + /// Set element to a provable sum tree with flags and sum value. + pub fn new_provable_sum_tree_with_flags_and_sum_value( + maybe_root_key: Option>, + sum_value: SumValue, + flags: Option, + ) -> Self { + Element::ProvableSumTree(maybe_root_key, sum_value, flags) + } + /// Set element to an empty commitment tree. /// /// Returns `InvalidInput` if `chunk_power > 31`. @@ -475,20 +511,22 @@ impl Element { /// parent sum tree's running sum when inserted. Counts (if any) still /// propagate. /// - /// Only the four sum-tree variants are accepted: `SumTree`, `BigSumTree`, - /// `CountSumTree`, `ProvableCountSumTree`. Any other element — including - /// items, sum items, references, non-sum trees, and any wrapper - /// (`NonCounted`, `NotSummed`, `NotCountedOrSummed`) — is rejected with + /// Only the five sum-bearing tree variants are accepted: `SumTree`, + /// `BigSumTree`, `CountSumTree`, `ProvableCountSumTree`, + /// `ProvableSumTree`. Any other element — including items, sum items, + /// references, non-sum trees, and any wrapper (`NonCounted`, + /// `NotSummed`, `NotCountedOrSummed`) — is rejected with /// `InvalidInput`. pub fn new_not_summed(inner: Element) -> Result { match inner { Element::SumTree(..) | Element::BigSumTree(..) | Element::CountSumTree(..) - | Element::ProvableCountSumTree(..) => Ok(Element::NotSummed(Box::new(inner))), + | Element::ProvableCountSumTree(..) + | Element::ProvableSumTree(..) => Ok(Element::NotSummed(Box::new(inner))), _ => Err(ElementError::InvalidInput( "NotSummed inner element must be a sum-tree variant (SumTree, BigSumTree, \ - CountSumTree, or ProvableCountSumTree)", + CountSumTree, ProvableCountSumTree, or ProvableSumTree)", )), } } @@ -521,10 +559,11 @@ impl Element { /// to BOTH its parent's running sum AND its parent's count when /// inserted. /// - /// Only the four sum-tree variants are accepted: `SumTree`, `BigSumTree`, - /// `CountSumTree`, `ProvableCountSumTree`. Any other element — including - /// items, sum items, references, non-sum trees, and any wrapper - /// (`NonCounted`, `NotSummed`, `NotCountedOrSummed`) — is rejected with + /// Only the five sum-bearing tree variants are accepted: `SumTree`, + /// `BigSumTree`, `CountSumTree`, `ProvableCountSumTree`, + /// `ProvableSumTree`. Any other element — including items, sum items, + /// references, non-sum trees, and any wrapper (`NonCounted`, + /// `NotSummed`, `NotCountedOrSummed`) — is rejected with /// `InvalidInput`. /// /// Note: at insert time the parent must be `CountSumTree` or @@ -534,10 +573,11 @@ impl Element { Element::SumTree(..) | Element::BigSumTree(..) | Element::CountSumTree(..) - | Element::ProvableCountSumTree(..) => Ok(Element::NotCountedOrSummed(Box::new(inner))), + | Element::ProvableCountSumTree(..) + | Element::ProvableSumTree(..) => Ok(Element::NotCountedOrSummed(Box::new(inner))), _ => Err(ElementError::InvalidInput( - "NotCountedOrSummed inner element must be a sum-tree variant (SumTree, \ - BigSumTree, CountSumTree, or ProvableCountSumTree)", + "NotCountedOrSummed inner element must be a sum-bearing tree variant (SumTree, \ + BigSumTree, CountSumTree, ProvableCountSumTree, or ProvableSumTree)", )), } } diff --git a/grovedb-element/src/element/helpers.rs b/grovedb-element/src/element/helpers.rs index 8dd7a1a4c..ac57d6c6d 100644 --- a/grovedb-element/src/element/helpers.rs +++ b/grovedb-element/src/element/helpers.rs @@ -100,6 +100,7 @@ impl Element { | Element::SumTree(_, sum_value, _) | Element::CountSumTree(_, _, sum_value, _) | Element::ProvableCountSumTree(_, _, sum_value, _) + | Element::ProvableSumTree(_, sum_value, _) | Element::ReferenceWithSumItem(_, _, sum_value, _) => *sum_value, _ => 0, } @@ -141,6 +142,7 @@ impl Element { Element::SumItem(sum_value, _) | Element::ItemWithSumItem(_, sum_value, _) | Element::SumTree(_, sum_value, _) + | Element::ProvableSumTree(_, sum_value, _) | Element::ReferenceWithSumItem(_, _, sum_value, _) => (1, *sum_value), Element::CountTree(_, count_value, _) => (*count_value, 0), Element::CountSumTree(_, count_value, sum_value, _) @@ -165,6 +167,7 @@ impl Element { | Element::SumTree(_, sum_value, _) | Element::CountSumTree(_, _, sum_value, _) | Element::ProvableCountSumTree(_, _, sum_value, _) + | Element::ProvableSumTree(_, sum_value, _) | Element::ReferenceWithSumItem(_, _, sum_value, _) => *sum_value as i128, Element::BigSumTree(_, sum_value, _) => *sum_value, _ => 0, @@ -254,6 +257,31 @@ impl Element { matches!(self.underlying(), Element::BigSumTree(..)) } + /// Check if the element is a provable sum tree. Looks through wrappers. + pub fn is_provable_sum_tree(&self) -> bool { + matches!(self.underlying(), Element::ProvableSumTree(..)) + } + + /// Decoded sum value from a `ProvableSumTree`. Looks through wrappers. + pub fn as_provable_sum_tree_value(&self) -> Result { + match self.underlying() { + Element::ProvableSumTree(_, value, _) => Ok(*value), + _ => Err(ElementError::WrongElementType( + "expected a provable sum tree", + )), + } + } + + /// Owned variant of [`as_provable_sum_tree_value`]. + pub fn into_provable_sum_tree_value(self) -> Result { + match self.into_underlying() { + Element::ProvableSumTree(_, value, _) => Ok(value), + _ => Err(ElementError::WrongElementType( + "expected a provable sum tree", + )), + } + } + /// Check if the element is a tree but not a sum tree. Looks through /// `NonCounted`. pub fn is_basic_tree(&self) -> bool { @@ -271,6 +299,7 @@ impl Element { | Element::CountSumTree(..) | Element::ProvableCountTree(..) | Element::ProvableCountSumTree(..) + | Element::ProvableSumTree(..) | Element::CommitmentTree(..) | Element::MmrTree(..) | Element::BulkAppendTree(..) @@ -349,6 +378,7 @@ impl Element { | Element::CountSumTree(Some(_), ..) | Element::ProvableCountTree(Some(_), ..) | Element::ProvableCountSumTree(Some(_), ..) + | Element::ProvableSumTree(Some(_), ..) | Element::CommitmentTree(..) | Element::MmrTree(..) | Element::BulkAppendTree(..) @@ -373,6 +403,7 @@ impl Element { | Element::CountSumTree(Some(_), ..) | Element::ProvableCountTree(Some(_), ..) | Element::ProvableCountSumTree(Some(_), ..) + | Element::ProvableSumTree(Some(_), ..) ) } @@ -446,6 +477,7 @@ impl Element { | Element::CountSumTree(.., flags) | Element::ProvableCountTree(.., flags) | Element::ProvableCountSumTree(.., flags) + | Element::ProvableSumTree(.., flags) | Element::ItemWithSumItem(.., flags) | Element::CommitmentTree(.., flags) | Element::MmrTree(.., flags) @@ -472,6 +504,7 @@ impl Element { | Element::CountSumTree(.., flags) | Element::ProvableCountTree(.., flags) | Element::ProvableCountSumTree(.., flags) + | Element::ProvableSumTree(.., flags) | Element::ItemWithSumItem(.., flags) | Element::CommitmentTree(.., flags) | Element::MmrTree(.., flags) @@ -498,6 +531,7 @@ impl Element { | Element::CountSumTree(.., flags) | Element::ProvableCountTree(.., flags) | Element::ProvableCountSumTree(.., flags) + | Element::ProvableSumTree(.., flags) | Element::ItemWithSumItem(.., flags) | Element::CommitmentTree(.., flags) | Element::MmrTree(.., flags) @@ -524,6 +558,7 @@ impl Element { | Element::CountSumTree(.., flags) | Element::ProvableCountTree(.., flags) | Element::ProvableCountSumTree(.., flags) + | Element::ProvableSumTree(.., flags) | Element::ItemWithSumItem(.., flags) | Element::CommitmentTree(.., flags) | Element::MmrTree(.., flags) @@ -941,6 +976,143 @@ mod not_summed_tests { } } +#[cfg(test)] +mod flag_accessor_tests { + // Targets the per-variant match arms in `get_flags`, `get_flags_owned`, + // `get_flags_mut`, and `set_flags`. Each aggregate-bearing variant + // (CountSumTree, ProvableCountTree, ProvableCountSumTree, ProvableSumTree) + // gets a direct round-trip exercise so the per-variant pattern lines + // register as covered. + use crate::element::Element; + + fn flags() -> Option> { + Some(vec![0xCA, 0xFE]) + } + + fn flags_b() -> Option> { + Some(vec![0xBE, 0xEF]) + } + + fn check_accessors_round_trip(initial: Element) { + // get_flags_owned (covers the matched arm on the consuming path). + let got_owned = initial.clone().get_flags_owned(); + assert_eq!(got_owned, flags()); + + // get_flags_mut (covers the &mut arm). + let mut mutable = initial.clone(); + let slot = mutable.get_flags_mut(); + assert_eq!(slot, &flags()); + *slot = flags_b(); + assert_eq!(mutable.get_flags(), &flags_b()); + + // set_flags (covers the &mut arm of set_flags). + let mut e = initial; + e.set_flags(flags_b()); + assert_eq!(e.get_flags(), &flags_b()); + } + + #[test] + fn count_sum_tree_flag_accessors() { + let e = Element::CountSumTree(None, 0, 0, flags()); + check_accessors_round_trip(e); + } + + #[test] + fn provable_count_tree_flag_accessors() { + let e = Element::ProvableCountTree(None, 0, flags()); + check_accessors_round_trip(e); + } + + #[test] + fn provable_count_sum_tree_flag_accessors() { + let e = Element::ProvableCountSumTree(None, 0, 0, flags()); + check_accessors_round_trip(e); + } + + #[test] + fn provable_sum_tree_flag_accessors() { + let e = Element::ProvableSumTree(None, 0, flags()); + check_accessors_round_trip(e); + } + + #[test] + fn item_with_sum_item_flag_accessors() { + let e = Element::ItemWithSumItem(b"x".to_vec(), 0, flags()); + check_accessors_round_trip(e); + } + + #[test] + fn commitment_tree_flag_accessors() { + let e = Element::CommitmentTree(0, 0, flags()); + check_accessors_round_trip(e); + } + + #[test] + fn mmr_tree_flag_accessors() { + let e = Element::MmrTree(0, flags()); + check_accessors_round_trip(e); + } + + #[test] + fn bulk_append_tree_flag_accessors() { + let e = Element::BulkAppendTree(0, 0, flags()); + check_accessors_round_trip(e); + } + + #[test] + fn dense_append_only_tree_flag_accessors() { + let e = Element::DenseAppendOnlyFixedSizeTree(0, 0, flags()); + check_accessors_round_trip(e); + } + + #[test] + fn big_sum_tree_flag_accessors() { + let e = Element::BigSumTree(None, 0, flags()); + check_accessors_round_trip(e); + } + + #[test] + fn count_tree_flag_accessors() { + let e = Element::CountTree(None, 0, flags()); + check_accessors_round_trip(e); + } + + #[test] + fn flag_accessors_delegate_through_not_summed() { + // Drives the `Element::NotSummed(inner) => inner.set_flags(...)` arm + // (and matching arms in the other two accessors). + let inner = Element::new_sum_tree_with_flags(None, flags()); + let mut wrapper = Element::new_not_summed(inner).expect("wrap ok"); + + assert_eq!(wrapper.clone().get_flags_owned(), flags()); + let slot = wrapper.get_flags_mut(); + assert_eq!(slot, &flags()); + *slot = flags_b(); + assert_eq!(wrapper.get_flags(), &flags_b()); + + wrapper.set_flags(None); + assert_eq!(wrapper.get_flags(), &None); + } + + #[test] + fn flag_accessors_delegate_through_non_counted() { + // Drives the `Element::NonCounted(inner) => ...` arm in the same + // three accessors. (Was uncovered when no test invoked set_flags via + // a NonCounted wrapper.) + let inner = Element::new_item_with_flags(b"x".to_vec(), flags()); + let mut wrapper = Element::new_non_counted(inner).expect("wrap ok"); + + assert_eq!(wrapper.clone().get_flags_owned(), flags()); + let slot = wrapper.get_flags_mut(); + assert_eq!(slot, &flags()); + *slot = flags_b(); + assert_eq!(wrapper.get_flags(), &flags_b()); + + wrapper.set_flags(None); + assert_eq!(wrapper.get_flags(), &None); + } +} + #[cfg(test)] mod not_counted_or_summed_tests { use grovedb_version::version::GroveVersion; @@ -949,12 +1121,15 @@ mod not_counted_or_summed_tests { #[test] fn new_not_counted_or_summed_wraps_sum_tree_variants() { - // All four sum-tree variants are accepted. + // All five sum-bearing tree variants are accepted, including the + // newer `ProvableSumTree` (whose own twin slot 0xC1 was hand-assigned + // because base 19 doesn't fit the `0xC0 | base` formula). for inner in [ Element::new_sum_tree(None), Element::new_big_sum_tree(None), Element::new_count_sum_tree(None), Element::new_provable_count_sum_tree(None), + Element::new_provable_sum_tree(None), ] { let wrapped = Element::new_not_counted_or_summed(inner.clone()).expect("wrap ok"); assert!(wrapped.is_not_counted_or_summed()); @@ -1183,8 +1358,9 @@ mod not_counted_or_summed_tests { #[test] fn deserialize_rejects_long_nested_wrapper_chain_without_recursion() { let grove_version = GroveVersion::latest(); - // 1024 NotCountedOrSummed wrapper bytes followed by a valid SumTree. - // Pre-check stops on byte 1. + // 1024 NotCountedOrSummed wrapper bytes (17 = NCOS wrapper + // discriminant) followed by a valid SumTree. Pre-check stops on + // byte 1. let mut bytes = vec![17u8; 1024]; bytes.extend_from_slice(&[4, 0, 0, 0]); let err = Element::deserialize(&bytes, grove_version) diff --git a/grovedb-element/src/element/mod.rs b/grovedb-element/src/element/mod.rs index 4d6ead51a..70d09f164 100644 --- a/grovedb-element/src/element/mod.rs +++ b/grovedb-element/src/element/mod.rs @@ -142,39 +142,42 @@ pub enum Element { /// at construction and at deserialization. NonCounted(Box), /// Not-summed wrapper: contains a sum-bearing tree variant (`SumTree`, - /// `BigSumTree`, `CountSumTree`, `ProvableCountSumTree`) and behaves - /// identically to it for storage, hashing, and its own internal sum - /// aggregate, but contributes 0 to its parent sum tree's running sum - /// when inserted. Counts still propagate. + /// `BigSumTree`, `CountSumTree`, `ProvableCountSumTree`, + /// `ProvableSumTree`) and behaves identically to it for storage, + /// hashing, and its own internal sum aggregate, but contributes 0 to + /// its parent sum tree's running sum when inserted. Counts still + /// propagate. /// /// May only be inserted into sum-bearing trees (`SumTree`, `BigSumTree`, - /// `CountSumTree`, `ProvableCountSumTree`). + /// `CountSumTree`, `ProvableCountSumTree`, `ProvableSumTree`). /// /// Invariants (enforced at construction, serialization, and /// deserialization): - /// - The inner element MUST be one of the four sum-tree variants above. + /// - The inner element MUST be one of the five sum-bearing tree variants + /// above. /// - A `NotSummed` may not wrap another `NotSummed`, a `NonCounted`, - /// `NotCountedOrSummed`, or any non-tree element. + /// a `NotCountedOrSummed`, or any non-tree element. NotSummed(Box), /// Not-counted-or-summed wrapper: contains a sum-bearing tree variant - /// (`SumTree`, `BigSumTree`, `CountSumTree`, `ProvableCountSumTree`) and - /// behaves identically to it for storage, hashing, and its own internal - /// aggregates, but contributes 0 to BOTH its parent's running sum AND - /// the parent's count when inserted. + /// (`SumTree`, `BigSumTree`, `CountSumTree`, `ProvableCountSumTree`, + /// `ProvableSumTree`) and behaves identically to it for storage, + /// hashing, and its own internal aggregates, but contributes 0 to + /// BOTH its parent's running sum AND the parent's count when inserted. /// /// May only be inserted into count-AND-sum-bearing trees (`CountSumTree`, /// `ProvableCountSumTree`) — the only tree types where suppressing both /// axes is meaningful. Any other parent rejects the wrapper at insert. /// - /// For `SumTree` / `BigSumTree` inners, this wrapper suppresses the - /// implicit count-of-one a subtree would contribute to a parent count - /// tree (it stores a sum internally, but counts as a single element). - /// For `CountSumTree` / `ProvableCountSumTree` inners, it suppresses - /// the explicit count value as well as the sum. + /// For `SumTree` / `BigSumTree` / `ProvableSumTree` inners, this wrapper + /// suppresses the implicit count-of-one a subtree would contribute to a + /// parent count tree (it stores a sum internally, but counts as a single + /// element). For `CountSumTree` / `ProvableCountSumTree` inners, it + /// suppresses the explicit count value as well as the sum. /// /// Invariants (enforced at construction, serialization, and /// deserialization): - /// - The inner element MUST be one of the four sum-tree variants above. + /// - The inner element MUST be one of the five sum-bearing tree variants + /// above. /// - A `NotCountedOrSummed` may not wrap any other wrapper or any /// non-tree element. NotCountedOrSummed(Box), @@ -207,6 +210,20 @@ pub enum Element { SumValue, Option, ), + /// Same as Element::SumTree but includes the per-node sum in the + /// cryptographic state. This mirrors `ProvableCountTree` but for sums, + /// allowing aggregate-sum range queries to be cryptographically verified + /// by including the sum in each node hash. + /// + /// Discriminant 19 — appended after the wrapper bytes and the + /// `ReferenceWithSumItem` base variant so that the on-disk encoding of + /// every variant that landed on `develop` before this PR (including + /// `NotCountedOrSummed = 17` and `ReferenceWithSumItem = 18`) is + /// preserved exactly. `NotSummedProvableSumTree` and + /// `NotCountedOrSummedProvableSumTree` keep their hand-assigned twin + /// slots (0xB1 and 0xC1) because base 19 still doesn't fit the + /// `prefix | base` formula either. + ProvableSumTree(Option>, SumValue, Option), } pub fn hex_to_ascii(hex_value: &[u8]) -> String { @@ -398,6 +415,17 @@ impl fmt::Display for Element { Element::NotSummed(inner) => { write!(f, "NotSummed({})", inner) } + Element::ProvableSumTree(root_key, sum_value, flags) => { + write!( + f, + "ProvableSumTree({}, {}{})", + root_key.as_ref().map_or("None".to_string(), hex::encode), + sum_value, + flags + .as_ref() + .map_or(String::new(), |f| format!(", flags: {:?}", f)) + ) + } Element::NotCountedOrSummed(inner) => { write!(f, "NotCountedOrSummed({})", inner) } @@ -441,6 +469,7 @@ impl Element { Element::MmrTree(..) => ElementType::MmrTree, Element::BulkAppendTree(..) => ElementType::BulkAppendTree, Element::DenseAppendOnlyFixedSizeTree(..) => ElementType::DenseAppendOnlyFixedSizeTree, + Element::ProvableSumTree(..) => ElementType::ProvableSumTree, Element::ReferenceWithSumItem(..) => ElementType::ReferenceWithSumItem, Element::NonCounted(inner) => match inner.element_type() { ElementType::Item => ElementType::NonCountedItem, @@ -460,6 +489,7 @@ impl Element { ElementType::DenseAppendOnlyFixedSizeTree => { ElementType::NonCountedDenseAppendOnlyFixedSizeTree } + ElementType::ProvableSumTree => ElementType::NonCountedProvableSumTree, ElementType::ReferenceWithSumItem => ElementType::NonCountedReferenceWithSumItem, // Inner is always a base type — nested wrappers are // forbidden at construction and (de)serialization. @@ -470,7 +500,8 @@ impl Element { ElementType::BigSumTree => ElementType::NotSummedBigSumTree, ElementType::CountSumTree => ElementType::NotSummedCountSumTree, ElementType::ProvableCountSumTree => ElementType::NotSummedProvableCountSumTree, - // Inner is always one of the 4 sum-tree variants above — + ElementType::ProvableSumTree => ElementType::NotSummedProvableSumTree, + // Inner is always one of the five sum-tree variants above — // construction and (de)serialization forbid anything else. // Returning the inner type is the safest fallback for the // unreachable case. @@ -523,11 +554,12 @@ impl Element { Element::SumTree(..) | Element::BigSumTree(..) | Element::CountSumTree(..) - | Element::ProvableCountSumTree(..) => {} + | Element::ProvableCountSumTree(..) + | Element::ProvableSumTree(..) => {} _ => { return Err(crate::error::ElementError::InvalidInput( "NotSummed inner element must be a sum-tree variant (SumTree, \ - BigSumTree, CountSumTree, or ProvableCountSumTree)", + BigSumTree, CountSumTree, ProvableCountSumTree, or ProvableSumTree)", )); } }, @@ -535,11 +567,13 @@ impl Element { Element::SumTree(..) | Element::BigSumTree(..) | Element::CountSumTree(..) - | Element::ProvableCountSumTree(..) => {} + | Element::ProvableCountSumTree(..) + | Element::ProvableSumTree(..) => {} _ => { return Err(crate::error::ElementError::InvalidInput( - "NotCountedOrSummed inner element must be a sum-tree variant \ - (SumTree, BigSumTree, CountSumTree, or ProvableCountSumTree)", + "NotCountedOrSummed inner element must be a sum-bearing tree variant \ + (SumTree, BigSumTree, CountSumTree, ProvableCountSumTree, or \ + ProvableSumTree)", )); } }, @@ -612,6 +646,7 @@ mod serde_impl { DenseAppendOnlyFixedSizeTree(u16, u8, Option), NonCounted(Box), NotSummed(Box), + ProvableSumTree(Option>, SumValue, Option), NotCountedOrSummed(Box), ReferenceWithSumItem( ReferencePathType, @@ -649,6 +684,7 @@ mod serde_impl { ElementShadow::NotSummed(inner) => { Element::NotSummed(Box::new(Element::from(*inner))) } + ElementShadow::ProvableSumTree(k, s, f) => Element::ProvableSumTree(k, s, f), ElementShadow::NotCountedOrSummed(inner) => { Element::NotCountedOrSummed(Box::new(Element::from(*inner))) } @@ -865,8 +901,8 @@ mod tests { #[test] fn element_type_resolves_not_summed_twins() { - // The four sum-tree variants each map to their NotSummed twin. - let cases: [(Element, ElementType); 4] = [ + // The five sum-tree variants each map to their NotSummed twin. + let cases: [(Element, ElementType); 5] = [ ( Element::NotSummed(Box::new(Element::SumTree(None, 0, None))), ElementType::NotSummedSumTree, @@ -883,6 +919,10 @@ mod tests { Element::NotSummed(Box::new(Element::ProvableCountSumTree(None, 0, 0, None))), ElementType::NotSummedProvableCountSumTree, ), + ( + Element::NotSummed(Box::new(Element::ProvableSumTree(None, 0, None))), + ElementType::NotSummedProvableSumTree, + ), ]; for (element, expected) in cases { assert_eq!(element.element_type(), expected); diff --git a/grovedb-element/src/element/serialize.rs b/grovedb-element/src/element/serialize.rs index 6621378e9..21d3e195d 100644 --- a/grovedb-element/src/element/serialize.rs +++ b/grovedb-element/src/element/serialize.rs @@ -46,7 +46,8 @@ impl Element { Element::SumTree(..) | Element::BigSumTree(..) | Element::CountSumTree(..) - | Element::ProvableCountSumTree(..) => {} + | Element::ProvableCountSumTree(..) + | Element::ProvableSumTree(..) => {} _ => { return Err(ElementError::CorruptedData( "NotSummed inner must be a sum-tree variant".to_string(), @@ -59,10 +60,11 @@ impl Element { Element::SumTree(..) | Element::BigSumTree(..) | Element::CountSumTree(..) - | Element::ProvableCountSumTree(..) => {} + | Element::ProvableCountSumTree(..) + | Element::ProvableSumTree(..) => {} _ => { return Err(ElementError::CorruptedData( - "NotCountedOrSummed inner must be a sum-tree variant".to_string(), + "NotCountedOrSummed inner must be a sum-bearing tree variant".to_string(), )); } } @@ -144,7 +146,8 @@ impl Element { Element::SumTree(..) | Element::BigSumTree(..) | Element::CountSumTree(..) - | Element::ProvableCountSumTree(..) => {} + | Element::ProvableCountSumTree(..) + | Element::ProvableSumTree(..) => {} _ => { return Err(ElementError::CorruptedData( "deserialized NotSummed with non-sum-tree inner".to_string(), @@ -157,10 +160,12 @@ impl Element { Element::SumTree(..) | Element::BigSumTree(..) | Element::CountSumTree(..) - | Element::ProvableCountSumTree(..) => {} + | Element::ProvableCountSumTree(..) + | Element::ProvableSumTree(..) => {} _ => { return Err(ElementError::CorruptedData( - "deserialized NotCountedOrSummed with non-sum-tree inner".to_string(), + "deserialized NotCountedOrSummed with non-sum-bearing-tree inner" + .to_string(), )); } } diff --git a/grovedb-element/src/element/visualize.rs b/grovedb-element/src/element/visualize.rs index 8d519fc05..1e3348357 100644 --- a/grovedb-element/src/element/visualize.rs +++ b/grovedb-element/src/element/visualize.rs @@ -186,6 +186,17 @@ impl Visualize for Element { drawer = inner.visualize(drawer)?; drawer.write(b")")?; } + Element::ProvableSumTree(root_key, value, flags) => { + drawer.write(b"provable_sum_tree: ")?; + drawer = root_key.as_deref().visualize(drawer)?; + drawer.write(format!(" {value}").as_bytes())?; + + if let Some(f) = flags + && !f.is_empty() + { + drawer = f.visualize(drawer)?; + } + } Element::NotCountedOrSummed(inner) => { drawer.write(b"not_counted_or_summed(")?; drawer = inner.visualize(drawer)?; diff --git a/grovedb-element/src/element_type.rs b/grovedb-element/src/element_type.rs index 96f5fca51..58701b876 100644 --- a/grovedb-element/src/element_type.rs +++ b/grovedb-element/src/element_type.rs @@ -30,32 +30,33 @@ pub const NON_COUNTED_BASE_MASK: u8 = 0x7F; /// sum-tree base discriminants are legal as the inner byte. pub const NOT_SUMMED_WRAPPER_DISCRIMINANT: u8 = 16; -/// Twin-discriminant prefix for `NotSummedXxx` types: every twin is encoded -/// as `NOT_SUMMED_TWIN_PREFIX | base`. The prefix has the high bit set -/// (so all wrappers cluster in `0x80..` range) plus bits 4 and 5, which -/// distinguishes it from `NON_COUNTED_FLAG`'s lower range. Detection is a -/// range check: `0x80..=0xaf` is NonCounted, `0xb0..=0xbf` is NotSummed. +/// Twin-discriminant family marker for `NotSummedXxx` types. Every twin +/// lives in the range `0xB0..=0xBF` and is matched by an upper-nibble +/// compare: `disc & 0xf0 == 0xb0`. This keeps the NotSummed family +/// disjoint from NonCounted (`0x80..=0xAF`) and NotCountedOrSummed +/// (`0xC0..=0xCF`). /// -/// This places a soft cap on base discriminants: NonCounted bases can use -/// `0..=0x2F` (the low 6 bits of the twin), with bases `0..=0x0F` keeping -/// upper-nibble 0x80 and bases `0x10..=0x1F` falling under 0x90, `0x20..=0x2F` -/// under 0xA0. NotSummed bases must stay in `0..=0x0F` because the prefix -/// only leaves the low nibble. Today only four NotSummed bases are allowed -/// (4, 5, 7, 10) so this is not a constraint. +/// **No bitwise OR formula is used to compute the twin discriminant from +/// the base.** Each twin is assigned a specific value out of the 16 slots +/// in the family range, and resolution in both directions +/// (`from_serialized_value`'s `inner_byte → twin` and `base()`'s +/// `twin → base`) is done by an explicit per-variant match. This avoids +/// the constraint that the previous "`prefix | base`" formula imposed +/// (base discriminants had to fit in the low nibble), so a new +/// sum-tree base at e.g. discriminant 19 can have an arbitrary twin slot +/// like `0xB1 = 177` without colliding with the formula's collapsed +/// `0xb0 | 19 → 0xb3 → base 3 (SumItem)` interpretation. pub const NOT_SUMMED_TWIN_PREFIX: u8 = 0xb0; -/// Mask to recover the base type discriminant from a `NotSummedXxx` -/// discriminant. Base discriminants are `0..=14` (4 bits) so masking the -/// low nibble is sufficient. -pub const NOT_SUMMED_BASE_MASK: u8 = 0x0F; - /// Bincode discriminant byte for `Element::NotCountedOrSummed`. Tied to the -/// declaration order of the `Element` enum (0-indexed, 18th variant). +/// declaration order of the `Element` enum (0-indexed, 17th variant). +/// Matches develop's positioning (this branch preserves that ordering and +/// places its own `ProvableSumTree` addition at slot 19 instead). /// /// Like the other two wrapper discriminants, this byte has no direct /// `ElementType` variant. `from_serialized_value` reads the next byte and -/// resolves to one of the four `NotCountedOrSummedXxx` synthetic twins. Only -/// the four sum-tree base discriminants are legal as the inner byte. +/// resolves to one of the synthetic `NotCountedOrSummedXxx` twins. Only +/// the sum-bearing tree base discriminants are legal as the inner byte. pub const NOT_COUNTED_OR_SUMMED_WRAPPER_DISCRIMINANT: u8 = 17; /// Twin-discriminant prefix for `NotCountedOrSummedXxx` types: every twin is @@ -65,12 +66,6 @@ pub const NOT_COUNTED_OR_SUMMED_WRAPPER_DISCRIMINANT: u8 = 17; /// `NOT_SUMMED_TWIN_PREFIX`'s `0xb0` upper-nibble. Detection is therefore /// an upper-nibble compare: `disc & 0xf0 == 0xc0`. pub const NOT_COUNTED_OR_SUMMED_TWIN_PREFIX: u8 = 0xc0; - -/// Mask to recover the base type discriminant from a -/// `NotCountedOrSummedXxx` discriminant. Base discriminants are `0..=14` -/// (4 bits) so masking the low nibble is sufficient. -pub const NOT_COUNTED_OR_SUMMED_BASE_MASK: u8 = 0x0F; - /// Indicates which type of proof node should be used when generating proofs. /// /// This determines whether the verifier will recompute the value hash (secure) @@ -145,33 +140,80 @@ pub enum ProofNodeType { /// /// Used for: Reference (inside ProvableCountTree) KvRefValueHashCount, + + /// Use `Node::KVSum` - sum analogue of `KvCount`. The verifier + /// recomputes `value_hash = H(value)` and includes the i64 sum in the + /// node hash via `node_hash_with_sum`. + /// + /// Used for: Item, SumItem, ItemWithSumItem (inside ProvableSumTree) + KvSum, + + /// Use `Node::KVRefValueHashSum` - sum analogue of `KvRefValueHashCount`. + /// At the merk layer, this generates `KVValueHashFeatureType` (since + /// merk doesn't know about references). GroveDB post-processes these + /// nodes to `Node::KVRefValueHashSum` with the dereferenced value. + /// + /// Used for: Reference (inside ProvableSumTree) + KvRefValueHashSum, } /// Element type discriminants. /// -/// Base types (0..=14) match the bincode serialization order of the `Element` -/// enum. Non-counted twins (128..=142) are synthetic — they encode "this is a -/// `NonCounted` wrapper around an inner element of base type -/// `disc & 0x7F`". The on-disk representation of `Element::NonCounted` still -/// uses the wrapper byte `NON_COUNTED_WRAPPER_DISCRIMINANT` (15) followed by -/// the inner element's bytes; `from_serialized_value` synthesizes the +/// Base types (0..=14, 18, 19) match the bincode serialization order of +/// the `Element` enum. The `Element` enum has indices 15, 16, and 17 +/// reserved for the `NonCounted`, `NotSummed`, and `NotCountedOrSummed` +/// wrapper variants respectively (none has a direct `ElementType` +/// variant — each synthesizes twin discriminants by reading the inner +/// element's byte). Slot 18 is `ReferenceWithSumItem` and slot 19 is +/// `ProvableSumTree`. +/// +/// Non-counted twins are synthetic — they encode "this is a `NonCounted` +/// wrapper around an inner element of base type `disc & ...`". Twins for +/// base discriminants 0..=14 live in 128..=142 (`0x80 | base`). The twin +/// for `ReferenceWithSumItem` (base 18) is placed at 146 (`0x80 | 18 = +/// 0x92`), and the twin for `ProvableSumTree` (base 19) is placed at 147 +/// (`0x80 | 19 = 0x93`). All NonCounted twins satisfy the range check +/// `0x80 <= disc < 0xB0` — the lower range distinguishes them from the +/// NotSummed family that starts at `0xB0`. The on-disk representation of +/// `Element::NonCounted` still uses the wrapper byte +/// `NON_COUNTED_WRAPPER_DISCRIMINANT` (15) followed by the inner +/// element's bytes; `from_serialized_value` synthesizes the /// `NonCountedXxx` variant by peeking at the second byte. /// -/// Not-summed twins follow the same scheme but use the prefix `0xb0` and only -/// cover the four sum-tree base discriminants (4, 5, 7, 10), placing them at -/// `180, 181, 183, 186`. The wrapper byte is `NOT_SUMMED_WRAPPER_DISCRIMINANT` -/// (16). Not-counted-or-summed twins use the prefix `0xc0` over the same four -/// sum-tree base discriminants, placing them at `196, 197, 199, 202`. The -/// wrapper byte is `NOT_COUNTED_OR_SUMMED_WRAPPER_DISCRIMINANT` (17). All -/// three wrapper twin ranges have bit 7 set, so wrappers cluster in `0x80..`, -/// and the upper nibble distinguishes them: `0x80` for `NonCounted`, `0xb0` -/// for `NotSummed`, `0xc0` for `NotCountedOrSummed`. The three wrappers are -/// mutually exclusive — the constructors and (de)serializers reject any -/// nesting in either direction. +/// Not-summed twins use the prefix `0xb0` over the five sum-bearing tree +/// bases (4, 5, 7, 10, 19), placing them at `180, 181, 183, 186, 177`. +/// The wrapper byte is `NOT_SUMMED_WRAPPER_DISCRIMINANT` (16). +/// Not-counted-or-summed twins use the prefix `0xc0` over the same five +/// sum-bearing tree bases, placing them at `196, 197, 199, 202, 193`. +/// The wrapper byte is `NOT_COUNTED_OR_SUMMED_WRAPPER_DISCRIMINANT` +/// (17). +/// +/// All three wrapper twin ranges have bit 7 set, so wrappers cluster in +/// `0x80..`, and the upper nibble distinguishes them: `0x80..=0xAF` is +/// `NonCounted`, `0xB0..=0xBF` is `NotSummed`, `0xC0..=0xCF` is +/// `NotCountedOrSummed`. The three wrappers are mutually exclusive — +/// the constructors and (de)serializers reject any nesting in either +/// direction. /// -/// IMPORTANT: Base values (0..=14) must match the order of variants in the -/// `Element` enum. The `test_element_serialization_discriminants_match_element_type` -/// test catches drift. +/// Not-summed twins cluster in the `0xB0..=0xBF` family range (matched +/// via `disc & 0xf0 == 0xb0`). Unlike NonCounted, twin slots are +/// **assigned explicitly per variant** rather than computed via a +/// `prefix | base` formula. The five legal sum-bearing tree inner types +/// are mapped 1-to-1: +/// SumTree (base 4) -> 180 +/// BigSumTree (base 5) -> 181 +/// CountSumTree (base 7) -> 183 +/// ProvableCountSumTree (base 10)-> 186 +/// ProvableSumTree (base 19) -> 177 (hand-assigned; base overflows +/// the low nibble of the formula) +/// The wrapper byte on disk is `NOT_SUMMED_WRAPPER_DISCRIMINANT` (16), +/// and `from_serialized_value` resolves `[16, inner_byte]` to the matching +/// twin via an explicit `inner_byte → twin` match. +/// +/// IMPORTANT: Base values (0..=14, 18, 19) must match the order of +/// variants in the `Element` enum. The +/// `test_element_serialization_discriminants_match_element_type` test +/// catches drift. #[derive(Debug, Clone, Copy, PartialEq, Eq, Hash)] #[repr(u8)] pub enum ElementType { @@ -209,11 +251,18 @@ pub enum ElementType { // direct ElementType variant. // 16 is reserved as the on-disk wrapper byte (NotSummed) and has no // direct ElementType variant. + // 17 is reserved as the on-disk wrapper byte (NotCountedOrSummed) and + // has no direct ElementType variant. /// Reference that also carries an explicit `SumValue` - discriminant 18. /// Resolves like `Reference` on `get()` and propagates `sum_value` into /// sum-bearing parents like `SumItem` / `ItemWithSumItem`. See /// `Element::ReferenceWithSumItem` for full semantics. ReferenceWithSumItem = 18, + /// Provable sum tree - discriminant 19 (sums baked into node hashes). + /// Placed at the end of the base-variant range so this branch preserves + /// develop's `NotCountedOrSummed = 17` / `ReferenceWithSumItem = 18` + /// wire encoding exactly. + ProvableSumTree = 19, /// Non-counted wrapper around `Item` - discriminant 128 NonCountedItem = 128, /// Non-counted wrapper around `Reference` - discriminant 129 @@ -246,14 +295,20 @@ pub enum ElementType { NonCountedDenseAppendOnlyFixedSizeTree = 142, /// Non-counted wrapper around `ReferenceWithSumItem` - discriminant 146 (`0x80 | 18`) NonCountedReferenceWithSumItem = 146, - /// Not-summed wrapper around `SumTree` - discriminant 180 (`0xb0 | 4`) + /// Non-counted wrapper around `ProvableSumTree` - discriminant 147 (`0x80 | 19`) + NonCountedProvableSumTree = 147, + /// Not-summed wrapper around `SumTree` - discriminant 180 (`0xB4`) NotSummedSumTree = 180, - /// Not-summed wrapper around `BigSumTree` - discriminant 181 (`0xb0 | 5`) + /// Not-summed wrapper around `BigSumTree` - discriminant 181 (`0xB5`) NotSummedBigSumTree = 181, - /// Not-summed wrapper around `CountSumTree` - discriminant 183 (`0xb0 | 7`) + /// Not-summed wrapper around `CountSumTree` - discriminant 183 (`0xB7`) NotSummedCountSumTree = 183, - /// Not-summed wrapper around `ProvableCountSumTree` - discriminant 186 (`0xb0 | 10`) + /// Not-summed wrapper around `ProvableCountSumTree` - discriminant 186 (`0xBA`) NotSummedProvableCountSumTree = 186, + /// Not-summed wrapper around `ProvableSumTree` - discriminant 177 (`0xB1`), + /// assigned explicitly out of the `0xB0..=0xBF` family range. Not derived + /// from any formula — see the doc comment on `NOT_SUMMED_TWIN_PREFIX`. + NotSummedProvableSumTree = 177, /// Not-counted-or-summed wrapper around `SumTree` - discriminant 196 (`0xc0 | 4`) NotCountedOrSummedSumTree = 196, /// Not-counted-or-summed wrapper around `BigSumTree` - discriminant 197 (`0xc0 | 5`) @@ -262,6 +317,11 @@ pub enum ElementType { NotCountedOrSummedCountSumTree = 199, /// Not-counted-or-summed wrapper around `ProvableCountSumTree` - discriminant 202 (`0xc0 | 10`) NotCountedOrSummedProvableCountSumTree = 202, + /// Not-counted-or-summed wrapper around `ProvableSumTree` - discriminant + /// 193 (`0xC1`), assigned explicitly out of the `0xC0..=0xCF` family + /// range. Like `NotSummedProvableSumTree` it can't use the + /// `prefix | base` formula because base 19 overflows the low nibble. + NotCountedOrSummedProvableSumTree = 193, } impl ElementType { @@ -290,22 +350,23 @@ impl ElementType { "NonCounted wrapper has no inner element discriminant byte".to_string(), ) })?; - // The inner discriminant must be a legal base type. Today those - // are `0..=14` plus `18` (ReferenceWithSumItem). Bytes 15, 16, - // and 17 are the wrapper bytes themselves (nested wrappers - // forbidden); 19..=127 are unallocated; 128..=142 + 146 are the - // synthetic NonCountedXxx twins which never appear on disk. + // The inner discriminant must be a legal base type. Today + // those are `0..=14`, `18` (ReferenceWithSumItem), and `19` + // (ProvableSumTree). Bytes 15, 16, and 17 are the wrapper bytes + // themselves (nested wrappers forbidden in either direction); + // 20..=127 are unallocated; 128..=147 are the synthetic + // NonCountedXxx twins which never appear on disk. // Without this check, the bitwise OR below would collapse // `0x80 | inner_byte` into `inner_byte` and a payload like // `[15, 128, ...]` would silently parse as `NonCountedItem`. // - // Use an explicit allowlist so the next base-variant addition is - // a one-line edit here and the check stays robust to new + // Use an explicit allowlist so the next base-variant addition + // is a one-line edit here and the check stays robust to new // variants landing without updating the guard. - if !matches!(inner_byte, 0..=14 | 18) { + if !matches!(inner_byte, 0..=14 | 18 | 19) { return Err(ElementError::CorruptedData(format!( "NonCounted inner discriminant must be a base type \ - (0..=14 or 18), got {}", + (0..=14, 18, or 19), got {}", inner_byte ))); } @@ -316,16 +377,22 @@ impl ElementType { "NotSummed wrapper has no inner element discriminant byte".to_string(), ) })?; - // Only the four sum-tree base discriminants are legal here. - // Anything else — including the wrapper bytes 15/16/17, the - // synthetic twin ranges, and the unrelated base types — is - // rejected so that round-tripping `from_serialized_value` always - // yields a valid `NotSummedXxx` twin. + // Only the five sum-bearing tree base discriminants are legal + // here. Each is mapped explicitly to its assigned NotSummed + // twin — no `prefix | inner_byte` formula is used, because + // some twin slots are hand-assigned (see NOT_SUMMED_TWIN_PREFIX + // doc). Anything else is rejected so that round-tripping + // `from_serialized_value` always yields a valid `NotSummedXxx`. match inner_byte { - 4 | 5 | 7 | 10 => Self::try_from(NOT_SUMMED_TWIN_PREFIX | inner_byte), + 4 => Ok(ElementType::NotSummedSumTree), + 5 => Ok(ElementType::NotSummedBigSumTree), + 7 => Ok(ElementType::NotSummedCountSumTree), + 10 => Ok(ElementType::NotSummedProvableCountSumTree), + 19 => Ok(ElementType::NotSummedProvableSumTree), _ => Err(ElementError::CorruptedData(format!( - "NotSummed inner discriminant must be a sum-tree base type \ - (4=SumTree, 5=BigSumTree, 7=CountSumTree, 10=ProvableCountSumTree), got {}", + "NotSummed inner discriminant must be a sum-bearing tree base type \ + (4=SumTree, 5=BigSumTree, 7=CountSumTree, 10=ProvableCountSumTree, \ + 19=ProvableSumTree), got {}", inner_byte ))), } @@ -335,16 +402,23 @@ impl ElementType { "NotCountedOrSummed wrapper has no inner element discriminant byte".to_string(), ) })?; - // Same whitelist as NotSummed: only the four sum-bearing tree + // Same whitelist as NotSummed: only the five sum-bearing tree // base discriminants are legal here. The new wrapper additionally // suppresses the parent count contribution (the implicit +1 for - // SumTree/BigSumTree, or the explicit count for CountSumTree / - // ProvableCountSumTree). + // SumTree/BigSumTree/ProvableSumTree, or the explicit count for + // CountSumTree / ProvableCountSumTree). `ProvableSumTree` (base + // 19) is mapped explicitly because its twin slot 0xC1 doesn't + // fall out of the bitwise `prefix | base` formula. match inner_byte { - 4 | 5 | 7 | 10 => Self::try_from(NOT_COUNTED_OR_SUMMED_TWIN_PREFIX | inner_byte), + 4 => Ok(ElementType::NotCountedOrSummedSumTree), + 5 => Ok(ElementType::NotCountedOrSummedBigSumTree), + 7 => Ok(ElementType::NotCountedOrSummedCountSumTree), + 10 => Ok(ElementType::NotCountedOrSummedProvableCountSumTree), + 19 => Ok(ElementType::NotCountedOrSummedProvableSumTree), _ => Err(ElementError::CorruptedData(format!( - "NotCountedOrSummed inner discriminant must be a sum-tree base type \ - (4=SumTree, 5=BigSumTree, 7=CountSumTree, 10=ProvableCountSumTree), got {}", + "NotCountedOrSummed inner discriminant must be a sum-bearing tree base \ + type (4=SumTree, 5=BigSumTree, 7=CountSumTree, \ + 10=ProvableCountSumTree, 19=ProvableSumTree), got {}", inner_byte ))), } @@ -355,15 +429,23 @@ impl ElementType { /// Returns true if this is a `NonCountedXxx` discriminant. Range check: /// `NonCountedXxx` lives in `0x80..=0xaf`, `NotSummedXxx` in - /// `0xb0..=0xbf`. Both ranges have bit 7 set but only NonCounted's upper - /// nibble stays below `0xb`. + /// `0xb0..=0xbf`, `NotCountedOrSummedXxx` in `0xc0..=0xcf`. All three + /// ranges have bit 7 set but only NonCounted's upper nibble stays + /// below `0xb`. The widened upper bound (was `0x9F` in earlier + /// revisions) makes room for synthetic twins of higher-numbered base + /// discriminants like `NonCountedProvableSumTree = 0x93` (twin of + /// base 19) and `NonCountedReferenceWithSumItem = 0x92` (twin of + /// base 18). #[inline] pub const fn is_non_counted(self) -> bool { let disc = self as u8; disc >= NON_COUNTED_FLAG && disc < NOT_SUMMED_TWIN_PREFIX } - /// Returns true if this is a `NotSummedXxx` discriminant. + /// Returns true if this is a `NotSummedXxx` discriminant. The + /// NotSummed family spans `0xB0..=0xBF` (16 explicit slots, no + /// formula); matched via `disc & 0xf0 == 0xb0`. NonCounted lives at + /// `0x80..=0xAF` so the two families stay disjoint. #[inline] pub const fn is_not_summed(self) -> bool { (self as u8) & 0xf0 == NOT_SUMMED_TWIN_PREFIX @@ -378,32 +460,44 @@ impl ElementType { /// Returns the underlying base ElementType, stripping any wrapper flag /// bits. For base types, returns `self` unchanged. /// - /// The three wrapper twin ranges share bit 7 but are distinguished by - /// the upper nibble (`0x80` for `NonCounted`, `0xb0` for `NotSummed`, - /// `0xc0` for `NotCountedOrSummed`). Constructors and (de)serializers - /// reject any wrapper nesting, so only one wrapper status is ever set - /// on any valid `ElementType` instance. + /// The three wrapper families occupy disjoint ranges: `NonCounted` at + /// `0x80..=0x9F` (`& 0xe0 == 0x80`), `NotSummed` at `0xB0..=0xBF` + /// (`& 0xf0 == 0xb0`), and `NotCountedOrSummed` at `0xC0..=0xCF` + /// (`& 0xf0 == 0xc0`). Constructors and (de)serializers reject any + /// wrapper nesting, so only one wrapper status is ever set on any + /// valid `ElementType` instance. + /// + /// `NonCounted` uses the bitwise formula `base | 0x80` (all base + /// discriminants fit in the low 5 bits), so its inverse mask works + /// uniformly. `NotSummed` and `NotCountedOrSummed` use **explicit + /// per-variant mapping** because their twin slots are hand-assigned + /// rather than computed — `NotSummedProvableSumTree = 0xB1` and + /// `NotCountedOrSummedProvableSumTree = 0xC1` would collide with the + /// `disc & 0x0F → base 1 (Reference)` interpretation if a bitwise + /// inverse were used. #[inline] pub fn base(self) -> ElementType { - let disc = self as u8; if self.is_non_counted() { // Safe: every NonCountedXxx is constructed from a valid base - // discriminant 0..=14, so masking the high bit yields a valid - // base discriminant. - ElementType::try_from(disc & NON_COUNTED_BASE_MASK) + // discriminant whose low 5 bits fit cleanly under 0x80. + ElementType::try_from((self as u8) & NON_COUNTED_BASE_MASK) .expect("NonCounted twin always has a valid base") - } else if self.is_not_summed() { - // Safe: every NotSummedXxx is constructed from one of the four - // sum-tree base discriminants {4, 5, 7, 10}. - ElementType::try_from(disc & NOT_SUMMED_BASE_MASK) - .expect("NotSummed twin always has a valid base") - } else if self.is_not_counted_or_summed() { - // Safe: every NotCountedOrSummedXxx is constructed from one of - // the four sum-tree base discriminants {4, 5, 7, 10}. - ElementType::try_from(disc & NOT_COUNTED_OR_SUMMED_BASE_MASK) - .expect("NotCountedOrSummed twin always has a valid base") } else { - self + match self { + ElementType::NotSummedSumTree => ElementType::SumTree, + ElementType::NotSummedBigSumTree => ElementType::BigSumTree, + ElementType::NotSummedCountSumTree => ElementType::CountSumTree, + ElementType::NotSummedProvableCountSumTree => ElementType::ProvableCountSumTree, + ElementType::NotSummedProvableSumTree => ElementType::ProvableSumTree, + ElementType::NotCountedOrSummedSumTree => ElementType::SumTree, + ElementType::NotCountedOrSummedBigSumTree => ElementType::BigSumTree, + ElementType::NotCountedOrSummedCountSumTree => ElementType::CountSumTree, + ElementType::NotCountedOrSummedProvableCountSumTree => { + ElementType::ProvableCountSumTree + } + ElementType::NotCountedOrSummedProvableSumTree => ElementType::ProvableSumTree, + other => other, + } } } @@ -452,32 +546,54 @@ impl ElementType { #[inline] pub fn proof_node_type(&self, parent_tree_type: Option) -> ProofNodeType { let parent_base = parent_tree_type.map(|t| t.base()); + // "Provable aggregate parents" are those that bake the per-node + // aggregate into the node hash. The count family + // (`ProvableCountTree`, `ProvableCountSumTree`) hashes the count; + // the sum family (`ProvableSumTree`) hashes the sum. + // + // The dispatch distinguishes the two families. Item / Reference proof + // variants diverge (KvSum / KvRefValueHashSum vs KvCount / + // KvRefValueHashCount). Subtrees inside either family still use + // `KvValueHashFeatureType` — the feature_type field on that variant + // carries both the count and sum in their respective tagged + // TreeFeatureType variants, so a single proof-node variant suffices + // for the subtree case. let is_provable_count_tree = matches!( parent_base, Some(ElementType::ProvableCountTree) | Some(ElementType::ProvableCountSumTree) ); + let is_provable_sum_tree = matches!(parent_base, Some(ElementType::ProvableSumTree)); + let is_provable_aggregate_tree = is_provable_count_tree || is_provable_sum_tree; let base = self.base(); if base.has_simple_value_hash() { // Items (Item, SumItem, ItemWithSumItem) if is_provable_count_tree { ProofNodeType::KvCount + } else if is_provable_sum_tree { + ProofNodeType::KvSum } else { ProofNodeType::Kv } } else if base.is_reference() { // References need combined hash (for reference resolution). - // In ProvableCountTree, they also need the count in node_hash. - // GroveDB post-processes these to KVRefValueHash/KVRefValueHashCount. + // In ProvableCountTree they additionally need the count in + // node_hash; in ProvableSumTree they need the sum. + // GroveDB post-processes these to KVRefValueHash / + // KVRefValueHashCount / KVRefValueHashSum. if is_provable_count_tree { ProofNodeType::KvRefValueHashCount + } else if is_provable_sum_tree { + ProofNodeType::KvRefValueHashSum } else { ProofNodeType::KvRefValueHash } } else { // Subtrees (Tree, SumTree, BigSumTree, CountTree, CountSumTree, - // ProvableCountTree) - if is_provable_count_tree { + // ProvableCountTree, ProvableSumTree). KvValueHashFeatureType + // works for both Count and Sum families because the embedded + // `TreeFeatureType` carries the aggregate. + if is_provable_aggregate_tree { ProofNodeType::KvValueHashFeatureType } else { ProofNodeType::KvValueHash @@ -526,6 +642,7 @@ impl ElementType { | ElementType::CountSumTree | ElementType::ProvableCountTree | ElementType::ProvableCountSumTree + | ElementType::ProvableSumTree | ElementType::CommitmentTree | ElementType::MmrTree | ElementType::BulkAppendTree @@ -574,6 +691,7 @@ impl ElementType { ElementType::BulkAppendTree => "bulk_append_tree", ElementType::DenseAppendOnlyFixedSizeTree => "dense_tree", ElementType::ReferenceWithSumItem => "reference with sum item", + ElementType::ProvableSumTree => "provable sum tree", ElementType::NonCountedItem => "non_counted item", ElementType::NonCountedReference => "non_counted reference", ElementType::NonCountedTree => "non_counted tree", @@ -590,16 +708,21 @@ impl ElementType { ElementType::NonCountedBulkAppendTree => "non_counted bulk_append_tree", ElementType::NonCountedDenseAppendOnlyFixedSizeTree => "non_counted dense_tree", ElementType::NonCountedReferenceWithSumItem => "non_counted reference with sum item", + ElementType::NonCountedProvableSumTree => "non_counted provable sum tree", ElementType::NotSummedSumTree => "not_summed sum tree", ElementType::NotSummedBigSumTree => "not_summed big sum tree", ElementType::NotSummedCountSumTree => "not_summed count sum tree", ElementType::NotSummedProvableCountSumTree => "not_summed provable count sum tree", + ElementType::NotSummedProvableSumTree => "not_summed provable sum tree", ElementType::NotCountedOrSummedSumTree => "not_counted_or_summed sum tree", ElementType::NotCountedOrSummedBigSumTree => "not_counted_or_summed big sum tree", ElementType::NotCountedOrSummedCountSumTree => "not_counted_or_summed count sum tree", ElementType::NotCountedOrSummedProvableCountSumTree => { "not_counted_or_summed provable count sum tree" } + ElementType::NotCountedOrSummedProvableSumTree => { + "not_counted_or_summed provable sum tree" + } } } } @@ -634,6 +757,7 @@ impl TryFrom for ElementType { // 16 is the raw NotSummed wrapper byte; same treatment. // 17 is the raw NotCountedOrSummed wrapper byte; same treatment. 18 => Ok(ElementType::ReferenceWithSumItem), + 19 => Ok(ElementType::ProvableSumTree), 128 => Ok(ElementType::NonCountedItem), 129 => Ok(ElementType::NonCountedReference), 130 => Ok(ElementType::NonCountedTree), @@ -650,10 +774,20 @@ impl TryFrom for ElementType { 141 => Ok(ElementType::NonCountedBulkAppendTree), 142 => Ok(ElementType::NonCountedDenseAppendOnlyFixedSizeTree), 146 => Ok(ElementType::NonCountedReferenceWithSumItem), + 147 => Ok(ElementType::NonCountedProvableSumTree), + // NotSummed twins occupy the 0xB0..=0xBF family range; slots + // are assigned explicitly per variant. + 177 => Ok(ElementType::NotSummedProvableSumTree), 180 => Ok(ElementType::NotSummedSumTree), 181 => Ok(ElementType::NotSummedBigSumTree), 183 => Ok(ElementType::NotSummedCountSumTree), 186 => Ok(ElementType::NotSummedProvableCountSumTree), + // NotCountedOrSummed twins occupy the 0xC0..=0xCF family range. + // Bases 4/5/7/10 use the bitwise `0xC0 | base` formula; base 19 + // (ProvableSumTree) is hand-assigned to 0xC1 (193) because its + // base overflows the low nibble and would collide with + // `disc & 0x0F → SumItem` under a uniform mask. + 193 => Ok(ElementType::NotCountedOrSummedProvableSumTree), 196 => Ok(ElementType::NotCountedOrSummedSumTree), 197 => Ok(ElementType::NotCountedOrSummedBigSumTree), 199 => Ok(ElementType::NotCountedOrSummedCountSumTree), @@ -724,11 +858,18 @@ mod tests { ElementType::try_from(18).unwrap(), ElementType::ReferenceWithSumItem ); - // 19..=127 are unallocated and invalid. - assert!(ElementType::try_from(19).is_err()); + // Base discriminant 19 is ProvableSumTree. + assert_eq!( + ElementType::try_from(19).unwrap(), + ElementType::ProvableSumTree + ); + // 20..=127 are unallocated and invalid. + assert!(ElementType::try_from(20).is_err()); assert!(ElementType::try_from(100).is_err()); - // NonCounted twins (0x80 | base): 128..142 plus 146 (twin of base 18). + // NonCounted twins (0x80 | base): 128..142, plus 146 (= 0x80|18 = + // ReferenceWithSumItem twin) and 147 (= 0x80|19 = ProvableSumTree + // twin). assert_eq!( ElementType::try_from(128).unwrap(), ElementType::NonCountedItem @@ -745,6 +886,10 @@ mod tests { ElementType::try_from(146).unwrap(), ElementType::NonCountedReferenceWithSumItem ); + assert_eq!( + ElementType::try_from(147).unwrap(), + ElementType::NonCountedProvableSumTree + ); // Bytes between the base and NonCounted-twin ranges are invalid. assert!(ElementType::try_from(127).is_err()); // 143 (= 0x80|15), 144 (= 0x80|16), 145 (= 0x80|17): wrapper bytes @@ -752,13 +897,18 @@ mod tests { assert!(ElementType::try_from(143).is_err()); assert!(ElementType::try_from(144).is_err()); assert!(ElementType::try_from(145).is_err()); - // 147..=179 (between NonCounted-twin and NotSummed-twin ranges) are - // also invalid. - assert!(ElementType::try_from(147).is_err()); - assert!(ElementType::try_from(179).is_err()); - - // NotSummed twins (0xb0 | base): only the four sum-tree bases - // {4, 5, 7, 10} are legal → discriminants {180, 181, 183, 186}. + // 148..=176 (between NonCounted-twin and NotSummed-twin ranges) are + // invalid (with the exception of 177 = NotSummedProvableSumTree). + assert!(ElementType::try_from(148).is_err()); + assert!(ElementType::try_from(176).is_err()); + + // NotSummed twins live in 0xB0..=0xBF with explicit per-variant + // slot assignments — not a formula. Five slots are populated: + // SumTree -> 180 (0xB4) + // BigSumTree -> 181 (0xB5) + // CountSumTree -> 183 (0xB7) + // ProvableCountSumTree -> 186 (0xBA) + // ProvableSumTree -> 177 (0xB1) assert_eq!( ElementType::try_from(180).unwrap(), ElementType::NotSummedSumTree @@ -775,19 +925,14 @@ mod tests { ElementType::try_from(186).unwrap(), ElementType::NotSummedProvableCountSumTree ); - // Other bytes in 0xb0..=0xbe (non-sum-tree bases) are invalid. + assert_eq!( + ElementType::try_from(177).unwrap(), + ElementType::NotSummedProvableSumTree + ); + // All unallocated slots in 0xB0..=0xBF are invalid. for bad in [ - 0xb0u8, // base 0 (Item) — not a sum-tree variant - 0xb1, // base 1 (Reference) - 0xb2, // base 2 (Tree) - 0xb3, // base 3 (SumItem) — leaf, not a tree - 0xb6, // base 6 (CountTree) - 0xb8, // base 8 (ProvableCountTree) - 0xb9, // base 9 (ItemWithSumItem) - 0xbb, // base 11 (CommitmentTree) - 0xbc, // base 12 (MmrTree) - 0xbd, // base 13 (BulkAppendTree) - 0xbe, // base 14 (DenseAppendOnlyFixedSizeTree) + 0xb0u8, // wrapper byte 16, never a twin + 0xb2, 0xb3, 0xb6, 0xb8, 0xb9, 0xbb, 0xbc, 0xbd, 0xbe, 0xbf, ] { assert!( ElementType::try_from(bad).is_err(), @@ -799,8 +944,11 @@ mod tests { assert!(ElementType::try_from(187).is_err()); assert!(ElementType::try_from(195).is_err()); - // NotCountedOrSummed twins (0xc0 | base): only the four sum-tree - // bases {4, 5, 7, 10} are legal → discriminants {196, 197, 199, 202}. + // NotCountedOrSummed twins: the five sum-bearing tree bases + // {4, 5, 7, 10, 19} are legal → discriminants {196, 197, 199, + // 202, 193}. `0xc0 | base` only works for bases 4/5/7/10; + // ProvableSumTree (base 19) gets an explicit hand-assigned slot + // at 0xC1 (193), same shape as NotSummedProvableSumTree at 0xB1. assert_eq!( ElementType::try_from(196).unwrap(), ElementType::NotCountedOrSummedSumTree @@ -817,9 +965,13 @@ mod tests { ElementType::try_from(202).unwrap(), ElementType::NotCountedOrSummedProvableCountSumTree ); - // Other bytes in 0xc0..=0xce (non-sum-tree bases) are invalid. + assert_eq!( + ElementType::try_from(193).unwrap(), + ElementType::NotCountedOrSummedProvableSumTree + ); + // Other bytes in 0xc0..=0xcf (non-sum-bearing-tree bases) are invalid. for bad in [ - 0xc0u8, 0xc1, 0xc2, 0xc3, 0xc6, 0xc8, 0xc9, 0xcb, 0xcc, 0xcd, 0xce, + 0xc0u8, 0xc2, 0xc3, 0xc6, 0xc8, 0xc9, 0xcb, 0xcc, 0xcd, 0xce, 0xcf, ] { assert!( ElementType::try_from(bad).is_err(), @@ -834,12 +986,21 @@ mod tests { #[test] fn test_non_counted_helpers() { - // is_non_counted: upper-nibble compare against 0x80. + // is_non_counted: range check against 0x80..=0xAF. assert!(!ElementType::Item.is_non_counted()); assert!(!ElementType::Tree.is_non_counted()); + assert!(!ElementType::ProvableSumTree.is_non_counted()); + assert!(!ElementType::ReferenceWithSumItem.is_non_counted()); assert!(ElementType::NonCountedItem.is_non_counted()); assert!(ElementType::NonCountedTree.is_non_counted()); assert!(ElementType::NonCountedDenseAppendOnlyFixedSizeTree.is_non_counted()); + // The NonCounted twin of ProvableSumTree (base 19) lives at 147 + // (0x93). The widened range check (0x80..=0xAF) must still + // classify it correctly. + assert!(ElementType::NonCountedProvableSumTree.is_non_counted()); + // Same for the NonCounted twin of ReferenceWithSumItem (base 18) + // at 146 (0x92). + assert!(ElementType::NonCountedReferenceWithSumItem.is_non_counted()); // All three wrapper twin ranges share bit 7, but only NonCounted has // upper-nibble 0x80. NotSummed (upper-nibble 0xb0) and @@ -847,8 +1008,10 @@ mod tests { // NonCounted. assert!(!ElementType::NotSummedSumTree.is_non_counted()); assert!(!ElementType::NotSummedProvableCountSumTree.is_non_counted()); + assert!(!ElementType::NotSummedProvableSumTree.is_non_counted()); assert!(!ElementType::NotCountedOrSummedSumTree.is_non_counted()); assert!(!ElementType::NotCountedOrSummedProvableCountSumTree.is_non_counted()); + assert!(!ElementType::NotCountedOrSummedProvableSumTree.is_non_counted()); // base() strips the wrapper and returns the underlying type. assert_eq!(ElementType::Item.base(), ElementType::Item); @@ -858,6 +1021,10 @@ mod tests { ElementType::NonCountedProvableCountTree.base(), ElementType::ProvableCountTree ); + assert_eq!( + ElementType::NonCountedProvableSumTree.base(), + ElementType::ProvableSumTree + ); // The discriminant relationship: twin = base | 0x80 assert_eq!( @@ -868,11 +1035,16 @@ mod tests { ElementType::NonCountedDenseAppendOnlyFixedSizeTree as u8, ElementType::DenseAppendOnlyFixedSizeTree as u8 | NON_COUNTED_FLAG ); + assert_eq!( + ElementType::NonCountedProvableSumTree as u8, + ElementType::ProvableSumTree as u8 | NON_COUNTED_FLAG + ); } #[test] fn test_not_summed_helpers() { - // is_not_summed: upper-nibble compare against 0xb0. + // is_not_summed: upper-nibble compare against 0xb0 + // (range 0xB0..=0xBF). assert!(!ElementType::Item.is_not_summed()); assert!(!ElementType::SumTree.is_not_summed()); assert!(!ElementType::NonCountedSumTree.is_not_summed()); @@ -881,6 +1053,13 @@ mod tests { assert!(ElementType::NotSummedBigSumTree.is_not_summed()); assert!(ElementType::NotSummedCountSumTree.is_not_summed()); assert!(ElementType::NotSummedProvableCountSumTree.is_not_summed()); + // The new ProvableSumTree NotSummed twin lives at 177 (0xB1) + // — an explicit slot in the 0xB0..=0xBF window. The mask + // (& 0xf0 == 0xb0) classifies it correctly. + assert!(ElementType::NotSummedProvableSumTree.is_not_summed()); + + // NonCounted twins (0x80..=0x9F) must NOT match. + assert!(!ElementType::NonCountedProvableSumTree.is_not_summed()); // base() strips the wrapper and returns the underlying type. assert_eq!(ElementType::NotSummedSumTree.base(), ElementType::SumTree); @@ -896,16 +1075,36 @@ mod tests { ElementType::NotSummedProvableCountSumTree.base(), ElementType::ProvableCountSumTree ); - - // The discriminant relationship: twin = base | 0xb0. - assert_eq!( - ElementType::NotSummedSumTree as u8, - ElementType::SumTree as u8 | NOT_SUMMED_TWIN_PREFIX - ); assert_eq!( - ElementType::NotSummedProvableCountSumTree as u8, - ElementType::ProvableCountSumTree as u8 | NOT_SUMMED_TWIN_PREFIX + ElementType::NotSummedProvableSumTree.base(), + ElementType::ProvableSumTree ); + + // Twin slots are explicit, not formula-derived. Pin each one. + // The first four are historical (formula `base | 0xb0` happens to + // match), the last is a hand-assigned slot. + assert_eq!(ElementType::NotSummedSumTree as u8, 180); + assert_eq!(ElementType::NotSummedBigSumTree as u8, 181); + assert_eq!(ElementType::NotSummedCountSumTree as u8, 183); + assert_eq!(ElementType::NotSummedProvableCountSumTree as u8, 186); + assert_eq!(ElementType::NotSummedProvableSumTree as u8, 177); + + // The whole family fits inside the 0xB0..=0xBF window. + for t in [ + ElementType::NotSummedSumTree, + ElementType::NotSummedBigSumTree, + ElementType::NotSummedCountSumTree, + ElementType::NotSummedProvableCountSumTree, + ElementType::NotSummedProvableSumTree, + ] { + let d = t as u8; + assert!( + d & 0xf0 == NOT_SUMMED_TWIN_PREFIX, + "{:?} = {:#x} outside NotSummed family", + t, + d + ); + } } #[test] @@ -970,6 +1169,7 @@ mod tests { assert!(ElementType::CountTree.has_combined_value_hash()); assert!(ElementType::CountSumTree.has_combined_value_hash()); assert!(ElementType::ProvableCountTree.has_combined_value_hash()); + assert!(ElementType::ProvableSumTree.has_combined_value_hash()); // The wrapper is transparent: NonCountedItem still hashes simply. assert!(ElementType::NonCountedItem.has_simple_value_hash()); @@ -979,6 +1179,28 @@ mod tests { assert!(ElementType::NonCountedReferenceWithSumItem.has_combined_value_hash()); } + #[test] + fn test_as_str_for_provable_sum_tree_variants() { + // Cover the as_str / Display path for the ProvableSumTree variant and + // its synthetic NonCountedProvableSumTree / NotSummed twins. The + // Display impl delegates to `as_str`, so we go through it to make the + // test resilient. + assert_eq!(ElementType::ProvableSumTree.as_str(), "provable sum tree"); + assert_eq!( + ElementType::NonCountedProvableSumTree.as_str(), + "non_counted provable sum tree" + ); + assert_eq!( + ElementType::NotSummedProvableSumTree.as_str(), + "not_summed provable sum tree" + ); + // Display delegation. + assert_eq!( + format!("{}", ElementType::NonCountedProvableSumTree), + "non_counted provable sum tree" + ); + } + #[test] fn test_proof_node_type_regular_tree() { use super::ProofNodeType; @@ -1122,6 +1344,53 @@ mod tests { ); } + #[test] + fn test_proof_node_type_provable_sum_tree() { + // Inside a ProvableSumTree parent, items map to KvSum and references + // map to KvRefValueHashSum. Subtrees still use KvValueHashFeatureType + // (the embedded TreeFeatureType carries the aggregate). This + // exercises the `is_provable_sum_tree` branches in `proof_node_type`. + use super::ProofNodeType; + + let pst = Some(ElementType::ProvableSumTree); + + assert_eq!(ElementType::Item.proof_node_type(pst), ProofNodeType::KvSum); + assert_eq!( + ElementType::SumItem.proof_node_type(pst), + ProofNodeType::KvSum + ); + assert_eq!( + ElementType::ItemWithSumItem.proof_node_type(pst), + ProofNodeType::KvSum + ); + + assert_eq!( + ElementType::Reference.proof_node_type(pst), + ProofNodeType::KvRefValueHashSum + ); + + // Subtrees still go through KvValueHashFeatureType. + assert_eq!( + ElementType::Tree.proof_node_type(pst), + ProofNodeType::KvValueHashFeatureType + ); + assert_eq!( + ElementType::SumTree.proof_node_type(pst), + ProofNodeType::KvValueHashFeatureType + ); + assert_eq!( + ElementType::ProvableSumTree.proof_node_type(pst), + ProofNodeType::KvValueHashFeatureType + ); + + // NotSummed-wrapped ProvableSumTree parents normalize to the same + // base — the wrapper is transparent here. + assert_eq!( + ElementType::Item.proof_node_type(Some(ElementType::NotSummedProvableSumTree)), + ProofNodeType::KvSum + ); + } + #[test] fn test_proof_node_type_through_non_counted_wrapper() { use super::ProofNodeType; @@ -1193,11 +1462,11 @@ mod tests { // would silently parse as `NonCountedItem`. assert!(ElementType::from_serialized_value(&[15, 128]).is_err()); assert!(ElementType::from_serialized_value(&[15, 142]).is_err()); - // Wrapper with an unallocated mid-range inner byte (16, 17, 19..=127) - // is also rejected, even though it has no high bit set. + // Wrapper with an unallocated mid-range inner byte (16, 17, + // 20..=127) is also rejected, even though it has no high bit set. assert!(ElementType::from_serialized_value(&[15, 16]).is_err()); assert!(ElementType::from_serialized_value(&[15, 17]).is_err()); - assert!(ElementType::from_serialized_value(&[15, 19]).is_err()); + assert!(ElementType::from_serialized_value(&[15, 20]).is_err()); assert!(ElementType::from_serialized_value(&[15, 100]).is_err()); // Inner byte 18 (ReferenceWithSumItem) IS a legal base; resolves to @@ -1206,6 +1475,12 @@ mod tests { ElementType::from_serialized_value(&[15, 18]).unwrap(), ElementType::NonCountedReferenceWithSumItem ); + // Inner byte 19 (ProvableSumTree) IS also a legal base; resolves to + // NonCountedProvableSumTree (twin slot 147 = 0x80|19). + assert_eq!( + ElementType::from_serialized_value(&[15, 19]).unwrap(), + ElementType::NonCountedProvableSumTree + ); } #[test] @@ -1225,6 +1500,8 @@ mod tests { assert!(ElementType::MmrTree.is_tree()); assert!(ElementType::BulkAppendTree.is_tree()); assert!(ElementType::DenseAppendOnlyFixedSizeTree.is_tree()); + assert!(ElementType::ProvableSumTree.is_tree()); + assert!(ElementType::NonCountedProvableSumTree.is_tree()); // ReferenceWithSumItem is a reference, not a tree and not an item. assert!(!ElementType::ReferenceWithSumItem.is_tree()); assert!(ElementType::ReferenceWithSumItem.is_reference()); @@ -1355,16 +1632,22 @@ mod tests { ElementType::ReferenceWithSumItem, "ReferenceWithSumItem", ), + // discriminant 19 + ( + Element::ProvableSumTree(None, 0, None), + ElementType::ProvableSumTree, + "ProvableSumTree", + ), ]; - // Verify we're testing all 16 base discriminants: 0..=14 and 18. + // Verify we're testing all 17 base discriminants: 0..=14, 18, 19. // (15 = NonCounted wrapper byte, 16 = NotSummed wrapper byte, // 17 = NotCountedOrSummed wrapper byte — none has a base // ElementType variant.) assert_eq!( test_cases.len(), - 16, - "Expected 16 base Element variants in test, got {}", + 17, + "Expected 17 base Element variants in test, got {}", test_cases.len() ); @@ -1499,7 +1782,7 @@ mod tests { } /// Pins the bincode discriminant for `Element::NotSummed` to - /// `NOT_SUMMED_WRAPPER_DISCRIMINANT` and the four allowed inner + /// `NOT_SUMMED_WRAPPER_DISCRIMINANT` and the five allowed inner /// discriminants. Mirrors `test_non_counted_wrapper_discriminant_pinned`. #[test] fn test_not_summed_wrapper_discriminant_pinned() { @@ -1509,34 +1792,47 @@ mod tests { let grove_version = GroveVersion::latest(); - let cases: Vec<(Element, ElementType, u8, &str)> = vec![ + // Tuple: (Element, expected twin, expected_inner_disc_on_wire, + // expected_twin_disc_assignment, name) + let cases: Vec<(Element, ElementType, u8, u8, &str)> = vec![ ( Element::NotSummed(Box::new(Element::SumTree(None, 0, None))), ElementType::NotSummedSumTree, 4, + 180, "NotSummed(SumTree)", ), ( Element::NotSummed(Box::new(Element::BigSumTree(None, 0, None))), ElementType::NotSummedBigSumTree, 5, + 181, "NotSummed(BigSumTree)", ), ( Element::NotSummed(Box::new(Element::CountSumTree(None, 0, 0, None))), ElementType::NotSummedCountSumTree, 7, + 183, "NotSummed(CountSumTree)", ), ( Element::NotSummed(Box::new(Element::ProvableCountSumTree(None, 0, 0, None))), ElementType::NotSummedProvableCountSumTree, 10, + 186, "NotSummed(ProvableCountSumTree)", ), + ( + Element::NotSummed(Box::new(Element::ProvableSumTree(None, 0, None))), + ElementType::NotSummedProvableSumTree, + 19, + 177, + "NotSummed(ProvableSumTree)", + ), ]; - for (element, expected_type, expected_inner_disc, name) in cases { + for (element, expected_type, expected_inner_disc, expected_twin_disc, name) in cases { let serialized = element .serialize(grove_version) .unwrap_or_else(|e| panic!("Failed to serialize {}: {:?}", name, e)); @@ -1564,23 +1860,62 @@ mod tests { "{}: from_serialized_value returned {:?}, expected {:?}", name, parsed, expected_type ); - // The synthetic discriminant follows the 0xb0|base rule. + // Twin discriminants are explicit slot assignments — pin each. assert_eq!( - parsed as u8, - expected_inner_disc | NOT_SUMMED_TWIN_PREFIX, - "{}: NotSummedXxx = inner_disc | 0xb0", + parsed as u8, expected_twin_disc, + "{}: NotSummedXxx slot mismatch (no formula — see \ + NOT_SUMMED_TWIN_PREFIX doc)", name ); } } + /// Round-trip the `ProvableSumTree` discriminant. The base byte is + /// 19 (this branch places `ProvableSumTree` at the end of the + /// `Element` enum to preserve develop's `NotCountedOrSummed = 17` + /// and `ReferenceWithSumItem = 18` wire encoding), and the + /// NonCounted twin lives at 147 (= 0x80 | 19). The + /// `NonCounted(ProvableSumTree)` shape is allowed. + #[test] + fn test_provable_sum_tree_discriminant_round_trip() { + use grovedb_version::version::GroveVersion; + + use crate::element::Element; + + let grove_version = GroveVersion::latest(); + + // Base form serializes with leading byte 19. + let element = Element::ProvableSumTree(None, 0, None); + let serialized = element + .serialize(grove_version) + .expect("serialize ProvableSumTree"); + assert_eq!(serialized[0], 19); + assert_eq!( + ElementType::from_serialized_value(&serialized).unwrap(), + ElementType::ProvableSumTree + ); + + // NonCounted(ProvableSumTree) serializes with the wrapper byte 15 + // followed by the inner discriminant 19, and resolves to the + // NonCountedProvableSumTree synthetic twin (= 147). + let nc = Element::NonCounted(Box::new(Element::ProvableSumTree(None, 0, None))); + let nc_serialized = nc.serialize(grove_version).expect("serialize NC PST"); + assert_eq!(nc_serialized[0], NON_COUNTED_WRAPPER_DISCRIMINANT); + assert_eq!(nc_serialized[1], 19); + assert_eq!( + ElementType::from_serialized_value(&nc_serialized).unwrap(), + ElementType::NonCountedProvableSumTree + ); + assert_eq!(ElementType::NonCountedProvableSumTree as u8, 147); + } + /// Validate the new resolver paths around byte 16 (NotSummed wrapper). #[test] fn test_from_serialized_value_not_summed_paths() { // Truncated wrapper (no inner byte) is rejected. assert!(ElementType::from_serialized_value(&[16]).is_err()); - // Each of the four legal inner discriminants resolves to the right + // Each of the five legal inner discriminants resolves to the right // synthetic twin. assert_eq!( ElementType::from_serialized_value(&[16, 4]).unwrap(), @@ -1598,14 +1933,22 @@ mod tests { ElementType::from_serialized_value(&[16, 10]).unwrap(), ElementType::NotSummedProvableCountSumTree ); + assert_eq!( + ElementType::from_serialized_value(&[16, 19]).unwrap(), + ElementType::NotSummedProvableSumTree + ); - // All other inner bytes are rejected: non-sum-tree base types, - // wrapper bytes, synthetic NonCounted twins (128..142), synthetic - // NotSummed twins (180..186), NotCountedOrSummed twins (196..202), - // and unallocated ranges. + // All other inner bytes are rejected: non-sum-bearing base types, + // wrapper bytes (15/16/17), the new ReferenceWithSumItem base + // (18 — a reference, not a sum-tree), synthetic NonCounted + // twins (128..142, 146, 147), synthetic NotSummed twins + // (177, 180..186), synthetic NotCountedOrSummed twins (193, + // 196..202), and unallocated ranges. Note: `19` + // (= ProvableSumTree base) IS accepted as inner — see the + // explicit-success assertion above. for bad in [ - 0u8, 1, 2, 3, 6, 8, 9, 11, 12, 13, 14, 15, 16, 17, 18, 100, 128, 142, 146, 180, 186, - 196, 202, 255, + 0u8, 1, 2, 3, 6, 8, 9, 11, 12, 13, 14, 15, 16, 17, 18, 100, 128, 142, 146, 147, 180, + 186, 193, 196, 202, 255, ] { assert!( ElementType::from_serialized_value(&[16, bad]).is_err(), @@ -1622,7 +1965,7 @@ mod tests { // Truncated wrapper (no inner byte) is rejected. assert!(ElementType::from_serialized_value(&[17]).is_err()); - // Each of the four legal inner discriminants resolves to the right + // Each of the five legal inner discriminants resolves to the right // synthetic twin. assert_eq!( ElementType::from_serialized_value(&[17, 4]).unwrap(), @@ -1640,11 +1983,18 @@ mod tests { ElementType::from_serialized_value(&[17, 10]).unwrap(), ElementType::NotCountedOrSummedProvableCountSumTree ); + assert_eq!( + ElementType::from_serialized_value(&[17, 19]).unwrap(), + ElementType::NotCountedOrSummedProvableSumTree + ); - // All other inner bytes are rejected. + // All other inner bytes are rejected: non-sum-bearing base types, + // wrapper bytes (15/16/17), the ReferenceWithSumItem base (18 — a + // reference, not a sum-tree), synthetic NonCounted/NotSummed/ + // NotCountedOrSummed twins, and unallocated ranges. for bad in [ - 0u8, 1, 2, 3, 6, 8, 9, 11, 12, 13, 14, 15, 16, 17, 18, 100, 128, 142, 146, 180, 186, - 196, 202, 255, + 0u8, 1, 2, 3, 6, 8, 9, 11, 12, 13, 14, 15, 16, 17, 18, 100, 128, 142, 146, 147, 177, + 180, 186, 193, 196, 202, 255, ] { assert!( ElementType::from_serialized_value(&[17, bad]).is_err(), diff --git a/grovedb-element/tests/element_constructors_helpers.rs b/grovedb-element/tests/element_constructors_helpers.rs index 949544c40..87e5a4369 100644 --- a/grovedb-element/tests/element_constructors_helpers.rs +++ b/grovedb-element/tests/element_constructors_helpers.rs @@ -537,6 +537,82 @@ fn convert_if_reference_to_absolute_reference_converts_and_preserves_other_types )); } +/// Coverage for every `ProvableSumTree` constructor and helper. Mirrors +/// `constructors_create_expected_provable_tree_variants` plus the relevant +/// sections of `value_helpers_and_conversion_errors_work`. +#[test] +fn provable_sum_tree_constructors_and_helpers() { + // --- Constructors --- + assert_eq!( + Element::empty_provable_sum_tree(), + Element::ProvableSumTree(None, 0, None) + ); + assert_eq!( + Element::empty_provable_sum_tree_with_flags(sample_flags()), + Element::ProvableSumTree(None, 0, sample_flags()) + ); + assert_eq!( + Element::new_provable_sum_tree(Some(vec![20])), + Element::ProvableSumTree(Some(vec![20]), 0, None) + ); + assert_eq!( + Element::new_provable_sum_tree_with_flags(Some(vec![20]), sample_flags()), + Element::ProvableSumTree(Some(vec![20]), 0, sample_flags()) + ); + let with_sum = Element::new_provable_sum_tree_with_flags_and_sum_value( + Some(vec![20]), + -123, + sample_flags(), + ); + assert_eq!( + with_sum, + Element::ProvableSumTree(Some(vec![20]), -123, sample_flags()) + ); + + // --- Type predicates / classification --- + assert!(with_sum.is_provable_sum_tree()); + assert!(with_sum.is_any_tree()); + assert!(!with_sum.is_sum_tree()); + assert!(!with_sum.is_basic_tree()); + assert!(!with_sum.is_commitment_tree()); + assert!(!with_sum.is_mmr_tree()); + assert!(!with_sum.is_bulk_append_tree()); + assert!(!with_sum.is_dense_tree()); + assert!(!with_sum.uses_non_merk_data_storage()); + assert_eq!(with_sum.non_merk_entry_count(), None); + + // --- Value accessors (borrowed) --- + assert_eq!(with_sum.as_provable_sum_tree_value().unwrap(), -123); + assert_eq!(with_sum.sum_value_or_default(), -123); + assert_eq!(with_sum.big_sum_value_or_default(), -123); + + // Wrong-element error paths for as_provable_sum_tree_value / into_provable_sum_tree_value. + let item = Element::new_item(vec![1, 2, 3]); + assert!(matches!( + item.as_provable_sum_tree_value(), + Err(ElementError::WrongElementType( + "expected a provable sum tree" + )) + )); + assert!(matches!( + item.clone().into_provable_sum_tree_value(), + Err(ElementError::WrongElementType( + "expected a provable sum tree" + )) + )); + + // --- Value accessor (owned) --- + assert_eq!( + with_sum.clone().into_provable_sum_tree_value().unwrap(), + -123 + ); + + // is_provable_sum_tree returns false for non-ProvableSumTree variants. + assert!(!Element::empty_tree().is_provable_sum_tree()); + assert!(!Element::empty_sum_tree().is_provable_sum_tree()); + assert!(!Element::empty_provable_count_tree().is_provable_sum_tree()); +} + #[test] fn constructors_create_expected_reference_with_sum_item_variants() { let ref_path = ReferencePathType::AbsolutePathReference(vec![b"a".to_vec(), b"b".to_vec()]); diff --git a/grovedb-element/tests/element_display_and_serialization.rs b/grovedb-element/tests/element_display_and_serialization.rs index 5d988ae83..a1714f51a 100644 --- a/grovedb-element/tests/element_display_and_serialization.rs +++ b/grovedb-element/tests/element_display_and_serialization.rs @@ -265,6 +265,10 @@ fn element_display_without_flags_covers_none_branches() { ), "ReferenceWithSumItem(SiblingReference(6b), max_hop: None, sum: 7)", ), + ( + Element::ProvableSumTree(None, 19, None), + "ProvableSumTree(None, 19)", + ), ]; for (element, expected_display) in values { diff --git a/grovedb-query/src/aggregate_count.rs b/grovedb-query/src/aggregate_count.rs index 986978a70..3538bd9c8 100644 --- a/grovedb-query/src/aggregate_count.rs +++ b/grovedb-query/src/aggregate_count.rs @@ -247,6 +247,12 @@ impl Query { AggregateCountOnRange item — use the leaf shape instead", )); } + QueryItem::AggregateSumOnRange(_) => { + return Err(Error::InvalidOperation( + "carrier AggregateCountOnRange query may not own an \ + AggregateSumOnRange item — the two aggregate variants are orthogonal", + )); + } } } let subquery = match self.default_subquery_branch.subquery.as_deref() { diff --git a/grovedb-query/src/proofs/encoding.rs b/grovedb-query/src/proofs/encoding.rs index 22c20b1d2..645f222d1 100644 --- a/grovedb-query/src/proofs/encoding.rs +++ b/grovedb-query/src/proofs/encoding.rs @@ -355,6 +355,132 @@ impl Encode for Op { } } + // ProvableSumTree proof variants. Tag bytes 0x30..=0x3D + // (0x3E and 0x3F intentionally reserved). Layout mirrors the + // corresponding Count variants verbatim; only the encoded + // aggregate type changes (i64 sum via varint instead of u64 + // count). The sum field uses varint for wire compactness — the + // hash recomputation in `node_hash_with_sum` uses the fixed + // big-endian byte form, which is independent of the wire + // encoding. + + // Push: ProvableSumTree variants + Op::Push(Node::KVSum(key, value, sum)) => { + debug_assert!(key.len() < 256); + if value.len() < 65536 { + dest.write_all(&[0x30, key.len() as u8])?; + dest.write_all(key)?; + (value.len() as u16).encode_into(dest)?; + dest.write_all(value)?; + sum.encode_into(dest)?; + } else { + dest.write_all(&[0x31, key.len() as u8])?; + dest.write_all(key)?; + (value.len() as u32).encode_into(dest)?; + dest.write_all(value)?; + sum.encode_into(dest)?; + } + } + Op::Push(Node::KVHashSum(kv_hash, sum)) => { + dest.write_all(&[0x32])?; + dest.write_all(kv_hash)?; + sum.encode_into(dest)?; + } + Op::Push(Node::KVRefValueHashSum(key, value, value_hash, sum)) => { + debug_assert!(key.len() < 256); + if value.len() < 65536 { + dest.write_all(&[0x33, key.len() as u8])?; + dest.write_all(key)?; + (value.len() as u16).encode_into(dest)?; + dest.write_all(value)?; + dest.write_all(value_hash)?; + sum.encode_into(dest)?; + } else { + dest.write_all(&[0x34, key.len() as u8])?; + dest.write_all(key)?; + (value.len() as u32).encode_into(dest)?; + dest.write_all(value)?; + dest.write_all(value_hash)?; + sum.encode_into(dest)?; + } + } + Op::Push(Node::KVDigestSum(key, value_hash, sum)) => { + debug_assert!(key.len() < 256); + + dest.write_all(&[0x35, key.len() as u8])?; + dest.write_all(key)?; + dest.write_all(value_hash)?; + sum.encode_into(dest)?; + } + Op::Push(Node::HashWithSum(kv_hash, left_child_hash, right_child_hash, sum)) => { + dest.write_all(&[0x36])?; + dest.write_all(kv_hash)?; + dest.write_all(left_child_hash)?; + dest.write_all(right_child_hash)?; + sum.encode_into(dest)?; + } + + // PushInverted: ProvableSumTree variants + Op::PushInverted(Node::KVSum(key, value, sum)) => { + debug_assert!(key.len() < 256); + if value.len() < 65536 { + dest.write_all(&[0x37, key.len() as u8])?; + dest.write_all(key)?; + (value.len() as u16).encode_into(dest)?; + dest.write_all(value)?; + sum.encode_into(dest)?; + } else { + dest.write_all(&[0x38, key.len() as u8])?; + dest.write_all(key)?; + (value.len() as u32).encode_into(dest)?; + dest.write_all(value)?; + sum.encode_into(dest)?; + } + } + Op::PushInverted(Node::KVHashSum(kv_hash, sum)) => { + dest.write_all(&[0x39])?; + dest.write_all(kv_hash)?; + sum.encode_into(dest)?; + } + Op::PushInverted(Node::KVRefValueHashSum(key, value, value_hash, sum)) => { + debug_assert!(key.len() < 256); + if value.len() < 65536 { + dest.write_all(&[0x3a, key.len() as u8])?; + dest.write_all(key)?; + (value.len() as u16).encode_into(dest)?; + dest.write_all(value)?; + dest.write_all(value_hash)?; + sum.encode_into(dest)?; + } else { + dest.write_all(&[0x3b, key.len() as u8])?; + dest.write_all(key)?; + (value.len() as u32).encode_into(dest)?; + dest.write_all(value)?; + dest.write_all(value_hash)?; + sum.encode_into(dest)?; + } + } + Op::PushInverted(Node::KVDigestSum(key, value_hash, sum)) => { + debug_assert!(key.len() < 256); + + dest.write_all(&[0x3c, key.len() as u8])?; + dest.write_all(key)?; + dest.write_all(value_hash)?; + sum.encode_into(dest)?; + } + Op::PushInverted(Node::HashWithSum( + kv_hash, + left_child_hash, + right_child_hash, + sum, + )) => { + dest.write_all(&[0x3d])?; + dest.write_all(kv_hash)?; + dest.write_all(left_child_hash)?; + dest.write_all(right_child_hash)?; + sum.encode_into(dest)?; + } + Op::Parent => dest.write_all(&[0x10])?, Op::Child => dest.write_all(&[0x11])?, Op::ParentInverted => dest.write_all(&[0x12])?, @@ -459,6 +585,38 @@ impl Encode for Op { + feature_type.encoding_length()? + HASH_LENGTH } + // ProvableSumTree variants — Push (sum is i64 varint) + Op::Push(Node::KVSum(key, value, sum)) => { + let header = if value.len() < 65536 { 4 } else { 6 }; + header + key.len() + value.len() + sum.encoding_length()? + } + Op::Push(Node::KVHashSum(_, sum)) => 1 + HASH_LENGTH + sum.encoding_length()?, + Op::Push(Node::KVRefValueHashSum(key, value, _, sum)) => { + let header = if value.len() < 65536 { 4 } else { 6 }; + header + key.len() + value.len() + HASH_LENGTH + sum.encoding_length()? + } + Op::Push(Node::KVDigestSum(key, _, sum)) => { + 2 + key.len() + HASH_LENGTH + sum.encoding_length()? + } + Op::Push(Node::HashWithSum(_, _, _, sum)) => { + 1 + 3 * HASH_LENGTH + sum.encoding_length()? + } + // ProvableSumTree variants — PushInverted + Op::PushInverted(Node::KVSum(key, value, sum)) => { + let header = if value.len() < 65536 { 4 } else { 6 }; + header + key.len() + value.len() + sum.encoding_length()? + } + Op::PushInverted(Node::KVHashSum(_, sum)) => 1 + HASH_LENGTH + sum.encoding_length()?, + Op::PushInverted(Node::KVRefValueHashSum(key, value, _, sum)) => { + let header = if value.len() < 65536 { 4 } else { 6 }; + header + key.len() + value.len() + HASH_LENGTH + sum.encoding_length()? + } + Op::PushInverted(Node::KVDigestSum(key, _, sum)) => { + 2 + key.len() + HASH_LENGTH + sum.encoding_length()? + } + Op::PushInverted(Node::HashWithSum(_, _, _, sum)) => { + 1 + 3 * HASH_LENGTH + sum.encoding_length()? + } Op::Parent => 1, Op::Child => 1, Op::ParentInverted => 1, @@ -1076,6 +1234,202 @@ impl Decode for Op { )) } + // ProvableSumTree decoder arms. Mirror the Count family layout + // exactly; only the aggregate type differs (i64 sum via varint + // instead of u64 count). + 0x30 => { + let key_len: u8 = Decode::decode(&mut input)?; + let mut key = vec![0; key_len as usize]; + input.read_exact(key.as_mut_slice())?; + + let value_len: u16 = Decode::decode(&mut input)?; + let mut value = vec![0; value_len as usize]; + input.read_exact(value.as_mut_slice())?; + + let sum: i64 = Decode::decode(&mut input)?; + + Self::Push(Node::KVSum(key, value, sum)) + } + 0x31 => { + let key_len: u8 = Decode::decode(&mut input)?; + let mut key = vec![0; key_len as usize]; + input.read_exact(key.as_mut_slice())?; + + let value_len: u32 = Decode::decode(&mut input)?; + if value_len > MAX_VALUE_LEN { + return Err(ed::Error::UnexpectedByte(0x31)); + } + let mut value = vec![0; value_len as usize]; + input.read_exact(value.as_mut_slice())?; + + let sum: i64 = Decode::decode(&mut input)?; + + Self::Push(Node::KVSum(key, value, sum)) + } + 0x32 => { + let mut kv_hash = [0; HASH_LENGTH]; + input.read_exact(&mut kv_hash)?; + let sum: i64 = Decode::decode(&mut input)?; + + Self::Push(Node::KVHashSum(kv_hash, sum)) + } + 0x33 => { + let key_len: u8 = Decode::decode(&mut input)?; + let mut key = vec![0; key_len as usize]; + input.read_exact(key.as_mut_slice())?; + + let value_len: u16 = Decode::decode(&mut input)?; + let mut value = vec![0; value_len as usize]; + input.read_exact(value.as_mut_slice())?; + + let mut value_hash = [0; HASH_LENGTH]; + input.read_exact(&mut value_hash)?; + + let sum: i64 = Decode::decode(&mut input)?; + Self::Push(Node::KVRefValueHashSum(key, value, value_hash, sum)) + } + 0x34 => { + let key_len: u8 = Decode::decode(&mut input)?; + let mut key = vec![0; key_len as usize]; + input.read_exact(key.as_mut_slice())?; + + let value_len: u32 = Decode::decode(&mut input)?; + if value_len > MAX_VALUE_LEN { + return Err(ed::Error::UnexpectedByte(0x34)); + } + let mut value = vec![0; value_len as usize]; + input.read_exact(value.as_mut_slice())?; + + let mut value_hash = [0; HASH_LENGTH]; + input.read_exact(&mut value_hash)?; + + let sum: i64 = Decode::decode(&mut input)?; + Self::Push(Node::KVRefValueHashSum(key, value, value_hash, sum)) + } + 0x35 => { + let key_len: u8 = Decode::decode(&mut input)?; + let mut key = vec![0; key_len as usize]; + input.read_exact(key.as_mut_slice())?; + + let mut value_hash = [0; HASH_LENGTH]; + input.read_exact(&mut value_hash)?; + + let sum: i64 = Decode::decode(&mut input)?; + Self::Push(Node::KVDigestSum(key, value_hash, sum)) + } + 0x36 => { + let mut kv_hash = [0; HASH_LENGTH]; + input.read_exact(&mut kv_hash)?; + let mut left_child_hash = [0; HASH_LENGTH]; + input.read_exact(&mut left_child_hash)?; + let mut right_child_hash = [0; HASH_LENGTH]; + input.read_exact(&mut right_child_hash)?; + let sum: i64 = Decode::decode(&mut input)?; + + Self::Push(Node::HashWithSum( + kv_hash, + left_child_hash, + right_child_hash, + sum, + )) + } + 0x37 => { + let key_len: u8 = Decode::decode(&mut input)?; + let mut key = vec![0; key_len as usize]; + input.read_exact(key.as_mut_slice())?; + + let value_len: u16 = Decode::decode(&mut input)?; + let mut value = vec![0; value_len as usize]; + input.read_exact(value.as_mut_slice())?; + + let sum: i64 = Decode::decode(&mut input)?; + + Self::PushInverted(Node::KVSum(key, value, sum)) + } + 0x38 => { + let key_len: u8 = Decode::decode(&mut input)?; + let mut key = vec![0; key_len as usize]; + input.read_exact(key.as_mut_slice())?; + + let value_len: u32 = Decode::decode(&mut input)?; + if value_len > MAX_VALUE_LEN { + return Err(ed::Error::UnexpectedByte(0x38)); + } + let mut value = vec![0; value_len as usize]; + input.read_exact(value.as_mut_slice())?; + + let sum: i64 = Decode::decode(&mut input)?; + + Self::PushInverted(Node::KVSum(key, value, sum)) + } + 0x39 => { + let mut kv_hash = [0; HASH_LENGTH]; + input.read_exact(&mut kv_hash)?; + let sum: i64 = Decode::decode(&mut input)?; + + Self::PushInverted(Node::KVHashSum(kv_hash, sum)) + } + 0x3a => { + let key_len: u8 = Decode::decode(&mut input)?; + let mut key = vec![0; key_len as usize]; + input.read_exact(key.as_mut_slice())?; + + let value_len: u16 = Decode::decode(&mut input)?; + let mut value = vec![0; value_len as usize]; + input.read_exact(value.as_mut_slice())?; + + let mut value_hash = [0; HASH_LENGTH]; + input.read_exact(&mut value_hash)?; + + let sum: i64 = Decode::decode(&mut input)?; + Self::PushInverted(Node::KVRefValueHashSum(key, value, value_hash, sum)) + } + 0x3b => { + let key_len: u8 = Decode::decode(&mut input)?; + let mut key = vec![0; key_len as usize]; + input.read_exact(key.as_mut_slice())?; + + let value_len: u32 = Decode::decode(&mut input)?; + if value_len > MAX_VALUE_LEN { + return Err(ed::Error::UnexpectedByte(0x3b)); + } + let mut value = vec![0; value_len as usize]; + input.read_exact(value.as_mut_slice())?; + + let mut value_hash = [0; HASH_LENGTH]; + input.read_exact(&mut value_hash)?; + + let sum: i64 = Decode::decode(&mut input)?; + Self::PushInverted(Node::KVRefValueHashSum(key, value, value_hash, sum)) + } + 0x3c => { + let key_len: u8 = Decode::decode(&mut input)?; + let mut key = vec![0; key_len as usize]; + input.read_exact(key.as_mut_slice())?; + + let mut value_hash = [0; HASH_LENGTH]; + input.read_exact(&mut value_hash)?; + + let sum: i64 = Decode::decode(&mut input)?; + Self::PushInverted(Node::KVDigestSum(key, value_hash, sum)) + } + 0x3d => { + let mut kv_hash = [0; HASH_LENGTH]; + input.read_exact(&mut kv_hash)?; + let mut left_child_hash = [0; HASH_LENGTH]; + input.read_exact(&mut left_child_hash)?; + let mut right_child_hash = [0; HASH_LENGTH]; + input.read_exact(&mut right_child_hash)?; + let sum: i64 = Decode::decode(&mut input)?; + + Self::PushInverted(Node::HashWithSum( + kv_hash, + left_child_hash, + right_child_hash, + sum, + )) + } + 0x10 => Self::Parent, 0x11 => Self::Child, 0x12 => Self::ParentInverted, @@ -2368,4 +2722,205 @@ mod test { let decoded_ops: Result, _> = decoder.collect(); assert_eq!(decoded_ops.unwrap(), ops); } + + // ProvableSumTree proof-node round-trip tests. Each variant must + // round-trip through both `Op::Push` and `Op::PushInverted`, and through + // the full numeric range of i64 sums (incl. negatives and boundaries). + fn round_trip_op(op: Op) { + let mut encoded = vec![]; + op.encode_into(&mut encoded).unwrap(); + // encoding_length must match the actual encoded byte length. + assert_eq!(encoded.len(), op.encoding_length()); + let mut decoder = Decoder::new(&encoded); + let decoded = decoder.next().unwrap().unwrap(); + assert_eq!(decoded, op); + assert_eq!(decoder.remaining_bytes(), 0); + } + + fn round_trip_sum_variants_with(sum: i64) { + // Push family + round_trip_op(Op::Push(Node::KVSum(vec![1, 2, 3], vec![4, 5], sum))); + round_trip_op(Op::Push(Node::KVHashSum([0xAB; HASH_LENGTH], sum))); + round_trip_op(Op::Push(Node::KVRefValueHashSum( + vec![9, 8], + vec![7, 6, 5], + [0xCD; HASH_LENGTH], + sum, + ))); + round_trip_op(Op::Push(Node::KVDigestSum( + vec![10, 11], + [0xEF; HASH_LENGTH], + sum, + ))); + round_trip_op(Op::Push(Node::HashWithSum( + [1; HASH_LENGTH], + [2; HASH_LENGTH], + [3; HASH_LENGTH], + sum, + ))); + // PushInverted family + round_trip_op(Op::PushInverted(Node::KVSum( + vec![1, 2, 3], + vec![4, 5], + sum, + ))); + round_trip_op(Op::PushInverted(Node::KVHashSum([0xAB; HASH_LENGTH], sum))); + round_trip_op(Op::PushInverted(Node::KVRefValueHashSum( + vec![9, 8], + vec![7, 6, 5], + [0xCD; HASH_LENGTH], + sum, + ))); + round_trip_op(Op::PushInverted(Node::KVDigestSum( + vec![10, 11], + [0xEF; HASH_LENGTH], + sum, + ))); + round_trip_op(Op::PushInverted(Node::HashWithSum( + [1; HASH_LENGTH], + [2; HASH_LENGTH], + [3; HASH_LENGTH], + sum, + ))); + } + + #[test] + fn sum_node_variants_round_trip_at_zero() { + round_trip_sum_variants_with(0); + } + + #[test] + fn sum_node_variants_round_trip_at_positive() { + round_trip_sum_variants_with(1); + round_trip_sum_variants_with(42); + round_trip_sum_variants_with(i64::MAX); + } + + #[test] + fn sum_node_variants_round_trip_at_negative() { + round_trip_sum_variants_with(-1); + round_trip_sum_variants_with(-42); + round_trip_sum_variants_with(i64::MIN); + } + + #[test] + fn sum_node_variants_use_new_tag_bytes() { + // Sanity check: each new variant writes its expected tag byte as the + // first byte of the encoded form. This guards against tag drift if + // someone refactors the encoder. + let cases: &[(Op, u8)] = &[ + (Op::Push(Node::KVSum(vec![1], vec![2], 5)), 0x30), + (Op::Push(Node::KVHashSum([0; HASH_LENGTH], 5)), 0x32), + ( + Op::Push(Node::KVRefValueHashSum( + vec![1], + vec![2], + [0; HASH_LENGTH], + 5, + )), + 0x33, + ), + ( + Op::Push(Node::KVDigestSum(vec![1], [0; HASH_LENGTH], 5)), + 0x35, + ), + ( + Op::Push(Node::HashWithSum( + [0; HASH_LENGTH], + [0; HASH_LENGTH], + [0; HASH_LENGTH], + 5, + )), + 0x36, + ), + (Op::PushInverted(Node::KVSum(vec![1], vec![2], 5)), 0x37), + (Op::PushInverted(Node::KVHashSum([0; HASH_LENGTH], 5)), 0x39), + ( + Op::PushInverted(Node::KVRefValueHashSum( + vec![1], + vec![2], + [0; HASH_LENGTH], + 5, + )), + 0x3a, + ), + ( + Op::PushInverted(Node::KVDigestSum(vec![1], [0; HASH_LENGTH], 5)), + 0x3c, + ), + ( + Op::PushInverted(Node::HashWithSum( + [0; HASH_LENGTH], + [0; HASH_LENGTH], + [0; HASH_LENGTH], + 5, + )), + 0x3d, + ), + ]; + for (op, expected_tag) in cases { + let mut bytes = vec![]; + op.encode_into(&mut bytes).unwrap(); + assert_eq!(bytes[0], *expected_tag, "wrong tag byte for {:?}", op); + } + } + + // Large-value (>= 65536 bytes) round-trip tests for ProvableSumTree + // proof-node variants. Each KV-style variant has a "small value" (u16 length) + // and a "large value" (u32 length) encoding path. The small-value path is + // exercised by `sum_node_variants_round_trip_at_*` above; here we cover + // the large-value path for the four KV variants that carry a value field + // (`KVSum`, `KVRefValueHashSum` in both Push and PushInverted directions). + + /// Helper: encode → decode → assert byte-for-byte and structural equality. + fn large_value_round_trip(op: Op, expected_tag: u8) { + let mut bytes = vec![]; + op.encode_into(&mut bytes).unwrap(); + assert_eq!(bytes[0], expected_tag, "wrong tag byte for {:?}", op); + assert_eq!(bytes.len(), op.encoding_length()); + let decoded = Op::decode(&bytes[..]).expect("decode failed"); + assert_eq!(decoded, op); + } + + #[test] + fn kvsum_push_large_value_round_trip() { + // 0x31 = Push KVSum with u32 value length (value.len() >= 65536). + let large_value = vec![0xAB; 70_000]; + let op = Op::Push(Node::KVSum(vec![1, 2, 3], large_value, 42)); + large_value_round_trip(op, 0x31); + } + + #[test] + fn kvsum_pushinverted_large_value_round_trip() { + // 0x38 = PushInverted KVSum with u32 value length. + let large_value = vec![0xCD; 70_000]; + let op = Op::PushInverted(Node::KVSum(vec![9, 8, 7], large_value, -99)); + large_value_round_trip(op, 0x38); + } + + #[test] + fn kvrefvaluehashsum_push_large_value_round_trip() { + // 0x34 = Push KVRefValueHashSum with u32 value length. + let large_value = vec![0xEF; 70_000]; + let op = Op::Push(Node::KVRefValueHashSum( + vec![1, 2, 3], + large_value, + [0x55; HASH_LENGTH], + i64::MAX, + )); + large_value_round_trip(op, 0x34); + } + + #[test] + fn kvrefvaluehashsum_pushinverted_large_value_round_trip() { + // 0x3b = PushInverted KVRefValueHashSum with u32 value length. + let large_value = vec![0x12; 70_000]; + let op = Op::PushInverted(Node::KVRefValueHashSum( + vec![4, 5, 6], + large_value, + [0x77; HASH_LENGTH], + i64::MIN, + )); + large_value_round_trip(op, 0x3b); + } } diff --git a/grovedb-query/src/proofs/mod.rs b/grovedb-query/src/proofs/mod.rs index d49eb2e4a..2e5d05ae7 100644 --- a/grovedb-query/src/proofs/mod.rs +++ b/grovedb-query/src/proofs/mod.rs @@ -151,6 +151,50 @@ pub enum Node { /// /// Contains: `(kv_hash, left_child_hash, right_child_hash, count)` HashWithCount(CryptoHash, CryptoHash, CryptoHash, u64), + + /// Key, value, and sum. For queried Items in ProvableSumTree. + /// + /// Sum analogue of `KVCount`: the verifier recomputes + /// `node_hash = node_hash_with_sum(kv_hash, left, right, sum)` so a + /// forged sum produces a hash divergence at the parent boundary. + /// + /// Contains: `(key, value, sum)` + KVSum(Vec, Vec, i64), + + /// KV hash and sum. For non-queried nodes in ProvableSumTree. + /// + /// Sum analogue of `KVHashCount`. + /// + /// Contains: `(kv_hash, sum)` + KVHashSum(CryptoHash, i64), + + /// Key, referenced value, reference element hash, and sum. + /// For queried References in ProvableSumTree. + /// + /// Sum analogue of `KVRefValueHashCount`. + /// + /// Contains: `(key, referenced_value, reference_element_hash, sum)` + KVRefValueHashSum(Vec, Vec, CryptoHash, i64), + + /// Key, value_hash, and sum. For proving absence in ProvableSumTree. + /// + /// Sum analogue of `KVDigestCount`. + /// + /// Contains: `(key, value_hash, sum)` + KVDigestSum(Vec, CryptoHash, i64), + + /// A self-verifying compressed subtree for `AggregateSumOnRange` proofs + /// against a `ProvableSumTree`. + /// + /// Sum analogue of `HashWithCount` — encodes the subtree's *root* node as + /// `(kv_hash, left_child_hash, right_child_hash, sum)`. The verifier + /// reconstructs the subtree's root `node_hash` as + /// `node_hash_with_sum(kv_hash, left_child_hash, right_child_hash, sum)` + /// and uses that hash exactly as `Hash(...)` would. The sum is + /// cryptographically committed by the parent's hash chain. + /// + /// Contains: `(kv_hash, left_child_hash, right_child_hash, sum)` + HashWithSum(CryptoHash, CryptoHash, CryptoHash, i64), } use std::fmt; @@ -230,6 +274,35 @@ impl fmt::Display for Node { feature_type, hex::encode(child_hash) ), + Node::KVSum(key, value, sum) => format!( + "KVSum({}, {}, {})", + hex_to_ascii(key), + hex_to_ascii(value), + sum + ), + Node::KVHashSum(kv_hash, sum) => { + format!("KVHashSum(HASH[{}], {})", hex::encode(kv_hash), sum) + } + Node::KVRefValueHashSum(key, value, value_hash, sum) => format!( + "KVRefValueHashSum({}, {}, HASH[{}], {})", + hex_to_ascii(key), + hex_to_ascii(value), + hex::encode(value_hash), + sum + ), + Node::KVDigestSum(key, value_hash, sum) => format!( + "KVDigestSum({}, HASH[{}], {})", + hex_to_ascii(key), + hex::encode(value_hash), + sum + ), + Node::HashWithSum(kv_hash, left_child_hash, right_child_hash, sum) => format!( + "HashWithSum(kv_hash=HASH[{}], left=HASH[{}], right=HASH[{}], sum={})", + hex::encode(kv_hash), + hex::encode(left_child_hash), + hex::encode(right_child_hash), + sum + ), }; write!(f, "{}", node_string) } @@ -306,4 +379,91 @@ mod tests { display ); } + + // Display tests for the ProvableSumTree proof-node variants. Each + // variant has its own match arm in the Display impl, so we exercise them + // individually to ensure they don't accidentally fall through to a + // wildcard that would mask future drift. + + #[test] + fn display_kv_sum() { + let node = Node::KVSum(b"k".to_vec(), b"v".to_vec(), -7); + let display = node.to_string(); + assert!(display.starts_with("KVSum("), "got: {}", display); + assert!( + display.contains("-7"), + "expected sum in output: {}", + display + ); + } + + #[test] + fn display_kv_hash_sum() { + let node = Node::KVHashSum([0xAB; HASH_LENGTH], 123); + let display = node.to_string(); + assert!(display.starts_with("KVHashSum("), "got: {}", display); + assert!(display.contains("123"), "expected sum: {}", display); + assert!( + display.contains(&hex::encode([0xAB; HASH_LENGTH])), + "expected kv_hash hex: {}", + display + ); + } + + #[test] + fn display_kv_ref_value_hash_sum() { + let node = + Node::KVRefValueHashSum(b"k".to_vec(), b"v".to_vec(), [0xCD; HASH_LENGTH], i64::MIN); + let display = node.to_string(); + assert!( + display.starts_with("KVRefValueHashSum("), + "got: {}", + display + ); + assert!( + display.contains(&i64::MIN.to_string()), + "expected i64::MIN: {}", + display + ); + } + + #[test] + fn display_kv_digest_sum() { + let node = Node::KVDigestSum(b"k".to_vec(), [0xEF; HASH_LENGTH], i64::MAX); + let display = node.to_string(); + assert!(display.starts_with("KVDigestSum("), "got: {}", display); + assert!( + display.contains(&i64::MAX.to_string()), + "expected i64::MAX: {}", + display + ); + } + + #[test] + fn display_hash_with_sum() { + let node = Node::HashWithSum( + [0x11; HASH_LENGTH], + [0x22; HASH_LENGTH], + [0x33; HASH_LENGTH], + -42, + ); + let display = node.to_string(); + assert!(display.starts_with("HashWithSum("), "got: {}", display); + assert!(display.contains("sum=-42"), "expected sum=-42: {}", display); + assert!( + display.contains(&hex::encode([0x11; HASH_LENGTH])), + "expected kv_hash hex: {}", + display + ); + assert!( + display.contains(&hex::encode([0x22; HASH_LENGTH])), + "expected left hex: {}", + display + ); + assert!( + display.contains(&hex::encode([0x33; HASH_LENGTH])), + "expected right hex: {}", + display + ); + } } diff --git a/grovedb-query/src/proofs/tree_feature_type.rs b/grovedb-query/src/proofs/tree_feature_type.rs index b5fa81cb0..50c7643b5 100644 --- a/grovedb-query/src/proofs/tree_feature_type.rs +++ b/grovedb-query/src/proofs/tree_feature_type.rs @@ -10,7 +10,7 @@ use integer_encoding::{VarInt, VarIntReader, VarIntWriter}; use self::TreeFeatureType::{ BasicMerkNode, BigSummedMerkNode, CountedMerkNode, CountedSummedMerkNode, - ProvableCountedMerkNode, SummedMerkNode, + ProvableCountedMerkNode, ProvableSummedMerkNode, SummedMerkNode, }; use crate::proofs::TreeFeatureType::ProvableCountedSummedMerkNode; @@ -32,6 +32,11 @@ pub enum NodeType { ProvableCountNode, /// Provable count + sum node (count included in hash) ProvableCountSumNode, + /// Provable sum node (sum included in hash). Mirrors `SumNode`'s + /// encoding layout (i64 varint, 9-byte feature length), but the hash + /// computation includes the sum so the sum participates in the node + /// hash (unlike `SumNode`, which only tracks the sum alongside). + ProvableSumNode, } impl NodeType { @@ -45,6 +50,7 @@ impl NodeType { NodeType::CountSumNode => 17, NodeType::ProvableCountNode => 9, NodeType::ProvableCountSumNode => 17, + NodeType::ProvableSumNode => 9, } } @@ -58,6 +64,7 @@ impl NodeType { NodeType::CountSumNode => 16, NodeType::ProvableCountNode => 8, NodeType::ProvableCountSumNode => 16, + NodeType::ProvableSumNode => 8, } } } @@ -79,6 +86,10 @@ pub enum TreeFeatureType { ProvableCountedMerkNode(u64), /// Provable Counted and Summed Merk Tree Node (count in hash, sum tracked) ProvableCountedSummedMerkNode(u64, i64), + /// Provable Summed Merk Tree Node (sum included in hash). + /// Mirrors `SummedMerkNode` for encoding/cost purposes, but the hash + /// computation includes the sum so the sum participates in the node hash. + ProvableSummedMerkNode(i64), } impl TreeFeatureType { @@ -92,7 +103,10 @@ impl TreeFeatureType { | ProvableCountedMerkNode(count) | CountedSummedMerkNode(count, _) | ProvableCountedSummedMerkNode(count, _) => Some(*count), - BasicMerkNode | SummedMerkNode(_) | BigSummedMerkNode(_) => None, + BasicMerkNode + | SummedMerkNode(_) + | BigSummedMerkNode(_) + | ProvableSummedMerkNode(_) => None, } } @@ -107,7 +121,26 @@ impl TreeFeatureType { match self { CountedMerkNode(count) | ProvableCountedMerkNode(count) => *count = 0, CountedSummedMerkNode(count, _) | ProvableCountedSummedMerkNode(count, _) => *count = 0, - BasicMerkNode | SummedMerkNode(_) | BigSummedMerkNode(_) => {} + BasicMerkNode + | SummedMerkNode(_) + | BigSummedMerkNode(_) + | ProvableSummedMerkNode(_) => {} + } + } + + /// Force the sum component of this feature type to 0, leaving count + /// components untouched. No-op for variants that don't carry a sum. + /// + /// Mirrors `zero_count` for the `Element::NotSummed` wrapper: when + /// computing the parent sum tree's feature type for a not-summed child, + /// we use the inner element's feature type but zero out its sum so the + /// parent's aggregate excludes it. + pub fn zero_sum(&mut self) { + match self { + SummedMerkNode(sum) | ProvableSummedMerkNode(sum) => *sum = 0, + BigSummedMerkNode(sum) => *sum = 0, + CountedSummedMerkNode(_, sum) | ProvableCountedSummedMerkNode(_, sum) => *sum = 0, + BasicMerkNode | CountedMerkNode(_) | ProvableCountedMerkNode(_) => {} } } @@ -121,6 +154,7 @@ impl TreeFeatureType { CountedSummedMerkNode(..) => NodeType::CountSumNode, ProvableCountedMerkNode(_) => NodeType::ProvableCountNode, ProvableCountedSummedMerkNode(..) => NodeType::ProvableCountSumNode, + ProvableSummedMerkNode(_) => NodeType::ProvableSumNode, } } @@ -135,6 +169,7 @@ impl TreeFeatureType { CountedSummedMerkNode(..) => 17, ProvableCountedMerkNode(_) => 9, ProvableCountedSummedMerkNode(..) => 17, + ProvableSummedMerkNode(_) => 9, } } } @@ -168,6 +203,10 @@ impl TreeFeatureType { TreeCostType::TreeFeatureUsesTwoVarIntsCostAs16Bytes, count.encode_var_vec().len() as u32 + sum.encode_var_vec().len() as u32, )), + ProvableSummedMerkNode(m) => Some(( + TreeCostType::TreeFeatureUsesVarIntCostAs8Bytes, + m.encode_var_vec().len() as u32, + )), } } } @@ -212,6 +251,11 @@ impl Encode for TreeFeatureType { dest.write_varint(*sum)?; Ok(()) } + ProvableSummedMerkNode(sum) => { + dest.write_all(&[7])?; + dest.write_varint(*sum)?; + Ok(()) + } } } @@ -240,6 +284,10 @@ impl Encode for TreeFeatureType { let encoded_lengths = count.encode_var_vec().len() + sum.encode_var_vec().len(); Ok(1 + encoded_lengths) } + ProvableSummedMerkNode(sum) => { + let encoded_sum = sum.encode_var_vec(); + Ok(1 + encoded_sum.len()) + } } } } @@ -279,6 +327,10 @@ impl Decode for TreeFeatureType { let encoded_sum: i64 = input.read_varint()?; Ok(ProvableCountedSummedMerkNode(encoded_count, encoded_sum)) } + [7] => { + let encoded_sum: i64 = input.read_varint()?; + Ok(ProvableSummedMerkNode(encoded_sum)) + } [b] => Err(ed::Error::UnexpectedByte(b)), } } @@ -328,5 +380,83 @@ mod tests { assert_eq!(CountedSummedMerkNode(7, 42).count(), Some(7)); assert_eq!(ProvableCountedMerkNode(7).count(), Some(7)); assert_eq!(ProvableCountedSummedMerkNode(7, 42).count(), Some(7)); + // ProvableSummedMerkNode carries a sum, not a count. + assert_eq!(ProvableSummedMerkNode(42).count(), None); + } + + /// `zero_sum` mirrors `zero_count`: it zeroes the sum component on + /// every sum-bearing variant and is a no-op elsewhere. + #[test] + fn zero_sum_only_zeros_sum() { + let mut basic = BasicMerkNode; + basic.zero_sum(); + assert_eq!(basic, BasicMerkNode); + + let mut summed = SummedMerkNode(42); + summed.zero_sum(); + assert_eq!(summed, SummedMerkNode(0)); + + let mut big_summed = BigSummedMerkNode(42); + big_summed.zero_sum(); + assert_eq!(big_summed, BigSummedMerkNode(0)); + + let mut counted = CountedMerkNode(7); + counted.zero_sum(); + assert_eq!(counted, CountedMerkNode(7)); + + let mut count_sum = CountedSummedMerkNode(7, 42); + count_sum.zero_sum(); + assert_eq!(count_sum, CountedSummedMerkNode(7, 0)); + + let mut prov_counted = ProvableCountedMerkNode(7); + prov_counted.zero_sum(); + assert_eq!(prov_counted, ProvableCountedMerkNode(7)); + + let mut prov_count_sum = ProvableCountedSummedMerkNode(7, 42); + prov_count_sum.zero_sum(); + assert_eq!(prov_count_sum, ProvableCountedSummedMerkNode(7, 0)); + + let mut prov_summed = ProvableSummedMerkNode(42); + prov_summed.zero_sum(); + assert_eq!(prov_summed, ProvableSummedMerkNode(0)); + } + + /// `ProvableSummedMerkNode` round-trips through `Encode`/`Decode` with + /// tag byte 7 followed by a varint i64. + #[test] + fn provable_summed_round_trip() { + for &sum in &[0i64, 1, -1, 42, -42, i64::MAX, i64::MIN] { + let original = ProvableSummedMerkNode(sum); + let mut buf = Vec::new(); + original.encode_into(&mut buf).expect("encode"); + // First byte must be the tag. + assert_eq!(buf[0], 7); + // Encoded length matches the variable-length `encoding_length` + // (1 tag byte + varint i64). `encoding_cost` is the storage + // worst-case (9 = 1 + 8) and does NOT have to match the + // serialized length on the wire — it intentionally over-counts + // so cost accounting is consistent across all sum values. + assert_eq!( + buf.len(), + original.encoding_length().expect("encoding_length") + ); + let back = TreeFeatureType::decode(&buf[..]).expect("decode"); + assert_eq!(back, original); + } + } + + /// The new `ProvableSumNode` carries the same feature length / cost as + /// `SumNode` so cost accounting remains consistent. + #[test] + fn provable_sum_node_matches_sum_node_layout() { + assert_eq!( + NodeType::ProvableSumNode.feature_len(), + NodeType::SumNode.feature_len() + ); + assert_eq!(NodeType::ProvableSumNode.cost(), NodeType::SumNode.cost()); + assert_eq!( + ProvableSummedMerkNode(0).node_type(), + NodeType::ProvableSumNode + ); } } diff --git a/grovedb-query/src/query.rs b/grovedb-query/src/query.rs index 2ab979994..992048d8d 100644 --- a/grovedb-query/src/query.rs +++ b/grovedb-query/src/query.rs @@ -303,6 +303,144 @@ impl Query { } } + /// Creates an aggregate-sum-on-range query that sums the children matched + /// by `range`. Mirrors `Query::new_aggregate_count_on_range` for + /// `ProvableSumTree` instead of `ProvableCountTree`. + /// + /// `range` must be a true range variant; passing `Key`, `RangeFull`, + /// another `AggregateSumOnRange`, or an `AggregateCountOnRange` is + /// allowed at construction time but will be rejected by + /// [`Self::validate_aggregate_sum_on_range`]. + pub fn new_aggregate_sum_on_range(range: QueryItem) -> Self { + Self { + items: vec![QueryItem::AggregateSumOnRange(Box::new(range))], + left_to_right: true, + ..Self::default() + } + } + + /// Returns `Some(...)` for any query containing an + /// `AggregateSumOnRange` item, regardless of well-formedness. + pub fn aggregate_sum_on_range(&self) -> Option<&QueryItem> { + self.items + .iter() + .find(|item| item.is_aggregate_sum_on_range()) + } + + /// Mirror of `Query::has_aggregate_count_on_range_anywhere` for + /// `AggregateSumOnRange`. Used by the prover/verifier to validate at + /// entry — if any ASOR is present anywhere, the query must satisfy + /// [`Self::validate_aggregate_sum_on_range`]. + pub fn has_aggregate_sum_on_range_anywhere(&self) -> bool { + if self.aggregate_sum_on_range().is_some() { + return true; + } + if let Some(sub) = self.default_subquery_branch.subquery.as_deref() + && sub.has_aggregate_sum_on_range_anywhere() + { + return true; + } + if let Some(branches) = &self.conditional_subquery_branches { + for (selector, branch) in branches { + // The selector is itself a `QueryItem` and could carry an + // `AggregateSumOnRange` tag (the type permits it even + // though it would not be a meaningful conditional + // matcher). Reject defensively so a hidden ASOR in a + // selector cannot slip past the aggregate-shape check. + if selector.is_aggregate_sum_on_range() { + return true; + } + if let Some(sub) = branch.subquery.as_deref() + && sub.has_aggregate_sum_on_range_anywhere() + { + return true; + } + } + } + false + } + + /// Validates the Query-level constraints that apply when an + /// `AggregateSumOnRange` is present. Mirror of + /// `Query::validate_aggregate_count_on_range` (now in the + /// `grovedb-query::aggregate_count` module) for `ProvableSumTree`. + /// + /// Rules enforced: + /// + /// 1. The query must contain exactly one item. + /// 2. That item must be `AggregateSumOnRange(_)`. + /// 3. The inner item must not be `Key` (use `has_raw` / `get_raw` for + /// existence tests). + /// 4. The inner item must not be `RangeFull` (read the parent + /// `Element::ProvableSumTree` bytes directly for the unconditional + /// total). + /// 5. The inner item must not itself be `AggregateSumOnRange`. + /// 6. The inner item must not be `AggregateCountOnRange` (the two + /// aggregate variants are orthogonal). + /// 7. `default_subquery_branch.subquery` and + /// `default_subquery_branch.subquery_path` must both be `None`. + /// 8. `conditional_subquery_branches` must be `None` or empty. + /// + /// `SizedQuery::limit` / `SizedQuery::offset` checks live at the + /// `PathQuery` / `SizedQuery` layer. + pub fn validate_aggregate_sum_on_range(&self) -> Result<&QueryItem, Error> { + if self.items.len() != 1 { + return Err(Error::InvalidOperation( + "AggregateSumOnRange must be the only item in the query", + )); + } + let inner = match &self.items[0] { + QueryItem::AggregateSumOnRange(inner) => inner.as_ref(), + _ => { + return Err(Error::InvalidOperation( + "validate_aggregate_sum_on_range called on a query without an \ + AggregateSumOnRange item", + )); + } + }; + match inner { + QueryItem::Key(_) => { + return Err(Error::InvalidOperation( + "AggregateSumOnRange may not wrap Key — use has_raw / get_raw for \ + existence tests", + )); + } + QueryItem::RangeFull(_) => { + return Err(Error::InvalidOperation( + "AggregateSumOnRange may not wrap RangeFull — read the parent \ + ProvableSumTree element for the unconditional total", + )); + } + QueryItem::AggregateSumOnRange(_) => { + return Err(Error::InvalidOperation( + "AggregateSumOnRange may not wrap another AggregateSumOnRange", + )); + } + QueryItem::AggregateCountOnRange(_) => { + return Err(Error::InvalidOperation( + "AggregateSumOnRange may not wrap AggregateCountOnRange — the two are \ + orthogonal aggregate queries", + )); + } + _ => {} + } + if self.default_subquery_branch.subquery.is_some() + || self.default_subquery_branch.subquery_path.is_some() + { + return Err(Error::InvalidOperation( + "AggregateSumOnRange queries may not carry a default subquery branch", + )); + } + if let Some(branches) = &self.conditional_subquery_branches + && !branches.is_empty() + { + return Err(Error::InvalidOperation( + "AggregateSumOnRange queries may not carry conditional subquery branches", + )); + } + Ok(inner) + } + /// Returns `true` if the given key would trigger a subquery (either via /// the default subquery branch or a matching conditional branch). pub fn has_subquery_on_key(&self, key: &[u8], in_path: bool) -> bool { @@ -907,4 +1045,62 @@ mod tests { "innermost query should have no further subquery" ); } + + /// Sum-side mirror of `has_aggregate_count_on_range_anywhere_walks_subqueries`, + /// with one extra case: an `AggregateSumOnRange` tag appearing as the + /// *selector* (map key) of a conditional subquery branch. The selector + /// is itself a `QueryItem` and the type permits ASOR there even though + /// it would never be a meaningful matcher; the walker must surface it + /// so the prove_query entry-point gate can reject the malformed shape. + #[test] + fn has_aggregate_sum_on_range_anywhere_walks_subqueries_and_selectors() { + // No ASOR anywhere → false. + let plain = Query::new_single_query_item(QueryItem::Range(b"a".to_vec()..b"z".to_vec())); + assert!(!plain.has_aggregate_sum_on_range_anywhere()); + + // Top-level ASOR → true. + let top = Query::new_aggregate_sum_on_range(QueryItem::Range(b"a".to_vec()..b"z".to_vec())); + assert!(top.has_aggregate_sum_on_range_anywhere()); + + // ASOR hidden inside default_subquery_branch.subquery. + let inner = + Query::new_aggregate_sum_on_range(QueryItem::Range(b"a".to_vec()..b"z".to_vec())); + let mut hidden = + Query::new_single_query_item(QueryItem::Range(b"a".to_vec()..b"z".to_vec())); + hidden.set_subquery(inner); + assert!(hidden.aggregate_sum_on_range().is_none()); + assert!( + hidden.has_aggregate_sum_on_range_anywhere(), + "ASOR hidden in default subquery branch must be detected" + ); + + // ASOR hidden inside a conditional subquery branch's subquery. + let inner2 = + Query::new_aggregate_sum_on_range(QueryItem::Range(b"a".to_vec()..b"z".to_vec())); + let mut conditional = + Query::new_single_query_item(QueryItem::Range(b"a".to_vec()..b"z".to_vec())); + conditional.add_conditional_subquery(QueryItem::Key(b"k".to_vec()), None, Some(inner2)); + assert!( + conditional.has_aggregate_sum_on_range_anywhere(), + "ASOR hidden in conditional subquery branch must be detected" + ); + + // ASOR appearing as the SELECTOR of a conditional branch. The + // selector itself is a `QueryItem` and could carry an ASOR tag — + // pre-fix this slipped past the walker because the iteration + // looked only at `branch.subquery` and ignored the map key. + let mut selector = + Query::new_single_query_item(QueryItem::Range(b"a".to_vec()..b"z".to_vec())); + selector.add_conditional_subquery( + QueryItem::AggregateSumOnRange(Box::new(QueryItem::Range( + b"a".to_vec()..b"z".to_vec(), + ))), + None, + None, + ); + assert!( + selector.has_aggregate_sum_on_range_anywhere(), + "ASOR appearing as a conditional-branch selector must be detected" + ); + } } diff --git a/grovedb-query/src/query_item/intersect.rs b/grovedb-query/src/query_item/intersect.rs index 22d414390..9996cbd2d 100644 --- a/grovedb-query/src/query_item/intersect.rs +++ b/grovedb-query/src/query_item/intersect.rs @@ -613,6 +613,7 @@ impl QueryItem { end: RangeSetItem::Inclusive(range.end().clone()), }, QueryItem::AggregateCountOnRange(inner) => inner.to_range_set(), + QueryItem::AggregateSumOnRange(inner) => inner.to_range_set(), } } @@ -662,6 +663,7 @@ impl QueryItem { end: RangeSetSimpleItemBorrowed::Inclusive(range.end()), }), QueryItem::AggregateCountOnRange(inner) => inner.to_range_set_borrowed(), + QueryItem::AggregateSumOnRange(inner) => inner.to_range_set_borrowed(), } } diff --git a/grovedb-query/src/query_item/mod.rs b/grovedb-query/src/query_item/mod.rs index b42b9a939..2a75f0f24 100644 --- a/grovedb-query/src/query_item/mod.rs +++ b/grovedb-query/src/query_item/mod.rs @@ -91,6 +91,26 @@ pub enum QueryItem { /// no pagination, no other range items). The inner `QueryItem` may not be /// `Key`, `RangeFull`, or another `AggregateCountOnRange`. AggregateCountOnRange(Box), + + /// A sum-only meta-query that wraps another `QueryItem` describing the + /// range to sum. + /// + /// When this variant appears in a `Query`, the query is interpreted as + /// "return the **total signed sum** of children with keys in the inner + /// range" instead of returning the elements themselves. The proof is + /// shaped accordingly: boundary nodes are emitted as `KVDigestSum`, + /// fully-inside subtree roots as `HashWithSum`, and fully-outside subtree + /// roots also as `HashWithSum` so the parent's `own_sum` derivation is + /// hash-bound (same self-verifying compression pattern as count). + /// + /// This variant is only valid against `ProvableSumTree` (and its + /// `NotSummed` wrapper variant), and it must be the **only** item in the + /// surrounding `Query` (no subqueries, no pagination, no other range + /// items). The inner `QueryItem` may not be `Key`, `RangeFull`, another + /// `AggregateCountOnRange`, or another `AggregateSumOnRange`. Sum values + /// are signed `i64`; the verifier uses an `i128` accumulator and narrows + /// to `i64` at the end to detect overflow on adversarial inputs. + AggregateSumOnRange(Box), } #[cfg(feature = "serde")] @@ -99,47 +119,62 @@ impl Serialize for QueryItem { where S: Serializer, { + // Variant tags are emitted in snake_case so that textual formats + // (JSON, YAML, TOML) round-trip through the matching Deserialize + // impl, which uses `#[serde(field_identifier, rename_all = + // "snake_case")]` on its `Field` enum. Binary formats that + // identify variants by index (bincode, postcard) ignore the + // string tag, so this change is transparent to them. match self { - QueryItem::Key(key) => serializer.serialize_newtype_variant("QueryItem", 0, "Key", key), + QueryItem::Key(key) => serializer.serialize_newtype_variant("QueryItem", 0, "key", key), QueryItem::Range(range) => { - serializer.serialize_newtype_variant("QueryItem", 1, "Range", &range) + serializer.serialize_newtype_variant("QueryItem", 1, "range", &range) } QueryItem::RangeInclusive(range) => { - serializer.serialize_newtype_variant("QueryItem", 2, "RangeInclusive", range) + serializer.serialize_newtype_variant("QueryItem", 2, "range_inclusive", range) } QueryItem::RangeFull(_) => { - serializer.serialize_unit_variant("QueryItem", 3, "RangeFull") + serializer.serialize_unit_variant("QueryItem", 3, "range_full") } QueryItem::RangeFrom(range_from) => { - serializer.serialize_newtype_variant("QueryItem", 4, "RangeFrom", range_from) + serializer.serialize_newtype_variant("QueryItem", 4, "range_from", range_from) } QueryItem::RangeTo(range_to) => { - serializer.serialize_newtype_variant("QueryItem", 5, "RangeTo", range_to) + serializer.serialize_newtype_variant("QueryItem", 5, "range_to", range_to) } QueryItem::RangeToInclusive(range_to_inclusive) => serializer .serialize_newtype_variant( "QueryItem", 6, - "RangeToInclusive", + "range_to_inclusive", &range_to_inclusive.end, ), QueryItem::RangeAfter(range_after) => { - serializer.serialize_newtype_variant("QueryItem", 7, "RangeAfter", range_after) - } - QueryItem::RangeAfterTo(range_after_to) => { - serializer.serialize_newtype_variant("QueryItem", 8, "RangeAfterTo", range_after_to) + serializer.serialize_newtype_variant("QueryItem", 7, "range_after", range_after) } + QueryItem::RangeAfterTo(range_after_to) => serializer.serialize_newtype_variant( + "QueryItem", + 8, + "range_after_to", + range_after_to, + ), QueryItem::RangeAfterToInclusive(range_after_to_inclusive) => serializer .serialize_newtype_variant( "QueryItem", 9, - "RangeAfterToInclusive", + "range_after_to_inclusive", range_after_to_inclusive, ), QueryItem::AggregateCountOnRange(inner) => serializer.serialize_newtype_variant( "QueryItem", 10, - "AggregateCountOnRange", + "aggregate_count_on_range", + inner, + ), + QueryItem::AggregateSumOnRange(inner) => serializer.serialize_newtype_variant( + "QueryItem", + 11, + "aggregate_sum_on_range", inner, ), } @@ -166,6 +201,7 @@ impl<'de> Deserialize<'de> for QueryItem { RangeAfterTo, RangeAfterToInclusive, AggregateCountOnRange, + AggregateSumOnRange, } struct QueryItemVisitor; @@ -235,6 +271,18 @@ impl<'de> Deserialize<'de> for QueryItem { let NonAggregateInner(inner) = variant_access.newtype_variant()?; Ok(QueryItem::AggregateCountOnRange(Box::new(inner))) } + Field::AggregateSumOnRange => { + // Same defense-in-depth as AggregateCountOnRange: the + // inner is deserialized through `NonAggregateInner` + // whose field set excludes both aggregate-variant + // tags, so any nested aggregate payload is rejected + // immediately by serde without recursing through + // `QueryItem::deserialize`. Keeps `AggregateSumOnRange` + // and `AggregateCountOnRange` orthogonal — one cannot + // wrap the other. + let NonAggregateInner(inner) = variant_access.newtype_variant()?; + Ok(QueryItem::AggregateSumOnRange(Box::new(inner))) + } } } } @@ -251,6 +299,7 @@ impl<'de> Deserialize<'de> for QueryItem { "RangeAfterTo", "RangeAfterToInclusive", "AggregateCountOnRange", + "AggregateSumOnRange", ]; deserializer.deserialize_enum("QueryItem", VARIANTS, QueryItemVisitor) @@ -258,14 +307,18 @@ impl<'de> Deserialize<'de> for QueryItem { } /// Newtype wrapper used internally by the serde `Deserialize` impl when -/// deserializing the *inner* item of an `AggregateCountOnRange`. The wrapper's -/// `Deserialize` impl mirrors `QueryItem::deserialize` but rejects the -/// `AggregateCountOnRange` field tag immediately — without recursing — so -/// nested aggregate payloads cannot exhaust the stack via repeated variant-10 -/// recursion through `QueryItem::deserialize`. +/// deserializing the *inner* item of an `AggregateCountOnRange` or +/// `AggregateSumOnRange`. The wrapper's `Deserialize` impl mirrors +/// `QueryItem::deserialize` but rejects both aggregate field tags immediately +/// — without recursing — so nested aggregate payloads cannot exhaust the +/// stack via repeated variant-10/11 recursion through +/// `QueryItem::deserialize`. Reused for both aggregate wrappers so +/// `AggregateCountOnRange` and `AggregateSumOnRange` stay orthogonal: +/// neither can wrap the other (or itself). /// -/// Defense-in-depth: nested `AggregateCountOnRange` is also rejected by -/// `Query::validate_aggregate_count_on_range`, but enforcing it at decode time +/// Defense-in-depth: nested aggregate variants are also rejected by +/// `Query::validate_aggregate_count_on_range` / +/// `Query::validate_aggregate_sum_on_range`, but enforcing it at decode time /// matches the bincode side and prevents the DoS class on its own. #[cfg(feature = "serde")] struct NonAggregateInner(QueryItem); @@ -276,9 +329,9 @@ impl<'de> Deserialize<'de> for NonAggregateInner { where D: Deserializer<'de>, { - // Field set excludes "AggregateCountOnRange"; encountering that tag - // produces a serde "unknown variant" error before any inner - // recursion can happen. + // Field set excludes both `AggregateCountOnRange` and + // `AggregateSumOnRange`; encountering either tag produces a serde + // "unknown variant" error before any inner recursion can happen. #[derive(Deserialize)] #[serde(field_identifier, rename_all = "snake_case")] enum Field { @@ -405,6 +458,10 @@ impl Encode for QueryItem { encoder.writer().write(&[10])?; inner.as_ref().encode(encoder) } + QueryItem::AggregateSumOnRange(inner) => { + encoder.writer().write(&[11])?; + inner.as_ref().encode(encoder) + } } } } @@ -490,17 +547,38 @@ impl QueryItem { // Defense-in-depth: nested AggregateCountOnRange is invalid // by validation rules, so we also reject it at decode time. // The depth guard above remains the primary stack-overflow - // mitigation for malicious deeper nesting. - if matches!(inner, QueryItem::AggregateCountOnRange(_)) { + // mitigation for malicious deeper nesting. Also reject + // `AggregateSumOnRange` to keep the two aggregate variants + // orthogonal. + if matches!( + inner, + QueryItem::AggregateCountOnRange(_) | QueryItem::AggregateSumOnRange(_) + ) { return Err(DecodeError::Other( - "AggregateCountOnRange must not wrap another AggregateCountOnRange", + "AggregateCountOnRange must not wrap another aggregate variant", )); } Ok(QueryItem::AggregateCountOnRange(Box::new(inner))) } + 11 => { + let inner = QueryItem::decode_with_depth(decoder, depth + 1)?; + // Same defense-in-depth as variant 10. `AggregateSumOnRange` + // may not wrap another aggregate variant (whether sum or + // count) — keeps the two orthogonal and the depth guard + // primary mitigation against stack-exhaustion. + if matches!( + inner, + QueryItem::AggregateSumOnRange(_) | QueryItem::AggregateCountOnRange(_) + ) { + return Err(DecodeError::Other( + "AggregateSumOnRange must not wrap another aggregate variant", + )); + } + Ok(QueryItem::AggregateSumOnRange(Box::new(inner))) + } _ => Err(DecodeError::UnexpectedVariant { type_name: "QueryItem", - allowed: &bincode::error::AllowedEnumVariants::Range { min: 0, max: 10 }, + allowed: &bincode::error::AllowedEnumVariants::Range { min: 0, max: 11 }, found: variant_id as u32, }), } @@ -575,16 +653,31 @@ impl QueryItem { } 10 => { let inner = QueryItem::borrow_decode_with_depth(decoder, depth + 1)?; - if matches!(inner, QueryItem::AggregateCountOnRange(_)) { + if matches!( + inner, + QueryItem::AggregateCountOnRange(_) | QueryItem::AggregateSumOnRange(_) + ) { return Err(DecodeError::Other( - "AggregateCountOnRange must not wrap another AggregateCountOnRange", + "AggregateCountOnRange must not wrap another aggregate variant", )); } Ok(QueryItem::AggregateCountOnRange(Box::new(inner))) } + 11 => { + let inner = QueryItem::borrow_decode_with_depth(decoder, depth + 1)?; + if matches!( + inner, + QueryItem::AggregateSumOnRange(_) | QueryItem::AggregateCountOnRange(_) + ) { + return Err(DecodeError::Other( + "AggregateSumOnRange must not wrap another aggregate variant", + )); + } + Ok(QueryItem::AggregateSumOnRange(Box::new(inner))) + } _ => Err(DecodeError::UnexpectedVariant { type_name: "QueryItem", - allowed: &bincode::error::AllowedEnumVariants::Range { min: 0, max: 10 }, + allowed: &bincode::error::AllowedEnumVariants::Range { min: 0, max: 11 }, found: variant_id as u32, }), } @@ -633,6 +726,9 @@ impl fmt::Display for QueryItem { QueryItem::AggregateCountOnRange(inner) => { write!(f, "AggregateCountOnRange({})", inner) } + QueryItem::AggregateSumOnRange(inner) => { + write!(f, "AggregateSumOnRange({})", inner) + } } } } @@ -644,6 +740,7 @@ impl QueryItem { QueryItem::Key(key) => key.len() as u32, QueryItem::RangeFull(_) => 0u32, QueryItem::AggregateCountOnRange(inner) => inner.processing_footprint(), + QueryItem::AggregateSumOnRange(inner) => inner.processing_footprint(), _ => { self.lower_bound().0.map_or(0u32, |x| x.len() as u32) + self.upper_bound().0.map_or(0u32, |x| x.len() as u32) @@ -666,6 +763,7 @@ impl QueryItem { QueryItem::RangeAfterTo(range) => (Some(range.start.as_ref()), true), QueryItem::RangeAfterToInclusive(range) => (Some(range.start().as_ref()), true), QueryItem::AggregateCountOnRange(inner) => inner.lower_bound(), + QueryItem::AggregateSumOnRange(inner) => inner.lower_bound(), } } @@ -683,6 +781,7 @@ impl QueryItem { QueryItem::RangeAfterTo(_) => false, QueryItem::RangeAfterToInclusive(_) => false, QueryItem::AggregateCountOnRange(inner) => inner.lower_unbounded(), + QueryItem::AggregateSumOnRange(inner) => inner.lower_unbounded(), } } @@ -701,6 +800,7 @@ impl QueryItem { QueryItem::RangeAfterTo(range) => (Some(range.end.as_ref()), false), QueryItem::RangeAfterToInclusive(range) => (Some(range.end().as_ref()), true), QueryItem::AggregateCountOnRange(inner) => inner.upper_bound(), + QueryItem::AggregateSumOnRange(inner) => inner.upper_bound(), } } @@ -718,6 +818,7 @@ impl QueryItem { QueryItem::RangeAfterTo(_) => false, QueryItem::RangeAfterToInclusive(_) => false, QueryItem::AggregateCountOnRange(inner) => inner.upper_unbounded(), + QueryItem::AggregateSumOnRange(inner) => inner.upper_unbounded(), } } @@ -747,6 +848,7 @@ impl QueryItem { QueryItem::RangeAfterTo(_) => 8, QueryItem::RangeAfterToInclusive(_) => 9, QueryItem::AggregateCountOnRange(_) => 10, + QueryItem::AggregateSumOnRange(_) => 11, } } @@ -756,8 +858,8 @@ impl QueryItem { } /// Returns `true` if this query item is any kind of range (not a single - /// key). `AggregateCountOnRange` counts as a range — it describes a range - /// to count over. + /// key). `AggregateCountOnRange` and `AggregateSumOnRange` count as + /// ranges — they describe a range to aggregate over. pub const fn is_range(&self) -> bool { matches!( self, @@ -771,6 +873,7 @@ impl QueryItem { | QueryItem::RangeAfterTo(_) | QueryItem::RangeAfterToInclusive(_) | QueryItem::AggregateCountOnRange(_) + | QueryItem::AggregateSumOnRange(_) ) } @@ -781,10 +884,12 @@ impl QueryItem { /// Returns `true` if this query item is a range with at least one unbounded /// end (e.g., `RangeFull`, `RangeFrom`, `RangeTo`, etc.). For - /// `AggregateCountOnRange`, delegates to the inner item. + /// `AggregateCountOnRange` and `AggregateSumOnRange`, delegates to the + /// inner item. pub fn is_unbounded_range(&self) -> bool { match self { QueryItem::AggregateCountOnRange(inner) => inner.is_unbounded_range(), + QueryItem::AggregateSumOnRange(inner) => inner.is_unbounded_range(), _ => !matches!( self, QueryItem::Key(_) | QueryItem::Range(_) | QueryItem::RangeInclusive(_) @@ -797,6 +902,11 @@ impl QueryItem { matches!(self, QueryItem::AggregateCountOnRange(_)) } + /// Returns `true` if this query item is the sum-only meta-variant. + pub const fn is_aggregate_sum_on_range(&self) -> bool { + matches!(self, QueryItem::AggregateSumOnRange(_)) + } + /// If this is `AggregateCountOnRange`, returns a reference to the inner /// `QueryItem` describing the range to count. Otherwise returns `None`. pub fn aggregate_count_inner(&self) -> Option<&QueryItem> { @@ -806,6 +916,15 @@ impl QueryItem { } } + /// If this is `AggregateSumOnRange`, returns a reference to the inner + /// `QueryItem` describing the range to sum. Otherwise returns `None`. + pub fn aggregate_sum_inner(&self) -> Option<&QueryItem> { + match self { + QueryItem::AggregateSumOnRange(inner) => Some(inner.as_ref()), + _ => None, + } + } + /// Enumerates all distinct keys in this query item. Only works for `Key`, /// `Range`, and `RangeInclusive` with single-byte boundaries; returns an /// error for unbounded ranges. @@ -1008,6 +1127,7 @@ impl QueryItem { } } QueryItem::AggregateCountOnRange(inner) => inner.seek_for_iter(iter, left_to_right), + QueryItem::AggregateSumOnRange(inner) => inner.seek_for_iter(iter, left_to_right), } } @@ -1103,6 +1223,9 @@ impl QueryItem { QueryItem::AggregateCountOnRange(inner) => { return inner.iter_is_valid_for_type(iter, limit, aggregate_limit, left_to_right); } + QueryItem::AggregateSumOnRange(inner) => { + return inner.iter_is_valid_for_type(iter, limit, aggregate_limit, left_to_right); + } }; is_valid.wrap_with_cost(cost) @@ -1310,13 +1433,11 @@ mod test { // rejected immediately by serde without recursion through // `QueryItem::deserialize`. // - // We use `serde_test`'s token-level driver here rather than a textual - // format because the existing `Serialize` impl emits variant tags in - // PascalCase (`"AggregateCountOnRange"`) while the existing `Field` enum - // uses `rename_all = "snake_case"` — a pre-existing mismatch unrelated - // to this PR that breaks JSON round-trip but is invisible to formats - // that don't carry variant names textually. Using token streams sidesteps - // that issue and lets us validate the rejection contract directly. + // We use `serde_test`'s token-level driver here for symmetry with the + // rejection contract — the inner field set's snake_case tag (e.g. + // `aggregate_count_on_range`) matches both the Serialize impl (which + // emits snake_case variant tags) and the Deserialize impl + // (`#[serde(field_identifier, rename_all = "snake_case")]`). #[cfg(feature = "serde")] #[test] @@ -1387,4 +1508,438 @@ mod test { ], ); } + + // ---------- AggregateSumOnRange: mirrors of the AggregateCountOnRange tests ---------- + // + // These exist to drive coverage of the variant-11 dispatch in encode, + // decode_with_depth, borrow_decode_with_depth, Display, and the helper + // accessors. Each test targets a specific arm previously not exercised. + + #[test] + fn decode_rejects_nested_aggregate_sum_on_range() { + // AggregateSumOnRange(AggregateSumOnRange(Range)): the inner nested + // aggregate must be rejected by the variant-11 dispatch's matches! + // guard (or by the depth guard). + let nested = QueryItem::AggregateSumOnRange(Box::new(QueryItem::AggregateSumOnRange( + Box::new(QueryItem::Range(b"a".to_vec()..b"z".to_vec())), + ))); + let bytes = bincode::encode_to_vec(&nested, bincode_config()).expect("encode succeeds"); + let result: Result<(QueryItem, _), _> = + bincode::decode_from_slice(&bytes, bincode_config()); + let err = result.expect_err("nested AggregateSumOnRange must be rejected at decode"); + let msg = format!("{:?}", err); + assert!( + msg.contains("AggregateSumOnRange") || msg.contains("nesting depth"), + "expected nested-rejection message, got: {msg}" + ); + } + + #[test] + fn decode_rejects_aggregate_sum_wrapping_aggregate_count() { + // Orthogonality check: AggregateSumOnRange cannot wrap + // AggregateCountOnRange (and vice versa) — this is the explicit + // matches! guard in variant 11. + let mixed = QueryItem::AggregateSumOnRange(Box::new(QueryItem::AggregateCountOnRange( + Box::new(QueryItem::Range(b"a".to_vec()..b"z".to_vec())), + ))); + let bytes = bincode::encode_to_vec(&mixed, bincode_config()).expect("encode succeeds"); + let result: Result<(QueryItem, _), _> = + bincode::decode_from_slice(&bytes, bincode_config()); + let err = result + .expect_err("AggregateSumOnRange wrapping AggregateCountOnRange must be rejected"); + let msg = format!("{:?}", err); + assert!( + msg.contains("AggregateSumOnRange") || msg.contains("aggregate"), + "expected nested-rejection message, got: {msg}" + ); + } + + #[test] + fn decode_rejects_aggregate_count_wrapping_aggregate_sum() { + // The other direction: AggregateCountOnRange cannot wrap + // AggregateSumOnRange — variant 10's matches! guard. + let mixed = QueryItem::AggregateCountOnRange(Box::new(QueryItem::AggregateSumOnRange( + Box::new(QueryItem::Range(b"a".to_vec()..b"z".to_vec())), + ))); + let bytes = bincode::encode_to_vec(&mixed, bincode_config()).expect("encode succeeds"); + let result: Result<(QueryItem, _), _> = + bincode::decode_from_slice(&bytes, bincode_config()); + let err = result + .expect_err("AggregateCountOnRange wrapping AggregateSumOnRange must be rejected"); + let msg = format!("{:?}", err); + assert!( + msg.contains("AggregateCountOnRange") || msg.contains("aggregate"), + "expected nested-rejection message, got: {msg}" + ); + } + + #[test] + fn decode_accepts_valid_one_level_aggregate_sum_on_range() { + let q = QueryItem::AggregateSumOnRange(Box::new(QueryItem::Range( + b"a".to_vec()..b"z".to_vec(), + ))); + let bytes = bincode::encode_to_vec(&q, bincode_config()).unwrap(); + let (decoded, _): (QueryItem, _) = bincode::decode_from_slice(&bytes, bincode_config()) + .expect("single-level wrap must decode"); + assert_eq!(q, decoded); + } + + #[test] + fn decode_caps_depth_for_malicious_sum_payload() { + // Mirrors the count payload depth test but with the variant-11 + // (AggregateSumOnRange) tag. Hits the depth guard inside + // decode_with_depth on the sum branch. + let depth_to_try = MAX_QUERY_ITEM_DECODE_DEPTH + 2; + let mut payload: Vec = Vec::new(); + for _ in 0..depth_to_try { + payload.push(11u8); // AggregateSumOnRange variant tag + } + // Innermost: Range. Variant tag 1, then start/end Vec bytes. + payload.push(1u8); + let inner = QueryItem::Range(b"a".to_vec()..b"z".to_vec()); + let inner_bytes = bincode::encode_to_vec(&inner, bincode_config()).unwrap(); + payload.extend_from_slice(&inner_bytes[1..]); + + let result: Result<(QueryItem, _), _> = + bincode::decode_from_slice(&payload, bincode_config()); + let err = result.expect_err("payload exceeding max depth must be rejected"); + let msg = format!("{:?}", err); + assert!( + msg.contains("nesting depth") || msg.contains("AggregateSumOnRange"), + "expected depth-rejection message, got: {msg}" + ); + } + + #[test] + fn decode_unknown_variant_rejected() { + // Variant byte 12 is unknown (max = 11). Verifies the trailing + // UnexpectedVariant arm in decode_with_depth. + let payload = vec![12u8]; + let result: Result<(QueryItem, _), _> = + bincode::decode_from_slice(&payload, bincode_config()); + let err = result.expect_err("unknown variant must be rejected"); + let msg = format!("{:?}", err); + assert!( + msg.contains("UnexpectedVariant") || msg.contains("QueryItem"), + "expected unknown-variant error, got: {msg}" + ); + } + + #[test] + fn borrow_decode_round_trips_aggregate_sum_on_range() { + // BorrowDecode path: exercises borrow_decode_with_depth's variant-11 + // branch. + let q = QueryItem::AggregateSumOnRange(Box::new(QueryItem::Range( + b"a".to_vec()..b"z".to_vec(), + ))); + let bytes = bincode::encode_to_vec(&q, bincode_config()).unwrap(); + let (decoded, _): (QueryItem, _) = + bincode::borrow_decode_from_slice(&bytes, bincode_config()).expect("borrow decode"); + assert_eq!(q, decoded); + } + + #[test] + fn borrow_decode_rejects_nested_aggregate_sum_on_range() { + let nested = QueryItem::AggregateSumOnRange(Box::new(QueryItem::AggregateSumOnRange( + Box::new(QueryItem::Range(b"a".to_vec()..b"z".to_vec())), + ))); + let bytes = bincode::encode_to_vec(&nested, bincode_config()).expect("encode"); + let result: Result<(QueryItem, _), _> = + bincode::borrow_decode_from_slice(&bytes, bincode_config()); + let err = result.expect_err("must reject nested aggregate sum via borrow decode"); + let msg = format!("{:?}", err); + assert!( + msg.contains("AggregateSumOnRange") || msg.contains("nesting depth"), + "got: {msg}" + ); + } + + #[test] + fn borrow_decode_rejects_count_wrapping_sum() { + let mixed = QueryItem::AggregateCountOnRange(Box::new(QueryItem::AggregateSumOnRange( + Box::new(QueryItem::Range(b"a".to_vec()..b"z".to_vec())), + ))); + let bytes = bincode::encode_to_vec(&mixed, bincode_config()).expect("encode"); + let result: Result<(QueryItem, _), _> = + bincode::borrow_decode_from_slice(&bytes, bincode_config()); + let err = result.expect_err("must reject count wrapping sum via borrow decode"); + let msg = format!("{:?}", err); + assert!( + msg.contains("AggregateCountOnRange") || msg.contains("aggregate"), + "got: {msg}" + ); + } + + #[test] + fn borrow_decode_rejects_sum_wrapping_count() { + let mixed = QueryItem::AggregateSumOnRange(Box::new(QueryItem::AggregateCountOnRange( + Box::new(QueryItem::Range(b"a".to_vec()..b"z".to_vec())), + ))); + let bytes = bincode::encode_to_vec(&mixed, bincode_config()).expect("encode"); + let result: Result<(QueryItem, _), _> = + bincode::borrow_decode_from_slice(&bytes, bincode_config()); + let err = result.expect_err("must reject sum wrapping count via borrow decode"); + let msg = format!("{:?}", err); + assert!( + msg.contains("AggregateSumOnRange") || msg.contains("aggregate"), + "got: {msg}" + ); + } + + #[test] + fn borrow_decode_unknown_variant_rejected() { + let payload = vec![12u8]; + let result: Result<(QueryItem, _), _> = + bincode::borrow_decode_from_slice(&payload, bincode_config()); + let err = result.expect_err("unknown variant must be rejected"); + let msg = format!("{:?}", err); + assert!( + msg.contains("UnexpectedVariant") || msg.contains("QueryItem"), + "got: {msg}" + ); + } + + #[test] + fn display_aggregate_sum_on_range_formats() { + // Drives the Display arm at line ~717. + let q = QueryItem::AggregateSumOnRange(Box::new(QueryItem::Range( + b"aa".to_vec()..b"zz".to_vec(), + ))); + let s = format!("{}", q); + assert!(s.starts_with("AggregateSumOnRange("), "got: {s}"); + assert!(s.contains("Range("), "got: {s}"); + } + + #[test] + fn display_aggregate_count_on_range_formats() { + // Drives the Display arm at line ~714 (also currently uncovered). + let q = QueryItem::AggregateCountOnRange(Box::new(QueryItem::Range( + b"aa".to_vec()..b"zz".to_vec(), + ))); + let s = format!("{}", q); + assert!(s.starts_with("AggregateCountOnRange("), "got: {s}"); + } + + #[test] + fn aggregate_sum_helpers_and_bounds() { + // Hits processing_footprint, lower_bound, lower_unbounded, + // upper_bound, upper_unbounded, enum_value, is_range, is_single, + // is_unbounded_range, is_aggregate_*, aggregate_*_inner for the + // sum variant. + let inner = QueryItem::Range(b"a".to_vec()..b"z".to_vec()); + let q = QueryItem::AggregateSumOnRange(Box::new(inner.clone())); + + assert_eq!(q.processing_footprint(), inner.processing_footprint()); + assert_eq!(q.lower_bound(), inner.lower_bound()); + assert_eq!(q.upper_bound(), inner.upper_bound()); + assert_eq!(q.lower_unbounded(), inner.lower_unbounded()); + assert_eq!(q.upper_unbounded(), inner.upper_unbounded()); + assert_eq!(q.enum_value(), 11); + assert!(q.is_range()); + assert!(!q.is_single()); + assert!(!q.is_key()); + assert!(!q.is_aggregate_count_on_range()); + assert!(q.is_aggregate_sum_on_range()); + assert!(q.aggregate_count_inner().is_none()); + assert_eq!(q.aggregate_sum_inner(), Some(&inner)); + // unbounded delegation: inner is bounded -> false + assert!(!q.is_unbounded_range()); + + // Now wrap an unbounded inner and verify delegation flips. + let q_unbound = + QueryItem::AggregateSumOnRange(Box::new(QueryItem::RangeFrom(b"a".to_vec()..))); + assert!(q_unbound.is_unbounded_range()); + } + + #[test] + fn aggregate_count_helpers_and_bounds() { + // Mirror the helpers for the count variant — covers count arms + // for the same accessors (some of which were missed by the + // existing tests). + let inner = QueryItem::Range(b"a".to_vec()..b"z".to_vec()); + let q = QueryItem::AggregateCountOnRange(Box::new(inner.clone())); + + assert_eq!(q.processing_footprint(), inner.processing_footprint()); + assert_eq!(q.lower_bound(), inner.lower_bound()); + assert_eq!(q.upper_bound(), inner.upper_bound()); + assert_eq!(q.lower_unbounded(), inner.lower_unbounded()); + assert_eq!(q.upper_unbounded(), inner.upper_unbounded()); + assert_eq!(q.enum_value(), 10); + assert!(q.is_range()); + assert!(q.is_aggregate_count_on_range()); + assert!(!q.is_aggregate_sum_on_range()); + assert_eq!(q.aggregate_count_inner(), Some(&inner)); + assert!(q.aggregate_sum_inner().is_none()); + assert!(!q.is_unbounded_range()); + } + + #[cfg(feature = "serde")] + #[test] + fn serde_decode_rejects_nested_aggregate_sum_on_range() { + // Mirrors the serde nested-rejection test for the sum variant + // (covers the serde dispatcher's variant-11 path). + use serde_test::{assert_de_tokens_error, Token}; + assert_de_tokens_error::( + &[ + Token::NewtypeVariant { + name: "QueryItem", + variant: "aggregate_sum_on_range", + }, + Token::NewtypeVariant { + name: "QueryItem", + variant: "aggregate_sum_on_range", + }, + ], + "unknown field `aggregate_sum_on_range`, expected one of \ + `key`, `range`, `range_inclusive`, `range_full`, `range_from`, \ + `range_to`, `range_to_inclusive`, `range_after`, `range_after_to`, \ + `range_after_to_inclusive`", + ); + } + + #[cfg(feature = "serde")] + #[test] + fn serde_decode_accepts_valid_one_level_aggregate_sum_on_range() { + // Covers the serde Field::AggregateSumOnRange dispatch arm and + // the NonAggregateInner inner deserialization. + use serde_test::{assert_de_tokens, Token}; + let expected = QueryItem::AggregateSumOnRange(Box::new(QueryItem::Range( + b"a".to_vec()..b"z".to_vec(), + ))); + assert_de_tokens( + &expected, + &[ + Token::NewtypeVariant { + name: "QueryItem", + variant: "aggregate_sum_on_range", + }, + Token::NewtypeVariant { + name: "QueryItem", + variant: "range", + }, + Token::Struct { + name: "Range", + len: 2, + }, + Token::Str("start"), + Token::Seq { len: Some(1) }, + Token::U8(b'a'), + Token::SeqEnd, + Token::Str("end"), + Token::Seq { len: Some(1) }, + Token::U8(b'z'), + Token::SeqEnd, + Token::StructEnd, + ], + ); + } + + #[cfg(feature = "serde")] + #[test] + fn serde_decode_rejects_aggregate_sum_wrapping_count() { + // The NonAggregateInner field set excludes both variants — verify + // that explicitly for the sum-wrapping-count combination. + use serde_test::{assert_de_tokens_error, Token}; + assert_de_tokens_error::( + &[ + Token::NewtypeVariant { + name: "QueryItem", + variant: "aggregate_sum_on_range", + }, + Token::NewtypeVariant { + name: "QueryItem", + variant: "aggregate_count_on_range", + }, + ], + "unknown field `aggregate_count_on_range`, expected one of \ + `key`, `range`, `range_inclusive`, `range_full`, `range_from`, \ + `range_to`, `range_to_inclusive`, `range_after`, `range_after_to`, \ + `range_after_to_inclusive`", + ); + } + + /// Regression: the Serialize impl emits variant tags in snake_case + /// so the round-trip with the snake_case `Field` enum on the + /// Deserialize side works for textual formats (JSON, YAML). + /// + /// Before this was fixed, the Serialize impl emitted PascalCase + /// (`"AggregateSumOnRange"`) while the Deserialize side expected + /// snake_case (`"aggregate_sum_on_range"`), so JSON round-trip + /// failed silently for every QueryItem variant. Bincode round-trip + /// was unaffected because it identifies variants by index, not by + /// the textual tag. + #[cfg(feature = "serde")] + #[test] + fn serde_round_trip_aggregate_sum_on_range_uses_snake_case_tag() { + use serde_test::{assert_tokens, Token}; + + let qi = QueryItem::AggregateSumOnRange(Box::new(QueryItem::Range( + b"a".to_vec()..b"z".to_vec(), + ))); + assert_tokens( + &qi, + &[ + Token::NewtypeVariant { + name: "QueryItem", + variant: "aggregate_sum_on_range", + }, + Token::NewtypeVariant { + name: "QueryItem", + variant: "range", + }, + Token::Struct { + name: "Range", + len: 2, + }, + Token::Str("start"), + Token::Seq { len: Some(1) }, + Token::U8(b'a'), + Token::SeqEnd, + Token::Str("end"), + Token::Seq { len: Some(1) }, + Token::U8(b'z'), + Token::SeqEnd, + Token::StructEnd, + ], + ); + } + + /// Mirror of the sum test: count side round-trips through + /// snake_case too. Pins the contract so both aggregate variants + /// stay in lockstep on the Serialize side. + #[cfg(feature = "serde")] + #[test] + fn serde_round_trip_aggregate_count_on_range_uses_snake_case_tag() { + use serde_test::{assert_tokens, Token}; + + let qi = QueryItem::AggregateCountOnRange(Box::new(QueryItem::Range( + b"a".to_vec()..b"z".to_vec(), + ))); + assert_tokens( + &qi, + &[ + Token::NewtypeVariant { + name: "QueryItem", + variant: "aggregate_count_on_range", + }, + Token::NewtypeVariant { + name: "QueryItem", + variant: "range", + }, + Token::Struct { + name: "Range", + len: 2, + }, + Token::Str("start"), + Token::Seq { len: Some(1) }, + Token::U8(b'a'), + Token::SeqEnd, + Token::Str("end"), + Token::Seq { len: Some(1) }, + Token::U8(b'z'), + Token::SeqEnd, + Token::StructEnd, + ], + ); + } } diff --git a/grovedb-version/src/version/grovedb_versions.rs b/grovedb-version/src/version/grovedb_versions.rs index fa25a6497..88f2bf963 100644 --- a/grovedb-version/src/version/grovedb_versions.rs +++ b/grovedb-version/src/version/grovedb_versions.rs @@ -117,6 +117,7 @@ pub struct GroveDBOperationsQueryVersions { pub query_item_value_or_sum: FeatureVersion, pub query_aggregate_sums: FeatureVersion, pub query_aggregate_count_on_range: FeatureVersion, + pub query_aggregate_sum_on_range: FeatureVersion, pub query_sums: FeatureVersion, pub query_raw: FeatureVersion, pub query_keys_optional: FeatureVersion, diff --git a/grovedb-version/src/version/v1.rs b/grovedb-version/src/version/v1.rs index 3e9a919b8..576db0066 100644 --- a/grovedb-version/src/version/v1.rs +++ b/grovedb-version/src/version/v1.rs @@ -135,6 +135,7 @@ pub const GROVE_V1: GroveVersion = GroveVersion { query_item_value_or_sum: 0, query_aggregate_sums: 0, query_aggregate_count_on_range: 0, + query_aggregate_sum_on_range: 0, query_sums: 0, query_raw: 0, query_keys_optional: 0, diff --git a/grovedb-version/src/version/v2.rs b/grovedb-version/src/version/v2.rs index 47a20b900..42701d3c3 100644 --- a/grovedb-version/src/version/v2.rs +++ b/grovedb-version/src/version/v2.rs @@ -135,6 +135,7 @@ pub const GROVE_V2: GroveVersion = GroveVersion { query_item_value_or_sum: 0, query_aggregate_sums: 0, query_aggregate_count_on_range: 0, + query_aggregate_sum_on_range: 0, query_sums: 0, query_raw: 0, query_keys_optional: 0, diff --git a/grovedb-version/src/version/v3.rs b/grovedb-version/src/version/v3.rs index 137379fbd..cac055d67 100644 --- a/grovedb-version/src/version/v3.rs +++ b/grovedb-version/src/version/v3.rs @@ -135,6 +135,7 @@ pub const GROVE_V3: GroveVersion = GroveVersion { query_item_value_or_sum: 0, query_aggregate_sums: 0, query_aggregate_count_on_range: 0, + query_aggregate_sum_on_range: 0, query_sums: 0, query_raw: 0, query_keys_optional: 0, diff --git a/grovedb/src/batch/mod.rs b/grovedb/src/batch/mod.rs index 4834da99a..3c4870a5c 100644 --- a/grovedb/src/batch/mod.rs +++ b/grovedb/src/batch/mod.rs @@ -1682,6 +1682,7 @@ where | Element::CountSumTree(..) | Element::ProvableCountTree(..) | Element::ProvableCountSumTree(..) + | Element::ProvableSumTree(..) | Element::CommitmentTree(..) | Element::MmrTree(..) | Element::BulkAppendTree(..) @@ -1837,6 +1838,7 @@ where | Element::CountSumTree(..) | Element::ProvableCountTree(..) | Element::ProvableCountSumTree(..) + | Element::ProvableSumTree(..) | Element::CommitmentTree(..) | Element::MmrTree(..) | Element::BulkAppendTree(..) @@ -1886,6 +1888,7 @@ where | Element::CountSumTree(..) | Element::ProvableCountTree(..) | Element::ProvableCountSumTree(..) + | Element::ProvableSumTree(..) | Element::CommitmentTree(..) | Element::MmrTree(..) | Element::BulkAppendTree(..) @@ -2229,6 +2232,7 @@ where | Element::CountSumTree(..) | Element::ProvableCountTree(..) | Element::ProvableCountSumTree(..) + | Element::ProvableSumTree(..) | Element::MmrTree(..) | Element::BulkAppendTree(..) | Element::DenseAppendOnlyFixedSizeTree(..) => { @@ -2688,6 +2692,11 @@ where AggregateData::ProvableCountAndSum(count_value, sum_value) => { Element::ProvableCountSumTree(root_key, count_value, sum_value, flags) } + AggregateData::ProvableSum(sum_value) => { + Element::new_provable_sum_tree_with_flags_and_sum_value( + root_key, sum_value, flags, + ) + } }; // Re-wrap if the original element was wrapped, so the // on-disk bytes preserve the wrapper and the parent's @@ -2888,6 +2897,7 @@ where | Element::CountSumTree(..) | Element::ProvableCountTree(..) | Element::ProvableCountSumTree(..) + | Element::ProvableSumTree(..) | Element::CommitmentTree(..) | Element::MmrTree(..) | Element::BulkAppendTree(..) @@ -3205,6 +3215,21 @@ impl GroveDb { .., flags, ) = element + { + *mutable_occupied_entry = + GroveOp::InsertTreeWithRootHash { + hash: root_hash, + root_key: calculated_root_key, + flags: flags.clone(), + aggregate_data, + non_counted, + not_summed, + not_counted_or_summed, + } + } else if let Element::ProvableSumTree( + .., + flags, + ) = element { *mutable_occupied_entry = GroveOp::InsertTreeWithRootHash { diff --git a/grovedb/src/debugger.rs b/grovedb/src/debugger.rs index 38bfc0e0d..c0b5f3165 100644 --- a/grovedb/src/debugger.rs +++ b/grovedb/src/debugger.rs @@ -478,6 +478,9 @@ fn merk_proof_node_to_grovedbg(node: Node) -> Result { grovedbg_types::TreeFeatureType::ProvableCountedSummedMerkNode(count, sum) } + TreeFeatureType::ProvableSummedMerkNode(sum) => { + grovedbg_types::TreeFeatureType::ProvableSummedMerkNode(sum) + } }; MerkProofNode::KVValueHashFeatureType( key, @@ -542,6 +545,9 @@ fn merk_proof_node_to_grovedbg(node: Node) -> Result { grovedbg_types::TreeFeatureType::ProvableCountedSummedMerkNode(count, sum) } + TreeFeatureType::ProvableSummedMerkNode(sum) => { + grovedbg_types::TreeFeatureType::ProvableSummedMerkNode(sum) + } }; MerkProofNode::KVValueHashFeatureType( key, @@ -570,6 +576,60 @@ fn merk_proof_node_to_grovedbg(node: Node) -> Result { + let element = crate::Element::deserialize(&value, GroveVersion::latest())?; + let val_hash = value_hash(&value).unwrap(); + MerkProofNode::KVValueHashFeatureType( + key, + element_to_grovedbg(element), + val_hash, + grovedbg_types::TreeFeatureType::ProvableSummedMerkNode(sum), + ) + } + Node::KVHashSum(hash, sum) => MerkProofNode::KVValueHashFeatureType( + vec![], + grovedbg_types::Element::Item { + value: vec![], + element_flags: None, + }, + hash, + grovedbg_types::TreeFeatureType::ProvableSummedMerkNode(sum), + ), + Node::KVRefValueHashSum(key, value, hash, sum) => { + let element = crate::Element::deserialize(&value, GroveVersion::latest())?; + MerkProofNode::KVValueHashFeatureType( + key, + element_to_grovedbg(element), + hash, + grovedbg_types::TreeFeatureType::ProvableSummedMerkNode(sum), + ) + } + Node::KVDigestSum(key, hash, sum) => MerkProofNode::KVValueHashFeatureType( + key, + grovedbg_types::Element::Item { + value: vec![], + element_flags: None, + }, + hash, + grovedbg_types::TreeFeatureType::ProvableSummedMerkNode(sum), + ), + Node::HashWithSum(kv_hash, left_child_hash, right_child_hash, sum) => { + use grovedb_merk::tree::node_hash_with_sum; + let computed_node_hash = + node_hash_with_sum(&kv_hash, &left_child_hash, &right_child_hash, sum).unwrap(); + MerkProofNode::KVValueHashFeatureType( + vec![], + grovedbg_types::Element::Item { + value: vec![], + element_flags: None, + }, + computed_node_hash, + grovedbg_types::TreeFeatureType::ProvableSummedMerkNode(sum), + ) + } }) } @@ -796,6 +856,13 @@ fn element_to_grovedbg(element: crate::Element) -> grovedbg_types::Element { element_flags, } } + crate::Element::ProvableSumTree(root_key, sum, element_flags) => { + grovedbg_types::Element::ProvableSumTree { + root_key, + sum, + element_flags, + } + } crate::Element::CommitmentTree(_, _, element_flags) => grovedbg_types::Element::Subtree { root_key: None, element_flags, @@ -868,6 +935,9 @@ fn node_to_update( TreeFeatureType::ProvableCountedSummedMerkNode(count, sum) => { grovedbg_types::TreeFeatureType::ProvableCountedSummedMerkNode(count, sum) } + TreeFeatureType::ProvableSummedMerkNode(sum) => { + grovedbg_types::TreeFeatureType::ProvableSummedMerkNode(sum) + } }, value_hash, kv_digest_hash, diff --git a/grovedb/src/lib.rs b/grovedb/src/lib.rs index cfc8a8936..37aef88ba 100644 --- a/grovedb/src/lib.rs +++ b/grovedb/src/lib.rs @@ -968,6 +968,7 @@ impl GroveDb { | Element::CountSumTree(..) | Element::ProvableCountTree(..) | Element::ProvableCountSumTree(..) + | Element::ProvableSumTree(..) | Element::CommitmentTree(..) | Element::MmrTree(..) | Element::BulkAppendTree(..) @@ -1019,6 +1020,67 @@ impl GroveDb { ); } + // Software-consistency check: the aggregate fields + // stored in the parent's tree element (e.g. + // `sum_value` in `ProvableSumTree(_, sum_value, _)`) + // must agree with the inner Merk's actual + // `aggregate_data()`. This is distinct from the + // cryptographic check above: for the provable + // variants, both the recorded aggregate field AND + // the actual inner aggregate are bound into + // element_value_hash, but they are independently + // representable on disk and could disagree if a + // propagation bug (or storage corruption) drifts + // them out of sync. For non-provable variants, the + // aggregate field is stored alongside but not bound + // into the hash; an out-of-sync field is therefore + // a pure software bug, and the cryptographic check + // would not catch it. + // + // Non-Merk data trees (CommitmentTree, MmrTree, + // BulkAppendTree, DenseTree) keep an empty inner + // Merk by design, so their `aggregate_data()` is + // always `NoAggregateData`. Skip them here; the + // recursion below is already skipped for them via + // `uses_non_merk_data_storage`. + // + // For aggregate-mismatch logging we reuse the + // existing `VerificationIssues` shape + // (HashMap) by packing the recorded vs. actual + // aggregate values into a deterministic placeholder + // hash via blake3. This avoids breaking the type + // signature and all callers (including + // `visualize_verify_grovedb`), at the cost of the + // hex output being a placeholder rather than a + // real Merk hash. The recorded-aggregate hash is + // placed in the "expected" slot and the + // actual-aggregate hash in the "actual" slot; the + // "root" slot reuses the inner-Merk root hash for + // path-locality. + if !element.uses_non_merk_data_storage() { + let actual_aggregate = inner_merk.aggregate_data().map_err(MerkError)?; + if let Some((recorded_label, actual_label)) = + aggregate_consistency_labels(&element, &actual_aggregate) + { + let expected_placeholder: CryptoHash = + blake3::hash(recorded_label.as_bytes()).into(); + let actual_placeholder: CryptoHash = + blake3::hash(actual_label.as_bytes()).into(); + // Use `.entry().or_insert(...)` so we don't + // clobber an earlier cryptographic + // (`combined_value_hash != element_value_hash`) + // entry inserted above for this same path — + // the real Merk-hash chain mismatch is more + // diagnostic than the aggregate placeholder. + issues.entry(new_path.to_vec()).or_insert(( + root_hash, + expected_placeholder, + actual_placeholder, + )); + } + } + // Non-Merk data trees (CommitmentTree, MmrTree, // BulkAppendTree, DenseTree) store data in the data // namespace as non-Element entries. Recursing into @@ -1206,6 +1268,419 @@ impl GroveDb { } } +/// Inspect a tree-bearing Element together with the actual aggregate data of +/// its inner Merk. Returns `Some((recorded_label, actual_label))` if the +/// aggregate field(s) stored in the element disagree with `actual`, or +/// `None` if they match (or if `element` is not a tree variant that carries +/// an aggregate field reflecting the inner Merk's `aggregate_data()`). +/// +/// The string labels are intended to be hashed into deterministic placeholder +/// `CryptoHash` values for inclusion in `VerificationIssues`. +/// +/// Coverage: +/// - `SumTree(_, n, _)` vs. `AggregateData::Sum(m)`. +/// - `ProvableSumTree(_, n, _)` vs. `AggregateData::ProvableSum(m)`. +/// - `BigSumTree(_, n, _)` vs. `AggregateData::BigSum(m)`. +/// - `CountTree(_, n, _)` vs. `AggregateData::Count(m)`. +/// - `CountSumTree(_, c, s, _)` vs. `AggregateData::CountAndSum(cm, sm)`. +/// - `ProvableCountTree(_, n, _)` vs. `AggregateData::ProvableCount(m)`. +/// - `ProvableCountSumTree(_, c, s, _)` vs. +/// `AggregateData::ProvableCountAndSum(cm, sm)`. +/// +/// A plain `Element::Tree(..)` has no aggregate field; the inner Merk's +/// `aggregate_data` is `NoAggregateData` by construction, and any other +/// value would be a separate corruption (caught by the type/feature checks +/// elsewhere). We return `None` for it here. +/// +/// A variant/aggregate-shape mismatch (e.g. `ProvableSumTree` whose inner +/// Merk reports `AggregateData::Count(_)` instead of `ProvableSum(_)`) is +/// also reported, because the inner Merk's tree-type has drifted from what +/// the parent element claims. +#[cfg(feature = "minimal")] +fn aggregate_consistency_labels( + element: &Element, + actual: &AggregateData, +) -> Option<(String, String)> { + match (element, actual) { + // --- Plain Tree: no aggregate, never reports a mismatch. + (Element::Tree(..), AggregateData::NoAggregateData) => None, + + // --- SumTree variants --- + (Element::SumTree(_, recorded, _), AggregateData::Sum(actual_sum)) => { + if recorded == actual_sum { + None + } else { + Some(( + format!("SumTree recorded sum {}", recorded), + format!("inner aggregate Sum {}", actual_sum), + )) + } + } + (Element::ProvableSumTree(_, recorded, _), AggregateData::ProvableSum(actual_sum)) => { + if recorded == actual_sum { + None + } else { + Some(( + format!("ProvableSumTree recorded sum {}", recorded), + format!("inner aggregate ProvableSum {}", actual_sum), + )) + } + } + (Element::BigSumTree(_, recorded, _), AggregateData::BigSum(actual_sum)) => { + if recorded == actual_sum { + None + } else { + Some(( + format!("BigSumTree recorded sum {}", recorded), + format!("inner aggregate BigSum {}", actual_sum), + )) + } + } + (Element::CountTree(_, recorded, _), AggregateData::Count(actual_count)) => { + if recorded == actual_count { + None + } else { + Some(( + format!("CountTree recorded count {}", recorded), + format!("inner aggregate Count {}", actual_count), + )) + } + } + ( + Element::CountSumTree(_, recorded_count, recorded_sum, _), + AggregateData::CountAndSum(actual_count, actual_sum), + ) => { + if recorded_count == actual_count && recorded_sum == actual_sum { + None + } else { + Some(( + format!( + "CountSumTree recorded count {} sum {}", + recorded_count, recorded_sum + ), + format!( + "inner aggregate CountAndSum count {} sum {}", + actual_count, actual_sum + ), + )) + } + } + ( + Element::ProvableCountTree(_, recorded, _), + AggregateData::ProvableCount(actual_count), + ) => { + if recorded == actual_count { + None + } else { + Some(( + format!("ProvableCountTree recorded count {}", recorded), + format!("inner aggregate ProvableCount {}", actual_count), + )) + } + } + ( + Element::ProvableCountSumTree(_, recorded_count, recorded_sum, _), + AggregateData::ProvableCountAndSum(actual_count, actual_sum), + ) => { + if recorded_count == actual_count && recorded_sum == actual_sum { + None + } else { + Some(( + format!( + "ProvableCountSumTree recorded count {} sum {}", + recorded_count, recorded_sum + ), + format!( + "inner aggregate ProvableCountAndSum count {} sum {}", + actual_count, actual_sum + ), + )) + } + } + + // --- Empty-merk edge case: an empty Merk returns NoAggregateData + // for any tree type. This is the correct initial state for a + // freshly-inserted tree element. Treat as not-mismatching as long + // as the recorded aggregate is the identity for that variant + // (zero / zero counts). Anything else is a real mismatch. --- + (Element::SumTree(_, recorded, _), AggregateData::NoAggregateData) if *recorded == 0 => { + None + } + (Element::ProvableSumTree(_, recorded, _), AggregateData::NoAggregateData) + if *recorded == 0 => + { + None + } + (Element::BigSumTree(_, recorded, _), AggregateData::NoAggregateData) if *recorded == 0 => { + None + } + (Element::CountTree(_, recorded, _), AggregateData::NoAggregateData) if *recorded == 0 => { + None + } + ( + Element::CountSumTree(_, recorded_count, recorded_sum, _), + AggregateData::NoAggregateData, + ) if *recorded_count == 0 && *recorded_sum == 0 => None, + (Element::ProvableCountTree(_, recorded, _), AggregateData::NoAggregateData) + if *recorded == 0 => + { + None + } + ( + Element::ProvableCountSumTree(_, recorded_count, recorded_sum, _), + AggregateData::NoAggregateData, + ) if *recorded_count == 0 && *recorded_sum == 0 => None, + + // --- Non-Merk data trees: caller skips us via + // `uses_non_merk_data_storage`; if we end up here anyway, do not + // report. --- + (Element::CommitmentTree(..), _) + | (Element::MmrTree(..), _) + | (Element::BulkAppendTree(..), _) + | (Element::DenseAppendOnlyFixedSizeTree(..), _) => None, + + // --- Anything else is a variant/aggregate-shape mismatch (e.g. + // the inner Merk's tree-type has drifted from what the parent + // claims). Report with descriptive labels. --- + (element, actual) => Some(( + format!("element variant {}", element.type_str()), + format!("inner aggregate variant {:?}", actual), + )), + } +} + +#[cfg(all(test, feature = "minimal"))] +mod aggregate_consistency_labels_tests { + //! Unit tests for the `aggregate_consistency_labels` helper. Each + //! aggregate-bearing tree variant has a match arm; covering each arm + //! requires one matching pair (matches, returns None) plus one + //! mismatching pair (disagrees, returns Some). Plus the + //! NoAggregateData identity arms, the non-Merk-data-tree arms, and + //! the catch-all variant/shape mismatch. + + use grovedb_merk::tree::AggregateData; + + use super::{aggregate_consistency_labels, Element}; + + // --- SumTree ----------------------------------------------------------- + #[test] + fn sum_tree_match_returns_none() { + let e = Element::SumTree(None, 42, None); + assert!(aggregate_consistency_labels(&e, &AggregateData::Sum(42)).is_none()); + } + + #[test] + fn sum_tree_mismatch_returns_labels() { + let e = Element::SumTree(None, 42, None); + let labels = aggregate_consistency_labels(&e, &AggregateData::Sum(1)).expect("labels"); + assert!(labels.0.contains("SumTree recorded sum 42")); + assert!(labels.1.contains("Sum 1")); + } + + // --- ProvableSumTree --------------------------------------------------- + #[test] + fn provable_sum_tree_match_returns_none() { + let e = Element::ProvableSumTree(None, 7, None); + assert!(aggregate_consistency_labels(&e, &AggregateData::ProvableSum(7)).is_none()); + } + + #[test] + fn provable_sum_tree_mismatch_returns_labels() { + let e = Element::ProvableSumTree(None, 7, None); + let labels = + aggregate_consistency_labels(&e, &AggregateData::ProvableSum(0)).expect("labels"); + assert!(labels.0.contains("ProvableSumTree recorded sum 7")); + assert!(labels.1.contains("ProvableSum 0")); + } + + // --- BigSumTree -------------------------------------------------------- + #[test] + fn big_sum_tree_match_returns_none() { + let e = Element::BigSumTree(None, 100, None); + assert!(aggregate_consistency_labels(&e, &AggregateData::BigSum(100)).is_none()); + } + + #[test] + fn big_sum_tree_mismatch_returns_labels() { + let e = Element::BigSumTree(None, 100, None); + let labels = aggregate_consistency_labels(&e, &AggregateData::BigSum(0)).expect("labels"); + assert!(labels.0.contains("BigSumTree recorded sum 100")); + assert!(labels.1.contains("BigSum 0")); + } + + // --- CountTree --------------------------------------------------------- + #[test] + fn count_tree_match_returns_none() { + let e = Element::CountTree(None, 9, None); + assert!(aggregate_consistency_labels(&e, &AggregateData::Count(9)).is_none()); + } + + #[test] + fn count_tree_mismatch_returns_labels() { + let e = Element::CountTree(None, 9, None); + let labels = aggregate_consistency_labels(&e, &AggregateData::Count(0)).expect("labels"); + assert!(labels.0.contains("CountTree recorded count 9")); + assert!(labels.1.contains("Count 0")); + } + + // --- CountSumTree ------------------------------------------------------- + #[test] + fn count_sum_tree_match_returns_none() { + let e = Element::CountSumTree(None, 3, 14, None); + assert!(aggregate_consistency_labels(&e, &AggregateData::CountAndSum(3, 14)).is_none()); + } + + #[test] + fn count_sum_tree_mismatch_returns_labels() { + let e = Element::CountSumTree(None, 3, 14, None); + let labels = + aggregate_consistency_labels(&e, &AggregateData::CountAndSum(3, 0)).expect("labels"); + assert!(labels.0.contains("recorded count 3 sum 14")); + assert!(labels.1.contains("count 3 sum 0")); + } + + // --- ProvableCountTree ------------------------------------------------- + #[test] + fn provable_count_tree_match_returns_none() { + let e = Element::ProvableCountTree(None, 5, None); + assert!(aggregate_consistency_labels(&e, &AggregateData::ProvableCount(5)).is_none()); + } + + #[test] + fn provable_count_tree_mismatch_returns_labels() { + let e = Element::ProvableCountTree(None, 5, None); + let labels = + aggregate_consistency_labels(&e, &AggregateData::ProvableCount(0)).expect("labels"); + assert!(labels.0.contains("ProvableCountTree recorded count 5")); + assert!(labels.1.contains("ProvableCount 0")); + } + + // --- ProvableCountSumTree ---------------------------------------------- + #[test] + fn provable_count_sum_tree_match_returns_none() { + let e = Element::ProvableCountSumTree(None, 4, 8, None); + assert!( + aggregate_consistency_labels(&e, &AggregateData::ProvableCountAndSum(4, 8)).is_none() + ); + } + + #[test] + fn provable_count_sum_tree_mismatch_returns_labels() { + let e = Element::ProvableCountSumTree(None, 4, 8, None); + let labels = aggregate_consistency_labels(&e, &AggregateData::ProvableCountAndSum(4, 0)) + .expect("labels"); + assert!(labels.0.contains("recorded count 4 sum 8")); + assert!(labels.1.contains("count 4 sum 0")); + } + + // --- Plain Tree / NoAggregateData -------------------------------------- + #[test] + fn plain_tree_no_aggregate_returns_none() { + let e = Element::Tree(None, None); + assert!(aggregate_consistency_labels(&e, &AggregateData::NoAggregateData).is_none()); + } + + // --- NoAggregateData with empty-merk identity arms --------------------- + #[test] + fn sum_tree_zero_recorded_with_no_aggregate_is_ok() { + let e = Element::SumTree(None, 0, None); + assert!(aggregate_consistency_labels(&e, &AggregateData::NoAggregateData).is_none()); + } + + #[test] + fn sum_tree_nonzero_recorded_with_no_aggregate_is_mismatch() { + // Should fall through to the catch-all variant/shape mismatch arm. + let e = Element::SumTree(None, 7, None); + let labels = + aggregate_consistency_labels(&e, &AggregateData::NoAggregateData).expect("labels"); + assert!(labels.0.contains("element variant")); + assert!(labels.1.contains("NoAggregateData")); + } + + #[test] + fn provable_sum_tree_zero_recorded_with_no_aggregate_is_ok() { + let e = Element::ProvableSumTree(None, 0, None); + assert!(aggregate_consistency_labels(&e, &AggregateData::NoAggregateData).is_none()); + } + + #[test] + fn big_sum_tree_zero_recorded_with_no_aggregate_is_ok() { + let e = Element::BigSumTree(None, 0, None); + assert!(aggregate_consistency_labels(&e, &AggregateData::NoAggregateData).is_none()); + } + + #[test] + fn count_tree_zero_recorded_with_no_aggregate_is_ok() { + let e = Element::CountTree(None, 0, None); + assert!(aggregate_consistency_labels(&e, &AggregateData::NoAggregateData).is_none()); + } + + #[test] + fn count_sum_tree_zero_zero_with_no_aggregate_is_ok() { + let e = Element::CountSumTree(None, 0, 0, None); + assert!(aggregate_consistency_labels(&e, &AggregateData::NoAggregateData).is_none()); + } + + #[test] + fn provable_count_tree_zero_recorded_with_no_aggregate_is_ok() { + let e = Element::ProvableCountTree(None, 0, None); + assert!(aggregate_consistency_labels(&e, &AggregateData::NoAggregateData).is_none()); + } + + #[test] + fn provable_count_sum_tree_zero_zero_with_no_aggregate_is_ok() { + let e = Element::ProvableCountSumTree(None, 0, 0, None); + assert!(aggregate_consistency_labels(&e, &AggregateData::NoAggregateData).is_none()); + } + + // --- Non-Merk data trees: always None --------------------------------- + #[test] + fn commitment_tree_always_returns_none() { + let e = Element::CommitmentTree(0, 0, None); + assert!(aggregate_consistency_labels(&e, &AggregateData::NoAggregateData).is_none()); + // Even a non-NoAggregateData paired with these returns None per the + // explicit catch arm. + assert!(aggregate_consistency_labels(&e, &AggregateData::Sum(5)).is_none()); + } + + #[test] + fn mmr_tree_always_returns_none() { + let e = Element::MmrTree(0, None); + assert!(aggregate_consistency_labels(&e, &AggregateData::NoAggregateData).is_none()); + } + + #[test] + fn bulk_append_tree_always_returns_none() { + let e = Element::BulkAppendTree(0, 0, None); + assert!(aggregate_consistency_labels(&e, &AggregateData::NoAggregateData).is_none()); + } + + #[test] + fn dense_append_only_tree_always_returns_none() { + let e = Element::DenseAppendOnlyFixedSizeTree(0, 0, None); + assert!(aggregate_consistency_labels(&e, &AggregateData::NoAggregateData).is_none()); + } + + // --- Catch-all variant/shape mismatch --------------------------------- + #[test] + fn provable_sum_tree_paired_with_wrong_aggregate_kind_is_mismatch() { + // ProvableSumTree vs Count → catch-all variant-mismatch arm. + let e = Element::ProvableSumTree(None, 7, None); + let labels = aggregate_consistency_labels(&e, &AggregateData::Count(0)).expect("labels"); + assert!(labels.0.contains("element variant")); + assert!(labels.1.contains("inner aggregate variant")); + } + + #[test] + fn item_element_paired_with_no_aggregate_is_mismatch() { + // Item isn't a tree at all → catch-all (no specific arm matches). + let e = Element::Item(b"x".to_vec(), None); + let labels = + aggregate_consistency_labels(&e, &AggregateData::NoAggregateData).expect("labels"); + assert!(labels.0.contains("element variant")); + } +} + /// Test-only helpers for verifying internal storage state. #[cfg(all(test, feature = "minimal"))] impl GroveDb { diff --git a/grovedb/src/operations/get/mod.rs b/grovedb/src/operations/get/mod.rs index 764050d1e..f60893e9d 100644 --- a/grovedb/src/operations/get/mod.rs +++ b/grovedb/src/operations/get/mod.rs @@ -409,6 +409,7 @@ impl GroveDb { | Ok(Element::CountSumTree(..)) | Ok(Element::ProvableCountTree(..)) | Ok(Element::ProvableCountSumTree(..)) + | Ok(Element::ProvableSumTree(..)) | Ok(Element::CommitmentTree(..)) | Ok(Element::MmrTree(..)) | Ok(Element::BulkAppendTree(..)) diff --git a/grovedb/src/operations/get/query.rs b/grovedb/src/operations/get/query.rs index dd496e9b4..75899f5c6 100644 --- a/grovedb/src/operations/get/query.rs +++ b/grovedb/src/operations/get/query.rs @@ -272,7 +272,8 @@ where { | Element::CountTree(..) | Element::CountSumTree(..) | Element::ProvableCountTree(..) - | Element::ProvableCountSumTree(..) => Ok(element), + | Element::ProvableCountSumTree(..) + | Element::ProvableSumTree(..) => Ok(element), Element::Tree(..) | Element::CommitmentTree(..) | Element::MmrTree(..) @@ -415,6 +416,7 @@ where { | Element::CountSumTree(..) | Element::ProvableCountTree(..) | Element::ProvableCountSumTree(..) + | Element::ProvableSumTree(..) | Element::CommitmentTree(..) | Element::MmrTree(..) | Element::BulkAppendTree(..) @@ -540,6 +542,9 @@ where { count_value, sum_value, )), + Element::ProvableSumTree(_, sum_value, _) => { + Ok(QueryItemOrSumReturnType::SumValue(sum_value)) + } _ => Err(Error::InvalidQuery( "the reference must result in an item", )), @@ -575,6 +580,9 @@ where { Element::ProvableCountSumTree(_, count_value, sum_value, _) => Ok( QueryItemOrSumReturnType::CountSumValue(count_value, sum_value), ), + Element::ProvableSumTree(_, sum_value, _) => { + Ok(QueryItemOrSumReturnType::SumValue(sum_value)) + } Element::Tree(..) | Element::CommitmentTree(..) | Element::MmrTree(..) @@ -600,6 +608,90 @@ where { Ok((results, skipped)).wrap_with_cost(cost) } + /// Execute an `AggregateSumOnRange` path query without producing a + /// proof, returning the in-range signed sum directly. + /// + /// This is the no-proof counterpart of + /// [`Self::prove_query`] + + /// [`Self::verify_aggregate_sum_query`](GroveDb::verify_aggregate_sum_query) + /// for `AggregateSumOnRange` queries: it performs the same merk-level + /// boundary walk the prover does (using each internal node's stored + /// aggregate sum to short-circuit Contained / Disjoint subtrees) but + /// skips proof generation, serialization, and verification entirely. + /// + /// `path_query` must satisfy + /// [`PathQuery::validate_aggregate_sum_on_range`] — a single + /// `AggregateSumOnRange(_)` item, no subqueries, no pagination, a + /// non-empty path, and an inner range that isn't `Key`, `RangeFull`, + /// or another aggregate variant. Any other shape is rejected up front + /// with `Error::InvalidQuery` before any merk reads happen. + /// + /// The subtree at `path_query.path` must be a `ProvableSumTree` — the + /// merk-level walk rejects any other tree type. If the subtree is + /// missing (path does not resolve), this returns the same + /// `PathNotFound` / `PathParentLayerNotFound` errors as other + /// path-based reads. + /// + /// Mirrors PR #662's `query_aggregate_count` for the signed-sum side. + /// + /// The returned sum is **not** independently verifiable — callers are + /// trusting their own merk read path. For a verifiable sum, use + /// [`Self::prove_query`] + + /// [`Self::verify_aggregate_sum_query`](GroveDb::verify_aggregate_sum_query). + pub fn query_aggregate_sum( + &self, + path_query: &PathQuery, + transaction: TransactionArg, + grove_version: &GroveVersion, + ) -> CostResult { + check_grovedb_v0_with_cost!( + "query_aggregate_sum", + grove_version + .grovedb_versions + .operations + .query + .query_aggregate_sum_on_range + ); + + let mut cost = OperationCost::default(); + + // Up-front shape validation: same gate the prover and verifier use. + // Catches malformed ASOR queries (illegal inner range, ASOR-hidden-in- + // subquery, pagination, empty path, etc.) before any storage reads. + let inner_range = cost_return_on_error_no_add!( + cost, + path_query.validate_aggregate_sum_on_range().cloned() + ); + + let tx = TxRef::new(&self.db, transaction); + + // Open the leaf merk and ask it for the sum. The merk-level entry + // point enforces `tree_type == ProvableSumTree` and handles the + // empty-merk case (returns 0). + let path_slices: Vec<&[u8]> = path_query.path.iter().map(|p| p.as_slice()).collect(); + let subtree = cost_return_on_error!( + &mut cost, + self.open_transactional_merk_at_path( + SubtreePath::from(path_slices.as_slice()), + tx.as_ref(), + None, + grove_version, + ) + ); + + let sum = cost_return_on_error!( + &mut cost, + subtree + .sum_aggregate_on_range(&inner_range, grove_version) + .map_err(|e| Error::CorruptedData(format!( + "query_aggregate_sum at path {:?}: {}", + path_slices, e + ))) + ); + + Ok(sum).wrap_with_cost(cost) + } + /// Retrieves SumItem values using an [`AggregateSumPathQuery`] with /// budget-limited scanning (max elements scanned is capped by /// [`GroveDBQueryLimits::max_aggregate_sum_query_elements_scanned`]). @@ -757,7 +849,10 @@ where { &mut cost, subtree .count_aggregate_on_range(&inner_range, grove_version) - .map_err(Error::MerkError) + .map_err(|e| Error::CorruptedData(format!( + "query_aggregate_count at path {:?}: {}", + path_slices, e + ))) ); Ok(count).wrap_with_cost(cost) @@ -1024,6 +1119,7 @@ where { | Element::CountSumTree(..) | Element::ProvableCountTree(..) | Element::ProvableCountSumTree(..) + | Element::ProvableSumTree(..) | Element::CommitmentTree(..) | Element::MmrTree(..) | Element::BulkAppendTree(..) diff --git a/grovedb/src/operations/insert/mod.rs b/grovedb/src/operations/insert/mod.rs index 5818ab040..8815a2a61 100644 --- a/grovedb/src/operations/insert/mod.rs +++ b/grovedb/src/operations/insert/mod.rs @@ -291,7 +291,8 @@ impl GroveDb { | Element::CountTree(value, ..) | Element::CountSumTree(value, ..) | Element::ProvableCountTree(value, ..) - | Element::ProvableCountSumTree(value, ..) => { + | Element::ProvableCountSumTree(value, ..) + | Element::ProvableSumTree(value, ..) => { if value.is_some() { return Err(Error::InvalidCodeExecution( "a tree should be empty at the moment of insertion when not using batches", diff --git a/grovedb/src/operations/proof/aggregate_count/helpers.rs b/grovedb/src/operations/proof/aggregate_count/helpers.rs index 53040015d..728231170 100644 --- a/grovedb/src/operations/proof/aggregate_count/helpers.rs +++ b/grovedb/src/operations/proof/aggregate_count/helpers.rs @@ -1,7 +1,11 @@ //! Shared helpers used by both the leaf-chain walker and the per-key //! carrier walker. //! -//! - [`decode_grovedb_proof`] — parse the bincode envelope. +//! Envelope decoding lives one level up in +//! [`crate::operations::proof::decode_grovedb_proof_canonical`] so the +//! canonical-decode contract has exactly one definition shared with +//! the aggregate-sum side. +//! //! - [`verify_count_leaf`] — delegate to the merk-level count verifier. //! - [`expect_merk_bytes`] — unwrap a `ProofBytes::Merk(_)` or reject. //! - [`verify_single_key_layer_proof_v0`] — verify a non-leaf merk @@ -25,35 +29,7 @@ use grovedb_merk::{ use grovedb_query::QueryItem; use grovedb_version::version::GroveVersion; -use crate::{ - operations::proof::{GroveDBProof, ProofBytes}, - Element, Error, PathQuery, -}; - -/// Decode a serialized `GroveDBProof` envelope using the same bincode -/// configuration the prover writes out. -/// -/// Decoding is canonical: trailing bytes beyond the encoded envelope -/// are rejected. Without this check the same `(RootHash, count)` could -/// be reconstructed from many different proof byte-strings (a proof and -/// the same proof with arbitrary suffix bytes), which is harmless for -/// the chain-bound correctness guarantee but breaks any -/// equality-by-bytes assumption a caller might rely on (caching, -/// deduplication, hashing the proof itself). -pub(super) fn decode_grovedb_proof(proof: &[u8]) -> Result { - let config = bincode::config::standard() - .with_big_endian() - .with_limit::<{ 256 * 1024 * 1024 }>(); - let (decoded, consumed) = bincode::decode_from_slice(proof, config) - .map_err(|e| Error::CorruptedData(format!("unable to decode proof: {}", e)))?; - if consumed != proof.len() { - return Err(Error::CorruptedData(format!( - "aggregate-count proof has {} trailing bytes after the encoded envelope", - proof.len() - consumed - ))); - } - Ok(decoded) -} +use crate::{operations::proof::ProofBytes, Element, Error, PathQuery}; /// Verify the leaf layer: bytes are the encoded count-proof Op stream; /// the inner range is the same one the prover counted over. @@ -240,16 +216,22 @@ pub(super) fn execute_carrier_layer_proof( /// /// Intermediate path elements may be any tree type — the GroveDB grove can /// route through Normal/Sum/Count/etc. trees on the way down to the -/// provable-count leaf. The leaf-level tree-type check is enforced by the -/// merk prover (`Merk::prove_aggregate_count_on_range`); here we only -/// require that each non-leaf element on the path *is* some non-empty tree, -/// since only trees have a lower layer to chain into. +/// provable-count leaf. At the terminal layer (passed `is_terminal = +/// true` when the next descent goes into the actual leaf merk), the +/// element MUST deserialize to `ProvableCountTree` or +/// `ProvableCountSumTree`. Without this, an empty Merk-backed tree of any +/// other type at the leaf accepts a forged empty leaf proof — every +/// empty Merk-backed tree has `inner_root = NULL_HASH` and so its stored +/// `value_hash = combine_hash(H(bytes), NULL_HASH)` matches uniformly, +/// and the verifier would silently return `count = 0` for a +/// non-ProvableCount* leaf (type-confusion soundness gap). pub(super) fn enforce_lower_chain( path_query: &PathQuery, target_key: &[u8], proven_value_bytes: &[u8], lower_hash: &CryptoHash, parent_proof_hash: &CryptoHash, + is_terminal: bool, grove_version: &GroveVersion, ) -> Result<(), Error> { let element = Element::deserialize(proven_value_bytes, grove_version) @@ -264,14 +246,30 @@ pub(super) fn enforce_lower_chain( ) })? .into_underlying(); - if !element.is_any_tree() { + if is_terminal { + if !matches!( + element, + Element::ProvableCountTree(..) | Element::ProvableCountSumTree(..) + ) { + return Err(Error::InvalidProof( + path_query.clone(), + format!( + "aggregate-count proof's terminal path element at key {} must be a \ + ProvableCountTree or ProvableCountSumTree (got {}); a count aggregate \ + is only meaningful against a tree that binds its count into the node hash", + hex::encode(target_key), + element.type_str() + ), + )); + } + } else if !element.is_any_tree() { return Err(Error::InvalidProof( path_query.clone(), format!( - "aggregate-count proof's path element at key {} is not a tree element \ - (got {:?}); count queries can only descend through tree elements", + "aggregate-count proof's intermediate path element at key {} is not a tree \ + element (got {}); count queries can only descend through tree elements", hex::encode(target_key), - std::mem::discriminant(&element) + element.type_str() ), )); } diff --git a/grovedb/src/operations/proof/aggregate_count/leaf_chain.rs b/grovedb/src/operations/proof/aggregate_count/leaf_chain.rs index 7057142ca..7bb988067 100644 --- a/grovedb/src/operations/proof/aggregate_count/leaf_chain.rs +++ b/grovedb/src/operations/proof/aggregate_count/leaf_chain.rs @@ -64,12 +64,14 @@ pub(super) fn verify_v1_leaf_chain( grove_version, )?; + let is_terminal = depth + 1 == path_keys.len(); enforce_lower_chain( path_query, &next_key, &proven_value_bytes, &lower_hash, &parent_proof_hash, + is_terminal, grove_version, )?; diff --git a/grovedb/src/operations/proof/aggregate_count/mod.rs b/grovedb/src/operations/proof/aggregate_count/mod.rs index c163e4e7b..39f498b59 100644 --- a/grovedb/src/operations/proof/aggregate_count/mod.rs +++ b/grovedb/src/operations/proof/aggregate_count/mod.rs @@ -118,7 +118,7 @@ impl GroveDb { // alongside the inner-Query shape rules. let inner_range = path_query.validate_leaf_aggregate_count_on_range()?.clone(); - let grovedb_proof = helpers::decode_grovedb_proof(proof)?; + let grovedb_proof = super::decode_grovedb_proof_canonical(proof)?; let path_keys: Vec<&[u8]> = path_query.path.iter().map(|p| p.as_slice()).collect(); let root_layer = require_v1_envelope(&grovedb_proof, path_query)?; @@ -191,7 +191,7 @@ impl GroveDb { // descent below is skipped (carrier_outer_items is None). let classification = classification::classify_aggregate_count_path_query(path_query)?; - let grovedb_proof = helpers::decode_grovedb_proof(proof)?; + let grovedb_proof = super::decode_grovedb_proof_canonical(proof)?; let path_keys: Vec<&[u8]> = path_query.path.iter().map(|p| p.as_slice()).collect(); let root_layer = require_v1_envelope(&grovedb_proof, path_query)?; diff --git a/grovedb/src/operations/proof/aggregate_count/per_key.rs b/grovedb/src/operations/proof/aggregate_count/per_key.rs index 37df0f779..b825ce16b 100644 --- a/grovedb/src/operations/proof/aggregate_count/per_key.rs +++ b/grovedb/src/operations/proof/aggregate_count/per_key.rs @@ -82,12 +82,21 @@ fn verify_v1_per_key( classification, grove_version, )?; + // Terminal here only in the LEAF shape — where the path's final + // element is itself the ProvableCount(Sum)Tree. In carrier shape + // the final path element is the carrier (which could be any + // tree containing ProvableCountTrees as children); the terminal + // type check is enforced deeper, on each outer-key match in + // `verify_v1_carrier_layer` / `verify_v1_subquery_path`. + let is_terminal = + depth + 1 == path_keys.len() && classification.carrier_outer_items.is_none(); enforce_lower_chain( path_query, &next_key, &proven_value_bytes, &lower_hash, &parent_proof_hash, + is_terminal, grove_version, )?; return Ok((parent_root_hash, results)); @@ -179,12 +188,18 @@ fn verify_v1_carrier_layer( grove_version, )?; + // Terminal here when the subquery_path is empty — the outer + // match goes directly to the leaf merk. When subquery_path has + // depth > 0, the leaf-terminal check fires inside + // `verify_v1_subquery_path` instead. + let is_terminal = subquery_path.is_empty(); enforce_lower_chain( path_query, &outer_key, &value_bytes, &lower_root, &commitment_hash, + is_terminal, grove_version, )?; results.push((outer_key, count)); @@ -228,12 +243,14 @@ fn verify_v1_subquery_path( inner_range, grove_version, )?; + let is_terminal = depth + 1 == subquery_path.len(); enforce_lower_chain( path_query, &next_key, &proven_value_bytes, &lower_hash, &parent_proof_hash, + is_terminal, grove_version, )?; Ok((parent_root_hash, count)) diff --git a/grovedb/src/operations/proof/aggregate_sum/helpers.rs b/grovedb/src/operations/proof/aggregate_sum/helpers.rs new file mode 100644 index 000000000..6d0fc6555 --- /dev/null +++ b/grovedb/src/operations/proof/aggregate_sum/helpers.rs @@ -0,0 +1,204 @@ +//! Shared helpers used by the aggregate-sum leaf-chain walker. +//! +//! Envelope decoding lives one level up in +//! [`crate::operations::proof::decode_grovedb_proof_canonical`] so the +//! canonical-decode contract has exactly one definition shared with +//! the aggregate-count side. +//! +//! - [`verify_sum_leaf`] — delegate to the merk-level sum verifier. +//! - [`expect_merk_bytes`] — unwrap a `ProofBytes::Merk(_)` or reject. +//! - [`verify_single_key_layer_proof_v0`] — verify a non-leaf merk +//! proof for one expected key and recover its value bytes + chain +//! commitment hash. +//! - [`enforce_lower_chain`] — `combine_hash(H(value), lower_root) == +//! parent_value_hash`, the binding that ties each layer's sum to the +//! GroveDB root hash, plus the terminal-type gate that requires the +//! leaf-target element to be a `ProvableSumTree`. + +use grovedb_merk::{ + proofs::{ + query::{aggregate_sum::verify_aggregate_sum_on_range_proof, QueryProofVerify}, + Query as MerkQuery, + }, + tree::{combine_hash, value_hash}, + CryptoHash, +}; +use grovedb_query::QueryItem; +use grovedb_version::version::GroveVersion; + +use crate::{operations::proof::ProofBytes, Element, Error, PathQuery}; + +/// Verify the leaf layer: bytes are the encoded sum-proof Op stream; +/// the inner range is the same one the prover summed over. +pub(super) fn verify_sum_leaf( + leaf_bytes: &[u8], + inner_range: &QueryItem, + path_query: &PathQuery, +) -> Result<(CryptoHash, i64), Error> { + let (root_hash, sum) = verify_aggregate_sum_on_range_proof(leaf_bytes, inner_range) + .unwrap() + .map_err(|e| { + Error::InvalidProof( + path_query.clone(), + format!("aggregate-sum leaf proof failed to verify: {}", e), + ) + })?; + Ok((root_hash, sum)) +} + +/// Unwrap a `ProofBytes::Merk(_)` or reject the proof — aggregate-sum +/// envelopes are always merk-flavored at every layer. +pub(super) fn expect_merk_bytes<'a>( + proof_bytes: &'a ProofBytes, + path_query: &PathQuery, +) -> Result<&'a [u8], Error> { + match proof_bytes { + ProofBytes::Merk(b) => Ok(b.as_slice()), + other => Err(Error::InvalidProof( + path_query.clone(), + format!( + "aggregate-sum proof has unexpected non-merk layer bytes: {:?}", + std::mem::discriminant(other) + ), + )), + } +} + +/// Verify a non-leaf layer that should contain a single-key proof for +/// `target_key`. Returns `(proven_value_bytes, this_layer_root_hash, +/// proof_hash_recorded_for_target)`. +/// +/// The "proof_hash" is the value_hash committed by the merk proof for the +/// target key — this is the hash the verifier will compare against +/// `combine_hash(H(child_tree_value), lower_layer_root_hash)` to enforce +/// the chain. +pub(super) fn verify_single_key_layer_proof_v0( + merk_bytes: &[u8], + target_key: &[u8], + path_query: &PathQuery, +) -> Result<(Vec, CryptoHash, CryptoHash), Error> { + let level_query = MerkQuery { + items: vec![grovedb_merk::proofs::query::QueryItem::Key( + target_key.to_vec(), + )], + left_to_right: true, + ..Default::default() + }; + + let (root_hash, merk_result) = level_query + .execute_proof(merk_bytes, None, true, 0) + .unwrap() + .map_err(|e| { + Error::InvalidProof( + path_query.clone(), + format!( + "non-leaf single-key proof for {} failed to verify: {}", + hex::encode(target_key), + e + ), + ) + })?; + + let proved = merk_result + .result_set + .iter() + .find(|p| p.key == target_key) + .ok_or_else(|| { + Error::InvalidProof( + path_query.clone(), + format!( + "non-leaf proof did not contain the expected key {}", + hex::encode(target_key) + ), + ) + })?; + + let value_bytes = proved.value.clone().ok_or_else(|| { + Error::InvalidProof( + path_query.clone(), + format!( + "non-leaf proof for key {} returned no value bytes", + hex::encode(target_key) + ), + ) + })?; + + Ok((value_bytes, root_hash, proved.proof)) +} + +/// Enforce the layer-chain hash equality plus, at the terminal layer, +/// the leaf-tree-type invariant. +/// +/// At intermediate depths the only requirement is that the element be +/// *some* tree (we have to descend further). At the terminal depth — the +/// last path element, whose inner Merk is the actual aggregate target — +/// the element MUST deserialize to `Element::ProvableSumTree` (after +/// wrapper unwrapping). Without this check, an empty Merk-backed tree of +/// any other type at the leaf accepts a forged empty leaf proof, because +/// every empty Merk-backed tree has `inner_root = NULL_HASH` and so its +/// stored `value_hash = combine_hash(H(bytes), NULL_HASH)` — the chain +/// check passes uniformly. The honest prover-side gate in +/// `Merk::prove_aggregate_sum_on_range` already rejects non-ProvableSumTree +/// inputs; this is the matching verifier-side gate. +pub(super) fn enforce_lower_chain( + path_query: &PathQuery, + target_key: &[u8], + proven_value_bytes: &[u8], + lower_hash: &CryptoHash, + parent_proof_hash: &CryptoHash, + is_terminal: bool, + grove_version: &GroveVersion, +) -> Result<(), Error> { + let element = Element::deserialize(proven_value_bytes, grove_version) + .map_err(|e| { + Error::InvalidProof( + path_query.clone(), + format!( + "non-leaf proof's element at key {} failed to deserialize: {}", + hex::encode(target_key), + e + ), + ) + })? + .into_underlying(); + if is_terminal { + if !matches!(element, Element::ProvableSumTree(..)) { + return Err(Error::InvalidProof( + path_query.clone(), + format!( + "aggregate-sum proof's terminal path element at key {} must be a \ + ProvableSumTree (got {}); a sum aggregate is only meaningful against \ + a tree that binds its sum into the node hash", + hex::encode(target_key), + element.type_str() + ), + )); + } + } else if !element.is_any_tree() { + return Err(Error::InvalidProof( + path_query.clone(), + format!( + "aggregate-sum proof's intermediate path element at key {} is not a tree \ + element (got {}); sum queries can only descend through tree elements", + hex::encode(target_key), + element.type_str() + ), + )); + } + + let value_h = value_hash(proven_value_bytes).value().to_owned(); + let combined = combine_hash(&value_h, lower_hash).value().to_owned(); + if combined != *parent_proof_hash { + return Err(Error::InvalidProof( + path_query.clone(), + format!( + "aggregate-sum proof chain mismatch at key {}: parent recorded value_hash \ + {} but combine_hash(H(value), lower_root) is {}", + hex::encode(target_key), + hex::encode(parent_proof_hash), + hex::encode(combined) + ), + )); + } + Ok(()) +} diff --git a/grovedb/src/operations/proof/aggregate_sum/leaf_chain.rs b/grovedb/src/operations/proof/aggregate_sum/leaf_chain.rs new file mode 100644 index 000000000..06fd6cbca --- /dev/null +++ b/grovedb/src/operations/proof/aggregate_sum/leaf_chain.rs @@ -0,0 +1,120 @@ +//! Leaf-chain walker: descends `path_query.path` via single-key existence +//! proofs and delegates to the merk-level sum verifier at the leaf +//! merk. Drives the single-`i64` entry point +//! [`crate::GroveDb::verify_aggregate_sum_query`]. +//! +//! Mirror of [`super::super::aggregate_count::leaf_chain`] for the +//! `ProvableSumTree` flavor. +//! +//! V0 (`MerkOnlyLayerProof`) envelopes are rejected at the entry-point +//! gate in [`super::mod`] before they reach this walker — V0 predates +//! the aggregate-sum feature and cannot legitimately carry one. + +use grovedb_merk::CryptoHash; +use grovedb_query::QueryItem; +use grovedb_version::version::GroveVersion; + +use crate::{ + operations::proof::{ + aggregate_sum::helpers::{ + enforce_lower_chain, expect_merk_bytes, verify_single_key_layer_proof_v0, + verify_sum_leaf, + }, + LayerProof, + }, + Error, PathQuery, +}; + +/// Walk `path_query.path` layer by layer through `layer.lower_layers`, +/// verifying a single-key existence proof at each non-leaf depth and +/// delegating to [`verify_sum_leaf`] at the leaf. At each non-leaf step, +/// the chain check `combine_hash(H(value), lower_root) == +/// parent_value_hash` ties the layer's sum to the GroveDB root hash. +pub(super) fn verify_v1_leaf_chain( + layer: &LayerProof, + path_query: &PathQuery, + path_keys: &[&[u8]], + depth: usize, + inner_range: &QueryItem, + grove_version: &GroveVersion, +) -> Result<(CryptoHash, i64), Error> { + let merk_bytes = expect_merk_bytes(&layer.merk_proof, path_query)?; + + if depth == path_keys.len() { + // Strict-shape gate: a leaf-shape aggregate-sum proof terminates + // in the merk that holds the actual count proof; that merk is a + // *leaf* of the GroveDB-proof envelope and must carry no further + // `lower_layers`. Without this check, an attacker can attach + // arbitrary unverified `LayerProof`s under the leaf and produce + // byte-distinct envelopes that all verify to the same `(root, + // sum)`, harming determinism (caching, deduplication) and + // enlarging the attack surface for downstream consumers that + // syntactically scan proof structure. + if !layer.lower_layers.is_empty() { + return Err(Error::InvalidProof( + path_query.clone(), + "aggregate-sum proof contains unexpected lower layers below the leaf merk" + .to_string(), + )); + } + return verify_sum_leaf(merk_bytes, inner_range, path_query); + } + + let next_key = path_keys[depth].to_vec(); + // Strict-shape gate (size): at each non-leaf depth the honest + // prover emits exactly one `lower_layers` entry — the descent + // into the next path key. Reject any other count (extra siblings, + // empty map, etc.) so the verified path-prefix is unambiguous + // and proofs are uniquely byte-shaped. + if layer.lower_layers.len() != 1 { + return Err(Error::InvalidProof( + path_query.clone(), + format!( + "aggregate-sum proof has {} lower-layer entries at depth {} (expected exactly \ + one entry for path key {})", + layer.lower_layers.len(), + depth, + hex::encode(&next_key) + ), + )); + } + let (proven_value_bytes, parent_root_hash, parent_proof_hash) = + verify_single_key_layer_proof_v0(merk_bytes, &next_key, path_query)?; + + // Strict-shape gate (key): the sole entry must be under the + // expected descent key. A `lower_layers` map with one entry under + // an unexpected key indicates either tampering or a proof for a + // different path — both must be rejected. + let lower_layer = layer.lower_layers.get(&next_key).ok_or_else(|| { + Error::InvalidProof( + path_query.clone(), + format!( + "aggregate-sum proof's sole lower-layer entry at depth {} is not keyed by \ + the expected path key {}", + depth, + hex::encode(&next_key) + ), + ) + })?; + let (lower_hash, sum) = verify_v1_leaf_chain( + lower_layer, + path_query, + path_keys, + depth + 1, + inner_range, + grove_version, + )?; + + let is_terminal = depth + 1 == path_keys.len(); + enforce_lower_chain( + path_query, + &next_key, + &proven_value_bytes, + &lower_hash, + &parent_proof_hash, + is_terminal, + grove_version, + )?; + + Ok((parent_root_hash, sum)) +} diff --git a/grovedb/src/operations/proof/aggregate_sum/mod.rs b/grovedb/src/operations/proof/aggregate_sum/mod.rs new file mode 100644 index 000000000..3ca2ca9c0 --- /dev/null +++ b/grovedb/src/operations/proof/aggregate_sum/mod.rs @@ -0,0 +1,128 @@ +//! GroveDB-side prove/verify glue for `AggregateSumOnRange` queries. +//! +//! Mirror of [`super::aggregate_count`] for the `ProvableSumTree` flavor. +//! The merk-level pieces live in `grovedb_merk::proofs::query::aggregate_sum` +//! (proof generation in `Merk::prove_aggregate_sum_on_range`, proof +//! verification in `verify_aggregate_sum_on_range_proof`). This module adds +//! the GroveDB-level *envelope* handling: a verifier that walks the +//! multi-layer `GroveDBProof` chain (parent merk → ... → leaf merk), +//! verifies the path-element existence proofs at each non-leaf layer, and +//! delegates to the merk-level sum verifier at the leaf. +//! +//! The proof generator side is wired directly into +//! [`GroveDb::prove_subqueries`] / [`GroveDb::prove_subqueries_v1`] — see +//! the "Aggregate-sum short-circuit" branches there. +//! +//! ## Module layout +//! +//! - [`leaf_chain`] — the recursive walker that descends `path_query.path` +//! layer by layer and delegates to the merk-level sum verifier at the +//! leaf. +//! - [`helpers`] — shared utilities (envelope decode, single-key layer +//! verification, chain enforcement, leaf sum verification). +//! +//! Unlike [`super::aggregate_count`], `AggregateSumOnRange` only supports +//! the leaf shape (a single `AggregateSumOnRange(_)` item at the top level +//! of the inner `Query`). The carrier shape (outer `Key`/`Range*` items +//! routing to an aggregate-sum subquery) is not yet wired in the merk-level +//! prover, so there is no per-key entry point or classification module. + +mod helpers; +mod leaf_chain; + +use grovedb_merk::CryptoHash; +use grovedb_version::{check_grovedb_v0, version::GroveVersion}; + +use crate::{ + operations::proof::{GroveDBProof, GroveDBProofV1, LayerProof}, + Error, GroveDb, PathQuery, +}; + +impl GroveDb { + /// Verify a serialized `prove_query` proof against an + /// `AggregateSumOnRange` `PathQuery`, returning the GroveDB root hash + /// and the verified signed sum. + /// + /// `path_query` must satisfy + /// [`PathQuery::validate_aggregate_sum_on_range`] — a single + /// `AggregateSumOnRange(_)` item, no subqueries, no pagination, and an + /// inner range that isn't `Key`, `RangeFull`, another + /// `AggregateSumOnRange`, or an `AggregateCountOnRange`. Any other + /// shape is rejected up front with `Error::InvalidQuery` before any + /// bytes are decoded. + /// + /// `AggregateSumOnRange` requires **V1 proof envelopes** + /// (`GroveDBProofV1`). V0 (`GroveDBProofV0` / `MerkOnlyLayerProof`) + /// envelopes predate the aggregate-sum feature and are only produced by + /// grove versions older than the one used by Dash Platform v12; this + /// entry point rejects them with `Error::InvalidProof`. + /// + /// Returns: + /// - `root_hash` — the reconstructed GroveDB root hash. The caller is + /// responsible for comparing this against their trusted root hash. + /// - `sum` — the signed `i64` sum of children with keys in the inner + /// range that were committed by the proof. + /// + /// Cryptographic guarantees: + /// - At each non-leaf layer, a regular single-key merk proof + /// demonstrates that the next path element exists with the recorded + /// value bytes; the verifier checks the chain + /// `combine_hash(H(value), lower_hash) == parent_proof_hash` so a + /// forged path is impossible without a root-hash mismatch. + /// - At the leaf layer, the sum is committed by `HashWithSum`'s + /// `node_hash_with_sum(kv_hash, left, right, sum)` recomputation — + /// tampering with the sum produces a different reconstructed merk + /// root, and the chain check above then fails. + /// - The leaf-level verifier uses an `i128` accumulator and rejects + /// any result that doesn't fit in `i64`, so adversarial extremes + /// like two `i64::MAX` children cannot silently wrap. + pub fn verify_aggregate_sum_query( + proof: &[u8], + path_query: &PathQuery, + grove_version: &GroveVersion, + ) -> Result<(CryptoHash, i64), Error> { + check_grovedb_v0!( + "verify_aggregate_sum_query", + grove_version + .grovedb_versions + .operations + .proof + .verify_query_with_options + ); + + let inner_range = path_query.validate_aggregate_sum_on_range()?.clone(); + + let grovedb_proof = super::decode_grovedb_proof_canonical(proof)?; + let path_keys: Vec<&[u8]> = path_query.path.iter().map(|p| p.as_slice()).collect(); + + let root_layer = require_v1_envelope(&grovedb_proof, path_query)?; + leaf_chain::verify_v1_leaf_chain( + root_layer, + path_query, + &path_keys, + 0, + &inner_range, + grove_version, + ) + } +} + +/// Extract the V1 root layer from a `GroveDBProof` envelope, or refuse +/// the proof. `AggregateSumOnRange` requires V1 envelopes — the V0 +/// (`MerkOnlyLayerProof`) envelope predates the aggregate-sum feature and +/// is only emitted by grove versions older than the one used by Dash +/// Platform v12, so it cannot legitimately contain an aggregate-sum proof. +fn require_v1_envelope<'a>( + proof: &'a GroveDBProof, + path_query: &PathQuery, +) -> Result<&'a LayerProof, Error> { + match proof { + GroveDBProof::V1(GroveDBProofV1 { root_layer }) => Ok(root_layer), + GroveDBProof::V0(_) => Err(Error::InvalidProof( + path_query.clone(), + "AggregateSumOnRange proofs require V1 proof envelopes; V0 envelopes predate \ + this feature and cannot legitimately carry an aggregate-sum proof" + .to_string(), + )), + } +} diff --git a/grovedb/src/operations/proof/generate.rs b/grovedb/src/operations/proof/generate.rs index b87a9d004..6f385d6bc 100644 --- a/grovedb/src/operations/proof/generate.rs +++ b/grovedb/src/operations/proof/generate.rs @@ -123,6 +123,14 @@ impl GroveDb { if is_acor_query && let Err(e) = path_query.validate_aggregate_count_on_range() { return Err(e).wrap_with_cost(OperationCost::default()); } + // Mirror of the count gate for sum. Same defense-in-depth: catch + // malformed `AggregateSumOnRange` shapes up front so the prover + // never silently returns a regular proof for a path that doesn't + // exist. + let is_asor_query = path_query.query.query.has_aggregate_sum_on_range_anywhere(); + if is_asor_query && let Err(e) = path_query.validate_aggregate_sum_on_range() { + return Err(e).wrap_with_cost(OperationCost::default()); + } let prove_version = grove_version .grovedb_versions @@ -145,6 +153,20 @@ impl GroveDb { .wrap_with_cost(OperationCost::default()); } + // Mirror of the count V0 gate for sum. `AggregateSumOnRange` + // postdates V0 envelopes for the same reason as count, so a V0 + // aggregate-sum proof can never be honestly produced; refuse + // the combination here so callers see a clear `NotSupported` + // instead of a downstream verifier rejection. + if is_asor_query && prove_version == 0 { + return Err(Error::NotSupported( + "AggregateSumOnRange proofs require V1 proof envelopes; upgrade the grove \ + version producing the proof" + .to_string(), + )) + .wrap_with_cost(OperationCost::default()); + } + match prove_version { 0 => self.prove_query_non_serialized_v0(path_query, prove_options, grove_version), 1 => self.prove_query_non_serialized_v1(path_query, prove_options, grove_version), @@ -331,6 +353,33 @@ impl GroveDb { .wrap_with_cost(cost); } + // Aggregate-sum short-circuit (mirror of count). Same contract: any + // `AggregateSumOnRange` at this level requires the whole `PathQuery` + // to be well-formed; the validate call surfaces the precise error + // otherwise. + if query.items.iter().any(QueryItem::is_aggregate_sum_on_range) { + let inner_range = cost_return_on_error_no_add!( + cost, + path_query.validate_aggregate_sum_on_range().cloned() + ); + let (sum_ops, _sum) = cost_return_on_error!( + &mut cost, + subtree + .prove_aggregate_sum_on_range(&inner_range, grove_version) + .map_err(|e| Error::CorruptedData(format!( + "prove_aggregate_sum_on_range failed: {}", + e + ))) + ); + let mut serialized = Vec::with_capacity(128); + encode_into(sum_ops.iter(), &mut serialized); + return Ok(MerkOnlyLayerProof { + merk_proof: serialized, + lower_layers: BTreeMap::new(), + }) + .wrap_with_cost(cost); + } + let mut merk_proof = cost_return_on_error!( &mut cost, self.generate_merk_proof( @@ -368,14 +417,18 @@ impl GroveDb { done_with_results |= overall_limit == &Some(0); // Check if node should preserve its special type before destructuring // We need this flag to avoid converting it to Node::KV later - // - KVValueHashFeatureType: used by ProvableCountTree for trees/references + // - KVValueHashFeatureType: used by ProvableCountTree / ProvableSumTree for + // trees/references // - KVCount: used by ProvableCountTree for Items (tamper-resistant with count) + // - KVSum: used by ProvableSumTree for Items (tamper-resistant with sum) let should_preserve_node_type = matches!( op, Op::Push(Node::KVValueHashFeatureType(..)) | Op::PushInverted(Node::KVValueHashFeatureType(..)) | Op::Push(Node::KVCount(..)) | Op::PushInverted(Node::KVCount(..)) + | Op::Push(Node::KVSum(..)) + | Op::PushInverted(Node::KVSum(..)) ); // Extract count if present for ProvableCountTree references let count_for_ref = match op { @@ -386,11 +439,25 @@ impl GroveDb { }, _ => None, }; + // Extract sum if present for ProvableSumTree references. Mirrors + // count_for_ref — the merk layer emits `KVValueHashFeatureType` + // with a `ProvableSummedMerkNode(sum)` feature for references; + // the GroveDB layer rewrites that to `KVRefValueHashSum` with + // the dereferenced value. + let sum_for_ref = match op { + Op::Push(Node::KVValueHashFeatureType(_, _, _, ft)) + | Op::PushInverted(Node::KVValueHashFeatureType(_, _, _, ft)) => match ft { + TreeFeatureType::ProvableSummedMerkNode(sum) => Some(*sum), + _ => None, + }, + _ => None, + }; match op { Op::Push(node) | Op::PushInverted(node) => match node { Node::KV(key, value) | Node::KVValueHash(key, value, ..) | Node::KVCount(key, value, _) + | Node::KVSum(key, value, _) | Node::KVValueHashFeatureType(key, value, ..) if !done_with_results => { @@ -439,9 +506,23 @@ impl GroveDb { .wrap_with_cost(cost); } - // Use KVRefValueHashCount if in ProvableCountTree, - // otherwise use KVRefValueHash - *node = if let Some(count) = count_for_ref { + // Dispatch priority: + // ProvableSumTree references -> KVRefValueHashSum + // ProvableCountTree references -> KVRefValueHashCount + // regular references -> KVRefValueHash + // The two ref-aggregate flags are mutually + // exclusive (a ref child sees one parent + // tree type), but Sum takes priority if both + // are erroneously set — Sum-in-hash is the + // stricter invariant. + *node = if let Some(sum) = sum_for_ref { + Node::KVRefValueHashSum( + key.to_owned(), + serialized_referenced_elem.expect("confirmed ok above"), + value_hash(value).unwrap_add_cost(&mut cost), + sum, + ) + } else if let Some(count) = count_for_ref { Node::KVRefValueHashCount( key.to_owned(), serialized_referenced_elem.expect("confirmed ok above"), @@ -472,6 +553,7 @@ impl GroveDb { // Only convert to Node::KV if not already a special node type // - KVValueHashFeatureType: preserves feature_type for trees/refs // - KVCount: preserves count for Items in ProvableCountTree + // - KVSum: preserves sum for Items in ProvableSumTree if !should_preserve_node_type { *node = Node::KV(key.to_owned(), value.to_owned()); } @@ -487,6 +569,7 @@ impl GroveDb { | Ok(Element::CountSumTree(Some(_), ..)) | Ok(Element::ProvableCountTree(Some(_), ..)) | Ok(Element::ProvableCountSumTree(Some(_), ..)) + | Ok(Element::ProvableSumTree(Some(_), ..)) | Ok(Element::CommitmentTree(..)) if !done_with_results && query.has_subquery_or_matching_in_path_on_key(key) => @@ -556,6 +639,7 @@ impl GroveDb { | Ok(Element::ProvableCountTree(..)) | Ok(Element::CountSumTree(..)) | Ok(Element::ProvableCountSumTree(..)) + | Ok(Element::ProvableSumTree(..)) | Ok(Element::CommitmentTree(..)) | Ok(Element::MmrTree(..)) | Ok(Element::BulkAppendTree(..)) @@ -595,6 +679,7 @@ impl GroveDb { | Ok(Element::CountSumTree(..)) | Ok(Element::ProvableCountTree(..)) | Ok(Element::ProvableCountSumTree(..)) + | Ok(Element::ProvableSumTree(..)) | Ok(Element::CommitmentTree(..)) | Ok(Element::MmrTree(..)) | Ok(Element::BulkAppendTree(..)) @@ -1116,6 +1201,31 @@ impl GroveDb { .wrap_with_cost(cost); } + // Aggregate-sum short-circuit (v1 path). Mirror of the count v1 + // branch. + if query.items.iter().any(QueryItem::is_aggregate_sum_on_range) { + let inner_range = cost_return_on_error_no_add!( + cost, + path_query.validate_aggregate_sum_on_range().cloned() + ); + let (sum_ops, _sum) = cost_return_on_error!( + &mut cost, + subtree + .prove_aggregate_sum_on_range(&inner_range, grove_version) + .map_err(|e| Error::CorruptedData(format!( + "prove_aggregate_sum_on_range failed: {}", + e + ))) + ); + let mut serialized = Vec::with_capacity(128); + encode_into(sum_ops.iter(), &mut serialized); + return Ok(LayerProof { + merk_proof: ProofBytes::Merk(serialized), + lower_layers: BTreeMap::new(), + }) + .wrap_with_cost(cost); + } + // Whether the surrounding query is an aggregate-count carrier: // empty trees that match a `subquery_path` step still need a // lower-layer descent so the aggregate-count short-circuit can @@ -1144,12 +1254,16 @@ impl GroveDb { for op in merk_proof.proof.iter_mut() { done_with_results |= overall_limit == &Some(0); + // Mirror generate.rs's first ref-rewriting loop — preserve + // ProvableSumTree special nodes too. let should_preserve_node_type = matches!( op, Op::Push(Node::KVValueHashFeatureType(..)) | Op::PushInverted(Node::KVValueHashFeatureType(..)) | Op::Push(Node::KVCount(..)) | Op::PushInverted(Node::KVCount(..)) + | Op::Push(Node::KVSum(..)) + | Op::PushInverted(Node::KVSum(..)) ); let count_for_ref = match op { Op::Push(Node::KVValueHashFeatureType(_, _, _, ft)) @@ -1159,12 +1273,21 @@ impl GroveDb { }, _ => None, }; + let sum_for_ref = match op { + Op::Push(Node::KVValueHashFeatureType(_, _, _, ft)) + | Op::PushInverted(Node::KVValueHashFeatureType(_, _, _, ft)) => match ft { + TreeFeatureType::ProvableSummedMerkNode(sum) => Some(*sum), + _ => None, + }, + _ => None, + }; match op { Op::Push(node) | Op::PushInverted(node) => match node { Node::KV(key, value) | Node::KVValueHash(key, value, ..) | Node::KVCount(key, value, _) + | Node::KVSum(key, value, _) | Node::KVValueHashFeatureType(key, value, ..) if !done_with_results => { @@ -1209,7 +1332,14 @@ impl GroveDb { .wrap_with_cost(cost); } - *node = if let Some(count) = count_for_ref { + *node = if let Some(sum) = sum_for_ref { + Node::KVRefValueHashSum( + key.to_owned(), + serialized_referenced_elem.expect("confirmed ok above"), + value_hash(value).unwrap_add_cost(&mut cost), + sum, + ) + } else if let Some(count) = count_for_ref { Node::KVRefValueHashCount( key.to_owned(), serialized_referenced_elem.expect("confirmed ok above"), @@ -1357,6 +1487,7 @@ impl GroveDb { | Ok(Element::CountSumTree(Some(_), ..)) | Ok(Element::ProvableCountTree(Some(_), ..)) | Ok(Element::ProvableCountSumTree(Some(_), ..)) + | Ok(Element::ProvableSumTree(Some(_), ..)) if !done_with_results && query.has_subquery_or_matching_in_path_on_key(key) => { @@ -1403,6 +1534,7 @@ impl GroveDb { | Ok(Element::ProvableCountTree(Some(_), ..)) | Ok(Element::CountSumTree(Some(_), ..)) | Ok(Element::ProvableCountSumTree(Some(_), ..)) + | Ok(Element::ProvableSumTree(Some(_), ..)) if !done_with_results => { // Non-empty tree without subquery: inject child @@ -1490,6 +1622,7 @@ impl GroveDb { | Ok(Element::ProvableCountTree(None, ..)) | Ok(Element::CountSumTree(None, ..)) | Ok(Element::ProvableCountSumTree(None, ..)) + | Ok(Element::ProvableSumTree(None, ..)) | Ok(Element::CommitmentTree(..)) if !done_with_results => { @@ -1512,6 +1645,7 @@ impl GroveDb { | Ok(Element::CountSumTree(..)) | Ok(Element::ProvableCountTree(..)) | Ok(Element::ProvableCountSumTree(..)) + | Ok(Element::ProvableSumTree(..)) | Ok(Element::CommitmentTree(..)) | Ok(Element::MmrTree(..)) | Ok(Element::BulkAppendTree(..)) @@ -2025,6 +2159,12 @@ impl GroveDb { not on dense fixed-size merkle trees", )); } + QueryItem::AggregateSumOnRange(_) => { + return Err(Error::InvalidInput( + "AggregateSumOnRange is only supported on provable sum trees, \ + not on dense fixed-size merkle trees", + )); + } } } @@ -2149,6 +2289,12 @@ impl GroveDb { not on MMR trees", )); } + QueryItem::AggregateSumOnRange(_) => { + return Err(Error::InvalidInput( + "AggregateSumOnRange is only supported on provable sum trees, \ + not on MMR trees", + )); + } } } @@ -2223,6 +2369,12 @@ impl GroveDb { not on BulkAppendTree", )); } + QueryItem::AggregateSumOnRange(_) => { + return Err(Error::InvalidInput( + "AggregateSumOnRange is only supported on provable sum trees, \ + not on BulkAppendTree", + )); + } } } @@ -2455,4 +2607,57 @@ mod tests { other => panic!("expected InvalidInput, got {:?}", other), } } + + // ----------------------------------------------------------------------- + // AggregateSumOnRange rejection on non-provable-sum tree types. + // + // Same rationale as the count side: `AggregateSumOnRange` is only valid + // against `ProvableSumTree` (binds sum into the node hash via + // `node_hash_with_sum`). Dense / MMR / BulkAppend trees must reject. + // ----------------------------------------------------------------------- + + #[test] + fn dense_tree_rejects_aggregate_sum_on_range() { + let inner = QueryItem::RangeInclusive(be_u16(0)..=be_u16(5)); + let items = vec![QueryItem::AggregateSumOnRange(Box::new(inner))]; + let err = GroveDb::query_items_to_positions(&items, 100) + .expect_err("dense tree must reject AggregateSumOnRange"); + match err { + Error::InvalidInput(msg) => assert!( + msg.contains("dense fixed-size") || msg.contains("provable sum"), + "unexpected message: {msg}" + ), + other => panic!("expected InvalidInput, got {:?}", other), + } + } + + #[test] + fn mmr_tree_rejects_aggregate_sum_on_range() { + let inner = QueryItem::RangeInclusive(be_u64(0)..=be_u64(5)); + let items = vec![QueryItem::AggregateSumOnRange(Box::new(inner))]; + let err = GroveDb::query_items_to_leaf_indices(&items, 7) + .expect_err("MMR must reject AggregateSumOnRange"); + match err { + Error::InvalidInput(msg) => assert!( + msg.contains("MMR") || msg.contains("provable sum"), + "unexpected message: {msg}" + ), + other => panic!("expected InvalidInput, got {:?}", other), + } + } + + #[test] + fn bulk_append_tree_rejects_aggregate_sum_on_range() { + let inner = QueryItem::RangeInclusive(be_u64(0)..=be_u64(5)); + let items = vec![QueryItem::AggregateSumOnRange(Box::new(inner))]; + let err = GroveDb::query_items_to_range(&items, 100) + .expect_err("BulkAppendTree must reject AggregateSumOnRange"); + match err { + Error::InvalidInput(msg) => assert!( + msg.contains("BulkAppendTree") || msg.contains("provable sum"), + "unexpected message: {msg}" + ), + other => panic!("expected InvalidInput, got {:?}", other), + } + } } diff --git a/grovedb/src/operations/proof/mod.rs b/grovedb/src/operations/proof/mod.rs index 74784f1c9..6ef04217f 100644 --- a/grovedb/src/operations/proof/mod.rs +++ b/grovedb/src/operations/proof/mod.rs @@ -2,6 +2,8 @@ #[cfg(any(feature = "minimal", feature = "verify"))] mod aggregate_count; +#[cfg(any(feature = "minimal", feature = "verify"))] +mod aggregate_sum; #[cfg(feature = "minimal")] mod generate; /// Utility functions for proof display and conversion. @@ -18,6 +20,34 @@ use std::{collections::BTreeMap, fmt}; /// fitting comfortably within typical stack sizes. pub const MAX_PROOF_DEPTH: usize = 128; +/// Decode a serialized [`GroveDBProof`] envelope using the same bincode +/// configuration the prover writes out. +/// +/// Decoding is canonical: trailing bytes beyond the encoded envelope are +/// rejected. Without this check the same `(RootHash, payload)` could be +/// reconstructed from many different proof byte-strings (a proof and the +/// same proof with arbitrary suffix bytes), which is harmless for the +/// chain-bound correctness guarantee but breaks any equality-by-bytes +/// assumption a caller might rely on (caching, deduplication, hashing +/// the proof itself). +/// +/// Shared by the aggregate-count and aggregate-sum verifier entry +/// points so the canonical-decode contract has exactly one definition. +pub(super) fn decode_grovedb_proof_canonical(proof: &[u8]) -> Result { + let config = bincode::config::standard() + .with_big_endian() + .with_limit::<{ 256 * 1024 * 1024 }>(); + let (decoded, consumed) = bincode::decode_from_slice(proof, config) + .map_err(|e| Error::CorruptedData(format!("unable to decode proof: {}", e)))?; + if consumed != proof.len() { + return Err(Error::CorruptedData(format!( + "proof has {} trailing bytes after the encoded envelope", + proof.len() - consumed + ))); + } + Ok(decoded) +} + use bincode::{ de::{BorrowDecoder, Decoder as BincodeDecoder}, error::DecodeError, @@ -747,6 +777,36 @@ fn node_to_string(node: &Node) -> Result { hex::encode(right_child_hash), count ), + // ProvableSumTree proof variants. + Node::KVSum(key, value, sum) => format!( + "KVSum({}, {}, {})", + hex_to_ascii(key), + element_hex_to_ascii(value)?, + sum + ), + Node::KVHashSum(kv_hash, sum) => { + format!("KVHashSum(HASH[{}], {})", hex::encode(kv_hash), sum) + } + Node::KVRefValueHashSum(key, value, value_hash, sum) => format!( + "KVRefValueHashSum({}, {}, HASH[{}], {})", + hex_to_ascii(key), + element_hex_to_ascii(value)?, + hex::encode(value_hash), + sum + ), + Node::KVDigestSum(key, value_hash, sum) => format!( + "KVDigestSum({}, HASH[{}], {})", + hex_to_ascii(key), + hex::encode(value_hash), + sum + ), + Node::HashWithSum(kv_hash, left_child_hash, right_child_hash, sum) => format!( + "HashWithSum(kv_hash=HASH[{}], left=HASH[{}], right=HASH[{}], sum={})", + hex::encode(kv_hash), + hex::encode(left_child_hash), + hex::encode(right_child_hash), + sum + ), }; Ok(s) } diff --git a/grovedb/src/operations/proof/verify.rs b/grovedb/src/operations/proof/verify.rs index 0431e92c2..1c175f73d 100644 --- a/grovedb/src/operations/proof/verify.rs +++ b/grovedb/src/operations/proof/verify.rs @@ -61,12 +61,7 @@ impl GroveDb { )); } - let config = bincode::config::standard() - .with_big_endian() - .with_limit::<{ 256 * 1024 * 1024 }>(); - let grovedb_proof: GroveDBProof = bincode::decode_from_slice(proof, config) - .map_err(|e| Error::CorruptedData(format!("unable to decode proof: {}", e)))? - .0; + let grovedb_proof = super::decode_grovedb_proof_canonical(proof)?; let (root_hash, _, result) = Self::verify_proof_internal(&grovedb_proof, query, options, grove_version)?; @@ -110,12 +105,7 @@ impl GroveDb { )); } - let config = bincode::config::standard() - .with_big_endian() - .with_limit::<{ 256 * 1024 * 1024 }>(); - let grovedb_proof: GroveDBProof = bincode::decode_from_slice(proof, config) - .map_err(|e| Error::CorruptedData(format!("unable to decode proof: {}", e)))? - .0; + let grovedb_proof = super::decode_grovedb_proof_canonical(proof)?; let (root_hash, tree_feature_type, result) = Self::verify_proof_internal(&grovedb_proof, query, options, grove_version)?; @@ -143,12 +133,7 @@ impl GroveDb { .proof .verify_query_raw ); - let config = bincode::config::standard() - .with_big_endian() - .with_limit::<{ 256 * 1024 * 1024 }>(); - let grovedb_proof: GroveDBProof = bincode::decode_from_slice(proof, config) - .map_err(|e| Error::CorruptedData(format!("unable to decode proof: {}", e)))? - .0; + let grovedb_proof = super::decode_grovedb_proof_canonical(proof)?; let (root_hash, _, result) = Self::verify_proof_raw_internal( &grovedb_proof, @@ -501,6 +486,7 @@ impl GroveDb { | Element::CountSumTree(Some(_), ..) | Element::ProvableCountTree(Some(_), ..) | Element::ProvableCountSumTree(Some(_), ..) + | Element::ProvableSumTree(Some(_), ..) | Element::CommitmentTree(..) | Element::MmrTree(..) | Element::BulkAppendTree(..) @@ -634,6 +620,7 @@ impl GroveDb { | Element::CountSumTree(None, ..) | Element::ProvableCountTree(None, ..) | Element::ProvableCountSumTree(None, ..) + | Element::ProvableSumTree(None, ..) | Element::SumItem(..) | Element::Item(..) | Element::ItemWithSumItem(..) @@ -1241,6 +1228,12 @@ impl GroveDb { not on BulkAppendTree", )); } + QueryItem::AggregateSumOnRange(_) => { + return Err(Error::InvalidInput( + "AggregateSumOnRange is only supported on provable sum trees, \ + not on BulkAppendTree", + )); + } } } @@ -1365,6 +1358,12 @@ impl GroveDb { not on this tree type", )); } + QueryItem::AggregateSumOnRange(_) => { + return Err(Error::InvalidInput( + "AggregateSumOnRange is only supported on provable sum trees, \ + not on this tree type", + )); + } } } @@ -1487,7 +1486,8 @@ impl GroveDb { | Element::CountTree(Some(_), ..) | Element::CountSumTree(Some(_), ..) | Element::ProvableCountTree(Some(_), ..) - | Element::ProvableCountSumTree(Some(_), ..) => { + | Element::ProvableCountSumTree(Some(_), ..) + | Element::ProvableSumTree(Some(_), ..) => { path.push(key); *last_parent_tree_type = element.tree_feature_type(); if query.query_items_at_path(&path, grove_version)?.is_none() { @@ -1610,6 +1610,7 @@ impl GroveDb { | Element::CountSumTree(None, ..) | Element::ProvableCountTree(None, ..) | Element::ProvableCountSumTree(None, ..) + | Element::ProvableSumTree(None, ..) | Element::CommitmentTree(..) | Element::MmrTree(..) | Element::BulkAppendTree(..) @@ -1969,12 +1970,7 @@ impl GroveDb { query: &PathTrunkChunkQuery, grove_version: &GroveVersion, ) -> Result<(CryptoHash, GroveTrunkQueryResult), Error> { - let config = bincode::config::standard() - .with_big_endian() - .with_limit::<{ 256 * 1024 * 1024 }>(); - let grovedb_proof: GroveDBProof = bincode::decode_from_slice(proof, config) - .map_err(|e| Error::CorruptedData(format!("unable to decode proof: {}", e)))? - .0; + let grovedb_proof = super::decode_grovedb_proof_canonical(proof)?; match grovedb_proof { GroveDBProof::V0(proof_v0) => { @@ -2596,9 +2592,18 @@ impl GroveDb { )); } } - Node::KVRefValueHash(..) | Node::KVRefValueHashCount(..) => { - // KVRefValueHash carries an opaque node_value_hash that cannot - // be recomputed from the value bytes alone. These node types + Node::KVRefValueHash(..) + | Node::KVRefValueHashCount(..) + | Node::KVRefValueHashSum(..) => { + // KVRefValueHash{,Count,Sum} carries an opaque + // node_value_hash that cannot be recomputed from the value + // bytes alone — the hash is `combine_hash(node_value_hash, + // value_hash(referenced_value))`, and the verifier never + // gets to see the referenced_value at this layer. Without + // this rejection, a forged value could ride along in a + // KVRefValueHashSum trunk/branch node while the merk-level + // hash chain still appears valid, because the embedded + // opaque hash is treated as authoritative. These node types // should never appear in trunk/branch chunk proofs. return Err(Error::InvalidProof( PathQuery::new_unsized(Vec::new(), Query::default()), @@ -2681,14 +2686,19 @@ impl GroveDb { | Node::KVValueHashFeatureTypeWithChildHash(key, value, ..) | Node::KVCount(key, value, ..) | Node::KVRefValueHash(key, value, ..) - | Node::KVRefValueHashCount(key, value, ..) => Some((key.clone(), value.clone())), + | Node::KVRefValueHashCount(key, value, ..) + | Node::KVSum(key, value, ..) + | Node::KVRefValueHashSum(key, value, ..) => Some((key.clone(), value.clone())), // These nodes don't have values, only key+hash or just hash Node::KVDigest(..) | Node::KVDigestCount(..) + | Node::KVDigestSum(..) | Node::Hash(_) | Node::KVHash(_) | Node::KVHashCount(..) - | Node::HashWithCount(..) => None, + | Node::HashWithCount(..) + | Node::KVHashSum(..) + | Node::HashWithSum(..) => None, } } diff --git a/grovedb/src/query/mod.rs b/grovedb/src/query/mod.rs index d08c64cd6..c1db5ab9b 100644 --- a/grovedb/src/query/mod.rs +++ b/grovedb/src/query/mod.rs @@ -204,10 +204,31 @@ impl SizedQuery { } Ok(()) } + + /// Mirror of [`Self::validate_aggregate_count_on_range`] for + /// `AggregateSumOnRange`. Forwards to + /// [`Query::validate_aggregate_sum_on_range`] and additionally rejects + /// any non-`None` `limit` or `offset`. + pub fn validate_aggregate_sum_on_range(&self) -> Result<&QueryItem, Error> { + if self.limit.is_some() { + return Err(Error::InvalidQuery( + "AggregateSumOnRange queries may not set SizedQuery::limit", + )); + } + if self.offset.is_some() { + return Err(Error::InvalidQuery( + "AggregateSumOnRange queries may not set SizedQuery::offset", + )); + } + self.query + .validate_aggregate_sum_on_range() + .map_err(sum_query_validation_error_to_static_str) + .map_err(Error::InvalidQuery) + } } -/// Converts a `Query::validate_aggregate_count_on_range` error into a -/// `&'static str`. Validation only ever returns +/// Converts an aggregate-count-validation error into a `&'static str`. +/// Validation only ever returns /// `grovedb_query::error::Error::InvalidOperation(&'static str)`, so this is /// just a projection of that variant; any other error variant (which would /// indicate an unrelated bug) is forwarded as a generic catch-all label. @@ -218,6 +239,18 @@ pub(crate) fn query_validation_error_to_static_str(e: grovedb_query::error::Erro } } +/// Sum-side mirror of [`query_validation_error_to_static_str`]. Same +/// projection contract; only the catch-all label differs so logs and +/// error surfaces stay self-describing per-aggregate-variant. +pub(crate) fn sum_query_validation_error_to_static_str( + e: grovedb_query::error::Error, +) -> &'static str { + match e { + grovedb_query::error::Error::InvalidOperation(msg) => msg, + _ => "AggregateSumOnRange query validation failed", + } +} + impl PathQuery { /// New path query pub const fn new(path: Vec>, query: SizedQuery) -> Self { @@ -254,15 +287,63 @@ impl PathQuery { Self::new_unsized(path, Query::new_aggregate_count_on_range(range)) } + /// Mirror of [`Self::new_aggregate_count_on_range`] for + /// `AggregateSumOnRange`. Builds a `PathQuery` whose underlying query + /// asks for the cryptographically-verifiable sum of children with keys + /// in `range` against the `ProvableSumTree` rooted at `path`. + pub fn new_aggregate_sum_on_range(path: Vec>, range: QueryItem) -> Self { + Self::new_unsized(path, Query::new_aggregate_sum_on_range(range)) + } + /// Validates that this `PathQuery` is a well-formed /// `AggregateCountOnRange` query in either the leaf or carrier shape. /// On success, returns a reference to the leaf inner range item. /// + /// Rejects empty paths up-front. The GroveDB root merk is always a + /// `NormalTree` by API construction (and never a `ProvableCountTree`), + /// so a root-level aggregate-count query has no valid target — + /// `verify_v0_layer` and `verify_v1_layer` would otherwise hit the + /// `depth == path_keys.len()` short-circuit at depth 0, going + /// straight to the merk-level count verifier without ever invoking + /// the terminal-type gate in `enforce_lower_chain`. Although the + /// merk-level hash-divergence between `node_hash` and + /// `node_hash_with_count` makes a numeric forgery infeasible, an + /// up-front rejection gives a clear error and removes the gate + /// dependency on cryptographic hash analysis. + /// /// Forwards to [`SizedQuery::validate_aggregate_count_on_range`]. pub fn validate_aggregate_count_on_range(&self) -> Result<&QueryItem, Error> { + if self.path.is_empty() { + return Err(Error::InvalidQuery( + "AggregateCountOnRange queries may not target the root merk: \ + the GroveDB root is always a NormalTree, never a \ + ProvableCountTree / ProvableCountSumTree, so a count \ + aggregate at the root layer has no valid target", + )); + } self.query.validate_aggregate_count_on_range() } + /// Validates that this `PathQuery` is a well-formed + /// `AggregateSumOnRange` query. On success, returns a reference to the + /// inner range item. + /// + /// Rejects empty paths up-front for the same reason as + /// [`Self::validate_aggregate_count_on_range`] — the GroveDB root + /// merk is always a `NormalTree`, never a `ProvableSumTree`. Forwards + /// to [`SizedQuery::validate_aggregate_sum_on_range`]. + pub fn validate_aggregate_sum_on_range(&self) -> Result<&QueryItem, Error> { + if self.path.is_empty() { + return Err(Error::InvalidQuery( + "AggregateSumOnRange queries may not target the root merk: \ + the GroveDB root is always a NormalTree, never a \ + ProvableSumTree, so a sum aggregate at the root layer has \ + no valid target", + )); + } + self.query.validate_aggregate_sum_on_range() + } + /// Strict variant of [`Self::validate_aggregate_count_on_range`] that /// only accepts the **leaf** shape (single `AggregateCountOnRange(_)` /// item, no subqueries). @@ -278,6 +359,11 @@ impl PathQuery { self.query.query.aggregate_count_on_range().is_some() } + /// Mirror of [`Self::has_aggregate_count_on_range`] for the sum variant. + pub fn has_aggregate_sum_on_range(&self) -> bool { + self.query.query.aggregate_sum_on_range().is_some() + } + /// The max depth of the query, this is the maximum layers we could get back /// from grovedb /// If the max depth can not be calculated we get None diff --git a/grovedb/src/tests/aggregate_count_query_tests.rs b/grovedb/src/tests/aggregate_count_query_tests.rs index 9bc34e7e0..c02fa444d 100644 --- a/grovedb/src/tests/aggregate_count_query_tests.rs +++ b/grovedb/src/tests/aggregate_count_query_tests.rs @@ -1248,6 +1248,168 @@ mod tests { } } + /// Security regression: empty-path aggregate-count queries are + /// rejected at validation time, before any proof handling. + /// + /// `verify_aggregate_count_query` calls + /// `path_query.validate_aggregate_count_on_range()` at its entry. If + /// the path is empty, validation must fail — otherwise both + /// `verify_v0_layer` and `verify_v1_layer` would hit the + /// `depth == path_keys.len()` short-circuit at depth 0 and go + /// straight to the merk-level leaf verifier, never invoking the + /// terminal-type gate in `enforce_lower_chain`. The GroveDB root + /// merk is always a `NormalTree` by API construction, so a root + /// aggregate-count query has no valid target. + #[test] + fn empty_path_aggregate_count_rejected_at_validation() { + let v = GroveVersion::latest(); + let pq = PathQuery::new_aggregate_count_on_range( + Vec::new(), + QueryItem::RangeFrom(b"a".to_vec()..), + ); + let err = pq + .validate_aggregate_count_on_range() + .expect_err("empty path must be rejected at validation"); + let msg = format!("{err}"); + assert!( + msg.contains("root") + && (msg.contains("ProvableCountTree") || msg.contains("ProvableCountSumTree")), + "expected message naming root + ProvableCountTree, got: {msg}" + ); + + let result = GroveDb::verify_aggregate_count_query(&[0u8; 4], &pq, v); + assert!( + result.is_err(), + "verify_aggregate_count_query must reject empty-path queries" + ); + } + + /// Security regression: empty-leaf type-confusion forgery + /// (parallel of `empty_leaf_type_confusion_forgery_rejected` on the + /// sum side). + /// + /// The honest leaf is an empty NormalTree (root_key=None). Every + /// empty Merk-backed tree stores `inner_root = NULL_HASH`, so its + /// recorded value_hash equals `combine_hash(H(element_bytes), + /// NULL_HASH)`. The merk-level count verifier accepts empty proof + /// bytes as `(NULL_HASH, 0)`. Before the fix the verifier's loose + /// `is_any_tree()` check happily accepted NormalTree element bytes + /// and the chain hash matched by coincidence, letting an attacker + /// prove `count = 0` against a path that wasn't actually a + /// ProvableCountTree. The numeric answer (0) is correct for an + /// empty tree of any type, but the implicit claim "the leaf is a + /// ProvableCountTree" was a soundness gap. + #[test] + fn empty_leaf_type_confusion_forgery_rejected() { + use std::collections::BTreeMap; + + use bincode::config; + use grovedb_version::version::v2::GROVE_V2; + + use crate::operations::proof::{ + GroveDBProof, GroveDBProofV0, MerkOnlyLayerProof, ProveOptions, + }; + + // Use V0 (GROVE_V2) envelope — its MerkOnlyLayerProof is simpler + // to surgically reconstruct than V1's LayerProof/ProofBytes. + let v: &GroveVersion = &GROVE_V2; + let db = make_test_grovedb(v); + db.insert( + [TEST_LEAF].as_ref(), + b"evil", + Element::empty_tree(), + None, + None, + v, + ) + .unwrap() + .expect("insert empty normal tree at evil"); + + // Honest probe to harvest the layer-0 merk proof bytes that prove + // `evil` exists in the TEST_LEAF merk with its NormalTree element + // bytes. + let probe = PathQuery::new_single_key(vec![TEST_LEAF.to_vec()], b"evil".to_vec()); + let probe_proof_bytes = db + .grove_db + .prove_query(&probe, None, v) + .unwrap() + .expect("honest probe should succeed"); + + let cfg = config::standard() + .with_big_endian() + .with_limit::<{ 256 * 1024 * 1024 }>(); + let probe_decoded: GroveDBProof = bincode::decode_from_slice(&probe_proof_bytes, cfg) + .unwrap() + .0; + + let (root_mp, test_leaf_mp) = match probe_decoded { + GroveDBProof::V0(GroveDBProofV0 { root_layer, .. }) => ( + root_layer.merk_proof, + root_layer + .lower_layers + .get(TEST_LEAF) + .expect("descent") + .merk_proof + .clone(), + ), + GroveDBProof::V1(_) => panic!("expected V0 envelope under GROVE_V2"), + }; + + let leaf = MerkOnlyLayerProof { + merk_proof: Vec::new(), + lower_layers: BTreeMap::new(), + }; + let mut test_leaf_map = BTreeMap::new(); + test_leaf_map.insert(b"evil".to_vec(), leaf); + let test_leaf_layer = MerkOnlyLayerProof { + merk_proof: test_leaf_mp, + lower_layers: test_leaf_map, + }; + let mut root_lower = BTreeMap::new(); + root_lower.insert(TEST_LEAF.to_vec(), test_leaf_layer); + + let forged = GroveDBProof::V0(GroveDBProofV0 { + root_layer: MerkOnlyLayerProof { + merk_proof: root_mp, + lower_layers: root_lower, + }, + prove_options: ProveOptions::default(), + }); + let forged_bytes = bincode::encode_to_vec(&forged, cfg).expect("encode"); + + let attack_pq = PathQuery::new_aggregate_count_on_range( + vec![TEST_LEAF.to_vec(), b"evil".to_vec()], + QueryItem::RangeFrom(b"a".to_vec()..), + ); + + let result = GroveDb::verify_aggregate_count_query(&forged_bytes, &attack_pq, v); + match result { + Err(e) => { + let msg = format!("{e}"); + // The forgery is rejected either by: + // (a) the V0-envelope-not-allowed gate added in PR #663 + // (fires first under GROVE_V2), or + // (b) the terminal-type gate added in this PR (fires + // under V1 envelopes if we reach it). + // Either rejection means the forgery doesn't pass — the + // security property holds. Accept both error shapes here. + assert!( + msg.contains("must be a ProvableCountTree") + || msg.contains("ProvableCountSumTree") + || msg.contains("require V1 proof envelopes"), + "verifier rejected as expected but with an unrelated message: {msg}" + ); + } + Ok((root_hash, count)) => panic!( + "BUG: empty-leaf forgery accepted by aggregate-count verifier! \ + Returned (root_hash={}, count={}) — the leaf is a NormalTree, \ + not a ProvableCountTree.", + hex::encode(root_hash), + count + ), + } + } + // ------------------------------------------------------------------- // Tests for the no-proof variant: GroveDb::query_aggregate_count. // @@ -1487,12 +1649,14 @@ mod tests { .query_aggregate_count(&path_query, None, v) .unwrap() .expect_err("NormalTree must be rejected by the merk-level entry"); - // The merk-level error gets wrapped in Error::MerkError; we just - // require *some* error rather than asserting on the exact variant - // since the merk layer's InvalidProofError formatting is internal. + // The merk-level error gets wrapped with contextual `CorruptedData` + // (callsite-specific path info — see `query_aggregate_count` in + // `operations/get/query.rs`). We just require *some* error rather + // than asserting on the exact variant since the merk layer's + // `InvalidProofError` formatting is internal. match err { - crate::Error::MerkError(_) => {} - other => panic!("expected MerkError, got {:?}", other), + crate::Error::CorruptedData(_) => {} + other => panic!("expected CorruptedData wrapper, got {:?}", other), } } diff --git a/grovedb/src/tests/aggregate_sum_query_tests.rs b/grovedb/src/tests/aggregate_sum_query_tests.rs new file mode 100644 index 000000000..6811752ff --- /dev/null +++ b/grovedb/src/tests/aggregate_sum_query_tests.rs @@ -0,0 +1,2456 @@ +//! End-to-end GroveDB tests for `AggregateSumOnRange` queries. +//! +//! These exercise the full prove → encode → decode → verify pipeline against +//! `ProvableSumTree` at various path depths and across the full set of +//! allowed range variants. Mirrors `aggregate_count_query_tests.rs` for the +//! signed-sum flavor, with extra cases covering negative sums, mixed signs +//! at i64 extremes, and the i128-accumulator overflow gate. + +#[cfg(test)] +mod tests { + use grovedb_merk::proofs::query::QueryItem; + use grovedb_version::version::{v2::GROVE_V2, GroveVersion}; + + use crate::{ + tests::{make_test_grovedb, TEST_LEAF}, + Element, GroveDb, PathQuery, + }; + + /// Insert keys "a".."o" (15 keys) into a `ProvableSumTree` rooted at + /// `[TEST_LEAF, "st"]`, with sums 1..=15. Returns the db and root hash. + fn setup_15_key_provable_sum_tree( + grove_version: &GroveVersion, + ) -> (crate::tests::TempGroveDb, [u8; 32]) { + let db = make_test_grovedb(grove_version); + db.insert( + [TEST_LEAF].as_ref(), + b"st", + Element::empty_provable_sum_tree(), + None, + None, + grove_version, + ) + .unwrap() + .expect("insert st"); + for (i, c) in (b'a'..=b'o').enumerate() { + let value = (i as i64) + 1; + db.insert( + [TEST_LEAF, b"st"].as_ref(), + &[c], + Element::new_sum_item(value), + None, + None, + grove_version, + ) + .unwrap() + .expect("insert sum item"); + } + let root = db + .grove_db + .root_hash(None, grove_version) + .unwrap() + .expect("root_hash"); + (db, root) + } + + /// Round-trip: build a PathQuery, prove it, verify it, assert + /// `(root, sum)` matches. + fn round_trip( + db: &crate::tests::TempGroveDb, + expected_root: [u8; 32], + path: Vec>, + inner_range: QueryItem, + expected_sum: i64, + grove_version: &GroveVersion, + ) { + let path_query = PathQuery::new_aggregate_sum_on_range(path, inner_range); + let proof = db + .grove_db + .prove_query(&path_query, None, grove_version) + .unwrap() + .expect("prove_query should succeed"); + let (root, sum) = GroveDb::verify_aggregate_sum_query(&proof, &path_query, grove_version) + .expect("verify should succeed"); + assert_eq!(root, expected_root, "verifier reconstructed wrong root"); + assert_eq!(sum, expected_sum, "verifier returned wrong sum"); + } + + // ---------- 1. Round-trip: single-key sum tree ---------- + /// A `ProvableSumTree` with just one key: the proof should still + /// reconstruct correctly, and the sum should be the single value. + /// (Empty-tree round-trip at the GroveDB-envelope level is covered by + /// the merk-side `integration_empty_merk_sum` test — at GroveDB level + /// an empty subtree produces no `lower_layers` entry, which is a + /// separate routing concern from the proof shape we're testing here.) + #[test] + fn single_key_provable_sum_tree_round_trip() { + let v = GroveVersion::latest(); + let db = make_test_grovedb(v); + db.insert( + [TEST_LEAF].as_ref(), + b"st", + Element::empty_provable_sum_tree(), + None, + None, + v, + ) + .unwrap() + .expect("insert st"); + db.insert( + [TEST_LEAF, b"st"].as_ref(), + b"k", + Element::new_sum_item(42), + None, + None, + v, + ) + .unwrap() + .expect("insert single sum item"); + let root = db.grove_db.root_hash(None, v).unwrap().expect("root_hash"); + round_trip( + &db, + root, + vec![TEST_LEAF.to_vec(), b"st".to_vec()], + QueryItem::Range(b"a".to_vec()..b"z".to_vec()), + 42, + v, + ); + } + + // ---------- 2. Round-trip full range: sum 1+2+...+15 = 120 ---------- + #[test] + fn provable_sum_tree_full_range_from() { + let v = GroveVersion::latest(); + let (db, root) = setup_15_key_provable_sum_tree(v); + round_trip( + &db, + root, + vec![TEST_LEAF.to_vec(), b"st".to_vec()], + QueryItem::RangeFrom(b"a".to_vec()..), + 120, + v, + ); + } + + // ---------- 3. Subrange: c..=l (values 3..=12) → 75 ---------- + #[test] + fn provable_sum_tree_range_inclusive() { + let v = GroveVersion::latest(); + let (db, root) = setup_15_key_provable_sum_tree(v); + round_trip( + &db, + root, + vec![TEST_LEAF.to_vec(), b"st".to_vec()], + QueryItem::RangeInclusive(b"c".to_vec()..=b"l".to_vec()), + 75, + v, + ); + } + + // ---------- 3b. RangeAfter ---------- + #[test] + fn provable_sum_tree_range_after() { + let v = GroveVersion::latest(); + let (db, root) = setup_15_key_provable_sum_tree(v); + // RangeAfter("b") matches c..o → 3+4+...+15 = 117. + round_trip( + &db, + root, + vec![TEST_LEAF.to_vec(), b"st".to_vec()], + QueryItem::RangeAfter(b"b".to_vec()..), + 117, + v, + ); + } + + // ---------- 3c. RangeToInclusive ---------- + #[test] + fn provable_sum_tree_range_to_inclusive() { + let v = GroveVersion::latest(); + let (db, root) = setup_15_key_provable_sum_tree(v); + // ..=e → 1+2+3+4+5 = 15. + round_trip( + &db, + root, + vec![TEST_LEAF.to_vec(), b"st".to_vec()], + QueryItem::RangeToInclusive(..=b"e".to_vec()), + 15, + v, + ); + } + + // ---------- 4. Boundary: range [b"c"..=b"c"] → 3 ---------- + #[test] + fn provable_sum_tree_single_key_range() { + let v = GroveVersion::latest(); + let (db, root) = setup_15_key_provable_sum_tree(v); + round_trip( + &db, + root, + vec![TEST_LEAF.to_vec(), b"st".to_vec()], + QueryItem::RangeInclusive(b"c".to_vec()..=b"c".to_vec()), + 3, + v, + ); + } + + // ---------- 5. Negative sums: mixed +/- children → net -70 ---------- + #[test] + fn provable_sum_tree_negative_sums_mixed() { + let v = GroveVersion::latest(); + let db = make_test_grovedb(v); + db.insert( + [TEST_LEAF].as_ref(), + b"st", + Element::empty_provable_sum_tree(), + None, + None, + v, + ) + .unwrap() + .expect("insert st"); + // Sums: +50, -100, +30, -50 → net -70. + let entries: [(u8, i64); 4] = [(b'a', 50), (b'b', -100), (b'c', 30), (b'd', -50)]; + for (k, val) in entries { + db.insert( + [TEST_LEAF, b"st"].as_ref(), + &[k], + Element::new_sum_item(val), + None, + None, + v, + ) + .unwrap() + .expect("insert sum item"); + } + let root = db.grove_db.root_hash(None, v).unwrap().expect("root_hash"); + round_trip( + &db, + root, + vec![TEST_LEAF.to_vec(), b"st".to_vec()], + QueryItem::RangeInclusive(b"a".to_vec()..=b"d".to_vec()), + -70, + v, + ); + } + + // ---------- 5b. Negative-only: subrange contains only negatives ---------- + #[test] + fn provable_sum_tree_all_negative_subrange() { + let v = GroveVersion::latest(); + let db = make_test_grovedb(v); + db.insert( + [TEST_LEAF].as_ref(), + b"st", + Element::empty_provable_sum_tree(), + None, + None, + v, + ) + .unwrap() + .expect("insert st"); + let entries: [(u8, i64); 4] = [(b'a', 5), (b'b', -3), (b'c', -7), (b'd', 8)]; + for (k, val) in entries { + db.insert( + [TEST_LEAF, b"st"].as_ref(), + &[k], + Element::new_sum_item(val), + None, + None, + v, + ) + .unwrap() + .expect("insert sum item"); + } + let root = db.grove_db.root_hash(None, v).unwrap().expect("root_hash"); + // Range b..=c → -3 + -7 = -10. + round_trip( + &db, + root, + vec![TEST_LEAF.to_vec(), b"st".to_vec()], + QueryItem::RangeInclusive(b"b".to_vec()..=b"c".to_vec()), + -10, + v, + ); + } + + // ---------- 5c. Plus and minus cancel to zero (NOT a short-circuit case) ---------- + /// Sum can legitimately be zero with non-zero children. The verifier + /// must produce 0 by genuine arithmetic, not by any "if sum == 0 → + /// skip" shortcut (a bug the count code can use but the sum code can't). + #[test] + fn provable_sum_tree_sum_zero_from_offsetting_children() { + let v = GroveVersion::latest(); + let db = make_test_grovedb(v); + db.insert( + [TEST_LEAF].as_ref(), + b"st", + Element::empty_provable_sum_tree(), + None, + None, + v, + ) + .unwrap() + .expect("insert st"); + for (k, val) in [(b'a', 5i64), (b'b', -5i64)] { + db.insert( + [TEST_LEAF, b"st"].as_ref(), + &[k], + Element::new_sum_item(val), + None, + None, + v, + ) + .unwrap() + .expect("insert sum item"); + } + let root = db.grove_db.root_hash(None, v).unwrap().expect("root_hash"); + round_trip( + &db, + root, + vec![TEST_LEAF.to_vec(), b"st".to_vec()], + QueryItem::RangeInclusive(b"a".to_vec()..=b"b".to_vec()), + 0, + v, + ); + } + + // ---------- 6. i64::MAX + i64::MAX → verify returns overflow error ---------- + /// Two i64::MAX children sum to 2*i64::MAX which doesn't fit in i64. + /// The verifier's final i64-narrowing check must reject. Whether the + /// underlying tree allows insertion depends on the apply path's + /// intermediate-overflow handling — if it doesn't, we exit early; the + /// merk-side test in + /// `merk::aggregate_sum::integration_overflow_at_i64_max_is_rejected` + /// additionally exercises this via a directly-fabricated proof. + #[test] + fn provable_sum_tree_overflow_at_i64_max_is_rejected() { + let v = GroveVersion::latest(); + let db = make_test_grovedb(v); + db.insert( + [TEST_LEAF].as_ref(), + b"st", + Element::empty_provable_sum_tree(), + None, + None, + v, + ) + .unwrap() + .expect("insert st"); + // First i64::MAX inserts cleanly. The second insert may or may not + // succeed depending on aggregation overflow rules — accept either, + // we only require that an *eventual* proof+verify can't silently + // produce a wrong i64. + let ok1 = db + .insert( + [TEST_LEAF, b"st"].as_ref(), + b"a", + Element::new_sum_item(i64::MAX), + None, + None, + v, + ) + .unwrap() + .is_ok(); + let ok2 = db + .insert( + [TEST_LEAF, b"st"].as_ref(), + b"b", + Element::new_sum_item(i64::MAX), + None, + None, + v, + ) + .unwrap() + .is_ok(); + let either_insert_rejected = !ok1 || !ok2; + + // If both inserts succeeded, the overflow must be caught later — + // either by the prover or by the verifier. If both inserts AND + // the prover succeed AND the verifier accepts, that's the + // silent-no-op regression we explicitly want to fail on. + let pq = PathQuery::new_aggregate_sum_on_range( + vec![TEST_LEAF.to_vec(), b"st".to_vec()], + QueryItem::RangeInclusive(b"a".to_vec()..=b"b".to_vec()), + ); + let (prover_rejected, verifier_rejected) = if either_insert_rejected { + // The insert side already detected the overflow; no need to + // exercise prove/verify (they'd never reach the i128->i64 + // gate without inputs that overflow). + (false, false) + } else { + match db.grove_db.prove_query(&pq, None, v).unwrap() { + Err(_) => (true, false), + Ok(proof) => { + let verify_result = GroveDb::verify_aggregate_sum_query(&proof, &pq, v); + (false, verify_result.is_err()) + } + } + }; + + // Exactly the silent-no-op branch must NEVER be reached: at least + // one of {insert, prove, verify} must reject the i64::MAX + + // i64::MAX overflow. + assert!( + either_insert_rejected || prover_rejected || verifier_rejected, + "BUG: i64::MAX + i64::MAX silently produced a wrong sum — insert, \ + prove, and verify all accepted the overflow" + ); + } + + // ---------- 7. i64::MAX + i64::MIN = -1 (intermediate overflows i64 but final fits) ---------- + #[test] + fn provable_sum_tree_mixed_extremes_sum_to_negative_one() { + let v = GroveVersion::latest(); + let db = make_test_grovedb(v); + db.insert( + [TEST_LEAF].as_ref(), + b"st", + Element::empty_provable_sum_tree(), + None, + None, + v, + ) + .unwrap() + .expect("insert st"); + // i64::MAX + i64::MIN. In i128 the intermediate computes to -1 + // cleanly; in i64 it would overflow if computed naively. The + // verifier uses i128 throughout, so it must reach -1. + let ok1 = db + .insert( + [TEST_LEAF, b"st"].as_ref(), + b"a", + Element::new_sum_item(i64::MAX), + None, + None, + v, + ) + .unwrap() + .is_ok(); + let ok2 = db + .insert( + [TEST_LEAF, b"st"].as_ref(), + b"b", + Element::new_sum_item(i64::MIN), + None, + None, + v, + ) + .unwrap() + .is_ok(); + if !ok1 || !ok2 { + return; // tree-level overflow detection; not our scenario today + } + let root = db.grove_db.root_hash(None, v).unwrap().expect("root_hash"); + // Cumulative aggregate at the tree level should already be -1 if + // both inserts succeeded. The range covering both should report + // -1, not panic or wrap. + round_trip( + &db, + root, + vec![TEST_LEAF.to_vec(), b"st".to_vec()], + QueryItem::RangeInclusive(b"a".to_vec()..=b"b".to_vec()), + -1, + v, + ); + } + + // ---------- 8. Tampering: mutate HashWithSum's sum field ---------- + #[test] + fn tampered_hash_with_sum_byte_is_rejected() { + let v = GroveVersion::latest(); + let (db, _root) = setup_15_key_provable_sum_tree(v); + let pq = PathQuery::new_aggregate_sum_on_range( + vec![TEST_LEAF.to_vec(), b"st".to_vec()], + QueryItem::RangeInclusive(b"c".to_vec()..=b"l".to_vec()), + ); + let mut proof = db + .grove_db + .prove_query(&pq, None, v) + .unwrap() + .expect("prove"); + // Flip a byte deep enough that it lands inside the leaf merk proof + // (past the envelope's metadata). + let target = proof.len() / 2; + proof[target] = proof[target].wrapping_add(1); + let result = GroveDb::verify_aggregate_sum_query(&proof, &pq, v); + assert!( + result.is_err(), + "tampered proof byte must be rejected, got {:?}", + result.map(|(_, s)| s) + ); + } + + // ---------- 9. Tampering: mutate KVSum's sum field (via byte flip) ---------- + /// The proof envelope contains both `HashWithSum` and `KVDigestSum` + /// nodes; flipping any byte that lands inside their sum encoding + /// must be caught by the chain check. We try several positions to + /// raise the probability of hitting a sum byte. + #[test] + fn multiple_byte_flips_in_leaf_are_all_rejected() { + let v = GroveVersion::latest(); + let (db, _root) = setup_15_key_provable_sum_tree(v); + let pq = PathQuery::new_aggregate_sum_on_range( + vec![TEST_LEAF.to_vec(), b"st".to_vec()], + QueryItem::RangeInclusive(b"c".to_vec()..=b"l".to_vec()), + ); + let honest = db + .grove_db + .prove_query(&pq, None, v) + .unwrap() + .expect("prove"); + + // Try a handful of bytes in the back half of the proof (the leaf + // merk bytes). For each, the verifier must either error or return + // a different root hash than honest. We deliberately do not pin + // exact byte indices so this test stays robust against encoding + // tweaks. + let honest_decoded = + GroveDb::verify_aggregate_sum_query(&honest, &pq, v).expect("honest verify"); + let mut at_least_one_caught = false; + for offset_frac in [3, 5, 7, 9] { + let target = honest.len() * offset_frac / 10; + if target >= honest.len() { + continue; + } + let mut bytes = honest.clone(); + bytes[target] = bytes[target].wrapping_add(0x5a); + match GroveDb::verify_aggregate_sum_query(&bytes, &pq, v) { + Err(_) => at_least_one_caught = true, + Ok((root, _sum)) if root != honest_decoded.0 => at_least_one_caught = true, + Ok(_) => { + // Same (root, sum) is acceptable — the byte didn't + // change the semantic outcome (e.g. a length-prefix + // padding bit). Keep trying. + } + } + } + assert!( + at_least_one_caught, + "at least one of several leaf byte flips should have been caught" + ); + } + + // ---------- 10. Wrong path: returns root ≠ trusted root ---------- + /// If a caller verifies against a proof for a *different* tree, the + /// returned root won't match their trusted root and the application + /// rejects on that comparison. The verifier itself doesn't take a + /// trusted root; it returns the reconstructed one for the caller to + /// compare. We assert the returned root differs from what an + /// unrelated tree would produce. + #[test] + fn proof_for_different_tree_yields_different_root() { + let v = GroveVersion::latest(); + let (db1, root1) = setup_15_key_provable_sum_tree(v); + // Build a *different* db with the same path shape but different + // values, generate a proof against it, and confirm that proof + // verifies to root2 ≠ root1. + let db2 = make_test_grovedb(v); + db2.insert( + [TEST_LEAF].as_ref(), + b"st", + Element::empty_provable_sum_tree(), + None, + None, + v, + ) + .unwrap() + .expect("insert"); + db2.insert( + [TEST_LEAF, b"st"].as_ref(), + b"a", + Element::new_sum_item(999), + None, + None, + v, + ) + .unwrap() + .expect("insert"); + let root2 = db2.grove_db.root_hash(None, v).unwrap().expect("root2"); + assert_ne!(root1, root2); + + let pq = PathQuery::new_aggregate_sum_on_range( + vec![TEST_LEAF.to_vec(), b"st".to_vec()], + QueryItem::RangeFrom(b"a".to_vec()..), + ); + let proof2 = db2 + .grove_db + .prove_query(&pq, None, v) + .unwrap() + .expect("prove2"); + let (got_root, _sum) = + GroveDb::verify_aggregate_sum_query(&proof2, &pq, v).expect("verify against db2 proof"); + assert_eq!(got_root, root2); + assert_ne!( + got_root, root1, + "caller's root check must catch wrong-tree proofs" + ); + } + + // ---------- 11. Wrong query shape: PathQuery with subquery is rejected ---------- + #[test] + fn aggregate_sum_with_subquery_is_rejected_at_validation() { + let v = GroveVersion::latest(); + let mut pq = PathQuery::new_aggregate_sum_on_range( + vec![TEST_LEAF.to_vec(), b"st".to_vec()], + QueryItem::Range(b"a".to_vec()..b"z".to_vec()), + ); + // Sneak in a subquery — the validator must reject on the + // verifier side. + pq.query + .query + .set_subquery(grovedb_merk::proofs::Query::new_range_full()); + let dummy_proof = vec![0u8; 16]; + assert!(GroveDb::verify_aggregate_sum_query(&dummy_proof, &pq, v).is_err()); + + // Defense-in-depth: the *prover* must also refuse a malformed + // ASOR path query. Without this assertion a regression in + // `prove_query_non_serialized` could silently produce a proof + // for a malformed shape while the verifier-side test still + // passed on the dummy bytes. We need an actual db to call + // prove_query; reuse the standard 15-key fixture. + let (db, _root) = setup_15_key_provable_sum_tree(v); + let prove_result = db.grove_db.prove_query(&pq, None, v).unwrap(); + assert!( + prove_result.is_err(), + "prover must refuse to run ASOR with a hidden subquery, got Ok" + ); + } + + // ---------- 12. Empty range (start > end is structurally invalid; use range above all keys → 0) ---------- + #[test] + fn range_above_all_keys_returns_zero_sum() { + let v = GroveVersion::latest(); + let (db, root) = setup_15_key_provable_sum_tree(v); + round_trip( + &db, + root, + vec![TEST_LEAF.to_vec(), b"st".to_vec()], + QueryItem::RangeInclusive(b"z".to_vec()..=vec![0xff]), + 0, + v, + ); + } + + // ---------- 12b. Range below all keys → 0 ---------- + #[test] + fn range_below_all_keys_returns_zero_sum() { + let v = GroveVersion::latest(); + let (db, root) = setup_15_key_provable_sum_tree(v); + round_trip( + &db, + root, + vec![TEST_LEAF.to_vec(), b"st".to_vec()], + QueryItem::RangeInclusive(vec![0x00]..=vec![0x10]), + 0, + v, + ); + } + + // ---------- 13. Multi-layer path (3 layers) ---------- + /// Outer NormalTree → inner ProvableSumTree. Exercises the chain + /// enforcement that count tests use, with sum semantics. + fn setup_three_layer_provable_sum_tree( + grove_version: &GroveVersion, + ) -> (crate::tests::TempGroveDb, [u8; 32]) { + let db = make_test_grovedb(grove_version); + db.insert( + [TEST_LEAF].as_ref(), + b"outer", + Element::empty_tree(), + None, + None, + grove_version, + ) + .unwrap() + .expect("insert outer"); + db.insert( + [TEST_LEAF, b"outer"].as_ref(), + b"inner", + Element::empty_provable_sum_tree(), + None, + None, + grove_version, + ) + .unwrap() + .expect("insert inner"); + // Five keys: a..=e with values 1..=5; sum = 15. + for (i, c) in (b'a'..=b'e').enumerate() { + db.insert( + [TEST_LEAF, b"outer", b"inner"].as_ref(), + &[c], + Element::new_sum_item((i as i64) + 1), + None, + None, + grove_version, + ) + .unwrap() + .expect("insert sum item"); + } + let root = db + .grove_db + .root_hash(None, grove_version) + .unwrap() + .expect("root_hash"); + (db, root) + } + + #[test] + fn three_layer_path_round_trip_sum() { + let v = GroveVersion::latest(); + let (db, root) = setup_three_layer_provable_sum_tree(v); + // RangeInclusive("b"..="d") matches values 2+3+4 = 9. + round_trip( + &db, + root, + vec![TEST_LEAF.to_vec(), b"outer".to_vec(), b"inner".to_vec()], + QueryItem::RangeInclusive(b"b".to_vec()..=b"d".to_vec()), + 9, + v, + ); + } + + // ---------- 14. Illegal mix: AggregateSumOnRange + AggregateCountOnRange ---------- + /// Constructing a `PathQuery` that contains both aggregate variants is + /// possible at the Vec level, but validation must reject — the two + /// types are explicitly orthogonal. + #[test] + fn mixed_aggregate_sum_and_count_is_rejected() { + let v = GroveVersion::latest(); + let mut pq = PathQuery::new_aggregate_sum_on_range( + vec![TEST_LEAF.to_vec(), b"st".to_vec()], + QueryItem::Range(b"a".to_vec()..b"z".to_vec()), + ); + // Manually push an AggregateCountOnRange — the surrounding query + // now has two items, which validation rejects ("must be the only + // item"). + pq.query + .query + .items + .push(QueryItem::AggregateCountOnRange(Box::new( + QueryItem::Range(b"a".to_vec()..b"z".to_vec()), + ))); + let dummy_proof = vec![0u8; 16]; + let err = GroveDb::verify_aggregate_sum_query(&dummy_proof, &pq, v) + .expect_err("mixed aggregates must be rejected"); + let msg = format!("{:?}", err); + assert!( + msg.contains("only item") || msg.contains("InvalidQuery"), + "expected validation rejection, got: {msg}" + ); + } + + // ---------- 15. Validation: nested AggregateSumOnRange ---------- + #[test] + fn validate_at_construction_rejects_nested_aggregate_sum_on_range() { + let pq = PathQuery::new_aggregate_sum_on_range( + vec![TEST_LEAF.to_vec(), b"st".to_vec()], + QueryItem::AggregateSumOnRange(Box::new(QueryItem::Range( + b"a".to_vec()..b"z".to_vec(), + ))), + ); + assert!(pq.validate_aggregate_sum_on_range().is_err()); + } + + // ---------- 16. Validation: AggregateSumOnRange wrapping AggregateCountOnRange ---------- + #[test] + fn validate_rejects_sum_wrapping_count() { + let pq = PathQuery::new_aggregate_sum_on_range( + vec![TEST_LEAF.to_vec(), b"st".to_vec()], + QueryItem::AggregateCountOnRange(Box::new(QueryItem::Range( + b"a".to_vec()..b"z".to_vec(), + ))), + ); + assert!(pq.validate_aggregate_sum_on_range().is_err()); + } + + // ---------- 17. Validation: Key inner is rejected ---------- + #[test] + fn validate_rejects_key_inner() { + let pq = PathQuery::new_aggregate_sum_on_range( + vec![TEST_LEAF.to_vec(), b"st".to_vec()], + QueryItem::Key(b"a".to_vec()), + ); + assert!(pq.validate_aggregate_sum_on_range().is_err()); + } + + // ---------- 18. Validation: RangeFull inner is rejected ---------- + #[test] + fn validate_rejects_range_full_inner() { + let pq = PathQuery::new_aggregate_sum_on_range( + vec![TEST_LEAF.to_vec(), b"st".to_vec()], + QueryItem::RangeFull(std::ops::RangeFull), + ); + assert!(pq.validate_aggregate_sum_on_range().is_err()); + } + + // ---------- 19. Rejected on non-ProvableSumTree (NormalTree) ---------- + #[test] + fn proof_rejected_on_normal_tree_path() { + // The path points to a normal tree, not a ProvableSumTree. The + // prover must refuse — either at the merk-level tree-type gate + // (`prove_aggregate_sum_on_range` errors on non-ProvableSumTree) + // or, if the prover happens to produce some bytes, the verifier + // must reject during the leaf-level shape walk because the proof + // ops won't be the sum-flavor variants the verifier expects. + let v = GroveVersion::latest(); + let db = make_test_grovedb(v); + db.insert( + [TEST_LEAF].as_ref(), + b"normal", + Element::empty_tree(), + None, + None, + v, + ) + .unwrap() + .expect("insert normal"); + // Add a child so the subtree isn't empty — empty subtrees can + // short-circuit in places that bypass the tree-type check. + db.insert( + [TEST_LEAF, b"normal"].as_ref(), + b"a", + Element::new_item(b"v".to_vec()), + None, + None, + v, + ) + .unwrap() + .expect("insert child"); + let pq = PathQuery::new_aggregate_sum_on_range( + vec![TEST_LEAF.to_vec(), b"normal".to_vec()], + QueryItem::Range(b"a".to_vec()..b"z".to_vec()), + ); + match db.grove_db.prove_query(&pq, None, v).unwrap() { + Err(_) => { /* prover rejected — good */ } + Ok(proof) => { + // Prover didn't catch it (e.g. via an unrelated path); + // the verifier must catch it. + let r = GroveDb::verify_aggregate_sum_query(&proof, &pq, v); + assert!( + r.is_err(), + "verifier must reject sum proof against non-ProvableSumTree" + ); + } + } + } + + // ---------- 20. V0 (GROVE_V2) envelope rejected ---------- + /// Mirror of `aggregate_count_rejects_grove_v2_envelope`. GROVE_V2 + /// dispatches to the V0 `prove_query_non_serialized` path, which + /// produces a `MerkOnlyLayerProof` envelope. `AggregateSumOnRange` + /// postdates V0 (it was added alongside / after V1 in the grove + /// version used by Dash Platform v12+), so V0+ASOR is impossible in + /// any deployed Platform release. The prover rejects the combination + /// up front to keep callers from emitting a V0 aggregate-sum proof + /// that the verifier would (correctly) refuse. + #[test] + fn aggregate_sum_rejects_grove_v2_envelope() { + let v: &GroveVersion = &GROVE_V2; + let (db, _root) = setup_15_key_provable_sum_tree(v); + let pq = PathQuery::new_aggregate_sum_on_range( + vec![TEST_LEAF.to_vec(), b"st".to_vec()], + QueryItem::RangeInclusive(b"c".to_vec()..=b"l".to_vec()), + ); + let prove_result = db.grove_db.prove_query(&pq, None, v).unwrap(); + match prove_result { + Err(crate::Error::NotSupported(msg)) => assert!( + msg.contains("V1 proof envelopes"), + "unexpected message: {msg}" + ), + other => panic!( + "expected NotSupported for V0+aggregate-sum, got {:?}", + other.map(|b| b.len()) + ), + } + } + + // ---------- 21. NotSummed-wrapped child tree contributes 0 ---------- + /// `Element::NotSummed` wraps a *sum-tree variant* and tells the parent + /// to skip the wrapped subtree's aggregate sum. Verify the proof + /// honors that exclusion: the wrapped subtree's sum doesn't + /// contribute to the parent ProvableSumTree's `KVDigestSum.aggregate`, + /// so the aggregate query at the parent level sees only the un-wrapped + /// sum items. + #[test] + fn not_summed_child_tree_excluded_from_aggregate_sum() { + let v = GroveVersion::latest(); + let db = make_test_grovedb(v); + db.insert( + [TEST_LEAF].as_ref(), + b"st", + Element::empty_provable_sum_tree(), + None, + None, + v, + ) + .unwrap() + .expect("insert st"); + // Two regular sum items (5 + 7 = 12). + db.insert( + [TEST_LEAF, b"st"].as_ref(), + b"a", + Element::new_sum_item(5), + None, + None, + v, + ) + .unwrap() + .expect("insert a"); + db.insert( + [TEST_LEAF, b"st"].as_ref(), + b"b", + Element::new_sum_item(7), + None, + None, + v, + ) + .unwrap() + .expect("insert b"); + // One NotSummed-wrapped ProvableSumTree at key "c". Its inner + // children's sum contributes nothing to the parent's aggregate. + let ns_tree = + Element::new_not_summed(Element::empty_provable_sum_tree()).expect("wrap not_summed"); + db.insert([TEST_LEAF, b"st"].as_ref(), b"c", ns_tree, None, None, v) + .unwrap() + .expect("insert NotSummed tree"); + // Put a value inside the wrapped subtree to confirm it doesn't + // bleed into the parent's aggregate. + db.insert( + [TEST_LEAF, b"st", b"c"].as_ref(), + b"hidden", + Element::new_sum_item(100), + None, + None, + v, + ) + .unwrap() + .expect("insert hidden"); + let root = db.grove_db.root_hash(None, v).unwrap().expect("root_hash"); + round_trip( + &db, + root, + vec![TEST_LEAF.to_vec(), b"st".to_vec()], + QueryItem::RangeInclusive(b"a".to_vec()..=b"c".to_vec()), + 12, // NotSummed-wrapped subtree contributes 0 → 5+7 = 12 + v, + ); + } + + // ---------- 22. Empty-leaf type-confusion forgery (security regression) - + /// Codex security finding: when an honest tree at the queried leaf path + /// is an empty Merk-backed tree of any non-ProvableSumTree type + /// (NormalTree, SumTree, ProvableCountTree, …), every such tree stores + /// `inner_root = NULL_HASH`, so its recorded value_hash equals + /// `combine_hash(H(element_bytes), NULL_HASH)`. The merk-level sum + /// verifier accepts empty proof bytes as `(NULL_HASH, 0)`. The + /// pre-fix verifier's `is_any_tree()` check happily accepted those + /// non-ProvableSumTree element bytes — and the chain-hash check + /// passed trivially — letting an attacker prove `sum = 0` against a + /// path that wasn't actually a ProvableSumTree. The numeric answer + /// (0) was correct for an empty tree of any type, but the implicit + /// claim "the leaf is a ProvableSumTree" was a soundness gap. + /// + /// This test surgically constructs the forged proof from a real + /// honest single-key envelope and confirms the new + /// terminal-type gate rejects it. + #[test] + fn empty_leaf_type_confusion_forgery_rejected() { + use std::collections::BTreeMap; + + use bincode::config; + + use crate::operations::proof::{ + GroveDBProof, GroveDBProofV0, MerkOnlyLayerProof, ProveOptions, + }; + + // Use V0 (GROVE_V2) envelope — its MerkOnlyLayerProof is simpler to + // surgically reconstruct than V1's LayerProof/ProofBytes. + let v: &GroveVersion = &GROVE_V2; + + // Build the malicious tree state: an empty NormalTree at the path + // we'll later claim is a ProvableSumTree. We exercise the bypass + // on the empty case specifically — that's the only case where the + // pre-fix chain check passes. + let db = make_test_grovedb(v); + db.insert( + [TEST_LEAF].as_ref(), + b"evil", + Element::empty_tree(), + None, + None, + v, + ) + .unwrap() + .expect("insert empty normal tree at evil"); + + // Run an honest "does evil exist?" single-key probe via prove_query + // to harvest the layer-0 merk proof bytes (proves `evil` exists in + // TEST_LEAF with its NormalTree element bytes). The result has the + // shape we need for the layer-0 portion of the forgery. + let probe = PathQuery::new_single_key(vec![TEST_LEAF.to_vec()], b"evil".to_vec()); + let probe_proof_bytes = db + .grove_db + .prove_query(&probe, None, v) + .unwrap() + .expect("honest probe should succeed"); + + let cfg = config::standard() + .with_big_endian() + .with_limit::<{ 256 * 1024 * 1024 }>(); + let probe_decoded: GroveDBProof = bincode::decode_from_slice(&probe_proof_bytes, cfg) + .unwrap() + .0; + + // Forge a V0 envelope: + // root_layer.merk_proof = honest proof of TEST_LEAF in root + // root_layer.lower_layers[TEST_LEAF].merk_proof = honest proof of + // "evil" in TEST_LEAF + // root_layer.lower_layers[TEST_LEAF].lower_layers["evil"].merk_proof = [] + // <-- forged empty leaf + let (root_merk_proof_bytes, test_leaf_merk_proof_bytes) = match probe_decoded { + GroveDBProof::V0(GroveDBProofV0 { root_layer, .. }) => { + let test_leaf = root_layer + .lower_layers + .get(TEST_LEAF) + .expect("probe must descend into TEST_LEAF") + .merk_proof + .clone(); + (root_layer.merk_proof, test_leaf) + } + GroveDBProof::V1(_) => panic!("expected V0 envelope under GROVE_V2"), + }; + + let leaf_layer = MerkOnlyLayerProof { + merk_proof: Vec::new(), // the forged empty leaf + lower_layers: BTreeMap::new(), + }; + let mut test_leaf_map = BTreeMap::new(); + test_leaf_map.insert(b"evil".to_vec(), leaf_layer); + + let test_leaf_layer = MerkOnlyLayerProof { + merk_proof: test_leaf_merk_proof_bytes, + lower_layers: test_leaf_map, + }; + let mut root_lower = BTreeMap::new(); + root_lower.insert(TEST_LEAF.to_vec(), test_leaf_layer); + + let forged_envelope = GroveDBProof::V0(GroveDBProofV0 { + root_layer: MerkOnlyLayerProof { + merk_proof: root_merk_proof_bytes, + lower_layers: root_lower, + }, + prove_options: ProveOptions::default(), + }); + let forged_bytes = + bincode::encode_to_vec(&forged_envelope, cfg).expect("encode forged envelope"); + + // The attacker submits the forged proof against an aggregate-sum + // path that targets the empty NormalTree as if it were a + // ProvableSumTree. + let attack_pq = PathQuery::new_aggregate_sum_on_range( + vec![TEST_LEAF.to_vec(), b"evil".to_vec()], + QueryItem::RangeFrom(b"a".to_vec()..), + ); + + let result = GroveDb::verify_aggregate_sum_query(&forged_bytes, &attack_pq, v); + match result { + Err(e) => { + // The forgery is rejected either by: + // (a) the V0-envelope-not-allowed gate (fires first + // because the forged proof uses the V0 + // MerkOnlyLayerProof envelope shape), or + // (b) the terminal-type gate in `enforce_lower_chain` + // (would fire under a V1 envelope if we + // reconstructed the forgery there instead). + // Either rejection means the type-confusion forgery + // doesn't pass — the security property holds. + let msg = format!("{e}"); + assert!( + msg.contains("must be a ProvableSumTree") + || msg.contains("require V1 proof envelopes"), + "verifier rejected as expected but with an unrelated message: {msg}" + ); + } + Ok((root_hash, sum)) => panic!( + "BUG: empty-leaf forgery accepted by verifier! \ + Returned (root_hash={}, sum={}) — the leaf is a NormalTree, \ + not a ProvableSumTree.", + hex::encode(root_hash), + sum + ), + } + } + + /// Security regression: empty-path aggregate-sum queries are + /// rejected at validation time, before any proof handling. + /// + /// `verify_aggregate_sum_query` calls + /// `path_query.validate_aggregate_sum_on_range()` at its entry. If + /// the path is empty, validation must fail — otherwise the V1 leaf- + /// chain walker would hit the `depth == path_keys.len()` + /// short-circuit at depth 0 and go straight to the merk-level leaf + /// verifier, never invoking the terminal-type gate in + /// `enforce_lower_chain`. The GroveDB root merk is always a + /// `NormalTree` by API construction, so a root aggregate-sum query + /// has no valid target. + #[test] + fn empty_path_aggregate_sum_rejected_at_validation() { + let v = GroveVersion::latest(); + let pq = PathQuery::new_aggregate_sum_on_range( + Vec::new(), // empty path → must be rejected + QueryItem::RangeFrom(b"a".to_vec()..), + ); + let err = pq + .validate_aggregate_sum_on_range() + .expect_err("empty path must be rejected at validation"); + let msg = format!("{err}"); + assert!( + msg.contains("root") && msg.contains("ProvableSumTree"), + "expected message naming root + ProvableSumTree, got: {msg}" + ); + + // Also confirm the verifier surface rejects with the same error + // (the validator is called first inside verify_aggregate_sum_query). + // We don't need a real proof — any bytes go in; validation runs + // before proof decode. + let result = GroveDb::verify_aggregate_sum_query(&[0u8; 4], &pq, v); + assert!( + result.is_err(), + "verify_aggregate_sum_query must reject empty-path queries" + ); + } + + /// V1-envelope twin of `empty_leaf_type_confusion_forgery_rejected`. + /// + /// The V0 twin currently gets caught by the V0-envelope-not-allowed + /// gate (which fires first under `GROVE_V2`) before the + /// terminal-type gate in `enforce_lower_chain` runs. That gate is + /// the actual fix this PR landed for empty-leaf type confusion, so + /// we forge the same shape under a V1 envelope here to pin the + /// terminal-type check directly. The error message must name + /// `ProvableSumTree` explicitly — only the terminal-type gate + /// produces that message. + #[test] + fn empty_leaf_type_confusion_forgery_rejected_under_v1_envelope() { + use std::collections::BTreeMap; + + use bincode::config; + + use crate::operations::proof::{GroveDBProof, GroveDBProofV1, LayerProof, ProofBytes}; + + let v = GroveVersion::latest(); + let db = make_test_grovedb(v); + db.insert( + [TEST_LEAF].as_ref(), + b"evil", + Element::empty_tree(), + None, + None, + v, + ) + .unwrap() + .expect("insert empty normal tree at evil"); + + // Honest probe to harvest the layer-0 and layer-1 merk proof + // bytes for the forgery. We use prove_query on a single-key + // PathQuery so the prover follows the same descent shape we want + // to forge. + let probe = PathQuery::new_single_key(vec![TEST_LEAF.to_vec()], b"evil".to_vec()); + let probe_proof_bytes = db + .grove_db + .prove_query(&probe, None, v) + .unwrap() + .expect("honest probe should succeed"); + + let cfg = config::standard() + .with_big_endian() + .with_limit::<{ 256 * 1024 * 1024 }>(); + let probe_decoded: GroveDBProof = bincode::decode_from_slice(&probe_proof_bytes, cfg) + .unwrap() + .0; + + let (root_proof_bytes, test_leaf_proof_bytes) = match probe_decoded { + GroveDBProof::V1(GroveDBProofV1 { root_layer }) => { + let test_leaf_bytes = match &root_layer + .lower_layers + .get(TEST_LEAF) + .expect("descent into TEST_LEAF") + .merk_proof + { + ProofBytes::Merk(b) => b.clone(), + other => panic!( + "expected Merk bytes, got {:?}", + std::mem::discriminant(other) + ), + }; + let root_bytes = match root_layer.merk_proof { + ProofBytes::Merk(b) => b, + ref other => panic!( + "expected Merk bytes, got {:?}", + std::mem::discriminant(other) + ), + }; + (root_bytes, test_leaf_bytes) + } + GroveDBProof::V0(_) => panic!("expected V1 envelope under latest grove version"), + }; + + // Forge a V1 envelope: + // root_layer.merk_proof = honest proof of TEST_LEAF in root + // root_layer.lower_layers[TEST_LEAF].merk_proof = + // honest proof of "evil" in TEST_LEAF + // root_layer.lower_layers[TEST_LEAF].lower_layers["evil"].merk_proof = + // [] (forged empty leaf — accepted as (NULL_HASH, 0) by + // the merk-level sum verifier). + let evil_leaf = LayerProof { + merk_proof: ProofBytes::Merk(Vec::new()), + lower_layers: BTreeMap::new(), + }; + let mut test_leaf_map = BTreeMap::new(); + test_leaf_map.insert(b"evil".to_vec(), evil_leaf); + + let test_leaf_layer = LayerProof { + merk_proof: ProofBytes::Merk(test_leaf_proof_bytes), + lower_layers: test_leaf_map, + }; + let mut root_lower = BTreeMap::new(); + root_lower.insert(TEST_LEAF.to_vec(), test_leaf_layer); + + let forged = GroveDBProof::V1(GroveDBProofV1 { + root_layer: LayerProof { + merk_proof: ProofBytes::Merk(root_proof_bytes), + lower_layers: root_lower, + }, + }); + let forged_bytes = bincode::encode_to_vec(&forged, cfg).expect("encode forged envelope"); + + let attack_pq = PathQuery::new_aggregate_sum_on_range( + vec![TEST_LEAF.to_vec(), b"evil".to_vec()], + QueryItem::RangeFrom(b"a".to_vec()..), + ); + + let result = GroveDb::verify_aggregate_sum_query(&forged_bytes, &attack_pq, v); + match result { + Err(e) => { + // V1 envelope means the V0 gate does NOT fire — the + // terminal-type gate in `enforce_lower_chain` is the + // only thing standing between the forgery and a + // silently-accepted `sum = 0`. Pin its specific error + // message so future refactors that drop the gate are + // caught. + let msg = format!("{e}"); + assert!( + msg.contains("must be a ProvableSumTree"), + "expected terminal-type gate to fire; got: {msg}" + ); + } + Ok((root_hash, sum)) => panic!( + "BUG: empty-leaf forgery accepted by V1 verifier! \ + Returned (root_hash={}, sum={}) — the leaf is a NormalTree, \ + not a ProvableSumTree.", + hex::encode(root_hash), + sum + ), + } + } + + /// Same forgery shape, but the honest leaf is an empty + /// `ProvableCountTree` (the wrong PROVABLE tree type for a sum + /// query). Confirms the terminal-type gate enforces the precise + /// tree-type, not just "any provable aggregate tree". + #[test] + fn empty_provable_count_tree_at_leaf_rejected_for_sum() { + use std::collections::BTreeMap; + + use bincode::config; + + use crate::operations::proof::{ + GroveDBProof, GroveDBProofV0, MerkOnlyLayerProof, ProveOptions, + }; + + let v: &GroveVersion = &GROVE_V2; + let db = make_test_grovedb(v); + db.insert( + [TEST_LEAF].as_ref(), + b"pct", + Element::empty_provable_count_tree(), + None, + None, + v, + ) + .unwrap() + .expect("insert empty provable count tree"); + + let probe = PathQuery::new_single_key(vec![TEST_LEAF.to_vec()], b"pct".to_vec()); + let probe_proof_bytes = db + .grove_db + .prove_query(&probe, None, v) + .unwrap() + .expect("honest probe"); + let cfg = config::standard() + .with_big_endian() + .with_limit::<{ 256 * 1024 * 1024 }>(); + let probe_decoded: GroveDBProof = bincode::decode_from_slice(&probe_proof_bytes, cfg) + .unwrap() + .0; + + let (root_mp, test_leaf_mp) = match probe_decoded { + GroveDBProof::V0(GroveDBProofV0 { root_layer, .. }) => ( + root_layer.merk_proof, + root_layer + .lower_layers + .get(TEST_LEAF) + .expect("descent") + .merk_proof + .clone(), + ), + GroveDBProof::V1(_) => panic!("expected V0"), + }; + + let mut leaf = BTreeMap::new(); + leaf.insert( + b"pct".to_vec(), + MerkOnlyLayerProof { + merk_proof: Vec::new(), + lower_layers: BTreeMap::new(), + }, + ); + let mut root_lower = BTreeMap::new(); + root_lower.insert( + TEST_LEAF.to_vec(), + MerkOnlyLayerProof { + merk_proof: test_leaf_mp, + lower_layers: leaf, + }, + ); + let forged = GroveDBProof::V0(GroveDBProofV0 { + root_layer: MerkOnlyLayerProof { + merk_proof: root_mp, + lower_layers: root_lower, + }, + prove_options: ProveOptions::default(), + }); + let forged_bytes = bincode::encode_to_vec(&forged, cfg).expect("encode"); + + let attack_pq = PathQuery::new_aggregate_sum_on_range( + vec![TEST_LEAF.to_vec(), b"pct".to_vec()], + QueryItem::RangeFrom(b"a".to_vec()..), + ); + + let result = GroveDb::verify_aggregate_sum_query(&forged_bytes, &attack_pq, v); + assert!( + result.is_err(), + "ProvableCountTree at leaf must NOT be accepted for an aggregate-sum query" + ); + let msg = format!("{}", result.unwrap_err()); + // The forgery is rejected either by: + // (a) the V0-envelope-not-allowed gate (fires first under + // GROVE_V2 because the forged proof uses the V0 + // MerkOnlyLayerProof envelope shape), or + // (b) the terminal-type gate in `enforce_lower_chain` (would + // fire under a V1 envelope if we constructed the forgery + // there instead). + // Either rejection means the type-confusion forgery doesn't + // pass — the security property holds. Accept both error shapes. + assert!( + msg.contains("must be a ProvableSumTree") || msg.contains("require V1 proof envelopes"), + "expected terminal-type or V0-envelope error, got: {msg}" + ); + } + + // ------------------------------------------------------------------- + // Tests for the no-proof variant: GroveDb::query_aggregate_sum. + // + // Mirrors PR #662's no-proof query_aggregate_count for the signed-sum + // side. The no-proof variant must return the same sum as the proof + // variant for every valid PathQuery shape but should not need to + // produce or verify any proof bytes. + // ------------------------------------------------------------------- + + /// No-proof helper: build the path-query, call query_aggregate_sum, + /// assert the returned sum matches the expected value AND matches + /// what the proof round-trip returns. + fn no_proof_sum_matches_proof( + db: &crate::tests::TempGroveDb, + path: Vec>, + inner_range: QueryItem, + expected_sum: i64, + grove_version: &GroveVersion, + ) { + let path_query = PathQuery::new_aggregate_sum_on_range(path, inner_range); + + let direct = db + .grove_db + .query_aggregate_sum(&path_query, None, grove_version) + .unwrap() + .expect("query_aggregate_sum should succeed"); + assert_eq!(direct, expected_sum, "no-proof variant returned wrong sum"); + + let proof = db + .grove_db + .prove_query(&path_query, None, grove_version) + .unwrap() + .expect("prove_query should succeed"); + let (_root, proved) = + GroveDb::verify_aggregate_sum_query(&proof, &path_query, grove_version) + .expect("verify should succeed"); + assert_eq!( + direct, proved, + "no-proof variant disagrees with proof variant" + ); + } + + #[test] + fn no_proof_sum_provable_sum_tree_range_inclusive() { + let v = GroveVersion::latest(); + let (db, _) = setup_15_key_provable_sum_tree(v); + no_proof_sum_matches_proof( + &db, + vec![TEST_LEAF.to_vec(), b"st".to_vec()], + QueryItem::RangeInclusive(b"c".to_vec()..=b"l".to_vec()), + 75, + v, + ); + } + + #[test] + fn no_proof_sum_provable_sum_tree_range_exclusive() { + let v = GroveVersion::latest(); + let (db, _) = setup_15_key_provable_sum_tree(v); + no_proof_sum_matches_proof( + &db, + vec![TEST_LEAF.to_vec(), b"st".to_vec()], + QueryItem::Range(b"c".to_vec()..b"l".to_vec()), + 63, + v, + ); + } + + #[test] + fn no_proof_sum_provable_sum_tree_range_from() { + let v = GroveVersion::latest(); + let (db, _) = setup_15_key_provable_sum_tree(v); + no_proof_sum_matches_proof( + &db, + vec![TEST_LEAF.to_vec(), b"st".to_vec()], + QueryItem::RangeFrom(b"c".to_vec()..), + 117, + v, + ); + } + + #[test] + fn no_proof_sum_provable_sum_tree_range_after() { + let v = GroveVersion::latest(); + let (db, _) = setup_15_key_provable_sum_tree(v); + no_proof_sum_matches_proof( + &db, + vec![TEST_LEAF.to_vec(), b"st".to_vec()], + QueryItem::RangeAfter(b"b".to_vec()..), + 117, + v, + ); + } + + #[test] + fn no_proof_sum_provable_sum_tree_range_to_inclusive() { + let v = GroveVersion::latest(); + let (db, _) = setup_15_key_provable_sum_tree(v); + no_proof_sum_matches_proof( + &db, + vec![TEST_LEAF.to_vec(), b"st".to_vec()], + QueryItem::RangeToInclusive(..=b"e".to_vec()), + 15, + v, + ); + } + + #[test] + fn no_proof_sum_provable_sum_tree_disjoint_range() { + let v = GroveVersion::latest(); + let (db, _) = setup_15_key_provable_sum_tree(v); + no_proof_sum_matches_proof( + &db, + vec![TEST_LEAF.to_vec(), b"st".to_vec()], + QueryItem::RangeInclusive(vec![0x00]..=vec![0x10]), + 0, + v, + ); + } + + #[test] + fn no_proof_sum_empty_provable_sum_tree_returns_zero() { + // An empty ProvableSumTree returns sum 0 — same as the merk-level + // empty-merk contract. Inserting nothing under the tree exercises + // this path through the full GroveDB stack. + let v = GroveVersion::latest(); + let db = make_test_grovedb(v); + db.insert( + [TEST_LEAF].as_ref(), + b"st", + Element::empty_provable_sum_tree(), + None, + None, + v, + ) + .unwrap() + .expect("insert st"); + let path_query = PathQuery::new_aggregate_sum_on_range( + vec![TEST_LEAF.to_vec(), b"st".to_vec()], + QueryItem::RangeFrom(b"a".to_vec()..), + ); + let direct = db + .grove_db + .query_aggregate_sum(&path_query, None, v) + .unwrap() + .expect("query_aggregate_sum should succeed on empty"); + assert_eq!(direct, 0); + } + + #[test] + fn no_proof_sum_negative_values_matches_proof() { + // Cross-check no-proof and proof on a tree with mixed positive + // and negative sum items. This exercises both the i128 + // accumulator and the signed own_sum subtraction in the no-proof + // walker. + let v = GroveVersion::latest(); + let db = make_test_grovedb(v); + db.insert( + [TEST_LEAF].as_ref(), + b"st", + Element::empty_provable_sum_tree(), + None, + None, + v, + ) + .unwrap() + .expect("insert st"); + let entries: [(u8, i64); 4] = [(b'a', 50), (b'b', -100), (b'c', 30), (b'd', -50)]; + for (k, val) in entries { + db.insert( + [TEST_LEAF, b"st"].as_ref(), + &[k], + Element::new_sum_item(val), + None, + None, + v, + ) + .unwrap() + .expect("insert sum item"); + } + no_proof_sum_matches_proof( + &db, + vec![TEST_LEAF.to_vec(), b"st".to_vec()], + QueryItem::RangeFrom(b"a".to_vec()..), + -70, // 50 − 100 + 30 − 50 + v, + ); + no_proof_sum_matches_proof( + &db, + vec![TEST_LEAF.to_vec(), b"st".to_vec()], + QueryItem::RangeInclusive(b"b".to_vec()..=b"c".to_vec()), + -70, // −100 + 30 = −70 + v, + ); + } + + #[test] + fn no_proof_sum_invalid_inner_range_rejected_before_storage_reads() { + // The validator runs at the top of query_aggregate_sum; an + // illegal inner range like `Key(_)` is rejected before any merk + // is opened. + let v = GroveVersion::latest(); + let db = make_test_grovedb(v); + let path_query = PathQuery::new_aggregate_sum_on_range( + vec![TEST_LEAF.to_vec(), b"st".to_vec()], + QueryItem::Key(b"a".to_vec()), + ); + let err = db + .grove_db + .query_aggregate_sum(&path_query, None, v) + .unwrap() + .expect_err("Key inner must be rejected at validation"); + match err { + crate::Error::InvalidQuery(_) => {} + other => panic!("expected InvalidQuery, got {:?}", other), + } + } + + #[test] + fn no_proof_sum_empty_path_rejected_at_validation() { + // Mirror of the verify-side empty-path rejection: the no-proof + // entry point must also reject empty-path queries up front, since + // the GroveDB root is always a NormalTree. + let v = GroveVersion::latest(); + let db = make_test_grovedb(v); + let path_query = PathQuery::new_aggregate_sum_on_range( + Vec::new(), + QueryItem::RangeFrom(b"a".to_vec()..), + ); + let err = db + .grove_db + .query_aggregate_sum(&path_query, None, v) + .unwrap() + .expect_err("empty path must be rejected"); + match err { + crate::Error::InvalidQuery(_) => {} + other => panic!("expected InvalidQuery, got {:?}", other), + } + } + + #[test] + fn no_proof_sum_normal_tree_rejected_at_merk() { + // A path that resolves to a NormalTree (not a ProvableSumTree) + // must be rejected by the merk-level tree-type gate. + let v = GroveVersion::latest(); + let db = make_test_grovedb(v); + db.insert( + [TEST_LEAF].as_ref(), + b"normal", + Element::empty_tree(), + None, + None, + v, + ) + .unwrap() + .expect("insert normal tree"); + // Insert a child so the merk isn't empty (an empty merk would + // short-circuit to 0 before hitting the tree-type check on the + // no-proof side, since `Merk::sum_aggregate_on_range` checks + // tree_type before descending — confirm by inserting something). + db.insert( + [TEST_LEAF, b"normal"].as_ref(), + b"a", + Element::new_item(b"v".to_vec()), + None, + None, + v, + ) + .unwrap() + .expect("insert child"); + let path_query = PathQuery::new_aggregate_sum_on_range( + vec![TEST_LEAF.to_vec(), b"normal".to_vec()], + QueryItem::RangeFrom(b"a".to_vec()..), + ); + let err = db + .grove_db + .query_aggregate_sum(&path_query, None, v) + .unwrap() + .expect_err("NormalTree leaf must be rejected by merk-level gate"); + // The merk-level error gets wrapped with contextual `CorruptedData` + // by `query_aggregate_sum` (callsite-specific path info — see + // `operations/get/query.rs`). + match err { + crate::Error::CorruptedData(_) => {} + other => panic!("expected CorruptedData wrapper, got {:?}", other), + } + } + + // ------------------------------------------------------------------- + // Verifier error-path coverage: each test below pins a specific + // arm of `verify_v1_leaf_chain` / `verify_sum_leaf` / + // `verify_single_key_layer_proof_v0` / `enforce_lower_chain` in + // `grovedb/src/operations/proof/aggregate_sum/`. Mirrored from the + // count-side mutation tests in `aggregate_count_query_tests.rs`. + // ------------------------------------------------------------------- + + /// Decode the bincode envelope back into a `GroveDBProof` for surgical + /// mutation, mirroring the count-side helper. + fn decode_sum_envelope(proof: &[u8]) -> crate::operations::proof::GroveDBProof { + bincode::decode_from_slice( + proof, + bincode::config::standard() + .with_big_endian() + .with_limit::<{ 256 * 1024 * 1024 }>(), + ) + .expect("decode envelope") + .0 + } + + /// Re-encode a (possibly mutated) `GroveDBProof` envelope using the + /// same bincode config the prover uses on the way out. + fn reencode_sum_envelope(decoded: crate::operations::proof::GroveDBProof) -> Vec { + bincode::encode_to_vec( + decoded, + bincode::config::standard() + .with_big_endian() + .with_no_limit(), + ) + .expect("re-encode envelope") + } + + /// Walk to the TEST_LEAF non-leaf merk proof bytes in a V1 envelope, + /// run `mutate` over its parsed ops, then re-encode. Mirrors + /// `mutate_test_leaf_layer_ops` from the count tests. + fn mutate_sum_test_leaf_layer_ops( + proof: &[u8], + mutate: impl FnOnce(&mut Vec), + ) -> Vec { + use grovedb_merk::proofs::{encoding::encode_into, Decoder, Op}; + + use crate::operations::proof::{GroveDBProof, GroveDBProofV1, ProofBytes}; + + let mut decoded = decode_sum_envelope(proof); + let GroveDBProof::V1(GroveDBProofV1 { root_layer }) = &mut decoded else { + panic!("expected V1 envelope"); + }; + let test_leaf_layer = root_layer + .lower_layers + .get_mut(&TEST_LEAF.to_vec()) + .expect("TEST_LEAF lower layer"); + let bytes = match &mut test_leaf_layer.merk_proof { + ProofBytes::Merk(b) => b, + _ => panic!("expected Merk bytes at TEST_LEAF non-leaf"), + }; + let mut ops: Vec = Decoder::new(bytes) + .map(|r| r.expect("decode existing op")) + .collect(); + mutate(&mut ops); + let mut new_bytes = Vec::new(); + encode_into(ops.iter(), &mut new_bytes); + *bytes = new_bytes; + reencode_sum_envelope(decoded) + } + + #[test] + fn sum_non_leaf_proof_without_target_key_is_rejected() { + // Replace the KV op carrying the "st" key with a `Hash` op. The + // single-key verifier still parses the proof but `result_set` is + // empty for the requested key — the "did not contain the expected + // key" arm in verify_single_key_layer_proof_v0 fires (or, if the + // upstream merk verifier rejects first because the hash op makes + // the proof unparsable, that's still the same outcome). + use grovedb_merk::proofs::{Node, Op}; + + let v = GroveVersion::latest(); + let (db, _root) = setup_15_key_provable_sum_tree(v); + let pq = PathQuery::new_aggregate_sum_on_range( + vec![TEST_LEAF.to_vec(), b"st".to_vec()], + QueryItem::RangeInclusive(b"c".to_vec()..=b"l".to_vec()), + ); + let proof = db + .grove_db + .prove_query(&pq, None, v) + .unwrap() + .expect("prove_query"); + let mutated = mutate_sum_test_leaf_layer_ops(&proof, |ops| { + for op in ops.iter_mut() { + let key_match = matches!( + op, + Op::Push( + Node::KV(k, _) + | Node::KVValueHash(k, _, _) + | Node::KVValueHashFeatureType(k, _, _, _) + | Node::KVValueHashFeatureTypeWithChildHash(k, _, _, _, _) + ) + | Op::PushInverted( + Node::KV(k, _) + | Node::KVValueHash(k, _, _) + | Node::KVValueHashFeatureType(k, _, _, _) + | Node::KVValueHashFeatureTypeWithChildHash(k, _, _, _, _) + ) if k == b"st" + ); + if key_match { + *op = Op::Push(Node::Hash([0u8; 32])); + return; + } + } + panic!("test setup: no `st` KV op found in non-leaf proof"); + }); + let err = GroveDb::verify_aggregate_sum_query(&mutated, &pq, v) + .expect_err("missing target key must be rejected"); + match err { + crate::Error::InvalidProof(_, msg) => assert!( + msg.contains("did not contain the expected key") + || msg.contains("non-leaf single-key proof"), + "unexpected message: {msg}" + ), + other => panic!("expected InvalidProof, got {:?}", other), + } + } + + #[test] + fn sum_non_leaf_proof_with_kv_replaced_by_kvdigest_is_rejected() { + // Replace `st` KV with KVDigest (no value bytes) — hits the "no + // value bytes" arm in verify_single_key_layer_proof_v0 + // (`aggregate_sum/helpers.rs`). + use grovedb_merk::proofs::{Node, Op}; + + let v = GroveVersion::latest(); + let (db, _root) = setup_15_key_provable_sum_tree(v); + let pq = PathQuery::new_aggregate_sum_on_range( + vec![TEST_LEAF.to_vec(), b"st".to_vec()], + QueryItem::RangeInclusive(b"c".to_vec()..=b"l".to_vec()), + ); + let proof = db + .grove_db + .prove_query(&pq, None, v) + .unwrap() + .expect("prove_query"); + let mutated = mutate_sum_test_leaf_layer_ops(&proof, |ops| { + for op in ops.iter_mut() { + let replaced = match op { + Op::Push(Node::KVValueHash(k, _, vh)) + | Op::PushInverted(Node::KVValueHash(k, _, vh)) + if k == b"st" => + { + Some((k.clone(), *vh)) + } + Op::Push(Node::KVValueHashFeatureType(k, _, vh, _)) + | Op::PushInverted(Node::KVValueHashFeatureType(k, _, vh, _)) + if k == b"st" => + { + Some((k.clone(), *vh)) + } + Op::Push(Node::KVValueHashFeatureTypeWithChildHash(k, _, vh, _, _)) + | Op::PushInverted(Node::KVValueHashFeatureTypeWithChildHash(k, _, vh, _, _)) + if k == b"st" => + { + Some((k.clone(), *vh)) + } + _ => None, + }; + if let Some((k, vh)) = replaced { + *op = Op::Push(Node::KVDigest(k, vh)); + return; + } + } + panic!("test setup: no `st` KVValueHash op"); + }); + let result = GroveDb::verify_aggregate_sum_query(&mutated, &pq, v); + match result { + Err(crate::Error::InvalidProof(_, _)) => {} + other => panic!("expected InvalidProof, got {:?}", other), + } + } + + #[test] + fn sum_non_leaf_proof_with_undeserializable_value_is_rejected() { + // Mutate value bytes to garbage so Element::deserialize fails — + // covers the deserialize-failure arm in enforce_lower_chain + // (lines 341-348). + use grovedb_merk::proofs::{Node, Op}; + + let v = GroveVersion::latest(); + let (db, _root) = setup_15_key_provable_sum_tree(v); + let pq = PathQuery::new_aggregate_sum_on_range( + vec![TEST_LEAF.to_vec(), b"st".to_vec()], + QueryItem::RangeInclusive(b"c".to_vec()..=b"l".to_vec()), + ); + let proof = db + .grove_db + .prove_query(&pq, None, v) + .unwrap() + .expect("prove_query"); + let garbage: Vec = vec![0xff, 0xff, 0xff]; + let mutated = mutate_sum_test_leaf_layer_ops(&proof, |ops| { + for op in ops.iter_mut() { + let replaced = match op { + Op::Push(Node::KVValueHash(k, val, _)) + | Op::PushInverted(Node::KVValueHash(k, val, _)) + if k == b"st" => + { + *val = garbage.clone(); + true + } + Op::Push(Node::KVValueHashFeatureType(k, val, _, _)) + | Op::PushInverted(Node::KVValueHashFeatureType(k, val, _, _)) + if k == b"st" => + { + *val = garbage.clone(); + true + } + Op::Push(Node::KVValueHashFeatureTypeWithChildHash(k, val, _, _, _)) + | Op::PushInverted(Node::KVValueHashFeatureTypeWithChildHash( + k, + val, + _, + _, + _, + )) if k == b"st" => { + *val = garbage.clone(); + true + } + _ => false, + }; + if replaced { + return; + } + } + panic!("test setup: no `st` value-bearing KV op"); + }); + let result = GroveDb::verify_aggregate_sum_query(&mutated, &pq, v); + assert!( + matches!(result, Err(crate::Error::InvalidProof(_, _))), + "expected InvalidProof, got {:?}", + result.map(|(_, s)| s) + ); + } + + #[test] + fn sum_non_leaf_proof_with_non_tree_element_is_rejected() { + // Replace `st` value with a serialized Item: deserializes fine, + // but enforce_lower_chain's `is_any_tree()` guard rejects it + // (`aggregate_sum/helpers.rs`). + use grovedb_merk::proofs::{Node, Op}; + + let v = GroveVersion::latest(); + let (db, _root) = setup_three_layer_provable_sum_tree(v); + // We need a 3-layer setup so there's an intermediate (non-terminal) + // descent at depth 1 (path[1] = "outer"). At terminal layer the + // ProvableSumTree gate would fire first; we want the + // intermediate-tree gate. + let pq = PathQuery::new_aggregate_sum_on_range( + vec![TEST_LEAF.to_vec(), b"outer".to_vec(), b"inner".to_vec()], + QueryItem::RangeInclusive(b"b".to_vec()..=b"d".to_vec()), + ); + let proof = db + .grove_db + .prove_query(&pq, None, v) + .unwrap() + .expect("prove_query"); + let item_bytes = Element::new_item(vec![0xab, 0xcd]) + .serialize(v) + .expect("serialize"); + let mutated = mutate_sum_test_leaf_layer_ops(&proof, |ops| { + for op in ops.iter_mut() { + let replaced = match op { + Op::Push(Node::KVValueHash(k, val, _)) + | Op::PushInverted(Node::KVValueHash(k, val, _)) + if k == b"outer" => + { + *val = item_bytes.clone(); + true + } + Op::Push(Node::KVValueHashFeatureType(k, val, _, _)) + | Op::PushInverted(Node::KVValueHashFeatureType(k, val, _, _)) + if k == b"outer" => + { + *val = item_bytes.clone(); + true + } + Op::Push(Node::KVValueHashFeatureTypeWithChildHash(k, val, _, _, _)) + | Op::PushInverted(Node::KVValueHashFeatureTypeWithChildHash( + k, + val, + _, + _, + _, + )) if k == b"outer" => { + *val = item_bytes.clone(); + true + } + _ => false, + }; + if replaced { + return; + } + } + panic!("test setup: no `outer` value-bearing KV op"); + }); + let result = GroveDb::verify_aggregate_sum_query(&mutated, &pq, v); + assert!( + matches!(result, Err(crate::Error::InvalidProof(_, _))), + "non-tree element on path must be rejected, got {:?}", + result.map(|(_, s)| s) + ); + } + + #[test] + fn sum_v1_envelope_with_non_merk_proof_bytes_is_rejected() { + // Swap leaf layer bytes for MMR variant → triggers V1 walker's + // "unexpected non-merk leaf bytes" arm (lines 189-196). + use crate::operations::proof::{GroveDBProof, GroveDBProofV1, ProofBytes}; + + let v = GroveVersion::latest(); + let (db, _root) = setup_15_key_provable_sum_tree(v); + let pq = PathQuery::new_aggregate_sum_on_range( + vec![TEST_LEAF.to_vec(), b"st".to_vec()], + QueryItem::RangeInclusive(b"c".to_vec()..=b"l".to_vec()), + ); + let proof = db + .grove_db + .prove_query(&pq, None, v) + .unwrap() + .expect("prove_query"); + + let mut decoded = decode_sum_envelope(&proof); + let GroveDBProof::V1(GroveDBProofV1 { root_layer }) = &mut decoded else { + panic!("expected V1 envelope on latest GroveVersion"); + }; + let leaf_layer = root_layer + .lower_layers + .get_mut(&TEST_LEAF.to_vec()) + .expect("TEST_LEAF") + .lower_layers + .get_mut(&b"st".to_vec()) + .expect("st"); + leaf_layer.merk_proof = ProofBytes::MMR(vec![0u8; 8]); + + let reencoded = reencode_sum_envelope(decoded); + let err = GroveDb::verify_aggregate_sum_query(&reencoded, &pq, v) + .expect_err("non-Merk leaf bytes must be rejected"); + match err { + crate::Error::InvalidProof(_, msg) => { + assert!( + msg.contains("non-merk"), + "expected non-merk rejection, got: {msg}" + ); + } + other => panic!("expected InvalidProof, got {:?}", other), + } + } + + #[test] + fn sum_v1_envelope_with_missing_lower_layer_is_rejected() { + // Drop the leaf layer → triggers the V1 walker's + // "missing lower layer for path key" arm (lines 209-216). + use crate::operations::proof::{GroveDBProof, GroveDBProofV1}; + + let v = GroveVersion::latest(); + let (db, _root) = setup_15_key_provable_sum_tree(v); + let pq = PathQuery::new_aggregate_sum_on_range( + vec![TEST_LEAF.to_vec(), b"st".to_vec()], + QueryItem::RangeInclusive(b"c".to_vec()..=b"l".to_vec()), + ); + let proof = db + .grove_db + .prove_query(&pq, None, v) + .unwrap() + .expect("prove_query"); + + let mut decoded = decode_sum_envelope(&proof); + let GroveDBProof::V1(GroveDBProofV1 { root_layer }) = &mut decoded else { + panic!("expected V1 envelope"); + }; + let test_leaf_layer = root_layer + .lower_layers + .get_mut(&TEST_LEAF.to_vec()) + .expect("TEST_LEAF"); + let removed = test_leaf_layer.lower_layers.remove(&b"st".to_vec()); + assert!(removed.is_some(), "test setup: st layer should exist"); + + let reencoded = reencode_sum_envelope(decoded); + let err = GroveDb::verify_aggregate_sum_query(&reencoded, &pq, v) + .expect_err("missing lower_layer must be rejected"); + match err { + crate::Error::InvalidProof(_, msg) => { + // Removing the only entry trips the count-shape gate + // (`lower_layers.len() != 1`), which fires before the + // key-shape gate. Both pin the same underlying + // property (the descent into the next path key must + // be present), so either message is acceptable. + assert!( + msg.contains("missing lower layer") + || msg.contains("lower-layer entries at depth"), + "expected missing-lower-layer or entry-count rejection, got: {msg}" + ); + } + other => panic!("expected InvalidProof, got {:?}", other), + } + } + + /// Strict-shape regression: any non-leaf layer must contain + /// exactly one `lower_layers` entry — the descent into the next + /// path key. Adding an extra sibling entry makes two byte-distinct + /// envelopes verify to the same `(root, sum)`, which we reject so + /// proofs are uniquely byte-shaped. + #[test] + fn sum_v1_envelope_with_extra_lower_layer_is_rejected() { + use std::collections::BTreeMap; + + use crate::operations::proof::{GroveDBProof, GroveDBProofV1, LayerProof, ProofBytes}; + + let v = GroveVersion::latest(); + let (db, _root) = setup_15_key_provable_sum_tree(v); + let pq = PathQuery::new_aggregate_sum_on_range( + vec![TEST_LEAF.to_vec(), b"st".to_vec()], + QueryItem::RangeInclusive(b"c".to_vec()..=b"l".to_vec()), + ); + let proof = db + .grove_db + .prove_query(&pq, None, v) + .unwrap() + .expect("prove_query"); + + let mut decoded = decode_sum_envelope(&proof); + let GroveDBProof::V1(GroveDBProofV1 { root_layer }) = &mut decoded else { + panic!("expected V1 envelope"); + }; + let test_leaf_layer = root_layer + .lower_layers + .get_mut(&TEST_LEAF.to_vec()) + .expect("TEST_LEAF"); + // Smuggle an unrelated layer in alongside the legitimate `st` + // descent. The honest prover never emits this shape, so the + // strict-shape gate must reject it. + test_leaf_layer.lower_layers.insert( + b"intruder".to_vec(), + LayerProof { + merk_proof: ProofBytes::Merk(Vec::new()), + lower_layers: BTreeMap::new(), + }, + ); + + let reencoded = reencode_sum_envelope(decoded); + let err = GroveDb::verify_aggregate_sum_query(&reencoded, &pq, v) + .expect_err("extra lower_layer at non-leaf depth must be rejected"); + match err { + crate::Error::InvalidProof(_, msg) => { + assert!( + msg.contains("lower-layer entries at depth"), + "expected entry-count rejection, got: {msg}" + ); + } + other => panic!("expected InvalidProof, got {:?}", other), + } + } + + /// Strict-shape regression: the sole `lower_layers` entry at a + /// non-leaf depth must be keyed under the expected descent key. + /// Renaming the entry (single entry, but under the wrong key) + /// exercises the key-shape gate, distinct from the count-shape + /// gate covered by `sum_v1_envelope_with_extra_lower_layer_*`. + #[test] + fn sum_v1_envelope_with_wrong_keyed_lower_layer_is_rejected() { + use crate::operations::proof::{GroveDBProof, GroveDBProofV1}; + + let v = GroveVersion::latest(); + let (db, _root) = setup_15_key_provable_sum_tree(v); + let pq = PathQuery::new_aggregate_sum_on_range( + vec![TEST_LEAF.to_vec(), b"st".to_vec()], + QueryItem::RangeInclusive(b"c".to_vec()..=b"l".to_vec()), + ); + let proof = db + .grove_db + .prove_query(&pq, None, v) + .unwrap() + .expect("prove_query"); + + let mut decoded = decode_sum_envelope(&proof); + let GroveDBProof::V1(GroveDBProofV1 { root_layer }) = &mut decoded else { + panic!("expected V1 envelope"); + }; + let test_leaf_layer = root_layer + .lower_layers + .get_mut(&TEST_LEAF.to_vec()) + .expect("TEST_LEAF"); + // Re-key the sole `st` entry under a different name. The + // count gate (`len == 1`) still passes; the key gate must + // reject because the sole entry is no longer for the expected + // path key. + let st_layer = test_leaf_layer + .lower_layers + .remove(&b"st".to_vec()) + .expect("test setup: st should be present"); + test_leaf_layer + .lower_layers + .insert(b"impostor".to_vec(), st_layer); + + let reencoded = reencode_sum_envelope(decoded); + let err = GroveDb::verify_aggregate_sum_query(&reencoded, &pq, v) + .expect_err("wrong-keyed lower_layer must be rejected"); + match err { + crate::Error::InvalidProof(_, msg) => { + assert!( + msg.contains("not keyed by the expected path key"), + "expected wrong-key rejection, got: {msg}" + ); + } + other => panic!("expected InvalidProof, got {:?}", other), + } + } + + /// Canonical-decode regression: a valid proof with any trailing + /// bytes appended must be rejected, even though the cryptographic + /// chain check would still bind the same `(RootHash, sum)` + /// result. Otherwise the same logical proof would have many + /// distinct byte encodings, which breaks proof-equality / caching + /// assumptions. Mirrors `aggregate_count_proof_with_trailing_bytes_*`. + #[test] + fn sum_proof_with_trailing_bytes_is_rejected() { + let v = GroveVersion::latest(); + let (db, _root) = setup_15_key_provable_sum_tree(v); + let pq = PathQuery::new_aggregate_sum_on_range( + vec![TEST_LEAF.to_vec(), b"st".to_vec()], + QueryItem::RangeInclusive(b"c".to_vec()..=b"l".to_vec()), + ); + let mut proof = db + .grove_db + .prove_query(&pq, None, v) + .unwrap() + .expect("prove_query should succeed"); + // Sanity: the untouched proof verifies. + GroveDb::verify_aggregate_sum_query(&proof, &pq, v).expect("clean proof should verify"); + // Append a single trailing byte and expect canonical-decode + // rejection. + proof.push(0u8); + let err = GroveDb::verify_aggregate_sum_query(&proof, &pq, v) + .expect_err("trailing-byte proof must be rejected"); + match err { + crate::Error::CorruptedData(msg) => { + assert!(msg.contains("trailing bytes"), "unexpected message: {msg}") + } + other => panic!("expected CorruptedData, got {:?}", other), + } + } + + /// Strict-shape regression: the leaf merk (the layer that holds the + /// actual sum proof) must carry no `lower_layers`. The honest + /// prover always emits an empty map at this depth. + #[test] + fn sum_v1_envelope_with_lower_layers_under_leaf_is_rejected() { + use std::collections::BTreeMap; + + use crate::operations::proof::{GroveDBProof, GroveDBProofV1, LayerProof, ProofBytes}; + + let v = GroveVersion::latest(); + let (db, _root) = setup_15_key_provable_sum_tree(v); + let pq = PathQuery::new_aggregate_sum_on_range( + vec![TEST_LEAF.to_vec(), b"st".to_vec()], + QueryItem::RangeInclusive(b"c".to_vec()..=b"l".to_vec()), + ); + let proof = db + .grove_db + .prove_query(&pq, None, v) + .unwrap() + .expect("prove_query"); + + let mut decoded = decode_sum_envelope(&proof); + let GroveDBProof::V1(GroveDBProofV1 { root_layer }) = &mut decoded else { + panic!("expected V1 envelope"); + }; + let leaf_layer = root_layer + .lower_layers + .get_mut(&TEST_LEAF.to_vec()) + .expect("TEST_LEAF") + .lower_layers + .get_mut(&b"st".to_vec()) + .expect("st"); + // Attach a dangling sub-layer under the leaf merk. The + // strict-shape gate at `depth == path_keys.len()` must reject + // this even though the smuggled bytes would not affect the + // verified sum. + leaf_layer.lower_layers.insert( + b"dangling".to_vec(), + LayerProof { + merk_proof: ProofBytes::Merk(Vec::new()), + lower_layers: BTreeMap::new(), + }, + ); + + let reencoded = reencode_sum_envelope(decoded); + let err = GroveDb::verify_aggregate_sum_query(&reencoded, &pq, v) + .expect_err("dangling layer under leaf must be rejected"); + match err { + crate::Error::InvalidProof(_, msg) => { + assert!( + msg.contains("unexpected lower layers below the leaf"), + "expected leaf-no-children rejection, got: {msg}" + ); + } + other => panic!("expected InvalidProof, got {:?}", other), + } + } + + #[test] + fn sum_v1_envelope_with_malformed_leaf_sum_proof_is_rejected() { + // Replace leaf merk bytes with a Push(Hash(...)) ops stream that + // the sum verifier's Phase 1 rejects (plain Hash isn't on the + // sum allowlist). Triggers `verify_sum_leaf`'s `.map_err(...)` + // arm (lines 250-254). + use std::collections::LinkedList; + + use grovedb_merk::proofs::{encoding::encode_into, Node, Op}; + + use crate::operations::proof::{GroveDBProof, GroveDBProofV1, ProofBytes}; + + let v = GroveVersion::latest(); + let (db, _root) = setup_15_key_provable_sum_tree(v); + let pq = PathQuery::new_aggregate_sum_on_range( + vec![TEST_LEAF.to_vec(), b"st".to_vec()], + QueryItem::RangeInclusive(b"c".to_vec()..=b"l".to_vec()), + ); + let proof = db + .grove_db + .prove_query(&pq, None, v) + .unwrap() + .expect("prove_query"); + + let mut decoded = decode_sum_envelope(&proof); + let GroveDBProof::V1(GroveDBProofV1 { root_layer }) = &mut decoded else { + panic!("expected V1 envelope"); + }; + let leaf_layer = root_layer + .lower_layers + .get_mut(&TEST_LEAF.to_vec()) + .expect("TEST_LEAF") + .lower_layers + .get_mut(&b"st".to_vec()) + .expect("st"); + + let mut ops: LinkedList = LinkedList::new(); + ops.push_back(Op::Push(Node::Hash([0u8; 32]))); + let mut bad_bytes = Vec::new(); + encode_into(ops.iter(), &mut bad_bytes); + leaf_layer.merk_proof = ProofBytes::Merk(bad_bytes); + + let reencoded = reencode_sum_envelope(decoded); + let err = GroveDb::verify_aggregate_sum_query(&reencoded, &pq, v) + .expect_err("malformed leaf sum proof must be rejected"); + match err { + crate::Error::InvalidProof(_, msg) => { + assert!( + msg.contains("aggregate-sum leaf proof failed to verify"), + "expected leaf-verify failure message, got: {msg}" + ); + } + other => panic!("expected InvalidProof, got {:?}", other), + } + } + + #[test] + fn sum_v1_envelope_with_corrupted_non_leaf_merk_bytes_is_rejected() { + // Truncate the non-leaf merk proof bytes — the single-key proof + // verifier fails before we ever descend (lines 279-286). + use crate::operations::proof::{GroveDBProof, GroveDBProofV1, ProofBytes}; + + let v = GroveVersion::latest(); + let (db, _root) = setup_15_key_provable_sum_tree(v); + let pq = PathQuery::new_aggregate_sum_on_range( + vec![TEST_LEAF.to_vec(), b"st".to_vec()], + QueryItem::RangeInclusive(b"c".to_vec()..=b"l".to_vec()), + ); + let proof = db + .grove_db + .prove_query(&pq, None, v) + .unwrap() + .expect("prove_query"); + + let mut decoded = decode_sum_envelope(&proof); + let GroveDBProof::V1(GroveDBProofV1 { root_layer }) = &mut decoded else { + panic!("expected V1 envelope"); + }; + let test_leaf_layer = root_layer + .lower_layers + .get_mut(&TEST_LEAF.to_vec()) + .expect("TEST_LEAF"); + match &mut test_leaf_layer.merk_proof { + ProofBytes::Merk(b) => { + *b = vec![0xff]; + } + other => panic!( + "expected Merk bytes at non-leaf, got discriminant {:?}", + std::mem::discriminant(other) + ), + } + + let reencoded = reencode_sum_envelope(decoded); + let err = GroveDb::verify_aggregate_sum_query(&reencoded, &pq, v) + .expect_err("corrupted non-leaf merk bytes must be rejected"); + match err { + crate::Error::InvalidProof(_, _) => {} + other => panic!("expected InvalidProof, got {:?}", other), + } + } + + // NOTE: There used to be a `sum_v0_envelope_with_missing_lower_layer_is_rejected` + // test here. It was removed alongside the V0 envelope verifier + // walker — V0 (`MerkOnlyLayerProof`) envelopes are now rejected at + // the prover and verifier entry points (see + // `aggregate_sum_rejects_grove_v2_envelope` for the prover gate and + // the `require_v1_envelope` helper in + // `operations/proof/aggregate_sum/mod.rs` for the verifier gate), + // so the V0 missing-layer code path no longer exists. The V1 + // counterpart `sum_v1_envelope_with_missing_lower_layer_is_rejected` + // continues to pin the missing-layer behavior on the only envelope + // shape that can legitimately carry an aggregate-sum proof. + + #[test] + fn sum_unparsable_envelope_is_rejected() { + // Random garbage bytes can't decode as a GroveDBProof — covers + // the bincode-decode error arm in verify_aggregate_sum_query + // (around line 86-88). + let v = GroveVersion::latest(); + let pq = PathQuery::new_aggregate_sum_on_range( + vec![TEST_LEAF.to_vec(), b"st".to_vec()], + QueryItem::RangeInclusive(b"c".to_vec()..=b"l".to_vec()), + ); + let err = GroveDb::verify_aggregate_sum_query(&[0xffu8; 64], &pq, v) + .expect_err("unparsable bytes must be rejected"); + match err { + crate::Error::CorruptedData(msg) => { + assert!( + msg.contains("unable to decode proof"), + "expected decode-error message, got: {msg}" + ); + } + other => panic!("expected CorruptedData, got {:?}", other), + } + } + + #[test] + fn sum_proof_display_includes_sum_node_variants() { + // Drive the Display arms for ProvableSumTree node variants + // (KVSum / KVHashSum / KVDigestSum / HashWithSum / KVRefValueHashSum) + // in `node_to_string` (grovedb/src/operations/proof/mod.rs around + // lines 753-781). Formatting the decoded proof recursively walks + // every Op → Node, hitting each per-variant arm that appears in + // the proof. We don't pin which specific variants the prover + // emits — for a sum-proof on a 15-key tree we expect at least + // KVDigestSum (boundary) and HashWithSum (Disjoint / Contained + // leaf), but the exact mix can change. Instead we assert that + // the formatted output mentions the sum-bearing prefix. + let v = GroveVersion::latest(); + let (db, _root) = setup_15_key_provable_sum_tree(v); + let pq = PathQuery::new_aggregate_sum_on_range( + vec![TEST_LEAF.to_vec(), b"st".to_vec()], + QueryItem::RangeInclusive(b"c".to_vec()..=b"l".to_vec()), + ); + let proof_bytes = db + .grove_db + .prove_query(&pq, None, v) + .unwrap() + .expect("prove_query"); + let decoded = decode_sum_envelope(&proof_bytes); + let printed = format!("{}", decoded); + assert!( + printed.contains("Sum") || printed.contains("HashWith"), + "expected formatted proof to mention sum-bearing nodes: {printed}" + ); + } + + #[test] + fn regular_prove_on_provable_sum_tree_formats_kv_sum_nodes() { + // Drive the KVSum / KVHashSum Display arms specifically. The + // sum-aggregate proof emits KVDigestSum / HashWithSum, but a + // regular `Merk::prove`-style query on a ProvableSumTree emits + // KVSum (for the queried items) and KVHashSum (for non-queried + // path nodes). We hit those by running a normal proof query on + // the same tree and formatting it. + use grovedb_merk::proofs::Query as MerkQuery; + + let v = GroveVersion::latest(); + let (db, _root) = setup_15_key_provable_sum_tree(v); + let mut q = MerkQuery::new(); + q.insert_range_inclusive(b"c".to_vec()..=b"l".to_vec()); + let pq = PathQuery::new( + vec![TEST_LEAF.to_vec(), b"st".to_vec()], + crate::SizedQuery::new(q, None, None), + ); + let proof_bytes = db + .grove_db + .prove_query(&pq, None, v) + .unwrap() + .expect("prove_query"); + let decoded = decode_sum_envelope(&proof_bytes); + let printed = format!("{}", decoded); + // KVSum or KVHashSum must appear in the formatted output for a + // regular range query against a ProvableSumTree. + assert!( + printed.contains("KVSum") || printed.contains("KVHashSum"), + "expected KV-sum-flavored node in printed proof: {printed}" + ); + } +} diff --git a/grovedb/src/tests/mod.rs b/grovedb/src/tests/mod.rs index 58ec31d90..eb17a018f 100644 --- a/grovedb/src/tests/mod.rs +++ b/grovedb/src/tests/mod.rs @@ -7,6 +7,7 @@ mod query_tests; mod sum_tree_tests; mod aggregate_count_query_tests; +mod aggregate_sum_query_tests; mod batch_coverage_tests; mod batch_delete_tree_tests; mod batch_rejection_tests; @@ -40,6 +41,7 @@ mod provable_count_sum_tree_tests; mod provable_count_tree_comprehensive_test; mod provable_count_tree_structure_test; mod provable_count_tree_test; +mod provable_sum_tree_tests; mod query_result_type_tests; mod reference_path_tests; mod reference_with_sum_item_tests; diff --git a/grovedb/src/tests/not_counted_or_summed_tests.rs b/grovedb/src/tests/not_counted_or_summed_tests.rs index ac71f8902..334fd5180 100644 --- a/grovedb/src/tests/not_counted_or_summed_tests.rs +++ b/grovedb/src/tests/not_counted_or_summed_tests.rs @@ -283,6 +283,149 @@ mod tests { assert_eq!(aggregate.as_sum_i64(), 0); } + #[test] + fn direct_insert_not_counted_or_summed_provable_sum_tree_in_count_sum_tree_excludes_both_axes() + { + // A bare ProvableSumTree inserted into a CountSumTree contributes + // (1, internal_sum) to the parent's (count, sum). Wrapped in + // NotCountedOrSummed it must contribute (0, 0). This exercises the + // ProvableSumTree branch of the NotCountedOrSummed inner allow-list + // (added when the wrapper inherited PR #666's contract on top of + // the ProvableSumTree feature). + let grove_version = GroveVersion::latest(); + let db = make_test_grovedb(grove_version); + + make_count_sum_tree_parent(&db, b"outer", grove_version); + + // A plain sum item contributes (1, 7) so the aggregate is non-zero + // and we can read the precise suppression effect off the wrapped + // subtree. + db.insert( + [TEST_LEAF, b"outer"].as_ref(), + b"plain", + Element::new_sum_item(7), + None, + None, + grove_version, + ) + .unwrap() + .expect("insert plain sum item"); + + // NotCountedOrSummed(ProvableSumTree(_, 0)) wrapper. + let inner = Element::new_provable_sum_tree(None); + let wrapped = Element::new_not_counted_or_summed(inner).expect("wrap ok"); + db.insert( + [TEST_LEAF, b"outer"].as_ref(), + b"w", + wrapped, + None, + None, + grove_version, + ) + .unwrap() + .expect("insert wrapped provable sum tree"); + + // Add a sum item under the wrapped subtree so the inner ProvableSumTree + // carries a non-trivial aggregate that must NOT bubble up to the + // CountSumTree parent. + db.insert( + [TEST_LEAF, b"outer", b"w"].as_ref(), + b"inner", + Element::new_sum_item(42), + None, + None, + grove_version, + ) + .unwrap() + .expect("insert inner sum item under wrapped ProvableSumTree"); + + // Outer aggregate must be (count=1, sum=7): only `plain` contributes; + // the wrapped ProvableSumTree's implicit count-of-one and its + // inner sum=42 are both suppressed. + let batch = StorageBatch::new(); + let tx = db.start_transaction(); + let outer_merk = db + .open_transactional_merk_at_path( + [TEST_LEAF, b"outer"].as_ref().into(), + &tx, + Some(&batch), + grove_version, + ) + .unwrap() + .expect("open outer merk"); + let aggregate = outer_merk + .aggregate_data() + .expect("read outer aggregate data"); + assert_eq!( + aggregate.as_count_u64(), + 1, + "wrapped ProvableSumTree must not contribute to outer count; got {:?}", + aggregate + ); + assert_eq!( + aggregate.as_sum_i64(), + 7, + "wrapped ProvableSumTree must not contribute to outer sum; got {:?}", + aggregate + ); + } + + #[test] + fn direct_insert_not_counted_or_summed_provable_sum_tree_in_provable_count_sum_tree_excludes_both_axes( + ) { + // Same as the test above but with a ProvableCountSumTree parent — + // confirms the wrapper also works against the provable variant of + // the count-and-sum-bearing parent. + let grove_version = GroveVersion::latest(); + let db = make_test_grovedb(grove_version); + + make_provable_count_sum_tree_parent(&db, b"outer", grove_version); + + // Plain item contributes (1, 0). + db.insert( + [TEST_LEAF, b"outer"].as_ref(), + b"plain", + Element::new_item(b"x".to_vec()), + None, + None, + grove_version, + ) + .unwrap() + .expect("insert plain item"); + + // NotCountedOrSummed(ProvableSumTree) → contributes (0, 0). + let inner = Element::new_provable_sum_tree(None); + let wrapped = Element::new_not_counted_or_summed(inner).expect("wrap ok"); + db.insert( + [TEST_LEAF, b"outer"].as_ref(), + b"w", + wrapped, + None, + None, + grove_version, + ) + .unwrap() + .expect("insert wrapped provable sum tree"); + + // Outer aggregate must be (count=1, sum=0). + let batch = StorageBatch::new(); + let tx = db.start_transaction(); + let outer_merk = db + .open_transactional_merk_at_path( + [TEST_LEAF, b"outer"].as_ref().into(), + &tx, + Some(&batch), + grove_version, + ) + .unwrap() + .expect("open outer merk"); + let aggregate = outer_merk + .aggregate_data() + .expect("read outer aggregate data"); + assert_eq!(aggregate.as_count_u64(), 1); + assert_eq!(aggregate.as_sum_i64(), 0); + } + #[test] fn batch_propagation_preserves_not_counted_or_summed_wrapper_on_subtree() { // A batch that inserts a NotCountedOrSummed(SumTree) AND writes a @@ -490,13 +633,14 @@ mod tests { #[test] fn constructor_invariants() { - // Only the four sum-bearing tree variants are accepted as inner. + // All five sum-bearing tree variants are accepted as inner. assert!(Element::new_not_counted_or_summed(Element::new_sum_tree(None)).is_ok()); assert!(Element::new_not_counted_or_summed(Element::new_big_sum_tree(None)).is_ok()); assert!(Element::new_not_counted_or_summed(Element::new_count_sum_tree(None)).is_ok()); assert!( Element::new_not_counted_or_summed(Element::new_provable_count_sum_tree(None)).is_ok() ); + assert!(Element::new_not_counted_or_summed(Element::new_provable_sum_tree(None)).is_ok()); // Non-sum-tree inners are rejected. assert!(Element::new_not_counted_or_summed(Element::new_item(b"x".to_vec())).is_err()); diff --git a/grovedb/src/tests/proof_advanced_tests.rs b/grovedb/src/tests/proof_advanced_tests.rs index 6f29d611d..5297f9fdb 100644 --- a/grovedb/src/tests/proof_advanced_tests.rs +++ b/grovedb/src/tests/proof_advanced_tests.rs @@ -316,6 +316,66 @@ mod tests { } } + /// Canonical-decode regression for the general verifier path. + /// + /// `GroveDb::verify_query` (and its siblings `verify_query_raw`, + /// `verify_query_with_options`, `verify_trunk_chunk_proof`) all + /// route through `decode_grovedb_proof_canonical`, which rejects + /// any trailing bytes beyond the encoded envelope. Without this, + /// the same logical proof would have many distinct byte encodings + /// (a proof and the same proof with arbitrary suffix bytes), all + /// verifying to the same `(RootHash, results)`. The cryptographic + /// chain still binds the answer, so this isn't a soundness break, + /// but it lets two byte-distinct proofs verify identically, which + /// breaks any equality-by-bytes / caching / dedup assumption a + /// consumer might rely on. + /// + /// Mirrors `sum_proof_with_trailing_bytes_is_rejected` and + /// `aggregate_count_proof_with_trailing_bytes_is_rejected` for the + /// non-aggregate `verify_query` entry point. + #[test] + fn verify_query_rejects_proof_with_trailing_bytes() { + let grove_version = GroveVersion::latest(); + let db = make_test_grovedb(grove_version); + + db.insert( + [TEST_LEAF].as_ref(), + b"k", + Element::new_item(b"v".to_vec()), + None, + None, + grove_version, + ) + .unwrap() + .expect("insert"); + + let mut query = Query::new(); + query.insert_all(); + let path_query = PathQuery::new_unsized(vec![TEST_LEAF.to_vec()], query); + + let mut proof = db + .prove_query(&path_query, None, grove_version) + .unwrap() + .expect("prove_query"); + + // Sanity: the untouched proof verifies. + GroveDb::verify_query(&proof, &path_query, grove_version) + .expect("clean proof should verify"); + + // Append a single trailing byte. The canonical-decode contract + // must reject this even though the cryptographic chain still + // binds the same `(RootHash, results)`. + proof.push(0u8); + let err = GroveDb::verify_query(&proof, &path_query, grove_version) + .expect_err("trailing-byte proof must be rejected"); + match err { + crate::Error::CorruptedData(msg) => { + assert!(msg.contains("trailing bytes"), "unexpected message: {msg}") + } + other => panic!("expected CorruptedData, got {:?}", other), + } + } + #[test] fn verify_query_with_options_limit() { // Prove a query with a limit and verify the results are correctly diff --git a/grovedb/src/tests/provable_count_sum_tree_tests.rs b/grovedb/src/tests/provable_count_sum_tree_tests.rs index 8bee9f4b9..c270b0cc3 100644 --- a/grovedb/src/tests/provable_count_sum_tree_tests.rs +++ b/grovedb/src/tests/provable_count_sum_tree_tests.rs @@ -78,11 +78,15 @@ mod tests { Node::KVDigestCount(k, ..) => k.clone(), Node::KVRefValueHash(k, ..) => k.clone(), Node::KVRefValueHashCount(k, ..) => k.clone(), + Node::KVSum(k, ..) => k.clone(), + Node::KVDigestSum(k, ..) => k.clone(), + Node::KVRefValueHashSum(k, ..) => k.clone(), Node::KVHashCount(..) => vec![], Node::Hash(_) | Node::KVHash(_) => vec![], // HashWithCount is keyless (collapsed subtree representation // for AggregateCountOnRange proofs). Node::HashWithCount(..) => vec![], + Node::KVHashSum(..) | Node::HashWithSum(..) => vec![], }; results.push((key, count)); } diff --git a/grovedb/src/tests/provable_sum_tree_tests.rs b/grovedb/src/tests/provable_sum_tree_tests.rs new file mode 100644 index 000000000..30d0fe9d5 --- /dev/null +++ b/grovedb/src/tests/provable_sum_tree_tests.rs @@ -0,0 +1,1385 @@ +//! End-to-end behavior tests for `ProvableSumTree` in GroveDB. +//! +//! Coverage: +//! 1. Direct insert + read round-trip of a `ProvableSumTree`, with the +//! parent's `sum_value` field reflecting the running total of inserted +//! `SumItem` children. +//! 2. Aggregate propagation across positive, negative, zero, and +//! `i64::MIN`/`i64::MAX` sum values. +//! 3. Hash divergence from a plain `SumTree` populated with identical +//! children — `node_hash_with_sum` binds the aggregate sum. +//! 4. Nested `ProvableSumTree` aggregates propagate to the outer tree's +//! aggregate sum and root hash. +//! 5. Wrapper interactions: `NonCounted(ProvableSumTree)` and +//! `NotSummed(ProvableSumTree)` short-circuit parent aggregation +//! without affecting the wrapped tree's own hash. +//! 6. Sum mutation (e.g. deleting a `SumItem` child) changes the +//! `ProvableSumTree`'s root hash because the aggregate sum is bound +//! into the hash. +//! 7. Direct insertion of a non-empty `ProvableSumTree` element with a +//! pre-existing root key + state (mirroring the existing +//! `ProvableCountTree` direct-insert pattern). + +#[cfg(test)] +mod tests { + use grovedb_version::version::GroveVersion; + + use crate::{tests::make_test_grovedb, Element}; + + /// 1. Round-trip a `ProvableSumTree`: insert it, populate with mixed + /// `SumItem` children, read back the parent and verify its tracked + /// `sum_value` matches the running sum of inserted children. + #[test] + fn provable_sum_tree_round_trip_tracks_aggregate_sum() { + let grove_version = GroveVersion::latest(); + let db = make_test_grovedb(grove_version); + + db.insert( + &[] as &[&[u8]], + b"psum", + Element::empty_provable_sum_tree(), + None, + None, + grove_version, + ) + .unwrap() + .expect("should insert provable sum tree"); + + // Mix of SumItem values: 7, 13, 20. Aggregate = 40. + let mut expected_sum: i64 = 0; + for (key, value) in [(b"a".as_slice(), 7i64), (b"b", 13), (b"c", 20)] { + db.insert( + &[b"psum".as_slice()], + key, + Element::new_sum_item(value), + None, + None, + grove_version, + ) + .unwrap() + .expect("should insert sum item"); + + expected_sum += value; + + let fetched = db + .get(&[] as &[&[u8]], b"psum", None, grove_version) + .unwrap() + .expect("should get parent psum"); + // Each round, the parent's tracked aggregate sum should equal the + // running total of inserted children. + // (The first iteration: 7; second: 20; third: 40.) + assert!(matches!(fetched, Element::ProvableSumTree(_, _, _))); + let running = fetched.as_provable_sum_tree_value().expect("psum value"); + assert_eq!( + running, + expected_sum, + "ProvableSumTree aggregate must equal running total after inserting {:?}", + std::str::from_utf8(key).unwrap_or("") + ); + } + + let parent = db + .get(&[] as &[&[u8]], b"psum", None, grove_version) + .unwrap() + .expect("get parent"); + let sum_value = parent.as_provable_sum_tree_value().expect("psum value"); + assert_eq!(sum_value, 7 + 13 + 20); + + // Children must round-trip identically. + for (key, expected) in [(b"a".as_slice(), 7i64), (b"b", 13), (b"c", 20)] { + let elem = db + .get(&[b"psum".as_slice()], key, None, grove_version) + .unwrap() + .expect("get sum item"); + match elem { + Element::SumItem(v, _) => assert_eq!(v, expected), + other => panic!("expected SumItem, got {:?}", other), + } + } + } + + /// 2. Aggregate propagation across positive, negative, zero, and the + /// extremes of `i64`. We test ranges that won't overflow. + #[test] + fn provable_sum_tree_aggregate_negatives_and_zeros() { + let grove_version = GroveVersion::latest(); + let db = make_test_grovedb(grove_version); + + db.insert( + &[] as &[&[u8]], + b"psum", + Element::empty_provable_sum_tree(), + None, + None, + grove_version, + ) + .unwrap() + .expect("insert psum"); + + // -100 + 50 + 50 + (-200) = -200 + let inputs: &[(&[u8], i64)] = &[ + (b"a", -100), + (b"b", 50), + (b"c", 50), + (b"d", -200), + (b"e", 0), + ]; + for (key, value) in inputs { + db.insert( + &[b"psum".as_slice()], + key, + Element::new_sum_item(*value), + None, + None, + grove_version, + ) + .unwrap() + .expect("insert sum item"); + } + let agg = db + .get(&[] as &[&[u8]], b"psum", None, grove_version) + .unwrap() + .expect("get psum") + .as_provable_sum_tree_value() + .expect("psum value"); + assert_eq!(agg, -200); + } + + /// 2b. `i64::MAX` and `i64::MIN` alone propagate correctly (not combined, + /// to avoid overflow). + #[test] + fn provable_sum_tree_aggregate_extremes() { + let grove_version = GroveVersion::latest(); + + for &extreme in &[i64::MAX, i64::MIN] { + let db = make_test_grovedb(grove_version); + db.insert( + &[] as &[&[u8]], + b"psum", + Element::empty_provable_sum_tree(), + None, + None, + grove_version, + ) + .unwrap() + .expect("insert psum"); + + db.insert( + &[b"psum".as_slice()], + b"k", + Element::new_sum_item(extreme), + None, + None, + grove_version, + ) + .unwrap() + .expect("insert sum item"); + + let agg = db + .get(&[] as &[&[u8]], b"psum", None, grove_version) + .unwrap() + .expect("get psum") + .as_provable_sum_tree_value() + .expect("psum value"); + assert_eq!(agg, extreme, "i64 extreme {} should propagate", extreme); + } + } + + /// 3. `ProvableSumTree` root hash diverges from a plain `SumTree` with + /// identical children. This is the hash-binding cornerstone: the sum + /// is part of the node hash. + #[test] + fn provable_sum_tree_hash_diverges_from_sum_tree() { + let grove_version = GroveVersion::latest(); + let db = make_test_grovedb(grove_version); + + // Two trees with identical children. + db.insert( + &[] as &[&[u8]], + b"plain_sum", + Element::empty_sum_tree(), + None, + None, + grove_version, + ) + .unwrap() + .expect("insert plain sum tree"); + db.insert( + &[] as &[&[u8]], + b"provable_sum", + Element::empty_provable_sum_tree(), + None, + None, + grove_version, + ) + .unwrap() + .expect("insert provable sum tree"); + + for (key, v) in [(b"a".as_slice(), 1i64), (b"b", 2), (b"c", 3)] { + db.insert( + &[b"plain_sum".as_slice()], + key, + Element::new_sum_item(v), + None, + None, + grove_version, + ) + .unwrap() + .expect("insert into plain sum tree"); + db.insert( + &[b"provable_sum".as_slice()], + key, + Element::new_sum_item(v), + None, + None, + grove_version, + ) + .unwrap() + .expect("insert into provable sum tree"); + } + + let plain = db + .get(&[] as &[&[u8]], b"plain_sum", None, grove_version) + .unwrap() + .expect("get plain"); + let provable = db + .get(&[] as &[&[u8]], b"provable_sum", None, grove_version) + .unwrap() + .expect("get provable"); + + // Both should track the same aggregate. + match plain { + Element::SumTree(_, s, _) => assert_eq!(s, 6), + other => panic!("expected SumTree, got {:?}", other), + } + match provable { + Element::ProvableSumTree(_, s, _) => assert_eq!(s, 6), + other => panic!("expected ProvableSumTree, got {:?}", other), + } + + // But the two subtree root hashes (and hence the grovedb root hash + // path through them) must differ because ProvableSumTree binds the + // sum into the node hash via `node_hash_with_sum`. + let test_leaf = db.start_transaction(); + let plain_merk_root = db + .open_transactional_merk_at_path( + [b"plain_sum".as_slice()].as_ref().into(), + &test_leaf, + None, + grove_version, + ) + .unwrap() + .expect("open plain merk") + .root_hash() + .unwrap(); + let provable_merk_root = db + .open_transactional_merk_at_path( + [b"provable_sum".as_slice()].as_ref().into(), + &test_leaf, + None, + grove_version, + ) + .unwrap() + .expect("open provable merk") + .root_hash() + .unwrap(); + assert_ne!( + plain_merk_root, provable_merk_root, + "Root hash divergence: same children must give different roots \ + between SumTree and ProvableSumTree" + ); + } + + /// 4. Nested `ProvableSumTree[A] -> ProvableSumTree[B] -> SumItems`: + /// B's aggregate propagates up into A's aggregate, and A's root hash + /// includes A's aggregate (which transitively reflects B's children). + #[test] + fn nested_provable_sum_trees_propagate_aggregate_upward() { + let grove_version = GroveVersion::latest(); + let db = make_test_grovedb(grove_version); + + // outer: ProvableSumTree[A] + db.insert( + &[] as &[&[u8]], + b"A", + Element::empty_provable_sum_tree(), + None, + None, + grove_version, + ) + .unwrap() + .expect("insert A"); + + // inner: ProvableSumTree[B] inside A + db.insert( + &[b"A".as_slice()], + b"B", + Element::empty_provable_sum_tree(), + None, + None, + grove_version, + ) + .unwrap() + .expect("insert B inside A"); + + // Some SumItems in B. + for (key, v) in [(b"x".as_slice(), 10i64), (b"y", 20), (b"z", -5)] { + db.insert( + &[b"A".as_slice(), b"B".as_slice()], + key, + Element::new_sum_item(v), + None, + None, + grove_version, + ) + .unwrap() + .expect("insert sum item into B"); + } + + // And a couple directly in A. + db.insert( + &[b"A".as_slice()], + b"direct", + Element::new_sum_item(100), + None, + None, + grove_version, + ) + .unwrap() + .expect("insert direct in A"); + + // B's aggregate = 25, contributed to A. + let b_elem = db + .get(&[b"A".as_slice()], b"B", None, grove_version) + .unwrap() + .expect("get B"); + assert_eq!(b_elem.as_provable_sum_tree_value().unwrap(), 25); + + // A's aggregate = B's aggregate (25) + direct sum item (100) = 125. + let a_elem = db + .get(&[] as &[&[u8]], b"A", None, grove_version) + .unwrap() + .expect("get A"); + assert_eq!(a_elem.as_provable_sum_tree_value().unwrap(), 125); + + // Now mutate B's children — A's aggregate (and hash) must shift. + let tx = db.start_transaction(); + let root_before = db + .root_hash(None, grove_version) + .unwrap() + .expect("root hash before"); + drop(tx); + + db.insert( + &[b"A".as_slice(), b"B".as_slice()], + b"w", + Element::new_sum_item(1000), + None, + None, + grove_version, + ) + .unwrap() + .expect("insert into B"); + + let root_after = db + .root_hash(None, grove_version) + .unwrap() + .expect("root hash after"); + assert_ne!( + root_before, root_after, + "nested ProvableSumTree mutation must shift the root hash" + ); + + let b_after = db + .get(&[b"A".as_slice()], b"B", None, grove_version) + .unwrap() + .expect("get B"); + assert_eq!(b_after.as_provable_sum_tree_value().unwrap(), 1025); + let a_after = db + .get(&[] as &[&[u8]], b"A", None, grove_version) + .unwrap() + .expect("get A"); + assert_eq!(a_after.as_provable_sum_tree_value().unwrap(), 1125); + } + + /// 5a. `NonCounted(ProvableSumTree)` inside a `CountTree` parent: + /// the wrapper short-circuits count propagation, so the + /// CountTree's aggregate count does NOT include this child as 1. + #[test] + fn non_counted_provable_sum_tree_does_not_increment_count_parent() { + let grove_version = GroveVersion::latest(); + let db = make_test_grovedb(grove_version); + + db.insert( + &[] as &[&[u8]], + b"ct", + Element::empty_count_tree(), + None, + None, + grove_version, + ) + .unwrap() + .expect("insert count tree parent"); + + // Bare item contributes 1. + db.insert( + &[b"ct".as_slice()], + b"plain_item", + Element::new_item(b"x".to_vec()), + None, + None, + grove_version, + ) + .unwrap() + .expect("insert plain item"); + + // NonCounted(ProvableSumTree) should contribute 0. + let nc_pst = Element::new_non_counted(Element::empty_provable_sum_tree()).expect("wrap ok"); + db.insert( + &[b"ct".as_slice()], + b"nc_pst", + nc_pst, + None, + None, + grove_version, + ) + .unwrap() + .expect("insert NonCounted(ProvableSumTree)"); + + let count_tree = db + .get(&[] as &[&[u8]], b"ct", None, grove_version) + .unwrap() + .expect("get count_tree"); + // Only the plain item should count; the wrapped subtree contributes 0. + assert_eq!(count_tree.count_value_or_default(), 1); + } + + /// 5b. `NotSummed(ProvableSumTree)` inside a `SumTree` parent: the + /// wrapper suppresses the inner tree's sum from propagating to the + /// SumTree parent. The wrapped ProvableSumTree's own children's sums + /// still affect its own root hash though. + #[test] + fn not_summed_provable_sum_tree_does_not_propagate_sum_to_sum_parent() { + let grove_version = GroveVersion::latest(); + let db = make_test_grovedb(grove_version); + + db.insert( + &[] as &[&[u8]], + b"st", + Element::empty_sum_tree(), + None, + None, + grove_version, + ) + .unwrap() + .expect("insert sum tree parent"); + + // A bare SumItem(7) contributes 7. + db.insert( + &[b"st".as_slice()], + b"plain_si", + Element::new_sum_item(7), + None, + None, + grove_version, + ) + .unwrap() + .expect("insert plain sum item"); + + // NotSummed(ProvableSumTree) — its own (eventually-populated) + // aggregate must not propagate. + let ns_pst = Element::new_not_summed(Element::empty_provable_sum_tree()).expect("wrap ok"); + db.insert( + &[b"st".as_slice()], + b"ns_pst", + ns_pst, + None, + None, + grove_version, + ) + .unwrap() + .expect("insert NotSummed(ProvableSumTree)"); + + // Even after populating the inner ProvableSumTree, the SumTree + // parent's aggregate sum must NOT advance from the wrapped child. + // NOTE: insertion into the wrapped child uses the inner type's + // dispatch — but at the parent's aggregate level, NotSummed + // already zeroed out the wrapper's contribution. + db.insert( + &[b"st".as_slice(), b"ns_pst".as_slice()], + b"hidden", + Element::new_sum_item(9999), + None, + None, + grove_version, + ) + .unwrap() + .expect("insert into ns_pst inner"); + + let sum_tree = db + .get(&[] as &[&[u8]], b"st", None, grove_version) + .unwrap() + .expect("get sum tree parent"); + match sum_tree { + Element::SumTree(_, s, _) => assert_eq!( + s, 7, + "wrapped ProvableSumTree's children must not propagate" + ), + other => panic!("expected SumTree, got {:?}", other), + } + + // The wrapped inner tree's own aggregate STILL tracks its sum. + // Use `get_raw` to preserve the wrapper (db.get strips wrappers via + // `into_underlying`). + let wrapped = db + .get_raw( + [b"st".as_slice()].as_ref().into(), + b"ns_pst", + None, + grove_version, + ) + .unwrap() + .expect("get_raw wrapped"); + // wrapped is `NotSummed(Box)`. + match wrapped { + Element::NotSummed(inner) => match *inner { + Element::ProvableSumTree(_, s, _) => assert_eq!(s, 9999), + other => panic!("expected ProvableSumTree, got {:?}", other), + }, + other => panic!("expected NotSummed, got {:?}", other), + } + } + + /// 6. Mutating the sum (deleting a SumItem child) changes the root + /// hash of the ProvableSumTree because the aggregate sum is bound into + /// the node hash. + #[test] + fn deleting_sum_item_changes_provable_sum_tree_root_hash() { + let grove_version = GroveVersion::latest(); + let db = make_test_grovedb(grove_version); + + db.insert( + &[] as &[&[u8]], + b"psum", + Element::empty_provable_sum_tree(), + None, + None, + grove_version, + ) + .unwrap() + .expect("insert provable sum tree"); + + for (key, v) in [(b"a".as_slice(), 10i64), (b"b", 20), (b"c", 30), (b"d", 40)] { + db.insert( + &[b"psum".as_slice()], + key, + Element::new_sum_item(v), + None, + None, + grove_version, + ) + .unwrap() + .expect("insert sum item"); + } + + let root_before = db + .root_hash(None, grove_version) + .unwrap() + .expect("root hash before"); + + db.delete(&[b"psum".as_slice()], b"c", None, None, grove_version) + .unwrap() + .expect("delete c"); + + let root_after = db + .root_hash(None, grove_version) + .unwrap() + .expect("root hash after"); + assert_ne!( + root_before, root_after, + "deleting a SumItem must change the ProvableSumTree root hash" + ); + + let psum = db + .get(&[] as &[&[u8]], b"psum", None, grove_version) + .unwrap() + .expect("get psum"); + assert_eq!(psum.as_provable_sum_tree_value().unwrap(), 10 + 20 + 40); + } + + /// 7. Directly insert a non-empty `ProvableSumTree` element pointing at + /// an existing root key. Mirrors the existing + /// `ProvableCountTree` direct-insert behavior — when no state exists, + /// the insert is structurally accepted but corresponds to an empty + /// Merk. Most importantly, the direct-insert path does not panic + /// and the read path returns the value back faithfully. + #[test] + fn direct_insert_provable_sum_tree_with_root_key_and_sum() { + let grove_version = GroveVersion::latest(); + let db = make_test_grovedb(grove_version); + + // Step 1: build a populated provable_sum_tree under `template`, + // then snapshot its root key + aggregate sum. The direct-insert + // path below cannot fabricate state out of thin air, so the + // canonical pattern is: write a tree the normal way and inspect + // its on-disk shape. + db.insert( + &[] as &[&[u8]], + b"template", + Element::empty_provable_sum_tree(), + None, + None, + grove_version, + ) + .unwrap() + .expect("insert template"); + + for (key, v) in [(b"a".as_slice(), 1i64), (b"b", 2), (b"c", 3)] { + db.insert( + &[b"template".as_slice()], + key, + Element::new_sum_item(v), + None, + None, + grove_version, + ) + .unwrap() + .expect("insert into template"); + } + + let template = db + .get(&[] as &[&[u8]], b"template", None, grove_version) + .unwrap() + .expect("get template"); + let (captured_root_key, captured_sum) = match template { + Element::ProvableSumTree(root_key, sum, flags) => { + assert!(root_key.is_some()); + assert_eq!(sum, 6); + (root_key, sum) + } + other => panic!("expected ProvableSumTree, got {:?}", other), + }; + + // Step 2: actually exercise the direct-insert path with the + // captured root_key + sum. The non-batch insert path forbids + // inserting a Tree element that already declares a root_key + // ("a tree should be empty at the moment of insertion when not + // using batches"), so the documented "direct-insert" semantics + // are reachable only via the batch path. Use an + // `insert_only_known_to_not_already_exist_op` over a populated + // ProvableSumTree element at a fresh top-level key and apply + // the batch. + use crate::batch::QualifiedGroveDbOp; + let direct = Element::ProvableSumTree(captured_root_key.clone(), captured_sum, None); + let op = QualifiedGroveDbOp::insert_only_known_to_not_already_exist_op( + vec![], + b"direct".to_vec(), + direct, + ); + db.apply_batch(vec![op], None, None, grove_version) + .unwrap() + .expect("batch direct-insert provable sum tree with captured root_key and sum"); + + let round_tripped = db + .get(&[] as &[&[u8]], b"direct", None, grove_version) + .unwrap() + .expect("get direct-inserted element"); + match round_tripped { + Element::ProvableSumTree(rk, s, _) => { + assert_eq!(rk, captured_root_key, "root_key must round-trip"); + assert_eq!(s, captured_sum, "sum must round-trip"); + } + other => panic!( + "expected ProvableSumTree after direct insert, got {:?}", + other + ), + } + } + + /// Bonus regression: NonCounted(ProvableSumTree) round-trips its + /// inner aggregate sum independently of the wrapper. + #[test] + fn non_counted_provable_sum_tree_round_trip_preserves_inner_sum() { + let grove_version = GroveVersion::latest(); + let db = make_test_grovedb(grove_version); + + db.insert( + &[] as &[&[u8]], + b"ct", + Element::empty_count_tree(), + None, + None, + grove_version, + ) + .unwrap() + .expect("insert count tree parent"); + + let nc = Element::new_non_counted(Element::empty_provable_sum_tree()).expect("wrap ok"); + db.insert( + &[b"ct".as_slice()], + b"nc_pst", + nc, + None, + None, + grove_version, + ) + .unwrap() + .expect("insert NonCounted(ProvableSumTree)"); + + db.insert( + &[b"ct".as_slice(), b"nc_pst".as_slice()], + b"item", + Element::new_sum_item(42), + None, + None, + grove_version, + ) + .unwrap() + .expect("insert into nc_pst inner"); + + // Use `get_raw` to preserve the NonCounted wrapper. + let wrapped = db + .get_raw( + [b"ct".as_slice()].as_ref().into(), + b"nc_pst", + None, + grove_version, + ) + .unwrap() + .expect("get_raw wrapped"); + match wrapped { + Element::NonCounted(inner) => match *inner { + Element::ProvableSumTree(_, s, _) => assert_eq!(s, 42), + other => panic!("expected ProvableSumTree, got {:?}", other), + }, + other => panic!("expected NonCounted, got {:?}", other), + } + } + + /// Integrity walk tests for `verify_grovedb`. + /// + /// `verify_grovedb` performs two kinds of check on every tree-bearing + /// element it walks: + /// + /// 1. A **cryptographic** check: + /// `combine_hash(value_hash(parent_bytes), inner_merk_root) == + /// stored_element_value_hash`. + /// + /// This catches every form of *byte-level* tampering: if any value + /// in the inner Merk is altered (and stored value_hash not also + /// fixed up), the inner Merk's root hash changes, and the parent's + /// binding hash no longer matches its stored + /// `element_value_hash`. For SumItems, tampering only the stored + /// value bytes (leaving the stored `value_hash` field alone) is + /// caught at the SumItem arm by `value_hash(bytes) != + /// stored_value_hash`. + /// + /// 2. A **software-consistency** check: + /// the parent's recorded aggregate field (e.g. `sum_value` in + /// `ProvableSumTree(_, sum_value, _)`) must equal the inner Merk's + /// actual `aggregate_data()`. + /// + /// This catches a class of bugs not visible to the crypto check: a + /// parent element whose stored bytes are *internally consistent* but + /// whose claimed aggregate disagrees with reality. + /// + /// The tests below exercise both, covering ProvableSumTree (the Phase + /// 1–3 feature) and ProvableCountTree (a sanity check that the new + /// general check works for all variants the helper covers). + #[cfg(test)] + mod integrity_walk_tests { + use grovedb_merk::{ + tree::{combine_hash, kv_digest_to_kv_hash, value_hash, TreeNode}, + CryptoHash, + }; + use grovedb_storage::{Storage, StorageContext}; + use grovedb_version::version::GroveVersion; + + use crate::{tests::make_empty_grovedb, Element}; + + // Helper: read raw TreeNode bytes for `key` from the prefixed + // storage at `path`, patch in `new_element` as the value bytes + // *without* updating the stored value_hash on the node, and + // write back via the immediate storage context. + // + // This simulates byte-level tampering of a leaf value (e.g. a + // SumItem) that leaves the stored value_hash stale. The + // verifier's value_hash check is expected to catch it. + fn tamper_value_no_hash_update( + db: &crate::GroveDb, + path: &[&[u8]], + key: &[u8], + new_element: &Element, + grove_version: &GroveVersion, + ) { + let tx = db.start_transaction(); + let storage_ctx = db + .db + .get_immediate_storage_context(path.into(), &tx) + .unwrap(); + + let raw = storage_ctx + .get(key) + .unwrap() + .expect("storage_ctx get") + .expect("tampered key must exist on disk"); + + let mut tree_node = TreeNode::decode_raw( + &raw, + key.to_vec(), + None::< + &fn( + &[u8], + &GroveVersion, + ) + -> Option, + >, + grove_version, + ) + .expect("decode raw tree node"); + let new_bytes = new_element + .serialize(grove_version) + .expect("serialize replacement element"); + // `set_value` mutates only the value field; hash and + // value_hash on the KV are left untouched. + tree_node.set_value(new_bytes); + let encoded = tree_node.encode(); + + storage_ctx + .put(key, &encoded, None, None) + .unwrap() + .expect("put corrupted tree node"); + db.commit_transaction(tx).unwrap().expect("commit tamper"); + } + + // Helper: rewrite the parent's stored tree element bytes to + // claim a *different* aggregate, AND fix up the stored + // hash/value_hash to remain consistent with the inner Merk's + // existing root_hash. The inner Merk is untouched; only the + // parent's view of it changes. + // + // After this tamper, the cryptographic check (combine_hash of + // parent value_hash with inner Merk root_hash equals stored + // element_value_hash) passes, because we update the stored + // hashes to match the new bytes. The new aggregate-consistency + // check is expected to fire because the new bytes claim a sum + // (or count) that disagrees with the inner Merk's + // `aggregate_data()`. + // + // Implementation: + // + // TreeNodeInner encoding is: + // [option_byte u8] left_link? (variable) + // [option_byte u8] right_link? (variable) + // [feature_type encoding] (variable) + // [hash 32 bytes] + // [value_hash 32 bytes] + // [value bytes: rest] + // + // We use `TreeNode::decode_raw` to learn the original value + // length; the offset of `hash` is then `total_len - value_len + // - 64`. We splice in: + // + // raw[..hash_off] + new_kv_hash + new_value_hash + new_bytes + // + // where: + // new_value_hash = combine_hash(value_hash(new_bytes), + // inner_root_hash) + // new_kv_hash = kv_digest_to_kv_hash(key, new_value_hash) + // + // This matches what `KV::new_with_layered_value_hash` produces + // on a real insert (see `merk/src/tree/kv.rs`). + fn tamper_parent_element_with_consistent_hashes( + db: &crate::GroveDb, + path: &[&[u8]], + key: &[u8], + new_element: &Element, + inner_root_hash: CryptoHash, + grove_version: &GroveVersion, + ) { + let tx = db.start_transaction(); + let storage_ctx = db + .db + .get_immediate_storage_context(path.into(), &tx) + .unwrap(); + + let raw = storage_ctx + .get(key) + .unwrap() + .expect("storage_ctx get") + .expect("tampered key must exist on disk"); + + let decoded = TreeNode::decode_raw( + &raw, + key.to_vec(), + None::< + &fn( + &[u8], + &GroveVersion, + ) + -> Option, + >, + grove_version, + ) + .expect("decode raw tree node"); + + let original_value_len = decoded.value_as_slice().len(); + let total_len = raw.len(); + let hash_off = total_len - original_value_len - 32 - 32; + + let new_bytes = new_element + .serialize(grove_version) + .expect("serialize replacement element"); + let raw_value_hash = value_hash(&new_bytes).unwrap(); + let new_combined_value_hash = combine_hash(&raw_value_hash, &inner_root_hash).unwrap(); + let new_kv_hash = kv_digest_to_kv_hash(key, &new_combined_value_hash).unwrap(); + + let mut new_raw = Vec::with_capacity(hash_off + 64 + new_bytes.len()); + new_raw.extend_from_slice(&raw[..hash_off]); + new_raw.extend_from_slice(&new_kv_hash); + new_raw.extend_from_slice(&new_combined_value_hash); + new_raw.extend_from_slice(&new_bytes); + + storage_ctx + .put(key, &new_raw, None, None) + .unwrap() + .expect("put consistently-rebound tampered tree node"); + db.commit_transaction(tx).unwrap().expect("commit tamper"); + } + + // ============================================================== + // Test 1: cryptographic tampering of an inner SumItem is + // caught by verify_grovedb. + // ============================================================== + #[test] + fn verify_grovedb_catches_inner_sum_item_value_tamper() { + let grove_version = GroveVersion::latest(); + let db = make_empty_grovedb(); + + db.insert( + &[] as &[&[u8]], + b"psum", + Element::empty_provable_sum_tree(), + None, + None, + grove_version, + ) + .unwrap() + .expect("insert provable sum tree"); + + for (k, v) in [(b"a".as_slice(), 7i64), (b"b", 13), (b"c", 20)] { + db.insert( + &[b"psum".as_slice()], + k, + Element::new_sum_item(v), + None, + None, + grove_version, + ) + .unwrap() + .expect("insert sum item"); + } + + // Sanity: clean tree verifies clean. + let issues = db + .verify_grovedb(None, true, false, grove_version) + .expect("verify clean"); + assert!( + issues.is_empty(), + "clean tree should verify clean, got: {:?}", + issues + ); + + // Tamper: rewrite SumItem(b"a") -> different SumItem WITHOUT + // updating the stored value_hash. The SumItem arm of the + // verifier reads stored value_hash and compares against + // value_hash(bytes); the comparison must now fail. + tamper_value_no_hash_update( + &db, + &[b"psum"], + b"a", + &Element::new_sum_item(99), + grove_version, + ); + + let issues = db + .verify_grovedb(None, true, false, grove_version) + .expect("verify tampered"); + // Expect exactly the tampered path to be reported. + assert!( + !issues.is_empty(), + "expected verify_grovedb to detect inner SumItem tamper" + ); + let tampered_path: Vec> = vec![b"psum".to_vec(), b"a".to_vec()]; + assert!( + issues.contains_key(&tampered_path), + "expected issue at tampered path {:?}, got: {:?}", + tampered_path, + issues + ); + } + + // ============================================================== + // Test 2: cryptographic tampering of an inner SumItem with a + // value the same length as the original is still caught (this + // is a sanity check that hashes — not lengths — are what get + // verified). + // ============================================================== + #[test] + fn verify_grovedb_catches_inner_sum_item_same_length_tamper() { + let grove_version = GroveVersion::latest(); + let db = make_empty_grovedb(); + + db.insert( + &[] as &[&[u8]], + b"psum", + Element::empty_provable_sum_tree(), + None, + None, + grove_version, + ) + .unwrap() + .expect("insert provable sum tree"); + + db.insert( + &[b"psum".as_slice()], + b"a", + Element::new_sum_item(7), + None, + None, + grove_version, + ) + .unwrap() + .expect("insert sum item"); + + // SumItem(7) -> SumItem(8); same encoded length. + let old_bytes = Element::new_sum_item(7).serialize(grove_version).unwrap(); + let new_bytes = Element::new_sum_item(8).serialize(grove_version).unwrap(); + assert_eq!( + old_bytes.len(), + new_bytes.len(), + "same-length tamper requires equal serialized sizes" + ); + + tamper_value_no_hash_update( + &db, + &[b"psum"], + b"a", + &Element::new_sum_item(8), + grove_version, + ); + + let issues = db + .verify_grovedb(None, true, false, grove_version) + .expect("verify tampered"); + assert!( + !issues.is_empty(), + "expected verify_grovedb to detect same-length SumItem tamper" + ); + } + + // ============================================================== + // Test 3: the new aggregate-consistency check fires when the + // parent's stored sum_value disagrees with the inner Merk's + // actual aggregate, even though the cryptographic binding is + // still consistent. + // ============================================================== + #[test] + fn verify_grovedb_catches_parent_aggregate_mismatch_provable_sum_tree() { + let grove_version = GroveVersion::latest(); + let db = make_empty_grovedb(); + + db.insert( + &[] as &[&[u8]], + b"psum", + Element::empty_provable_sum_tree(), + None, + None, + grove_version, + ) + .unwrap() + .expect("insert provable sum tree"); + + for (k, v) in [(b"a".as_slice(), 7i64), (b"b", 13), (b"c", 20)] { + db.insert( + &[b"psum".as_slice()], + k, + Element::new_sum_item(v), + None, + None, + grove_version, + ) + .unwrap() + .expect("insert sum item"); + } + + // Read the inner Merk's actual root hash + the parent's + // current ProvableSumTree element to reuse the root key. + let parent = db + .get(&[] as &[&[u8]], b"psum", None, grove_version) + .unwrap() + .expect("get parent"); + let (root_key, _real_sum, flags) = match parent { + Element::ProvableSumTree(rk, s, f) => (rk, s, f), + other => panic!("expected ProvableSumTree, got {:?}", other), + }; + + // Compute the actual inner Merk root hash by opening the + // inner Merk and reading it. + let inner_root = { + let tx = db.start_transaction(); + let inner_merk = db + .open_transactional_merk_at_path( + [b"psum".as_slice()].as_ref().into(), + &tx, + None, + grove_version, + ) + .unwrap() + .expect("open inner merk"); + inner_merk.root_hash().unwrap() + }; + + // Craft a corrupted parent that claims sum=999 while the + // inner Merk actually sums to 40. + let corrupted_parent = Element::ProvableSumTree(root_key.clone(), 999, flags.clone()); + + tamper_parent_element_with_consistent_hashes( + &db, + &[], + b"psum", + &corrupted_parent, + inner_root, + grove_version, + ); + + let issues = db + .verify_grovedb(None, true, false, grove_version) + .expect("verify tampered"); + // The parent path should appear in issues because of the + // aggregate-consistency check. + let tampered_path: Vec> = vec![b"psum".to_vec()]; + assert!( + issues.contains_key(&tampered_path), + "expected aggregate-consistency issue at {:?}, got: {:?}", + tampered_path, + issues + ); + } + + // ============================================================== + // Test 4: clean ProvableSumTree verifies clean. + // ============================================================== + #[test] + fn verify_grovedb_clean_provable_sum_tree_reports_no_issues() { + let grove_version = GroveVersion::latest(); + let db = make_empty_grovedb(); + + db.insert( + &[] as &[&[u8]], + b"psum", + Element::empty_provable_sum_tree(), + None, + None, + grove_version, + ) + .unwrap() + .expect("insert provable sum tree"); + + for (k, v) in [(b"a".as_slice(), 1i64), (b"b", -2), (b"c", 0), (b"d", 100)] { + db.insert( + &[b"psum".as_slice()], + k, + Element::new_sum_item(v), + None, + None, + grove_version, + ) + .unwrap() + .expect("insert sum item"); + } + + let issues = db + .verify_grovedb(None, true, false, grove_version) + .expect("verify"); + assert!( + issues.is_empty(), + "clean ProvableSumTree should verify clean, got: {:?}", + issues + ); + } + + // ============================================================== + // Test 5: same general check works for ProvableCountTree. + // (One positive case + one aggregate-mismatch case.) + // ============================================================== + #[test] + fn verify_grovedb_clean_provable_count_tree_reports_no_issues() { + let grove_version = GroveVersion::latest(); + let db = make_empty_grovedb(); + + db.insert( + &[] as &[&[u8]], + b"pcount", + Element::empty_provable_count_tree(), + None, + None, + grove_version, + ) + .unwrap() + .expect("insert provable count tree"); + + for k in [b"a".as_slice(), b"b", b"c", b"d", b"e"] { + db.insert( + &[b"pcount".as_slice()], + k, + Element::new_item(b"v".to_vec()), + None, + None, + grove_version, + ) + .unwrap() + .expect("insert item"); + } + + let issues = db + .verify_grovedb(None, true, false, grove_version) + .expect("verify"); + assert!( + issues.is_empty(), + "clean ProvableCountTree should verify clean, got: {:?}", + issues + ); + } + + #[test] + fn verify_grovedb_catches_parent_aggregate_mismatch_provable_count_tree() { + let grove_version = GroveVersion::latest(); + let db = make_empty_grovedb(); + + db.insert( + &[] as &[&[u8]], + b"pcount", + Element::empty_provable_count_tree(), + None, + None, + grove_version, + ) + .unwrap() + .expect("insert provable count tree"); + + for k in [b"a".as_slice(), b"b", b"c"] { + db.insert( + &[b"pcount".as_slice()], + k, + Element::new_item(b"v".to_vec()), + None, + None, + grove_version, + ) + .unwrap() + .expect("insert item"); + } + + let parent = db + .get(&[] as &[&[u8]], b"pcount", None, grove_version) + .unwrap() + .expect("get parent"); + let (root_key, _real_count, flags) = match parent { + Element::ProvableCountTree(rk, c, f) => (rk, c, f), + other => panic!("expected ProvableCountTree, got {:?}", other), + }; + + let inner_root = { + let tx = db.start_transaction(); + let inner_merk = db + .open_transactional_merk_at_path( + [b"pcount".as_slice()].as_ref().into(), + &tx, + None, + grove_version, + ) + .unwrap() + .expect("open inner merk"); + inner_merk.root_hash().unwrap() + }; + + // Parent claims 9999 items; inner Merk actually has 3. + let corrupted_parent = + Element::ProvableCountTree(root_key.clone(), 9999, flags.clone()); + tamper_parent_element_with_consistent_hashes( + &db, + &[], + b"pcount", + &corrupted_parent, + inner_root, + grove_version, + ); + + let issues = db + .verify_grovedb(None, true, false, grove_version) + .expect("verify tampered"); + let tampered_path: Vec> = vec![b"pcount".to_vec()]; + assert!( + issues.contains_key(&tampered_path), + "expected aggregate-consistency issue at {:?}, got: {:?}", + tampered_path, + issues + ); + } + + // ============================================================== + // Test 6: reload-after-write determinism. Insert, drop the + // db handle, reopen, run verify_grovedb. Zero issues. + // ============================================================== + #[test] + fn verify_grovedb_persists_clean_across_reopen() { + let grove_version = GroveVersion::latest(); + let tmp_dir = tempfile::TempDir::new().expect("temp dir"); + + { + let db = crate::GroveDb::open(tmp_dir.path()).expect("open db"); + db.insert( + &[] as &[&[u8]], + b"psum", + Element::empty_provable_sum_tree(), + None, + None, + grove_version, + ) + .unwrap() + .expect("insert provable sum tree"); + for (k, v) in [(b"a".as_slice(), 5i64), (b"b", 7), (b"c", 11)] { + db.insert( + &[b"psum".as_slice()], + k, + Element::new_sum_item(v), + None, + None, + grove_version, + ) + .unwrap() + .expect("insert sum item"); + } + } // db dropped here + + // Reopen + verify. + let db = crate::GroveDb::open(tmp_dir.path()).expect("reopen db"); + let issues = db + .verify_grovedb(None, true, false, grove_version) + .expect("verify after reopen"); + assert!( + issues.is_empty(), + "freshly-reopened DB should verify clean, got: {:?}", + issues + ); + + // And the parent's stored aggregate sum is intact. + let parent = db + .get(&[] as &[&[u8]], b"psum", None, grove_version) + .unwrap() + .expect("get parent"); + assert_eq!(parent.as_provable_sum_tree_value().expect("psum value"), 23); + } + } +} diff --git a/grovedb/src/tests/reference_with_sum_item_tests.rs b/grovedb/src/tests/reference_with_sum_item_tests.rs index 0f3facc93..c3af56568 100644 --- a/grovedb/src/tests/reference_with_sum_item_tests.rs +++ b/grovedb/src/tests/reference_with_sum_item_tests.rs @@ -2122,4 +2122,325 @@ mod tests { other => panic!("expected resolved target Item, got {:?}", other), } } + + // ==================================================================== + // Crossover: Element::ReferenceWithSumItem × Element::ProvableSumTree + // -------------------------------------------------------------------- + // `ReferenceWithSumItem` (added in PR #667) and `ProvableSumTree` + // (added in this PR) were developed in parallel. They interact at one + // critical surface: a `ReferenceWithSumItem` inserted into a + // `ProvableSumTree` parent must propagate its explicit `sum_value` + // into the parent's CRYPTOGRAPHICALLY-BOUND aggregate sum (the sum + // that `node_hash_with_sum` bakes into every node hash, which makes + // `AggregateSumOnRange` proofs verifiable). Plain `SumTree` already + // had this exercised in `insert_in_sum_tree_aggregates_sum`; the + // tests below verify the same contract for the provable flavor and + // for the full proof round-trip. + // ==================================================================== + + /// Insert a `ReferenceWithSumItem` directly into a `ProvableSumTree` + /// parent. Its explicit sum_value must propagate into the parent's + /// running sum just like it does in a plain SumTree — but the binding + /// here is stronger: the sum is baked into the parent merk's node + /// hashes via `node_hash_with_sum`, so any forge attempt would + /// produce a different root hash. + #[test] + fn insert_reference_with_sum_item_in_provable_sum_tree_aggregates_sum() { + let grove_version = GroveVersion::latest(); + let db = make_test_grovedb(grove_version); + + db.insert( + [TEST_LEAF].as_ref(), + b"pst", + Element::empty_provable_sum_tree(), + None, + None, + grove_version, + ) + .unwrap() + .expect("insert provable sum tree"); + + insert_target_item( + &db, + [TEST_LEAF].as_ref(), + b"target", + b"target_payload", + grove_version, + ); + + // Reference-with-sum-item pointing at the target, carrying sum 50. + let ref_path = + ReferencePathType::AbsolutePathReference(vec![TEST_LEAF.to_vec(), b"target".to_vec()]); + let element = Element::new_reference_with_sum_item(ref_path, 50); + db.insert( + [TEST_LEAF, b"pst"].as_ref(), + b"link", + element, + None, + None, + grove_version, + ) + .unwrap() + .expect("insert ref-with-sum-item into provable sum tree"); + + // The parent ProvableSumTree's aggregate must equal the explicit + // sum carried on the link — independent of the resolved target. + let agg = open_merk_aggregate(&db, &[TEST_LEAF, b"pst"], grove_version); + assert_eq!( + agg, + AggregateData::ProvableSum(50), + "RWSI sum_value must propagate into ProvableSumTree's bound aggregate" + ); + } + + /// Multiple `ReferenceWithSumItem`s in the same `ProvableSumTree` + /// each independently contribute their sum_values. Verifies the + /// hash-bound aggregate adds them correctly (including negative + /// values). + #[test] + fn multiple_reference_with_sum_items_in_provable_sum_tree_accumulate() { + let grove_version = GroveVersion::latest(); + let db = make_test_grovedb(grove_version); + + db.insert( + [TEST_LEAF].as_ref(), + b"pst", + Element::empty_provable_sum_tree(), + None, + None, + grove_version, + ) + .unwrap() + .expect("insert provable sum tree"); + + insert_target_item(&db, [TEST_LEAF].as_ref(), b"target_a", b"a", grove_version); + insert_target_item(&db, [TEST_LEAF].as_ref(), b"target_b", b"b", grove_version); + + // Three RWSI links: 30 + (-8) + 100 = 122. Two of them point to + // the same target — sum is link-level, not target-level. + for (key, target, sum) in [ + (b"link_a".as_ref(), b"target_a".as_ref(), 30i64), + (b"link_b".as_ref(), b"target_b".as_ref(), -8), + (b"link_c".as_ref(), b"target_a".as_ref(), 100), + ] { + let ref_path = + ReferencePathType::AbsolutePathReference(vec![TEST_LEAF.to_vec(), target.to_vec()]); + db.insert( + [TEST_LEAF, b"pst"].as_ref(), + key, + Element::new_reference_with_sum_item(ref_path, sum), + None, + None, + grove_version, + ) + .unwrap() + .expect("insert link"); + } + + let agg = open_merk_aggregate(&db, &[TEST_LEAF, b"pst"], grove_version); + assert_eq!(agg, AggregateData::ProvableSum(122)); + } + + /// Aggregate-sum proof round-trip: build a `ProvableSumTree` + /// populated entirely with `ReferenceWithSumItem` entries, run a + /// range query that covers a subset of them, prove + verify, and + /// assert that the verified sum is exactly the sum of the + /// link-carried weights inside the range. + /// + /// This is the strongest combination test: it exercises the merk + /// `aggregate_sum` prover/verifier, the GroveDB envelope walker, + /// the terminal-type gate (must accept ProvableSumTree of RWSI), + /// and the `node_hash_with_sum` binding all at once. + #[test] + fn aggregate_sum_on_range_over_reference_with_sum_item_in_provable_sum_tree() { + let grove_version = GroveVersion::latest(); + let db = make_test_grovedb(grove_version); + + db.insert( + [TEST_LEAF].as_ref(), + b"pst", + Element::empty_provable_sum_tree(), + None, + None, + grove_version, + ) + .unwrap() + .expect("insert provable sum tree"); + + // Target the references point at. Sum on the link, NOT on the + // target, so the target itself is just a plain Item. + insert_target_item( + &db, + [TEST_LEAF].as_ref(), + b"target", + b"payload", + grove_version, + ); + let ref_path = + ReferencePathType::AbsolutePathReference(vec![TEST_LEAF.to_vec(), b"target".to_vec()]); + + // 15 RWSI links keyed a..=o with weights 1..=15 — same shape as + // setup_15_key_provable_sum_tree in aggregate_sum_query_tests but + // with RWSI in place of plain SumItem. + for (i, c) in (b'a'..=b'o').enumerate() { + let weight = (i as i64) + 1; + db.insert( + [TEST_LEAF, b"pst"].as_ref(), + &[c], + Element::new_reference_with_sum_item(ref_path.clone(), weight), + None, + None, + grove_version, + ) + .unwrap() + .expect("insert RWSI link"); + } + + let root_hash = db + .grove_db + .root_hash(None, grove_version) + .unwrap() + .expect("root hash"); + + // Sub-range c..=l: weights 3 + 4 + ... + 12 = 75. + let path_query = PathQuery::new_aggregate_sum_on_range( + vec![TEST_LEAF.to_vec(), b"pst".to_vec()], + QueryItem::RangeInclusive(b"c".to_vec()..=b"l".to_vec()), + ); + let proof = db + .grove_db + .prove_query(&path_query, None, grove_version) + .unwrap() + .expect("prove_query should succeed"); + let (verified_root, verified_sum) = + GroveDb::verify_aggregate_sum_query(&proof, &path_query, grove_version) + .expect("verify_aggregate_sum_query should succeed"); + assert_eq!(verified_root, root_hash); + assert_eq!( + verified_sum, 75, + "verified sum must equal the sum of in-range RWSI link weights" + ); + } + + /// Negative-sum proof round-trip: a single RWSI with a negative + /// weight in a `ProvableSumTree` must verify to that exact negative + /// sum. Confirms the i64-signed contract on the verifier side + /// works for RWSI-carried weights. + #[test] + fn aggregate_sum_handles_negative_weight_from_reference_with_sum_item() { + let grove_version = GroveVersion::latest(); + let db = make_test_grovedb(grove_version); + db.insert( + [TEST_LEAF].as_ref(), + b"pst", + Element::empty_provable_sum_tree(), + None, + None, + grove_version, + ) + .unwrap() + .expect("insert pst"); + insert_target_item(&db, [TEST_LEAF].as_ref(), b"t", b"v", grove_version); + + let ref_path = + ReferencePathType::AbsolutePathReference(vec![TEST_LEAF.to_vec(), b"t".to_vec()]); + db.insert( + [TEST_LEAF, b"pst"].as_ref(), + b"link", + Element::new_reference_with_sum_item(ref_path, -42), + None, + None, + grove_version, + ) + .unwrap() + .expect("insert RWSI with negative weight"); + + let path_query = PathQuery::new_aggregate_sum_on_range( + vec![TEST_LEAF.to_vec(), b"pst".to_vec()], + QueryItem::RangeFrom(b"a".to_vec()..), + ); + let proof = db + .grove_db + .prove_query(&path_query, None, grove_version) + .unwrap() + .expect("prove"); + let (_root, sum) = GroveDb::verify_aggregate_sum_query(&proof, &path_query, grove_version) + .expect("verify"); + assert_eq!(sum, -42); + } + + /// `NonCounted(ReferenceWithSumItem)` is REJECTED at insert time + /// when the parent is a `ProvableSumTree` — the `NonCounted` + /// wrapper's contract is "may only be inserted into count-bearing + /// trees", and `ProvableSumTree` is sum-only, not count-bearing. + /// This is the same rule that rejects `NonCounted(RWSI)` from + /// `SumTree`/`BigSumTree`; it's just easy to forget when thinking + /// about RWSI specifically because the wrapper conceptually does + /// nothing here (no count to suppress). + /// + /// If a caller wants a sum-only contribution they should insert the + /// bare `RWSI` directly — that's exactly what the + /// `insert_reference_with_sum_item_in_provable_sum_tree_aggregates_sum` + /// test above covers. + #[test] + fn non_counted_reference_with_sum_item_rejected_in_provable_sum_tree() { + let grove_version = GroveVersion::latest(); + let db = make_test_grovedb(grove_version); + + db.insert( + [TEST_LEAF].as_ref(), + b"pst", + Element::empty_provable_sum_tree(), + None, + None, + grove_version, + ) + .unwrap() + .expect("insert pst"); + insert_target_item(&db, [TEST_LEAF].as_ref(), b"t", b"v", grove_version); + + let ref_path = + ReferencePathType::AbsolutePathReference(vec![TEST_LEAF.to_vec(), b"t".to_vec()]); + let rwsi = Element::new_reference_with_sum_item(ref_path, 77); + let wrapped = Element::new_non_counted(rwsi).expect("NonCounted wrap accepted"); + + let err = db + .insert( + [TEST_LEAF, b"pst"].as_ref(), + b"link", + wrapped, + None, + None, + grove_version, + ) + .unwrap() + .expect_err("NonCounted parent-type guard must reject"); + let msg = format!("{err:?}"); + assert!( + msg.contains("count-bearing") + || msg.contains("non-counted") + || msg.contains("NonCounted"), + "expected NonCounted parent-type guard error, got: {msg}" + ); + } + + /// `NotCountedOrSummed` may only wrap sum-BEARING tree variants + /// (SumTree/BigSumTree/CountSumTree/ProvableCountSumTree/ + /// ProvableSumTree). `ReferenceWithSumItem` is a reference — not a + /// tree — so the constructor must reject it. PR #667 already covers + /// the `NotSummed` rejection in `new_not_summed_rejects_reference_with_sum_item`; + /// this is the matching `NotCountedOrSummed` parity test that + /// landed alongside our `NotCountedOrSummed` wrapper rule. + #[test] + fn new_not_counted_or_summed_rejects_reference_with_sum_item() { + let rwsi = Element::new_reference_with_sum_item( + ReferencePathType::SiblingReference(b"k".to_vec()), + 10, + ); + assert!( + Element::new_not_counted_or_summed(rwsi).is_err(), + "NotCountedOrSummed must reject ReferenceWithSumItem inner — only \ + sum-bearing tree variants are allowed" + ); + } } diff --git a/grovedb/src/tests/trunk_proof_tests.rs b/grovedb/src/tests/trunk_proof_tests.rs index da3af87a9..180bd5fca 100644 --- a/grovedb/src/tests/trunk_proof_tests.rs +++ b/grovedb/src/tests/trunk_proof_tests.rs @@ -1911,6 +1911,80 @@ mod tests { assert!(result.is_err(), "should reject KVRefValueHash node"); } + /// Defense-in-depth sibling of the KVRefValueHash rejection test for + /// the KVRefValueHashSum variant. KVRefValueHashSum carries an opaque + /// `node_value_hash` (combine_hash of node_value_hash and + /// referenced_value_hash) that the trunk verifier cannot recompute, so + /// a forged value bundled into such a node must be rejected by the + /// trunk extractor regardless of whether the merk hash chain catches + /// it first. Mirrors `test_trunk_proof_v1_rejects_kv_ref_value_hash_node`. + #[test] + fn test_trunk_proof_v1_rejects_kv_ref_value_hash_sum_node() { + let grove_version = GroveVersion::latest(); + let (proof_v1, query, _) = make_single_level_v1_proof(); + + let target_layer = proof_v1 + .root_layer + .lower_layers + .get(b"cst".as_slice()) + .expect("should have cst layer"); + let merk_bytes = match &target_layer.merk_proof { + ProofBytes::Merk(bytes) => bytes.clone(), + _ => panic!("expected Merk"), + }; + + let ops: Vec = Decoder::new(&merk_bytes) + .collect::, _>>() + .expect("decode ops"); + + // Replace a KV node with KVRefValueHashSum bundling a forged value + // and a placeholder opaque hash + sum. + let mut tampered_ops: Vec = Vec::new(); + let mut found_kv = false; + for op in &ops { + match op { + Op::Push(Node::KV(key, _value)) if !found_kv => { + let forged_element = Element::new_sum_item(9999); + let forged_value = forged_element.serialize(grove_version).expect("serialize"); + let fake_hash = [0xAB; 32]; + tampered_ops.push(Op::Push(Node::KVRefValueHashSum( + key.clone(), + forged_value, + fake_hash, + 42, + ))); + found_kv = true; + } + other => tampered_ops.push(other.clone()), + } + } + assert!(found_kv, "should have found a KV node to replace"); + + let mut tampered_merk = Vec::new(); + encode_into(tampered_ops.iter(), &mut tampered_merk); + + let mut tampered_v1 = proof_v1; + tampered_v1 + .root_layer + .lower_layers + .get_mut(b"cst".as_slice()) + .unwrap() + .merk_proof = ProofBytes::Merk(tampered_merk); + + let config = bincode::config::standard() + .with_big_endian() + .with_no_limit(); + let tampered_proof = + bincode::encode_to_vec(&GroveDBProof::V1(tampered_v1), config).expect("encode"); + + let result = GroveDb::verify_trunk_chunk_proof(&tampered_proof, &query, grove_version); + // Either the merk hash chain catches the tag swap first or the + // trunk extractor's explicit rejection of KVRefValueHashSum nodes + // (mirroring KVRefValueHash and KVRefValueHashCount) catches it. + // Both are acceptable rejection paths. + assert!(result.is_err(), "should reject KVRefValueHashSum node"); + } + /// Injecting a KVValueHashFeatureTypeWithChildHash node with a forged /// value should be caught by the combine_hash check. #[test] diff --git a/grovedbg-types/src/lib.rs b/grovedbg-types/src/lib.rs index cb9e1e693..12dfbb93b 100644 --- a/grovedbg-types/src/lib.rs +++ b/grovedbg-types/src/lib.rs @@ -162,6 +162,13 @@ pub enum Element { #[serde_as(as = "Option")] element_flags: Option>, }, + ProvableSumTree { + #[serde_as(as = "Option")] + root_key: Option, + sum: i64, + #[serde_as(as = "Option")] + element_flags: Option>, + }, Item { #[serde_as(as = "Base64")] value: Vec, @@ -320,6 +327,11 @@ pub enum TreeFeatureType { CountedSummedMerkNode(u64, i64), ProvableCountedMerkNode(u64), ProvableCountedSummedMerkNode(u64, i64), + /// Provable sum node: sum included in node hash. Mirrors + /// `SummedMerkNode` for serialization; the debugger renders both + /// identically (the on-the-wire distinction is by node hash, not by + /// serialization shape). + ProvableSummedMerkNode(i64), } #[derive(Debug, Clone, Serialize, Deserialize, PartialEq, Eq)] diff --git a/merk/src/element/costs.rs b/merk/src/element/costs.rs index 053547ebd..06e0d3888 100644 --- a/merk/src/element/costs.rs +++ b/merk/src/element/costs.rs @@ -72,6 +72,10 @@ impl ElementCostPrivateExtensions for Element { Element::CountSumTree(..) => Ok(COUNT_SUM_TREE_COST_SIZE), Element::ProvableCountTree(..) => Ok(COUNT_TREE_COST_SIZE), Element::ProvableCountSumTree(..) => Ok(COUNT_SUM_TREE_COST_SIZE), + // ProvableSumTree has the same on-disk layout as SumTree: + // (Option>, i64, Option>). It uses the same + // SUM_TREE_COST_SIZE. + Element::ProvableSumTree(..) => Ok(SUM_TREE_COST_SIZE), Element::NonCounted(inner) | Element::NotSummed(inner) | Element::NotCountedOrSummed(inner) => { @@ -191,6 +195,17 @@ impl ElementCostExtensions for Element { key_len, value_len, node_type, ) } + Element::ProvableSumTree(_, _sum_value, flags) => { + let flags_len = flags.map_or(0, |flags| { + let flags_len = flags.len() as u32; + flags_len + flags_len.required_space() as u32 + }); + let value_len = SUM_TREE_COST_SIZE + flags_len + wrapper_overhead; + let key_len = key.len() as u32; + KV::layered_value_byte_cost_size_for_key_and_value_lengths( + key_len, value_len, node_type, + ) + } Element::CommitmentTree(_, _, flags) => { let flags_len = flags.map_or(0, |flags| { let flags_len = flags.len() as u32; @@ -305,6 +320,7 @@ impl ElementCostExtensions for Element { | Element::CountSumTree(..) | Element::ProvableCountTree(..) | Element::ProvableCountSumTree(..) + | Element::ProvableSumTree(..) | Element::CommitmentTree(..) | Element::MmrTree(..) | Element::BulkAppendTree(..) @@ -335,6 +351,7 @@ impl ElementCostExtensions for Element { Element::CountSumTree(..) => Some(LayeredValueDefinedCost(cost)), Element::ProvableCountTree(..) => Some(LayeredValueDefinedCost(cost)), Element::ProvableCountSumTree(..) => Some(LayeredValueDefinedCost(cost)), + Element::ProvableSumTree(..) => Some(LayeredValueDefinedCost(cost)), Element::SumItem(..) => Some(SpecializedValueDefinedCost(cost)), Element::ItemWithSumItem(item, ..) => { let item_len = item.len() as u32; diff --git a/merk/src/element/delete.rs b/merk/src/element/delete.rs index b8ef69db4..4fe29c3c7 100644 --- a/merk/src/element/delete.rs +++ b/merk/src/element/delete.rs @@ -68,6 +68,7 @@ impl ElementDeleteFromStorageExtensions for Element { | (TreeType::CountSumTree, true) | (TreeType::ProvableCountTree, true) | (TreeType::ProvableCountSumTree, true) + | (TreeType::ProvableSumTree, true) | (TreeType::CommitmentTree(_), true) | (TreeType::MmrTree, true) | (TreeType::BulkAppendTree(_), true) @@ -80,6 +81,7 @@ impl ElementDeleteFromStorageExtensions for Element { | (TreeType::CountSumTree, false) | (TreeType::ProvableCountTree, false) | (TreeType::ProvableCountSumTree, false) + | (TreeType::ProvableSumTree, false) | (TreeType::CommitmentTree(_), false) | (TreeType::MmrTree, false) | (TreeType::BulkAppendTree(_), false) @@ -138,6 +140,7 @@ impl ElementDeleteFromStorageExtensions for Element { | (TreeType::CountSumTree, true) | (TreeType::ProvableCountTree, true) | (TreeType::ProvableCountSumTree, true) + | (TreeType::ProvableSumTree, true) | (TreeType::CommitmentTree(_), true) | (TreeType::MmrTree, true) | (TreeType::BulkAppendTree(_), true) @@ -150,6 +153,7 @@ impl ElementDeleteFromStorageExtensions for Element { | (TreeType::CountSumTree, false) | (TreeType::ProvableCountTree, false) | (TreeType::ProvableCountSumTree, false) + | (TreeType::ProvableSumTree, false) | (TreeType::CommitmentTree(_), false) | (TreeType::MmrTree, false) | (TreeType::BulkAppendTree(_), false) @@ -204,6 +208,7 @@ impl ElementDeleteFromStorageExtensions for Element { | (TreeType::CountSumTree, true) | (TreeType::ProvableCountTree, true) | (TreeType::ProvableCountSumTree, true) + | (TreeType::ProvableSumTree, true) | (TreeType::CommitmentTree(_), true) | (TreeType::MmrTree, true) | (TreeType::BulkAppendTree(_), true) @@ -216,6 +221,7 @@ impl ElementDeleteFromStorageExtensions for Element { | (TreeType::CountSumTree, false) | (TreeType::ProvableCountTree, false) | (TreeType::ProvableCountSumTree, false) + | (TreeType::ProvableSumTree, false) | (TreeType::CommitmentTree(_), false) | (TreeType::MmrTree, false) | (TreeType::BulkAppendTree(_), false) diff --git a/merk/src/element/get.rs b/merk/src/element/get.rs index 00d8750da..e651e4e5d 100644 --- a/merk/src/element/get.rs +++ b/merk/src/element/get.rs @@ -460,6 +460,7 @@ impl ElementFetchFromStoragePrivateExtensions for Element { | Some(Element::CountSumTree(.., flags)) | Some(Element::ProvableCountTree(_, _, flags)) | Some(Element::ProvableCountSumTree(.., flags)) + | Some(Element::ProvableSumTree(_, _, flags)) | Some(Element::CommitmentTree(_, _, flags)) | Some(Element::MmrTree(_, flags)) | Some(Element::BulkAppendTree(.., flags)) @@ -576,6 +577,7 @@ impl ElementFetchFromStoragePrivateExtensions for Element { | Element::CountSumTree(.., flags) | Element::ProvableCountTree(_, _, flags) | Element::ProvableCountSumTree(.., flags) + | Element::ProvableSumTree(_, _, flags) | Element::CommitmentTree(_, _, flags) | Element::MmrTree(_, flags) | Element::BulkAppendTree(.., flags) diff --git a/merk/src/element/reconstruct.rs b/merk/src/element/reconstruct.rs index 3fa4146b2..e6845080d 100644 --- a/merk/src/element/reconstruct.rs +++ b/merk/src/element/reconstruct.rs @@ -58,6 +58,11 @@ impl ElementReconstructExtensions for Element { aggregate_data.as_sum_i64(), f.clone(), )), + Element::ProvableSumTree(.., f) => Some(Element::ProvableSumTree( + maybe_root_key, + aggregate_data.as_sum_i64(), + f.clone(), + )), Element::CommitmentTree(tc, cp, f) => { Some(Element::CommitmentTree(*tc, *cp, f.clone())) } diff --git a/merk/src/element/tree_type.rs b/merk/src/element/tree_type.rs index 98b5d52e5..7360551a5 100644 --- a/merk/src/element/tree_type.rs +++ b/merk/src/element/tree_type.rs @@ -50,6 +50,7 @@ impl ElementTreeTypeExtensions for Element { Element::ProvableCountSumTree(root_key, ..) => { Some((root_key, TreeType::ProvableCountSumTree)) } + Element::ProvableSumTree(root_key, ..) => Some((root_key, TreeType::ProvableSumTree)), Element::CommitmentTree(_, chunk_power, _) => { Some((None, TreeType::CommitmentTree(chunk_power))) } @@ -84,6 +85,7 @@ impl ElementTreeTypeExtensions for Element { Element::ProvableCountSumTree(root_key, ..) => { Some((root_key, TreeType::ProvableCountSumTree)) } + Element::ProvableSumTree(root_key, ..) => Some((root_key, TreeType::ProvableSumTree)), Element::CommitmentTree(_, chunk_power, _) => { Some((&NONE_ROOT_KEY, TreeType::CommitmentTree(*chunk_power))) } @@ -115,6 +117,7 @@ impl ElementTreeTypeExtensions for Element { Element::ProvableCountSumTree(.., flags) => { Some((flags, TreeType::ProvableCountSumTree)) } + Element::ProvableSumTree(_, _, flags) => Some((flags, TreeType::ProvableSumTree)), Element::CommitmentTree(_, chunk_power, flags) => { Some((flags, TreeType::CommitmentTree(*chunk_power))) } @@ -143,6 +146,7 @@ impl ElementTreeTypeExtensions for Element { Element::CountSumTree(..) => Some(TreeType::CountSumTree), Element::ProvableCountTree(..) => Some(TreeType::ProvableCountTree), Element::ProvableCountSumTree(..) => Some(TreeType::ProvableCountSumTree), + Element::ProvableSumTree(..) => Some(TreeType::ProvableSumTree), Element::CommitmentTree(_, chunk_power, _) => { Some(TreeType::CommitmentTree(*chunk_power)) } @@ -175,6 +179,9 @@ impl ElementTreeTypeExtensions for Element { Element::ProvableCountSumTree(_, count, sum, _) => { Some(TreeFeatureType::ProvableCountedSummedMerkNode(*count, *sum)) } + Element::ProvableSumTree(_, value, _) => { + Some(TreeFeatureType::ProvableSummedMerkNode(*value)) + } Element::CommitmentTree(..) => Some(BasicMerkNode), Element::MmrTree(..) => Some(BasicMerkNode), Element::BulkAppendTree(..) => Some(BasicMerkNode), @@ -197,6 +204,7 @@ impl ElementTreeTypeExtensions for Element { Element::CountSumTree(..) => MaybeTree::Tree(TreeType::CountSumTree), Element::ProvableCountTree(..) => MaybeTree::Tree(TreeType::ProvableCountTree), Element::ProvableCountSumTree(..) => MaybeTree::Tree(TreeType::ProvableCountSumTree), + Element::ProvableSumTree(..) => MaybeTree::Tree(TreeType::ProvableSumTree), Element::CommitmentTree(_, chunk_power, _) => { MaybeTree::Tree(TreeType::CommitmentTree(*chunk_power)) } @@ -245,6 +253,14 @@ impl ElementTreeTypeExtensions for Element { TreeType::MmrTree => Ok(BasicMerkNode), TreeType::BulkAppendTree(_) => Ok(BasicMerkNode), TreeType::DenseAppendOnlyFixedSizeTree(_) => Ok(BasicMerkNode), + // ProvableSumTree aggregates an i64 sum (same arithmetic + // shape as plain SumTree) but carries it via + // `ProvableSummedMerkNode` so the sum is baked into every + // node's hash via `node_hash_with_sum` — making sum + // tampering catchable through proof verification. + TreeType::ProvableSumTree => Ok(TreeFeatureType::ProvableSummedMerkNode( + self.sum_value_or_default(), + )), } } } @@ -461,5 +477,138 @@ mod tests { } other => panic!("expected ProvableCountedSummedMerkNode, got {:?}", other), } + + // Sum-bearing parent: ProvableSumTree must also zero out the wrapped + // sum so the wrapper semantics stay consistent across the family. + // The sum-bearing branch uses the `ProvableSummedMerkNode(0)` + // feature type. + match ns.get_feature_type(TreeType::ProvableSumTree).unwrap() { + TreeFeatureType::ProvableSummedMerkNode(s) => assert_eq!(s, 0), + other => panic!("expected ProvableSummedMerkNode(0), got {:?}", other), + } + } + + // ------------------------------------------------------------------- + // Per-variant extension-method coverage: each tree-bearing variant + // has match arms in `root_key_and_tree_type`, `tree_flags_and_type`, + // `tree_type`, `tree_feature_type`, and `maybe_tree_type`. The + // existing tests above don't drive every variant directly; the + // following do. + // ------------------------------------------------------------------- + + fn assert_provable_sum_tree_arms(e: &Element) { + let (rk, tt) = e.root_key_and_tree_type().expect("Some"); + assert!(rk.is_none()); + assert_eq!(tt, TreeType::ProvableSumTree); + + assert_eq!(e.tree_type(), Some(TreeType::ProvableSumTree)); + assert_eq!( + e.maybe_tree_type(), + MaybeTree::Tree(TreeType::ProvableSumTree) + ); + let (flags, tt) = e.tree_flags_and_type().expect("Some"); + assert_eq!(flags, e.get_flags()); + assert_eq!(tt, TreeType::ProvableSumTree); + + match e.tree_feature_type() { + Some(TreeFeatureType::ProvableSummedMerkNode(_)) => {} + other => panic!("expected ProvableSummedMerkNode, got {:?}", other), + } + } + + #[test] + fn provable_sum_tree_extension_arms_direct() { + // Directly drive every per-variant arm for ProvableSumTree without + // wrappers, covering the lines that the wrapper-delegation test + // can't reach. + let e = Element::ProvableSumTree(None, 42, Some(vec![9, 8])); + assert_provable_sum_tree_arms(&e); + } + + #[test] + fn commitment_tree_extension_arms_direct() { + // CommitmentTree carries a chunk_power that flows through every + // helper. Drive the per-variant arms directly. + let chunk_power = 4u8; + let e = Element::CommitmentTree(0, chunk_power, Some(vec![1])); + + let (rk, tt) = e.root_key_and_tree_type().expect("Some"); + assert!(rk.is_none()); + assert_eq!(tt, TreeType::CommitmentTree(chunk_power)); + + assert_eq!(e.tree_type(), Some(TreeType::CommitmentTree(chunk_power))); + assert_eq!( + e.maybe_tree_type(), + MaybeTree::Tree(TreeType::CommitmentTree(chunk_power)) + ); + + let (flags, tt) = e.tree_flags_and_type().expect("Some"); + assert!(flags.is_some()); + assert_eq!(tt, TreeType::CommitmentTree(chunk_power)); + assert_eq!(e.tree_feature_type(), Some(BasicMerkNode)); + } + + #[test] + fn bulk_append_tree_extension_arms_direct() { + let chunk_power = 8u8; + let e = Element::BulkAppendTree(0, chunk_power, None); + + let (rk, tt) = e.root_key_and_tree_type().expect("Some"); + assert!(rk.is_none()); + assert_eq!(tt, TreeType::BulkAppendTree(chunk_power)); + + assert_eq!(e.tree_type(), Some(TreeType::BulkAppendTree(chunk_power))); + assert_eq!( + e.maybe_tree_type(), + MaybeTree::Tree(TreeType::BulkAppendTree(chunk_power)) + ); + let (flags, tt) = e.tree_flags_and_type().expect("Some"); + assert!(flags.is_none()); + assert_eq!(tt, TreeType::BulkAppendTree(chunk_power)); + assert_eq!(e.tree_feature_type(), Some(BasicMerkNode)); + } + + #[test] + fn dense_append_only_tree_extension_arms_direct() { + let height = 5u8; + let e = Element::DenseAppendOnlyFixedSizeTree(0, height, Some(vec![])); + + let (rk, tt) = e.root_key_and_tree_type().expect("Some"); + assert!(rk.is_none()); + assert_eq!(tt, TreeType::DenseAppendOnlyFixedSizeTree(height)); + + assert_eq!( + e.tree_type(), + Some(TreeType::DenseAppendOnlyFixedSizeTree(height)) + ); + assert_eq!( + e.maybe_tree_type(), + MaybeTree::Tree(TreeType::DenseAppendOnlyFixedSizeTree(height)) + ); + + let (flags, tt) = e.tree_flags_and_type().expect("Some"); + assert!(flags.is_some()); + assert_eq!(tt, TreeType::DenseAppendOnlyFixedSizeTree(height)); + assert_eq!(e.tree_feature_type(), Some(BasicMerkNode)); + } + + #[test] + fn mmr_tree_extension_arms_direct() { + let e = Element::MmrTree(0, None); + let (rk, tt) = e.root_key_and_tree_type().expect("Some"); + assert!(rk.is_none()); + assert_eq!(tt, TreeType::MmrTree); + + assert_eq!(e.tree_type(), Some(TreeType::MmrTree)); + assert_eq!(e.maybe_tree_type(), MaybeTree::Tree(TreeType::MmrTree)); + assert_eq!(e.tree_feature_type(), Some(BasicMerkNode)); + } + + #[test] + fn provable_sum_tree_through_not_summed_wrapper() { + // Drive the look-through arm for ProvableSumTree specifically. + let inner = Element::ProvableSumTree(None, 99, None); + let ns = Element::new_not_summed(inner).expect("wrap ok"); + assert_provable_sum_tree_arms(&ns); } } diff --git a/merk/src/merk/chunks.rs b/merk/src/merk/chunks.rs index 6e383ce08..70ff3c00b 100644 --- a/merk/src/merk/chunks.rs +++ b/merk/src/merk/chunks.rs @@ -490,6 +490,13 @@ mod test { // HashWithCount is hash-equivalent to Hash for the verifier; // count it under `hash` for the test counter. Node::HashWithCount(..) => counts.hash += 1, + // ProvableSumTree proof variants count under the same + // buckets as their structural Count counterparts. + Node::KVSum(..) => counts.kv += 1, + Node::KVHashSum(..) => counts.kv_hash += 1, + Node::KVDigestSum(..) => counts.kv_digest += 1, + Node::KVRefValueHashSum(..) => counts.kv_ref_value_hash += 1, + Node::HashWithSum(..) => counts.hash += 1, }; }); diff --git a/merk/src/merk/get.rs b/merk/src/merk/get.rs index 0950d184b..8f040ebdd 100644 --- a/merk/src/merk/get.rs +++ b/merk/src/merk/get.rs @@ -398,6 +398,53 @@ where } }) } + + /// Execute an `AggregateSumOnRange` query without producing a proof, + /// returning just the in-range signed sum. + /// + /// This is the no-proof counterpart of + /// [`Self::prove_aggregate_sum_on_range`]. It walks the same + /// classification path the proof emitter does — using each internal + /// node's stored aggregate sum to short-circuit Contained / Disjoint + /// subtrees — but skips the proof-op emission and serialization. The + /// merk-level cost is O(log n) in the number of distinct keys, the + /// same as the proof variant. + /// + /// The merk's `tree_type` must be `ProvableSumTree`; any other tree + /// type is rejected with `Error::InvalidProofError` before any + /// walking happens. On an empty merk this returns `sum = 0`. + /// + /// The accumulator carries `i128` end-to-end and narrows to `i64` at + /// the very last step (parallel to the prover and verifier). An + /// out-of-i64 result is treated as corruption — a real + /// `ProvableSumTree` maintains every aggregate as `i64` at every + /// level, so an out-of-range i128 result implies inconsistent tree + /// state. + /// + /// The returned sum is **not** independently verifiable — callers + /// trust the merk's reads. Use `prove_aggregate_sum_on_range` + + /// `verify_aggregate_sum_on_range_proof` for a verifiable sum. + pub fn sum_aggregate_on_range( + &self, + inner_range: &QueryItem, + grove_version: &GroveVersion, + ) -> CostResult { + let tree_type = self.tree_type; + if !matches!(tree_type, crate::TreeType::ProvableSumTree) { + return Err(Error::InvalidProofError(format!( + "AggregateSumOnRange is only valid against ProvableSumTree, got {:?}", + tree_type + ))) + .wrap_with_cost(Default::default()); + } + self.use_tree_mut(|maybe_tree| match maybe_tree { + None => Ok(0i64).wrap_with_cost(Default::default()), + Some(tree) => { + let mut ref_walker = RefWalker::new(tree, self.source()); + ref_walker.sum_aggregate_on_range(inner_range, grove_version) + } + }) + } } #[cfg(test)] diff --git a/merk/src/merk/prove.rs b/merk/src/merk/prove.rs index 151098cf8..18b5f3191 100644 --- a/merk/src/merk/prove.rs +++ b/merk/src/merk/prove.rs @@ -184,6 +184,39 @@ where } }) } + + /// Generate a sum-only proof for an `AggregateSumOnRange` query. + /// Mirror of [`Self::prove_aggregate_count_on_range`] for the + /// `ProvableSumTree` flavor. + /// + /// The merk's `tree_type` must be `ProvableSumTree`; any other tree type + /// is rejected with `Error::InvalidProofError` before any walking + /// happens. Empty merk: returns `(empty proof, sum = 0)`. + pub fn prove_aggregate_sum_on_range( + &self, + inner_range: &QueryItem, + grove_version: &GroveVersion, + ) -> CostResult<(LinkedList, i64), Error> { + let tree_type = self.tree_type; + if !matches!(tree_type, crate::TreeType::ProvableSumTree) { + return Err(Error::InvalidProofError(format!( + "AggregateSumOnRange is only valid against ProvableSumTree, got {:?}", + tree_type + ))) + .wrap_with_cost(Default::default()); + } + self.use_tree_mut(|maybe_tree| match maybe_tree { + None => Ok((LinkedList::new(), 0i64)).wrap_with_cost(Default::default()), + Some(tree) => { + let mut ref_walker = RefWalker::new(tree, self.source()); + ref_walker.create_aggregate_sum_on_range_proof( + inner_range, + tree_type, + grove_version, + ) + } + }) + } } type Proof = (LinkedList, Option); diff --git a/merk/src/proofs/branch/mod.rs b/merk/src/proofs/branch/mod.rs index 3d8f27e36..b7ab8e939 100644 --- a/merk/src/proofs/branch/mod.rs +++ b/merk/src/proofs/branch/mod.rs @@ -119,10 +119,16 @@ impl TrunkQueryResult { | Node::KVDigestCount(key, ..) | Node::KVRefValueHash(key, ..) | Node::KVCount(key, ..) - | Node::KVRefValueHashCount(key, ..) => Some(key.clone()), - Node::Hash(_) | Node::KVHash(_) | Node::KVHashCount(..) | Node::HashWithCount(..) => { - None - } + | Node::KVRefValueHashCount(key, ..) + | Node::KVSum(key, ..) + | Node::KVDigestSum(key, ..) + | Node::KVRefValueHashSum(key, ..) => Some(key.clone()), + Node::Hash(_) + | Node::KVHash(_) + | Node::KVHashCount(..) + | Node::HashWithCount(..) + | Node::KVHashSum(..) + | Node::HashWithSum(..) => None, } } @@ -384,10 +390,16 @@ impl BranchQueryResult { | Node::KVDigestCount(key, ..) | Node::KVRefValueHash(key, ..) | Node::KVCount(key, ..) - | Node::KVRefValueHashCount(key, ..) => Some(key.clone()), - Node::Hash(_) | Node::KVHash(_) | Node::KVHashCount(..) | Node::HashWithCount(..) => { - None - } + | Node::KVRefValueHashCount(key, ..) + | Node::KVSum(key, ..) + | Node::KVDigestSum(key, ..) + | Node::KVRefValueHashSum(key, ..) => Some(key.clone()), + Node::Hash(_) + | Node::KVHash(_) + | Node::KVHashCount(..) + | Node::HashWithCount(..) + | Node::KVHashSum(..) + | Node::HashWithSum(..) => None, } } } diff --git a/merk/src/proofs/chunk/chunk.rs b/merk/src/proofs/chunk/chunk.rs index d188eb32e..8061d56e0 100644 --- a/merk/src/proofs/chunk/chunk.rs +++ b/merk/src/proofs/chunk/chunk.rs @@ -166,12 +166,14 @@ where match proof_node_type { ProofNodeType::Kv => self.to_kv_node(), ProofNodeType::KvCount => self.to_kv_count_node(), + ProofNodeType::KvSum => self.to_kv_sum_node(), ProofNodeType::KvValueHash => self.to_kv_value_hash_node(), ProofNodeType::KvValueHashFeatureType => self.to_kv_value_hash_feature_type_node(), // References: at merk level, generate same node type as non-ref counterpart // GroveDB will post-process if needed ProofNodeType::KvRefValueHash => self.to_kv_value_hash_node(), ProofNodeType::KvRefValueHashCount => self.to_kv_value_hash_feature_type_node(), + ProofNodeType::KvRefValueHashSum => self.to_kv_value_hash_feature_type_node(), } } diff --git a/merk/src/proofs/query/aggregate_common.rs b/merk/src/proofs/query/aggregate_common.rs new file mode 100644 index 000000000..b97f93ba2 --- /dev/null +++ b/merk/src/proofs/query/aggregate_common.rs @@ -0,0 +1,129 @@ +//! Shared helpers for `AggregateCountOnRange` and `AggregateSumOnRange` +//! provers/verifiers. +//! +//! Range-bound classification is independent of the aggregate flavor +//! (count vs. sum) — it depends only on how a subtree's possible-key +//! window overlaps the query's inner range. The same `SubtreeClassification` +//! and the same `classify_subtree` decision drive both aggregate variants +//! identically. Keeping a single source of truth here prevents drift +//! between the two paths and removes a malleability surface (one of the +//! verifiers accepting a subtree the other rejects, or vice versa). +//! +//! Items exported here are `pub(super)` so only the two aggregate modules +//! that live alongside this one can use them. + +#[cfg(any(feature = "minimal", feature = "verify"))] +use grovedb_query::QueryItem; + +#[cfg(any(feature = "minimal", feature = "verify"))] +use crate::CryptoHash; + +/// All-zero `CryptoHash`. Used as the placeholder for missing children +/// in `Node::HashWithCount` / `Node::HashWithSum` proof reconstruction. +#[cfg(any(feature = "minimal", feature = "verify"))] +pub(super) const NULL_HASH: CryptoHash = [0u8; 32]; + +/// How a subtree's possible-key window relates to the inner range we're +/// aggregating over. +#[cfg(any(feature = "minimal", feature = "verify"))] +#[derive(Clone, Copy, Debug, PartialEq, Eq)] +pub(super) enum SubtreeClassification { + /// Every possible key in this subtree falls **outside** the range. + Disjoint, + /// Every possible key in this subtree falls **inside** the range. + Contained, + /// The subtree straddles a range boundary (or directly contains one). + Boundary, +} + +/// Classify a subtree relative to the inner range. +/// +/// `subtree_lo_excl` and `subtree_hi_excl` are the **exclusive** bounds +/// on what keys can appear under the subtree (derived from ancestors +/// during the walk; both `None` at the root). The range bounds come +/// from the inner `QueryItem`'s `lower_bound` / `upper_bound`. +/// +/// The comparisons treat `subtree_hi_excl` as exclusive (subtree keys +/// are strictly < `subtree_hi_excl`) and `subtree_lo_excl` as exclusive +/// (subtree keys are strictly > `subtree_lo_excl`). For the range +/// bounds, the inclusivity flag returned by `lower_bound` / +/// `upper_bound` is **not** load-bearing for the disjoint/contained +/// tests below — see the inline proofs. +#[cfg(any(feature = "minimal", feature = "verify"))] +pub(super) fn classify_subtree( + subtree_lo_excl: Option<&[u8]>, + subtree_hi_excl: Option<&[u8]>, + range: &QueryItem, +) -> SubtreeClassification { + let (range_lo, _range_lo_excl) = range.lower_bound(); + let (range_hi, _range_hi_incl) = range.upper_bound(); + + // Disjoint-LEFT: subtree entirely below the range. + // + // Subtree keys are < subtree_hi_excl. If subtree_hi_excl <= range_lo, + // every subtree key < subtree_hi_excl <= range_lo is also < range_lo, + // so excluded regardless of whether range_lo is inclusive or exclusive. + if let (Some(s_hi), Some(r_lo)) = (subtree_hi_excl, range_lo) + && s_hi <= r_lo + { + return SubtreeClassification::Disjoint; + } + + // Disjoint-RIGHT: subtree entirely above the range. + // + // Subtree keys are > subtree_lo_excl. If subtree_lo_excl >= range_hi, + // every subtree key > subtree_lo_excl >= range_hi is also > range_hi, + // so excluded regardless of whether range_hi is inclusive or exclusive. + if let (Some(s_lo), Some(r_hi)) = (subtree_lo_excl, range_hi) + && s_lo >= r_hi + { + return SubtreeClassification::Disjoint; + } + + // Contained: subtree (s_lo, s_hi) ⊆ range. + // + // Lower side: every subtree key > s_lo. If s_lo >= r_lo, every subtree + // key > s_lo >= r_lo, so > r_lo, satisfying both inclusive and exclusive + // r_lo. If subtree has no lower bound (s_lo = -inf) but range does, the + // subtree could include arbitrarily small keys → not contained. + let lower_contained = match range_lo { + None => true, + Some(r_lo) => match subtree_lo_excl { + Some(s_lo) => s_lo >= r_lo, + None => false, + }, + }; + // Upper side: every subtree key < s_hi. If s_hi <= r_hi, every subtree + // key < s_hi <= r_hi, so < r_hi, satisfying both inclusive and exclusive + // r_hi. (We forgo the slightly tighter "s_hi <= r_hi+1" optimization for + // inclusive r_hi because we don't have key arithmetic.) + let upper_contained = match range_hi { + None => true, + Some(r_hi) => match subtree_hi_excl { + Some(s_hi) => s_hi <= r_hi, + None => false, + }, + }; + + if lower_contained && upper_contained { + SubtreeClassification::Contained + } else { + SubtreeClassification::Boundary + } +} + +/// Returns true when `key` lies strictly between the exclusive bounds +/// `(lo, hi)`, where `None` represents `-inf` / `+inf`. +/// +/// Used at every `Boundary` node during the shape walk to verify that a +/// `KVDigest{Count,Sum}` carries a key consistent with its inherited +/// subtree window. Without this check, a forged proof could place a +/// boundary key outside the window its ancestors implied, and the +/// classification logic would silently miscount/misadd children that +/// don't actually exist at that position in the tree. +#[cfg(any(feature = "minimal", feature = "verify"))] +pub(super) fn key_strictly_inside(key: &[u8], lo: Option<&[u8]>, hi: Option<&[u8]>) -> bool { + let lo_ok = lo.is_none_or(|l| key > l); + let hi_ok = hi.is_none_or(|h| key < h); + lo_ok && hi_ok +} diff --git a/merk/src/proofs/query/aggregate_count.rs b/merk/src/proofs/query/aggregate_count.rs deleted file mode 100644 index 7e5944ca5..000000000 --- a/merk/src/proofs/query/aggregate_count.rs +++ /dev/null @@ -1,2113 +0,0 @@ -//! Proof generation and verification for `AggregateCountOnRange` queries. -//! -//! This module implements the count-only proof shape described in the GroveDB -//! book chapter "Aggregate Count Queries". It is intentionally **separate** -//! from `create_proof_internal`: regular proofs always descend into a queried -//! subtree, but count proofs *stop* at fully-inside subtree roots and emit a -//! single `HashWithCount` op for the entire collapsed subtree. -//! -//! The proof targets a `ProvableCountTree` or `ProvableCountSumTree` (or -//! their `NonCounted*` wrapper variants — wrappers only affect whether the -//! tree contributes to its parent's count, not its own internal count -//! mechanics). On any other tree type the entry point returns -//! `Error::InvalidProofError`. - -#[cfg(feature = "minimal")] -use std::collections::LinkedList; - -use grovedb_costs::{ - cost_return_on_error, cost_return_on_error_no_add, CostResult, CostsExt, OperationCost, -}; -#[cfg(feature = "minimal")] -use grovedb_version::version::GroveVersion; - -#[cfg(feature = "minimal")] -use crate::{ - proofs::Op, - tree::{kv::ValueDefinedCostType, AggregateData, Fetch, RefWalker}, - TreeType, -}; -use crate::{ - proofs::{ - query::QueryItem, - tree::{execute_with_options, Tree as ProofTree}, - Decoder, Node, - }, - CryptoHash, Error, -}; - -/// All-zero `CryptoHash`, used in `Node::HashWithCount` for missing children. -const NULL_HASH: CryptoHash = [0u8; 32]; - -/// How a subtree's possible-key window relates to the inner range we're -/// counting over. -#[derive(Clone, Copy, Debug, PartialEq, Eq)] -enum SubtreeClassification { - /// Every possible key in this subtree falls **outside** the range. - Disjoint, - /// Every possible key in this subtree falls **inside** the range. - Contained, - /// The subtree straddles a range boundary (or directly contains one). - Boundary, -} - -/// Classify a subtree relative to the inner range. -/// -/// `subtree_lo_excl` and `subtree_hi_excl` are the **exclusive** bounds on -/// what keys can appear under the subtree (derived from ancestors during the -/// walk; both `None` at the root). The range bounds come from the inner -/// `QueryItem`'s `lower_bound` / `upper_bound`. -/// -/// The comparisons treat `subtree_hi_excl` as exclusive (subtree keys are -/// strictly < `subtree_hi_excl`) and `subtree_lo_excl` as exclusive (subtree -/// keys are strictly > `subtree_lo_excl`). For the range bounds, the -/// inclusivity flag returned by `lower_bound`/`upper_bound` is **not** -/// load-bearing for the disjoint/contained tests below — see the inline -/// proofs. -fn classify_subtree( - subtree_lo_excl: Option<&[u8]>, - subtree_hi_excl: Option<&[u8]>, - range: &QueryItem, -) -> SubtreeClassification { - let (range_lo, _range_lo_excl) = range.lower_bound(); - let (range_hi, _range_hi_incl) = range.upper_bound(); - - // Disjoint-LEFT: subtree entirely below the range. - // - // Subtree keys are < subtree_hi_excl. If subtree_hi_excl <= range_lo, - // every subtree key < subtree_hi_excl <= range_lo is also < range_lo, - // so excluded regardless of whether range_lo is inclusive or exclusive. - if let (Some(s_hi), Some(r_lo)) = (subtree_hi_excl, range_lo) - && s_hi <= r_lo - { - return SubtreeClassification::Disjoint; - } - - // Disjoint-RIGHT: subtree entirely above the range. - // - // Subtree keys are > subtree_lo_excl. If subtree_lo_excl >= range_hi, - // every subtree key > subtree_lo_excl >= range_hi is also > range_hi, - // so excluded regardless of whether range_hi is inclusive or exclusive. - if let (Some(s_lo), Some(r_hi)) = (subtree_lo_excl, range_hi) - && s_lo >= r_hi - { - return SubtreeClassification::Disjoint; - } - - // Contained: subtree (s_lo, s_hi) ⊆ range. - // - // Lower side: every subtree key > s_lo. If s_lo >= r_lo, every subtree - // key > s_lo >= r_lo, so > r_lo, satisfying both inclusive and exclusive - // r_lo. If subtree has no lower bound (s_lo = -inf) but range does, the - // subtree could include arbitrarily small keys → not contained. - let lower_contained = match range_lo { - None => true, - Some(r_lo) => match subtree_lo_excl { - Some(s_lo) => s_lo >= r_lo, - None => false, - }, - }; - // Upper side: every subtree key < s_hi. If s_hi <= r_hi, every subtree - // key < s_hi <= r_hi, so < r_hi, satisfying both inclusive and exclusive - // r_hi. (We forgo the slightly tighter "s_hi <= r_hi+1" optimization for - // inclusive r_hi because we don't have key arithmetic.) - let upper_contained = match range_hi { - None => true, - Some(r_hi) => match subtree_hi_excl { - Some(s_hi) => s_hi <= r_hi, - None => false, - }, - }; - - if lower_contained && upper_contained { - SubtreeClassification::Contained - } else { - SubtreeClassification::Boundary - } -} - -/// Returns true if `tree_type` is one of the four tree types that can host an -/// `AggregateCountOnRange` proof. Wrapper types are accepted by stripping -/// down to the inner tree type via `is_provable_count_bearing`. -#[cfg(feature = "minimal")] -fn is_provable_count_bearing(tree_type: TreeType) -> bool { - matches!( - tree_type, - TreeType::ProvableCountTree | TreeType::ProvableCountSumTree - ) -} - -/// Pull the count out of a `ProvableCount` / `ProvableCountAndSum` aggregate. -/// Returns `Err(InvalidProofError)` for any other variant — the entry point -/// has already gated `tree_type`, so reaching the error means the tree's -/// in-memory state disagrees with its declared type. -#[cfg(feature = "minimal")] -fn provable_count_from_aggregate(data: AggregateData) -> Result { - match data { - AggregateData::ProvableCount(c) => Ok(c), - AggregateData::ProvableCountAndSum(c, _) => Ok(c), - other => Err(Error::InvalidProofError(format!( - "expected ProvableCount aggregate data on a provable count tree, got {:?}", - other - ))), - } -} - -#[cfg(feature = "minimal")] -impl RefWalker<'_, S> -where - S: Fetch + Sized + Clone, -{ - /// Generate a count-only proof for an `AggregateCountOnRange` query. - /// - /// `inner_range` is the `QueryItem` wrapped by `AggregateCountOnRange` - /// (already stripped at the caller). `tree_type` must be one of - /// `ProvableCountTree` or `ProvableCountSumTree`; any other tree type is - /// rejected with `Error::InvalidProofError` before any walking happens. - /// - /// The returned tuple is `(proof_ops, count)`: - /// - `proof_ops` is the linear stream the verifier will replay to - /// reconstruct the tree's root hash. - /// - `count` is the prover-side computed count (the verifier independently - /// recomputes it from the proof and compares against the expected root - /// hash; this value is returned as a convenience, not as ground truth). - pub fn create_aggregate_count_on_range_proof( - &mut self, - inner_range: &QueryItem, - tree_type: TreeType, - grove_version: &GroveVersion, - ) -> CostResult<(LinkedList, u64), Error> { - if !is_provable_count_bearing(tree_type) { - return Err(Error::InvalidProofError(format!( - "AggregateCountOnRange is only valid against ProvableCountTree or \ - ProvableCountSumTree, got {:?}", - tree_type - ))) - .wrap_with_cost(OperationCost::default()); - } - - let mut cost = OperationCost::default(); - let mut ops = LinkedList::new(); - let count = cost_return_on_error!( - &mut cost, - emit_count_proof(self, inner_range, None, None, &mut ops, grove_version) - ); - Ok((ops, count)).wrap_with_cost(cost) - } - - /// Walk the tree for an `AggregateCountOnRange` query and return the - /// in-range count, **without** producing a proof. - /// - /// This is the no-proof counterpart of - /// [`Self::create_aggregate_count_on_range_proof`]. It performs the same - /// classification walk (Contained / Disjoint / Boundary) and reads each - /// node's aggregate count directly from the merk, so it is O(log n) in - /// the number of distinct keys under the indexed subtree — the same - /// complexity as the proof variant but without the proof-op allocations, - /// hash recomputations, or serialization round-trip. - /// - /// The caller (`Merk::count_aggregate_on_range`) is expected to have - /// already validated `tree_type` is `ProvableCountTree` or - /// `ProvableCountSumTree`; the per-node `provable_count_from_aggregate` - /// check inside the walk surfaces any disagreement between the declared - /// tree type and the in-memory aggregate. - /// - /// The result is **not** independently verifiable: the caller is trusting - /// their own merk read path. Callers that need a verifiable count must - /// use `prove_aggregate_count_on_range` + `verify_aggregate_count_on_range_proof`. - pub fn count_aggregate_on_range( - &mut self, - inner_range: &QueryItem, - grove_version: &GroveVersion, - ) -> CostResult { - walk_count_only(self, inner_range, None, None, grove_version) - } -} - -/// Recursive proof emitter. Always called on a non-empty subtree. -/// -/// At entry, `subtree_lo_excl` / `subtree_hi_excl` are the inherited -/// exclusive key bounds for the subtree this walker points at (both `None` -/// at the root call). -#[cfg(feature = "minimal")] -fn emit_count_proof( - walker: &mut RefWalker<'_, S>, - range: &QueryItem, - subtree_lo_excl: Option<&[u8]>, - subtree_hi_excl: Option<&[u8]>, - ops: &mut LinkedList, - grove_version: &GroveVersion, -) -> CostResult -where - S: Fetch + Sized + Clone, -{ - let mut cost = OperationCost::default(); - - // Step 1: classify the current subtree against the inner range. - let class = classify_subtree(subtree_lo_excl, subtree_hi_excl, range); - - if matches!( - class, - SubtreeClassification::Disjoint | SubtreeClassification::Contained - ) { - // Whole subtree is either entirely outside or entirely inside the - // range. Either way we emit a single self-verifying - // `HashWithCount(kv_hash, left_child_hash, right_child_hash, count)` - // op for the subtree's root. - // - // Why HashWithCount even for Disjoint subtrees (rather than the - // smaller `Hash(node_hash)` that an in-range count would never - // need)? Because the parent's `own_count` is computed by the - // verifier as `parent_aggregate − left_struct − right_struct` (see - // `verify_count_shape`), so the *structural* count of every child - // — including disjoint outside subtrees — has to be - // cryptographically bound to the parent's hash chain. The only - // node type that carries a hash-bound count is `HashWithCount` - // (its four committed fields recompute `node_hash_with_count` and - // would diverge under any count tampering). Plain `Hash(node_hash)` - // carries no count, so a malicious prover could lie about the - // structural count and skew the parent's `own_count` - // derivation — leading to silent over/under-counts at boundary - // ancestors. - let aggregate = match walker.tree().aggregate_data() { - Ok(a) => a, - Err(e) => { - return Err(Error::InvalidProofError(format!("aggregate_data: {}", e))) - .wrap_with_cost(cost); - } - }; - let subtree_count = match provable_count_from_aggregate(aggregate) { - Ok(c) => c, - Err(e) => return Err(e).wrap_with_cost(cost), - }; - let kv_hash = *walker.tree().kv_hash(); - let left_child_hash = walker - .tree() - .link(true) - .map(|l| *l.hash()) - .unwrap_or(NULL_HASH); - let right_child_hash = walker - .tree() - .link(false) - .map(|l| *l.hash()) - .unwrap_or(NULL_HASH); - ops.push_back(Op::Push(Node::HashWithCount( - kv_hash, - left_child_hash, - right_child_hash, - subtree_count, - ))); - // For the prover-side in-range total: Contained contributes its - // entire subtree count (which already excludes NonCounted entries - // because their stored aggregate is 0); Disjoint contributes 0. - let in_range_contribution = match class { - SubtreeClassification::Contained => subtree_count, - SubtreeClassification::Disjoint => 0, - SubtreeClassification::Boundary => unreachable!(), - }; - return Ok(in_range_contribution).wrap_with_cost(cost); - } - // class == Boundary — fall through to descent + KVDigestCount emission. - - // Step 2: snapshot what we need from the current node before walking. - // walk(true/false) takes &mut self.tree, so we must drop any existing - // borrows on walker.tree() before calling it. - let node_key: Vec = walker.tree().key().to_vec(); - let node_value_hash: CryptoHash = *walker.tree().value_hash(); - let node_count: u64 = match walker - .tree() - .aggregate_data() - .map_err(|e| Error::InvalidProofError(format!("aggregate_data: {}", e))) - { - Ok(data) => match provable_count_from_aggregate(data) { - Ok(c) => c, - Err(e) => return Err(e).wrap_with_cost(cost), - }, - Err(e) => return Err(e).wrap_with_cost(cost), - }; - - // Snapshot each child link's structural aggregate count from the link - // itself (avoids loading the child for this lookup). The verifier needs - // these to compute `own_count = node_count − left_struct − right_struct` - // at this boundary node. - let left_link_aggregate: u64 = walker - .tree() - .link(true) - .map(|l| l.aggregate_data().as_count_u64()) - .unwrap_or(0); - let right_link_aggregate: u64 = walker - .tree() - .link(false) - .map(|l| l.aggregate_data().as_count_u64()) - .unwrap_or(0); - let left_link_present = walker.tree().link(true).is_some(); - let right_link_present = walker.tree().link(false).is_some(); - - let mut total: u64 = 0; - - // Step 3: handle the LEFT child. Both Disjoint and Contained require a - // one-level walk so the recursive Disjoint/Contained arm can emit a - // self-verifying `HashWithCount` (plain `Hash` is no longer used here - // — see the Disjoint branch comment above). - let left_emitted = if left_link_present { - let left_lo = subtree_lo_excl; - let left_hi: Option<&[u8]> = Some(node_key.as_slice()); - let walked = cost_return_on_error!( - &mut cost, - walker.walk( - true, - None::<&fn(&[u8], &GroveVersion) -> Option>, - grove_version, - ) - ); - let mut left_walker = match walked { - Some(lw) => lw, - None => { - return Err(Error::CorruptedState( - "tree.link(true) was Some but walk(true) returned None", - )) - .wrap_with_cost(cost) - } - }; - let n = cost_return_on_error!( - &mut cost, - emit_count_proof( - &mut left_walker, - range, - left_lo, - left_hi, - ops, - grove_version, - ) - ); - total = total.saturating_add(n); - true - } else { - false - }; - - // Step 4: emit the current node as a boundary KVDigestCount + attach left - // as its left child. The node's own contribution to the in-range count - // is `own_count` (0 for `NonCounted`-wrapped, 1 for normal), derived as - // `node_count − left_struct − right_struct`. This is what makes - // NonCounted entries fall out of the count: a NonCounted leaf has - // node_count = 0 and no children, so own_count = 0. - ops.push_back(Op::Push(Node::KVDigestCount( - node_key.clone(), - node_value_hash, - node_count, - ))); - if left_emitted { - ops.push_back(Op::Parent); - } - if range.contains(&node_key) { - let own_count = node_count - .saturating_sub(left_link_aggregate) - .saturating_sub(right_link_aggregate); - total = total.saturating_add(own_count); - } - - // Step 5: handle the RIGHT child. Same descent pattern as LEFT. - let right_emitted = if right_link_present { - let right_lo: Option<&[u8]> = Some(node_key.as_slice()); - let right_hi = subtree_hi_excl; - let walked = cost_return_on_error!( - &mut cost, - walker.walk( - false, - None::<&fn(&[u8], &GroveVersion) -> Option>, - grove_version, - ) - ); - let mut right_walker = match walked { - Some(rw) => rw, - None => { - return Err(Error::CorruptedState( - "tree.link(false) was Some but walk(false) returned None", - )) - .wrap_with_cost(cost) - } - }; - let n = cost_return_on_error!( - &mut cost, - emit_count_proof( - &mut right_walker, - range, - right_lo, - right_hi, - ops, - grove_version, - ) - ); - total = total.saturating_add(n); - true - } else { - false - }; - - if right_emitted { - ops.push_back(Op::Child); - } - - Ok(total).wrap_with_cost(cost) -} - -/// Read the provable-count aggregate off the walker's current tree node. -/// Shared error-mapping helper used by [`walk_count_only`] at both the -/// Contained-leaf and Boundary positions. -#[cfg(feature = "minimal")] -fn provable_count_from_walker(walker: &RefWalker<'_, S>) -> Result -where - S: Fetch + Sized + Clone, -{ - let aggregate = walker - .tree() - .aggregate_data() - .map_err(|e| Error::InvalidProofError(format!("aggregate_data: {}", e)))?; - provable_count_from_aggregate(aggregate) -} - -/// No-proof variant of [`emit_count_proof`]: walks the same classification -/// path (Contained / Disjoint / Boundary) but only returns the running -/// in-range count. -/// -/// At entry, `subtree_lo_excl` / `subtree_hi_excl` are the inherited -/// exclusive key bounds for the subtree this walker points at (both `None` -/// at the root call). The walk reads each node's `aggregate_data()` and -/// each child link's `aggregate_data().as_count_u64()` exactly the same way -/// the proof emitter does, so the returned count is identical to the -/// `count` field returned by `create_aggregate_count_on_range_proof`. -#[cfg(feature = "minimal")] -fn walk_count_only( - walker: &mut RefWalker<'_, S>, - range: &QueryItem, - subtree_lo_excl: Option<&[u8]>, - subtree_hi_excl: Option<&[u8]>, - grove_version: &GroveVersion, -) -> CostResult -where - S: Fetch + Sized + Clone, -{ - let mut cost = OperationCost::default(); - - // Classify the current subtree against the inner range. - match classify_subtree(subtree_lo_excl, subtree_hi_excl, range) { - // Disjoint: subtree contributes 0 to the in-range count. - SubtreeClassification::Disjoint => Ok(0).wrap_with_cost(cost), - // Contained: subtree contributes its full stored aggregate - // (NonCounted entries are already excluded — their stored - // aggregate is 0). - SubtreeClassification::Contained => { - let count = cost_return_on_error_no_add!(cost, provable_count_from_walker(walker)); - Ok(count).wrap_with_cost(cost) - } - // Boundary: descend into both children and add own_count. - SubtreeClassification::Boundary => { - // Snapshot what we need from the current node before walking. - // walk(...) takes &mut self.tree, so we must drop any existing - // borrows on walker.tree() before calling it. - let node_key: Vec = walker.tree().key().to_vec(); - let node_count = cost_return_on_error_no_add!(cost, provable_count_from_walker(walker)); - let left_link_aggregate: u64 = walker - .tree() - .link(true) - .map(|l| l.aggregate_data().as_count_u64()) - .unwrap_or(0); - let right_link_aggregate: u64 = walker - .tree() - .link(false) - .map(|l| l.aggregate_data().as_count_u64()) - .unwrap_or(0); - let left_link_present = walker.tree().link(true).is_some(); - let right_link_present = walker.tree().link(false).is_some(); - - let mut total: u64 = 0; - - // LEFT child. If link is Some, walk(true) must yield Some; the - // proof variant has the verifier to catch silent inconsistencies, - // but this no-proof path returns the count straight to the - // caller — so we fail loudly on impossible state rather than - // silently undercounting. - if left_link_present { - let walked = cost_return_on_error!( - &mut cost, - walker.walk( - true, - None::<&fn(&[u8], &GroveVersion) -> Option>, - grove_version, - ) - ); - let mut left_walker = match walked { - Some(lw) => lw, - None => { - return Err(Error::CorruptedState( - "tree.link(true) was Some but walk(true) returned None", - )) - .wrap_with_cost(cost); - } - }; - let n = cost_return_on_error!( - &mut cost, - walk_count_only( - &mut left_walker, - range, - subtree_lo_excl, - Some(node_key.as_slice()), - grove_version, - ) - ); - total = total.saturating_add(n); - } - - // Current node's own_count: 1 if in-range and counted, 0 for - // NonCounted-wrapped (which has stored aggregate 0, so the - // subtraction yields 0). `checked_sub` (not `saturating_sub`) - // because children claiming more keys than the parent's - // aggregate is corrupted state, not something to silently - // clamp to 0. - if range.contains(&node_key) { - let own_count = node_count - .checked_sub(left_link_aggregate) - .and_then(|n| n.checked_sub(right_link_aggregate)) - .ok_or(Error::CorruptedState( - "child structural counts exceed parent's aggregate count", - )); - let own_count = cost_return_on_error_no_add!(cost, own_count); - total = total.saturating_add(own_count); - } - - // RIGHT child — same fail-fast pattern as LEFT. - if right_link_present { - let walked = cost_return_on_error!( - &mut cost, - walker.walk( - false, - None::<&fn(&[u8], &GroveVersion) -> Option>, - grove_version, - ) - ); - let mut right_walker = match walked { - Some(rw) => rw, - None => { - return Err(Error::CorruptedState( - "tree.link(false) was Some but walk(false) returned None", - )) - .wrap_with_cost(cost); - } - }; - let n = cost_return_on_error!( - &mut cost, - walk_count_only( - &mut right_walker, - range, - Some(node_key.as_slice()), - subtree_hi_excl, - grove_version, - ) - ); - total = total.saturating_add(n); - } - - Ok(total).wrap_with_cost(cost) - } - } -} - -/// Verify a count-only proof for an `AggregateCountOnRange` query. -/// -/// `proof_bytes` is the encoded `Vec` produced by -/// [`Merk::prove_aggregate_count_on_range`]; `inner_range` is the same -/// `QueryItem` the prover counted over (caller-supplied — typically extracted -/// from the verifier's `PathQuery`). -/// -/// On success returns `(merk_root_hash, count)`: -/// - `merk_root_hash` is the root hash of the reconstructed merk; the -/// caller must compare it against the expected root hash to complete -/// verification. -/// - `count` is the number of keys in the inner range, computed by replaying -/// the prover's classification walk against the reconstructed proof tree. -/// -/// **Two-phase verification.** Allowlisting node types alone is unsound: -/// a malicious prover can substitute `Hash` for an in-range subtree (to -/// undercount), attach extra `KVDigestCount` children below a keyless -/// `Hash` / `HashWithCount` (to overcount, since their hash recomputation -/// ignores attached children and the root hash would still match), or send -/// a single `Push(Hash(expected_root))` for a non-empty tree (to receive a -/// count of 0 with the trusted root). To prevent all three, this function: -/// -/// 1. Decodes the proof into a `ProofTree` via `execute_with_options` with -/// the AVL balance check disabled (count proofs intentionally collapse -/// one side to height 1) and **does not** count anything in the -/// `visit_node` callback. -/// 2. Walks the reconstructed tree with the same inherited exclusive -/// subtree-key bounds the prover used (`(None, None)` at the root). -/// At each position it calls `classify_subtree(bounds, inner_range)` and -/// requires the proof-tree node type to match the classification: -/// - `Disjoint` → must be a leaf `Hash(_)`. Contributes 0. -/// - `Contained` → must be a leaf `HashWithCount(...)`. Contributes its -/// count. -/// - `Boundary` → must be `KVDigestCount(key, ...)` with `key` strictly -/// inside `bounds`. Recurse left with `(lo, key)` and right with -/// `(key, hi)`; add 1 if `inner_range.contains(key)`. -/// -/// Counts are summed with `checked_add`; an overflow is treated as proof -/// corruption (`u64::MAX` keys is not a real merk shape). The caller is -/// still responsible for verifying the returned `merk_root_hash` against -/// their trusted root. -/// -/// **Empty merk case.** An empty merk is represented by an empty proof byte -/// stream and yields `(NULL_HASH, 0)`. Callers chaining this in a -/// multi-layer proof should recognize that shape explicitly. -pub fn verify_aggregate_count_on_range_proof( - proof_bytes: &[u8], - inner_range: &QueryItem, -) -> CostResult<(CryptoHash, u64), Error> { - if proof_bytes.is_empty() { - // Empty merk → empty proof → count = 0, hash = NULL_HASH. This - // matches the prover-side behavior of returning an empty op stream - // for an empty subtree. - return Ok((NULL_HASH, 0u64)).wrap_with_cost(OperationCost::default()); - } - - let mut cost = OperationCost::default(); - let decoder = Decoder::new(proof_bytes); - - // Phase 1: reconstruct the proof tree. The visit_node closure only - // performs a coarse allowlist; the per-position type/shape check happens - // in Phase 2 below. We still reject blatantly wrong node types here so - // execute() bails early on garbage input. - let tree_result: CostResult = - execute_with_options(decoder, false, false, |node| match node { - // The count proof emits only `HashWithCount` (for collapsed - // Disjoint or Contained subtrees) and `KVDigestCount` (for - // Boundary nodes). Plain `Hash(_)` is no longer used here - // because the structural count it would otherwise stand in - // for is needed by the verifier's `own_count` derivation and - // would not be hash-bound. - Node::HashWithCount(_, _, _, _) | Node::KVDigestCount(_, _, _) => Ok(()), - other => Err(Error::InvalidProofError(format!( - "unexpected node type in aggregate count proof: {}", - other - ))), - }); - let tree = cost_return_on_error!(&mut cost, tree_result); - - // Phase 2: shape-check + count by replaying the prover's classification - // walk. This binds each leaf node's type to the (subtree_bounds × range) - // classification, so the only valid count is the one a faithful prover - // would have produced for this exact range. - let (count, _structural) = match verify_count_shape(&tree, inner_range, None, None) { - Ok(pair) => pair, - Err(e) => return Err(e).wrap_with_cost(cost), - }; - - let root_hash = tree.hash().unwrap_add_cost(&mut cost); - Ok((root_hash, count)).wrap_with_cost(cost) -} - -/// Recursive shape-walk over the reconstructed proof tree. Returns the -/// pair `(in_range_count, structural_count)`: -/// -/// - `in_range_count` — number of keys in the subtree that fall inside the -/// inner range AND have a non-zero own-count (i.e. are not -/// `NonCounted`-wrapped). This is what bubbles up to the verifier's -/// return value. -/// - `structural_count` — the merk-recorded aggregate count of this subtree -/// (counting normal entries as 1 and `NonCounted` entries as 0). The -/// parent uses it to compute its own `own_count` as -/// `parent_node_count − left_struct − right_struct` (since -/// `parent_node_count = own + left_struct + right_struct`). -/// -/// The structural count of every child is **cryptographically bound** to -/// the parent's hash chain because every count-bearing node in a count -/// proof (`KVDigestCount`, `HashWithCount`) has its count fed into -/// `node_hash_with_count` for hash recomputation. Plain `Hash(_)` would -/// not carry a bound count and is therefore not allowed in count proofs; -/// see the prover-side comment in `emit_count_proof` for the full -/// justification. -/// -/// At each node: -/// -/// - Compute the expected classification from the inherited subtree bounds -/// and the inner range. -/// - Require the node's type to match the classification (and reject any -/// children attached under a leaf-shape classification — a malicious -/// prover could otherwise hide counted children under a `HashWithCount` -/// leaf, since its hash recomputation ignores reconstructed children). -/// - Recurse with tightened bounds at `Boundary` nodes, summing with -/// `checked_add` and computing `own_count` via `checked_sub`. -fn verify_count_shape( - tree: &ProofTree, - range: &QueryItem, - lo: Option<&[u8]>, - hi: Option<&[u8]>, -) -> Result<(u64, u64), Error> { - let class = classify_subtree(lo, hi, range); - match class { - SubtreeClassification::Disjoint => match &tree.node { - Node::HashWithCount(_, _, _, count) => { - if tree.left.is_some() || tree.right.is_some() { - return Err(Error::InvalidProofError( - "aggregate-count proof: HashWithCount node at a Disjoint position \ - must be a leaf" - .to_string(), - )); - } - // Disjoint subtree contributes 0 to the in-range count but - // its full structural count to the parent's `own_count` - // computation. - Ok((0, *count)) - } - other => Err(Error::InvalidProofError(format!( - "aggregate-count proof: expected HashWithCount at Disjoint position, got {}", - other - ))), - }, - SubtreeClassification::Contained => match &tree.node { - Node::HashWithCount(_, _, _, count) => { - if tree.left.is_some() || tree.right.is_some() { - return Err(Error::InvalidProofError( - "aggregate-count proof: HashWithCount node at a Contained position \ - must be a leaf" - .to_string(), - )); - } - // Contained subtree's structural count (which excludes - // NonCounted entries because their stored aggregate is 0) - // is exactly its in-range count. - Ok((*count, *count)) - } - other => Err(Error::InvalidProofError(format!( - "aggregate-count proof: expected HashWithCount at Contained position, got {}", - other - ))), - }, - SubtreeClassification::Boundary => match &tree.node { - Node::KVDigestCount(key, _, aggregate) => { - if !key_strictly_inside(key.as_slice(), lo, hi) { - return Err(Error::InvalidProofError(format!( - "aggregate-count proof: KVDigestCount key {} falls outside its \ - inherited subtree bounds (lo={:?}, hi={:?})", - hex::encode(key), - lo.map(hex::encode), - hi.map(hex::encode), - ))); - } - let key_slice = key.as_slice(); - let (left_in, left_struct) = match &tree.left { - Some(child) => verify_count_shape(&child.tree, range, lo, Some(key_slice))?, - None => (0, 0), - }; - let (right_in, right_struct) = match &tree.right { - Some(child) => verify_count_shape(&child.tree, range, Some(key_slice), hi)?, - None => (0, 0), - }; - // own_count = aggregate − left_struct − right_struct. - // Saturating sub here would silently mask a malformed - // proof (children claiming more keys than the parent's - // aggregate), so use checked_sub and reject. - let own_count = aggregate - .checked_sub(left_struct) - .and_then(|s| s.checked_sub(right_struct)) - .ok_or_else(|| { - Error::InvalidProofError(format!( - "aggregate-count proof: child structural counts ({} + {}) exceed \ - parent's aggregate count ({}) at key {}", - left_struct, - right_struct, - aggregate, - hex::encode(key) - )) - })?; - let self_contribution = if range.contains(key_slice) { - own_count - } else { - 0 - }; - let in_range = left_in - .checked_add(right_in) - .and_then(|s| s.checked_add(self_contribution)) - .ok_or_else(|| { - Error::InvalidProofError( - "aggregate-count proof: in-range count overflowed u64".to_string(), - ) - })?; - Ok((in_range, *aggregate)) - } - other => Err(Error::InvalidProofError(format!( - "aggregate-count proof: expected KVDigestCount at Boundary position, got {}", - other - ))), - }, - } -} - -/// Returns true when `key` lies strictly between the exclusive bounds -/// `(lo, hi)`, where `None` represents `-inf` / `+inf`. Used to validate that -/// a `Boundary` `KVDigestCount` carries a key consistent with its inherited -/// subtree window. -fn key_strictly_inside(key: &[u8], lo: Option<&[u8]>, hi: Option<&[u8]>) -> bool { - let lo_ok = lo.is_none_or(|l| key > l); - let hi_ok = hi.is_none_or(|h| key < h); - lo_ok && hi_ok -} - -#[cfg(test)] -mod tests { - use super::*; - - /// Asserts the hardcoded fixture in the `verify_only_tests` module - /// still matches the bytes a fresh prove run produces. If the proof - /// encoding ever changes, this test fails and prints the new - /// constants — copy them into `verify_only_tests`. - #[test] - fn verify_only_fixture_matches_fresh_prover_output() { - let v = GroveVersion::latest(); - let (merk, root) = make_15_key_provable_count_tree(v); - let inner_range = QueryItem::RangeInclusive(b"c".to_vec()..=b"l".to_vec()); - let (ops, count) = merk - .prove_aggregate_count_on_range(&inner_range, v) - .unwrap() - .expect("prove"); - let proof_hex = hex::encode(encode_proof(&ops)); - let root_hex = hex::encode(root); - - let drift_msg = format!( - "aggregate_count proof encoding has drifted — update verify_only_tests:\n\ - const FIXTURE_15_KEY_C_TO_L_PROOF_HEX: &str = \"{}\";\n\ - const FIXTURE_15_KEY_C_TO_L_ROOT_HEX: &str = \"{}\";\n\ - const FIXTURE_15_KEY_C_TO_L_COUNT: u64 = {};", - proof_hex, root_hex, count - ); - assert_eq!( - proof_hex, - super::verify_only_tests::FIXTURE_15_KEY_C_TO_L_PROOF_HEX, - "{}", - drift_msg - ); - assert_eq!( - root_hex, - super::verify_only_tests::FIXTURE_15_KEY_C_TO_L_ROOT_HEX, - "{}", - drift_msg - ); - assert_eq!( - count, - super::verify_only_tests::FIXTURE_15_KEY_C_TO_L_COUNT, - "{}", - drift_msg - ); - } - - fn range_inclusive(lo: &[u8], hi: &[u8]) -> QueryItem { - QueryItem::RangeInclusive(lo.to_vec()..=hi.to_vec()) - } - - fn range_full() -> QueryItem { - QueryItem::RangeFull(std::ops::RangeFull) - } - - fn range_from(lo: &[u8]) -> QueryItem { - QueryItem::RangeFrom(lo.to_vec()..) - } - - fn range_after(lo: &[u8]) -> QueryItem { - QueryItem::RangeAfter(lo.to_vec()..) - } - - #[test] - fn classify_disjoint_below() { - let r = range_inclusive(b"d", b"f"); - // subtree (None, b"c") — keys < "c", entirely below ["d", "f"]. - assert_eq!( - classify_subtree(None, Some(b"c"), &r), - SubtreeClassification::Disjoint, - ); - } - - #[test] - fn classify_disjoint_above() { - let r = range_inclusive(b"d", b"f"); - // subtree (b"g", None) — keys > "g", entirely above ["d", "f"]. - assert_eq!( - classify_subtree(Some(b"g"), None, &r), - SubtreeClassification::Disjoint, - ); - } - - #[test] - fn classify_disjoint_at_lower_boundary_inclusive() { - let r = range_inclusive(b"d", b"f"); - // subtree (None, b"d") — keys < "d", just below the inclusive bound. - assert_eq!( - classify_subtree(None, Some(b"d"), &r), - SubtreeClassification::Disjoint, - ); - } - - #[test] - fn classify_disjoint_at_upper_boundary_inclusive() { - let r = range_inclusive(b"d", b"f"); - // subtree (b"f", None) — keys > "f", just above the inclusive bound. - assert_eq!( - classify_subtree(Some(b"f"), None, &r), - SubtreeClassification::Disjoint, - ); - } - - #[test] - fn classify_contained_simple() { - let r = range_inclusive(b"a", b"z"); - // subtree (b"d", b"f") — keys in ("d", "f"), all in ["a", "z"]. - assert_eq!( - classify_subtree(Some(b"d"), Some(b"f"), &r), - SubtreeClassification::Contained, - ); - } - - #[test] - fn classify_contained_full_range_full_subtree() { - let r = range_full(); - // The full range matches everything — even an unbounded subtree is - // contained. - assert_eq!( - classify_subtree(None, None, &r), - SubtreeClassification::Contained, - ); - } - - #[test] - fn classify_boundary_overlapping_lower() { - let r = range_inclusive(b"d", b"f"); - // subtree (b"c", b"e") — keys in ("c", "e"), straddles the lower bound. - assert_eq!( - classify_subtree(Some(b"c"), Some(b"e"), &r), - SubtreeClassification::Boundary, - ); - } - - #[test] - fn classify_boundary_overlapping_upper() { - let r = range_inclusive(b"d", b"f"); - // subtree (b"e", b"g") — keys in ("e", "g"), straddles the upper bound. - assert_eq!( - classify_subtree(Some(b"e"), Some(b"g"), &r), - SubtreeClassification::Boundary, - ); - } - - #[test] - fn classify_boundary_unbounded_below_with_bounded_range() { - let r = range_from(b"d"); - // subtree (None, b"e") — could include keys < "d", so boundary. - assert_eq!( - classify_subtree(None, Some(b"e"), &r), - SubtreeClassification::Boundary, - ); - } - - #[test] - fn classify_contained_range_after_exclusive() { - let r = range_after(b"b"); - // RangeAfter(b"b") = (b, +inf). subtree (b"b", b"e") — keys > "b" and - // < "e", all in (b, +inf). Contained. - assert_eq!( - classify_subtree(Some(b"b"), Some(b"e"), &r), - SubtreeClassification::Contained, - ); - } - - // ---------- end-to-end integration tests on a real merk ---------- - // - // These tests build a small ProvableCountTree, generate count proofs - // through the merk-level API, then verify them with the count verifier. - // They cover the four documented categories: open-range (lower-only and - // upper-only) and closed-range (inclusive and after-to-inclusive). Empty - // tree and single-bound edge cases are also exercised. - - use grovedb_costs::CostsExt as _; - use grovedb_version::version::GroveVersion; - - use crate::{ - proofs::{encode_into, Op as ProofOp}, - test_utils::TempMerk, - tree::{Op, TreeFeatureType::ProvableCountedMerkNode}, - Merk, TreeType, - }; - - /// Build a fresh `ProvableCountTree` populated with single-byte keys - /// "a".."o" (15 keys) — same shape as the running example in the book - /// chapter's "Closed ranges" section. Returns the merk and its current - /// root hash. - fn make_15_key_provable_count_tree(grove_version: &GroveVersion) -> (TempMerk, [u8; 32]) { - let mut merk = TempMerk::new_with_tree_type(grove_version, TreeType::ProvableCountTree); - let keys: Vec> = (b'a'..=b'o').map(|c| vec![c]).collect(); - let entries: Vec<(Vec, Op)> = keys - .iter() - .enumerate() - .map(|(i, k)| { - ( - k.clone(), - Op::Put(vec![i as u8], ProvableCountedMerkNode(1)), - ) - }) - .collect(); - merk.apply::<_, Vec<_>>(&entries, &[], None, grove_version) - .unwrap() - .expect("apply should succeed"); - merk.commit(grove_version); - let root_hash = merk.root_hash().unwrap(); - (merk, root_hash) - } - - /// Encode a `LinkedList` into the wire format that the verifier - /// consumes. - fn encode_proof(ops: &LinkedList) -> Vec { - let mut bytes = Vec::with_capacity(128); - encode_into(ops.iter(), &mut bytes); - bytes - } - - /// Round-trip helper: prove the inner range, encode the proof, verify it, - /// assert the recovered root hash matches and the recovered count matches - /// `expected_count`. - fn round_trip( - merk: &Merk>, - expected_root: [u8; 32], - inner_range: QueryItem, - expected_count: u64, - grove_version: &GroveVersion, - ) { - let (ops, prover_count) = merk - .prove_aggregate_count_on_range(&inner_range, grove_version) - .unwrap() - .expect("prove should succeed"); - assert_eq!( - prover_count, expected_count, - "prover count mismatch for range {:?}", - inner_range - ); - let bytes = encode_proof(&ops); - let (root, verifier_count) = verify_aggregate_count_on_range_proof(&bytes, &inner_range) - .unwrap() - .expect("verify should succeed"); - assert_eq!( - root, expected_root, - "verifier reconstructed wrong root for range {:?}", - inner_range - ); - assert_eq!( - verifier_count, expected_count, - "verifier count mismatch for range {:?}", - inner_range - ); - } - - #[test] - fn integration_open_range_from() { - let v = GroveVersion::latest(); - let (merk, root) = make_15_key_provable_count_tree(v); - // RangeFrom("c"..) → keys c..o (13 keys). - round_trip(&merk, root, QueryItem::RangeFrom(b"c".to_vec()..), 13, v); - } - - #[test] - fn integration_open_range_after() { - let v = GroveVersion::latest(); - let (merk, root) = make_15_key_provable_count_tree(v); - // RangeAfter(("b", ..)) → keys c..o (13 keys), same set as RangeFrom("c"..) - // but proof shape differs — the boundary lands on "b" exclusive. - round_trip(&merk, root, QueryItem::RangeAfter(b"b".to_vec()..), 13, v); - } - - #[test] - fn integration_open_range_to() { - let v = GroveVersion::latest(); - let (merk, root) = make_15_key_provable_count_tree(v); - // RangeTo(..b"e") → keys a..d (4 keys, exclusive upper). - round_trip(&merk, root, QueryItem::RangeTo(..b"e".to_vec()), 4, v); - } - - #[test] - fn integration_open_range_to_inclusive() { - let v = GroveVersion::latest(); - let (merk, root) = make_15_key_provable_count_tree(v); - // RangeToInclusive(..=b"e") → keys a..e (5 keys, inclusive upper). - round_trip( - &merk, - root, - QueryItem::RangeToInclusive(..=b"e".to_vec()), - 5, - v, - ); - } - - #[test] - fn integration_closed_range_inclusive() { - let v = GroveVersion::latest(); - let (merk, root) = make_15_key_provable_count_tree(v); - // RangeInclusive("c"..="l") → 10 keys. - round_trip( - &merk, - root, - QueryItem::RangeInclusive(b"c".to_vec()..=b"l".to_vec()), - 10, - v, - ); - } - - #[test] - fn integration_closed_range_exclusive() { - let v = GroveVersion::latest(); - let (merk, root) = make_15_key_provable_count_tree(v); - // Range("c".."l") → c..k (9 keys, exclusive upper). - round_trip( - &merk, - root, - QueryItem::Range(b"c".to_vec()..b"l".to_vec()), - 9, - v, - ); - } - - #[test] - fn integration_closed_range_after_to_inclusive() { - let v = GroveVersion::latest(); - let (merk, root) = make_15_key_provable_count_tree(v); - // RangeAfterToInclusive(("c", "l")) → keys d..l (9 keys: d..=l excluding c). - round_trip( - &merk, - root, - QueryItem::RangeAfterToInclusive(b"c".to_vec()..=b"l".to_vec()), - 9, - v, - ); - } - - #[test] - fn integration_closed_range_after_to_exclusive() { - let v = GroveVersion::latest(); - let (merk, root) = make_15_key_provable_count_tree(v); - // RangeAfterTo(("c", "l")) → keys d..l (8 keys, both exclusive). - round_trip( - &merk, - root, - QueryItem::RangeAfterTo(b"c".to_vec()..b"l".to_vec()), - 8, - v, - ); - } - - #[test] - fn integration_range_below_all_keys() { - let v = GroveVersion::latest(); - let (merk, root) = make_15_key_provable_count_tree(v); - // Entire range below the smallest key — should produce count = 0 - // and a Disjoint proof at the root level. - round_trip( - &merk, - root, - QueryItem::RangeInclusive(vec![0x00]..=vec![0x10]), - 0, - v, - ); - } - - #[test] - fn integration_range_above_all_keys() { - let v = GroveVersion::latest(); - let (merk, root) = make_15_key_provable_count_tree(v); - // Entire range above the largest key. - round_trip( - &merk, - root, - QueryItem::RangeInclusive(b"z".to_vec()..=vec![0xff]), - 0, - v, - ); - } - - #[test] - fn integration_empty_merk() { - let v = GroveVersion::latest(); - let merk = TempMerk::new_with_tree_type(v, TreeType::ProvableCountTree); - let (ops, prover_count) = merk - .prove_aggregate_count_on_range(&QueryItem::Range(b"a".to_vec()..b"z".to_vec()), v) - .unwrap() - .expect("prove on empty merk should succeed"); - assert_eq!(prover_count, 0); - // Empty proof means the verifier returns NULL_HASH and count = 0. - let bytes = encode_proof(&ops); - let (root, verifier_count) = verify_aggregate_count_on_range_proof( - &bytes, - &QueryItem::Range(b"a".to_vec()..b"z".to_vec()), - ) - .unwrap() - .expect("verify on empty merk should succeed"); - assert_eq!(root, NULL_HASH); - assert_eq!(verifier_count, 0); - } - - #[test] - fn integration_rejected_on_normal_tree() { - let v = GroveVersion::latest(); - let merk = TempMerk::new(v); // NormalTree - let err = merk - .prove_aggregate_count_on_range(&QueryItem::Range(b"a".to_vec()..b"z".to_vec()), v) - .unwrap(); - assert!( - err.is_err(), - "expected an InvalidProofError on NormalTree, got Ok({:?})", - err.ok().map(|(_, c)| c) - ); - } - - #[test] - fn integration_count_forgery_is_rejected() { - // Demonstrates the cryptographic binding: tamper with the count in a - // HashWithCount op and the verifier's root-hash recomputation must - // diverge from the expected root. - let v = GroveVersion::latest(); - let (merk, expected_root) = make_15_key_provable_count_tree(v); - let inner_range = QueryItem::RangeInclusive(b"c".to_vec()..=b"l".to_vec()); - let (mut ops, _prover_count) = merk - .prove_aggregate_count_on_range(&inner_range, v) - .unwrap() - .expect("prove should succeed"); - - // Forge: bump the count on the first HashWithCount op we see. - let mut tampered = false; - for op in ops.iter_mut() { - if let ProofOp::Push(Node::HashWithCount(_, _, _, count)) - | ProofOp::PushInverted(Node::HashWithCount(_, _, _, count)) = op - { - *count = count.saturating_add(1); - tampered = true; - break; - } - } - assert!( - tampered, - "test setup: expected at least one HashWithCount op" - ); - - let bytes = encode_proof(&ops); - let (root, _count) = verify_aggregate_count_on_range_proof(&bytes, &inner_range) - .unwrap() - .expect("verify should still complete (root mismatch is the caller's job)"); - assert_ne!( - root, expected_root, - "tampered count must produce a different reconstructed root hash" - ); - } - - // ---------- no-proof variant: count_aggregate_on_range ---------- - // - // The no-proof entry point must return exactly the same count as the - // proof path for every range shape, without producing any proof ops. - // These tests cross-check the two paths on the same merk. - - /// Cross-check: assert that `count_aggregate_on_range` and the count - /// returned by `prove_aggregate_count_on_range` agree for the given - /// range, and that both equal `expected_count`. - fn no_proof_matches_prover( - merk: &Merk>, - inner_range: QueryItem, - expected_count: u64, - grove_version: &GroveVersion, - ) { - let no_proof = merk - .count_aggregate_on_range(&inner_range, grove_version) - .unwrap() - .expect("count_aggregate_on_range should succeed"); - assert_eq!( - no_proof, expected_count, - "no-proof variant returned wrong count for range {:?}", - inner_range - ); - let (_ops, prover_count) = merk - .prove_aggregate_count_on_range(&inner_range, grove_version) - .unwrap() - .expect("prove should succeed"); - assert_eq!( - no_proof, prover_count, - "no-proof variant disagrees with prover count for range {:?}", - inner_range - ); - } - - #[test] - fn no_proof_matches_prover_closed_range_inclusive() { - let v = GroveVersion::latest(); - let (merk, _root) = make_15_key_provable_count_tree(v); - no_proof_matches_prover( - &merk, - QueryItem::RangeInclusive(b"c".to_vec()..=b"l".to_vec()), - 10, - v, - ); - } - - #[test] - fn no_proof_matches_prover_closed_range_exclusive() { - let v = GroveVersion::latest(); - let (merk, _root) = make_15_key_provable_count_tree(v); - no_proof_matches_prover(&merk, QueryItem::Range(b"c".to_vec()..b"l".to_vec()), 9, v); - } - - #[test] - fn no_proof_matches_prover_open_range_from() { - let v = GroveVersion::latest(); - let (merk, _root) = make_15_key_provable_count_tree(v); - no_proof_matches_prover(&merk, QueryItem::RangeFrom(b"c".to_vec()..), 13, v); - } - - #[test] - fn no_proof_matches_prover_range_below_all_keys() { - let v = GroveVersion::latest(); - let (merk, _root) = make_15_key_provable_count_tree(v); - no_proof_matches_prover( - &merk, - QueryItem::RangeInclusive(vec![0x00]..=vec![0x10]), - 0, - v, - ); - } - - #[test] - fn no_proof_empty_merk_returns_zero() { - let v = GroveVersion::latest(); - let merk = TempMerk::new_with_tree_type(v, TreeType::ProvableCountTree); - let count = merk - .count_aggregate_on_range(&QueryItem::Range(b"a".to_vec()..b"z".to_vec()), v) - .unwrap() - .expect("count_aggregate_on_range on empty merk should succeed"); - assert_eq!(count, 0); - } - - #[test] - fn no_proof_rejected_on_normal_tree() { - let v = GroveVersion::latest(); - let merk = TempMerk::new(v); // NormalTree - let result = merk - .count_aggregate_on_range(&QueryItem::Range(b"a".to_vec()..b"z".to_vec()), v) - .unwrap(); - assert!( - result.is_err(), - "expected InvalidProofError on NormalTree, got Ok({:?})", - result.ok() - ); - } - - #[test] - fn no_proof_matches_prover_range_after() { - // RangeAfter at the root pushes the left boundary exclusive to "b", - // which causes the walk to descend into the right subtree from the - // root — exercising the right-child arm of walk_count_only. - let v = GroveVersion::latest(); - let (merk, _root) = make_15_key_provable_count_tree(v); - no_proof_matches_prover(&merk, QueryItem::RangeAfter(b"b".to_vec()..), 13, v); - } - - #[test] - fn no_proof_matches_prover_range_to() { - let v = GroveVersion::latest(); - let (merk, _root) = make_15_key_provable_count_tree(v); - // RangeTo(..b"e") — exclusive upper, keys a..d (4 keys). - no_proof_matches_prover(&merk, QueryItem::RangeTo(..b"e".to_vec()), 4, v); - } - - #[test] - fn no_proof_matches_prover_range_to_inclusive() { - let v = GroveVersion::latest(); - let (merk, _root) = make_15_key_provable_count_tree(v); - // RangeToInclusive(..=b"e") — keys a..=e (5 keys). - no_proof_matches_prover(&merk, QueryItem::RangeToInclusive(..=b"e".to_vec()), 5, v); - } - - #[test] - fn no_proof_matches_prover_range_after_to_inclusive() { - let v = GroveVersion::latest(); - let (merk, _root) = make_15_key_provable_count_tree(v); - // RangeAfterToInclusive(("c", "l")) — keys d..=l (9 keys). - no_proof_matches_prover( - &merk, - QueryItem::RangeAfterToInclusive(b"c".to_vec()..=b"l".to_vec()), - 9, - v, - ); - } - - #[test] - fn no_proof_provable_count_sum_tree() { - // Exercise the ProvableCountSumTree branch of the tree-type gate — - // it should accept the walk and return the same count as a - // ProvableCountTree with the same key set. - let v = GroveVersion::latest(); - let mut merk = TempMerk::new_with_tree_type(v, TreeType::ProvableCountSumTree); - // ProvableCountedAndSummedMerkNode(count=1, sum=0): treats each - // entry as count-1 with sum-contribution 0. - let entries: Vec<(Vec, Op)> = (b'a'..=b'o') - .enumerate() - .map(|(i, c)| { - ( - vec![c], - Op::Put( - vec![i as u8], - crate::tree::TreeFeatureType::ProvableCountedSummedMerkNode(1, 0), - ), - ) - }) - .collect(); - merk.apply::<_, Vec<_>>(&entries, &[], None, v) - .unwrap() - .expect("apply ProvableCountSumTree entries"); - merk.commit(v); - - let count = merk - .count_aggregate_on_range(&QueryItem::RangeInclusive(b"c".to_vec()..=b"l".to_vec()), v) - .unwrap() - .expect("count_aggregate_on_range on ProvableCountSumTree should succeed"); - assert_eq!(count, 10, "c..=l should be 10 keys"); - } - - // ---------- attack tests for the shape-walk verifier ---------- - // - // These three tests exercise attacks the old allowlist-only verifier let - // through. With the shape walk in `verify_count_shape`, each one is - // rejected before the caller's root-hash check. - - /// A malicious prover sends a single `Push(Hash(expected_root))` for a - /// non-empty tree. Without the shape check this would return - /// `(expected_root, 0)` for any range. The shape check classifies the - /// root with `(None, None)` against a bounded inner range as `Boundary`, - /// expects `KVDigestCount`, and rejects. - #[test] - fn shape_walk_rejects_single_hash_undercount() { - let v = GroveVersion::latest(); - let (merk, expected_root) = make_15_key_provable_count_tree(v); - let inner_range = QueryItem::RangeInclusive(b"c".to_vec()..=b"l".to_vec()); - - // Forged proof: a single Hash op carrying the genuine root hash. - let mut forged: LinkedList = LinkedList::new(); - forged.push_back(ProofOp::Push(Node::Hash(expected_root))); - let bytes = encode_proof(&forged); - - let result = verify_aggregate_count_on_range_proof(&bytes, &inner_range).unwrap(); - let err = result.expect_err("single-Hash forgery must be rejected"); - // keep merk alive for clarity in the test scope - let _ = merk; - // Plain `Hash` is no longer in the count-proof allowlist (it would - // carry an unbound structural count), so the rejection now lands - // in Phase 1's coarse allowlist rather than Phase 2's shape walk. - // Either error message is fine — the attack is rejected. - match err { - Error::InvalidProofError(msg) => { - assert!( - msg.contains("unexpected node type") - || msg.contains("expected KVDigestCount") - || msg.contains("Boundary"), - "unexpected message: {msg}" - ); - } - other => panic!("expected InvalidProofError, got {other:?}"), - } - } - - /// A malicious prover replaces an in-range `HashWithCount` subtree with - /// a `Hash` carrying that subtree's node_hash, undercounting by the - /// subtree's count. The hash chain still matches (same node_hash), so - /// the old allowlist verifier would have happily returned a wrong - /// count. The shape walk classifies that position as `Contained` and - /// requires `HashWithCount`, rejecting the swap. - #[test] - fn shape_walk_rejects_hash_swap_for_contained_subtree() { - let v = GroveVersion::latest(); - let (merk, _root) = make_15_key_provable_count_tree(v); - let inner_range = QueryItem::RangeInclusive(b"c".to_vec()..=b"l".to_vec()); - let (mut ops, _) = merk - .prove_aggregate_count_on_range(&inner_range, v) - .unwrap() - .expect("prove succeeds"); - - // Swap the first HashWithCount op for a Hash op carrying the - // computed node_hash for that subtree (so the chain check still - // matches and only the shape walk can detect the attack). - let mut swapped = false; - for op in ops.iter_mut() { - if let ProofOp::Push(Node::HashWithCount(kv_hash, l, r, c)) = op { - let node_hash = crate::tree::node_hash_with_count(kv_hash, l, r, *c).unwrap(); - *op = ProofOp::Push(Node::Hash(node_hash)); - swapped = true; - break; - } - } - assert!( - swapped, - "test setup: expected at least one HashWithCount op" - ); - - let bytes = encode_proof(&ops); - let result = verify_aggregate_count_on_range_proof(&bytes, &inner_range).unwrap(); - assert!( - result.is_err(), - "HashWithCount→Hash swap on a Contained subtree must be rejected by the shape walk" - ); - } - - /// A malicious prover attaches a `KVDigestCount` child under a leaf - /// `HashWithCount`. Because `Tree::hash()` for `HashWithCount` is - /// computed from the four embedded fields and ignores any reconstructed - /// children, the root hash check passes — but a naive verifier that - /// counts every visited node would credit the bogus child as +1. The - /// shape walk requires `Contained` positions to be **leaves**, so it - /// rejects the smuggled-in child. - #[test] - fn shape_walk_rejects_keyless_node_with_attached_children() { - let v = GroveVersion::latest(); - let (merk, _root) = make_15_key_provable_count_tree(v); - let inner_range = QueryItem::RangeInclusive(b"c".to_vec()..=b"l".to_vec()); - let (mut ops, _honest_count) = merk - .prove_aggregate_count_on_range(&inner_range, v) - .unwrap() - .expect("prove succeeds"); - - // Smuggle a fake +1 child under the first HashWithCount op. After - // any HashWithCount(...), insert: Push(Hash(zero)) Parent — that - // attaches an extra hashed node as the LEFT child of the - // HashWithCount during reconstruction. Then add a fake - // Push(KVDigestCount) Child that would be picked up by an - // allowlist verifier counting visited keys. - // - // Concretely we splice 4 ops right after the HashWithCount: - // Push(KVDigestCount(in_range_key, value_hash, 1)) - // Parent (attach KVDigestCount as the LEFT child of HashWithCount) - // Push(Hash([0; 32])) - // Child (attach Hash as the RIGHT child of HashWithCount) - // - // The HashWithCount's hash() ignores these children, so the root - // hash recomputation is unaffected. The shape walk catches the - // Contained-position-with-children violation. - let mut new_ops: LinkedList = LinkedList::new(); - let mut spliced = false; - for op in ops.iter() { - new_ops.push_back(op.clone()); - if !spliced && matches!(op, ProofOp::Push(Node::HashWithCount(_, _, _, _))) { - let in_range_key = b"d".to_vec(); - new_ops.push_back(ProofOp::Push(Node::KVDigestCount( - in_range_key, - [0u8; 32], - 1, - ))); - new_ops.push_back(ProofOp::Parent); - new_ops.push_back(ProofOp::Push(Node::Hash([0u8; 32]))); - new_ops.push_back(ProofOp::Child); - spliced = true; - } - } - assert!( - spliced, - "test setup: expected to splice into a HashWithCount" - ); - ops = new_ops; - - let bytes = encode_proof(&ops); - let result = verify_aggregate_count_on_range_proof(&bytes, &inner_range).unwrap(); - assert!( - result.is_err(), - "attaching children under HashWithCount must be rejected (root hash alone wouldn't catch it)" - ); - } - - /// `HashWithCount` is only safe inside the dedicated aggregate-count - /// verifier (which shape-checks the collapsed subtree). The plain - /// `Query::execute_proof` verifier must reject it on sight — otherwise - /// a malicious prover could include `HashWithCount` in a regular - /// query proof, attach fake KV children to it (whose pushes the - /// verifier would credit as query results via `execute_node`), and - /// have the parent's hash chain still verify because - /// `Tree::hash()` for `HashWithCount` ignores attached children. - #[test] - fn regular_query_verifier_rejects_hash_with_count_node() { - use crate::proofs::query::QueryProofVerify; - let v = GroveVersion::latest(); - - // Build a regular merk and a regular range query against it. - let mut merk = TempMerk::new(v); - for i in 0u8..5 { - merk.apply::<_, Vec<_>>( - &[( - vec![i], - Op::Put(vec![i], crate::TreeFeatureType::BasicMerkNode), - )], - &[], - None, - v, - ) - .unwrap() - .expect("apply"); - } - merk.commit(v); - let q = crate::proofs::query::Query::new_single_query_item(QueryItem::Range( - vec![0u8]..vec![5u8], - )); - - // Generate an honest proof, then splice a `HashWithCount` push into - // it. The exact op sequence doesn't matter for what we're testing — - // we just need the regular verifier to refuse to process the proof - // because it contains a `HashWithCount`. - let (mut ops, _) = merk - .prove_unchecked_query_items(&[QueryItem::Range(vec![0u8]..vec![5u8])], None, true, v) - .unwrap() - .expect("prove"); - ops.push_front(ProofOp::Push(Node::HashWithCount( - [0u8; 32], [0u8; 32], [0u8; 32], 0, - ))); - let bytes = encode_proof(&ops); - - let result = q.execute_proof(&bytes, None, true, 0).unwrap(); - let err = result.expect_err("regular query verifier must reject HashWithCount on sight"); - let msg = format!("{}", err); - assert!( - msg.contains("HashWithCount") || msg.contains("aggregate-count"), - "expected HashWithCount-rejection message, got: {msg}" - ); - } - - // ---------- byte-mutation fuzzer ---------- - // - // Stronger forgery-resistance check than the three hand-crafted attack - // tests above: enumerate every byte of an honest proof, flip it to - // each of three different values, and assert the verifier never - // produces a "silent forgery" — i.e. an `Ok((root, count))` where - // the root **matches** the honest one but the count **differs**. - // - // Three safe outcomes per mutation: - // - **Rejection** — Phase 1 decode error, or Phase 2 shape mismatch. - // - **Divergence** — `Ok((root', _))` where `root' != honest_root`, - // so any caller comparing against their trusted root catches it. - // - **Same outcome** — `Ok((honest_root, honest_count))`. This can - // happen for non-canonical re-encodings (e.g. swapping - // `Push` ↔ `PushInverted` doesn't change the reconstructed tree's - // root or the shape walk's count). Harmless: the verifier is - // deterministic on (root, count), and that pair is what the - // caller acts on. - // - // The **unsafe** outcome is `Ok((honest_root, count'))` where - // `count' != honest_count`. The hash chain binds count via - // `node_hash_with_count`, so this should be impossible — the test - // panics if it ever happens. - // - // We also assert each safe branch fires at least once as a sanity - // check that the test is actually exercising the surface. - #[test] - fn fuzz_byte_mutation_no_silent_forgery() { - let v = GroveVersion::latest(); - let (merk, honest_root) = make_15_key_provable_count_tree(v); - let inner_range = QueryItem::RangeInclusive(b"c".to_vec()..=b"l".to_vec()); - let (ops, honest_count) = merk - .prove_aggregate_count_on_range(&inner_range, v) - .unwrap() - .expect("prove"); - let honest_bytes = encode_proof(&ops); - assert!(!honest_bytes.is_empty()); - - let mut rejected = 0usize; - let mut diverged = 0usize; - let mut same_outcome = 0usize; - let mut total = 0usize; - - // Three different mutations per byte: +1, +0x55, XOR 0xff. - let deltas: [u8; 3] = [1, 0x55, 0xff]; - for byte_idx in 0..honest_bytes.len() { - for &delta in &deltas { - let mut bytes = honest_bytes.clone(); - let original = bytes[byte_idx]; - let mutated = if delta == 0xff { - original ^ 0xff - } else { - original.wrapping_add(delta) - }; - if mutated == original { - continue; // no-op, don't count - } - bytes[byte_idx] = mutated; - total += 1; - - let result = verify_aggregate_count_on_range_proof(&bytes, &inner_range).unwrap(); - match result { - Err(_) => rejected += 1, - Ok((root, count)) => { - if root == honest_root { - // Same root — the verifier MUST also produce - // the same count, otherwise we have a silent - // count-forgery: the caller would accept the - // forged count thinking it's the honest one. - assert_eq!( - count, honest_count, - "SILENT FORGERY at byte index {} (delta=0x{:02x}): \ - verifier returned the honest root but a wrong count \ - ({} != {}). The hash chain should bind count.", - byte_idx, delta, count, honest_count - ); - same_outcome += 1; - } else { - // Different root — caller's root check catches it. - diverged += 1; - } - } - } - } - } - - // Sanity: each safe branch should fire at least once on a real proof. - assert!( - rejected > 0, - "expected at least one mutation to be rejected outright" - ); - assert!( - diverged > 0, - "expected at least one mutation to diverge the root hash" - ); - // `same_outcome` may legitimately be zero on some encoders, so we - // don't require it. We just require no silent forgery occurred, - // which the inner assert_eq! guarantees. - let _ = same_outcome; - assert_eq!(rejected + diverged + same_outcome, total); - } - - // ---------- randomized round-trip property test ---------- - // - // Build merks with varying sizes and key shapes from a deterministic - // RNG, run a bunch of randomly-chosen ranges through the prove → encode - // → verify pipeline, and assert the verifier's count agrees with a - // ground-truth count computed by directly intersecting the inserted - // keys with the range. Catches silent miscounts that the fixed - // examples above would miss (off-by-one, edge-of-tree, exact-bound - // matches against multi-byte keys, etc.). - #[test] - fn fuzz_random_trees_and_ranges_round_trip() { - // Tiny custom xorshift RNG so we don't have to add a dev-dep. - struct XorShift(u64); - impl XorShift { - fn next_u64(&mut self) -> u64 { - let mut x = self.0; - x ^= x << 13; - x ^= x >> 7; - x ^= x << 17; - self.0 = x; - x - } - fn gen_range(&mut self, lo: usize, hi: usize) -> usize { - lo + (self.next_u64() as usize) % (hi - lo) - } - fn gen_key(&mut self, max_len: usize) -> Vec { - let len = 1 + self.gen_range(0, max_len); - (0..len).map(|_| (self.next_u64() & 0xff) as u8).collect() - } - } - - let v = GroveVersion::latest(); - let mut rng = XorShift(0xDEAD_BEEF_C0FFEE); - let trials = 16; - for trial in 0..trials { - let key_count = rng.gen_range(1, 64); - let mut keys: Vec> = (0..key_count).map(|_| rng.gen_key(8)).collect(); - keys.sort(); - keys.dedup(); - - let mut merk = TempMerk::new_with_tree_type(v, TreeType::ProvableCountTree); - let entries: Vec<(Vec, Op)> = keys - .iter() - .map(|k| (k.clone(), Op::Put(vec![0xAB], ProvableCountedMerkNode(1)))) - .collect(); - merk.apply::<_, Vec<_>>(&entries, &[], None, v) - .unwrap() - .expect("apply"); - merk.commit(v); - let root = merk.root_hash().unwrap(); - - // Try several random ranges per tree, picking shapes that - // exercise both bounded and half-bounded variants. - for sub_trial in 0..6 { - let lo = rng.gen_key(8); - let hi = rng.gen_key(8); - let (lo, hi) = if lo <= hi { (lo, hi) } else { (hi, lo) }; - - let inner_range = match sub_trial % 6 { - 0 => QueryItem::Range(lo.clone()..hi.clone()), - 1 => QueryItem::RangeInclusive(lo.clone()..=hi.clone()), - 2 => QueryItem::RangeFrom(lo.clone()..), - 3 => QueryItem::RangeAfter(lo.clone()..), - 4 => QueryItem::RangeTo(..hi.clone()), - _ => QueryItem::RangeToInclusive(..=hi.clone()), - }; - - let expected = keys - .iter() - .filter(|k| inner_range.contains(k.as_slice())) - .count() as u64; - - let (ops, prover_count) = merk - .prove_aggregate_count_on_range(&inner_range, v) - .unwrap() - .expect("prove"); - assert_eq!( - prover_count, expected, - "trial {} sub {}: prover count mismatch for range {:?}", - trial, sub_trial, inner_range - ); - let bytes = encode_proof(&ops); - let (vroot, vcount) = verify_aggregate_count_on_range_proof(&bytes, &inner_range) - .unwrap() - .expect("verify"); - assert_eq!( - vroot, root, - "trial {} sub {}: verifier root mismatch", - trial, sub_trial - ); - assert_eq!( - vcount, expected, - "trial {} sub {}: verifier count mismatch for range {:?}", - trial, sub_trial, inner_range - ); - } - } - } - - // ---------- shape-walk rejection of malformed proof shapes ---------- - // - // These tests synthesize op streams that are well-formed bytes (Phase 1 - // decode succeeds) but violate the structural invariants the shape walk - // requires (Phase 2 rejection). They exist to lock down the defensive - // error branches in `verify_count_shape` so future refactors that - // accidentally relax them are caught by the test suite. - - /// `HashWithCount` is only valid as a leaf in the proof tree. If the - /// prover attaches children to a Disjoint-position `HashWithCount`, - /// the shape walk must reject — even though the parent's hash chain - /// (which uses `Tree::hash()` for `HashWithCount`, computed from the - /// four embedded fields and ignoring children) would still verify. - #[test] - fn shape_walk_rejects_disjoint_hashwithcount_with_children() { - let v = GroveVersion::latest(); - let (merk, _root) = make_15_key_provable_count_tree(v); - // RangeAfter("o") → all 15 keys are below; the entire tree is - // Disjoint relative to the inner range, so the honest proof is a - // single Push(HashWithCount(...)). - let inner_range = QueryItem::RangeAfter(b"o".to_vec()..); - let (mut ops, _) = merk - .prove_aggregate_count_on_range(&inner_range, v) - .unwrap() - .expect("prove succeeds"); - - // Splice in another HashWithCount as the child (no key, so no - // ordering constraint at Phase 1) so we exercise Phase 2's - // leaf-only assertion at the Disjoint position. - let mut spliced = LinkedList::::new(); - let mut done = false; - for op in ops.iter() { - spliced.push_back(op.clone()); - if !done && matches!(op, ProofOp::Push(Node::HashWithCount(_, _, _, _))) { - spliced.push_back(ProofOp::Push(Node::HashWithCount( - [0u8; 32], [0u8; 32], [0u8; 32], 1, - ))); - spliced.push_back(ProofOp::Parent); - done = true; - } - } - assert!(done, "test setup: expected at least one HashWithCount op"); - ops = spliced; - - let bytes = encode_proof(&ops); - let result = verify_aggregate_count_on_range_proof(&bytes, &inner_range).unwrap(); - let err = result.expect_err("Disjoint HashWithCount with children must be rejected"); - match err { - Error::InvalidProofError(msg) => assert!( - msg.contains("Disjoint position must be a leaf"), - "unexpected message: {msg}" - ), - other => panic!("expected InvalidProofError, got {:?}", other), - } - } - - /// At a Disjoint position the shape walk requires `HashWithCount` (only - /// node type with a hash-bound count). A `Hash` op there would carry an - /// untrusted structural count for the parent's `own_count` derivation, - /// so it must be rejected. - #[test] - fn shape_walk_rejects_non_hashwithcount_at_disjoint() { - let v = GroveVersion::latest(); - let (merk, _root) = make_15_key_provable_count_tree(v); - let inner_range = QueryItem::RangeAfter(b"o".to_vec()..); - let (mut ops, _) = merk - .prove_aggregate_count_on_range(&inner_range, v) - .unwrap() - .expect("prove succeeds"); - - // Replace the single Disjoint HashWithCount with a plain Hash. - let mut swapped = false; - for op in ops.iter_mut() { - if let ProofOp::Push(Node::HashWithCount(kv, l, r, c)) = op { - let node_hash = crate::tree::node_hash_with_count(kv, l, r, *c).unwrap(); - *op = ProofOp::Push(Node::Hash(node_hash)); - swapped = true; - break; - } - } - assert!(swapped, "test setup: expected a HashWithCount op to swap"); - - let bytes = encode_proof(&ops); - let result = verify_aggregate_count_on_range_proof(&bytes, &inner_range).unwrap(); - // Phase 1 rejects plain Hash via the allowlist; Phase 2 would also - // reject "expected HashWithCount at Disjoint position". Either is fine. - let err = result.expect_err("plain Hash at Disjoint must be rejected"); - match err { - Error::InvalidProofError(_) => {} - other => panic!("expected InvalidProofError, got {:?}", other), - } - } - - /// At a Boundary position the shape walk requires the node's key to - /// fall strictly inside the inherited subtree bounds. A prover that - /// emits a `KVDigestCount` whose key is outside those bounds is trying - /// to confuse the recursion's bound tracking — it must be rejected. - #[test] - fn shape_walk_rejects_kvdigestcount_outside_inherited_bounds() { - let v = GroveVersion::latest(); - let (merk, _root) = make_15_key_provable_count_tree(v); - let inner_range = QueryItem::RangeInclusive(b"c".to_vec()..=b"l".to_vec()); - let (mut ops, _) = merk - .prove_aggregate_count_on_range(&inner_range, v) - .unwrap() - .expect("prove succeeds"); - - // Find a Boundary KVDigestCount and rewrite its key to something - // outside the tree (way past 'z'). This will violate the inherited - // (lo, hi) bounds at the verifier's recursion frame. - let mut rewrote = false; - for op in ops.iter_mut() { - if let ProofOp::Push(Node::KVDigestCount(key, _, _)) = op { - *key = vec![0xff, 0xff]; - rewrote = true; - break; - } - } - assert!(rewrote, "test setup: expected a KVDigestCount to rewrite"); - - let bytes = encode_proof(&ops); - let result = verify_aggregate_count_on_range_proof(&bytes, &inner_range).unwrap(); - let err = result.expect_err("KVDigestCount outside bounds must be rejected"); - match err { - Error::InvalidProofError(_) => {} - other => panic!("expected InvalidProofError, got {:?}", other), - } - } -} - -/// Verifier-only smoke tests that exercise the leaf-level verifier without -/// pulling in any prover-side machinery (no `TempMerk`, no `RefWalker`). -/// They consume hardcoded fixtures kept in lockstep with the prover by -/// `tests::verify_only_fixture_matches_fresh_prover_output` above — -/// when that drift check fails it prints fresh constants to paste here. -#[cfg(test)] -mod verify_only_tests { - use super::*; - - /// Hex-encoded proof bytes for a 15-key `ProvableCountTree` (keys - /// "a"..="o", each with feature `ProvableCountedMerkNode(1)`) queried - /// with `RangeInclusive("c"..="l")`. Captured from - /// `dump_verify_only_fixtures`; regenerate if the proof encoding ever - /// changes. - pub(super) const FIXTURE_15_KEY_C_TO_L_PROOF_HEX: &str = "1e76f2d62fbefb076d8902b2f25bcf9acbd1e903b740c98ea0d926473922f6bbb50000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000011a01622022ec9d571ba774cf9e83d0194962f5d1e3aa1a48d486a67e2762a6c79590150000000000000003101a0163b7d770040f780e9deff6bc038abea66e108b88d098d16d24cd7486eb671060b20000000000000001111a0164d2ad1a0bb9fdf4450bd87151c08b9968cd046bda6654aabdba2430b0a981e7900000000000000007101e28b724715b1fab1f72e0be7e944488dcfbeeb875867d27c06ad9bad8c739997207ce95cd4d1789e01c0a079a3f2c18a3888f2d69fa6d0eabf51c2b434f7cb99e212f1a0042798fb890e8203f007f4f58b033a72cb4e070bfddceb27687d641fe0000000000000003111a01683fc14ed7ecde203a90425ee191e9db5966336d737f0398ec93b764517b6df400000000000000000f101e6da2e2f8e4bdead2a8ac51909f0fa0fb88d47d6bc3b84858bb739fb28a36501031b7c191d5ac70764f815bd7a6c7d0e628f48cef5b813933c07d5ce0ac1dbd5a995443ca10193ebf20e64468deaecc061a981a6dbf4f30e7154b5e9ab806866d00000000000000031a016c55c024f95ca4cc338f7cc2e25db37be2a3fa3a40b151017e460bfc0779cf369f0000000000000007101e3673296561a4d6c3e1ec5cd02c5c468acbd3c8ccd4a42906e8ed06d3fb587a0d2b6d9e310b7c94d3f91fcbb3d5f7547b76c6d1ab3ac3d3540752c5f0b46be24a2f66bf541434a53eae46fa4e6092c03511538c0e1a2c5fc0f0deb72de08a71e500000000000000031111"; - pub(super) const FIXTURE_15_KEY_C_TO_L_ROOT_HEX: &str = - "19ed16776ebe6643b342a238baf7508ddf687fc4bdd53e98f91df8bffb605d96"; - pub(super) const FIXTURE_15_KEY_C_TO_L_COUNT: u64 = 10; - - /// Empty proof bytes encode "empty merk" — the verifier returns - /// `(NULL_HASH, 0)`. This case has no prover dependency at all and is - /// the most basic compile-time signal that the verifier path is wired - /// up correctly under `--no-default-features --features verify`. - #[test] - fn empty_merk_returns_null_hash_and_zero_count() { - let inner = QueryItem::Range(b"a".to_vec()..b"z".to_vec()); - let (root, count) = verify_aggregate_count_on_range_proof(&[], &inner) - .unwrap() - .expect("verify on empty proof must succeed"); - assert_eq!(root, NULL_HASH); - assert_eq!(count, 0); - } - - /// A real `RangeInclusive("c"..="l")` proof against a 15-key - /// `ProvableCountTree`. The verifier must reconstruct the expected - /// merk root hash and recover count = 10. - #[test] - fn fixture_15_key_range_c_to_l_verifies() { - let proof = hex::decode(FIXTURE_15_KEY_C_TO_L_PROOF_HEX).expect("valid hex"); - let mut expected_root = [0u8; 32]; - expected_root - .copy_from_slice(&hex::decode(FIXTURE_15_KEY_C_TO_L_ROOT_HEX).expect("valid hex")); - - let inner = QueryItem::RangeInclusive(b"c".to_vec()..=b"l".to_vec()); - let (root, count) = verify_aggregate_count_on_range_proof(&proof, &inner) - .unwrap() - .expect("fixture proof must verify"); - assert_eq!( - root, expected_root, - "verifier reconstructed an unexpected root — fixture stale?" - ); - assert_eq!(count, FIXTURE_15_KEY_C_TO_L_COUNT); - } - - /// Mutating any single byte of the fixture proof must not yield a - /// `(honest_root, wrong_count)` outcome — the hash chain binds count via - /// `node_hash_with_count`, so any successful verify with the honest root - /// must reproduce the honest count. Single-fixture analogue of - /// `fuzz_byte_mutation_no_silent_forgery` that runs without the prover. - #[test] - fn fixture_byte_mutation_does_not_silently_forge_count() { - let proof = hex::decode(FIXTURE_15_KEY_C_TO_L_PROOF_HEX).expect("valid hex"); - let mut expected_root = [0u8; 32]; - expected_root - .copy_from_slice(&hex::decode(FIXTURE_15_KEY_C_TO_L_ROOT_HEX).expect("valid hex")); - let inner = QueryItem::RangeInclusive(b"c".to_vec()..=b"l".to_vec()); - - // Sanity check: the honest fixture verifies under the same code path - // the mutation loop will exercise. Without this, an `Err`-on-honest - // fixture would silently make every mutation a vacuous pass. - let (honest_root, honest_count) = verify_aggregate_count_on_range_proof(&proof, &inner) - .unwrap() - .expect("honest fixture must verify (regenerate fixture if this fails)"); - assert_eq!(honest_root, expected_root); - assert_eq!(honest_count, FIXTURE_15_KEY_C_TO_L_COUNT); - - for byte_idx in 0..proof.len() { - for &delta in &[1u8, 0x55, 0xff] { - let mut bytes = proof.clone(); - let original = bytes[byte_idx]; - let mutated = if delta == 0xff { - original ^ 0xff - } else { - original.wrapping_add(delta) - }; - if mutated == original { - continue; - } - bytes[byte_idx] = mutated; - if let Ok((root, count)) = - verify_aggregate_count_on_range_proof(&bytes, &inner).unwrap() - && root == expected_root - { - assert_eq!( - count, FIXTURE_15_KEY_C_TO_L_COUNT, - "SILENT FORGERY at byte {} (delta=0x{:02x}): \ - verifier returned the honest root but a wrong count \ - ({} != {}).", - byte_idx, delta, count, FIXTURE_15_KEY_C_TO_L_COUNT - ); - } - // Err and Ok-with-different-root are both safe outcomes. - } - } - } -} diff --git a/merk/src/proofs/query/aggregate_count/emit.rs b/merk/src/proofs/query/aggregate_count/emit.rs new file mode 100644 index 000000000..eaf5be947 --- /dev/null +++ b/merk/src/proofs/query/aggregate_count/emit.rs @@ -0,0 +1,263 @@ +//! Recursive proof-emission engine for `AggregateCountOnRange`. +//! +//! For each subtree we visit, the bound classification (Disjoint / +//! Contained / Boundary) determines what op to push and whether to +//! descend: +//! +//! - **Disjoint** / **Contained** → emit a single `HashWithCount` op +//! for the collapsed subtree root. Contained contributes its full +//! subtree count to the running in-range total; Disjoint contributes +//! 0. (Both still need the count hash-bound so the verifier can +//! reconstruct the parent's `own_count` later — see the inline +//! comment on the `HashWithCount` emit for the long form.) +//! - **Boundary** → emit `KVDigestCount(key, value_hash, node_count)` +//! for the current node, recurse into both children for descent, and +//! add `own_count = node_count − left_struct − right_struct` to the +//! running total iff the node's key is itself in range. This is what +//! makes `NonCounted`-wrapped entries fall out of the in-range total +//! automatically (their node_count is 0). + +use std::collections::LinkedList; + +use grovedb_costs::{cost_return_on_error, CostResult, CostsExt, OperationCost}; +use grovedb_version::version::GroveVersion; + +use super::provable_count_from_aggregate; +use crate::{ + proofs::{ + query::{ + aggregate_common::{classify_subtree, SubtreeClassification, NULL_HASH}, + QueryItem, + }, + Node, Op, + }, + tree::{kv::ValueDefinedCostType, Fetch, RefWalker}, + CryptoHash, Error, +}; + +/// Recursive proof emitter. Always called on a non-empty subtree. +/// +/// At entry, `subtree_lo_excl` / `subtree_hi_excl` are the inherited +/// exclusive key bounds for the subtree this walker points at (both `None` +/// at the root call). +pub(super) fn emit_count_proof( + walker: &mut RefWalker<'_, S>, + range: &QueryItem, + subtree_lo_excl: Option<&[u8]>, + subtree_hi_excl: Option<&[u8]>, + ops: &mut LinkedList, + grove_version: &GroveVersion, +) -> CostResult +where + S: Fetch + Sized + Clone, +{ + let mut cost = OperationCost::default(); + + // Step 1: classify the current subtree against the inner range. + let class = classify_subtree(subtree_lo_excl, subtree_hi_excl, range); + + if matches!( + class, + SubtreeClassification::Disjoint | SubtreeClassification::Contained + ) { + // Whole subtree is either entirely outside or entirely inside the + // range. Either way we emit a single self-verifying + // `HashWithCount(kv_hash, left_child_hash, right_child_hash, count)` + // op for the subtree's root. + // + // Why HashWithCount even for Disjoint subtrees (rather than the + // smaller `Hash(node_hash)` that an in-range count would never + // need)? Because the parent's `own_count` is computed by the + // verifier as `parent_aggregate − left_struct − right_struct` (see + // `verify_count_shape`), so the *structural* count of every child + // — including disjoint outside subtrees — has to be + // cryptographically bound to the parent's hash chain. The only + // node type that carries a hash-bound count is `HashWithCount` + // (its four committed fields recompute `node_hash_with_count` and + // would diverge under any count tampering). Plain `Hash(node_hash)` + // carries no count, so a malicious prover could lie about the + // structural count and skew the parent's `own_count` + // derivation — leading to silent over/under-counts at boundary + // ancestors. + let aggregate = match walker.tree().aggregate_data() { + Ok(a) => a, + Err(e) => { + return Err(Error::InvalidProofError(format!("aggregate_data: {}", e))) + .wrap_with_cost(cost); + } + }; + let subtree_count = match provable_count_from_aggregate(aggregate) { + Ok(c) => c, + Err(e) => return Err(e).wrap_with_cost(cost), + }; + let kv_hash = *walker.tree().kv_hash(); + let left_child_hash = walker + .tree() + .link(true) + .map(|l| *l.hash()) + .unwrap_or(NULL_HASH); + let right_child_hash = walker + .tree() + .link(false) + .map(|l| *l.hash()) + .unwrap_or(NULL_HASH); + ops.push_back(Op::Push(Node::HashWithCount( + kv_hash, + left_child_hash, + right_child_hash, + subtree_count, + ))); + // For the prover-side in-range total: Contained contributes its + // entire subtree count (which already excludes NonCounted entries + // because their stored aggregate is 0); Disjoint contributes 0. + let in_range_contribution = match class { + SubtreeClassification::Contained => subtree_count, + SubtreeClassification::Disjoint => 0, + SubtreeClassification::Boundary => unreachable!(), + }; + return Ok(in_range_contribution).wrap_with_cost(cost); + } + // class == Boundary — fall through to descent + KVDigestCount emission. + + // Step 2: snapshot what we need from the current node before walking. + // walk(true/false) takes &mut self.tree, so we must drop any existing + // borrows on walker.tree() before calling it. + let node_key: Vec = walker.tree().key().to_vec(); + let node_value_hash: CryptoHash = *walker.tree().value_hash(); + let node_count: u64 = match walker + .tree() + .aggregate_data() + .map_err(|e| Error::InvalidProofError(format!("aggregate_data: {}", e))) + { + Ok(data) => match provable_count_from_aggregate(data) { + Ok(c) => c, + Err(e) => return Err(e).wrap_with_cost(cost), + }, + Err(e) => return Err(e).wrap_with_cost(cost), + }; + + // Snapshot each child link's structural aggregate count from the link + // itself (avoids loading the child for this lookup). The verifier needs + // these to compute `own_count = node_count − left_struct − right_struct` + // at this boundary node. + let left_link_aggregate: u64 = walker + .tree() + .link(true) + .map(|l| l.aggregate_data().as_count_u64()) + .unwrap_or(0); + let right_link_aggregate: u64 = walker + .tree() + .link(false) + .map(|l| l.aggregate_data().as_count_u64()) + .unwrap_or(0); + let left_link_present = walker.tree().link(true).is_some(); + let right_link_present = walker.tree().link(false).is_some(); + + let mut total: u64 = 0; + + // Step 3: handle the LEFT child. Both Disjoint and Contained require a + // one-level walk so the recursive Disjoint/Contained arm can emit a + // self-verifying `HashWithCount` (plain `Hash` is no longer used here + // — see the Disjoint branch comment above). + let left_emitted = if left_link_present { + let left_lo = subtree_lo_excl; + let left_hi: Option<&[u8]> = Some(node_key.as_slice()); + let walked = cost_return_on_error!( + &mut cost, + walker.walk( + true, + None::<&fn(&[u8], &GroveVersion) -> Option>, + grove_version, + ) + ); + let mut left_walker = match walked { + Some(lw) => lw, + None => { + return Err(Error::CorruptedState( + "tree.link(true) was Some but walk(true) returned None", + )) + .wrap_with_cost(cost) + } + }; + let n = cost_return_on_error!( + &mut cost, + emit_count_proof( + &mut left_walker, + range, + left_lo, + left_hi, + ops, + grove_version, + ) + ); + total = total.saturating_add(n); + true + } else { + false + }; + + // Step 4: emit the current node as a boundary KVDigestCount + attach left + // as its left child. The node's own contribution to the in-range count + // is `own_count` (0 for `NonCounted`-wrapped, 1 for normal), derived as + // `node_count − left_struct − right_struct`. This is what makes + // NonCounted entries fall out of the count: a NonCounted leaf has + // node_count = 0 and no children, so own_count = 0. + ops.push_back(Op::Push(Node::KVDigestCount( + node_key.clone(), + node_value_hash, + node_count, + ))); + if left_emitted { + ops.push_back(Op::Parent); + } + if range.contains(&node_key) { + let own_count = node_count + .saturating_sub(left_link_aggregate) + .saturating_sub(right_link_aggregate); + total = total.saturating_add(own_count); + } + + // Step 5: handle the RIGHT child. Same descent pattern as LEFT. + let right_emitted = if right_link_present { + let right_lo: Option<&[u8]> = Some(node_key.as_slice()); + let right_hi = subtree_hi_excl; + let walked = cost_return_on_error!( + &mut cost, + walker.walk( + false, + None::<&fn(&[u8], &GroveVersion) -> Option>, + grove_version, + ) + ); + let mut right_walker = match walked { + Some(rw) => rw, + None => { + return Err(Error::CorruptedState( + "tree.link(false) was Some but walk(false) returned None", + )) + .wrap_with_cost(cost) + } + }; + let n = cost_return_on_error!( + &mut cost, + emit_count_proof( + &mut right_walker, + range, + right_lo, + right_hi, + ops, + grove_version, + ) + ); + total = total.saturating_add(n); + true + } else { + false + }; + + if right_emitted { + ops.push_back(Op::Child); + } + + Ok(total).wrap_with_cost(cost) +} diff --git a/merk/src/proofs/query/aggregate_count/mod.rs b/merk/src/proofs/query/aggregate_count/mod.rs new file mode 100644 index 000000000..30a3a190f --- /dev/null +++ b/merk/src/proofs/query/aggregate_count/mod.rs @@ -0,0 +1,76 @@ +//! Proof generation and verification for `AggregateCountOnRange` queries. +//! +//! This module implements the count-only proof shape described in the GroveDB +//! book chapter "Aggregate Count Queries". It is intentionally **separate** +//! from `create_proof_internal`: regular proofs always descend into a queried +//! subtree, but count proofs *stop* at fully-inside subtree roots and emit a +//! single `HashWithCount` op for the entire collapsed subtree. +//! +//! The proof targets a `ProvableCountTree` or `ProvableCountSumTree` (or +//! their `NonCounted*` wrapper variants — wrappers only affect whether the +//! tree contributes to its parent's count, not its own internal count +//! mechanics). On any other tree type the entry point returns +//! `Error::InvalidProofError`. +//! +//! ## Module layout +//! +//! - [`prove`] — `impl RefWalker` block holding the public prover entry +//! points (`create_aggregate_count_on_range_proof` and the no-proof +//! `count_aggregate_on_range`). +//! - [`emit`] — the recursive proof-emission engine (`emit_count_proof`). +//! - [`walk`] — the no-proof equivalent walk (`walk_count_only`). +//! - [`verify`] — the verifier (`verify_aggregate_count_on_range_proof`) +//! and its recursive shape-walker. +//! - [`tests`] / [`verify_only_tests`] — unit + integration tests. +//! +//! Range-bound classification is shared with the sum side via +//! [`super::aggregate_common`]. + +#[cfg(feature = "minimal")] +mod emit; +#[cfg(feature = "minimal")] +mod prove; +#[cfg(test)] +mod tests; +#[cfg(any(feature = "minimal", feature = "verify"))] +mod verify; +#[cfg(test)] +mod verify_only_tests; +#[cfg(feature = "minimal")] +mod walk; + +#[cfg(any(feature = "minimal", feature = "verify"))] +pub use verify::verify_aggregate_count_on_range_proof; + +#[cfg(feature = "minimal")] +use crate::{ + tree::AggregateData, + {Error, TreeType}, +}; + +/// Returns true if `tree_type` is one of the four tree types that can host an +/// `AggregateCountOnRange` proof. Wrapper types are accepted by stripping +/// down to the inner tree type via `is_provable_count_bearing`. +#[cfg(feature = "minimal")] +pub(super) fn is_provable_count_bearing(tree_type: TreeType) -> bool { + matches!( + tree_type, + TreeType::ProvableCountTree | TreeType::ProvableCountSumTree + ) +} + +/// Pull the count out of a `ProvableCount` / `ProvableCountAndSum` aggregate. +/// Returns `Err(InvalidProofError)` for any other variant — the entry point +/// has already gated `tree_type`, so reaching the error means the tree's +/// in-memory state disagrees with its declared type. +#[cfg(feature = "minimal")] +pub(super) fn provable_count_from_aggregate(data: AggregateData) -> Result { + match data { + AggregateData::ProvableCount(c) => Ok(c), + AggregateData::ProvableCountAndSum(c, _) => Ok(c), + other => Err(Error::InvalidProofError(format!( + "expected ProvableCount aggregate data on a provable count tree, got {:?}", + other + ))), + } +} diff --git a/merk/src/proofs/query/aggregate_count/prove.rs b/merk/src/proofs/query/aggregate_count/prove.rs new file mode 100644 index 000000000..1f25ea6a6 --- /dev/null +++ b/merk/src/proofs/query/aggregate_count/prove.rs @@ -0,0 +1,87 @@ +//! Public prover entry points for `AggregateCountOnRange` queries. +//! +//! `impl RefWalker` block holding both the proof-emitting entry point +//! (`create_aggregate_count_on_range_proof`) and its no-proof read +//! counterpart (`count_aggregate_on_range`). + +use std::collections::LinkedList; + +use grovedb_costs::{cost_return_on_error, CostResult, CostsExt, OperationCost}; +use grovedb_version::version::GroveVersion; + +use super::{emit::emit_count_proof, is_provable_count_bearing, walk::walk_count_only}; +use crate::{ + proofs::{query::QueryItem, Op}, + tree::{Fetch, RefWalker}, + {Error, TreeType}, +}; + +impl RefWalker<'_, S> +where + S: Fetch + Sized + Clone, +{ + /// Generate a count-only proof for an `AggregateCountOnRange` query. + /// + /// `inner_range` is the `QueryItem` wrapped by `AggregateCountOnRange` + /// (already stripped at the caller). `tree_type` must be one of + /// `ProvableCountTree` or `ProvableCountSumTree`; any other tree type is + /// rejected with `Error::InvalidProofError` before any walking happens. + /// + /// The returned tuple is `(proof_ops, count)`: + /// - `proof_ops` is the linear stream the verifier will replay to + /// reconstruct the tree's root hash. + /// - `count` is the prover-side computed count (the verifier independently + /// recomputes it from the proof and compares against the expected root + /// hash; this value is returned as a convenience, not as ground truth). + pub fn create_aggregate_count_on_range_proof( + &mut self, + inner_range: &QueryItem, + tree_type: TreeType, + grove_version: &GroveVersion, + ) -> CostResult<(LinkedList, u64), Error> { + if !is_provable_count_bearing(tree_type) { + return Err(Error::InvalidProofError(format!( + "AggregateCountOnRange is only valid against ProvableCountTree or \ + ProvableCountSumTree, got {:?}", + tree_type + ))) + .wrap_with_cost(OperationCost::default()); + } + + let mut cost = OperationCost::default(); + let mut ops = LinkedList::new(); + let count = cost_return_on_error!( + &mut cost, + emit_count_proof(self, inner_range, None, None, &mut ops, grove_version) + ); + Ok((ops, count)).wrap_with_cost(cost) + } + + /// Walk the tree for an `AggregateCountOnRange` query and return the + /// in-range count, **without** producing a proof. + /// + /// This is the no-proof counterpart of + /// [`Self::create_aggregate_count_on_range_proof`]. It performs the same + /// classification walk (Contained / Disjoint / Boundary) and reads each + /// node's aggregate count directly from the merk, so it is O(log n) in + /// the number of distinct keys under the indexed subtree — the same + /// complexity as the proof variant but without the proof-op allocations, + /// hash recomputations, or serialization round-trip. + /// + /// The caller (`Merk::count_aggregate_on_range`) is expected to have + /// already validated `tree_type` is `ProvableCountTree` or + /// `ProvableCountSumTree`; the per-node `provable_count_from_aggregate` + /// check inside the walk surfaces any disagreement between the declared + /// tree type and the in-memory aggregate. + /// + /// The result is **not** independently verifiable: the caller is trusting + /// their own merk read path. Callers that need a verifiable count must + /// use `prove_aggregate_count_on_range` + `verify_aggregate_count_on_range_proof`. + pub fn count_aggregate_on_range( + &mut self, + inner_range: &QueryItem, + grove_version: &GroveVersion, + ) -> CostResult { + walk_count_only(self, inner_range, None, None, grove_version) + } +} diff --git a/merk/src/proofs/query/aggregate_count/tests.rs b/merk/src/proofs/query/aggregate_count/tests.rs new file mode 100644 index 000000000..96788448e --- /dev/null +++ b/merk/src/proofs/query/aggregate_count/tests.rs @@ -0,0 +1,1164 @@ +//! Unit + integration tests for the aggregate-count prover/verifier. +//! +//! Split out of the legacy single-file `aggregate_count.rs` along with +//! the prover/walker/verifier when the module became a directory. Body +//! is byte-identical to the previous in-file `mod tests { ... }` block; +//! only the `use super::*;` line at the top expanded into explicit +//! imports from the new sub-modules. + +use std::collections::LinkedList; + +use grovedb_costs::CostsExt; +use grovedb_version::version::GroveVersion; + +use super::verify_aggregate_count_on_range_proof; +use crate::{ + proofs::{ + encode_into, + query::{ + aggregate_common::{classify_subtree, SubtreeClassification, NULL_HASH}, + QueryItem, + }, + Node, Op as ProofOp, + }, + test_utils::TempMerk, + tree::{Op, TreeFeatureType::ProvableCountedMerkNode}, + Error, Merk, TreeType, +}; + +/// Asserts the hardcoded fixture in the `verify_only_tests` module +/// still matches the bytes a fresh prove run produces. If the proof +/// encoding ever changes, this test fails and prints the new +/// constants — copy them into `verify_only_tests`. +#[test] +fn verify_only_fixture_matches_fresh_prover_output() { + let v = GroveVersion::latest(); + let (merk, root) = make_15_key_provable_count_tree(v); + let inner_range = QueryItem::RangeInclusive(b"c".to_vec()..=b"l".to_vec()); + let (ops, count) = merk + .prove_aggregate_count_on_range(&inner_range, v) + .unwrap() + .expect("prove"); + let proof_hex = hex::encode(encode_proof(&ops)); + let root_hex = hex::encode(root); + + let drift_msg = format!( + "aggregate_count proof encoding has drifted — update verify_only_tests:\n\ + const FIXTURE_15_KEY_C_TO_L_PROOF_HEX: &str = \"{}\";\n\ + const FIXTURE_15_KEY_C_TO_L_ROOT_HEX: &str = \"{}\";\n\ + const FIXTURE_15_KEY_C_TO_L_COUNT: u64 = {};", + proof_hex, root_hex, count + ); + assert_eq!( + proof_hex, + super::verify_only_tests::FIXTURE_15_KEY_C_TO_L_PROOF_HEX, + "{}", + drift_msg + ); + assert_eq!( + root_hex, + super::verify_only_tests::FIXTURE_15_KEY_C_TO_L_ROOT_HEX, + "{}", + drift_msg + ); + assert_eq!( + count, + super::verify_only_tests::FIXTURE_15_KEY_C_TO_L_COUNT, + "{}", + drift_msg + ); +} + +fn range_inclusive(lo: &[u8], hi: &[u8]) -> QueryItem { + QueryItem::RangeInclusive(lo.to_vec()..=hi.to_vec()) +} + +fn range_full() -> QueryItem { + QueryItem::RangeFull(std::ops::RangeFull) +} + +fn range_from(lo: &[u8]) -> QueryItem { + QueryItem::RangeFrom(lo.to_vec()..) +} + +fn range_after(lo: &[u8]) -> QueryItem { + QueryItem::RangeAfter(lo.to_vec()..) +} + +#[test] +fn classify_disjoint_below() { + let r = range_inclusive(b"d", b"f"); + // subtree (None, b"c") — keys < "c", entirely below ["d", "f"]. + assert_eq!( + classify_subtree(None, Some(b"c"), &r), + SubtreeClassification::Disjoint, + ); +} + +#[test] +fn classify_disjoint_above() { + let r = range_inclusive(b"d", b"f"); + // subtree (b"g", None) — keys > "g", entirely above ["d", "f"]. + assert_eq!( + classify_subtree(Some(b"g"), None, &r), + SubtreeClassification::Disjoint, + ); +} + +#[test] +fn classify_disjoint_at_lower_boundary_inclusive() { + let r = range_inclusive(b"d", b"f"); + // subtree (None, b"d") — keys < "d", just below the inclusive bound. + assert_eq!( + classify_subtree(None, Some(b"d"), &r), + SubtreeClassification::Disjoint, + ); +} + +#[test] +fn classify_disjoint_at_upper_boundary_inclusive() { + let r = range_inclusive(b"d", b"f"); + // subtree (b"f", None) — keys > "f", just above the inclusive bound. + assert_eq!( + classify_subtree(Some(b"f"), None, &r), + SubtreeClassification::Disjoint, + ); +} + +#[test] +fn classify_contained_simple() { + let r = range_inclusive(b"a", b"z"); + // subtree (b"d", b"f") — keys in ("d", "f"), all in ["a", "z"]. + assert_eq!( + classify_subtree(Some(b"d"), Some(b"f"), &r), + SubtreeClassification::Contained, + ); +} + +#[test] +fn classify_contained_full_range_full_subtree() { + let r = range_full(); + // The full range matches everything — even an unbounded subtree is + // contained. + assert_eq!( + classify_subtree(None, None, &r), + SubtreeClassification::Contained, + ); +} + +#[test] +fn classify_boundary_overlapping_lower() { + let r = range_inclusive(b"d", b"f"); + // subtree (b"c", b"e") — keys in ("c", "e"), straddles the lower bound. + assert_eq!( + classify_subtree(Some(b"c"), Some(b"e"), &r), + SubtreeClassification::Boundary, + ); +} + +#[test] +fn classify_boundary_overlapping_upper() { + let r = range_inclusive(b"d", b"f"); + // subtree (b"e", b"g") — keys in ("e", "g"), straddles the upper bound. + assert_eq!( + classify_subtree(Some(b"e"), Some(b"g"), &r), + SubtreeClassification::Boundary, + ); +} + +#[test] +fn classify_boundary_unbounded_below_with_bounded_range() { + let r = range_from(b"d"); + // subtree (None, b"e") — could include keys < "d", so boundary. + assert_eq!( + classify_subtree(None, Some(b"e"), &r), + SubtreeClassification::Boundary, + ); +} + +#[test] +fn classify_contained_range_after_exclusive() { + let r = range_after(b"b"); + // RangeAfter(b"b") = (b, +inf). subtree (b"b", b"e") — keys > "b" and + // < "e", all in (b, +inf). Contained. + assert_eq!( + classify_subtree(Some(b"b"), Some(b"e"), &r), + SubtreeClassification::Contained, + ); +} + +// ---------- end-to-end integration tests on a real merk ---------- +// +// These tests build a small ProvableCountTree, generate count proofs +// through the merk-level API, then verify them with the count verifier. +// They cover the four documented categories: open-range (lower-only and +// upper-only) and closed-range (inclusive and after-to-inclusive). Empty +// tree and single-bound edge cases are also exercised. + +/// Build a fresh `ProvableCountTree` populated with single-byte keys +/// "a".."o" (15 keys) — same shape as the running example in the book +/// chapter's "Closed ranges" section. Returns the merk and its current +/// root hash. +fn make_15_key_provable_count_tree(grove_version: &GroveVersion) -> (TempMerk, [u8; 32]) { + let mut merk = TempMerk::new_with_tree_type(grove_version, TreeType::ProvableCountTree); + let keys: Vec> = (b'a'..=b'o').map(|c| vec![c]).collect(); + let entries: Vec<(Vec, Op)> = keys + .iter() + .enumerate() + .map(|(i, k)| { + ( + k.clone(), + Op::Put(vec![i as u8], ProvableCountedMerkNode(1)), + ) + }) + .collect(); + merk.apply::<_, Vec<_>>(&entries, &[], None, grove_version) + .unwrap() + .expect("apply should succeed"); + merk.commit(grove_version); + let root_hash = merk.root_hash().unwrap(); + (merk, root_hash) +} + +/// Encode a `LinkedList` into the wire format that the verifier +/// consumes. +fn encode_proof(ops: &LinkedList) -> Vec { + let mut bytes = Vec::with_capacity(128); + encode_into(ops.iter(), &mut bytes); + bytes +} + +/// Round-trip helper: prove the inner range, encode the proof, verify it, +/// assert the recovered root hash matches and the recovered count matches +/// `expected_count`. +fn round_trip( + merk: &Merk>, + expected_root: [u8; 32], + inner_range: QueryItem, + expected_count: u64, + grove_version: &GroveVersion, +) { + let (ops, prover_count) = merk + .prove_aggregate_count_on_range(&inner_range, grove_version) + .unwrap() + .expect("prove should succeed"); + assert_eq!( + prover_count, expected_count, + "prover count mismatch for range {:?}", + inner_range + ); + let bytes = encode_proof(&ops); + let (root, verifier_count) = verify_aggregate_count_on_range_proof(&bytes, &inner_range) + .unwrap() + .expect("verify should succeed"); + assert_eq!( + root, expected_root, + "verifier reconstructed wrong root for range {:?}", + inner_range + ); + assert_eq!( + verifier_count, expected_count, + "verifier count mismatch for range {:?}", + inner_range + ); +} + +#[test] +fn integration_open_range_from() { + let v = GroveVersion::latest(); + let (merk, root) = make_15_key_provable_count_tree(v); + // RangeFrom("c"..) → keys c..o (13 keys). + round_trip(&merk, root, QueryItem::RangeFrom(b"c".to_vec()..), 13, v); +} + +#[test] +fn integration_open_range_after() { + let v = GroveVersion::latest(); + let (merk, root) = make_15_key_provable_count_tree(v); + // RangeAfter(("b", ..)) → keys c..o (13 keys), same set as RangeFrom("c"..) + // but proof shape differs — the boundary lands on "b" exclusive. + round_trip(&merk, root, QueryItem::RangeAfter(b"b".to_vec()..), 13, v); +} + +#[test] +fn integration_open_range_to() { + let v = GroveVersion::latest(); + let (merk, root) = make_15_key_provable_count_tree(v); + // RangeTo(..b"e") → keys a..d (4 keys, exclusive upper). + round_trip(&merk, root, QueryItem::RangeTo(..b"e".to_vec()), 4, v); +} + +#[test] +fn integration_open_range_to_inclusive() { + let v = GroveVersion::latest(); + let (merk, root) = make_15_key_provable_count_tree(v); + // RangeToInclusive(..=b"e") → keys a..e (5 keys, inclusive upper). + round_trip( + &merk, + root, + QueryItem::RangeToInclusive(..=b"e".to_vec()), + 5, + v, + ); +} + +#[test] +fn integration_closed_range_inclusive() { + let v = GroveVersion::latest(); + let (merk, root) = make_15_key_provable_count_tree(v); + // RangeInclusive("c"..="l") → 10 keys. + round_trip( + &merk, + root, + QueryItem::RangeInclusive(b"c".to_vec()..=b"l".to_vec()), + 10, + v, + ); +} + +#[test] +fn integration_closed_range_exclusive() { + let v = GroveVersion::latest(); + let (merk, root) = make_15_key_provable_count_tree(v); + // Range("c".."l") → c..k (9 keys, exclusive upper). + round_trip( + &merk, + root, + QueryItem::Range(b"c".to_vec()..b"l".to_vec()), + 9, + v, + ); +} + +#[test] +fn integration_closed_range_after_to_inclusive() { + let v = GroveVersion::latest(); + let (merk, root) = make_15_key_provable_count_tree(v); + // RangeAfterToInclusive(("c", "l")) → keys d..l (9 keys: d..=l excluding c). + round_trip( + &merk, + root, + QueryItem::RangeAfterToInclusive(b"c".to_vec()..=b"l".to_vec()), + 9, + v, + ); +} + +#[test] +fn integration_closed_range_after_to_exclusive() { + let v = GroveVersion::latest(); + let (merk, root) = make_15_key_provable_count_tree(v); + // RangeAfterTo(("c", "l")) → keys d..l (8 keys, both exclusive). + round_trip( + &merk, + root, + QueryItem::RangeAfterTo(b"c".to_vec()..b"l".to_vec()), + 8, + v, + ); +} + +#[test] +fn integration_range_below_all_keys() { + let v = GroveVersion::latest(); + let (merk, root) = make_15_key_provable_count_tree(v); + // Entire range below the smallest key — should produce count = 0 + // and a Disjoint proof at the root level. + round_trip( + &merk, + root, + QueryItem::RangeInclusive(vec![0x00]..=vec![0x10]), + 0, + v, + ); +} + +#[test] +fn integration_range_above_all_keys() { + let v = GroveVersion::latest(); + let (merk, root) = make_15_key_provable_count_tree(v); + // Entire range above the largest key. + round_trip( + &merk, + root, + QueryItem::RangeInclusive(b"z".to_vec()..=vec![0xff]), + 0, + v, + ); +} + +#[test] +fn integration_empty_merk() { + let v = GroveVersion::latest(); + let merk = TempMerk::new_with_tree_type(v, TreeType::ProvableCountTree); + let (ops, prover_count) = merk + .prove_aggregate_count_on_range(&QueryItem::Range(b"a".to_vec()..b"z".to_vec()), v) + .unwrap() + .expect("prove on empty merk should succeed"); + assert_eq!(prover_count, 0); + // Empty proof means the verifier returns NULL_HASH and count = 0. + let bytes = encode_proof(&ops); + let (root, verifier_count) = verify_aggregate_count_on_range_proof( + &bytes, + &QueryItem::Range(b"a".to_vec()..b"z".to_vec()), + ) + .unwrap() + .expect("verify on empty merk should succeed"); + assert_eq!(root, NULL_HASH); + assert_eq!(verifier_count, 0); +} + +#[test] +fn integration_rejected_on_normal_tree() { + let v = GroveVersion::latest(); + let merk = TempMerk::new(v); // NormalTree + let err = merk + .prove_aggregate_count_on_range(&QueryItem::Range(b"a".to_vec()..b"z".to_vec()), v) + .unwrap(); + assert!( + err.is_err(), + "expected an InvalidProofError on NormalTree, got Ok({:?})", + err.ok().map(|(_, c)| c) + ); +} + +#[test] +fn integration_count_forgery_is_rejected() { + // Demonstrates the cryptographic binding: tamper with the count in a + // HashWithCount op and the verifier's root-hash recomputation must + // diverge from the expected root. + let v = GroveVersion::latest(); + let (merk, expected_root) = make_15_key_provable_count_tree(v); + let inner_range = QueryItem::RangeInclusive(b"c".to_vec()..=b"l".to_vec()); + let (mut ops, _prover_count) = merk + .prove_aggregate_count_on_range(&inner_range, v) + .unwrap() + .expect("prove should succeed"); + + // Forge: bump the count on the first HashWithCount op we see. + let mut tampered = false; + for op in ops.iter_mut() { + if let ProofOp::Push(Node::HashWithCount(_, _, _, count)) + | ProofOp::PushInverted(Node::HashWithCount(_, _, _, count)) = op + { + *count = count.saturating_add(1); + tampered = true; + break; + } + } + assert!( + tampered, + "test setup: expected at least one HashWithCount op" + ); + + let bytes = encode_proof(&ops); + let (root, _count) = verify_aggregate_count_on_range_proof(&bytes, &inner_range) + .unwrap() + .expect("verify should still complete (root mismatch is the caller's job)"); + assert_ne!( + root, expected_root, + "tampered count must produce a different reconstructed root hash" + ); +} + +// ---------- no-proof variant: count_aggregate_on_range ---------- +// +// The no-proof entry point must return exactly the same count as the +// proof path for every range shape, without producing any proof ops. +// These tests cross-check the two paths on the same merk. + +/// Cross-check: assert that `count_aggregate_on_range` and the count +/// returned by `prove_aggregate_count_on_range` agree for the given +/// range, and that both equal `expected_count`. +fn no_proof_matches_prover( + merk: &Merk>, + inner_range: QueryItem, + expected_count: u64, + grove_version: &GroveVersion, +) { + let no_proof = merk + .count_aggregate_on_range(&inner_range, grove_version) + .unwrap() + .expect("count_aggregate_on_range should succeed"); + assert_eq!( + no_proof, expected_count, + "no-proof variant returned wrong count for range {:?}", + inner_range + ); + let (_ops, prover_count) = merk + .prove_aggregate_count_on_range(&inner_range, grove_version) + .unwrap() + .expect("prove should succeed"); + assert_eq!( + no_proof, prover_count, + "no-proof variant disagrees with prover count for range {:?}", + inner_range + ); +} + +#[test] +fn no_proof_matches_prover_closed_range_inclusive() { + let v = GroveVersion::latest(); + let (merk, _root) = make_15_key_provable_count_tree(v); + no_proof_matches_prover( + &merk, + QueryItem::RangeInclusive(b"c".to_vec()..=b"l".to_vec()), + 10, + v, + ); +} + +#[test] +fn no_proof_matches_prover_closed_range_exclusive() { + let v = GroveVersion::latest(); + let (merk, _root) = make_15_key_provable_count_tree(v); + no_proof_matches_prover(&merk, QueryItem::Range(b"c".to_vec()..b"l".to_vec()), 9, v); +} + +#[test] +fn no_proof_matches_prover_open_range_from() { + let v = GroveVersion::latest(); + let (merk, _root) = make_15_key_provable_count_tree(v); + no_proof_matches_prover(&merk, QueryItem::RangeFrom(b"c".to_vec()..), 13, v); +} + +#[test] +fn no_proof_matches_prover_range_below_all_keys() { + let v = GroveVersion::latest(); + let (merk, _root) = make_15_key_provable_count_tree(v); + no_proof_matches_prover( + &merk, + QueryItem::RangeInclusive(vec![0x00]..=vec![0x10]), + 0, + v, + ); +} + +#[test] +fn no_proof_empty_merk_returns_zero() { + let v = GroveVersion::latest(); + let merk = TempMerk::new_with_tree_type(v, TreeType::ProvableCountTree); + let count = merk + .count_aggregate_on_range(&QueryItem::Range(b"a".to_vec()..b"z".to_vec()), v) + .unwrap() + .expect("count_aggregate_on_range on empty merk should succeed"); + assert_eq!(count, 0); +} + +#[test] +fn no_proof_rejected_on_normal_tree() { + let v = GroveVersion::latest(); + let merk = TempMerk::new(v); // NormalTree + let result = merk + .count_aggregate_on_range(&QueryItem::Range(b"a".to_vec()..b"z".to_vec()), v) + .unwrap(); + assert!( + result.is_err(), + "expected InvalidProofError on NormalTree, got Ok({:?})", + result.ok() + ); +} + +#[test] +fn no_proof_matches_prover_range_after() { + // RangeAfter at the root pushes the left boundary exclusive to "b", + // which causes the walk to descend into the right subtree from the + // root — exercising the right-child arm of walk_count_only. + let v = GroveVersion::latest(); + let (merk, _root) = make_15_key_provable_count_tree(v); + no_proof_matches_prover(&merk, QueryItem::RangeAfter(b"b".to_vec()..), 13, v); +} + +#[test] +fn no_proof_matches_prover_range_to() { + let v = GroveVersion::latest(); + let (merk, _root) = make_15_key_provable_count_tree(v); + // RangeTo(..b"e") — exclusive upper, keys a..d (4 keys). + no_proof_matches_prover(&merk, QueryItem::RangeTo(..b"e".to_vec()), 4, v); +} + +#[test] +fn no_proof_matches_prover_range_to_inclusive() { + let v = GroveVersion::latest(); + let (merk, _root) = make_15_key_provable_count_tree(v); + // RangeToInclusive(..=b"e") — keys a..=e (5 keys). + no_proof_matches_prover(&merk, QueryItem::RangeToInclusive(..=b"e".to_vec()), 5, v); +} + +#[test] +fn no_proof_matches_prover_range_after_to_inclusive() { + let v = GroveVersion::latest(); + let (merk, _root) = make_15_key_provable_count_tree(v); + // RangeAfterToInclusive(("c", "l")) — keys d..=l (9 keys). + no_proof_matches_prover( + &merk, + QueryItem::RangeAfterToInclusive(b"c".to_vec()..=b"l".to_vec()), + 9, + v, + ); +} + +#[test] +fn no_proof_provable_count_sum_tree() { + // Exercise the ProvableCountSumTree branch of the tree-type gate — + // it should accept the walk and return the same count as a + // ProvableCountTree with the same key set. + let v = GroveVersion::latest(); + let mut merk = TempMerk::new_with_tree_type(v, TreeType::ProvableCountSumTree); + // ProvableCountedAndSummedMerkNode(count=1, sum=0): treats each + // entry as count-1 with sum-contribution 0. + let entries: Vec<(Vec, Op)> = (b'a'..=b'o') + .enumerate() + .map(|(i, c)| { + ( + vec![c], + Op::Put( + vec![i as u8], + crate::tree::TreeFeatureType::ProvableCountedSummedMerkNode(1, 0), + ), + ) + }) + .collect(); + merk.apply::<_, Vec<_>>(&entries, &[], None, v) + .unwrap() + .expect("apply ProvableCountSumTree entries"); + merk.commit(v); + + let count = merk + .count_aggregate_on_range(&QueryItem::RangeInclusive(b"c".to_vec()..=b"l".to_vec()), v) + .unwrap() + .expect("count_aggregate_on_range on ProvableCountSumTree should succeed"); + assert_eq!(count, 10, "c..=l should be 10 keys"); +} + +// ---------- attack tests for the shape-walk verifier ---------- +// +// These three tests exercise attacks the old allowlist-only verifier let +// through. With the shape walk in `verify_count_shape`, each one is +// rejected before the caller's root-hash check. + +/// A malicious prover sends a single `Push(Hash(expected_root))` for a +/// non-empty tree. Without the shape check this would return +/// `(expected_root, 0)` for any range. The shape check classifies the +/// root with `(None, None)` against a bounded inner range as `Boundary`, +/// expects `KVDigestCount`, and rejects. +#[test] +fn shape_walk_rejects_single_hash_undercount() { + let v = GroveVersion::latest(); + let (merk, expected_root) = make_15_key_provable_count_tree(v); + let inner_range = QueryItem::RangeInclusive(b"c".to_vec()..=b"l".to_vec()); + + // Forged proof: a single Hash op carrying the genuine root hash. + let mut forged: LinkedList = LinkedList::new(); + forged.push_back(ProofOp::Push(Node::Hash(expected_root))); + let bytes = encode_proof(&forged); + + let result = verify_aggregate_count_on_range_proof(&bytes, &inner_range).unwrap(); + let err = result.expect_err("single-Hash forgery must be rejected"); + // keep merk alive for clarity in the test scope + let _ = merk; + // Plain `Hash` is no longer in the count-proof allowlist (it would + // carry an unbound structural count), so the rejection now lands + // in Phase 1's coarse allowlist rather than Phase 2's shape walk. + // Either error message is fine — the attack is rejected. + match err { + Error::InvalidProofError(msg) => { + assert!( + msg.contains("unexpected node type") + || msg.contains("expected KVDigestCount") + || msg.contains("Boundary"), + "unexpected message: {msg}" + ); + } + other => panic!("expected InvalidProofError, got {other:?}"), + } +} + +/// A malicious prover replaces an in-range `HashWithCount` subtree with +/// a `Hash` carrying that subtree's node_hash, undercounting by the +/// subtree's count. The hash chain still matches (same node_hash), so +/// the old allowlist verifier would have happily returned a wrong +/// count. The shape walk classifies that position as `Contained` and +/// requires `HashWithCount`, rejecting the swap. +#[test] +fn shape_walk_rejects_hash_swap_for_contained_subtree() { + let v = GroveVersion::latest(); + let (merk, _root) = make_15_key_provable_count_tree(v); + let inner_range = QueryItem::RangeInclusive(b"c".to_vec()..=b"l".to_vec()); + let (mut ops, _) = merk + .prove_aggregate_count_on_range(&inner_range, v) + .unwrap() + .expect("prove succeeds"); + + // Swap the first HashWithCount op for a Hash op carrying the + // computed node_hash for that subtree (so the chain check still + // matches and only the shape walk can detect the attack). + let mut swapped = false; + for op in ops.iter_mut() { + if let ProofOp::Push(Node::HashWithCount(kv_hash, l, r, c)) = op { + let node_hash = crate::tree::node_hash_with_count(kv_hash, l, r, *c).unwrap(); + *op = ProofOp::Push(Node::Hash(node_hash)); + swapped = true; + break; + } + } + assert!( + swapped, + "test setup: expected at least one HashWithCount op" + ); + + let bytes = encode_proof(&ops); + let result = verify_aggregate_count_on_range_proof(&bytes, &inner_range).unwrap(); + assert!( + result.is_err(), + "HashWithCount→Hash swap on a Contained subtree must be rejected by the shape walk" + ); +} + +/// A malicious prover attaches a `KVDigestCount` child under a leaf +/// `HashWithCount`. Because `Tree::hash()` for `HashWithCount` is +/// computed from the four embedded fields and ignores any reconstructed +/// children, the root hash check passes — but a naive verifier that +/// counts every visited node would credit the bogus child as +1. The +/// shape walk requires `Contained` positions to be **leaves**, so it +/// rejects the smuggled-in child. +#[test] +fn shape_walk_rejects_keyless_node_with_attached_children() { + let v = GroveVersion::latest(); + let (merk, _root) = make_15_key_provable_count_tree(v); + let inner_range = QueryItem::RangeInclusive(b"c".to_vec()..=b"l".to_vec()); + let (mut ops, _honest_count) = merk + .prove_aggregate_count_on_range(&inner_range, v) + .unwrap() + .expect("prove succeeds"); + + // Smuggle a fake +1 child under the first HashWithCount op. After + // any HashWithCount(...), insert: Push(Hash(zero)) Parent — that + // attaches an extra hashed node as the LEFT child of the + // HashWithCount during reconstruction. Then add a fake + // Push(KVDigestCount) Child that would be picked up by an + // allowlist verifier counting visited keys. + // + // Concretely we splice 4 ops right after the HashWithCount: + // Push(KVDigestCount(in_range_key, value_hash, 1)) + // Parent (attach KVDigestCount as the LEFT child of HashWithCount) + // Push(Hash([0; 32])) + // Child (attach Hash as the RIGHT child of HashWithCount) + // + // The HashWithCount's hash() ignores these children, so the root + // hash recomputation is unaffected. The shape walk catches the + // Contained-position-with-children violation. + let mut new_ops: LinkedList = LinkedList::new(); + let mut spliced = false; + for op in ops.iter() { + new_ops.push_back(op.clone()); + if !spliced && matches!(op, ProofOp::Push(Node::HashWithCount(_, _, _, _))) { + let in_range_key = b"d".to_vec(); + new_ops.push_back(ProofOp::Push(Node::KVDigestCount( + in_range_key, + [0u8; 32], + 1, + ))); + new_ops.push_back(ProofOp::Parent); + new_ops.push_back(ProofOp::Push(Node::Hash([0u8; 32]))); + new_ops.push_back(ProofOp::Child); + spliced = true; + } + } + assert!( + spliced, + "test setup: expected to splice into a HashWithCount" + ); + ops = new_ops; + + let bytes = encode_proof(&ops); + let result = verify_aggregate_count_on_range_proof(&bytes, &inner_range).unwrap(); + assert!( + result.is_err(), + "attaching children under HashWithCount must be rejected (root hash alone wouldn't catch it)" + ); +} + +/// `HashWithCount` is only safe inside the dedicated aggregate-count +/// verifier (which shape-checks the collapsed subtree). The plain +/// `Query::execute_proof` verifier must reject it on sight — otherwise +/// a malicious prover could include `HashWithCount` in a regular +/// query proof, attach fake KV children to it (whose pushes the +/// verifier would credit as query results via `execute_node`), and +/// have the parent's hash chain still verify because +/// `Tree::hash()` for `HashWithCount` ignores attached children. +#[test] +fn regular_query_verifier_rejects_hash_with_count_node() { + use crate::proofs::query::QueryProofVerify; + let v = GroveVersion::latest(); + + // Build a regular merk and a regular range query against it. + let mut merk = TempMerk::new(v); + for i in 0u8..5 { + merk.apply::<_, Vec<_>>( + &[( + vec![i], + Op::Put(vec![i], crate::TreeFeatureType::BasicMerkNode), + )], + &[], + None, + v, + ) + .unwrap() + .expect("apply"); + } + merk.commit(v); + let q = + crate::proofs::query::Query::new_single_query_item(QueryItem::Range(vec![0u8]..vec![5u8])); + + // Generate an honest proof, then splice a `HashWithCount` push into + // it. The exact op sequence doesn't matter for what we're testing — + // we just need the regular verifier to refuse to process the proof + // because it contains a `HashWithCount`. + let (mut ops, _) = merk + .prove_unchecked_query_items(&[QueryItem::Range(vec![0u8]..vec![5u8])], None, true, v) + .unwrap() + .expect("prove"); + ops.push_front(ProofOp::Push(Node::HashWithCount( + [0u8; 32], [0u8; 32], [0u8; 32], 0, + ))); + let bytes = encode_proof(&ops); + + let result = q.execute_proof(&bytes, None, true, 0).unwrap(); + let err = result.expect_err("regular query verifier must reject HashWithCount on sight"); + let msg = format!("{}", err); + assert!( + msg.contains("HashWithCount") || msg.contains("aggregate-count"), + "expected HashWithCount-rejection message, got: {msg}" + ); +} + +// ---------- byte-mutation fuzzer ---------- +// +// Stronger forgery-resistance check than the three hand-crafted attack +// tests above: enumerate every byte of an honest proof, flip it to +// each of three different values, and assert the verifier never +// produces a "silent forgery" — i.e. an `Ok((root, count))` where +// the root **matches** the honest one but the count **differs**. +// +// Three safe outcomes per mutation: +// - **Rejection** — Phase 1 decode error, or Phase 2 shape mismatch. +// - **Divergence** — `Ok((root', _))` where `root' != honest_root`, +// so any caller comparing against their trusted root catches it. +// - **Same outcome** — `Ok((honest_root, honest_count))`. This can +// happen for non-canonical re-encodings (e.g. swapping +// `Push` ↔ `PushInverted` doesn't change the reconstructed tree's +// root or the shape walk's count). Harmless: the verifier is +// deterministic on (root, count), and that pair is what the +// caller acts on. +// +// The **unsafe** outcome is `Ok((honest_root, count'))` where +// `count' != honest_count`. The hash chain binds count via +// `node_hash_with_count`, so this should be impossible — the test +// panics if it ever happens. +// +// We also assert each safe branch fires at least once as a sanity +// check that the test is actually exercising the surface. +#[test] +fn fuzz_byte_mutation_no_silent_forgery() { + let v = GroveVersion::latest(); + let (merk, honest_root) = make_15_key_provable_count_tree(v); + let inner_range = QueryItem::RangeInclusive(b"c".to_vec()..=b"l".to_vec()); + let (ops, honest_count) = merk + .prove_aggregate_count_on_range(&inner_range, v) + .unwrap() + .expect("prove"); + let honest_bytes = encode_proof(&ops); + assert!(!honest_bytes.is_empty()); + + let mut rejected = 0usize; + let mut diverged = 0usize; + let mut same_outcome = 0usize; + let mut total = 0usize; + + // Three different mutations per byte: +1, +0x55, XOR 0xff. + let deltas: [u8; 3] = [1, 0x55, 0xff]; + for byte_idx in 0..honest_bytes.len() { + for &delta in &deltas { + let mut bytes = honest_bytes.clone(); + let original = bytes[byte_idx]; + let mutated = if delta == 0xff { + original ^ 0xff + } else { + original.wrapping_add(delta) + }; + if mutated == original { + continue; // no-op, don't count + } + bytes[byte_idx] = mutated; + total += 1; + + let result = verify_aggregate_count_on_range_proof(&bytes, &inner_range).unwrap(); + match result { + Err(_) => rejected += 1, + Ok((root, count)) => { + if root == honest_root { + // Same root — the verifier MUST also produce + // the same count, otherwise we have a silent + // count-forgery: the caller would accept the + // forged count thinking it's the honest one. + assert_eq!( + count, honest_count, + "SILENT FORGERY at byte index {} (delta=0x{:02x}): \ + verifier returned the honest root but a wrong count \ + ({} != {}). The hash chain should bind count.", + byte_idx, delta, count, honest_count + ); + same_outcome += 1; + } else { + // Different root — caller's root check catches it. + diverged += 1; + } + } + } + } + } + + // Sanity: each safe branch should fire at least once on a real proof. + assert!( + rejected > 0, + "expected at least one mutation to be rejected outright" + ); + assert!( + diverged > 0, + "expected at least one mutation to diverge the root hash" + ); + // `same_outcome` may legitimately be zero on some encoders, so we + // don't require it. We just require no silent forgery occurred, + // which the inner assert_eq! guarantees. + let _ = same_outcome; + assert_eq!(rejected + diverged + same_outcome, total); +} + +// ---------- randomized round-trip property test ---------- +// +// Build merks with varying sizes and key shapes from a deterministic +// RNG, run a bunch of randomly-chosen ranges through the prove → encode +// → verify pipeline, and assert the verifier's count agrees with a +// ground-truth count computed by directly intersecting the inserted +// keys with the range. Catches silent miscounts that the fixed +// examples above would miss (off-by-one, edge-of-tree, exact-bound +// matches against multi-byte keys, etc.). +#[test] +fn fuzz_random_trees_and_ranges_round_trip() { + // Tiny custom xorshift RNG so we don't have to add a dev-dep. + struct XorShift(u64); + impl XorShift { + fn next_u64(&mut self) -> u64 { + let mut x = self.0; + x ^= x << 13; + x ^= x >> 7; + x ^= x << 17; + self.0 = x; + x + } + fn gen_range(&mut self, lo: usize, hi: usize) -> usize { + lo + (self.next_u64() as usize) % (hi - lo) + } + fn gen_key(&mut self, max_len: usize) -> Vec { + let len = 1 + self.gen_range(0, max_len); + (0..len).map(|_| (self.next_u64() & 0xff) as u8).collect() + } + } + + let v = GroveVersion::latest(); + let mut rng = XorShift(0xDEAD_BEEF_C0FFEE); + let trials = 16; + for trial in 0..trials { + let key_count = rng.gen_range(1, 64); + let mut keys: Vec> = (0..key_count).map(|_| rng.gen_key(8)).collect(); + keys.sort(); + keys.dedup(); + + let mut merk = TempMerk::new_with_tree_type(v, TreeType::ProvableCountTree); + let entries: Vec<(Vec, Op)> = keys + .iter() + .map(|k| (k.clone(), Op::Put(vec![0xAB], ProvableCountedMerkNode(1)))) + .collect(); + merk.apply::<_, Vec<_>>(&entries, &[], None, v) + .unwrap() + .expect("apply"); + merk.commit(v); + let root = merk.root_hash().unwrap(); + + // Try several random ranges per tree, picking shapes that + // exercise both bounded and half-bounded variants. + for sub_trial in 0..6 { + let lo = rng.gen_key(8); + let hi = rng.gen_key(8); + let (lo, hi) = if lo <= hi { (lo, hi) } else { (hi, lo) }; + + let inner_range = match sub_trial % 6 { + 0 => QueryItem::Range(lo.clone()..hi.clone()), + 1 => QueryItem::RangeInclusive(lo.clone()..=hi.clone()), + 2 => QueryItem::RangeFrom(lo.clone()..), + 3 => QueryItem::RangeAfter(lo.clone()..), + 4 => QueryItem::RangeTo(..hi.clone()), + _ => QueryItem::RangeToInclusive(..=hi.clone()), + }; + + let expected = keys + .iter() + .filter(|k| inner_range.contains(k.as_slice())) + .count() as u64; + + let (ops, prover_count) = merk + .prove_aggregate_count_on_range(&inner_range, v) + .unwrap() + .expect("prove"); + assert_eq!( + prover_count, expected, + "trial {} sub {}: prover count mismatch for range {:?}", + trial, sub_trial, inner_range + ); + let bytes = encode_proof(&ops); + let (vroot, vcount) = verify_aggregate_count_on_range_proof(&bytes, &inner_range) + .unwrap() + .expect("verify"); + assert_eq!( + vroot, root, + "trial {} sub {}: verifier root mismatch", + trial, sub_trial + ); + assert_eq!( + vcount, expected, + "trial {} sub {}: verifier count mismatch for range {:?}", + trial, sub_trial, inner_range + ); + } + } +} + +// ---------- shape-walk rejection of malformed proof shapes ---------- +// +// These tests synthesize op streams that are well-formed bytes (Phase 1 +// decode succeeds) but violate the structural invariants the shape walk +// requires (Phase 2 rejection). They exist to lock down the defensive +// error branches in `verify_count_shape` so future refactors that +// accidentally relax them are caught by the test suite. + +/// `HashWithCount` is only valid as a leaf in the proof tree. If the +/// prover attaches children to a Disjoint-position `HashWithCount`, +/// the shape walk must reject — even though the parent's hash chain +/// (which uses `Tree::hash()` for `HashWithCount`, computed from the +/// four embedded fields and ignoring children) would still verify. +#[test] +fn shape_walk_rejects_disjoint_hashwithcount_with_children() { + let v = GroveVersion::latest(); + let (merk, _root) = make_15_key_provable_count_tree(v); + // RangeAfter("o") → all 15 keys are below; the entire tree is + // Disjoint relative to the inner range, so the honest proof is a + // single Push(HashWithCount(...)). + let inner_range = QueryItem::RangeAfter(b"o".to_vec()..); + let (mut ops, _) = merk + .prove_aggregate_count_on_range(&inner_range, v) + .unwrap() + .expect("prove succeeds"); + + // Splice in another HashWithCount as the child (no key, so no + // ordering constraint at Phase 1) so we exercise Phase 2's + // leaf-only assertion at the Disjoint position. + let mut spliced = LinkedList::::new(); + let mut done = false; + for op in ops.iter() { + spliced.push_back(op.clone()); + if !done && matches!(op, ProofOp::Push(Node::HashWithCount(_, _, _, _))) { + spliced.push_back(ProofOp::Push(Node::HashWithCount( + [0u8; 32], [0u8; 32], [0u8; 32], 1, + ))); + spliced.push_back(ProofOp::Parent); + done = true; + } + } + assert!(done, "test setup: expected at least one HashWithCount op"); + ops = spliced; + + let bytes = encode_proof(&ops); + let result = verify_aggregate_count_on_range_proof(&bytes, &inner_range).unwrap(); + let err = result.expect_err("Disjoint HashWithCount with children must be rejected"); + match err { + Error::InvalidProofError(msg) => assert!( + msg.contains("Disjoint position must be a leaf"), + "unexpected message: {msg}" + ), + other => panic!("expected InvalidProofError, got {:?}", other), + } +} + +/// At a Disjoint position the shape walk requires `HashWithCount` (only +/// node type with a hash-bound count). A `Hash` op there would carry an +/// untrusted structural count for the parent's `own_count` derivation, +/// so it must be rejected. +#[test] +fn shape_walk_rejects_non_hashwithcount_at_disjoint() { + let v = GroveVersion::latest(); + let (merk, _root) = make_15_key_provable_count_tree(v); + let inner_range = QueryItem::RangeAfter(b"o".to_vec()..); + let (mut ops, _) = merk + .prove_aggregate_count_on_range(&inner_range, v) + .unwrap() + .expect("prove succeeds"); + + // Replace the single Disjoint HashWithCount with a plain Hash. + let mut swapped = false; + for op in ops.iter_mut() { + if let ProofOp::Push(Node::HashWithCount(kv, l, r, c)) = op { + let node_hash = crate::tree::node_hash_with_count(kv, l, r, *c).unwrap(); + *op = ProofOp::Push(Node::Hash(node_hash)); + swapped = true; + break; + } + } + assert!(swapped, "test setup: expected a HashWithCount op to swap"); + + let bytes = encode_proof(&ops); + let result = verify_aggregate_count_on_range_proof(&bytes, &inner_range).unwrap(); + // Phase 1 rejects plain Hash via the allowlist; Phase 2 would also + // reject "expected HashWithCount at Disjoint position". Either is fine. + let err = result.expect_err("plain Hash at Disjoint must be rejected"); + match err { + Error::InvalidProofError(_) => {} + other => panic!("expected InvalidProofError, got {:?}", other), + } +} + +/// At a Boundary position the shape walk requires the node's key to +/// fall strictly inside the inherited subtree bounds. A prover that +/// emits a `KVDigestCount` whose key is outside those bounds is trying +/// to confuse the recursion's bound tracking — it must be rejected. +#[test] +fn shape_walk_rejects_kvdigestcount_outside_inherited_bounds() { + let v = GroveVersion::latest(); + let (merk, _root) = make_15_key_provable_count_tree(v); + let inner_range = QueryItem::RangeInclusive(b"c".to_vec()..=b"l".to_vec()); + let (mut ops, _) = merk + .prove_aggregate_count_on_range(&inner_range, v) + .unwrap() + .expect("prove succeeds"); + + // Find a Boundary KVDigestCount and rewrite its key to something + // outside the tree (way past 'z'). This will violate the inherited + // (lo, hi) bounds at the verifier's recursion frame. + let mut rewrote = false; + for op in ops.iter_mut() { + if let ProofOp::Push(Node::KVDigestCount(key, _, _)) = op { + *key = vec![0xff, 0xff]; + rewrote = true; + break; + } + } + assert!(rewrote, "test setup: expected a KVDigestCount to rewrite"); + + let bytes = encode_proof(&ops); + let result = verify_aggregate_count_on_range_proof(&bytes, &inner_range).unwrap(); + let err = result.expect_err("KVDigestCount outside bounds must be rejected"); + match err { + Error::InvalidProofError(_) => {} + other => panic!("expected InvalidProofError, got {:?}", other), + } +} diff --git a/merk/src/proofs/query/aggregate_count/verify.rs b/merk/src/proofs/query/aggregate_count/verify.rs new file mode 100644 index 000000000..52a1a7ebb --- /dev/null +++ b/merk/src/proofs/query/aggregate_count/verify.rs @@ -0,0 +1,261 @@ +//! Verifier for `AggregateCountOnRange` proofs. +//! +//! Two-phase structure: +//! +//! 1. **Phase 1** — replay the prover's op stream through +//! `execute_with_options`, allowlisting the two node types the honest +//! prover ever emits (`HashWithCount` for collapsed Disjoint/Contained +//! subtrees, `KVDigestCount` for boundary nodes). Plain `Hash(_)` is +//! no longer used here because the structural count it would stand +//! in for is needed by the verifier's `own_count` derivation and +//! would not be hash-bound. +//! +//! 2. **Phase 2** — walk the reconstructed tree and re-derive the +//! in-range count, asserting that each node's type matches the +//! classification its inherited bounds imply. This is the +//! type-shape binding that makes the proof non-malleable. + +use grovedb_costs::{cost_return_on_error, CostResult, CostsExt, OperationCost}; + +use crate::{ + proofs::{ + query::{ + aggregate_common::{ + classify_subtree, key_strictly_inside, SubtreeClassification, NULL_HASH, + }, + QueryItem, + }, + tree::{execute_with_options, Tree as ProofTree}, + Decoder, Node, + }, + CryptoHash, Error, +}; + +/// Verify a count-only proof for an `AggregateCountOnRange` query. +/// +/// `proof_bytes` is the encoded `Vec` produced by +/// [`crate::Merk::prove_aggregate_count_on_range`]; `inner_range` is the same +/// `QueryItem` the prover counted over (caller-supplied — typically extracted +/// from the verifier's `PathQuery`). +/// +/// On success returns `(merk_root_hash, count)`: +/// - `merk_root_hash` is the root hash of the reconstructed merk; the +/// caller must compare it against the expected root hash to complete +/// verification. +/// - `count` is the number of keys in the inner range, computed by replaying +/// the prover's classification walk against the reconstructed proof tree. +/// +/// **Two-phase verification.** Allowlisting node types alone is unsound: +/// a malicious prover can substitute `Hash` for an in-range subtree (to +/// undercount), attach extra `KVDigestCount` children below a keyless +/// `Hash` / `HashWithCount` (to overcount, since their hash recomputation +/// ignores attached children and the root hash would still match), or send +/// a single `Push(Hash(expected_root))` for a non-empty tree (to receive a +/// count of 0 with the trusted root). To prevent all three, this function: +/// +/// 1. Decodes the proof into a `ProofTree` via `execute_with_options` with +/// the AVL balance check disabled (count proofs intentionally collapse +/// one side to height 1) and **does not** count anything in the +/// `visit_node` callback. +/// 2. Walks the reconstructed tree with the same inherited exclusive +/// subtree-key bounds the prover used (`(None, None)` at the root). +/// At each position it calls `classify_subtree(bounds, inner_range)` and +/// requires the proof-tree node type to match the classification: +/// - `Disjoint` → must be a leaf `HashWithCount(_)`. Contributes 0. +/// - `Contained` → must be a leaf `HashWithCount(...)`. Contributes its +/// count. +/// - `Boundary` → must be `KVDigestCount(key, ...)` with `key` strictly +/// inside `bounds`. Recurse left with `(lo, key)` and right with +/// `(key, hi)`; add 1 if `inner_range.contains(key)`. +/// +/// Counts are summed with `checked_add`; an overflow is treated as proof +/// corruption (`u64::MAX` keys is not a real merk shape). The caller is +/// still responsible for verifying the returned `merk_root_hash` against +/// their trusted root. +/// +/// **Empty merk case.** An empty merk is represented by an empty proof byte +/// stream and yields `(NULL_HASH, 0)`. Callers chaining this in a +/// multi-layer proof should recognize that shape explicitly. +pub fn verify_aggregate_count_on_range_proof( + proof_bytes: &[u8], + inner_range: &QueryItem, +) -> CostResult<(CryptoHash, u64), Error> { + if proof_bytes.is_empty() { + // Empty merk → empty proof → count = 0, hash = NULL_HASH. This + // matches the prover-side behavior of returning an empty op stream + // for an empty subtree. + return Ok((NULL_HASH, 0u64)).wrap_with_cost(OperationCost::default()); + } + + let mut cost = OperationCost::default(); + let decoder = Decoder::new(proof_bytes); + + // Phase 1: reconstruct the proof tree. The visit_node closure only + // performs a coarse allowlist; the per-position type/shape check happens + // in Phase 2 below. We still reject blatantly wrong node types here so + // execute() bails early on garbage input. + let tree_result: CostResult = + execute_with_options(decoder, false, false, |node| match node { + // The count proof emits only `HashWithCount` (for collapsed + // Disjoint or Contained subtrees) and `KVDigestCount` (for + // Boundary nodes). Plain `Hash(_)` is no longer used here + // because the structural count it would otherwise stand in + // for is needed by the verifier's `own_count` derivation and + // would not be hash-bound. + Node::HashWithCount(_, _, _, _) | Node::KVDigestCount(_, _, _) => Ok(()), + other => Err(Error::InvalidProofError(format!( + "unexpected node type in aggregate count proof: {}", + other + ))), + }); + let tree = cost_return_on_error!(&mut cost, tree_result); + + // Phase 2: shape-check + count by replaying the prover's classification + // walk. This binds each leaf node's type to the (subtree_bounds × range) + // classification, so the only valid count is the one a faithful prover + // would have produced for this exact range. + let (count, _structural) = match verify_count_shape(&tree, inner_range, None, None) { + Ok(pair) => pair, + Err(e) => return Err(e).wrap_with_cost(cost), + }; + + let root_hash = tree.hash().unwrap_add_cost(&mut cost); + Ok((root_hash, count)).wrap_with_cost(cost) +} + +/// Recursive shape-walk over the reconstructed proof tree. Returns the +/// pair `(in_range_count, structural_count)`: +/// +/// - `in_range_count` — number of keys in the subtree that fall inside the +/// inner range AND have a non-zero own-count (i.e. are not +/// `NonCounted`-wrapped). This is what bubbles up to the verifier's +/// return value. +/// - `structural_count` — the merk-recorded aggregate count of this subtree +/// (counting normal entries as 1 and `NonCounted` entries as 0). The +/// parent uses it to compute its own `own_count` as +/// `parent_node_count − left_struct − right_struct` (since +/// `parent_node_count = own + left_struct + right_struct`). +/// +/// The structural count of every child is **cryptographically bound** to +/// the parent's hash chain because every count-bearing node in a count +/// proof (`KVDigestCount`, `HashWithCount`) has its count fed into +/// `node_hash_with_count` for hash recomputation. Plain `Hash(_)` would +/// not carry a bound count and is therefore not allowed in count proofs; +/// see the prover-side comment in `emit_count_proof` for the full +/// justification. +/// +/// At each node: +/// +/// - Compute the expected classification from the inherited subtree bounds +/// and the inner range. +/// - Require the node's type to match the classification (and reject any +/// children attached under a leaf-shape classification — a malicious +/// prover could otherwise hide counted children under a `HashWithCount` +/// leaf, since its hash recomputation ignores reconstructed children). +/// - Recurse with tightened bounds at `Boundary` nodes, summing with +/// `checked_add` and computing `own_count` via `checked_sub`. +fn verify_count_shape( + tree: &ProofTree, + range: &QueryItem, + lo: Option<&[u8]>, + hi: Option<&[u8]>, +) -> Result<(u64, u64), Error> { + let class = classify_subtree(lo, hi, range); + match class { + SubtreeClassification::Disjoint => match &tree.node { + Node::HashWithCount(_, _, _, count) => { + if tree.left.is_some() || tree.right.is_some() { + return Err(Error::InvalidProofError( + "aggregate-count proof: HashWithCount node at a Disjoint position \ + must be a leaf" + .to_string(), + )); + } + // Disjoint subtree contributes 0 to the in-range count but + // its full structural count to the parent's `own_count` + // computation. + Ok((0, *count)) + } + other => Err(Error::InvalidProofError(format!( + "aggregate-count proof: expected HashWithCount at Disjoint position, got {}", + other + ))), + }, + SubtreeClassification::Contained => match &tree.node { + Node::HashWithCount(_, _, _, count) => { + if tree.left.is_some() || tree.right.is_some() { + return Err(Error::InvalidProofError( + "aggregate-count proof: HashWithCount node at a Contained position \ + must be a leaf" + .to_string(), + )); + } + // Contained subtree's structural count (which excludes + // NonCounted entries because their stored aggregate is 0) + // is exactly its in-range count. + Ok((*count, *count)) + } + other => Err(Error::InvalidProofError(format!( + "aggregate-count proof: expected HashWithCount at Contained position, got {}", + other + ))), + }, + SubtreeClassification::Boundary => match &tree.node { + Node::KVDigestCount(key, _, aggregate) => { + if !key_strictly_inside(key.as_slice(), lo, hi) { + return Err(Error::InvalidProofError(format!( + "aggregate-count proof: KVDigestCount key {} falls outside its \ + inherited subtree bounds (lo={:?}, hi={:?})", + hex::encode(key), + lo.map(hex::encode), + hi.map(hex::encode), + ))); + } + let key_slice = key.as_slice(); + let (left_in, left_struct) = match &tree.left { + Some(child) => verify_count_shape(&child.tree, range, lo, Some(key_slice))?, + None => (0, 0), + }; + let (right_in, right_struct) = match &tree.right { + Some(child) => verify_count_shape(&child.tree, range, Some(key_slice), hi)?, + None => (0, 0), + }; + // own_count = aggregate − left_struct − right_struct. + // Saturating sub here would silently mask a malformed + // proof (children claiming more keys than the parent's + // aggregate), so use checked_sub and reject. + let own_count = aggregate + .checked_sub(left_struct) + .and_then(|s| s.checked_sub(right_struct)) + .ok_or_else(|| { + Error::InvalidProofError(format!( + "aggregate-count proof: child structural counts ({} + {}) exceed \ + parent's aggregate count ({}) at key {}", + left_struct, + right_struct, + aggregate, + hex::encode(key) + )) + })?; + let self_contribution = if range.contains(key_slice) { + own_count + } else { + 0 + }; + let in_range = left_in + .checked_add(right_in) + .and_then(|s| s.checked_add(self_contribution)) + .ok_or_else(|| { + Error::InvalidProofError( + "aggregate-count proof: in-range count overflowed u64".to_string(), + ) + })?; + Ok((in_range, *aggregate)) + } + other => Err(Error::InvalidProofError(format!( + "aggregate-count proof: expected KVDigestCount at Boundary position, got {}", + other + ))), + }, + } +} diff --git a/merk/src/proofs/query/aggregate_count/verify_only_tests.rs b/merk/src/proofs/query/aggregate_count/verify_only_tests.rs new file mode 100644 index 000000000..90191ed16 --- /dev/null +++ b/merk/src/proofs/query/aggregate_count/verify_only_tests.rs @@ -0,0 +1,112 @@ +//! Verify-only fixture tests for `verify_aggregate_count_on_range_proof`. +//! +//! These tests do NOT generate proofs — they verify pre-captured proof +//! bytes against expected `(root_hash, count)` pairs. The fixtures are +//! produced by `dump_verify_only_fixtures` (in the main tests module) +//! and copied here as hex constants. The `verify_only_fixture_matches_fresh_prover_output` +//! test in `tests.rs` re-runs the prover and fails loudly if any +//! fixture drifts from the live encoding — so an encoding change is a +//! one-line edit here after that guard fires. +//! +//! Lives in its own file so the verify-only crate feature (`verify`) +//! can build and exercise just these — `tests.rs` pulls in the full +//! `minimal` feature for prover-side helpers and would not compile +//! under the leaner build. + +use super::verify_aggregate_count_on_range_proof; +use crate::proofs::query::{aggregate_common::NULL_HASH, QueryItem}; + +/// Hex-encoded proof bytes for a 15-key `ProvableCountTree` (keys +/// "a"..="o", each with feature `ProvableCountedMerkNode(1)`) queried +/// with `RangeInclusive("c"..="l")`. Captured from +/// `dump_verify_only_fixtures`; regenerate if the proof encoding ever +/// changes. +pub(super) const FIXTURE_15_KEY_C_TO_L_PROOF_HEX: &str = "1e76f2d62fbefb076d8902b2f25bcf9acbd1e903b740c98ea0d926473922f6bbb50000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000011a01622022ec9d571ba774cf9e83d0194962f5d1e3aa1a48d486a67e2762a6c79590150000000000000003101a0163b7d770040f780e9deff6bc038abea66e108b88d098d16d24cd7486eb671060b20000000000000001111a0164d2ad1a0bb9fdf4450bd87151c08b9968cd046bda6654aabdba2430b0a981e7900000000000000007101e28b724715b1fab1f72e0be7e944488dcfbeeb875867d27c06ad9bad8c739997207ce95cd4d1789e01c0a079a3f2c18a3888f2d69fa6d0eabf51c2b434f7cb99e212f1a0042798fb890e8203f007f4f58b033a72cb4e070bfddceb27687d641fe0000000000000003111a01683fc14ed7ecde203a90425ee191e9db5966336d737f0398ec93b764517b6df400000000000000000f101e6da2e2f8e4bdead2a8ac51909f0fa0fb88d47d6bc3b84858bb739fb28a36501031b7c191d5ac70764f815bd7a6c7d0e628f48cef5b813933c07d5ce0ac1dbd5a995443ca10193ebf20e64468deaecc061a981a6dbf4f30e7154b5e9ab806866d00000000000000031a016c55c024f95ca4cc338f7cc2e25db37be2a3fa3a40b151017e460bfc0779cf369f0000000000000007101e3673296561a4d6c3e1ec5cd02c5c468acbd3c8ccd4a42906e8ed06d3fb587a0d2b6d9e310b7c94d3f91fcbb3d5f7547b76c6d1ab3ac3d3540752c5f0b46be24a2f66bf541434a53eae46fa4e6092c03511538c0e1a2c5fc0f0deb72de08a71e500000000000000031111"; +pub(super) const FIXTURE_15_KEY_C_TO_L_ROOT_HEX: &str = + "19ed16776ebe6643b342a238baf7508ddf687fc4bdd53e98f91df8bffb605d96"; +pub(super) const FIXTURE_15_KEY_C_TO_L_COUNT: u64 = 10; + +/// Empty proof bytes encode "empty merk" — the verifier returns +/// `(NULL_HASH, 0)`. This case has no prover dependency at all and is +/// the most basic compile-time signal that the verifier path is wired +/// up correctly under `--no-default-features --features verify`. +#[test] +fn empty_merk_returns_null_hash_and_zero_count() { + let inner = QueryItem::Range(b"a".to_vec()..b"z".to_vec()); + let (root, count) = verify_aggregate_count_on_range_proof(&[], &inner) + .unwrap() + .expect("verify on empty proof must succeed"); + assert_eq!(root, NULL_HASH); + assert_eq!(count, 0); +} + +/// A real `RangeInclusive("c"..="l")` proof against a 15-key +/// `ProvableCountTree`. The verifier must reconstruct the expected +/// merk root hash and recover count = 10. +#[test] +fn fixture_15_key_range_c_to_l_verifies() { + let proof = hex::decode(FIXTURE_15_KEY_C_TO_L_PROOF_HEX).expect("valid hex"); + let mut expected_root = [0u8; 32]; + expected_root.copy_from_slice(&hex::decode(FIXTURE_15_KEY_C_TO_L_ROOT_HEX).expect("valid hex")); + + let inner = QueryItem::RangeInclusive(b"c".to_vec()..=b"l".to_vec()); + let (root, count) = verify_aggregate_count_on_range_proof(&proof, &inner) + .unwrap() + .expect("fixture proof must verify"); + assert_eq!( + root, expected_root, + "verifier reconstructed an unexpected root — fixture stale?" + ); + assert_eq!(count, FIXTURE_15_KEY_C_TO_L_COUNT); +} + +/// Mutating any single byte of the fixture proof must not yield a +/// `(honest_root, wrong_count)` outcome — the hash chain binds count via +/// `node_hash_with_count`, so any successful verify with the honest root +/// must reproduce the honest count. Single-fixture analogue of +/// `fuzz_byte_mutation_no_silent_forgery` that runs without the prover. +#[test] +fn fixture_byte_mutation_does_not_silently_forge_count() { + let proof = hex::decode(FIXTURE_15_KEY_C_TO_L_PROOF_HEX).expect("valid hex"); + let mut expected_root = [0u8; 32]; + expected_root.copy_from_slice(&hex::decode(FIXTURE_15_KEY_C_TO_L_ROOT_HEX).expect("valid hex")); + let inner = QueryItem::RangeInclusive(b"c".to_vec()..=b"l".to_vec()); + + // Sanity check: the honest fixture verifies under the same code path + // the mutation loop will exercise. Without this, an `Err`-on-honest + // fixture would silently make every mutation a vacuous pass. + let (honest_root, honest_count) = verify_aggregate_count_on_range_proof(&proof, &inner) + .unwrap() + .expect("honest fixture must verify (regenerate fixture if this fails)"); + assert_eq!(honest_root, expected_root); + assert_eq!(honest_count, FIXTURE_15_KEY_C_TO_L_COUNT); + + for byte_idx in 0..proof.len() { + for &delta in &[1u8, 0x55, 0xff] { + let mut bytes = proof.clone(); + let original = bytes[byte_idx]; + let mutated = if delta == 0xff { + original ^ 0xff + } else { + original.wrapping_add(delta) + }; + if mutated == original { + continue; + } + bytes[byte_idx] = mutated; + if let Ok((root, count)) = + verify_aggregate_count_on_range_proof(&bytes, &inner).unwrap() + && root == expected_root + { + assert_eq!( + count, FIXTURE_15_KEY_C_TO_L_COUNT, + "SILENT FORGERY at byte {} (delta=0x{:02x}): \ + verifier returned the honest root but a wrong count \ + ({} != {}).", + byte_idx, delta, count, FIXTURE_15_KEY_C_TO_L_COUNT + ); + } + // Err and Ok-with-different-root are both safe outcomes. + } + } +} diff --git a/merk/src/proofs/query/aggregate_count/walk.rs b/merk/src/proofs/query/aggregate_count/walk.rs new file mode 100644 index 000000000..74cd7bf6f --- /dev/null +++ b/merk/src/proofs/query/aggregate_count/walk.rs @@ -0,0 +1,177 @@ +//! No-proof walker: same classification logic as the proof emitter, but +//! returns only the in-range count without allocating proof ops. + +use grovedb_costs::{ + cost_return_on_error, cost_return_on_error_no_add, CostResult, CostsExt, OperationCost, +}; +use grovedb_version::version::GroveVersion; + +use super::provable_count_from_aggregate; +use crate::{ + proofs::query::{ + aggregate_common::{classify_subtree, SubtreeClassification}, + QueryItem, + }, + tree::{kv::ValueDefinedCostType, Fetch, RefWalker}, + Error, +}; + +/// Read the provable-count aggregate off the walker's current tree node. +/// Shared error-mapping helper used by [`walk_count_only`] at both the +/// Contained-leaf and Boundary positions. +fn provable_count_from_walker(walker: &RefWalker<'_, S>) -> Result +where + S: Fetch + Sized + Clone, +{ + let aggregate = walker + .tree() + .aggregate_data() + .map_err(|e| Error::InvalidProofError(format!("aggregate_data: {}", e)))?; + provable_count_from_aggregate(aggregate) +} + +/// No-proof variant of [`super::emit::emit_count_proof`]: walks the same +/// classification path (Contained / Disjoint / Boundary) but only +/// returns the running in-range count. +/// +/// At entry, `subtree_lo_excl` / `subtree_hi_excl` are the inherited +/// exclusive key bounds for the subtree this walker points at (both +/// `None` at the root call). The walk reads each node's +/// `aggregate_data()` and each child link's `aggregate_data().as_count_u64()` +/// exactly the same way the proof emitter does, so the returned count +/// is identical to the `count` field returned by +/// `create_aggregate_count_on_range_proof`. +pub(super) fn walk_count_only( + walker: &mut RefWalker<'_, S>, + range: &QueryItem, + subtree_lo_excl: Option<&[u8]>, + subtree_hi_excl: Option<&[u8]>, + grove_version: &GroveVersion, +) -> CostResult +where + S: Fetch + Sized + Clone, +{ + let mut cost = OperationCost::default(); + + // Classify the current subtree against the inner range. + match classify_subtree(subtree_lo_excl, subtree_hi_excl, range) { + // Disjoint: subtree contributes 0 to the in-range count. + SubtreeClassification::Disjoint => Ok(0).wrap_with_cost(cost), + // Contained: subtree contributes its full stored aggregate + // (NonCounted entries are already excluded — their stored + // aggregate is 0). + SubtreeClassification::Contained => { + let count = cost_return_on_error_no_add!(cost, provable_count_from_walker(walker)); + Ok(count).wrap_with_cost(cost) + } + // Boundary: descend into both children and add own_count. + SubtreeClassification::Boundary => { + // Snapshot what we need from the current node before walking. + // walk(...) takes &mut self.tree, so we must drop any existing + // borrows on walker.tree() before calling it. + let node_key: Vec = walker.tree().key().to_vec(); + let node_count = cost_return_on_error_no_add!(cost, provable_count_from_walker(walker)); + let left_link_aggregate: u64 = walker + .tree() + .link(true) + .map(|l| l.aggregate_data().as_count_u64()) + .unwrap_or(0); + let right_link_aggregate: u64 = walker + .tree() + .link(false) + .map(|l| l.aggregate_data().as_count_u64()) + .unwrap_or(0); + let left_link_present = walker.tree().link(true).is_some(); + let right_link_present = walker.tree().link(false).is_some(); + + let mut total: u64 = 0; + + // LEFT child. If link is Some, walk(true) must yield Some; the + // proof variant has the verifier to catch silent inconsistencies, + // but this no-proof path returns the count straight to the + // caller — so we fail loudly on impossible state rather than + // silently undercounting. + if left_link_present { + let walked = cost_return_on_error!( + &mut cost, + walker.walk( + true, + None::<&fn(&[u8], &GroveVersion) -> Option>, + grove_version, + ) + ); + let mut left_walker = match walked { + Some(lw) => lw, + None => { + return Err(Error::CorruptedState( + "tree.link(true) was Some but walk(true) returned None", + )) + .wrap_with_cost(cost); + } + }; + let n = cost_return_on_error!( + &mut cost, + walk_count_only( + &mut left_walker, + range, + subtree_lo_excl, + Some(node_key.as_slice()), + grove_version, + ) + ); + total = total.saturating_add(n); + } + + // Current node's own_count: 1 if in-range and counted, 0 for + // NonCounted-wrapped (which has stored aggregate 0, so the + // subtraction yields 0). `checked_sub` (not `saturating_sub`) + // because children claiming more keys than the parent's + // aggregate is corrupted state, not something to silently + // clamp to 0. + if range.contains(&node_key) { + let own_count = node_count + .checked_sub(left_link_aggregate) + .and_then(|n| n.checked_sub(right_link_aggregate)) + .ok_or(Error::CorruptedState( + "child structural counts exceed parent's aggregate count", + )); + let own_count = cost_return_on_error_no_add!(cost, own_count); + total = total.saturating_add(own_count); + } + + // RIGHT child — same fail-fast pattern as LEFT. + if right_link_present { + let walked = cost_return_on_error!( + &mut cost, + walker.walk( + false, + None::<&fn(&[u8], &GroveVersion) -> Option>, + grove_version, + ) + ); + let mut right_walker = match walked { + Some(rw) => rw, + None => { + return Err(Error::CorruptedState( + "tree.link(false) was Some but walk(false) returned None", + )) + .wrap_with_cost(cost); + } + }; + let n = cost_return_on_error!( + &mut cost, + walk_count_only( + &mut right_walker, + range, + Some(node_key.as_slice()), + subtree_hi_excl, + grove_version, + ) + ); + total = total.saturating_add(n); + } + + Ok(total).wrap_with_cost(cost) + } + } +} diff --git a/merk/src/proofs/query/aggregate_sum/emit.rs b/merk/src/proofs/query/aggregate_sum/emit.rs new file mode 100644 index 000000000..67ac6c5e2 --- /dev/null +++ b/merk/src/proofs/query/aggregate_sum/emit.rs @@ -0,0 +1,265 @@ +//! Recursive proof-emission engine for `AggregateSumOnRange`. +//! +//! For each subtree we visit, the bound classification (Disjoint / +//! Contained / Boundary) determines what op to push and whether to +//! descend: +//! +//! - **Disjoint** / **Contained** → emit a single `HashWithSum` op for +//! the collapsed subtree root. Contained contributes its full subtree +//! sum to the running in-range total; Disjoint contributes 0. (Both +//! still need the sum hash-bound so the verifier can reconstruct the +//! parent's `own_sum` later — see the inline comment on the +//! `HashWithSum` emit for the long form.) +//! - **Boundary** → emit `KVDigestSum(key, value_hash, node_sum)` for +//! the current node, recurse into both children for descent, and add +//! `own_sum = node_sum − left_struct − right_struct` to the running +//! total iff the node's key is itself in range. + +use std::collections::LinkedList; + +use grovedb_costs::{cost_return_on_error, CostResult, CostsExt, OperationCost}; +use grovedb_version::version::GroveVersion; + +use super::provable_sum_from_aggregate; +use crate::{ + proofs::{ + query::{ + aggregate_common::{classify_subtree, SubtreeClassification, NULL_HASH}, + QueryItem, + }, + Node, Op, + }, + tree::{kv::ValueDefinedCostType, Fetch, RefWalker}, + CryptoHash, Error, +}; + +/// Recursive proof emitter. Always called on a non-empty subtree. +/// +/// At entry, `subtree_lo_excl` / `subtree_hi_excl` are the inherited +/// exclusive key bounds for the subtree this walker points at (both `None` +/// at the root call). The accumulator is `i128` so the prover side never +/// overflows mid-walk on adversarial intermediate sums. +pub(super) fn emit_sum_proof( + walker: &mut RefWalker<'_, S>, + range: &QueryItem, + subtree_lo_excl: Option<&[u8]>, + subtree_hi_excl: Option<&[u8]>, + ops: &mut LinkedList, + grove_version: &GroveVersion, +) -> CostResult +where + S: Fetch + Sized + Clone, +{ + let mut cost = OperationCost::default(); + + // Step 1: classify the current subtree against the inner range. + let class = classify_subtree(subtree_lo_excl, subtree_hi_excl, range); + + if matches!( + class, + SubtreeClassification::Disjoint | SubtreeClassification::Contained + ) { + // Whole subtree is either entirely outside or entirely inside the + // range. Either way we emit a single self-verifying + // `HashWithSum(kv_hash, left_child_hash, right_child_hash, sum)` + // op for the subtree's root. + // + // Why `HashWithSum` even for Disjoint subtrees? Same reason the + // count proof uses `HashWithCount` at Disjoint positions: the + // verifier derives the parent boundary node's `own_sum` as + // `parent_aggregate − left_struct − right_struct`, so the + // *structural* sum of every child — including disjoint outside + // subtrees — has to be cryptographically bound to the parent's + // hash chain. Plain `Hash(node_hash)` would carry an unbound sum + // and let a malicious prover skew the boundary's `own_sum` + // derivation. See the count-side comment for the long form. + let aggregate = match walker.tree().aggregate_data() { + Ok(a) => a, + Err(e) => { + // Local prover-side walk over our own merk — if the + // node refuses to surface aggregate_data, that is a + // storage/state corruption, not a peer-supplied + // invalid proof. + return Err(Error::CorruptedData(format!("aggregate_data: {}", e))) + .wrap_with_cost(cost); + } + }; + let subtree_sum = match provable_sum_from_aggregate(aggregate) { + Ok(s) => s, + Err(e) => return Err(e).wrap_with_cost(cost), + }; + let kv_hash = *walker.tree().kv_hash(); + let left_child_hash = walker + .tree() + .link(true) + .map(|l| *l.hash()) + .unwrap_or(NULL_HASH); + let right_child_hash = walker + .tree() + .link(false) + .map(|l| *l.hash()) + .unwrap_or(NULL_HASH); + ops.push_back(Op::Push(Node::HashWithSum( + kv_hash, + left_child_hash, + right_child_hash, + subtree_sum, + ))); + // For the prover-side in-range total: Contained contributes its + // entire subtree sum (which already excludes `NotSummed` entries + // because their stored aggregate is 0); Disjoint contributes 0. + let in_range_contribution: i128 = match class { + SubtreeClassification::Contained => subtree_sum as i128, + SubtreeClassification::Disjoint => 0, + SubtreeClassification::Boundary => unreachable!(), + }; + return Ok(in_range_contribution).wrap_with_cost(cost); + } + // class == Boundary — fall through to descent + KVDigestSum emission. + + // Step 2: snapshot what we need from the current node before walking. + let node_key: Vec = walker.tree().key().to_vec(); + let node_value_hash: CryptoHash = *walker.tree().value_hash(); + let node_sum: i64 = match walker + .tree() + .aggregate_data() + // Local prover-side walk over our own merk — failure to read + // aggregate_data is local state corruption, not a peer-supplied + // invalid proof. + .map_err(|e| Error::CorruptedData(format!("aggregate_data: {}", e))) + { + Ok(data) => match provable_sum_from_aggregate(data) { + Ok(s) => s, + Err(e) => return Err(e).wrap_with_cost(cost), + }, + Err(e) => return Err(e).wrap_with_cost(cost), + }; + + // Snapshot each child link's structural aggregate sum from the link + // itself (avoids loading the child for this lookup). The verifier needs + // these to compute `own_sum = node_sum − left_struct − right_struct` + // at this boundary node. + let left_link_aggregate: i64 = walker + .tree() + .link(true) + .map(|l| l.aggregate_data().as_sum_i64()) + .unwrap_or(0); + let right_link_aggregate: i64 = walker + .tree() + .link(false) + .map(|l| l.aggregate_data().as_sum_i64()) + .unwrap_or(0); + let left_link_present = walker.tree().link(true).is_some(); + let right_link_present = walker.tree().link(false).is_some(); + + let mut total: i128 = 0; + + // Step 3: handle the LEFT child. + let left_emitted = if left_link_present { + let left_lo = subtree_lo_excl; + let left_hi: Option<&[u8]> = Some(node_key.as_slice()); + let walked = cost_return_on_error!( + &mut cost, + walker.walk( + true, + None::<&fn(&[u8], &GroveVersion) -> Option>, + grove_version, + ) + ); + let mut left_walker = match walked { + Some(lw) => lw, + None => { + return Err(Error::CorruptedState( + "tree.link(true) was Some but walk(true) returned None", + )) + .wrap_with_cost(cost) + } + }; + let n = cost_return_on_error!( + &mut cost, + emit_sum_proof( + &mut left_walker, + range, + left_lo, + left_hi, + ops, + grove_version + ) + ); + // Plain `+` on i128 cannot overflow with i64-sized inputs at the + // realistic depths a Merk tree reaches, so no saturating-add + // safeguard here (the i128 range is ~3.4e38, more than enough for + // any tree of i64 children). + total += n; + true + } else { + false + }; + + // Step 4: emit the current node as a boundary KVDigestSum + attach left + // as its left child. The node's own contribution to the in-range sum + // is `own_sum = node_sum − left_struct − right_struct`. `NotSummed` + // wrapping forces `node_sum = 0` so its own contribution is 0 by + // construction. + ops.push_back(Op::Push(Node::KVDigestSum( + node_key.clone(), + node_value_hash, + node_sum, + ))); + if left_emitted { + ops.push_back(Op::Parent); + } + if range.contains(&node_key) { + // Compute own_sum in i128 to mirror the verifier's overflow-safe + // accumulator. Saturating semantics would silently mask malformed + // intermediates; we propagate the literal arithmetic here and the + // verifier rejects any overflow at the final i64-narrow step. + let own_sum_i128 = + (node_sum as i128) - (left_link_aggregate as i128) - (right_link_aggregate as i128); + total += own_sum_i128; + } + + // Step 5: handle the RIGHT child. + let right_emitted = if right_link_present { + let right_lo: Option<&[u8]> = Some(node_key.as_slice()); + let right_hi = subtree_hi_excl; + let walked = cost_return_on_error!( + &mut cost, + walker.walk( + false, + None::<&fn(&[u8], &GroveVersion) -> Option>, + grove_version, + ) + ); + let mut right_walker = match walked { + Some(rw) => rw, + None => { + return Err(Error::CorruptedState( + "tree.link(false) was Some but walk(false) returned None", + )) + .wrap_with_cost(cost) + } + }; + let n = cost_return_on_error!( + &mut cost, + emit_sum_proof( + &mut right_walker, + range, + right_lo, + right_hi, + ops, + grove_version, + ) + ); + total += n; + true + } else { + false + }; + + if right_emitted { + ops.push_back(Op::Child); + } + + Ok(total).wrap_with_cost(cost) +} diff --git a/merk/src/proofs/query/aggregate_sum/mod.rs b/merk/src/proofs/query/aggregate_sum/mod.rs new file mode 100644 index 000000000..b685ebe29 --- /dev/null +++ b/merk/src/proofs/query/aggregate_sum/mod.rs @@ -0,0 +1,90 @@ +//! Proof generation and verification for `AggregateSumOnRange` queries. +//! +//! This module is the sum-only twin of [`super::aggregate_count`]. It +//! implements the proof shape described in the GroveDB book chapter +//! "Aggregate Sum Queries": instead of returning the number of keys in the +//! inner range, the query returns the **signed `i64` sum** of children with +//! keys in that range against a `ProvableSumTree`. +//! +//! Like its count sibling, this module is intentionally **separate** from +//! `create_proof_internal`: regular proofs always descend into a queried +//! subtree, but sum proofs *stop* at fully-inside subtree roots and emit a +//! single `HashWithSum` op for the entire collapsed subtree. +//! +//! The proof targets a `ProvableSumTree` exclusively (the `NotSummed` +//! wrapper variant only affects whether the tree contributes to its parent's +//! sum, not its own internal sum mechanics). On any other tree type the +//! entry point returns `Error::InvalidProofError`. +//! +//! ## Module layout +//! +//! - [`prove`] — `impl RefWalker` block holding the public prover entry +//! points (`create_aggregate_sum_on_range_proof` and the no-proof +//! `sum_aggregate_on_range`). +//! - [`emit`] — the recursive proof-emission engine (`emit_sum_proof`). +//! - [`walk`] — the no-proof equivalent walk (`walk_sum_only`). +//! - [`verify`] — the verifier (`verify_aggregate_sum_on_range_proof`) +//! and its recursive shape-walker. +//! - [`tests`] — unit + integration tests. +//! +//! Range-bound classification is shared with the count side via +//! [`super::aggregate_common`]. +//! +//! ## Negative-sum gotchas mirrored from the count side +//! +//! - The accumulator can legitimately reach zero with non-zero children +//! (e.g. `+5` plus `-5`), so there is no "if sum == 0 → short-circuit" +//! shortcut here — the count code uses `if count == 0` in a few places +//! that would be unsound here. The only zero-skip pattern that's +//! correct for sum is "subtree is fully outside range → contributes 0", +//! driven purely by the bound classification. +//! - The verifier accumulates in `i128` and narrows to `i64` at the end so +//! adversarial inputs like `i64::MAX + i64::MAX` are detected as +//! overflow instead of silently wrapping. + +#[cfg(feature = "minimal")] +mod emit; +#[cfg(feature = "minimal")] +mod prove; +#[cfg(test)] +mod tests; +#[cfg(any(feature = "minimal", feature = "verify"))] +mod verify; +#[cfg(feature = "minimal")] +mod walk; + +#[cfg(any(feature = "minimal", feature = "verify"))] +pub use verify::verify_aggregate_sum_on_range_proof; + +#[cfg(feature = "minimal")] +use crate::{ + tree::AggregateData, + {Error, TreeType}, +}; + +/// Returns true if `tree_type` is one that can host an `AggregateSumOnRange` +/// proof. Only `ProvableSumTree` is valid — the `Sum` / `BigSum` trees use +/// different hash dispatches (the inserted-value hash is not bound through +/// `node_hash_with_sum` for those) and can't produce verifiable sum proofs. +#[cfg(feature = "minimal")] +pub(super) fn is_provable_sum_bearing(tree_type: TreeType) -> bool { + matches!(tree_type, TreeType::ProvableSumTree) +} + +/// Pull the sum out of a `ProvableSum` aggregate. Returns +/// `Err(CorruptedData)` for any other variant — the entry point has +/// already gated `tree_type`, so reaching the error means the tree's +/// in-memory state disagrees with its declared type. This is a local +/// invariant failure on the prover side (we are walking *our own* +/// merk), so `CorruptedData` is the appropriate classification per the +/// repo error-handling convention. +#[cfg(feature = "minimal")] +pub(super) fn provable_sum_from_aggregate(data: AggregateData) -> Result { + match data { + AggregateData::ProvableSum(s) => Ok(s), + other => Err(Error::CorruptedData(format!( + "expected ProvableSum aggregate data on a provable sum tree, got {:?}", + other + ))), + } +} diff --git a/merk/src/proofs/query/aggregate_sum/prove.rs b/merk/src/proofs/query/aggregate_sum/prove.rs new file mode 100644 index 000000000..bbdaac7c5 --- /dev/null +++ b/merk/src/proofs/query/aggregate_sum/prove.rs @@ -0,0 +1,127 @@ +//! Public prover entry points for `AggregateSumOnRange` queries. +//! +//! `impl RefWalker` block holding both the proof-emitting entry point +//! (`create_aggregate_sum_on_range_proof`) and its no-proof read +//! counterpart (`sum_aggregate_on_range`). Both narrow the prover-side +//! `i128` accumulator down to the on-the-wire `i64`, rejecting any +//! out-of-range result as corruption (the prover walks its own merk, so +//! an out-of-range honest result is unreachable — defense-in-depth here +//! keeps the contract symmetric with the verifier). + +use std::collections::LinkedList; + +use grovedb_costs::{cost_return_on_error, CostResult, CostsExt, OperationCost}; +use grovedb_version::version::GroveVersion; + +use super::{emit::emit_sum_proof, is_provable_sum_bearing, walk::walk_sum_only}; +use crate::{ + proofs::{query::QueryItem, Op}, + tree::{Fetch, RefWalker}, + {Error, TreeType}, +}; + +impl RefWalker<'_, S> +where + S: Fetch + Sized + Clone, +{ + /// Generate a sum-only proof for an `AggregateSumOnRange` query. + /// + /// `inner_range` is the `QueryItem` wrapped by `AggregateSumOnRange` + /// (already stripped at the caller). `tree_type` must be + /// `ProvableSumTree`; any other tree type is rejected with + /// `Error::InvalidProofError` before any walking happens. + /// + /// The returned tuple is `(proof_ops, sum)`: + /// - `proof_ops` is the linear stream the verifier will replay to + /// reconstruct the tree's root hash. + /// - `sum` is the prover-side computed signed sum (the verifier + /// independently recomputes it from the proof and compares against + /// the expected root hash; this value is returned as a convenience, + /// not as ground truth). + pub fn create_aggregate_sum_on_range_proof( + &mut self, + inner_range: &QueryItem, + tree_type: TreeType, + grove_version: &GroveVersion, + ) -> CostResult<(LinkedList, i64), Error> { + if !is_provable_sum_bearing(tree_type) { + return Err(Error::InvalidProofError(format!( + "AggregateSumOnRange is only valid against ProvableSumTree, got {:?}", + tree_type + ))) + .wrap_with_cost(OperationCost::default()); + } + + let mut cost = OperationCost::default(); + let mut ops = LinkedList::new(); + let sum_i128 = cost_return_on_error!( + &mut cost, + emit_sum_proof(self, inner_range, None, None, &mut ops, grove_version) + ); + // Narrow the prover-side i128 accumulator to i64. The verifier does + // the same narrowing; if the honest sum doesn't fit in i64 we treat + // it as proof corruption (a real ProvableSumTree maintains all + // intermediate aggregates as i64, so an i128-only honest result is + // unreachable — but defending here keeps the contract symmetric with + // the verifier). + let sum: i64 = match i64::try_from(sum_i128) { + Ok(v) => v, + Err(_) => { + return Err(Error::InvalidProofError(format!( + "aggregate-sum proof: in-range sum overflowed i64 ({})", + sum_i128 + ))) + .wrap_with_cost(cost); + } + }; + Ok((ops, sum)).wrap_with_cost(cost) + } + + /// Walk the tree for an `AggregateSumOnRange` query and return the + /// in-range signed sum, **without** producing a proof. + /// + /// This is the no-proof counterpart of + /// [`Self::create_aggregate_sum_on_range_proof`]. It performs the same + /// classification walk (Contained / Disjoint / Boundary) and reads each + /// node's aggregate sum directly from the merk, so it is O(log n) in + /// the number of distinct keys under the indexed subtree — the same + /// complexity as the proof variant but without the proof-op allocations, + /// hash recomputations, or serialization round-trip. + /// + /// The caller (`Merk::sum_aggregate_on_range`) is expected to have + /// already validated `tree_type` is `ProvableSumTree`; the per-node + /// `provable_sum_from_aggregate` check inside the walk surfaces any + /// disagreement between the declared tree type and the in-memory + /// aggregate. + /// + /// The accumulator carries `i128` end-to-end and narrows to `i64` at + /// the very last step, exactly the way the prover and verifier do. + /// Any value outside `i64` range is treated as corruption (a real + /// `ProvableSumTree` maintains every aggregate as `i64` at every + /// level, so the i128 path only ever holds an out-of-range value if + /// the tree state is internally inconsistent). + /// + /// The result is **not** independently verifiable: the caller is + /// trusting their own merk read path. Callers that need a verifiable + /// sum must use `prove_aggregate_sum_on_range` + + /// `verify_aggregate_sum_on_range_proof`. + pub fn sum_aggregate_on_range( + &mut self, + inner_range: &QueryItem, + grove_version: &GroveVersion, + ) -> CostResult { + let mut cost = OperationCost::default(); + let sum_i128 = cost_return_on_error!( + &mut cost, + walk_sum_only(self, inner_range, None, None, grove_version) + ); + match i64::try_from(sum_i128) { + Ok(v) => Ok(v).wrap_with_cost(cost), + Err(_) => Err(Error::CorruptedData(format!( + "no-proof aggregate-sum: in-range sum overflowed i64 ({})", + sum_i128 + ))) + .wrap_with_cost(cost), + } + } +} diff --git a/merk/src/proofs/query/aggregate_sum/tests.rs b/merk/src/proofs/query/aggregate_sum/tests.rs new file mode 100644 index 000000000..b62138754 --- /dev/null +++ b/merk/src/proofs/query/aggregate_sum/tests.rs @@ -0,0 +1,771 @@ +//! Unit + integration tests for the aggregate-sum prover/verifier. +//! +//! Split out of the legacy single-file `aggregate_sum.rs` along with the +//! prover/walker/verifier when the module became a directory. The body +//! is byte-identical to the previous in-file `mod tests { ... }` block; +//! only the `use super::*;` line at the top expanded into explicit +//! imports from the new sub-modules so the test bodies can reach the +//! private helpers (`walk_sum_only`, `classify_subtree`, etc.) they +//! were already exercising. + +use std::collections::LinkedList; + +use grovedb_version::version::GroveVersion; + +use super::{ + is_provable_sum_bearing, provable_sum_from_aggregate, verify_aggregate_sum_on_range_proof, +}; +use crate::{ + proofs::{ + encode_into, + query::{ + aggregate_common::{ + classify_subtree, key_strictly_inside, SubtreeClassification, NULL_HASH, + }, + QueryItem, + }, + Node, Op as ProofOp, + }, + test_utils::TempMerk, + tree::{AggregateData, Op, TreeFeatureType::ProvableSummedMerkNode}, + Error, Merk, TreeType, +}; + +fn range_inclusive(lo: &[u8], hi: &[u8]) -> QueryItem { + QueryItem::RangeInclusive(lo.to_vec()..=hi.to_vec()) +} + +fn range_full() -> QueryItem { + QueryItem::RangeFull(std::ops::RangeFull) +} + +#[test] +fn classify_disjoint_below_sum() { + let r = range_inclusive(b"d", b"f"); + assert_eq!( + classify_subtree(None, Some(b"c"), &r), + SubtreeClassification::Disjoint, + ); +} + +#[test] +fn classify_contained_full_range_full_subtree_sum() { + let r = range_full(); + assert_eq!( + classify_subtree(None, None, &r), + SubtreeClassification::Contained, + ); +} + +#[test] +fn classify_boundary_overlapping_lower_sum() { + let r = range_inclusive(b"d", b"f"); + assert_eq!( + classify_subtree(Some(b"c"), Some(b"e"), &r), + SubtreeClassification::Boundary, + ); +} + +// ---------- end-to-end integration tests on a real merk ---------- + +/// Build a fresh `ProvableSumTree` populated with single-byte keys +/// "a".."o" (15 keys), each carrying sum 1, 2, ..., 15 respectively. +/// Returns the merk and its current root hash. +fn make_15_key_provable_sum_tree(grove_version: &GroveVersion) -> (TempMerk, [u8; 32]) { + let mut merk = TempMerk::new_with_tree_type(grove_version, TreeType::ProvableSumTree); + let keys: Vec> = (b'a'..=b'o').map(|c| vec![c]).collect(); + let entries: Vec<(Vec, Op)> = keys + .iter() + .enumerate() + .map(|(i, k)| { + let s = (i as i64) + 1; + (k.clone(), Op::Put(vec![i as u8], ProvableSummedMerkNode(s))) + }) + .collect(); + merk.apply::<_, Vec<_>>(&entries, &[], None, grove_version) + .unwrap() + .expect("apply should succeed"); + merk.commit(grove_version); + let root_hash = merk.root_hash().unwrap(); + (merk, root_hash) +} + +/// Encode a `LinkedList` into the wire format. +fn encode_proof(ops: &LinkedList) -> Vec { + let mut bytes = Vec::with_capacity(128); + encode_into(ops.iter(), &mut bytes); + bytes +} + +/// Round-trip: prove → encode → verify, assert root + sum match. +fn round_trip( + merk: &Merk>, + expected_root: [u8; 32], + inner_range: QueryItem, + expected_sum: i64, + grove_version: &GroveVersion, +) { + let (ops, prover_sum) = merk + .prove_aggregate_sum_on_range(&inner_range, grove_version) + .unwrap() + .expect("prove should succeed"); + assert_eq!( + prover_sum, expected_sum, + "prover sum mismatch for range {:?}", + inner_range + ); + let bytes = encode_proof(&ops); + let (root, verifier_sum) = verify_aggregate_sum_on_range_proof(&bytes, &inner_range) + .unwrap() + .expect("verify should succeed"); + assert_eq!( + root, expected_root, + "verifier reconstructed wrong root for range {:?}", + inner_range + ); + assert_eq!( + verifier_sum, expected_sum, + "verifier sum mismatch for range {:?}", + inner_range + ); +} + +#[test] +fn integration_full_range_sum_of_1_to_15() { + let v = GroveVersion::latest(); + let (merk, root) = make_15_key_provable_sum_tree(v); + // Full range with RangeFrom("a"..) — sum = 1+2+...+15 = 120. + round_trip(&merk, root, QueryItem::RangeFrom(b"a".to_vec()..), 120, v); +} + +#[test] +fn integration_closed_range_inclusive_sum() { + let v = GroveVersion::latest(); + let (merk, root) = make_15_key_provable_sum_tree(v); + // Keys "c"..="l" → values 3..=12 → sum = 75. + round_trip( + &merk, + root, + QueryItem::RangeInclusive(b"c".to_vec()..=b"l".to_vec()), + 75, + v, + ); +} + +#[test] +fn integration_range_below_all_keys_sum() { + let v = GroveVersion::latest(); + let (merk, root) = make_15_key_provable_sum_tree(v); + round_trip( + &merk, + root, + QueryItem::RangeInclusive(vec![0x00]..=vec![0x10]), + 0, + v, + ); +} + +#[test] +fn integration_range_above_all_keys_sum() { + let v = GroveVersion::latest(); + let (merk, root) = make_15_key_provable_sum_tree(v); + round_trip( + &merk, + root, + QueryItem::RangeInclusive(b"z".to_vec()..=vec![0xff]), + 0, + v, + ); +} + +#[test] +fn integration_empty_merk_sum() { + let v = GroveVersion::latest(); + let merk = TempMerk::new_with_tree_type(v, TreeType::ProvableSumTree); + let (ops, prover_sum) = merk + .prove_aggregate_sum_on_range(&QueryItem::Range(b"a".to_vec()..b"z".to_vec()), v) + .unwrap() + .expect("prove on empty merk should succeed"); + assert_eq!(prover_sum, 0); + let bytes = encode_proof(&ops); + let (root, verifier_sum) = verify_aggregate_sum_on_range_proof( + &bytes, + &QueryItem::Range(b"a".to_vec()..b"z".to_vec()), + ) + .unwrap() + .expect("verify on empty merk should succeed"); + assert_eq!(root, NULL_HASH); + assert_eq!(verifier_sum, 0); +} + +#[test] +fn integration_rejected_on_normal_tree() { + let v = GroveVersion::latest(); + let merk = TempMerk::new(v); + let err = merk + .prove_aggregate_sum_on_range(&QueryItem::Range(b"a".to_vec()..b"z".to_vec()), v) + .unwrap(); + assert!( + err.is_err(), + "expected InvalidProofError on NormalTree, got Ok({:?})", + err.ok().map(|(_, s)| s) + ); +} + +#[test] +fn integration_rejected_on_provable_count_tree() { + // ProvableSumTree-only — count trees use a different hash dispatch + // and are not valid input here. + let v = GroveVersion::latest(); + let merk = TempMerk::new_with_tree_type(v, TreeType::ProvableCountTree); + let err = merk + .prove_aggregate_sum_on_range(&QueryItem::Range(b"a".to_vec()..b"z".to_vec()), v) + .unwrap(); + assert!( + err.is_err(), + "expected InvalidProofError on ProvableCountTree, got Ok" + ); +} + +#[test] +fn integration_sum_forgery_is_rejected() { + // Tamper with a HashWithSum's sum field — the verifier's root-hash + // recomputation must diverge from the expected root. + let v = GroveVersion::latest(); + let (merk, expected_root) = make_15_key_provable_sum_tree(v); + let inner_range = QueryItem::RangeInclusive(b"c".to_vec()..=b"l".to_vec()); + let (mut ops, _prover_sum) = merk + .prove_aggregate_sum_on_range(&inner_range, v) + .unwrap() + .expect("prove should succeed"); + + let mut tampered = false; + for op in ops.iter_mut() { + if let ProofOp::Push(Node::HashWithSum(_, _, _, sum)) + | ProofOp::PushInverted(Node::HashWithSum(_, _, _, sum)) = op + { + *sum = sum.saturating_add(1); + tampered = true; + break; + } + } + assert!(tampered, "test setup: expected at least one HashWithSum op"); + + let bytes = encode_proof(&ops); + let (root, _sum) = verify_aggregate_sum_on_range_proof(&bytes, &inner_range) + .unwrap() + .expect("verify should still complete (root mismatch is the caller's job)"); + assert_ne!( + root, expected_root, + "tampered sum must produce a different reconstructed root hash" + ); +} + +#[test] +fn shape_walk_rejects_single_hash_undercount_sum() { + let v = GroveVersion::latest(); + let (merk, expected_root) = make_15_key_provable_sum_tree(v); + let inner_range = QueryItem::RangeInclusive(b"c".to_vec()..=b"l".to_vec()); + + // Forged proof: a single Hash op carrying the genuine root hash. + let mut forged: LinkedList = LinkedList::new(); + forged.push_back(ProofOp::Push(Node::Hash(expected_root))); + let bytes = encode_proof(&forged); + + let result = verify_aggregate_sum_on_range_proof(&bytes, &inner_range).unwrap(); + let err = result.expect_err("single-Hash forgery must be rejected"); + let _ = merk; + match err { + Error::InvalidProofError(msg) => { + assert!( + msg.contains("unexpected node type") + || msg.contains("expected KVDigestSum") + || msg.contains("Boundary"), + "unexpected message: {msg}" + ); + } + other => panic!("expected InvalidProofError, got {other:?}"), + } +} + +#[test] +fn shape_walk_rejects_disjoint_hashwithsum_with_children() { + let v = GroveVersion::latest(); + let (merk, _root) = make_15_key_provable_sum_tree(v); + let inner_range = QueryItem::RangeAfter(b"o".to_vec()..); + let (mut ops, _) = merk + .prove_aggregate_sum_on_range(&inner_range, v) + .unwrap() + .expect("prove succeeds"); + + let mut spliced = LinkedList::::new(); + let mut done = false; + for op in ops.iter() { + spliced.push_back(op.clone()); + if !done && matches!(op, ProofOp::Push(Node::HashWithSum(_, _, _, _))) { + spliced.push_back(ProofOp::Push(Node::HashWithSum( + [0u8; 32], [0u8; 32], [0u8; 32], 1, + ))); + spliced.push_back(ProofOp::Parent); + done = true; + } + } + assert!(done, "test setup: expected at least one HashWithSum op"); + ops = spliced; + + let bytes = encode_proof(&ops); + let result = verify_aggregate_sum_on_range_proof(&bytes, &inner_range).unwrap(); + let err = result.expect_err("Disjoint HashWithSum with children must be rejected"); + match err { + Error::InvalidProofError(msg) => assert!( + msg.contains("Disjoint position must be a leaf"), + "unexpected message: {msg}" + ), + other => panic!("expected InvalidProofError, got {:?}", other), + } +} + +/// Regular `Merk::prove` on a `ProvableSumTree` must emit the sum-bearing +/// proof node variants. Queried items yield `KVSum` (via `to_kv_sum_node`), +/// non-queried path nodes yield `KVHashSum` (via `to_kvhash_sum_node`). +/// This exercises the sum-node helper functions whose only callers are +/// inside `create_proof_internal`. +#[test] +fn regular_prove_on_provable_sum_tree_emits_kv_sum_and_kvhash_sum() { + use crate::proofs::{query::Query, Decoder, Node, Op as ProofOp}; + + let v = GroveVersion::latest(); + let (merk, _root) = make_15_key_provable_sum_tree(v); + + // Query a few keys, leaving most unqueried so we get both queried + // (KVSum) and path (KVHashSum) nodes. + let mut q = Query::new(); + q.insert_key(b"a".to_vec()); + q.insert_key(b"h".to_vec()); // middle + q.insert_key(b"o".to_vec()); + + let proof_result = merk.prove(q, None, v).unwrap().expect("regular prove"); + let proof_bytes = proof_result.proof; + + let ops: Vec = Decoder::new(&proof_bytes) + .collect::, _>>() + .expect("decode"); + + let mut saw_kvsum = false; + let mut saw_kvhashsum = false; + for op in &ops { + match op { + ProofOp::Push(node) | ProofOp::PushInverted(node) => match node { + Node::KVSum(..) => saw_kvsum = true, + Node::KVHashSum(..) => saw_kvhashsum = true, + _ => {} + }, + _ => {} + } + } + assert!( + saw_kvsum, + "expected at least one KVSum node from queried Items on a ProvableSumTree" + ); + assert!( + saw_kvhashsum, + "expected at least one KVHashSum node on the proof path" + ); +} + +/// Querying an out-of-range absent key on a `ProvableSumTree` must emit a +/// boundary `KVDigestSum` node — i.e. the result of `to_kvdigest_sum_node`. +/// We do this on a single-key tree so that one of the absence-flank keys +/// IS on the tree's boundary, forcing the `on_boundary_not_found` branch. +#[test] +fn regular_prove_on_provable_sum_tree_emits_kvdigest_sum() { + use crate::proofs::{query::Query, Decoder, Node, Op as ProofOp}; + + let v = GroveVersion::latest(); + let mut merk = TempMerk::new_with_tree_type(v, TreeType::ProvableSumTree); + // Single-key tree: querying any absent key forces a boundary emission. + merk.apply::<_, Vec<_>>( + &[(b"m".to_vec(), Op::Put(vec![0], ProvableSummedMerkNode(7)))], + &[], + None, + v, + ) + .unwrap() + .expect("apply"); + merk.commit(v); + + let mut q = Query::new(); + q.insert_key(b"zz".to_vec()); // absent, above the single key + let proof_result = merk.prove(q, None, v).unwrap().expect("regular prove"); + let ops: Vec = Decoder::new(&proof_result.proof) + .collect::, _>>() + .expect("decode"); + + let saw_kvdigestsum = ops.iter().any(|op| { + matches!( + op, + ProofOp::Push(Node::KVDigestSum(..)) | ProofOp::PushInverted(Node::KVDigestSum(..)) + ) + }); + assert!( + saw_kvdigestsum, + "expected KVDigestSum boundary node for absent-key proof, got ops: {:?}", + ops + ); +} + +/// Two i64::MAX children sum to 2*i64::MAX, which exceeds i64. The +/// verifier's final i64-narrowing check must surface this as a +/// proof-error. This exercises the i128 accumulator + overflow gate. +#[test] +fn integration_overflow_at_i64_max_is_rejected() { + let v = GroveVersion::latest(); + let mut merk = TempMerk::new_with_tree_type(v, TreeType::ProvableSumTree); + // Two children, each i64::MAX. Sum exceeds i64::MAX. + let entries: Vec<(Vec, Op)> = vec![ + ( + b"a".to_vec(), + Op::Put(vec![0], ProvableSummedMerkNode(i64::MAX)), + ), + ( + b"b".to_vec(), + Op::Put(vec![0], ProvableSummedMerkNode(i64::MAX)), + ), + ]; + // Insertion itself may or may not succeed depending on the apply + // path's intermediate-overflow handling. Skip if not; this scenario + // is additionally exercised at the verify layer via fabricated + // proofs. + if merk + .apply::<_, Vec<_>>(&entries, &[], None, v) + .unwrap() + .is_err() + { + return; + } + merk.commit(v); + let inner_range = QueryItem::RangeFrom(b"a".to_vec()..); + let result = merk.prove_aggregate_sum_on_range(&inner_range, v).unwrap(); + // Either the prover detects the overflow during its narrowing pass, + // or it produces a proof whose verifier-side narrowing catches it. + // Both are acceptable end states for this safety net. + match result { + Err(_) => { /* prover-side overflow detection — done */ } + Ok((ops, _)) => { + let bytes = encode_proof(&ops); + let v_result = verify_aggregate_sum_on_range_proof(&bytes, &inner_range).unwrap(); + assert!( + v_result.is_err(), + "verifier must reject an i128-sized sum that doesn't fit in i64" + ); + } + } +} + +// ---------- no-proof variant: sum_aggregate_on_range ---------- +// +// The no-proof entry point must return exactly the same sum as the +// proof path for every range shape, without producing any proof ops. +// These tests cross-check the two paths on the same merk and also +// cover the failure modes unique to the no-proof variant (wrong tree +// type, empty merk, overflow narrowing). + +/// Cross-check: assert `sum_aggregate_on_range` and the sum returned +/// by `prove_aggregate_sum_on_range` agree for the given range, and +/// that both equal `expected_sum`. +fn no_proof_sum_matches_prover( + merk: &Merk>, + inner_range: QueryItem, + expected_sum: i64, + grove_version: &GroveVersion, +) { + let no_proof = merk + .sum_aggregate_on_range(&inner_range, grove_version) + .unwrap() + .expect("sum_aggregate_on_range should succeed"); + assert_eq!( + no_proof, expected_sum, + "no-proof variant returned wrong sum for range {:?}", + inner_range + ); + let (_ops, prover_sum) = merk + .prove_aggregate_sum_on_range(&inner_range, grove_version) + .unwrap() + .expect("prove should succeed"); + assert_eq!( + no_proof, prover_sum, + "no-proof variant disagrees with prover sum for range {:?}", + inner_range + ); +} + +#[test] +fn no_proof_sum_matches_prover_closed_range_inclusive() { + let v = GroveVersion::latest(); + let (merk, _root) = make_15_key_provable_sum_tree(v); + // sums for keys c..=l are 3..=12 → 75 + no_proof_sum_matches_prover( + &merk, + QueryItem::RangeInclusive(b"c".to_vec()..=b"l".to_vec()), + 75, + v, + ); +} + +#[test] +fn no_proof_sum_matches_prover_closed_range_exclusive() { + let v = GroveVersion::latest(); + let (merk, _root) = make_15_key_provable_sum_tree(v); + // sums for keys c..l are 3..=11 → 63 + no_proof_sum_matches_prover(&merk, QueryItem::Range(b"c".to_vec()..b"l".to_vec()), 63, v); +} + +#[test] +fn no_proof_sum_matches_prover_open_range_from() { + let v = GroveVersion::latest(); + let (merk, _root) = make_15_key_provable_sum_tree(v); + // c..o → 3+4+...+15 = 117 + no_proof_sum_matches_prover(&merk, QueryItem::RangeFrom(b"c".to_vec()..), 117, v); +} + +#[test] +fn no_proof_sum_matches_prover_range_after() { + // RangeAfter at the root pushes the left boundary exclusive to + // "b", exercising the right-child arm of walk_sum_only. + let v = GroveVersion::latest(); + let (merk, _root) = make_15_key_provable_sum_tree(v); + no_proof_sum_matches_prover(&merk, QueryItem::RangeAfter(b"b".to_vec()..), 117, v); +} + +#[test] +fn no_proof_sum_matches_prover_range_to_inclusive() { + let v = GroveVersion::latest(); + let (merk, _root) = make_15_key_provable_sum_tree(v); + // ..=e → 1+2+3+4+5 = 15 + no_proof_sum_matches_prover(&merk, QueryItem::RangeToInclusive(..=b"e".to_vec()), 15, v); +} + +#[test] +fn no_proof_sum_matches_prover_range_below_all_keys() { + let v = GroveVersion::latest(); + let (merk, _root) = make_15_key_provable_sum_tree(v); + no_proof_sum_matches_prover( + &merk, + QueryItem::RangeInclusive(vec![0x00]..=vec![0x10]), + 0, + v, + ); +} + +#[test] +fn no_proof_sum_empty_merk_returns_zero() { + let v = GroveVersion::latest(); + let merk = TempMerk::new_with_tree_type(v, TreeType::ProvableSumTree); + let sum = merk + .sum_aggregate_on_range(&QueryItem::Range(b"a".to_vec()..b"z".to_vec()), v) + .unwrap() + .expect("sum_aggregate_on_range on empty merk should succeed"); + assert_eq!(sum, 0); +} + +#[test] +fn no_proof_sum_rejected_on_normal_tree() { + let v = GroveVersion::latest(); + let merk = TempMerk::new(v); // NormalTree + let result = merk + .sum_aggregate_on_range(&QueryItem::Range(b"a".to_vec()..b"z".to_vec()), v) + .unwrap(); + assert!( + result.is_err(), + "expected InvalidProofError on NormalTree, got Ok({:?})", + result.ok() + ); +} + +#[test] +fn no_proof_sum_rejected_on_provable_count_tree() { + // Sum variant must reject ProvableCountTree too (precise tree-type + // match), parallel to the verify-side terminal-type gate. + let v = GroveVersion::latest(); + let merk = TempMerk::new_with_tree_type(v, TreeType::ProvableCountTree); + let result = merk + .sum_aggregate_on_range(&QueryItem::Range(b"a".to_vec()..b"z".to_vec()), v) + .unwrap(); + assert!( + result.is_err(), + "expected InvalidProofError on ProvableCountTree for a sum query, got Ok({:?})", + result.ok() + ); +} + +// ---------- Unit tests for helper-function error paths -------------- +// +// These exercise small internal helpers that the integration tests +// can only reach indirectly. Each one pins a specific Err-classification +// arm so that future refactors can't silently drop the diagnostic. + +#[test] +fn provable_sum_from_aggregate_rejects_non_provable_sum_variants() { + // Cover every non-`ProvableSum` arm of `provable_sum_from_aggregate`. + // The fallback "other" arm should fire for each. + let cases = [ + AggregateData::NoAggregateData, + AggregateData::Sum(5), + AggregateData::BigSum(5), + AggregateData::Count(5), + AggregateData::CountAndSum(2, 3), + AggregateData::ProvableCount(5), + AggregateData::ProvableCountAndSum(2, 3), + ]; + for case in cases { + let result = provable_sum_from_aggregate(case); + match result { + Err(Error::CorruptedData(msg)) => { + assert!( + msg.contains("expected ProvableSum"), + "wrong message for {:?}: {msg}", + case + ); + } + other => panic!("expected CorruptedData for {:?}, got {:?}", case, other), + } + } +} + +#[test] +fn provable_sum_from_aggregate_accepts_provable_sum() { + // Sanity: the happy-path arm preserves the inner value (including + // negative values). + assert_eq!( + provable_sum_from_aggregate(AggregateData::ProvableSum(0)).unwrap(), + 0 + ); + assert_eq!( + provable_sum_from_aggregate(AggregateData::ProvableSum(-42)).unwrap(), + -42 + ); + assert_eq!( + provable_sum_from_aggregate(AggregateData::ProvableSum(i64::MAX)).unwrap(), + i64::MAX + ); + assert_eq!( + provable_sum_from_aggregate(AggregateData::ProvableSum(i64::MIN)).unwrap(), + i64::MIN + ); +} + +#[test] +fn is_provable_sum_bearing_only_for_provable_sum_tree() { + // Every TreeType variant must return false except ProvableSumTree. + // This pins the matches!(...) gate against accidental loosening. + assert!(is_provable_sum_bearing(TreeType::ProvableSumTree)); + for t in [ + TreeType::NormalTree, + TreeType::SumTree, + TreeType::BigSumTree, + TreeType::CountTree, + TreeType::CountSumTree, + TreeType::ProvableCountTree, + TreeType::ProvableCountSumTree, + TreeType::CommitmentTree(0), + TreeType::MmrTree, + TreeType::BulkAppendTree(0), + TreeType::DenseAppendOnlyFixedSizeTree(0), + ] { + assert!(!is_provable_sum_bearing(t), "false expected for {:?}", t); + } +} + +#[test] +fn classify_subtree_disjoint_above_sum() { + // Subtree entirely above the range → Disjoint. Mirror of + // classify_disjoint_below_sum. + let r = range_inclusive(b"d", b"f"); + assert_eq!( + classify_subtree(Some(b"g"), None, &r), + SubtreeClassification::Disjoint, + ); +} + +#[test] +fn classify_subtree_boundary_overlapping_upper_sum() { + let r = range_inclusive(b"d", b"f"); + assert_eq!( + classify_subtree(Some(b"e"), Some(b"h"), &r), + SubtreeClassification::Boundary, + ); +} + +#[test] +fn classify_subtree_contained_within_inclusive_sum() { + // Subtree (b, c] with range [a..=z] → Contained. + let r = range_inclusive(b"a", b"z"); + assert_eq!( + classify_subtree(Some(b"b"), Some(b"c"), &r), + SubtreeClassification::Contained, + ); +} + +#[test] +fn key_strictly_inside_handles_unbounded_endpoints() { + // -inf lower bound: any key > None is true. + assert!(key_strictly_inside(b"a", None, Some(b"z"))); + // +inf upper bound: any key < None is true. + assert!(key_strictly_inside(b"z", Some(b"a"), None)); + // Both unbounded: trivially true. + assert!(key_strictly_inside(b"m", None, None)); + // Strictly outside lo. + assert!(!key_strictly_inside(b"a", Some(b"a"), None)); + assert!(!key_strictly_inside(b"a", Some(b"z"), None)); + // Strictly outside hi. + assert!(!key_strictly_inside(b"z", None, Some(b"z"))); + assert!(!key_strictly_inside(b"z", None, Some(b"a"))); +} + +#[test] +fn empty_provable_sum_tree_proof_round_trip() { + // Hits the "empty merk" branch of `prove_aggregate_sum_on_range` + // (the no-proof side has its own test; this is the prover side). + let v = GroveVersion::latest(); + let merk = TempMerk::new_with_tree_type(v, TreeType::ProvableSumTree); + let (ops, sum) = merk + .prove_aggregate_sum_on_range(&QueryItem::Range(b"a".to_vec()..b"z".to_vec()), v) + .unwrap() + .expect("prove on empty merk should succeed"); + assert_eq!(sum, 0); + // The empty-merk proof should verify to (NULL_HASH, 0). + let bytes = encode_proof(&ops); + let (_root, verified) = verify_aggregate_sum_on_range_proof( + &bytes, + &QueryItem::Range(b"a".to_vec()..b"z".to_vec()), + ) + .unwrap() + .expect("verify on empty proof should succeed"); + assert_eq!(verified, 0); +} + +#[test] +fn no_proof_sum_with_negative_values_matches_prover() { + // A tree with mixed positive and negative sum items must yield the + // same net sum from both the no-proof and proof paths. + let v = GroveVersion::latest(); + let mut merk = TempMerk::new_with_tree_type(v, TreeType::ProvableSumTree); + let entries: [(&[u8], i64); 4] = [(b"a", 50), (b"b", -100), (b"c", 30), (b"d", -50)]; + let ops: Vec<(Vec, Op)> = entries + .iter() + .map(|(k, val)| (k.to_vec(), Op::Put(vec![], ProvableSummedMerkNode(*val)))) + .collect(); + merk.apply::<_, Vec<_>>(&ops, &[], None, v) + .unwrap() + .expect("apply mixed-sign items"); + merk.commit(v); + // Full range → 50 − 100 + 30 − 50 = −70 + no_proof_sum_matches_prover(&merk, QueryItem::RangeFrom(b"a".to_vec()..), -70, v); + // Subrange b..=c → −100 + 30 = −70 + no_proof_sum_matches_prover( + &merk, + QueryItem::RangeInclusive(b"b".to_vec()..=b"c".to_vec()), + -70, + v, + ); +} diff --git a/merk/src/proofs/query/aggregate_sum/verify.rs b/merk/src/proofs/query/aggregate_sum/verify.rs new file mode 100644 index 000000000..12295a576 --- /dev/null +++ b/merk/src/proofs/query/aggregate_sum/verify.rs @@ -0,0 +1,256 @@ +//! Verifier for `AggregateSumOnRange` proofs. +//! +//! Two-phase structure, mirroring the count side: +//! +//! 1. **Phase 1** — replay the prover's op stream through +//! `execute_with_options`, allowlisting the two node types the honest +//! prover ever emits (`HashWithSum` for collapsed Disjoint/Contained +//! subtrees, `KVDigestSum` for boundary nodes). Anything else is +//! rejected up front — including plain `Hash(_)`, whose sum is not +//! hash-bound and would let a malicious prover skew the boundary +//! arithmetic. +//! +//! 2. **Phase 2** — walk the reconstructed tree and re-derive the +//! in-range sum, asserting that each node's type matches the +//! classification its inherited bounds imply +//! (Disjoint/Contained → leaf `HashWithSum`; Boundary → +//! `KVDigestSum` whose key is strictly inside the inherited window). +//! This is the type-shape binding that makes the proof +//! non-malleable — re-arranging the ops would change the bound +//! classification at some node and that node's emitted type would no +//! longer match. +//! +//! All accumulation is done in `i128`. The narrow to `i64` happens once +//! at the very end so adversarial inputs like `i64::MAX + i64::MAX` +//! cleanly surface as overflow instead of silently wrapping. + +use grovedb_costs::{cost_return_on_error, CostResult, CostsExt, OperationCost}; + +use crate::{ + proofs::{ + query::{ + aggregate_common::{ + classify_subtree, key_strictly_inside, SubtreeClassification, NULL_HASH, + }, + QueryItem, + }, + tree::{execute_with_options, Tree as ProofTree}, + Decoder, Node, + }, + CryptoHash, Error, +}; + +/// Verify a sum-only proof for an `AggregateSumOnRange` query. +/// +/// `proof_bytes` is the encoded `Vec` produced by +/// [`crate::Merk::prove_aggregate_sum_on_range`]; `inner_range` is the same +/// `QueryItem` the prover summed over (caller-supplied — typically extracted +/// from the verifier's `PathQuery`). +/// +/// On success returns `(merk_root_hash, sum)`: +/// - `merk_root_hash` is the root hash of the reconstructed merk; the +/// caller must compare it against the expected root hash to complete +/// verification. +/// - `sum` is the signed `i64` sum of keys' contributions in the inner +/// range, computed by replaying the prover's classification walk against +/// the reconstructed proof tree. +/// +/// **Two-phase verification.** Same defensive structure as the count proof +/// verifier — allowlisting node types alone is unsound, so we both reject +/// blatantly wrong types up front and then run a structural shape walk that +/// binds each leaf's type to the (subtree_bounds × range) classification. +/// +/// **Overflow handling.** The shape walk accumulates in `i128` (so two +/// `i64::MAX` children sum cleanly to `2 * i64::MAX` rather than wrapping) +/// and narrows to `i64` at the end. If the i128 result doesn't fit in i64, +/// the verifier returns `Error::InvalidProofError` — this is the safety net +/// against adversarial proofs that compose extremes into a sum that +/// can't be represented in the on-the-wire `i64` field. +/// +/// **Empty merk case.** An empty merk is represented by an empty proof byte +/// stream and yields `(NULL_HASH, 0)`. Callers chaining this in a +/// multi-layer proof should recognize that shape explicitly. +pub fn verify_aggregate_sum_on_range_proof( + proof_bytes: &[u8], + inner_range: &QueryItem, +) -> CostResult<(CryptoHash, i64), Error> { + if proof_bytes.is_empty() { + // Empty merk → empty proof → sum = 0, hash = NULL_HASH. + return Ok((NULL_HASH, 0i64)).wrap_with_cost(OperationCost::default()); + } + + let mut cost = OperationCost::default(); + let decoder = Decoder::new(proof_bytes); + + // Phase 1: reconstruct the proof tree. Allowlist the only two node types + // the honest prover emits — `HashWithSum` (collapsed Disjoint/Contained + // subtrees) and `KVDigestSum` (Boundary nodes). Plain `Hash(_)` is not + // accepted: the structural sum it would carry must be hash-bound, and + // only `HashWithSum` provides that. + let tree_result: CostResult = + execute_with_options(decoder, false, false, |node| match node { + Node::HashWithSum(_, _, _, _) | Node::KVDigestSum(_, _, _) => Ok(()), + other => Err(Error::InvalidProofError(format!( + "unexpected node type in aggregate sum proof: {}", + other + ))), + }); + let tree = cost_return_on_error!(&mut cost, tree_result); + + // Phase 2: shape-check + sum by replaying the prover's classification + // walk. The accumulator is i128 so adversarial extremes don't wrap; + // we narrow to i64 at the end below. + let (sum_i128, _structural) = match verify_sum_shape(&tree, inner_range, None, None) { + Ok(pair) => pair, + Err(e) => return Err(e).wrap_with_cost(cost), + }; + + // Final overflow gate: narrow the i128 accumulator to i64. A + // well-formed `ProvableSumTree` maintains its aggregate as i64 at every + // level, so an honest verify lands here with a value already inside + // i64's range. Anything outside is a forgery or a tree that violates + // its invariants. + let sum: i64 = match i64::try_from(sum_i128) { + Ok(v) => v, + Err(_) => { + return Err(Error::InvalidProofError(format!( + "aggregate-sum proof: in-range sum overflowed i64 ({})", + sum_i128 + ))) + .wrap_with_cost(cost); + } + }; + + let root_hash = tree.hash().unwrap_add_cost(&mut cost); + Ok((root_hash, sum)).wrap_with_cost(cost) +} + +/// Recursive shape-walk over the reconstructed proof tree. Returns the +/// pair `(in_range_sum_i128, structural_sum_i128)`: +/// +/// - `in_range_sum_i128` — signed sum of keys in the subtree that fall +/// inside the inner range AND have a non-zero own-sum (i.e. are not +/// `NotSummed`-wrapped). Accumulated in i128; narrowed to i64 once at +/// the outer entry point. +/// - `structural_sum_i128` — the merk-recorded aggregate sum of this +/// subtree (counting normal entries as their value and `NotSummed` +/// entries as 0). The parent uses it to compute its own `own_sum` as +/// `parent_node_sum − left_struct − right_struct` (since +/// `parent_node_sum = own + left_struct + right_struct`). Also kept in +/// i128 throughout. +/// +/// The structural sum of every child is **cryptographically bound** to +/// the parent's hash chain because every sum-bearing node in a sum proof +/// (`KVDigestSum`, `HashWithSum`) has its sum fed into +/// `node_hash_with_sum` for hash recomputation. Plain `Hash(_)` would +/// not carry a bound sum and is therefore not allowed in sum proofs. +/// +/// At each node we run the same type ↔ classification binding as the +/// count side: +/// +/// - `Disjoint` → must be a leaf `HashWithSum`. Contributes 0 to +/// in_range_sum, full sum to structural_sum. +/// - `Contained` → must be a leaf `HashWithSum`. Contributes its sum to +/// both. +/// - `Boundary` → must be `KVDigestSum(key, ...)` with `key` strictly +/// inside `bounds`. Recurse left with `(lo, key)` and right with +/// `(key, hi)`; add `own_sum` if `inner_range.contains(key)`. +/// +/// **Negative-sum caveat:** unlike count's `checked_sub` (where +/// `parent_aggregate < left_struct + right_struct` would indicate +/// corruption), the sum arithmetic is naturally signed and *cannot* be +/// detected by sign alone — a negative own_sum is perfectly legal. We +/// just compute `node_sum - left_struct - right_struct` in i128 and trust +/// the final overflow gate to catch any meaningful corruption (it's hash- +/// bound regardless, so a mismatch in own_sum's arithmetic would change +/// the reconstructed root hash and the caller's root check catches it). +fn verify_sum_shape( + tree: &ProofTree, + range: &QueryItem, + lo: Option<&[u8]>, + hi: Option<&[u8]>, +) -> Result<(i128, i128), Error> { + let class = classify_subtree(lo, hi, range); + match class { + SubtreeClassification::Disjoint => match &tree.node { + Node::HashWithSum(_, _, _, sum) => { + if tree.left.is_some() || tree.right.is_some() { + return Err(Error::InvalidProofError( + "aggregate-sum proof: HashWithSum node at a Disjoint position \ + must be a leaf" + .to_string(), + )); + } + // Disjoint subtree contributes 0 to the in-range sum but + // its full structural sum to the parent's `own_sum` + // computation. + Ok((0i128, *sum as i128)) + } + other => Err(Error::InvalidProofError(format!( + "aggregate-sum proof: expected HashWithSum at Disjoint position, got {}", + other + ))), + }, + SubtreeClassification::Contained => match &tree.node { + Node::HashWithSum(_, _, _, sum) => { + if tree.left.is_some() || tree.right.is_some() { + return Err(Error::InvalidProofError( + "aggregate-sum proof: HashWithSum node at a Contained position \ + must be a leaf" + .to_string(), + )); + } + // Contained subtree's structural sum (which excludes + // NotSummed entries because their stored aggregate is 0) + // is exactly its in-range sum. + Ok((*sum as i128, *sum as i128)) + } + other => Err(Error::InvalidProofError(format!( + "aggregate-sum proof: expected HashWithSum at Contained position, got {}", + other + ))), + }, + SubtreeClassification::Boundary => match &tree.node { + Node::KVDigestSum(key, _, aggregate) => { + if !key_strictly_inside(key.as_slice(), lo, hi) { + return Err(Error::InvalidProofError(format!( + "aggregate-sum proof: KVDigestSum key {} falls outside its \ + inherited subtree bounds (lo={:?}, hi={:?})", + hex::encode(key), + lo.map(hex::encode), + hi.map(hex::encode), + ))); + } + let key_slice = key.as_slice(); + let (left_in, left_struct) = match &tree.left { + Some(child) => verify_sum_shape(&child.tree, range, lo, Some(key_slice))?, + None => (0i128, 0i128), + }; + let (right_in, right_struct) = match &tree.right { + Some(child) => verify_sum_shape(&child.tree, range, Some(key_slice), hi)?, + None => (0i128, 0i128), + }; + // own_sum = aggregate − left_struct − right_struct, in + // i128. There's no "child sum exceeds parent" check that + // makes sense for signed sums — any combination of + // children's structural sums is plausible (one positive, + // one negative, etc.). The hash chain binds the values + // regardless, so any wrong arithmetic here would change + // the reconstructed root hash. + let aggregate_i128 = *aggregate as i128; + let own_sum = aggregate_i128 - left_struct - right_struct; + let self_contribution = if range.contains(key_slice) { + own_sum + } else { + 0 + }; + let in_range = left_in + right_in + self_contribution; + Ok((in_range, aggregate_i128)) + } + other => Err(Error::InvalidProofError(format!( + "aggregate-sum proof: expected KVDigestSum at Boundary position, got {}", + other + ))), + }, + } +} diff --git a/merk/src/proofs/query/aggregate_sum/walk.rs b/merk/src/proofs/query/aggregate_sum/walk.rs new file mode 100644 index 000000000..e3f14ea3d --- /dev/null +++ b/merk/src/proofs/query/aggregate_sum/walk.rs @@ -0,0 +1,179 @@ +//! No-proof walker: same classification logic as the proof emitter, but +//! returns only the in-range signed sum without allocating proof ops. + +use grovedb_costs::{ + cost_return_on_error, cost_return_on_error_no_add, CostResult, CostsExt, OperationCost, +}; +use grovedb_version::version::GroveVersion; + +use super::provable_sum_from_aggregate; +use crate::{ + proofs::query::{ + aggregate_common::{classify_subtree, SubtreeClassification}, + QueryItem, + }, + tree::{kv::ValueDefinedCostType, Fetch, RefWalker}, + Error, +}; + +/// Read the provable-sum aggregate off the walker's current tree node. +/// Shared error-mapping helper used by [`walk_sum_only`] at both the +/// Contained-leaf and Boundary positions. +fn provable_sum_from_walker(walker: &RefWalker<'_, S>) -> Result +where + S: Fetch + Sized + Clone, +{ + let aggregate = walker + .tree() + .aggregate_data() + .map_err(|e| Error::CorruptedData(format!("aggregate_data: {}", e)))?; + provable_sum_from_aggregate(aggregate) +} + +/// No-proof variant of [`super::emit::emit_sum_proof`]: walks the same +/// classification path (Contained / Disjoint / Boundary) but only +/// returns the running in-range sum. +/// +/// At entry, `subtree_lo_excl` / `subtree_hi_excl` are the inherited +/// exclusive key bounds for the subtree this walker points at (both +/// `None` at the root call). The walk reads each node's +/// `aggregate_data()` and each child link's `aggregate_data().as_sum_i64()` +/// exactly the same way the proof emitter does, so the returned sum is +/// identical to the `sum` value returned by +/// `create_aggregate_sum_on_range_proof`. +/// +/// The accumulator is `i128` so the no-proof side never overflows +/// mid-walk on adversarial intermediate sums (matching the prover's +/// guarantee). Narrowing to `i64` happens in the public entry point +/// `Merk::sum_aggregate_on_range`. +pub(super) fn walk_sum_only( + walker: &mut RefWalker<'_, S>, + range: &QueryItem, + subtree_lo_excl: Option<&[u8]>, + subtree_hi_excl: Option<&[u8]>, + grove_version: &GroveVersion, +) -> CostResult +where + S: Fetch + Sized + Clone, +{ + let mut cost = OperationCost::default(); + + match classify_subtree(subtree_lo_excl, subtree_hi_excl, range) { + // Disjoint: subtree contributes 0 to the in-range sum. + SubtreeClassification::Disjoint => Ok(0i128).wrap_with_cost(cost), + // Contained: subtree contributes its full stored aggregate sum + // (NotSummed-wrapped entries are already excluded — their stored + // aggregate is 0 by the wrapper's contract). + SubtreeClassification::Contained => { + let sum = cost_return_on_error_no_add!(cost, provable_sum_from_walker(walker)); + Ok(sum as i128).wrap_with_cost(cost) + } + // Boundary: descend into both children and add own_sum. + SubtreeClassification::Boundary => { + // Snapshot what we need from the current node before walking. + // walk(...) takes &mut self.tree, so we must drop any existing + // borrows on walker.tree() before calling it. + let node_key: Vec = walker.tree().key().to_vec(); + let node_sum = cost_return_on_error_no_add!(cost, provable_sum_from_walker(walker)); + let left_link_aggregate: i64 = walker + .tree() + .link(true) + .map(|l| l.aggregate_data().as_sum_i64()) + .unwrap_or(0); + let right_link_aggregate: i64 = walker + .tree() + .link(false) + .map(|l| l.aggregate_data().as_sum_i64()) + .unwrap_or(0); + let left_link_present = walker.tree().link(true).is_some(); + let right_link_present = walker.tree().link(false).is_some(); + + let mut total: i128 = 0; + + // LEFT child. If link is Some, walk(true) must yield Some; + // the proof variant has the verifier to catch silent + // inconsistencies, but this no-proof path returns the sum + // straight to the caller — so we fail loudly on impossible + // state rather than silently under-summing. + if left_link_present { + let walked = cost_return_on_error!( + &mut cost, + walker.walk( + true, + None::<&fn(&[u8], &GroveVersion) -> Option>, + grove_version, + ) + ); + let mut left_walker = match walked { + Some(lw) => lw, + None => { + return Err(Error::CorruptedState( + "tree.link(true) was Some but walk(true) returned None", + )) + .wrap_with_cost(cost); + } + }; + let s = cost_return_on_error!( + &mut cost, + walk_sum_only( + &mut left_walker, + range, + subtree_lo_excl, + Some(node_key.as_slice()), + grove_version, + ) + ); + total = total.saturating_add(s); + } + + // Current node's own_sum: when the key is in range, the + // contribution is `node_sum − left_struct − right_struct`. + // Signed arithmetic — unlike the count side this can be + // negative (and so cannot be checked-sub-vs-corruption like + // count's). The hash chain in the verifying variant catches + // tampering; here we trust the merk read path per the API + // contract. `i128` accumulation keeps adversarial inputs + // from wrapping mid-walk. + if range.contains(&node_key) { + let own_sum: i128 = (node_sum as i128) + .wrapping_sub(left_link_aggregate as i128) + .wrapping_sub(right_link_aggregate as i128); + total = total.saturating_add(own_sum); + } + + // RIGHT child — same fail-fast pattern as LEFT. + if right_link_present { + let walked = cost_return_on_error!( + &mut cost, + walker.walk( + false, + None::<&fn(&[u8], &GroveVersion) -> Option>, + grove_version, + ) + ); + let mut right_walker = match walked { + Some(rw) => rw, + None => { + return Err(Error::CorruptedState( + "tree.link(false) was Some but walk(false) returned None", + )) + .wrap_with_cost(cost); + } + }; + let s = cost_return_on_error!( + &mut cost, + walk_sum_only( + &mut right_walker, + range, + Some(node_key.as_slice()), + subtree_hi_excl, + grove_version, + ) + ); + total = total.saturating_add(s); + } + + Ok(total).wrap_with_cost(cost) + } + } +} diff --git a/merk/src/proofs/query/mod.rs b/merk/src/proofs/query/mod.rs index fcff0ad2b..6d8068baf 100644 --- a/merk/src/proofs/query/mod.rs +++ b/merk/src/proofs/query/mod.rs @@ -5,15 +5,21 @@ pub use grovedb_query::*; #[cfg(test)] mod merk_integration_tests; +#[cfg(any(feature = "minimal", feature = "verify"))] +mod aggregate_common; #[cfg(any(feature = "minimal", feature = "verify"))] pub mod aggregate_count; #[cfg(any(feature = "minimal", feature = "verify"))] +pub mod aggregate_sum; +#[cfg(any(feature = "minimal", feature = "verify"))] mod map; #[cfg(any(feature = "minimal", feature = "verify"))] mod verify; #[cfg(any(feature = "minimal", feature = "verify"))] pub use aggregate_count::verify_aggregate_count_on_range_proof; +#[cfg(any(feature = "minimal", feature = "verify"))] +pub use aggregate_sum::verify_aggregate_sum_on_range_proof; #[cfg(feature = "minimal")] use grovedb_costs::{cost_return_on_error, CostContext, CostResult, CostsExt, OperationCost}; @@ -76,12 +82,12 @@ where /// Creates a `Node::KVValueHashFeatureType` from the key/value pair of the /// root node - /// Note: For ProvableCountTree and ProvableCountSumTree, uses aggregate - /// count to match hash calculation + /// Note: For ProvableCountTree, ProvableCountSumTree, and ProvableSumTree, + /// uses aggregate value to match hash calculation pub(crate) fn to_kv_value_hash_feature_type_node(&self) -> Node { - // For ProvableCountTree and ProvableCountSumTree, we need to use the aggregate - // count (sum of self + children) because the hash calculation uses - // aggregate_data(), not feature_type() + // For ProvableCountTree, ProvableCountSumTree, and ProvableSumTree + // we need to use the aggregate value (sum of self + children) because + // the hash calculation uses aggregate_data(), not feature_type() let feature_type = match self.tree().aggregate_data() { Ok(AggregateData::ProvableCount(count)) => { TreeFeatureType::ProvableCountedMerkNode(count) @@ -89,6 +95,7 @@ where Ok(AggregateData::ProvableCountAndSum(count, sum)) => { TreeFeatureType::ProvableCountedSummedMerkNode(count, sum) } + Ok(AggregateData::ProvableSum(sum)) => TreeFeatureType::ProvableSummedMerkNode(sum), _ => self.tree().feature_type(), }; Node::KVValueHashFeatureType( @@ -169,6 +176,45 @@ where ) } + /// Creates a `Node::KVDigestSum` from the key/value_hash pair and sum + /// of the root node. Parallel to `to_kvdigest_count_node` for + /// ProvableSumTree boundary nodes (proving absence). Uses aggregate sum + /// (self + children) to match the `node_hash_with_sum` calculation. + pub(crate) fn to_kvdigest_sum_node(&self) -> Node { + let sum = match self.tree().aggregate_data() { + Ok(AggregateData::ProvableSum(sum)) => sum, + _ => 0, + }; + Node::KVDigestSum(self.tree().key().to_vec(), *self.tree().value_hash(), sum) + } + + /// Creates a `Node::KVHashSum` from the kv hash and sum of the root + /// node. Parallel to `to_kvhash_count_node` for ProvableSumTree + /// non-queried nodes on the path. + pub(crate) fn to_kvhash_sum_node(&self) -> Node { + let sum = match self.tree().aggregate_data() { + Ok(AggregateData::ProvableSum(sum)) => sum, + _ => 0, + }; + Node::KVHashSum(*self.tree().kv_hash(), sum) + } + + /// Creates a `Node::KVSum` from the key/value pair and sum of the root + /// node. Parallel to `to_kv_count_node` for queried Items in a + /// ProvableSumTree. Tamper-resistant (verifier computes hash from value) + /// while including the sum in the node hash. + pub(crate) fn to_kv_sum_node(&self) -> Node { + let sum = match self.tree().aggregate_data() { + Ok(AggregateData::ProvableSum(sum)) => sum, + _ => 0, + }; + Node::KVSum( + self.tree().key().to_vec(), + self.tree().value_as_slice().to_vec(), + sum, + ) + } + #[cfg(feature = "minimal")] pub(crate) fn create_proof( &mut self, @@ -312,13 +358,22 @@ where TreeFeatureType::ProvableCountedMerkNode(_) | TreeFeatureType::ProvableCountedSummedMerkNode(..) ); + // Sibling family for ProvableSumTree, whose nodes carry the i64 sum + // in their feature_type. + let is_provable_sum_tree = matches!( + self.tree().feature_type(), + TreeFeatureType::ProvableSummedMerkNode(_) + ); + let is_provable_aggregate_tree = is_provable_count_tree || is_provable_sum_tree; - // Convert is_provable_count_tree to parent tree type for proof_node_type() - // Both ProvableCountTree and ProvableCountSumTree use count in hash + // Convert the tree kind to an `ElementType` so `proof_node_type()` + // can dispatch — the Count family folds to `ProvableCountTree` + // (count-in-hash) and the Sum family folds to `ProvableSumTree` + // (sum-in-hash). The two families are distinct. let parent_tree_type = if is_provable_count_tree { - // Use ProvableCountTree for both since proof handling is the same (count in - // hash) Some(ElementType::ProvableCountTree) + } else if is_provable_sum_tree { + Some(ElementType::ProvableSumTree) } else { None // Regular tree or unknown - treated the same }; @@ -350,6 +405,7 @@ where let node = match proof_node_type { ProofNodeType::Kv => self.to_kv_node(), ProofNodeType::KvCount => self.to_kv_count_node(), + ProofNodeType::KvSum => self.to_kv_sum_node(), ProofNodeType::KvValueHash => self.to_kv_value_hash_node(), ProofNodeType::KvValueHashFeatureType => self.to_kv_value_hash_feature_type_node(), // References: at merk level, generate same node type as non-ref counterpart @@ -358,6 +414,11 @@ where // ProvableCountTree references: generate KVValueHashFeatureType // GroveDB will post-process to KVRefValueHashCount with dereferenced value ProofNodeType::KvRefValueHashCount => self.to_kv_value_hash_feature_type_node(), + // ProvableSumTree references: same merk-level shape as + // KvRefValueHashCount — emit KVValueHashFeatureType so the + // feature_type carries the sum, then GroveDB post-processes + // to KVRefValueHashSum with the dereferenced value. + ProofNodeType::KvRefValueHashSum => self.to_kv_value_hash_feature_type_node(), }; if proof_params.left_to_right { @@ -366,10 +427,12 @@ where Op::PushInverted(node) } } else if on_boundary_not_found || left_absence.1 || right_absence.0 { - // On boundary (proving absence): use KVDigest or KVDigestCount - // depending on whether this is a ProvableCountTree + // On boundary (proving absence): use KVDigest / KVDigestCount / + // KVDigestSum depending on the parent's aggregate kind. let node = if is_provable_count_tree { self.to_kvdigest_count_node() + } else if is_provable_sum_tree { + self.to_kvdigest_sum_node() } else { self.to_kvdigest_node() }; @@ -378,11 +441,19 @@ where } else { Op::PushInverted(node) } - } else if is_provable_count_tree { + } else if is_provable_aggregate_tree { + // Non-queried path nodes carry the aggregate (count or sum) so + // the verifier can recompute the node hash. + let node = if is_provable_count_tree { + self.to_kvhash_count_node() + } else { + // is_provable_sum_tree + self.to_kvhash_sum_node() + }; if proof_params.left_to_right { - Op::Push(self.to_kvhash_count_node()) + Op::Push(node) } else { - Op::PushInverted(self.to_kvhash_count_node()) + Op::PushInverted(node) } } else if proof_params.left_to_right { Op::Push(self.to_kvhash_node()) diff --git a/merk/src/proofs/query/verify.rs b/merk/src/proofs/query/verify.rs index 822fec0fc..f84f2fb05 100644 --- a/merk/src/proofs/query/verify.rs +++ b/merk/src/proofs/query/verify.rs @@ -206,12 +206,15 @@ impl QueryProofVerify for Query { Some(Node::KV(..)) => {} Some(Node::KVDigest(..)) => {} Some(Node::KVDigestCount(..)) => {} + Some(Node::KVDigestSum(..)) => {} Some(Node::KVRefValueHash(..)) => {} Some(Node::KVValueHash(..)) => {} Some(Node::KVValueHashFeatureType(..)) => {} Some(Node::KVValueHashFeatureTypeWithChildHash(..)) => {} Some(Node::KVRefValueHashCount(..)) => {} + Some(Node::KVRefValueHashSum(..)) => {} Some(Node::KVCount(..)) => {} + Some(Node::KVSum(..)) => {} // cannot verify lower bound - we have an abridged // tree, so we cannot tell what the preceding key was @@ -239,12 +242,15 @@ impl QueryProofVerify for Query { Some(Node::KV(..)) => {} Some(Node::KVDigest(..)) => {} Some(Node::KVDigestCount(..)) => {} + Some(Node::KVDigestSum(..)) => {} Some(Node::KVRefValueHash(..)) => {} Some(Node::KVValueHash(..)) => {} Some(Node::KVValueHashFeatureType(..)) => {} Some(Node::KVValueHashFeatureTypeWithChildHash(..)) => {} Some(Node::KVRefValueHashCount(..)) => {} + Some(Node::KVRefValueHashSum(..)) => {} Some(Node::KVCount(..)) => {} + Some(Node::KVSum(..)) => {} // cannot verify upper bound - we have an abridged // tree so we cannot tell what the previous key was @@ -476,7 +482,7 @@ impl QueryProofVerify for Query { } execute_node(key, Some(value), *node_value_hash, true)?; } - Node::Hash(_) | Node::KVHash(_) | Node::KVHashCount(..) => { + Node::Hash(_) | Node::KVHash(_) | Node::KVHashCount(..) | Node::KVHashSum(..) => { if in_range { return Err(Error::InvalidProofError(format!( "Proof is missing data for query range. Encountered unexpected node \ @@ -504,6 +510,38 @@ impl QueryProofVerify for Query { .to_string(), )); } + Node::HashWithSum(..) => { + // Same fail-fast rationale as `HashWithCount` above. + // `HashWithSum` is reserved for the dedicated + // aggregate-sum verifier; it must never reach the + // regular query verifier. + return Err(Error::InvalidProofError( + "HashWithSum node is only valid in aggregate-sum proofs; \ + encountered in regular query verification" + .to_string(), + )); + } + Node::KVSum(key, value, _sum) => { + #[cfg(feature = "proof_debug")] + { + println!("Processing KVSum node"); + } + execute_node(key, Some(value), value_hash(value).unwrap(), false)?; + } + Node::KVDigestSum(key, value_hash, _sum) => { + #[cfg(feature = "proof_debug")] + { + println!("Processing KVDigestSum node"); + } + execute_node(key, None, *value_hash, false)?; + } + Node::KVRefValueHashSum(key, value, value_hash, _sum) => { + #[cfg(feature = "proof_debug")] + { + println!("Processing KVRefValueHashSum node"); + } + execute_node(key, Some(value), *value_hash, false)?; + } } last_push = Some(node.clone()); @@ -537,6 +575,11 @@ impl QueryProofVerify for Query { Some(Node::KVValueHashFeatureType(..)) => {} Some(Node::KVValueHashFeatureTypeWithChildHash(..)) => {} Some(Node::KVRefValueHashCount(..)) => {} + // ProvableSumTree key-bearing nodes are also acceptable + // absence-proof boundaries. + Some(Node::KVSum(..)) => {} + Some(Node::KVDigestSum(..)) => {} + Some(Node::KVRefValueHashSum(..)) => {} // proof contains abridged data so we cannot verify absence of // remaining query items @@ -694,8 +737,10 @@ impl fmt::Display for ProofVerificationResult { } /// Checks whether a key exists as a boundary element in the given merk proof -/// bytes. A boundary element is a `KVDigest` or `KVDigestCount` node — it -/// proves the key exists in the tree without revealing the value. +/// bytes. A boundary element is a `KVDigest`, `KVDigestCount`, or +/// `KVDigestSum` node — it proves the key exists in the tree without +/// revealing the value. (Same node-type coverage as +/// [`boundaries_in_proof`]; the two helpers must agree.) /// /// This is useful for exclusive range queries (e.g. `RangeAfter(10)`) where /// the boundary key (10) is included in the proof as a digest node to anchor @@ -709,6 +754,8 @@ pub fn key_exists_as_boundary_in_proof(proof_bytes: &[u8], key: &[u8]) -> Result | Op::PushInverted(Node::KVDigest(k, _)) | Op::Push(Node::KVDigestCount(k, _, _)) | Op::PushInverted(Node::KVDigestCount(k, _, _)) + | Op::Push(Node::KVDigestSum(k, _, _)) + | Op::PushInverted(Node::KVDigestSum(k, _, _)) if k.as_slice() == key => { return Ok(true); @@ -719,9 +766,146 @@ pub fn key_exists_as_boundary_in_proof(proof_bytes: &[u8], key: &[u8]) -> Result Ok(false) } +#[cfg(test)] +mod provable_sum_tree_bound_regression_tests { + //! Regression coverage for a Codex-flagged bug: the lower- and + //! upper-bound `last_push` checks in `execute_proof` accepted + //! `KVCount` / `KVDigestCount` / `KVRefValueHashCount` (the Count + //! family) but omitted the parallel `KVSum` / `KVDigestSum` / + //! `KVRefValueHashSum` variants, so a multi-item query like + //! `Key(...)` + `Range(...)` against a `ProvableSumTree` would + //! reject a perfectly valid proof with + //! `Cannot verify lower bound of queried range` whenever the + //! preceding boundary happened to be a `KVDigestSum` node. + //! + //! These tests build a populated `ProvableSumTree`, prove a + //! `Key("aa") + Range("g".."j")` query in both directions, and + //! verify the resulting proof. Without the fix in + //! `merk/src/proofs/query/verify.rs::execute_proof` these + //! verifications return `InvalidProofError`. + + use grovedb_version::version::GroveVersion; + + use crate::{ + proofs::{ + query::{ + verify::{QueryProofVerify, PROOF_VERSION_LATEST}, + QueryItem, + }, + Query, + }, + test_utils::TempMerk, + tree::Op, + TreeFeatureType::ProvableSummedMerkNode, + TreeType, + }; + + /// Build a `ProvableSumTree` populated with single-byte keys + /// "a", "b", ..., "o" (15 keys), each carrying sum `i+1`. + fn make_15_key_provable_sum_tree(grove_version: &GroveVersion) -> TempMerk { + let mut merk = TempMerk::new_with_tree_type(grove_version, TreeType::ProvableSumTree); + let entries: Vec<(Vec, Op)> = (b'a'..=b'o') + .enumerate() + .map(|(i, c)| { + let s = (i as i64) + 1; + (vec![c], Op::Put(vec![i as u8], ProvableSummedMerkNode(s))) + }) + .collect(); + merk.apply::<_, Vec<_>>(&entries, &[], None, grove_version) + .unwrap() + .expect("apply should succeed"); + merk.commit(grove_version); + merk + } + + /// Helper: prove a `[Key("aa"), Range("g".."j")]` query in a given + /// direction and verify the resulting proof. With the fix in place + /// this must succeed. The query mixes an absence boundary + /// (`Key("aa")` — between "a" and "b") with a range, which is the + /// shape that surfaces the `KVDigestSum`-as-prior-boundary case. + fn run_multi_item_query_verifies(left_to_right: bool, grove_version: &GroveVersion) { + let merk = make_15_key_provable_sum_tree(grove_version); + let mut query = Query::new(); + // Absent key — proves absence via a `KVDigest`-family boundary. + query.insert_item(QueryItem::Key(b"aa".to_vec())); + // Range that doesn't touch "aa". The verifier must accept the + // sequence regardless of which boundary node preceded it. + query.insert_item(QueryItem::Range(b"g".to_vec()..b"j".to_vec())); + query.left_to_right = left_to_right; + + let proof = merk + .prove(query.clone(), None, grove_version) + .unwrap() + .expect("prove should succeed"); + + let (_root_hash, _result) = query + .execute_proof(&proof.proof, None, left_to_right, PROOF_VERSION_LATEST) + .unwrap() + .expect( + "Key+Range verify on ProvableSumTree must succeed; failure here means the \ + KVDigestSum boundary still isn't accepted by the bound checks", + ); + } + + #[test] + fn key_plus_range_on_provable_sum_tree_left_to_right_verifies() { + let v = GroveVersion::latest(); + run_multi_item_query_verifies(true, v); + } + + #[test] + fn key_plus_range_on_provable_sum_tree_right_to_left_verifies() { + let v = GroveVersion::latest(); + run_multi_item_query_verifies(false, v); + } + + /// Boundary-extraction parallel: `KVDigestSum` produced by a + /// `ProvableSumTree` proof must surface in `boundaries_in_proof` + /// just like its `KVDigest` / `KVDigestCount` siblings, AND + /// `key_exists_as_boundary_in_proof` must agree (the two helpers + /// are documented to behave identically). + #[test] + fn kv_digest_sum_appears_in_both_boundary_helpers() { + use crate::proofs::query::verify::{boundaries_in_proof, key_exists_as_boundary_in_proof}; + + let v = GroveVersion::latest(); + let merk = make_15_key_provable_sum_tree(v); + // Querying an absent key emits a `KVDigestSum` boundary. + let mut query = Query::new(); + query.insert_item(QueryItem::Key(b"aa".to_vec())); + + let proof = merk + .prove(query, None, v) + .unwrap() + .expect("prove should succeed"); + + let boundaries = boundaries_in_proof(&proof.proof).expect("boundaries"); + assert!( + !boundaries.is_empty(), + "boundaries_in_proof must report KVDigestSum nodes from ProvableSumTree proofs" + ); + + // Every boundary key surfaced by `boundaries_in_proof` must + // round-trip through `key_exists_as_boundary_in_proof` as well — + // the two helpers must agree on the same node-type coverage. + for boundary in &boundaries { + let found = key_exists_as_boundary_in_proof(&proof.proof, boundary) + .expect("key_exists_as_boundary_in_proof"); + assert!( + found, + "key_exists_as_boundary_in_proof disagreed with boundaries_in_proof on {:?}", + boundary + ); + } + } +} + /// Returns all boundary keys found in the given merk proof bytes. -/// Boundary keys appear as `KVDigest` or `KVDigestCount` nodes — they -/// prove a key exists in the tree without revealing the value. +/// Boundary keys appear as `KVDigest`, `KVDigestCount`, or `KVDigestSum` +/// nodes — they prove a key exists in the tree without revealing the +/// value. (The Sum variant is the `ProvableSumTree` analogue of the +/// Count variant; both behave identically for boundary-detection +/// purposes.) pub fn boundaries_in_proof(proof_bytes: &[u8]) -> Result>, Error> { let decoder = Decoder::new(proof_bytes); let mut keys = Vec::new(); @@ -731,7 +915,9 @@ pub fn boundaries_in_proof(proof_bytes: &[u8]) -> Result>, Error> { Op::Push(Node::KVDigest(k, _)) | Op::PushInverted(Node::KVDigest(k, _)) | Op::Push(Node::KVDigestCount(k, _, _)) - | Op::PushInverted(Node::KVDigestCount(k, _, _)) => { + | Op::PushInverted(Node::KVDigestCount(k, _, _)) + | Op::Push(Node::KVDigestSum(k, _, _)) + | Op::PushInverted(Node::KVDigestSum(k, _, _)) => { keys.push(k); } _ => {} diff --git a/merk/src/proofs/tree.rs b/merk/src/proofs/tree.rs index 09cffe090..9456e0160 100644 --- a/merk/src/proofs/tree.rs +++ b/merk/src/proofs/tree.rs @@ -13,8 +13,8 @@ use grovedb_costs::{ use super::{Node, Op}; #[cfg(any(feature = "minimal", feature = "verify"))] use crate::tree::{ - combine_hash, kv_digest_to_kv_hash, kv_hash, node_hash, node_hash_with_count, value_hash, - NULL_HASH, + combine_hash, kv_digest_to_kv_hash, kv_hash, node_hash, node_hash_with_count, + node_hash_with_sum, value_hash, NULL_HASH, }; #[cfg(any(feature = "minimal", feature = "verify"))] use crate::{ @@ -61,6 +61,7 @@ impl Child { (key.as_slice(), (*feature_type).into()) } Node::KVCount(key, _, count) => (key.as_slice(), AggregateData::ProvableCount(*count)), + Node::KVSum(key, _, sum) => (key.as_slice(), AggregateData::ProvableSum(*sum)), // for the connection between the trunk and leaf chunks, we don't // have the child key so we must first write in an empty one. once // the leaf gets verified, we can write in this key to its parent @@ -160,6 +161,9 @@ impl Tree { kv_digest_to_kv_hash(key.as_slice(), value_hash).flat_map(|kv_hash| { // For ProvableCountTree and ProvableCountSumTree, use node_hash_with_count // Note: ProvableCountSumTree only includes count in hash, not sum + // For ProvableSumTree, use node_hash_with_sum so the sum + // baked into the parent's hash matches the link + // verifier's reconstruction. match feature_type { TreeFeatureType::ProvableCountedMerkNode(count) => node_hash_with_count( &kv_hash, @@ -176,6 +180,12 @@ impl Tree { *count, ) } + TreeFeatureType::ProvableSummedMerkNode(sum) => node_hash_with_sum( + &kv_hash, + &self.child_hash(true), + &self.child_hash(false), + *sum, + ), _ => compute_hash(self, kv_hash), } }) @@ -233,6 +243,56 @@ impl Tree { ) }) } + // ProvableSumTree proof-node hash dispatch. All five + // sum-bearing variants pipe through `node_hash_with_sum`, the + // same hash function used by `Tree::hash_for_link` and the + // commit path for `TreeType::ProvableSumTree`. This is what + // makes the proof verifier's reconstructed root hash agree + // with the prover's root hash for ProvableSumTree proofs. + Node::HashWithSum(kv_hash, left_child_hash, right_child_hash, sum) => { + node_hash_with_sum(kv_hash, left_child_hash, right_child_hash, *sum) + } + Node::KVSum(key, value, sum) => { + kv_hash(key.as_slice(), value.as_slice()).flat_map(|kv_hash| { + node_hash_with_sum( + &kv_hash, + &self.child_hash(true), + &self.child_hash(false), + *sum, + ) + }) + } + Node::KVHashSum(kv_hash, sum) => node_hash_with_sum( + kv_hash, + &self.child_hash(true), + &self.child_hash(false), + *sum, + ), + Node::KVDigestSum(key, value_hash, sum) => kv_digest_to_kv_hash(key, value_hash) + .flat_map(|kv_hash| { + node_hash_with_sum( + &kv_hash, + &self.child_hash(true), + &self.child_hash(false), + *sum, + ) + }), + Node::KVRefValueHashSum(key, referenced_value, node_value_hash, sum) => { + let mut cost = OperationCost::default(); + let referenced_value_hash = + value_hash(referenced_value.as_slice()).unwrap_add_cost(&mut cost); + let combined_value_hash = combine_hash(node_value_hash, &referenced_value_hash) + .unwrap_add_cost(&mut cost); + + kv_digest_to_kv_hash(key.as_slice(), &combined_value_hash).flat_map(|kv_hash| { + node_hash_with_sum( + &kv_hash, + &self.child_hash(true), + &self.child_hash(false), + *sum, + ) + }) + } } } @@ -391,8 +451,8 @@ impl Tree { } /// Returns the key from this tree node if it's a KV-type node with a key. - /// Returns None for Hash, KVHash, KVHashCount, or HashWithCount node - /// types (which only have hashes, not keys). + /// Returns None for Hash, KVHash, KVHashCount, KVHashSum, HashWithCount, + /// or HashWithSum node types (which only have hashes, not keys). #[cfg(any(feature = "minimal", feature = "verify"))] pub fn key(&self) -> Option<&[u8]> { match &self.node { @@ -404,11 +464,17 @@ impl Tree { | Node::KVDigest(key, ..) | Node::KVDigestCount(key, ..) | Node::KVCount(key, ..) - | Node::KVRefValueHashCount(key, ..) => Some(key.as_slice()), + | Node::KVRefValueHashCount(key, ..) + | Node::KVSum(key, ..) + | Node::KVDigestSum(key, ..) + | Node::KVRefValueHashSum(key, ..) => Some(key.as_slice()), // These nodes don't have keys, only hashes - Node::Hash(_) | Node::KVHash(_) | Node::KVHashCount(..) | Node::HashWithCount(..) => { - None - } + Node::Hash(_) + | Node::KVHash(_) + | Node::KVHashCount(..) + | Node::HashWithCount(..) + | Node::KVHashSum(..) + | Node::HashWithSum(..) => None, } } @@ -421,6 +487,9 @@ impl Tree { } Node::KVCount(_, _, count) => Ok(AggregateData::ProvableCount(*count)), Node::HashWithCount(.., count) => Ok(AggregateData::ProvableCount(*count)), + // ProvableSumTree proof nodes map to ProvableSum. + Node::KVSum(_, _, sum) => Ok(AggregateData::ProvableSum(*sum)), + Node::HashWithSum(.., sum) => Ok(AggregateData::ProvableSum(*sum)), Node::KV(..) | Node::KVValueHash(..) => Ok(AggregateData::NoAggregateData), _ => Err(Error::InvalidProofError( "Cannot extract aggregate data from this node type".to_string(), @@ -655,7 +724,10 @@ where | Node::KVCount(key, ..) | Node::KVRefValueHashCount(key, ..) | Node::KVDigest(key, _) - | Node::KVDigestCount(key, ..) = &node + | Node::KVDigestCount(key, ..) + | Node::KVSum(key, ..) + | Node::KVDigestSum(key, ..) + | Node::KVRefValueHashSum(key, ..) = &node { // keys should always increase if let Some(last_key) = &maybe_last_key @@ -693,7 +765,10 @@ where | Node::KVCount(key, ..) | Node::KVRefValueHashCount(key, ..) | Node::KVDigest(key, _) - | Node::KVDigestCount(key, ..) = &node + | Node::KVDigestCount(key, ..) + | Node::KVSum(key, ..) + | Node::KVDigestSum(key, ..) + | Node::KVRefValueHashSum(key, ..) = &node { // keys should always decrease if let Some(last_key) = &maybe_last_key @@ -1128,7 +1203,7 @@ mod test { let mut ops: Vec> = Vec::new(); let mut current_stack_depth: usize = 0; - // Phase 1: push height-(max_height-1) chains onto the stack. + // Step 1: push height-(max_height-1) chains onto the stack. // Each chain: 1 Push + (max_height-2) × (Push + Parent) = 2*(max_height-1) - 1 ops. // Each chain leaves 1 item on the stack (a tree of height max_height-1). let ops_per_chain = 2 * (max_height - 1) - 1; @@ -1142,14 +1217,14 @@ mod test { current_stack_depth += 1; } - // Phase 2: fill remaining op budget with bare Push(Hash) ops, + // Step 2: fill remaining op budget with bare Push(Hash) ops, // growing the stack further. while ops.len() < max_ops && current_stack_depth < max_stack { ops.push(Ok(Op::Push(Node::Hash([0xEE; 32])))); current_stack_depth += 1; } - // Phase 3: one more Push to exceed whichever limit is tighter. + // Step 3: one more Push to exceed whichever limit is tighter. ops.push(Ok(Op::Push(Node::Hash([0xFF; 32])))); let result = execute(ops.into_iter(), true, |_| Ok(())).unwrap(); @@ -1261,4 +1336,185 @@ mod test { .unwrap(); assert!(result.key().is_some()); } + + /// `Node::HashWithSum` with a forged sum recomputes to a different node + /// hash than the same kv/l/r with the correct sum. The verifier's + /// root-hash check therefore catches sum tampering, just as + /// `HashWithCount` catches count tampering. + #[test] + fn hashwithsum_forged_sum_changes_root_hash() { + use crate::tree::HASH_LENGTH; + let kv = [0xAB; HASH_LENGTH]; + let l = [0xCD; HASH_LENGTH]; + let r = [0xEF; HASH_LENGTH]; + + let honest: ProofTree = Node::HashWithSum(kv, l, r, 100).into(); + let forged: ProofTree = Node::HashWithSum(kv, l, r, 999).into(); + + let honest_hash = honest.hash().unwrap(); + let forged_hash = forged.hash().unwrap(); + assert_ne!( + honest_hash, forged_hash, + "forged sum on HashWithSum must produce a different node hash" + ); + } + + /// `Node::KVSum` hash recomputation is sum-bound. Changing the sum alone + /// (with the same key/value) produces a different node hash, so a + /// malicious prover can't claim a different sum without breaking the + /// parent's hash chain. + #[test] + fn kvsum_forged_sum_changes_root_hash() { + let honest: ProofTree = Node::KVSum(vec![1], vec![2, 3], 7).into(); + let forged: ProofTree = Node::KVSum(vec![1], vec![2, 3], 8).into(); + + let honest_hash = honest.hash().unwrap(); + let forged_hash = forged.hash().unwrap(); + assert_ne!(honest_hash, forged_hash); + } + + /// `Node::KVHashSum` hash recomputation is sum-bound. + #[test] + fn kvhashsum_forged_sum_changes_root_hash() { + use crate::tree::HASH_LENGTH; + let kv_hash = [0x55; HASH_LENGTH]; + let honest: ProofTree = Node::KVHashSum(kv_hash, 0).into(); + let forged: ProofTree = Node::KVHashSum(kv_hash, 1).into(); + assert_ne!(honest.hash().unwrap(), forged.hash().unwrap()); + } + + /// Cornerstone: a ProvableSumTree's root hash diverges from a plain + /// SumTree's root hash for the same {key/value/sum} contents. This + /// proves that the per-node sum participates in the node hash (via + /// `node_hash_with_sum`) instead of being merely tracked alongside + /// `node_hash`. + #[test] + fn provable_sum_tree_diverges_from_plain_sum_tree() { + use crate::tree::{ + node_hash, node_hash_with_sum, + TreeFeatureType::{ProvableSummedMerkNode, SummedMerkNode}, + }; + + // Same key+value, same per-node sum, same children. Only the + // feature_type differs — `SummedMerkNode(5)` vs + // `ProvableSummedMerkNode(5)`. The plain-sum proof tree's hash + // should use `node_hash`, the provable-sum proof tree's hash + // should use `node_hash_with_sum`. + + let key = b"k".to_vec(); + let value = b"v".to_vec(); + let value_hash_v = value_hash(&value).unwrap(); + let computed_kv_hash = kv_digest_to_kv_hash(&key, &value_hash_v).unwrap(); + + // Build standalone leaves (no children) for both proof-node + // variants so the hash math is unambiguous. + let plain_sum: ProofTree = Node::KVValueHashFeatureType( + key.clone(), + value.clone(), + value_hash_v, + SummedMerkNode(5), + ) + .into(); + let provable_sum: ProofTree = + Node::KVValueHashFeatureType(key, value, value_hash_v, ProvableSummedMerkNode(5)) + .into(); + + let plain_hash = plain_sum.hash().unwrap(); + let provable_hash = provable_sum.hash().unwrap(); + + // Sanity-check the math against the underlying primitives: + // plain SumTree node hashes via `node_hash` (sum not in hash). + // ProvableSumTree node hashes via `node_hash_with_sum`. + let expected_plain = node_hash(&computed_kv_hash, &NULL_HASH, &NULL_HASH).unwrap(); + let expected_provable = + node_hash_with_sum(&computed_kv_hash, &NULL_HASH, &NULL_HASH, 5).unwrap(); + assert_eq!(plain_hash, expected_plain); + assert_eq!(provable_hash, expected_provable); + + // The cornerstone: same contents, different cryptographic identity. + assert_ne!( + plain_hash, provable_hash, + "ProvableSumTree root hash must diverge from plain SumTree root \ + hash with identical contents" + ); + } + + /// `Node::KVDigestSum` hash recomputation is sum-bound. Changing the sum + /// alone produces a different node hash. + #[test] + fn kvdigestsum_forged_sum_changes_root_hash() { + use crate::tree::HASH_LENGTH; + let key = b"k".to_vec(); + let value_hash_bytes = [0x77; HASH_LENGTH]; + let honest: ProofTree = Node::KVDigestSum(key.clone(), value_hash_bytes, 42).into(); + let forged: ProofTree = Node::KVDigestSum(key, value_hash_bytes, 43).into(); + assert_ne!(honest.hash().unwrap(), forged.hash().unwrap()); + } + + /// `Node::KVRefValueHashSum` hash recomputation is sum-bound. Exercises + /// the full combined-hash path (combine(referenced_value_hash, + /// node_value_hash) → kv_digest_to_kv_hash → node_hash_with_sum). This + /// is the only place in the proof verifier where the reference's combined + /// hash logic is wired up to the sum-bearing hash. + #[test] + fn kvrefvaluehashsum_forged_sum_changes_root_hash() { + use crate::tree::HASH_LENGTH; + let key = b"k".to_vec(); + let value = b"v".to_vec(); + let node_value_hash = [0x33; HASH_LENGTH]; + let honest: ProofTree = + Node::KVRefValueHashSum(key.clone(), value.clone(), node_value_hash, 100).into(); + let forged: ProofTree = Node::KVRefValueHashSum(key, value, node_value_hash, 101).into(); + assert_ne!( + honest.hash().unwrap(), + forged.hash().unwrap(), + "forged sum on KVRefValueHashSum must produce a different node hash" + ); + } + + /// `aggregate_data()` on a Sum-bearing proof node must surface + /// `AggregateData::ProvableSum(_)`. This covers both `Node::KVSum` and + /// `Node::HashWithSum` arms of the `aggregate_data` match. + #[test] + fn aggregate_data_returns_provable_sum_for_sum_nodes() { + use crate::tree::{AggregateData, HASH_LENGTH}; + + let kv_sum: ProofTree = Node::KVSum(b"k".to_vec(), b"v".to_vec(), -42).into(); + match kv_sum.aggregate_data().expect("aggregate_data ok") { + AggregateData::ProvableSum(s) => assert_eq!(s, -42), + other => panic!("expected ProvableSum, got {:?}", other), + } + + let hash_with_sum: ProofTree = + Node::HashWithSum([0; HASH_LENGTH], [0; HASH_LENGTH], [0; HASH_LENGTH], 1234).into(); + match hash_with_sum.aggregate_data().expect("aggregate_data ok") { + AggregateData::ProvableSum(s) => assert_eq!(s, 1234), + other => panic!("expected ProvableSum, got {:?}", other), + } + } + + /// `Tree::key()` must return the key for the three keyed Sum variants + /// (`KVSum`, `KVDigestSum`, `KVRefValueHashSum`) and `None` for the + /// keyless variants (`KVHashSum`, `HashWithSum`). + #[test] + fn key_returns_correct_key_for_sum_nodes() { + use crate::tree::HASH_LENGTH; + + let kv_sum: ProofTree = Node::KVSum(b"a".to_vec(), vec![1], 0).into(); + assert_eq!(kv_sum.key(), Some(b"a".as_slice())); + + let kv_digest_sum: ProofTree = Node::KVDigestSum(b"b".to_vec(), [0; HASH_LENGTH], 0).into(); + assert_eq!(kv_digest_sum.key(), Some(b"b".as_slice())); + + let kv_ref: ProofTree = + Node::KVRefValueHashSum(b"c".to_vec(), vec![1], [0; HASH_LENGTH], 0).into(); + assert_eq!(kv_ref.key(), Some(b"c".as_slice())); + + let kv_hash_sum: ProofTree = Node::KVHashSum([0; HASH_LENGTH], 0).into(); + assert_eq!(kv_hash_sum.key(), None); + + let hash_with_sum: ProofTree = + Node::HashWithSum([0; HASH_LENGTH], [0; HASH_LENGTH], [0; HASH_LENGTH], 0).into(); + assert_eq!(hash_with_sum.key(), None); + } } diff --git a/merk/src/tree/hash.rs b/merk/src/tree/hash.rs index 3fec80cf4..4153a479b 100644 --- a/merk/src/tree/hash.rs +++ b/merk/src/tree/hash.rs @@ -164,3 +164,103 @@ pub fn node_hash_with_count( ..Default::default() }) } + +#[cfg(any(feature = "minimal", feature = "verify"))] +/// Hashes a node for ProvableSumTree, including the aggregate sum. +/// +/// Parallel to `node_hash_with_count` but for sum-bearing aggregates. +/// The i64 sum is appended via its big-endian byte representation (8 bytes, +/// fixed-width, deterministic). This is content-binding only — no order +/// preservation is needed since the bytes are part of the hash input. +/// Negative sums hash via their two's-complement big-endian form, which is +/// deterministic regardless of the platform. +pub fn node_hash_with_sum( + kv: &CryptoHash, + left: &CryptoHash, + right: &CryptoHash, + sum: i64, +) -> CostContext { + let mut hasher = blake3::Hasher::new(); + hasher.update(kv); + hasher.update(left); + hasher.update(right); + hasher.update(&sum.to_be_bytes()); + + // hashes will always be 2 (same shape as node_hash_with_count) + let hashes = 2; + + let res = hasher.finalize(); + let mut hash: CryptoHash = Default::default(); + hash.copy_from_slice(res.as_bytes()); + hash.wrap_with_cost(OperationCost { + hash_node_calls: hashes, + ..Default::default() + }) +} + +#[cfg(test)] +#[cfg(feature = "minimal")] +mod tests { + use grovedb_costs::CostsExt; + + use super::{node_hash, node_hash_with_sum, CryptoHash, HASH_LENGTH}; + + fn h(byte: u8) -> CryptoHash { + [byte; HASH_LENGTH] + } + + #[test] + fn node_hash_with_sum_differs_from_node_hash_at_zero_sum() { + // The sum bytes are appended even when zero, so the hash must + // differ from a plain `node_hash` with the same kv/l/r inputs. + let kv = h(1); + let l = h(2); + let r = h(3); + let with_sum = node_hash_with_sum(&kv, &l, &r, 0).unwrap(); + let without_sum = node_hash(&kv, &l, &r).unwrap(); + assert_ne!( + with_sum, without_sum, + "node_hash_with_sum at sum=0 must NOT equal node_hash" + ); + } + + #[test] + fn node_hash_with_sum_different_sums_produce_different_hashes() { + let kv = h(4); + let l = h(5); + let r = h(6); + let a = node_hash_with_sum(&kv, &l, &r, 0).unwrap(); + let b = node_hash_with_sum(&kv, &l, &r, 1).unwrap(); + let c = node_hash_with_sum(&kv, &l, &r, -1).unwrap(); + let d = node_hash_with_sum(&kv, &l, &r, 42).unwrap(); + assert_ne!(a, b); + assert_ne!(a, c); + assert_ne!(a, d); + assert_ne!(b, c); + assert_ne!(b, d); + assert_ne!(c, d); + } + + #[test] + fn node_hash_with_sum_extremes_distinct() { + let kv = h(7); + let l = h(8); + let r = h(9); + let min = node_hash_with_sum(&kv, &l, &r, i64::MIN).unwrap(); + let max = node_hash_with_sum(&kv, &l, &r, i64::MAX).unwrap(); + let zero = node_hash_with_sum(&kv, &l, &r, 0).unwrap(); + assert_ne!(min, max); + assert_ne!(min, zero); + assert_ne!(max, zero); + } + + #[test] + fn node_hash_with_sum_is_deterministic() { + let kv = h(0xaa); + let l = h(0xbb); + let r = h(0xcc); + let a = node_hash_with_sum(&kv, &l, &r, -7).unwrap(); + let b = node_hash_with_sum(&kv, &l, &r, -7).unwrap(); + assert_eq!(a, b); + } +} diff --git a/merk/src/tree/link.rs b/merk/src/tree/link.rs index 3b6f0173d..325817a59 100644 --- a/merk/src/tree/link.rs +++ b/merk/src/tree/link.rs @@ -314,7 +314,8 @@ impl Link { AggregateData::NoAggregateData => key.len() + 36, // 1 + HASH_LENGTH + 2 + 1, AggregateData::Count(_) | AggregateData::Sum(_) - | AggregateData::ProvableCount(_) => { + | AggregateData::ProvableCount(_) + | AggregateData::ProvableSum(_) => { // 1 for key len // key_len for keys // 32 for hash @@ -358,7 +359,8 @@ impl Link { AggregateData::NoAggregateData => tree.key().len() + 36, // 1 + 32 + 2 + 1, AggregateData::Count(_) | AggregateData::Sum(_) - | AggregateData::ProvableCount(_) => { + | AggregateData::ProvableCount(_) + | AggregateData::ProvableSum(_) => { tree.key().len() + 44 // 1 + 32 + 2 + 1 + 8 } AggregateData::BigSum(_) @@ -442,6 +444,15 @@ impl Encode for Link { out.write_varint(*count_value)?; out.write_varint(*sum_value)?; } + // Tag byte 7 parallels the `TreeFeatureType::ProvableSummedMerkNode` + // tag in `tree_feature_type.rs`. Sum encoded as varint i64 — same + // layout as `AggregateData::Sum`. The hash divergence happens + // upstream in `hash_for_link` / `commit`; the on-link encoding + // just preserves the variant for later dispatch. + AggregateData::ProvableSum(sum_value) => { + out.write_all(&[7])?; + out.write_varint(*sum_value)?; + } } Ok(()) @@ -507,6 +518,10 @@ impl Encode for Link { let encoded_count_value = count.encode_var_vec(); key.len() + encoded_sum_value.len() + encoded_count_value.len() + 36 } + AggregateData::ProvableSum(sum_value) => { + let encoded_sum_value = sum_value.encode_var_vec(); + key.len() + encoded_sum_value.len() + 36 + } }, Link::Modified { .. } => { return Err(ed::Error::IOError(std::io::Error::new( @@ -550,6 +565,10 @@ impl Encode for Link { let encoded_count_value = count.encode_var_vec(); tree.key().len() + encoded_sum_value.len() + encoded_count_value.len() + 36 } + AggregateData::ProvableSum(sum_value) => { + let encoded_sum_value = sum_value.encode_var_vec(); + tree.key().len() + encoded_sum_value.len() + 36 + } }, }) } @@ -631,6 +650,11 @@ impl Decode for Link { let encoded_sum: i64 = input.read_varint()?; AggregateData::ProvableCountAndSum(encoded_count, encoded_sum) } + // ProvableSum decode — matches encode tag 7. + 7 => { + let encoded_sum: i64 = input.read_varint()?; + AggregateData::ProvableSum(encoded_sum) + } byte => return Err(ed::Error::UnexpectedByte(byte)), }; } else { @@ -899,4 +923,47 @@ mod test { let link = Link::decode(bytes.as_slice()).expect("expected to decode a link"); assert_eq!(link.aggregate_data(), AggregateData::NoAggregateData); } + + /// Wire-format regression: `AggregateData::ProvableSum` is encoded with + /// tag byte 7 followed by a varint-encoded i64. Pin down both the tag + /// byte and the round-trip so any drift in the link encoding surface is + /// caught immediately. Uses a negative value to also exercise the i64 + /// varint encoding (ProvableSum is signed). + #[test] + fn round_trip_aggregate_data_provable_sum_negative() { + let original = Link::Reference { + hash: [55; 32], + aggregate_data: AggregateData::ProvableSum(-42), + child_heights: (1, 2), + key: vec![9, 9, 9], + }; + let mut bytes = vec![]; + original + .encode_into(&mut bytes) + .expect("encode ProvableSum link"); + // Tag byte 7 lives at the end of the encoded record, just before + // the varint sum. We don't pin the exact varint bytes (they + // depend on the integer encoding), but we do pin tag 7's + // presence. + assert!( + bytes.contains(&7u8), + "ProvableSum encoding must include tag byte 7, got {:?}", + bytes + ); + let decoded = Link::decode(bytes.as_slice()).expect("decode ProvableSum link"); + assert_eq!(decoded.aggregate_data(), AggregateData::ProvableSum(-42)); + if let Link::Reference { + hash, + child_heights, + key, + .. + } = decoded + { + assert_eq!(hash, [55; 32]); + assert_eq!(child_heights, (1, 2)); + assert_eq!(key, vec![9, 9, 9]); + } else { + panic!("expected Link::Reference after decode"); + } + } } diff --git a/merk/src/tree/mod.rs b/merk/src/tree/mod.rs index 37e04e21e..18f722c16 100644 --- a/merk/src/tree/mod.rs +++ b/merk/src/tree/mod.rs @@ -46,8 +46,8 @@ use grovedb_costs::{ use grovedb_version::version::GroveVersion; #[cfg(any(feature = "minimal", feature = "verify"))] pub use hash::{ - combine_hash, kv_digest_to_kv_hash, kv_hash, node_hash, node_hash_with_count, value_hash, - CryptoHash, HASH_LENGTH, NULL_HASH, + combine_hash, kv_digest_to_kv_hash, kv_hash, node_hash, node_hash_with_count, + node_hash_with_sum, value_hash, CryptoHash, HASH_LENGTH, NULL_HASH, }; #[cfg(feature = "minimal")] pub use hash::{HASH_BLOCK_SIZE, HASH_BLOCK_SIZE_U32, HASH_LENGTH_U32, HASH_LENGTH_U32_X2}; @@ -470,6 +470,7 @@ impl TreeNode { AggregateData::ProvableCountAndSum(c, s) => { s.encode_var_vec().len() as u32 + c.encode_var_vec().len() as u32 } + AggregateData::ProvableSum(s) => s.encode_var_vec().len() as u32, }, ) }) @@ -549,6 +550,12 @@ impl TreeNode { AggregateData::CountAndSum(_, s) => Ok(s), AggregateData::ProvableCount(_) => Ok(0), AggregateData::ProvableCountAndSum(_, s) => Ok(s), + // `ProvableSum` contributes to sum aggregation exactly like + // `Sum`. GroveDB enforces homogeneous feature types per Merk + // tree, so a sum-bearing parent will only see sum-bearing + // children (Sum or ProvableSum) — this arm is reached when + // the tree itself is a ProvableSumTree. + AggregateData::ProvableSum(s) => Ok(s), }, _ => Ok(0), } @@ -578,6 +585,8 @@ impl TreeNode { AggregateData::CountAndSum(c, _) => Ok(c), AggregateData::ProvableCount(c) => Ok(c), AggregateData::ProvableCountAndSum(c, _) => Ok(c), + // `ProvableSum` carries no count; behaves like `Sum`. + AggregateData::ProvableSum(_) => Ok(0), }, _ => Ok(0), } @@ -605,6 +614,8 @@ impl TreeNode { AggregateData::CountAndSum(_, s) => s as i128, AggregateData::ProvableCount(_) => 0, AggregateData::ProvableCountAndSum(_, s) => s as i128, + // `ProvableSum` widens to i128 the same way `Sum` does. + AggregateData::ProvableSum(s) => s as i128, }, _ => 0, } @@ -622,7 +633,28 @@ impl TreeNode { } /// Computes and returns the hash of the root node, including aggregate data - /// for ProvableCountTree and ProvableCountSumTree. + /// for ProvableCountTree, ProvableCountSumTree, and ProvableSumTree. + /// + /// # Fail-closed invariant + /// + /// The three `Provable*` arms expect `aggregate_data()` to return a + /// matching `AggregateData::Provable*` variant — that pairing is the + /// definition of a well-formed provable tree. If `aggregate_data()` + /// fails (e.g. arithmetic overflow during sum/count computation) or + /// returns an unrelated variant (which would mean the node's + /// `feature_type` field is inconsistent with the requested + /// `tree_type`), we **panic** instead of silently downgrading to + /// `self.hash()`. + /// + /// Earlier revisions of this code returned a plain `node_hash(...)` + /// in the fallback. That preserved soundness end-to-end (any caller + /// who later re-hashes the same tree via the provable path would get + /// a different result and verification would fail), but it stripped + /// the sum/count commitment at the source — a verifier expecting + /// `node_hash_with_sum` would see a plain `node_hash` and report a + /// confusing root mismatch with no indication of the underlying + /// invariant break. The fail-closed gate surfaces the corruption at + /// the failing call site instead. #[inline] pub fn hash_for_link(&self, tree_type: TreeType) -> CostContext { match tree_type { @@ -630,7 +662,7 @@ impl TreeNode { // For ProvableCountTree, include the aggregate count in the hash let aggregate_data = self .aggregate_data() - .unwrap_or(AggregateData::NoAggregateData); + .expect("ProvableCountTree::hash_for_link: aggregate_data() failed"); if let AggregateData::ProvableCount(count) = aggregate_data { node_hash_with_count( self.inner.kv.hash(), @@ -639,15 +671,19 @@ impl TreeNode { count, ) } else { - // Fallback to regular hash if aggregate data is unexpected - self.hash() + panic!( + "ProvableCountTree::hash_for_link: expected \ + AggregateData::ProvableCount, got {:?}; the node's \ + feature_type is inconsistent with its tree_type", + aggregate_data + ); } } TreeType::ProvableCountSumTree => { // For ProvableCountSumTree, include only the count in the hash (not the sum) let aggregate_data = self .aggregate_data() - .unwrap_or(AggregateData::NoAggregateData); + .expect("ProvableCountSumTree::hash_for_link: aggregate_data() failed"); if let AggregateData::ProvableCountAndSum(count, _) = aggregate_data { node_hash_with_count( self.inner.kv.hash(), @@ -656,8 +692,35 @@ impl TreeNode { count, ) } else { - // Fallback to regular hash if aggregate data is unexpected - self.hash() + panic!( + "ProvableCountSumTree::hash_for_link: expected \ + AggregateData::ProvableCountAndSum, got {:?}; the node's \ + feature_type is inconsistent with its tree_type", + aggregate_data + ); + } + } + TreeType::ProvableSumTree => { + // For ProvableSumTree, include the aggregate sum in the hash + // via `node_hash_with_sum`. This is what makes the root hash + // diverge from a plain SumTree containing the same elements. + let aggregate_data = self + .aggregate_data() + .expect("ProvableSumTree::hash_for_link: aggregate_data() failed"); + if let AggregateData::ProvableSum(sum) = aggregate_data { + node_hash_with_sum( + self.inner.kv.hash(), + self.child_hash(true), + self.child_hash(false), + sum, + ) + } else { + panic!( + "ProvableSumTree::hash_for_link: expected \ + AggregateData::ProvableSum, got {:?}; the node's \ + feature_type is inconsistent with its tree_type", + aggregate_data + ); } } _ => self.hash(), @@ -762,6 +825,21 @@ impl TreeNode { aggregated_sum_value, )) } + // `ProvableSummedMerkNode` aggregates exactly like a plain + // `SummedMerkNode` arithmetically, but yields a distinct + // `AggregateData::ProvableSum` so the hash dispatch can route + // through `node_hash_with_sum` (which bakes the sum into the + // node hash). The child helpers above treat `ProvableSum` + // identically to `Sum` for sum collection. + TreeFeatureType::ProvableSummedMerkNode(value) => { + let left = self.child_aggregate_sum_data_as_i64(true)?; + let right = self.child_aggregate_sum_data_as_i64(false)?; + value + .checked_add(left) + .and_then(|a| a.checked_add(right)) + .ok_or(Overflow("provable sum is overflowing")) + .map(AggregateData::ProvableSum) + } } } @@ -1183,7 +1261,8 @@ impl TreeNode { cost_return_on_error!(&mut cost, tree.commit(c, old_specialized_cost,)); let aggregate_data = cost_return_on_error_default!(tree.aggregate_data()); - // Use special hash for ProvableCountTree and ProvableCountSumTree + // Use special hash for ProvableCountTree, ProvableCountSumTree, + // and ProvableSumTree. let hash = match &aggregate_data { AggregateData::ProvableCount(count) => node_hash_with_count( tree.inner.kv.hash(), @@ -1199,6 +1278,13 @@ impl TreeNode { *count, ) .unwrap_add_cost(&mut cost), + AggregateData::ProvableSum(sum) => node_hash_with_sum( + tree.inner.kv.hash(), + tree.child_hash(true), + tree.child_hash(false), + *sum, + ) + .unwrap_add_cost(&mut cost), _ => tree.hash().unwrap_add_cost(&mut cost), }; self.inner.left = Some(Link::Loaded { @@ -1223,7 +1309,8 @@ impl TreeNode { // println!("key is {}", std::str::from_utf8(tree.key()).unwrap()); cost_return_on_error!(&mut cost, tree.commit(c, old_specialized_cost,)); let aggregate_data = cost_return_on_error_default!(tree.aggregate_data()); - // Use special hash for ProvableCountTree and ProvableCountSumTree + // Use special hash for ProvableCountTree, ProvableCountSumTree, + // and ProvableSumTree. let hash = match &aggregate_data { AggregateData::ProvableCount(count) => node_hash_with_count( tree.inner.kv.hash(), @@ -1239,6 +1326,13 @@ impl TreeNode { *count, ) .unwrap_add_cost(&mut cost), + AggregateData::ProvableSum(sum) => node_hash_with_sum( + tree.inner.kv.hash(), + tree.child_hash(true), + tree.child_hash(false), + *sum, + ) + .unwrap_add_cost(&mut cost), _ => tree.hash().unwrap_add_cost(&mut cost), }; self.inner.right = Some(Link::Loaded { @@ -1578,4 +1672,151 @@ mod test { .expect("expected to get sum from tree") ); } + + /// A `ProvableSumTree`-style tree (built from `ProvableSummedMerkNode` + /// features) aggregates to `ProvableSum(N)` where N is the sum of its + /// node values + children. The root hash for such a tree, computed via + /// `hash_for_link(TreeType::ProvableSumTree)`, must equal + /// `node_hash_with_sum(kv_hash, l, r, N)` rather than the plain + /// `node_hash`. + #[test] + fn provable_sum_tree_aggregates_and_hashes_sum() { + use crate::tree::{ + hash::{node_hash, node_hash_with_sum, NULL_HASH}, + tree_feature_type::TreeFeatureType::ProvableSummedMerkNode, + }; + use crate::TreeType; + + let mut tree = TreeNode::new(vec![0], vec![1], None, ProvableSummedMerkNode(3)) + .unwrap() + .attach( + false, + Some(TreeNode::new(vec![2], vec![3], None, ProvableSummedMerkNode(5)).unwrap()), + ) + .unwrap(); + tree.commit(&mut NoopCommit {}, &|_, _| Ok(0)) + .unwrap() + .expect("commit failed"); + + // Aggregate is `ProvableSum` (not plain `Sum`), so the hash dispatch + // routes through `node_hash_with_sum`. + assert_eq!( + AggregateData::ProvableSum(8), + tree.aggregate_data() + .expect("expected to get provable sum from tree") + ); + + // The root hash via the ProvableSumTree dispatch matches + // `node_hash_with_sum(kv, l, r, 8)`. It does NOT match the plain + // `node_hash` — that's the cryptographic divergence of ProvableSumTree. + let kv_hash = *tree.inner.kv.hash(); + let l = *tree.child_hash(true); + let r = *tree.child_hash(false); + + let expected_with_sum = node_hash_with_sum(&kv_hash, &l, &r, 8).unwrap(); + let expected_without_sum = node_hash(&kv_hash, &l, &r).unwrap(); + assert_ne!(expected_with_sum, expected_without_sum); + + let actual = tree.hash_for_link(TreeType::ProvableSumTree).unwrap(); + assert_eq!(actual, expected_with_sum); + + // Sanity: the plain `Tree::hash()` method still produces the + // sum-less hash. Only `hash_for_link(ProvableSumTree)` baked the + // sum in. + assert_eq!(tree.hash().unwrap(), expected_without_sum); + assert_ne!(tree.hash().unwrap(), actual); + } + + /// Mutating any node's sum changes the root hash for a ProvableSumTree. + /// This is the proof-tampering detection at the Merk-tree level. + #[test] + fn provable_sum_tree_root_hash_changes_on_sum_mutation() { + use crate::tree::tree_feature_type::TreeFeatureType::ProvableSummedMerkNode; + use crate::TreeType; + + let make_tree = |right_sum: i64| -> TreeNode { + let mut t = TreeNode::new(vec![0], vec![1], None, ProvableSummedMerkNode(3)) + .unwrap() + .attach( + false, + Some( + TreeNode::new(vec![2], vec![3], None, ProvableSummedMerkNode(right_sum)) + .unwrap(), + ), + ) + .unwrap(); + t.commit(&mut NoopCommit {}, &|_, _| Ok(0)) + .unwrap() + .expect("commit failed"); + t + }; + + let tree_a = make_tree(5); + let tree_b = make_tree(6); // right child's sum bumped by 1 + + let hash_a = tree_a.hash_for_link(TreeType::ProvableSumTree).unwrap(); + let hash_b = tree_b.hash_for_link(TreeType::ProvableSumTree).unwrap(); + assert_ne!( + hash_a, hash_b, + "mutating a node's sum must change the ProvableSumTree root hash" + ); + } + + /// Calling `hash_for_link(TreeType::ProvableSumTree)` on a node whose + /// `feature_type` is anything other than `ProvableSummedMerkNode` is a + /// contract violation — the caller asked for a sum-bearing hash but + /// the node can't produce a `ProvableSum`. Earlier revisions of + /// `hash_for_link` silently downgraded to `self.hash()` in this + /// case, stripping the sum commitment and producing a hash that any + /// honest verifier would later reject with a confusing root + /// mismatch. The fail-closed gate now panics so the corruption is + /// surfaced at the failing call site. + #[test] + #[should_panic(expected = "ProvableSumTree::hash_for_link")] + fn provable_sum_tree_hash_for_link_panics_on_feature_type_mismatch() { + use crate::TreeType; + + // Node with a plain `SummedMerkNode` (not provable). The merk + // tree-builder normally only emits matching feature_type for a + // given tree_type, but a corrupted on-disk record or a buggy + // caller could route a non-provable-summed node through this + // path — must abort instead of silently producing a stripped + // hash. + let mut tree = TreeNode::new(vec![0], vec![1], None, SummedMerkNode(42)).unwrap(); + tree.commit(&mut NoopCommit {}, &|_, _| Ok(0)) + .unwrap() + .expect("commit failed"); + + // Trips the fail-closed `else` arm. + let _ = tree.hash_for_link(TreeType::ProvableSumTree); + } + + /// Mirror for `ProvableCountTree`: a non-counted feature_type + /// shouldn't silently emit a stripped hash. + #[test] + #[should_panic(expected = "ProvableCountTree::hash_for_link")] + fn provable_count_tree_hash_for_link_panics_on_feature_type_mismatch() { + use crate::TreeType; + + let mut tree = TreeNode::new(vec![0], vec![1], None, BasicMerkNode).unwrap(); + tree.commit(&mut NoopCommit {}, &|_, _| Ok(0)) + .unwrap() + .expect("commit failed"); + + let _ = tree.hash_for_link(TreeType::ProvableCountTree); + } + + /// Mirror for `ProvableCountSumTree`. + #[test] + #[should_panic(expected = "ProvableCountSumTree::hash_for_link")] + fn provable_count_sum_tree_hash_for_link_panics_on_feature_type_mismatch() { + use crate::TreeType; + + let mut tree = TreeNode::new(vec![0], vec![1], None, BasicMerkNode).unwrap(); + tree.commit(&mut NoopCommit {}, &|_, _| Ok(0)) + .unwrap() + .expect("commit failed"); + + let _ = tree.hash_for_link(TreeType::ProvableCountSumTree); + } } diff --git a/merk/src/tree/tree_feature_type.rs b/merk/src/tree/tree_feature_type.rs index e28db40e0..e788bb910 100644 --- a/merk/src/tree/tree_feature_type.rs +++ b/merk/src/tree/tree_feature_type.rs @@ -29,6 +29,14 @@ pub enum AggregateData { ProvableCount(u64), /// A provable combined element count and sum. ProvableCountAndSum(u64, i64), + /// A provable signed 64-bit sum value (sum baked into node hash). + /// + /// Distinct from `Sum` so the hash dispatch in `Tree::hash_for_link` and + /// the `commit` path can route a `ProvableSumTree` aggregate through + /// `node_hash_with_sum` instead of the plain `node_hash`. Arithmetic + /// semantics are identical to `Sum` (i64, checked-add aggregation); + /// only the hash treatment differs. + ProvableSum(i64), } #[cfg(feature = "minimal")] @@ -43,6 +51,7 @@ impl AggregateData { AggregateData::CountAndSum(..) => TreeType::CountSumTree, AggregateData::ProvableCount(_) => TreeType::ProvableCountTree, AggregateData::ProvableCountAndSum(..) => TreeType::ProvableCountSumTree, + AggregateData::ProvableSum(_) => TreeType::ProvableSumTree, } } @@ -64,6 +73,7 @@ impl AggregateData { AggregateData::CountAndSum(_, s) => *s, AggregateData::ProvableCount(_) => 0, AggregateData::ProvableCountAndSum(_, s) => *s, + AggregateData::ProvableSum(s) => *s, } } @@ -77,6 +87,7 @@ impl AggregateData { AggregateData::CountAndSum(c, _) => *c, AggregateData::ProvableCount(c) => *c, AggregateData::ProvableCountAndSum(c, _) => *c, + AggregateData::ProvableSum(_) => 0, } } @@ -90,6 +101,7 @@ impl AggregateData { AggregateData::CountAndSum(_, s) => *s as i128, AggregateData::ProvableCount(_) => 0, AggregateData::ProvableCountAndSum(_, s) => *s as i128, + AggregateData::ProvableSum(s) => *s as i128, } } } @@ -107,6 +119,12 @@ impl From for AggregateData { TreeFeatureType::ProvableCountedSummedMerkNode(count, sum) => { AggregateData::ProvableCountAndSum(count, sum) } + // `ProvableSummedMerkNode` maps to its own + // `AggregateData::ProvableSum` variant so the hash dispatch + // (in `Tree::hash_for_link` and `commit`) can route a + // ProvableSumTree through `node_hash_with_sum`. Arithmetic + // semantics still mirror a plain `Sum` aggregation. + TreeFeatureType::ProvableSummedMerkNode(val) => AggregateData::ProvableSum(val), } } } @@ -143,6 +161,10 @@ mod tests { AggregateData::ProvableCountAndSum(1, 2).parent_tree_type(), TreeType::ProvableCountSumTree ); + assert_eq!( + AggregateData::ProvableSum(7).parent_tree_type(), + TreeType::ProvableSumTree + ); } #[test] @@ -165,6 +187,8 @@ mod tests { assert_eq!(AggregateData::CountAndSum(5, 20).as_sum_i64(), 20); assert_eq!(AggregateData::ProvableCount(3).as_sum_i64(), 0); assert_eq!(AggregateData::ProvableCountAndSum(1, -7).as_sum_i64(), -7); + assert_eq!(AggregateData::ProvableSum(42).as_sum_i64(), 42); + assert_eq!(AggregateData::ProvableSum(-1).as_sum_i64(), -1); } #[test] @@ -176,6 +200,7 @@ mod tests { assert_eq!(AggregateData::CountAndSum(5, 20).as_count_u64(), 5); assert_eq!(AggregateData::ProvableCount(3).as_count_u64(), 3); assert_eq!(AggregateData::ProvableCountAndSum(7, -1).as_count_u64(), 7); + assert_eq!(AggregateData::ProvableSum(42).as_count_u64(), 0); } #[test] @@ -190,6 +215,8 @@ mod tests { AggregateData::ProvableCountAndSum(1, 50).as_summed_i128(), 50 ); + assert_eq!(AggregateData::ProvableSum(42).as_summed_i128(), 42); + assert_eq!(AggregateData::ProvableSum(-1).as_summed_i128(), -1); } #[test] @@ -222,5 +249,15 @@ mod tests { AggregateData::from(TreeFeatureType::ProvableCountedSummedMerkNode(1, 2)), AggregateData::ProvableCountAndSum(1, 2) ); + // `ProvableSummedMerkNode` maps to its dedicated + // `AggregateData::ProvableSum` variant. + assert_eq!( + AggregateData::from(TreeFeatureType::ProvableSummedMerkNode(42)), + AggregateData::ProvableSum(42) + ); + assert_eq!( + AggregateData::from(TreeFeatureType::ProvableSummedMerkNode(-1)), + AggregateData::ProvableSum(-1) + ); } } diff --git a/merk/src/tree_type/costs.rs b/merk/src/tree_type/costs.rs index eb6f615bc..a2726659e 100644 --- a/merk/src/tree_type/costs.rs +++ b/merk/src/tree_type/costs.rs @@ -65,6 +65,8 @@ impl CostSize for TreeType { TreeType::MmrTree => MMR_TREE_COST_SIZE, TreeType::BulkAppendTree(_) => BULK_APPEND_TREE_COST_SIZE, TreeType::DenseAppendOnlyFixedSizeTree(_) => DENSE_TREE_COST_SIZE, + // ProvableSumTree mirrors SumTree's cost. + TreeType::ProvableSumTree => SUM_TREE_COST_SIZE, } } } diff --git a/merk/src/tree_type/mod.rs b/merk/src/tree_type/mod.rs index 5154f393d..23a02e9bd 100644 --- a/merk/src/tree_type/mod.rs +++ b/merk/src/tree_type/mod.rs @@ -44,6 +44,16 @@ pub enum TreeType { BulkAppendTree(u8), /// A dense append-only tree with fixed-size entries and a configurable height. DenseAppendOnlyFixedSizeTree(u8), + /// A sum tree with provable sum support — the aggregate `i64` sum is + /// baked into every node's hash via `node_hash_with_sum`. This is the + /// sum-side counterpart to `ProvableCountTree`: tampering with the + /// stored sum changes the node hash and is therefore catchable by + /// proof verification, unlike the plain `SumTree` where the sum is + /// stored alongside but not bound into the hash. Uses dedicated + /// proof-node families (`KVSum`, `KVHashSum`, `KVDigestSum`, + /// `KVRefValueHashSum`, `HashWithSum`, and the `AggregateSumOnRange` + /// query). + ProvableSumTree, } impl TreeType { @@ -63,6 +73,7 @@ impl TreeType { TreeType::MmrTree => 8, TreeType::BulkAppendTree(_) => 9, TreeType::DenseAppendOnlyFixedSizeTree(_) => 10, + TreeType::ProvableSumTree => 11, } } } @@ -83,7 +94,8 @@ impl TryFrom for TreeType { 8 => Ok(TreeType::MmrTree), 9 => Ok(TreeType::BulkAppendTree(0)), 10 => Ok(TreeType::DenseAppendOnlyFixedSizeTree(0)), - n => Err(Error::UnknownTreeType(format!("got {}, max is 10", n))), + 11 => Ok(TreeType::ProvableSumTree), + n => Err(Error::UnknownTreeType(format!("got {}, max is 11", n))), } } } @@ -102,6 +114,7 @@ impl fmt::Display for TreeType { TreeType::MmrTree => "MMR Tree", TreeType::BulkAppendTree(_) => "BulkAppendTree", TreeType::DenseAppendOnlyFixedSizeTree(_) => "Dense Tree", + TreeType::ProvableSumTree => "Provable Sum Tree", }; write!(f, "{}", s) } @@ -112,6 +125,8 @@ impl TreeType { /// non-Merk entries. These types have an always-empty Merk subtree and /// never contain child subtrees. pub fn uses_non_merk_data_storage(&self) -> bool { + // NOTE: `ProvableSumTree` is intentionally NOT in this list — it is + // a standard Merk-backed tree, just like `SumTree`. matches!( self, TreeType::CommitmentTree(_) @@ -146,6 +161,7 @@ impl TreeType { | TreeType::BigSumTree | TreeType::CountSumTree | TreeType::ProvableCountSumTree + | TreeType::ProvableSumTree ) } @@ -176,6 +192,7 @@ impl TreeType { TreeType::MmrTree => false, TreeType::BulkAppendTree(_) => false, TreeType::DenseAppendOnlyFixedSizeTree(_) => false, + TreeType::ProvableSumTree => true, } } @@ -194,6 +211,7 @@ impl TreeType { TreeType::MmrTree => NodeType::NormalNode, TreeType::BulkAppendTree(_) => NodeType::NormalNode, TreeType::DenseAppendOnlyFixedSizeTree(_) => NodeType::NormalNode, + TreeType::ProvableSumTree => NodeType::ProvableSumNode, } } @@ -211,6 +229,7 @@ impl TreeType { TreeType::MmrTree => TreeFeatureType::BasicMerkNode, TreeType::BulkAppendTree(_) => TreeFeatureType::BasicMerkNode, TreeType::DenseAppendOnlyFixedSizeTree(_) => TreeFeatureType::BasicMerkNode, + TreeType::ProvableSumTree => TreeFeatureType::ProvableSummedMerkNode(0), } } @@ -235,6 +254,7 @@ impl TreeType { TreeType::DenseAppendOnlyFixedSizeTree(_) => { Some(ElementType::DenseAppendOnlyFixedSizeTree) } + TreeType::ProvableSumTree => Some(ElementType::ProvableSumTree), } } } @@ -257,6 +277,7 @@ mod tests { TreeType::MmrTree, TreeType::BulkAppendTree(3), TreeType::DenseAppendOnlyFixedSizeTree(8), + TreeType::ProvableSumTree, ]; for v in &variants { let d = v.discriminant(); @@ -268,7 +289,7 @@ mod tests { #[test] fn tree_type_try_from_invalid() { - assert!(TreeType::try_from(11u8).is_err()); + assert!(TreeType::try_from(12u8).is_err()); assert!(TreeType::try_from(255u8).is_err()); } @@ -297,6 +318,10 @@ mod tests { format!("{}", TreeType::DenseAppendOnlyFixedSizeTree(0)), "Dense Tree" ); + assert_eq!( + format!("{}", TreeType::ProvableSumTree), + "Provable Sum Tree" + ); } #[test] @@ -312,6 +337,7 @@ mod tests { assert!(TreeType::MmrTree.uses_non_merk_data_storage()); assert!(TreeType::BulkAppendTree(0).uses_non_merk_data_storage()); assert!(TreeType::DenseAppendOnlyFixedSizeTree(0).uses_non_merk_data_storage()); + assert!(!TreeType::ProvableSumTree.uses_non_merk_data_storage()); } #[test] @@ -327,6 +353,8 @@ mod tests { assert!(!TreeType::MmrTree.is_count_bearing()); assert!(!TreeType::BulkAppendTree(0).is_count_bearing()); assert!(!TreeType::DenseAppendOnlyFixedSizeTree(0).is_count_bearing()); + // ProvableSumTree carries a sum aggregate, not a count. + assert!(!TreeType::ProvableSumTree.is_count_bearing()); } #[test] @@ -342,6 +370,7 @@ mod tests { assert!(!TreeType::MmrTree.is_sum_bearing()); assert!(!TreeType::BulkAppendTree(0).is_sum_bearing()); assert!(!TreeType::DenseAppendOnlyFixedSizeTree(0).is_sum_bearing()); + assert!(TreeType::ProvableSumTree.is_sum_bearing()); } #[test] @@ -395,6 +424,7 @@ mod tests { assert!(!TreeType::MmrTree.allows_sum_item()); assert!(!TreeType::BulkAppendTree(0).allows_sum_item()); assert!(!TreeType::DenseAppendOnlyFixedSizeTree(0).allows_sum_item()); + assert!(TreeType::ProvableSumTree.allows_sum_item()); } #[test] @@ -443,6 +473,10 @@ mod tests { TreeType::DenseAppendOnlyFixedSizeTree(0).empty_tree_feature_type(), TreeFeatureType::BasicMerkNode ); + assert_eq!( + TreeType::ProvableSumTree.empty_tree_feature_type(), + TreeFeatureType::ProvableSummedMerkNode(0) + ); } #[test] @@ -491,5 +525,9 @@ mod tests { TreeType::DenseAppendOnlyFixedSizeTree(0).to_element_type(), Some(ElementType::DenseAppendOnlyFixedSizeTree) ); + assert_eq!( + TreeType::ProvableSumTree.to_element_type(), + Some(ElementType::ProvableSumTree) + ); } }