feat: add AggregateCountOnRange query for provable count trees by QuantumExplorer · Pull Request #656 · dashpay/grovedb

QuantumExplorer · 2026-05-09T19:24:43Z

Issue being fixed or feature implemented

Counting a sub-range of keys in a ProvableCountTree was previously forced through a regular query, paying O(result-size) proof bytes for an answer that's already cryptographically committed at every internal node via node_hash_with_count. This PR adds a dedicated count-only proof shape with O(log n) size regardless of how many keys fall in the range.

What was done?

A new QueryItem variant AggregateCountOnRange(Box<QueryItem>) that wraps an inner range and asks for a single u64 count + cryptographic proof, instead of returning the elements themselves. Restricted to ProvableCountTree and ProvableCountSumTree (plus their NonCounted* wrapper variants); rejected for any other tree type at proof time.

Cryptographic core:

New proof node Node::HashWithCount(kv_hash, left_child_hash, right_child_hash, count) which is self-verifying: the verifier recomputes node_hash_with_count(...) from the four committed fields. A forged count diverges from the parent's expected hash and the Merkle-root chain check fails — so the count is bound by the proof, not trusted on faith. (An earlier draft used (node_hash, count) only, which would have been forge-able; that version was rejected during design review.)

Proof generation (merk):

Merk::prove_aggregate_count_on_range walks the AVL tree, classifying each subtree as Disjoint / Contained / Boundary using inherited exclusive key-bound windows. Disjoint short-circuits at the link level (no I/O); Contained walks one level for its kv_hash + grandchild hashes; Boundary recurses. Emits Hash / HashWithCount / KVDigestCount accordingly.

Proof verification (merk):

verify_aggregate_count_on_range_proof replays Ops via the new execute_with_options(.., verify_avl_balance=false, ..). The AVL-balance check is bypassed only here because count proofs intentionally collapse one side of the tree into a single op while descending the other — the resulting reconstructed tree is unbalanced by design. All existing callers keep the balance check via the unchanged execute() wrapper.
Allowlists Hash / HashWithCount / KVDigestCount; anything else is InvalidProofError.

GroveDB integration:

One prove_query entry — short-circuits inside both prove_subqueries (v0) and prove_subqueries_v1 when the leaf-merk's items contain AggregateCountOnRange. Envelope structure is unchanged so existing serialization works.
New GroveDb::verify_aggregate_count_query(proof, path_query, version) -> Result<(CryptoHash, u64), Error> walks the multi-layer envelope, verifies single-key existence proofs at each non-leaf layer, delegates to the merk count verifier at the leaf, and enforces combine_hash(H(value), lower_root) == parent_proof_hash for the chain.

Validation:
Enforced at Query::validate_aggregate_count_on_range / SizedQuery::validate_aggregate_count_on_range / PathQuery::validate_aggregate_count_on_range:

Must be the only item in the query
Inner item not Key (use has_raw / get_raw)
Inner item not RangeFull (read the parent Element::ProvableCountTree bytes)
Inner item not nested AggregateCountOnRange
No default_subquery_branch
No conditional_subquery_branches
No SizedQuery::limit / offset
left_to_right ignored

How Has This Been Tested?

Unit (10 tests): classification of every Disjoint / Contained / Boundary case across all range-bound shapes.

Merk integration (13 tests): every allowed range variant (Range, RangeInclusive, RangeFrom, RangeAfter, RangeTo, RangeToInclusive, RangeAfterTo, RangeAfterToInclusive) + range-below-all-keys + range-above-all-keys + empty merk + tree-type rejection + a count-forgery test that tampers with the count in a HashWithCount op and asserts the reconstructed root diverges from the expected one.

GroveDB integration (11 tests, grovedb/src/tests/aggregate_count_query_tests.rs): end-to-end prove → encode → decode → verify on ProvableCountTree and ProvableCountSumTree at multi-layer paths, plus validation rejections (Key inner, RangeFull inner, NormalTree target) and a GroveDB-level forgery test that flips a count byte in the encoded proof and asserts the verifier rejects it via the layer-chain check.

Workspace state: cargo build --workspace, cargo clippy --workspace --all-targets, and cargo test --workspace all pass with no regressions (1465+ grovedb lib tests + 387+ merk lib tests). cargo fmt clean.

Documentation

New chapter at docs/book/src/aggregate-count-queries.md covering the contract, allowed range variants, validation rules, and per-case proof shapes (open and closed ranges, with mermaid diagrams using a 15-key example tree). Cross-linked from the query-system chapter.

Breaking Changes

None for callers — the new variant is additive and existing prove_query / verify_query paths are untouched. Sites that pattern-match exhaustively on QueryItem or Node were updated (one &'static str rejection arm each in dense-tree, bulk-append-tree, and the grovedb proof generator/verifier — none of those tree types support count queries).

Checklist:

I have performed a self-review of my own code
I have commented my code, particularly in hard-to-understand areas
I have added or updated relevant unit/integration/functional/e2e tests
I have made corresponding changes to the documentation

🤖 Generated with Claude Code

Summary by CodeRabbit

New Features
- Added aggregate count queries: count elements in bounded key ranges with verifiable cryptographic proofs and accompanying prove/verify support.
Documentation
- New "Aggregate Count Queries" doc and SUMMARY entry; added client-side JS fix to ensure Mermaid diagrams render correctly in the HTML docs.
Tests
- Extensive unit and end-to-end tests covering aggregate count proofs across tree types.

Adds a new QueryItem variant `AggregateCountOnRange(Box<QueryItem>)` that counts the elements matched by an inner range and returns a single u64 together with an O(log n) cryptographic proof, instead of returning the elements themselves. Targets `ProvableCountTree` and `ProvableCountSumTree` (plus their `NonCounted*` wrappers); rejects all other tree types at proof time. Why: counting a sub-range of keys in a provable count tree was previously forced through a regular query, paying O(result-size) bytes for an answer that's already cryptographically committed at every internal node via `node_hash_with_count`. This adds a dedicated proof shape that collapses fully-inside subtrees into a single self-verifying op. Mechanics: - New proof node `Node::HashWithCount(kv_hash, left_child_hash, right_child_hash, count)` that is *self-verifying*: the verifier recomputes `node_hash_with_count(...)` from the four committed fields, so a forged count diverges from the parent's expected hash and the Merkle-root chain check fails. - `Merk::prove_aggregate_count_on_range` walks the AVL tree, classifying each subtree (Disjoint / Contained / Boundary) using inherited exclusive key-bound windows; emits `Hash` / `HashWithCount` / `KVDigestCount` accordingly. - Multi-layer GroveDB proof glue routes leaf-merk emission inside both `prove_subqueries` and `prove_subqueries_v1` short-circuit paths; envelope structure is unchanged so existing serialization works. - `GroveDb::verify_aggregate_count_query(proof, path_query, version) -> Result<(CryptoHash, u64), Error>` walks the layer chain, verifies single-key existence proofs at each non-leaf layer, delegates to the merk count verifier at the leaf, and enforces the `combine_hash(H(value), lower_root) == parent_proof_hash` chain. - Validation enforced at `Query` / `SizedQuery` / `PathQuery`: AggregateCountOnRange must be the only item, no subqueries, no pagination, inner item not Key / RangeFull / nested AggregateCountOnRange. - `execute` refactored to expose `execute_with_options(verify_avl_balance: bool)`; count proofs intentionally collapse one side to height 1 while descending the other, so the AVL balance check is bypassed only for the count verifier (existing callers unchanged). Documentation: new GroveDB book chapter at `docs/book/src/aggregate-count-queries.md` covering the contract, allowed range variants, validation rules, and per-case proof shapes (open and closed ranges, with mermaid diagrams). Cross-linked from the query-system chapter. Tests: - 10 classification unit tests covering Disjoint/Contained/Boundary decisions across every range bound shape. - 13 merk-level integration tests covering every allowed range variant + empty merk + tree-type rejection + a count-forgery test that proves the cryptographic binding. - 11 GroveDB end-to-end tests at `grovedb/src/tests/aggregate_count_query_tests.rs` covering ProvableCountTree, ProvableCountSumTree, multi-layer paths, validation rejections, normal-tree rejection, and a GroveDB-level forgery test. Full workspace: builds clean, clippy clean, all 1465+ grovedb lib tests + 387+ merk lib tests pass with no regressions. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

coderabbitai · 2026-05-09T19:24:58Z

Warning

Rate limit exceeded

@QuantumExplorer has exceeded the limit for the number of commits that can be reviewed per hour. Please wait 4 minutes and 19 seconds before requesting another review.

You’ve run out of usage credits. Purchase more in the billing tab.

⌛ How to resolve this issue?

After the wait time has elapsed, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

We recommend that you space out your commits to avoid hitting the rate limit.

🚦 How do rate limits work?

CodeRabbit enforces hourly rate limits for each developer per organization.

Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout.

Please see our FAQ for further information.

ℹ️ Review info

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 780e4de0-96de-4a7c-b461-dca7eea84744

📥 Commits

Reviewing files that changed from the base of the PR and between 30a960a and ce8e98d.

📒 Files selected for processing (5)

docs/book/src/aggregate-count-queries.md
grovedb-query/src/query.rs
grovedb/src/operations/proof/generate.rs
grovedb/src/tests/aggregate_count_query_tests.rs
merk/src/proofs/query/aggregate_count.rs

📝 Walkthrough

Walkthrough

This PR implements end-to-end support for aggregate-count-on-range queries: adds QueryItem::AggregateCountOnRange and Node::HashWithCount, safe serialization and opcodes, merk-level prover and verifier with subtree classification, GroveDB multi-layer envelope verification, integration into proof-gen, and extensive tests and docs.

Changes

Aggregate Count On Range Query Implementation

Layer / File(s)	Summary
Documentation `docs/book/book.toml`, `docs/book/mermaid-fixup.js`, `docs/book/src/SUMMARY.md`, `docs/book/src/aggregate-count-queries.md`, `docs/book/src/query-system.md`	Book docs, TOC entry, and small mermaid client fixup; documents ACOR semantics, proof shape, verifier algorithm, and API sketch.
Core Types `grovedb-query/src/query_item/mod.rs`, `grovedb-query/src/proofs/mod.rs`	Adds `QueryItem::AggregateCountOnRange(Box<QueryItem>)` and `Node::HashWithCount(CryptoHash, CryptoHash, CryptoHash, u64)` and delegates QueryItem behaviors to inner item.
Serialization & Encoding `grovedb-query/src/query_item/mod.rs`, `grovedb-query/src/proofs/encoding.rs`, `grovedb-query/Cargo.toml`	Serde/bincode support for new QueryItem variant with NonAggregateInner and depth-bounded decode; opcodes `0x1e`/`0x1f` for HashWithCount with encode/decode/length and round-trip tests; `serde_test` added to dev-deps.
Query APIs & Validation `grovedb-query/src/query.rs`, `grovedb/src/query/mod.rs`	Adds constructors, detection, and `validate_aggregate_count_on_range` at Query/SizedQuery/PathQuery levels; unit tests enforcing terminal-only shape and forbidding pagination/forbidden inner variants.
GroveDB Prove Integration `grovedb/src/operations/proof/generate.rs`, `grovedb/src/operations/proof/mod.rs`	Short-circuits V0/V1 proof generation to emit count-only proofs when ACOR is detected and returns layer proofs with empty lower_layers.
Merk Prover & Emission `merk/src/merk/prove.rs`, `merk/src/proofs/query/aggregate_count.rs`	Adds `Merk::prove_aggregate_count_on_range` and `RefWalker::create_aggregate_count_on_range_proof`; classifies subtrees (Disjoint/Contained/Boundary) and emits compact `HashWithCount`/`KVDigestCount` ops while accumulating prover count.
Tree & Proof Execution `merk/src/proofs/tree.rs`	Handles `HashWithCount` in Tree::hash/key/aggregate_data and adds `execute_with_options` to control AVL-balance verification.
Merk Verifier `merk/src/proofs/query/aggregate_count.rs`, `merk/src/proofs/query/verify.rs`	Implements `verify_aggregate_count_on_range_proof` with two-phase decode+shape validation, strict node-shape enforcement, and checked arithmetic to prevent forgery; extensive tests added.
GroveDB Envelope Verification `grovedb/src/operations/proof/aggregate_count.rs`	Adds `GroveDb::verify_aggregate_count_query` and V0/V1 envelope walkers that verify non-leaf single-key merk proofs, enforce lower-layer chain links, and delegate leaf verification to merk verifier.
Unsupported Tree Rejections `grovedb-bulk-append-tree/src/proof/mod.rs`, `grovedb-dense-fixed-sized-merkle-tree/src/proof/mod.rs`, `grovedb/src/operations/proof/generate.rs`, `grovedb/src/operations/proof/verify.rs`	Explicitly reject ACOR for BulkAppendTree, Dense fixed-size Merkle, and MMR with appropriate InvalidInput/InvalidProof errors.
Node Extraction, Benches & Tests `merk/src/proofs/branch/mod.rs`, `merk/benches/branch_queries.rs`, `merk/src/merk/chunks.rs`, `grovedb/src/tests/provable_count_sum_tree_tests.rs`, `grovedb/src/operations/proof/verify.rs`	Treats `HashWithCount` as key-less, updates node-to-string formatting, and adjusts test/bench helpers to count/handle `HashWithCount` nodes.
Debugger & Module Exports `grovedb/src/debugger.rs`, `merk/src/proofs/query/mod.rs`, `grovedb/src/operations/proof/mod.rs`	Debugger maps `HashWithCount` to `ProvableCountedMerkNode(count)`; exposes aggregate_count module under feature gating and re-exports verifier.
End-to-End Tests `grovedb/src/tests/aggregate_count_query_tests.rs`, `grovedb/src/tests/mod.rs`	Comprehensive prove→verify test suite: fixtures, round-trip helpers, range variants, validation rejections, tamper detection, multi-layer paths, NonCounted handling, and proof-size snapshot assertions.

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

Possibly related PRs

dashpay/grovedb#633: Adds proof Node variants and opcodes; overlaps with encoding/Node/verification changes.

🐇 Count proofs so fine, compressed with care,
Hashes nested in branches, counts hidden there,
Walk down the bounds, recompute the root,
O(log n) whispers — compact and astute,
The forest yields totals, verified with flair.

🚥 Pre-merge checks | ✅ 5

✅ Passed checks (5 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title clearly and specifically summarizes the main feature: adding a new AggregateCountOnRange query type for provable count trees, which is the core focus of this large PR.
Docstring Coverage	✅ Passed	Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.
Linked Issues check	✅ Passed	Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check	✅ Passed	Check skipped because no linked issues were found for this pull request.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests
Commit unit tests in branch feat/aggregate-count-query-item

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

…t 1.95) CI's clippy on Rust 1.95 caught a `collapsible_if` lint at `grovedb-query/src/query.rs:402` that local clippy on an older toolchain didn't surface. Rewrite the nested `if let Some(branches) = ... { if !branches.is_empty() { ... } }` as a single let-chain `if let Some(...) && !branches.is_empty()`. No behavior change. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

…debugger CI's clippy --all-features run surfaced two more `collapsible_if` lints in the count-proof classifier (rust 1.95) plus a non-exhaustive match on the new `Node::HashWithCount` variant in the debugger module that the default feature set hadn't compiled previously. - Collapse `if let (Some(s_hi), Some(r_lo)) = ... { if s_hi <= r_lo { ... } }` into let-chain form. Same for the symmetric upper-bound check. No behavior change. - Add a HashWithCount arm to `merk_proof_node_to_grovedbg`. The debugger UI doesn't have a dedicated rendering for the new variant yet, so we surface its committed `node_hash_with_count(...)` and the count via the existing `KVValueHashFeatureType` slot — same approach already used for `KVHashCount` in this function. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

codecov · 2026-05-09T19:35:40Z

Codecov Report

❌ Patch coverage is 91.02151% with 167 lines in your changes missing coverage. Please review.
✅ Project coverage is 90.63%. Comparing base (618affa) to head (ce8e98d).
⚠️ Report is 1 commits behind head on develop.

Files with missing lines	Patch %	Lines
merk/src/proofs/query/aggregate_count.rs	93.17%	57 Missing ⚠️
grovedb-query/src/query_item/mod.rs	79.79%	39 Missing ⚠️
grovedb/src/operations/proof/aggregate_count.rs	86.44%	32 Missing ⚠️
grovedb/src/operations/proof/verify.rs	0.00%	9 Missing ⚠️
grovedb-query/src/query.rs	96.80%	7 Missing ⚠️
grovedb-bulk-append-tree/src/proof/mod.rs	0.00%	5 Missing ⚠️
...edb-dense-fixed-sized-merkle-tree/src/proof/mod.rs	0.00%	5 Missing ⚠️
grovedb/src/query/mod.rs	95.74%	4 Missing ⚠️
grovedb/src/operations/proof/generate.rs	96.47%	3 Missing ⚠️
grovedb-query/src/query_item/intersect.rs	0.00%	2 Missing ⚠️
... and 3 more

Additional details and impacted files

@@             Coverage Diff             @@
##           develop     #656      +/-   ##
===========================================
+ Coverage    90.54%   90.63%   +0.09%     
===========================================
  Files          182      184       +2     
  Lines        52873    54763    +1890     
===========================================
+ Hits         47875    49637    +1762     
- Misses        4998     5126     +128

Components	Coverage Δ
grovedb-core	`88.59% <88.78%> (+<0.01%)`	⬆️
merk	`92.07% <93.13%> (+0.15%)`	⬆️
storage	`86.36% <ø> (ø)`
commitment-tree	`96.43% <ø> (ø)`
mmr	`96.76% <ø> (ø)`
bulk-append-tree	`89.14% <0.00%> (-0.52%)`	⬇️
element	`95.24% <ø> (ø)`

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

QuantumExplorer · 2026-05-09T19:38:46Z

Review findings:

[P0] Count verifier accepts forged proof shapes
merk/src/proofs/query/aggregate_count.rs:475-488

The verifier only allowlists node types; it never proves that a Hash is actually for a disjoint subtree or that HashWithCount is a leaf in a contained subtree. A prover can send Push(Hash(expected_root)) for a non-empty tree and get the trusted root back with count 0 for any range. They can also replace in-range HashWithCount subtrees with Hash to undercount, or attach extra counted children under keyless Hash/HashWithCount nodes to overcount while preserving the root hash. The verifier needs to replay the aggregate-count traversal against inner_range and inherited subtree bounds, not just execute arbitrary allowed ops.
[P2] Prover bypasses aggregate-count validation
grovedb/src/operations/proof/generate.rs:276-284

The short-circuit checks only whether the first item is AggregateCountOnRange. That lets invalid queries through proof generation: extra items are ignored, subquery branches are ignored, limits can be ignored, and invalid inner items such as Key or RangeFull still produce a count proof. This should use PathQuery::validate_aggregate_count_on_range() before routing to the count prover, or at least enforce the same single-item/no-subquery/no-pagination/inner-range rules here.
[P2] Recursive QueryItem decoding has no depth limit
grovedb-query/src/query_item/mod.rs:357-359

AggregateCountOnRange is invalid when nested, but decoding recursively accepts another QueryItem before validation runs. A small malicious payload made of repeated variant-10 bytes can recurse deeply enough to overflow the stack during bincode or borrow decoding. Either reject an inner aggregate variant during decode or add a bounded decode_with_depth path like the existing Query/Subquery decoding guards.

QuantumExplorer · 2026-05-09T19:40:54Z

Suggested fix direction for the findings above:

Fix the aggregate-count verifier by replaying the proof shape, not just allowlisting node types.

I would make verify_aggregate_count_on_range_proof two-phase:

Decode/execute the proof into a ProofTree as today, but do not accumulate count in the visit_node closure.
Walk the reconstructed proof tree with the same inherited exclusive subtree bounds used by the prover: (subtree_lo_excl, subtree_hi_excl).
At each proof-tree node, call classify_subtree(bounds, inner_range) and require the node shape to match the classification:
- Disjoint => node must be a leaf Node::Hash(_), contributes 0.
- Contained => node must be a leaf Node::HashWithCount(.., count), contributes count.
- Boundary => node must be Node::KVDigestCount(key, ..), the key must be strictly inside the inherited bounds, then recurse left with (lo, key) and right with (key, hi), plus 1 if inner_range.contains(key).
Reject children under Hash or HashWithCount; their hash computation ignores attached proof-tree children, so accepting them lets a proof overcount without changing the root hash.
Use checked addition rather than saturating_add; overflow should be an invalid proof.
Optionally reject inverted structural ops for this proof flavor, or at least rely on the shape walk to validate the final left/right tree topology.

In pseudocode, the important part is roughly:

fn verify_count_shape(
    tree: &ProofTree,
    range: &QueryItem,
    lo: Option<&[u8]>,
    hi: Option<&[u8]>,
) -> Result<u64, Error> {
    match classify_subtree(lo, hi, range) {
        Disjoint => match &tree.node {
            Node::Hash(_) if tree.left.is_none() && tree.right.is_none() => Ok(0),
            _ => Err(invalid("expected disjoint Hash leaf")),
        },
        Contained => match &tree.node {
            Node::HashWithCount(_, _, _, count)
                if tree.left.is_none() && tree.right.is_none() => Ok(*count),
            _ => Err(invalid("expected contained HashWithCount leaf")),
        },
        Boundary => match &tree.node {
            Node::KVDigestCount(key, _, _) => {
                ensure_key_inside_bounds(key, lo, hi)?;
                let left = tree.left.as_ref()
                    .map_or(Ok(0), |c| verify_count_shape(&c.tree, range, lo, Some(key)))?;
                let right = tree.right.as_ref()
                    .map_or(Ok(0), |c| verify_count_shape(&c.tree, range, Some(key), hi))?;
                checked_sum(left, right, u64::from(range.contains(key)))
            }
            _ => Err(invalid("expected boundary KVDigestCount")),
        },
    }
}

Then verify_aggregate_count_on_range_proof returns verify_count_shape(&tree, inner_range, None, None) alongside tree.hash(). That rejects the single-Hash(expected_root) undercount and the keyless-node child overcount cases.

Fix proof generation by centralizing validation before the short-circuit.

The prove_subqueries / prove_subqueries_v1 short-circuit should not look only at query.items.first(). I would make it trigger only when the current level contains an aggregate item anywhere, then call path_query.validate_aggregate_count_on_range() and use the returned inner range. That makes invalid shapes fail instead of being partially interpreted.

Concretely, change the short-circuit from “first item is aggregate” to “any aggregate item means this path query must validate as the exact aggregate-count form”:
```
if query.items.iter().any(QueryItem::is_aggregate_count_on_range) {
    let inner_range = path_query.validate_aggregate_count_on_range()?;
    // prove aggregate count with inner_range
}
```
This catches extra query items, nested/subquery aggregate usage, limits/offsets, and invalid inner Key / RangeFull before the merk prover is called. The same guard should be applied in both v0 and v1 proof-generation paths.
Fix recursive QueryItem decoding with a bounded decode helper.

I would add QueryItem::decode_with_depth(decoder, depth) and borrow_decode_with_depth(decoder, depth) with a small max depth, similar to the existing Query / SubqueryBranch decode-depth guard. The public Decode impl calls depth 0; variant 10 decodes the inner item at depth + 1 and errors once the max is exceeded.

Since nested AggregateCountOnRange is invalid anyway, it is also reasonable to reject an inner AggregateCountOnRange immediately after decoding the inner item. The depth guard is still useful so maliciously deep variant-10 payloads fail before stack exhaustion.

Codecov flagged this PR's patch coverage at 70.67%, below the 80% target. This commit raises it by adding focused unit + integration tests that exercise the previously-uncovered branches: grovedb-query/src/proofs/encoding.rs (+5 tests): - HashWithCount Push round-trip with all-different hashes and varint count. - HashWithCount PushInverted round-trip with u64::MAX count. - HashWithCount with count=0 + all-zero hashes (leaf-shaped collapsed subtree). - Decoder iterator covering a mixed Op stream of HashWithCount, Hash, KVDigestCount, Parent, Child, and PushInverted(HashWithCount). grovedb-query/src/query.rs (+10 tests): Cover every numbered rule in `Query::validate_aggregate_count_on_range` independently: happy path, extra items, non-ACOR-only item, inner Key, inner RangeFull, nested ACOR, default subquery branch, default subquery_path, conditional subquery branches non-empty, conditional branches empty-map (accepted). Plus a test for the `aggregate_count_on_range` helper's well-shaped detection. grovedb/src/query/mod.rs (+5 tests): SizedQuery rejects non-None limit and non-None offset; SizedQuery forwards Query-level rejections as InvalidQuery; SizedQuery happy path; PathQuery::validate_aggregate_count_on_range delegation; PathQuery::has_aggregate_count_on_range recognition. grovedb/src/tests/aggregate_count_query_tests.rs (+5 tests): - Three-layer path round trip (TEST_LEAF -> outer Tree -> inner ProvableCountTree) — exercises multi-layer chain enforcement. - Tampered non-leaf-layer byte rejected by the chain check. - Round trip through the V0 envelope (`MerkOnlyLayerProof`) by running against `GROVE_V2`, complementing the existing V1-envelope coverage. - Verifier rejects malformed path_query (inner Key) before any decoding. - Construction-time rejection of nested AggregateCountOnRange. All ~29 new tests pass; full workspace test suite remains green; clippy --workspace --all-features clean; cargo fmt clean. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Three review findings from the PR review, all now addressed: [P0] Count verifier accepts forged proof shapes (merk/src/proofs/query/aggregate_count.rs) The previous verifier only allowlisted node types (Hash / HashWithCount / KVDigestCount) inside the execute() visit_node closure. That let a malicious prover: 1. Send a single Push(Hash(expected_root)) for a non-empty tree and receive (expected_root, 0) for any range. 2. Replace an in-range HashWithCount with a Hash carrying the same node_hash (chain still matches), undercounting by the missing count. 3. Attach extra KVDigestCount children below a keyless Hash / HashWithCount; their hash() ignores reconstructed children, so the root still matches but a count-everything verifier would credit the bogus child as +1. The fix is a two-phase verifier: - Phase 1: reconstruct the proof tree via execute_with_options, with a coarse allowlist on node types but NO counting. - Phase 2: walk the reconstructed tree with the same inherited exclusive subtree-key bounds the prover used (None at the root), calling classify_subtree(bounds, range) at each position and binding the node type to the classification: Disjoint -> must be a leaf Hash; contributes 0. Contained -> must be a leaf HashWithCount; contributes its count. Boundary -> must be a KVDigestCount whose key is strictly inside the inherited bounds; recurse into both children with tightened bounds, +1 if range.contains(key). Counts are summed with checked_add (overflow -> invalid proof). Three new tests prove the attacks are now rejected: - shape_walk_rejects_single_hash_undercount - shape_walk_rejects_hash_swap_for_contained_subtree - shape_walk_rejects_keyless_node_with_attached_children [P2] Prover bypassed aggregate-count validation (grovedb/src/operations/proof/generate.rs, both v0 and v1 paths) The short-circuit checked only `query.items.first()`, so a query with extra items, subquery branches, limits, or an invalid inner (Key / RangeFull / nested AggregateCountOnRange) silently produced a count proof. Now the short-circuit fires whenever any item at the level is an AggregateCountOnRange, and immediately calls PathQuery::validate_aggregate_count_on_range() to extract the inner range, surfacing the precise validation error if the surrounding PathQuery is not a well-formed aggregate-count query. Same change in both v0 and v1 prove_subqueries paths. [P2] Recursive QueryItem decoding had no depth limit (grovedb-query/src/query_item/mod.rs) Variant 10 (AggregateCountOnRange) recursively decoded an inner QueryItem before any validation, so a small payload of repeated variant-10 bytes could stack-overflow the bincode / borrow decoder. Added a bounded decode_with_depth + borrow_decode_with_depth helper matching the existing Query / SubqueryBranch decode-depth guard pattern. MAX_QUERY_ITEM_DECODE_DEPTH = 4 (single legal level + headroom). Defense-in-depth: an inner AggregateCountOnRange is also rejected immediately at decode time, since it is invalid by validation rules anyway. Three new tests: - decode_rejects_nested_aggregate_count_on_range - decode_caps_depth_for_malicious_payload - decode_accepts_valid_one_level_aggregate_count_on_range Workspace state: cargo build clean; cargo clippy --workspace --all-features -D warnings clean; cargo fmt clean; all 1475+ grovedb lib tests + 27 merk aggregate_count tests + 3 query_item decode tests pass; full workspace test suite green. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

QuantumExplorer · 2026-05-09T20:07:15Z

This is Claude. Pushed 5989cc49 addressing all three findings:

[P0] Count verifier shape walk — verify_aggregate_count_on_range_proof is now two-phase. Phase 1 reconstructs the proof tree via execute_with_options (no counting in the visit_node closure). Phase 2 walks the reconstructed tree with the same inherited exclusive subtree-key bounds the prover used and binds each position's node type to the classification:

Classification	Required node	Contribution
Disjoint	leaf `Hash`	0
Contained	leaf `HashWithCount`	committed `count`
Boundary	`KVDigestCount` with key strictly inside bounds	recurse + (1 if `range.contains(key)`)

Counts use checked_add; overflow is treated as proof corruption. Three new tests in merk/src/proofs/query/aggregate_count.rs prove each attack you described is rejected:

shape_walk_rejects_single_hash_undercount — Push(Hash(expected_root)) for a non-empty tree.
shape_walk_rejects_hash_swap_for_contained_subtree — replacing an in-range HashWithCount with a Hash carrying the same node_hash.
shape_walk_rejects_keyless_node_with_attached_children — splicing a fake +1 child under a HashWithCount (whose hash() ignores attached children).

[P2] Prover validation — both prove_subqueries (v0) and prove_subqueries_v1 short-circuits now fire on `query.items.iter().any(QueryItem::is_aggregate_count_on_range)` and immediately call `PathQuery::validate_aggregate_count_on_range()` to extract the inner range. Extra items, subquery branches, limits/offsets, and invalid inner items are now rejected before the count prover runs.

[P2] QueryItem decode depth + nested rejection — added QueryItem::decode_with_depth and borrow_decode_with_depth mirroring the existing Query/SubqueryBranch depth-guard pattern. MAX_QUERY_ITEM_DECODE_DEPTH = 4 (one legal level + headroom). Defense-in-depth: an inner AggregateCountOnRange is also rejected immediately at decode time. Three new tests cover the legal one-level shape, the nested rejection, and a malicious deep payload.

CI:

cargo clippy --workspace --all-features -D warnings: clean
cargo fmt --all: clean
cargo test --workspace: all green (1475+ grovedb lib + 27 merk aggregate_count + 3 query_item decode)

QuantumExplorer · 2026-05-09T20:17:56Z

Re-reviewed the latest head (5989cc49). The three earlier findings are materially addressed:

The aggregate-count verifier now reconstructs the proof tree first, then shape-walks it against inherited subtree bounds. The previous single-Hash(expected_root), HashWithCount -> Hash, and keyless-node child-smuggling attacks are covered by tests and rejected.
The v0/v1 proof-generation short-circuits now call PathQuery::validate_aggregate_count_on_range() instead of routing on items.first().
The bincode Decode / BorrowDecode paths for QueryItem now have bounded recursive decode helpers and reject nested AggregateCountOnRange.

One remaining item I would still fix before merging:

[P2] Serde QueryItem deserialization still recursively accepts nested AggregateCountOnRange
grovedb-query/src/query_item/mod.rs:225-227

The bincode decode path is now depth-bounded, but the optional serde path still does let inner: QueryItem = variant_access.newtype_variant()? and then wraps it. With the serde feature enabled, a nested AggregateCountOnRange payload still recurses through QueryItem::deserialize before any validation runs, so serde-backed clients keep the invalid-nesting/DoS class that the bincode fix closed. Since nested aggregate count is invalid anyway, the serde variant should deserialize the inner value through a non-recursive helper that accepts only non-aggregate QueryItem variants, or otherwise enforce a depth limit before recursing. Please add a serde-feature test for nested AggregateCountOnRange rejection too.

Verification I ran locally:

cargo test -p grovedb-merk aggregate_count --features minimal
cargo test -p grovedb aggregate_count --features minimal
cargo test -p grovedb-query aggregate_count --features serde
cargo test -p grovedb-query query_item::test --features serde

@QuantumExplorer

…ange Follow-up to @QuantumExplorer's third review pass: the bincode decode path was depth-bounded in 5989cc4, but the optional `serde` `Deserialize` path still recursively accepted nested `AggregateCountOnRange` payloads via `variant_access.newtype_variant::<QueryItem>()`. With the `serde` feature enabled, a serde-backed client could send arbitrarily deep nested AggregateCountOnRange and exhaust the stack inside `QueryItem::deserialize` before any validation ran — re-introducing the same DoS class the bincode fix closed. Fix: when `Field::AggregateCountOnRange` dispatches in the QueryItem deserialize visitor, the inner item is now deserialized via a new `NonAggregateInner(QueryItem)` newtype wrapper. Its `Deserialize` impl mirrors the QueryItem variant set but **omits** `AggregateCountOnRange` from its `Field` enum entirely — so a nested-aggregate payload is rejected by serde's enum dispatcher immediately, with no recursion into `QueryItem::deserialize`. Tests (gated on `feature = "serde"`, using `serde_test`'s token-level driver to bypass an unrelated pre-existing PascalCase/snake_case mismatch in the `Serialize`/`Deserialize` impls that breaks textual formats): - `serde_decode_rejects_nested_aggregate_count_on_range` — token stream for `AggregateCountOnRange(AggregateCountOnRange(...))` produces an `unknown field 'aggregate_count_on_range'` error from `NonAggregateInner`'s field dispatcher. - `serde_decode_accepts_valid_one_level_aggregate_count_on_range` — token stream for the only legal shape (`AggregateCountOnRange` wrapping a non-aggregate range) deserializes successfully. Added `serde_test = "1.0"` as a dev-dependency. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

QuantumExplorer · 2026-05-09T20:25:58Z

This is Claude. Pushed fd272c7b addressing the remaining serde-feature finding.

Fix: when Field::AggregateCountOnRange dispatches in the QueryItem deserialize visitor, the inner item is now deserialized via a new NonAggregateInner(QueryItem) newtype wrapper. Its Deserialize impl mirrors the QueryItem variant set but omits AggregateCountOnRange from its Field enum entirely — so a nested-aggregate payload is rejected by serde's enum dispatcher immediately, with no recursion through QueryItem::deserialize. This is the serde counterpart to the bincode decode_with_depth + nested-rejection in 5989cc4.

Tests (gated on feature = "serde", using serde_test's token-level driver):

serde_decode_rejects_nested_aggregate_count_on_range — token stream for AggregateCountOnRange(AggregateCountOnRange(...)) produces an `unknown field `aggregate_count_on_range`` error from NonAggregateInner's field dispatcher.
serde_decode_accepts_valid_one_level_aggregate_count_on_range — token stream for the only legal shape (one wrap around a non-aggregate range) deserializes successfully.

Added serde_test = \"1.0\" as a dev-dependency. Used token-level testing rather than a textual format because while writing this I noticed a pre-existing PascalCase/snake_case mismatch between the existing Serialize impl (variant tags emitted as PascalCase) and the existing Field enum (`#[serde(field_identifier, rename_all = "snake_case")]` — expects snake_case). That mismatch breaks JSON round-trip but is invisible to formats that don't carry variant names textually. Happy to fix it as a small follow-up commit on this PR if you'd like — it's unrelated to this work but worth surfacing.

Workspace state: `cargo clippy --workspace --all-features -D warnings` clean, `cargo test --workspace` green.

The aggregate-count-queries chapter was written before the implementation landed and several sections drifted from what actually shipped. This commit reconciles them: - Result type: replace the AggregateCountQueryResult struct sketch with the actual bare-tuple API (verify_aggregate_count_query -> Result<(CryptoHash, u64), Error>). The struct was rejected in review in favor of the bare tuple since the count is a u64 and the path_query already echoes the inner range. - HashWithCount self-verifying form: update the node-types table, every mermaid diagram, and the role explanations to reflect the shipped HashWithCount(kv_hash, left_child_hash, right_child_hash, count) form. The earlier draft used "KVHashCount + 2 child Hash" ops, which the implementation discarded in favor of one self-verifying op. The diagrams no longer show e/g/i/k as separate Hash children under HashWithCount nodes (their hashes now live inside the parent HashWithCount's embedded child-hash fields). The closed-range example drops from "13 push ops" to "9 push ops" accordingly. - New "Verifier shape walk" section: documents the two-phase verification (decode -> shape walk against classify_subtree with inherited bounds) and explicitly enumerates the three attack classes the shape walk catches that a naive allowlist-only verifier would let through (single-Hash undercount, HashWithCount->Hash swap, keyless child smuggling). Mirrors the security findings addressed in 5989cc4. - New "Decode safety" section: documents the MAX_QUERY_ITEM_DECODE_DEPTH = 4 bound and NonAggregateInner serde wrapper that prevent stack-exhaustion via repeated variant-10 payloads. Mirrors the bincode + serde decode hardening from 5989cc4 and fd272c7. - API sketch: switch to PathQuery::new_aggregate_count_on_range helper; destructure the verifier result as (root, count) to match the actual signature. - "Open Design Questions" -> "Settled design choices": items 1-3 from the original list are now locked in by the shipped validation. Added a fourth bullet documenting the HashWithCount 4-field decision and the rationale (review rejection of the simpler (node_hash, count) form). Cost-limit interaction stays as the only remaining design note. Book builds cleanly via mdbook. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

The section was a leftover from the pre-implementation "Open Design Questions" list. Every bullet is now redundant: - Items 1-3 (one ACOR per Query, add_parent_tree_on_subquery forbidden, SizedQuery limit/offset rejected) are covered by the numbered Validation rules earlier in the chapter. - Item 4 (HashWithCount is self-verifying) is covered by the "Why HashWithCount is self-verifying" callout under the node-types table and elaborated in the Verifier shape walk section. - Item 5 (cost-limit interaction) is covered by the Cost Model section. The section ended on a "noted for review" framing that no longer matches reality — everything in it is shipped, validated, and tested. Removing it tightens the chapter and avoids duplicating rationale across two sections that could drift apart. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

coderabbitai

Actionable comments posted: 3

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@grovedb-query/src/query.rs`:
- Around line 330-335: The helper aggregate_count_on_range currently only
returns Some when self.items is exactly one ACOR item; change it to scan
self.items for any item where is_aggregate_count_on_range() is true and return a
reference to that item (e.g., find/iter().find(...)/iter().position then
&self.items[idx]) so malformed queries that include an ACOR still get detected;
keep shape enforcement delegated to validate_aggregate_count_on_range so that
only detection is changed.

In `@merk/src/proofs/query/aggregate_count.rs`:
- Around line 123-128: The match in is_provable_count_bearing only checks bare
ProvableCountTree/ProvableCountSumTree and rejects wrapped variants; before
matching, normalize tree_type by stripping any wrapper variants to the inner
tree kind (i.e., apply the same wrapper-unwrapping/normalization used elsewhere)
and then match against ProvableCountTree and ProvableCountSumTree; apply the
identical normalization/fix in Merk::prove_aggregate_count_on_range so wrapped
count trees are accepted there too.

In `@merk/src/proofs/query/verify.rs`:
- Around line 479-489: The match arm must reject Node::HashWithCount in the
plain query verifier: update the match handling around
Node::Hash(_)|Node::KVHash(_)|Node::KVHashCount(..)|Node::HashWithCount(..) so
that when a Node::HashWithCount(..) is encountered (and in_range is true) the
verifier fails fast (return an Err / propagate a verification failure) instead
of treating it like a benign collapsed subtree; this prevents forged proofs that
rely on Tree::hash() recomputing counts and avoids letting execute_node be
satisfied by hidden keyed children—reserve Node::HashWithCount for the
aggregate-count verifier only.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 1fc5ce85-ba3b-4c5e-9725-2902ea6a8f7e

📥 Commits

Reviewing files that changed from the base of the PR and between 8823ff1 and fd272c7.

📒 Files selected for processing (30)

docs/book/book.toml
docs/book/mermaid-fixup.js
docs/book/src/SUMMARY.md
docs/book/src/aggregate-count-queries.md
docs/book/src/query-system.md
grovedb-bulk-append-tree/src/proof/mod.rs
grovedb-dense-fixed-sized-merkle-tree/src/proof/mod.rs
grovedb-query/Cargo.toml
grovedb-query/src/proofs/encoding.rs
grovedb-query/src/proofs/mod.rs
grovedb-query/src/query.rs
grovedb-query/src/query_item/intersect.rs
grovedb-query/src/query_item/mod.rs
grovedb/src/debugger.rs
grovedb/src/operations/proof/aggregate_count.rs
grovedb/src/operations/proof/generate.rs
grovedb/src/operations/proof/mod.rs
grovedb/src/operations/proof/verify.rs
grovedb/src/query/mod.rs
grovedb/src/tests/aggregate_count_query_tests.rs
grovedb/src/tests/mod.rs
grovedb/src/tests/provable_count_sum_tree_tests.rs
merk/benches/branch_queries.rs
merk/src/merk/chunks.rs
merk/src/merk/prove.rs
merk/src/proofs/branch/mod.rs
merk/src/proofs/query/aggregate_count.rs
merk/src/proofs/query/mod.rs
merk/src/proofs/query/verify.rs
merk/src/proofs/tree.rs

👮 Files not reviewed due to content moderation or server errors (5)

grovedb-query/src/query_item/mod.rs
grovedb-bulk-append-tree/src/proof/mod.rs
grovedb-dense-fixed-sized-merkle-tree/src/proof/mod.rs
grovedb-query/src/query_item/intersect.rs
grovedb/src/operations/proof/verify.rs

[Major] Detect ACOR presence even on malformed queries (grovedb-query/src/query.rs) `Query::aggregate_count_on_range()` previously returned `None` unless the query was already the well-shaped single-item ACOR form. Queries that *contain* AggregateCountOnRange plus extra items (or with ACOR not at items[0]) reported `None` and could be mistakenly routed through the regular-query path. The helper now scans the whole `items` vec for any ACOR item; shape enforcement remains delegated to `validate_aggregate_count_on_range`. This is detection-only — the prover-side routing (which already used `iter().any()`) was unaffected, but downstream callers using the helper as a precondition gate now correctly hand malformed queries to the validator. [Major] Reject HashWithCount in the plain query verifier (merk/src/proofs/query/verify.rs) The plain `Query::execute_proof` verifier was treating `HashWithCount` the same as `Hash`/`KVHash`/`KVHashCount` — accepted as a "non-keyed node, OK if not in_range". That was unsafe: `Tree::hash()` for `HashWithCount` recomputes its hash from the embedded `(kv_hash, l, r, count)` while ignoring any reconstructed children, so a malicious prover could include a `HashWithCount` in a regular query proof, hang fake KV pushes under it (which `execute_node` would credit as query results), and still preserve the parent's hash chain. `HashWithCount` is now split into its own match arm that fails fast with a clear error. The dedicated aggregate-count verifier (`verify_aggregate_count_on_range_proof`) remains the only path that accepts the variant; that path also enforces the shape walk so attached children are independently rejected. New test: `regular_query_verifier_rejects_hash_with_count_node` builds an honest range proof against a normal merk, splices a `HashWithCount` push at the front, and asserts `Query::execute_proof` returns an `InvalidProofError`. [Skipped — not a bug] is_provable_count_bearing missing NonCounted variants CodeRabbit suggested normalizing `tree_type` for `NonCounted*` wrapper variants. There is no such `TreeType` variant — `NonCounted` is an `ElementType` wrapper that the merk-open path strips via `Element::NonCounted(inner) => inner.root_key_and_tree_type_owned()` (merk/src/element/tree_type.rs:63). The merk's `tree_type` for any NonCounted-wrapped provable count tree element is already the bare `TreeType::ProvableCountTree` / `ProvableCountSumTree`, so the existing gate is correct. Workspace state: cargo build clean; cargo clippy --workspace --all-features -D warnings clean; cargo test --workspace green (including the two new tests). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

QuantumExplorer · 2026-05-09T20:46:40Z

This is Claude. Pushed `02e75c15` addressing the CodeRabbit findings.

[Major #1] Detect ACOR presence even on malformed queries — Fixed. `Query::aggregate_count_on_range` now scans the entire `items` vec via `iter().find`, so queries with extra items or with the ACOR not at index 0 are still detected. Shape enforcement remains in `validate_aggregate_count_on_range`. Updated the `aggregate_count_on_range_helper_*` test to cover ACOR-not-at-index-0 and ACOR-with-extra-item cases.

[Major #3] Reject HashWithCount in the plain query verifier — Fixed. Real attack: `Tree::hash()` for `HashWithCount` recomputes from its embedded `(kv_hash, l, r, count)` and ignores any reconstructed children, so a malicious prover could include a `HashWithCount` in a regular query proof, attach fake KV pushes under it (which `execute_node` would credit as query results), and the parent's hash chain would still verify. Now `Node::HashWithCount(..)` has its own match arm in the plain verifier that fails fast with `"HashWithCount node is only valid in aggregate-count proofs; encountered in regular query verification"`. New test `regular_query_verifier_rejects_hash_with_count_node` exercises this end-to-end.

[Major #2] is_provable_count_bearing missing NonCounted variants — Skipped, not a bug. CodeRabbit suggested normalizing `tree_type` for `NonCounted*` wrapper variants, but there's no such `TreeType` variant — `NonCounted` is an `ElementType` wrapper that the merk-open path strips via `Element::NonCounted(inner) => inner.root_key_and_tree_type_owned()` (merk/src/element/tree_type.rs:63). The merk's `tree_type` for any NonCounted-wrapped provable count tree element is already the bare `TreeType::ProvableCountTree` / `ProvableCountSumTree`, so the existing gate is correct. (The book wording "NonCounted* wrapper variants" was the source of confusion; it referred to ElementType wrappers, not TreeType.)

`cargo clippy --workspace --all-features -D warnings` clean; full workspace test suite green including the two new tests.

QuantumExplorer

Reviewed

…+ proof size Pushes test coverage from ~7/10 to ~9/10 by adding the four missing classes the original review called out: merk/src/proofs/query/aggregate_count.rs (+2 tests): - fuzz_byte_mutation_no_silent_forgery: enumerates every byte of an honest count proof, flips it to three different values, and asserts the verifier never produces a "silent forgery" — Ok((honest_root, count')) where count' != honest_count. Three safe outcomes are permitted: rejection, root divergence, or same-(root, count) (which happens for non-canonical re-encodings like Push <-> PushInverted). The unsafe case panics with a precise location. Asserts both rejection and divergence branches fire as a sanity check. - fuzz_random_trees_and_ranges_round_trip: deterministic xorshift RNG builds 16 ProvableCountTrees of varying sizes with random multi-byte keys, runs 6 random ranges per tree (covering all 6 bounded / half-bounded variants), and asserts the verifier's count matches a brute-force keys.iter().filter(range.contains).count(). Catches off-by-one / edge-of-tree / multi-byte-key bugs the example-based tests would miss. grovedb/src/tests/aggregate_count_query_tests.rs (+2 tests): - non_counted_children_are_included_in_aggregate_count_v1_contract: pins the v1 contract that AggregateCountOnRange counts every in-range key including NonCounted-wrapped ones. Inserts 5 normal items + 1 NonCounted item into a ProvableCountTree and asserts the count is 6, not 5. The book chapter previously claimed exclusion; that doc was aspirational and is corrected in the same commit. The test's doc-comment points callers to read the parent Element::ProvableCountTree(_, count, _) bytes directly when they want the aggregate-excluding-NonCounted total. - proof_size_snapshot_for_15_key_closed_range: pins the proof byte size for the canonical 15-key + RangeInclusive("c"..="l") setup at ~650 bytes (window [300, 900]). Catches gross regressions in the proof shape — e.g. if the count short-circuit stops firing or if every node starts emitting full child hashes. docs/book/src/aggregate-count-queries.md: Replace the "NonCounted children are excluded by design" callout with an accurate description of the v1 contract: in-range keys are counted regardless of NonCounted-ness; callers wanting the parent's aggregate-excluding-NonCounted total should read the parent element bytes directly. Notes that a NonCounted-aware count mode could be added in a future revision (would require tracking structural-vs-in-range counts separately during the shape walk). Workspace state: cargo build clean, cargo clippy --workspace --all-features -D warnings clean, cargo test --workspace green including the four new tests. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

The `_verify_options_imported_marker` fn was a placeholder added to keep the `VerifyOptions` import alive "for future count-aware verify options." There's no concrete plan for those options on this branch, and a placeholder fn that exists only to satisfy a dead import is worse than dropping both — if/when the options arrive, restoring the import is a one-liner. Removed: - the `_verify_options_imported_marker` fn - `VerifyOptions` from the use block (kept `QueryProofVerify` since `execute_proof` still depends on it) Build + clippy clean. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

NonCounted-wrapped children should not contribute to the parent's aggregate count (they zero out their contribution at insertion time), but they still have a provable count equal to their left+right descendants. Implement the exclusion end-to-end: - Prover (`emit_count_proof`): Disjoint and Contained subtrees now both emit `HashWithCount(kv_hash, l, r, count)` so the structural count of every outside subtree is cryptographically bound to the parent's hash chain. Boundary nodes derive their own_count as `node_count - left_link_aggregate - right_link_aggregate`. - Verifier (`verify_count_shape`): returns `(in_range_count, structural_count)`. At Boundary nodes, `own_count` is derived via `checked_sub` from the parent aggregate and the left/right structural counts (rejects malformed proofs that would saturate). - Phase-1 allowlist no longer accepts plain `Node::Hash` in count proofs — only `HashWithCount` and `KVDigestCount`. Plain `Hash` carries no count, so a malicious prover could lie about the structural count and skew the parent's `own_count` derivation. - Test `non_counted_children_are_excluded_from_aggregate_count` inserts 5 normal + 1 NonCounted item and asserts count = 5. - Book chapter updated to document the new exclusion semantics and the `(in_range, structural)` verifier return type. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

coderabbitai

Actionable comments posted: 1

🧹 Nitpick comments (1)

grovedb/src/tests/aggregate_count_query_tests.rs (1)
282-297: ⚡ Quick win

Mutate a real HashWithCount node instead of the first matching byte.

This loop scans the entire serialized GroveDB proof and flips the first 0x1e/0x1f byte it finds, but those values can also occur inside hashes or envelope bytes. That means the test can fail for unrelated corruption while no longer proving that a forged HashWithCount count is rejected. Parsing down to the merk proof before mutating the count field would make this check much more precise.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@grovedb/src/tests/aggregate_count_query_tests.rs` around lines 282 - 297, The
test currently flips the first byte equal to 0x1e/0x1f in the serialized proof
(variables proof, tampered, count_offset), which can hit bytes inside
hashes/envelopes; instead parse the serialized merk proof into node/opcode
boundaries and locate a genuine HashWithCount node before mutating its count
varint. Update the loop to walk the proof with a cursor that reads an opcode
then the exact node payload lengths (opcode 0x1e/0x1f => opcode | kv_hash[32] |
left[32] | right[32] | count_varint) so you compute count_offset only when the
opcode is at a true node boundary, then increment that varint and set tampered =
true; ensure the parser skips over non-opcode bytes rather than matching raw
bytes inside hashes.

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@docs/book/src/aggregate-count-queries.md`:
- Around line 217-220: Update the excluded-subtree example diagrams and
descriptions so that any out-of-range or excluded leaf positions are shown and
treated as HashWithCount rather than plain Hash: replace occurrences of `Hash`
used to represent excluded subtrees in the excluded-subtree examples (including
the blocks referenced around the existing diagrams and the verifier walk) with
`HashWithCount`, and ensure the text and role-table consistency reflects that
excluded leaves must be disjoint positions represented by `HashWithCount` (since
count proofs explicitly reject plain `Hash`). Adjust the captions/labels and any
verifier-walk narrative that checks leaf types to reference `HashWithCount` in
those examples.

---

Nitpick comments:
In `@grovedb/src/tests/aggregate_count_query_tests.rs`:
- Around line 282-297: The test currently flips the first byte equal to
0x1e/0x1f in the serialized proof (variables proof, tampered, count_offset),
which can hit bytes inside hashes/envelopes; instead parse the serialized merk
proof into node/opcode boundaries and locate a genuine HashWithCount node before
mutating its count varint. Update the loop to walk the proof with a cursor that
reads an opcode then the exact node payload lengths (opcode 0x1e/0x1f => opcode
| kv_hash[32] | left[32] | right[32] | count_varint) so you compute count_offset
only when the opcode is at a true node boundary, then increment that varint and
set tampered = true; ensure the parser skips over non-opcode bytes rather than
matching raw bytes inside hashes.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: d731dbfa-432c-46a4-9f1c-476a1b34be0c

📥 Commits

Reviewing files that changed from the base of the PR and between fd272c7 and 30a960a.

📒 Files selected for processing (6)

docs/book/src/aggregate-count-queries.md
grovedb-query/src/query.rs
grovedb/src/operations/proof/aggregate_count.rs
grovedb/src/tests/aggregate_count_query_tests.rs
merk/src/proofs/query/aggregate_count.rs
merk/src/proofs/query/verify.rs

🚧 Files skipped from review as they are similar to previous changes (3)

merk/src/proofs/query/aggregate_count.rs
grovedb/src/operations/proof/aggregate_count.rs
grovedb-query/src/query.rs

- Book chapter: update the per-case example diagrams (Case 1 RangeFrom / RangeAfter, Case 2 RangeInclusive) so excluded-subtree leaves are shown as `HashWithCount` rather than plain `Hash`. The role table and verifier-walk section were already correct (out-of-range subtrees must use `HashWithCount` so the structural count is bound to the parent's hash chain), but the example diagrams still showed the old `Hash`-only shape and contradicted the security model. Verifier total tables also updated to include the outside subtrees with their +0 contributions. - count_forgery_is_caught_at_grovedb_level: rewrite to parse the GroveDB envelope (bincode) and the leaf merk proof (merk::proofs::Decoder) properly, mutating the count of a real `Op::Push(Node::HashWithCount)` op at a true op boundary instead of byte-scanning for 0x1e/0x1f. The previous approach could match a 0x1e byte inside an embedded 32-byte hash — the verifier would still reject, but for the wrong reason (root mismatch from a tampered hash, not a tampered count). The new approach actually proves the property the test name claims. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

QuantumExplorer · 2026-05-09T21:57:50Z

This is Claude. Addressed both CodeRabbit comments in abf86d0:

1. Excluded-subtree examples (docs/book/src/aggregate-count-queries.md)
Updated the per-case diagrams (Case 1 RangeFrom + RangeAfter, Case 2 RangeInclusive) so out-of-range leaves are shown as HashWithCount instead of plain Hash. The role table and verifier-walk section were already consistent — only the diagrams contradicted the security model. Also added the outside subtrees (with +0 contribution) to each verifier total table for clarity, and updated the gray legend entry to reflect the new shape.

2. Forgery test (grovedb/src/tests/aggregate_count_query_tests.rs)
Replaced the byte-scan with a proper parse:

Decode the GroveDBProof envelope via bincode (same config as the verifier).
Walk down lower_layers along the path keys to the leaf merk proof bytes.
Use grovedb_merk::proofs::Decoder to parse the merk proof at true op boundaries, find the first Op::Push(Node::HashWithCount(_, _, _, count)) (or PushInverted), increment the count, re-encode with encode_into.
Re-encode the envelope.

This guarantees we mutate an actual count varint and not a 0x1e byte that happens to live inside an embedded 32-byte hash. The test name promises that a forged count is rejected; the new flow actually proves that.

All 19 grovedb-level + 30 merk-level aggregate-count tests pass. cargo fmt --check and the typos/cargo fmt pre-commit hooks all clean.

QuantumExplorer · 2026-05-09T22:07:15Z

+        {
+            let inner_range = cost_return_on_error_no_add!(
+                cost,
+                path_query.validate_aggregate_count_on_range().cloned()


[P2] Aggregate-count validation still runs too late

This validation only runs after traversal reaches the target query level. If the path is missing, the recursive prover never sees the AggregateCountOnRange item, so malformed aggregate-count queries can still return a regular path/absence proof instead of being rejected. I reproduced this with PathQuery::new_aggregate_count_on_range([TEST_LEAF, "missing"], QueryItem::Key(...)): prove_query returned Ok(Some(proof)) even though the inner Key is invalid for aggregate count.

I would fix this by validating aggregate-count path queries at the prove_query / v1 entry point before prove_subqueries starts, for example: if path_query.has_aggregate_count_on_range() { path_query.validate_aggregate_count_on_range()?; }. If aggregate-count items can be constructed inside nested subquery branches, make the detector recursive and reject those shapes too, since aggregate-count is terminal and should not be hidden under a regular subquery path.

…ejections Adds nine new tests targeting previously-uncovered branches in the aggregate-count code, lifting per-file line coverage on the new files from 62.45%/92.17% to 76.68%/92.98%. GroveDB-side (`tests/aggregate_count_query_tests.rs`): - v1_envelope_with_non_merk_proof_bytes_is_rejected: swaps a leaf layer's `ProofBytes::Merk(_)` for `MMR(_)` and asserts the V1 walker rejects with InvalidProof. - v1_envelope_with_missing_lower_layer_is_rejected: drops the leaf layer's pointer entry in `lower_layers`. - v1_envelope_with_corrupted_non_leaf_merk_bytes_is_rejected: truncates a non-leaf merk proof to one byte to exercise the single-key proof error path. - v1_envelope_with_malformed_leaf_count_proof_is_rejected: replaces the leaf merk proof with a single Push(Hash) op stream so the leaf-level count verifier surfaces a wrapped `InvalidProof` error rather than reaching the chain check. Generate.rs (`operations/proof/generate.rs`): - {dense_tree, mmr_tree, bulk_append_tree}_rejects_aggregate_count_on_range: unit-test the `query_items_to_*` private helpers directly so we exercise the AggregateCountOnRange rejection arms without needing to set up populated dense/MMR/bulk-append trees in a real grove. Merk-side (`proofs/query/aggregate_count.rs`): - shape_walk_rejects_disjoint_hashwithcount_with_children: splices a HashWithCount child under a Disjoint-position HashWithCount (Phase-1-allowed but Phase-2-rejected) to exercise the leaf-only invariant at Disjoint positions. - shape_walk_rejects_non_hashwithcount_at_disjoint: swaps a Disjoint HashWithCount for a plain Hash carrying the same node hash; either Phase 1 (allowlist) or Phase 2 (expected-type) rejects. - shape_walk_rejects_kvdigestcount_outside_inherited_bounds: rewrites a KVDigestCount key to fall outside the parent's inherited subtree bounds. 26 grovedb-level + 33 merk-level aggregate-count tests now pass. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Codex finding (PR #656 review): the AggregateCountOnRange validation only fires inside `prove_subqueries` when recursion reaches the ACOR-bearing leaf level. If the path doesn't exist (e.g. a missing key under TEST_LEAF), the recursive prover never sees the ACOR item, so a malformed aggregate-count query — invalid inner range, ACOR hidden in a subquery branch, etc. — would silently route through the regular-proof path and return Ok with a regular path/absence proof. Reproduced as `aggregate_count_with_missing_path_and_invalid_inner_is_ rejected_at_entry`: a PathQuery with path = [TEST_LEAF, "missing"] and an inner `Key` (invalid for ACOR) returned `Ok(86)` (an 86-byte proof) before this fix. Fix: - Add `Query::has_aggregate_count_on_range_anywhere` — recursive detector that walks the top-level items, the default subquery branch, and every conditional subquery branch. - At the entry of `prove_query_non_serialized` (before v0/v1 dispatch), if the recursive detector finds any ACOR, immediately call `validate_aggregate_count_on_range`. This rejects: * invalid inner items (Key, RangeFull, nested ACOR) * ACOR with extra top-level items or with subquery branches * ACOR hidden inside a subquery branch (top-level shape isn't canonical, validate fails) Tests: - `has_aggregate_count_on_range_anywhere_walks_subqueries` (grovedb-query): plain query, top-level ACOR, ACOR in default subquery, ACOR in conditional subquery. - `aggregate_count_with_missing_path_and_invalid_inner_is_rejected_at_entry` (grovedb): the exact reproducer from Codex's review. - `aggregate_count_hidden_in_subquery_branch_is_rejected_at_entry` (grovedb): broader concern — terminal ACOR cannot be hidden under a subquery branch. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

QuantumExplorer · 2026-05-09T22:30:28Z

This is Claude. Real bug — thanks for the catch and the precise reproducer. Fixed in c13b5f4.

Reproduction confirmed first. Added aggregate_count_with_missing_path_and_invalid_inner_is_rejected_at_entry, which builds the exact shape you described (PathQuery::new_aggregate_count_on_range([TEST_LEAF, "missing"], QueryItem::Key(...))). Before the fix, prove_query returned Ok(86) (an 86-byte proof for the malformed query). Test went red, fix made it green.

Fix follows your recommended structure:

New Query::has_aggregate_count_on_range_anywhere — recursive detector. Walks the top-level items, then default_subquery_branch.subquery, then every conditional_subquery_branches[*].subquery. Returns true if any AggregateCountOnRange is reachable from this query.
New gate at the entry of GroveDb::prove_query_non_serialized (before V0/V1 dispatch):
```
if path_query.query.query.has_aggregate_count_on_range_anywhere()
    && let Err(e) = path_query.validate_aggregate_count_on_range()
{
    return Err(e).wrap_with_cost(OperationCost::default());
}
```
validate_aggregate_count_on_range already enforces the canonical single-ACOR-no-subqueries shape, so detection-anywhere + validate-strict together close the surface.

Coverage of the broader concern. Added aggregate_count_hidden_in_subquery_branch_is_rejected_at_entry — constructs a query whose top-level items are a regular Range and whose default_subquery_branch.subquery carries the AggregateCountOnRange. Top-level aggregate_count_on_range() misses it, but the new recursive detector finds it; validate then rejects because the surrounding query isn't the canonical shape. Plus has_aggregate_count_on_range_anywhere_walks_subqueries in grovedb-query covers default+conditional subquery branches at the unit level.

Workspace: 1486 grovedb lib tests + 28 aggregate_count grovedb + 4 grovedb-query aggregate_count tests pass. cargo fmt/typos hooks clean.

…ure paths Adds four new tests targeting previously-uncovered defensive branches in `verify_single_key_layer_proof_v0` and `enforce_lower_chain`: - non_leaf_proof_without_target_key_is_rejected: replace the "ct" KV op with a Hash op so the merk single-key verifier returns Ok with no matching key in result_set, hitting the "did not contain the expected key" arm. - non_leaf_proof_with_kv_replaced_by_kvdigest_is_rejected: replace the "ct" KV variant with KVDigest (key + value_hash, no value), so the result_set contains "ct" but `value = None`, hitting the "no value bytes" arm. - non_leaf_proof_with_undeserializable_value_is_rejected: mutate the "ct" value bytes to garbage that no Element variant tag matches, hitting the deserialize-failure arm of `enforce_lower_chain`. - non_leaf_proof_with_non_tree_element_is_rejected: mutate the "ct" value bytes to a serialized Element::Item, exercising the `is_any_tree()` guard — aggregate-count proofs can only descend through tree elements. All four are gated to `Err(InvalidProof)` and accept either the specific defensive branch firing or an upstream merk verifier rejection, since the order in which the validation steps fail can shift across mutations. Either path closes the attack surface. Also adds a `mutate_test_leaf_layer_ops` test helper that decodes the V1 envelope, walks to the TEST_LEAF non-leaf merk proof bytes, runs an arbitrary closure over the parsed ops, and re-encodes — shared by all four new tests. Per-file line coverage on `grovedb/src/operations/proof/aggregate_count.rs`: 73.91% → 84.58%. 32 grovedb-level aggregate_count tests now pass. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

…erify feature (#658) * feat(merk,grovedb): expose aggregate_count proof verification under verify feature The AggregateCountOnRange proof primitive (PR #656) had its modules gated behind feature = "minimal" in both grovedb-merk and grovedb, making the verifier entry points (`verify_aggregate_count_on_range_proof` and `GroveDb::verify_aggregate_count_query`) unreachable from downstream lean-verifier crates that depend on `grovedb`/`grovedb-merk` with `default-features = false, features = ["verify"]`. Widen the module gates to `any(feature = "minimal", feature = "verify")` to match the pattern already used by `query::common`, `query::merge`, `query::query_item`, and `query::verify` in the same file. Refactor aggregate_count.rs to keep prover-only items (`RefWalker`, `Fetch`, `AggregateData`, `emit_count_proof`, etc.) gated `feature = "minimal"` while exposing the verifier (`verify_aggregate_count_on_range_proof`, `verify_count_shape`, `classify_subtree`, `key_strictly_inside`) under the wider gate. Add a `cfg(all(test, feature = "verify"))` test module with hex-fixture round-trip, empty-merk, and byte-mutation no-silent-forgery tests that exercise the verifier without any prover dependency, plus an `#[ignore]`d helper to regenerate the fixture if the proof encoding ever changes. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * test: widen verify_only_tests gate so default CI runs them The verify_only_tests module was gated cfg(all(test, feature = "verify")), which hid it from CI runs that only enable the default `full` feature set (which doesn't include `verify`). Codecov flagged the test bodies as uncovered patch lines as a result. Widen to cfg(all(test, any(feature = "minimal", feature = "verify"))). The verifier function is available under both gates and these tests have no prover dependency, so the `verify`-only gate was buying nothing except missing CI coverage. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * test: convert ignored fixture-dump helper into active parity check The #[ignore]d dump_verify_only_fixtures test never ran in CI, so its ~17 lines counted as uncovered patch lines. Replace it with an active verify_only_fixture_matches_fresh_prover_output test that runs every build, asserts the hardcoded fixture in verify_only_tests still matches fresh prover output byte-for-byte, and prints regeneration-ready constants on mismatch. This catches encoding drift automatically and gets the fixture maintenance lines into CI coverage. Promote the FIXTURE_*_HEX/COUNT constants to pub(super) so the parity test can compare against them. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * test: drop redundant feature gates on test modules Both test modules were gated with feature = "minimal" / any(minimal, verify), but the crate convention (e.g., merk/src/element/tree_type.rs) is plain #[cfg(test)] — tests rely on default = ["full"] (which pulls in minimal) being active at test time. The extra gates were defensive beyond what the surrounding code does. Also tighten the verify_only_tests doc comment now that the drift check runs on every build instead of being a separate ignored helper. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

…rely V0 proof envelopes (`GroveDBProofV0` / `MerkOnlyLayerProof`) predate the ACOR feature — ACOR was introduced in #656, well after V1 envelopes became the default for grove versions used by Dash Platform v12+. The V0+ACOR combination is impossible in any deployed Platform release, so the previous compatibility shims are dead code. This change makes that explicit: - Prover entry: `prove_query_non_serialized` rejects ACOR queries paired with grove versions that dispatch to V0 with `Error::NotSupported("requires V1 proof envelopes")`. The check fires before any work happens, so callers can't accidentally emit a V0 ACOR proof. - Verifier entries: both `verify_aggregate_count_query` (leaf) and `verify_aggregate_count_query_per_key` (leaf or carrier) refuse `GroveDBProof::V0` envelopes via a shared `require_v1_envelope` helper, surfacing `Error::InvalidProof`. A forged V0 envelope carrying ACOR-looking bytes can't even reach the verification logic. - Removed dead code: `verify_v0_leaf_chain` and `verify_v0_with_classification` are deleted. `MerkOnlyLayerProof` and `GroveDBProofV0` are no longer imported by `aggregate_count.rs`. - Tests updated: `provable_count_tree_works_on_grove_v2_envelope` inverts to `aggregate_count_rejects_grove_v2_envelope`, asserting the prover refuses the combination. `acor_carrier_rejects_v0_envelope` similarly tightens its assertion to expect `NotSupported` from the prover instead of `InvalidProof` from the verifier. The V0 short-circuit in `prove_subqueries` (the v0 path) stays as defense-in-depth — unreachable through the entry gate but a safety net for any future internal caller that bypasses `prove_query_non_serialized`. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

…663) * feat(grovedb,query): allow AggregateCountOnRange as carrier subquery Allow `AggregateCountOnRange` as a subquery item under an outer `Keys`/ `Range*` carrier query — the natural multi-outer-key extension of the existing single-leg ACOR. The carrier returns one `(outer_key, u64)` pair per matched outer key while leaving the leaf wire format and existing entry points byte-identical. Why: Dash Platform's count-query module needs to answer `SELECT brand, COUNT(*) ... WHERE brand IN (...) GROUP BY brand` against `byBrand / <brand> / color / <ProvableCountTree>` layouts. Today grovedb rejects the nested shape at `Query::validate_*`, forcing callers to either fan out one proof per outer key (k× larger proofs) or fall back to the unproven path. With this change the same GroveDB proof contains all k counts and is verified in one pass. Changes: - `grovedb-query`: split validation into a leaf validator (renamed from the old `validate_aggregate_count_on_range`) and a new carrier validator that requires `Key`/`Range*` outer items, a leaf-ACOR subquery, and no conditional branches. The top-level `validate_aggregate_count_on_range` now dispatches by query shape and returns the leaf inner range either way. - `grovedb`: add `validate_leaf_aggregate_count_on_range` at SizedQuery/PathQuery for entry points that must reject the carrier. No prover changes needed — the existing recursion naturally walks outer items -> subquery_path -> leaf ACOR via `query_items_at_path`. - `grovedb`: add `verify_aggregate_count_query_per_key` returning `(RootHash, Vec<(Vec<u8>, u64)>)`. Leaf queries surface as a single entry with empty key + the same count `verify_aggregate_count_query` returns. The legacy `verify_aggregate_count_query` now strictly validates the leaf shape; carrier queries must use the new entry. - Tests: 10 new carrier-shape validation tests in grovedb-query and 7 end-to-end carrier proof tests in grovedb (two-outer-key happy path, absent-outer-key returns present-only, both-levels rejection, Range-outer carrier, leaf symmetry, non-ACOR rejection, and count forgery rejection through the carrier envelope). Asymptotic complexity for |outer matches| = k, outer tree B, leaf tree C: O(k * (log B + log C)). Out of scope (deferred): conditional-branch carriers, multi-In on prefix (IN x IN), drive-side integration. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * fix(grovedb,query): address review feedback on carrier ACOR - Move ACOR construction/validation out of `grovedb-query/src/query.rs` (it had grown past 1500 lines) into a focused `grovedb-query/src/aggregate_count.rs` with its own `impl Query` block and tests. `query.rs` shrinks from ~1580 to ~910 lines. - Per CodeRabbit major: route verifier validation through the PathQuery layer so `SizedQuery::limit`/`offset` are rejected up front for both leaf and carrier shapes. New tests pin the pagination rejection. - V0 proof envelopes can't carry the carrier shape (they're produced only by pre-feature grove versions); the new `verify_v0_with_classification` rejects carrier classification with `InvalidProof("only supported on V1 proof envelopes")` instead of attempting a never-emitted carrier walk. Removes the unused `verify_v0_carrier_layer` / `verify_v0_subquery_path`. - Per CodeRabbit minor: ordering doc on `verify_aggregate_count_query_per_key` now says "query-direction order" (matches the merk walker) instead of asserting strict ascending lex, since the carrier honors `Query::left_to_right`. - Per CodeRabbit nitpick: pin the malformed-leaf-inner-Key rejection message in the carrier path so future refactors can't silently re-route the error. - New end-to-end tests: V0+carrier rejection, long subquery_path (2 intermediate single-key layers), corrupted outer-layer merk bytes, undecodable envelope, legacy `verify_aggregate_count_query` rejection of carrier shapes, carrier pagination rejection. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * test(grovedb,query): boost carrier ACOR patch coverage Patch coverage was 89.86% (just under codecov's 90% target). The biggest gaps were untested error paths in the carrier verifier and a few carrier-validator branches the top-level dispatcher masks. New tests: grovedb (carrier proof verifier error paths): - `acor_carrier_missing_outer_lower_layer_is_rejected` — drops one `lower_layers[outer_key]` entry from a real envelope; verifier must surface "missing lower layer for outer key". - `acor_carrier_missing_subquery_path_layer_is_rejected` — drops the subquery_path "color" layer; exercises `verify_v1_subquery_path`'s missing-layer branch. - `acor_carrier_non_merk_proof_bytes_is_rejected` — swaps a `ProofBytes::Merk(...)` layer for `ProofBytes::MMR(...)`; the `expect_merk_bytes` helper rejects. grovedb-query (direct carrier-validator branches): - `validate_carrier_acor_direct_rejects_missing_subquery` - `validate_carrier_acor_direct_rejects_acor_outer_item` - `validate_carrier_acor_direct_rejects_range_full_outer` These call `validate_carrier_aggregate_count_on_range` directly to hit branches the top-level dispatcher routes around (the dispatcher classifies first and sends ACOR-outer-item shapes to the leaf validator). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * refactor(grovedb): reject V0 envelopes for AggregateCountOnRange entirely V0 proof envelopes (`GroveDBProofV0` / `MerkOnlyLayerProof`) predate the ACOR feature — ACOR was introduced in #656, well after V1 envelopes became the default for grove versions used by Dash Platform v12+. The V0+ACOR combination is impossible in any deployed Platform release, so the previous compatibility shims are dead code. This change makes that explicit: - Prover entry: `prove_query_non_serialized` rejects ACOR queries paired with grove versions that dispatch to V0 with `Error::NotSupported("requires V1 proof envelopes")`. The check fires before any work happens, so callers can't accidentally emit a V0 ACOR proof. - Verifier entries: both `verify_aggregate_count_query` (leaf) and `verify_aggregate_count_query_per_key` (leaf or carrier) refuse `GroveDBProof::V0` envelopes via a shared `require_v1_envelope` helper, surfacing `Error::InvalidProof`. A forged V0 envelope carrying ACOR-looking bytes can't even reach the verification logic. - Removed dead code: `verify_v0_leaf_chain` and `verify_v0_with_classification` are deleted. `MerkOnlyLayerProof` and `GroveDBProofV0` are no longer imported by `aggregate_count.rs`. - Tests updated: `provable_count_tree_works_on_grove_v2_envelope` inverts to `aggregate_count_rejects_grove_v2_envelope`, asserting the prover refuses the combination. `acor_carrier_rejects_v0_envelope` similarly tightens its assertion to expect `NotSupported` from the prover instead of `InvalidProof` from the verifier. The V0 short-circuit in `prove_subqueries` (the v0 path) stays as defense-in-depth — unreachable through the entry gate but a safety net for any future internal caller that bypasses `prove_query_non_serialized`. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * fix(grovedb): support right-to-left direction in carrier ACOR verifier The carrier-layer verifier was hardcoding `left_to_right=true` when calling `merk::execute_proof`, even though the rest of the carrier classification (`AcorClassification::carrier_left_to_right`) and the prover (`generate_merk_proof(..., query.left_to_right, ...)`) both honor the query's direction. Consequence: a carrier query with `left_to_right=false` would produce a correct multi-key proof emitted in reverse order, but the verifier would feed it through `execute_proof` walking left-to-right and discover only the last matched key (the merk walker terminates as soon as the first "impossible direction" boundary is hit), returning a single result instead of all matches. The single-key descent (`verify_single_key_layer_proof_v0`) is unaffected because it's exactly-one-key, so order doesn't matter. The fix is a one-line change: pass `left_to_right` through to `execute_proof` in `execute_carrier_layer_proof`. Adds a new test `acor_subquery_right_to_left_returns_descending_order` that builds a 3-brand carrier with `left_to_right=false` and asserts results come back in descending lex order (brand_002, brand_001, brand_000), each with the correct per-brand count. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * docs(grovedb): tone down carrier ACOR direction comment Strip the "CRITICAL:" framing and the hypothetical "hardcoding true would..." — a short factual note about what would actually go wrong is enough. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * refactor(grovedb,query): drop "ACOR" abbreviation for forward-compat The "ACOR" / "Acor" shorthand baked count into every identifier and prose mention — but the leaf/carrier shape generalizes to forthcoming aggregate variants (sum, average, …), and "ASOR" / "AAOR" don't make that lineage obvious. This commit drops the abbreviation so future aggregate modules can use parallel naming without inheriting a count-specific prefix. Renames: - `AcorClassification` → `AggregateCountClassification` - `classify_path_query` → `classify_aggregate_count_path_query` - `acor_subquery_*` test functions → `carrier_*` (the `aggregate_count_query_tests` module already scopes them) - `acor_carrier_*` → `carrier_*` - `acor_leaf_*` → `leaf_*` - `acor_per_key_*` → `per_key_*` - `carrier_acor_path_query` helper → `carrier_count_path_query` - `validate_acor_*` → `validate_aggregate_count_*` - `validate_carrier_acor_*` → `validate_carrier_aggregate_count_*` - `make_acor_query` → `make_aggregate_count_query` - `make_leaf_acor_subquery` → `make_leaf_aggregate_count_subquery` - `inner_acor*` → `inner_aggregate_count*` - `bad_inner_acor` → `bad_inner_aggregate_count` Docs/comments/error-message prose: "ACOR" → "aggregate-count" (or the full `AggregateCountOnRange` when referring to the QueryItem). Module docs note that future variants will live in sibling modules (`aggregate_sum`, `aggregate_average`, …) with parallel naming. No behavioral changes — pure rename. All 62 grovedb + 28 grovedb-query aggregate-count tests still pass; workspace tests + clippy clean. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * refactor(grovedb): split aggregate_count proof module into submodules The single 750-line `aggregate_count.rs` had grown to cover four distinct concerns: public API, classification, the leaf-only chain walker, and the per-key carrier walker (plus a pile of shared helpers). Split into a directory module with one file per concern. ``` grovedb/src/operations/proof/aggregate_count/ ├── mod.rs 227 lines — public API (impl GroveDb, │ verify_aggregate_count_query + │ verify_aggregate_count_query_per_key), │ require_v1_envelope, submodule wiring ├── classification.rs 77 lines — AggregateCountClassification struct + │ classify_aggregate_count_path_query ├── leaf_chain.rs 77 lines — verify_v1_leaf_chain (legacy │ single-u64 walker) ├── per_key.rs 221 lines — per-key carrier walker │ (with_classification, _per_key, │ _carrier_layer, _subquery_path) └── helpers.rs 270 lines — shared utilities (decode envelope, verify_count_leaf, expect_merk_bytes, verify_single_key_layer_proof_v0, OuterMatch + execute_carrier_layer_proof, enforce_lower_chain) ``` Visibility is `pub(super)` everywhere — the submodules are crate implementation detail, only `verify_aggregate_count_query` and `verify_aggregate_count_query_per_key` are public. Pure code reorganization — no behavior change. All 62 grovedb + 28 grovedb-query aggregate-count tests still pass; workspace + clippy clean. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * fix(grovedb): canonical decoding for aggregate-count proofs `bincode::decode_from_slice` returns the number of bytes consumed but we were discarding it, so a valid proof with arbitrary trailing bytes appended would still decode to the same envelope and verify successfully. The chain-bound cryptographic check still gives the same `(RootHash, count)`, so this isn't a correctness/security problem — but it does mean a single logical proof has infinitely many byte encodings, which breaks any equality-by-bytes assumption (caching, deduplication, hashing the proof itself). `decode_grovedb_proof` now compares `consumed` to `proof.len()` and rejects with `Error::CorruptedData("trailing bytes after the encoded envelope")` when they differ. Test `aggregate_count_proof_with_trailing_bytes_is_rejected` pins the rejection at both entry points (`verify_aggregate_count_query` and `verify_aggregate_count_query_per_key`) with a real proof + one appended zero byte. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * test(grovedb,query): pin nested-carrier rejection The original spec explicitly defers the "Range × Range × ACOR" / "IN × IN on prefix" composition: a carrier whose subquery is itself another carrier (rather than a leaf `AggregateCountOnRange`). The carrier validator already rejects this — it calls `validate_leaf_aggregate_count_on_range` on the subquery, which catches the inner carrier's outer items — but we didn't have a test documenting that contract. Two new tests, one per layer: - `grovedb-query::validate_carrier_aggregate_count_rejects_nested_carrier` drives the static validator with `outer.set_subquery(inner_carrier)` and asserts `InvalidOperation`. - `grovedb::rejects_nested_carrier_range_range_aggregate_count` builds a full `PathQuery` with the same shape and asserts both the static `PathQuery::validate_aggregate_count_on_range` and the prover's entry-point gate (`prove_query` → `InvalidQuery`) refuse. If we ever decide to support the nested shape, these tests are the contract that has to change first. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * test(grovedb): pin SQL-style A=1, B>4, COUNT(C>4) carrier query Adds an explicit end-to-end test that mirrors the canonical SQL aggregate-count query SELECT COUNT(*) FROM t WHERE a = 1 AND b > 4 AND c > 4 against an `(a, b, c)`-indexed grove laid out as TEST_LEAF / byA / <a_val> / byB / <b_val> / byC / <count_tree> The mapping to the carrier ACOR shape is: - `A = 1` is a fixed prefix → `path_query.path` - `B > 4` is the variable outer dimension → carrier's `RangeAfter` - per matched `B`, walk `byC` → carrier's `subquery_path` - `COUNT(C > 4)` is the leaf aggregate-count subquery The tree contains both `A = 1` and `A = 2` so we also confirm the fixed prefix actually scopes the result (only `A = 1` rows count). Expected: two `(b_val, 5)` entries — `b_5` and `b_7` each have 5 `C > c_4` matches. This shape was already supported via the existing carrier machinery — the test just makes the SQL → grove mapping obvious for callers. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * fix(grovedb): emit lower-layer for empty count tree under carrier ACOR When a carrier `AggregateCountOnRange` query's descent reached an empty `Element::ProvableCountTree(None, ..)` or `ProvableCountSumTree(None, ..)`, the V1 prover treated it as a terminal "tree with no children" result and inserted no `lower_layer` for that step. The carrier verifier then rejected the proof at the chain walk with `"carrier aggregate-count proof missing subquery_path layer for key …"`. The correct behavior is to return `count = 0` for that outer key — an existing empty leaf count tree is still a valid match. The fix is a new, narrow arm in `prove_subqueries_v1` that recurses into empty `Provable{Count,CountSum}Tree(None, ..)` elements when both: 1. The surrounding `PathQuery` is an aggregate-count query (`has_aggregate_count_on_range_anywhere() == true`), and 2. The current level's query has a subquery / matching subquery_path step on this key. The recursion opens the empty merk, hits the aggregate-count short-circuit, runs `prove_aggregate_count_on_range` on the empty subtree (which returns an empty op stream), and emits a `LayerProof { merk_proof: ProofBytes::Merk(empty), .. }`. The verifier's `verify_count_leaf` reads empty bytes as `(NULL_HASH, 0)`, and the chain check `combine_hash(H(empty_tree_value), NULL_HASH) == parent_value_hash` holds because the parent committed the tree element with the same `NULL_HASH` child root. Non-aggregate-count queries keep their existing "empty tree = terminal result" semantics. The narrow arm fires *before* the general empty-tree arm and only matches on the two provable-count flavors, so behavior is unchanged for every other tree type. Adds `carrier_returns_zero_count_for_empty_leaf_subtree` — inserts `byBrand/brand_000/color` as an empty `ProvableCountTree`, runs a carrier query, and asserts the returned vector is `[(b"brand_000", 0)]` with matching root hash. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * fix(grovedb): tighten query_aggregate_count to leaf-only validation `query_aggregate_count` is the no-proof counterpart of `verify_aggregate_count_query`: it returns a single `u64` and has no way to surface per-outer-key carrier counts. But the validator it called — `path_query.validate_aggregate_count_on_range` — is the top-level dispatcher that accepts BOTH leaf and carrier shapes. A carrier-shape path query would slip past validation here, then the subsequent `count_aggregate_on_range` call would either error (wrong merk tree type at `path_query.path`) or silently miscompute because the function only ever counts the inner range against the single subtree at `path_query.path`. Same fix we applied to `verify_aggregate_count_query` earlier: switch to `validate_leaf_aggregate_count_on_range`, which rejects the carrier shape up front. The docstring is updated to point carrier-shape callers at `verify_aggregate_count_query_per_key` (the proof-based per-key entry). New test `no_proof_rejects_carrier_shape` constructs a valid carrier path query, sanity-checks that the dispatcher-level validator still accepts it (so the rejection below is specifically from the leaf-only tightening), and asserts the no-proof entry returns `InvalidQuery` with no storage reads. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * fix(grovedb): surface InvalidProof instead of panicking on classification drift `verify_v1_carrier_layer` was `.expect()`-ing that `classification.carrier_subquery_path` is `Some` whenever `carrier_outer_items` is `Some`. That invariant is held today by `classify_aggregate_count_path_query`, but a panic is the wrong failure mode if the invariant ever drifts — a verifier should surface verification failures, not abort. Replace with `ok_or_else` returning `Error::InvalidProof` with a contextual message. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * feat(grovedb): add no-proof query_aggregate_count_per_key entry point Closes the leaf/carrier asymmetry on the no-proof side. Before this commit: proof path: verify_aggregate_count_query (leaf only, -> u64) verify_aggregate_count_query_per_key (leaf or carrier, -> Vec<(key, u64)>) no-proof: query_aggregate_count (leaf only, -> u64) <missing carrier sibling> A carrier-shape caller that didn't need a proof had to go through `prove_query` + `verify_aggregate_count_query_per_key` anyway — paying proof generation, serialization, and chain verification just to throw the proof bytes away. `query_aggregate_count_per_key` mirrors the proof-path surface (returns `Vec<(Vec<u8>, u64)>`) and accepts the same shapes: - Leaf path query: delegates to `query_aggregate_count` and wraps as a one-entry vec with an empty key — matches the proof-path leaf-symmetry contract. - Carrier path query: opens the carrier subtree via `query_raw` with a "shallow" path query (carrier's outer items, no subquery) to enumerate matched outer keys. For each match, walks `subquery_path` manually and calls `count_aggregate_on_range` on the leaf merk. Absent outer keys contribute no entry; outer keys whose leaf is an empty `ProvableCountTree` contribute `(key, 0)`. Validation is the dispatcher-level `validate_aggregate_count_on_range` (accepts both shapes); non-ACOR queries get `InvalidQuery` before any storage reads. Non-tree matches on the carrier level surface `InvalidQuery` rather than the merk-level `count_aggregate_on_range` error. Six tests pin the contract: - `no_proof_per_key_leaf_matches_single_count` — leaf shape returns the same `u64` as `query_aggregate_count`, wrapped. - `no_proof_per_key_carrier_returns_per_outer_count` — two outer brands, 500 colors each -> two entries. - `no_proof_per_key_skips_absent_outer_keys` — absent outer key contributes no entry. - `no_proof_per_key_empty_leaf_returns_zero` — outer key exists, leaf is empty `ProvableCountTree` -> `(key, 0)`. - `no_proof_per_key_matches_proof_path_per_key` — element-for-element equality with `verify_aggregate_count_query_per_key` on a non-trivial 3-brand carrier query. - `no_proof_per_key_rejects_non_aggregate_count_query` — non-ACOR path queries rejected with `InvalidQuery`. Reuses the existing `query_aggregate_count_on_range` version gate since this is another entry point for the same feature, not a distinct protocol change. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

QuantumExplorer and others added 2 commits May 10, 2026 02:26

QuantumExplorer and others added 2 commits May 10, 2026 02:44

QuantumExplorer and others added 2 commits May 10, 2026 03:36

coderabbitai Bot reviewed May 9, 2026

View reviewed changes

Comment thread grovedb-query/src/query.rs

Comment thread merk/src/proofs/query/aggregate_count.rs

Comment thread merk/src/proofs/query/verify.rs Outdated

QuantumExplorer commented May 9, 2026

View reviewed changes

QuantumExplorer and others added 3 commits May 10, 2026 04:10

coderabbitai Bot reviewed May 9, 2026

View reviewed changes

Comment thread docs/book/src/aggregate-count-queries.md

QuantumExplorer commented May 9, 2026

View reviewed changes

QuantumExplorer and others added 2 commits May 10, 2026 05:22

QuantumExplorer merged commit 347bd9b into develop May 9, 2026
12 of 13 checks passed

QuantumExplorer deleted the feat/aggregate-count-query-item branch May 9, 2026 22:48

QuantumExplorer mentioned this pull request May 10, 2026

feat(merk,grovedb): expose aggregate_count proof verification under verify feature #658

Merged

7 tasks

This was referenced May 11, 2026

feat: add CountIndexedTree element with auto-cascading secondary index #657

Open

feat: add Element::ProvableSumTree + AggregateSumOnRange query #661

Open

Conversation

QuantumExplorer commented May 9, 2026 • edited by coderabbitai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Issue being fixed or feature implemented

What was done?

How Has This Been Tested?

Documentation

Breaking Changes

Checklist:

Summary by CodeRabbit

Uh oh!

coderabbitai Bot commented May 9, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Rate limit exceeded

Walkthrough

Changes

Estimated code review effort

Possibly related PRs

Uh oh!

codecov Bot commented May 9, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

QuantumExplorer commented May 9, 2026

Uh oh!

QuantumExplorer commented May 9, 2026

Uh oh!

QuantumExplorer commented May 9, 2026

Uh oh!

QuantumExplorer commented May 9, 2026

Uh oh!

QuantumExplorer commented May 9, 2026

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

QuantumExplorer commented May 9, 2026

Uh oh!

QuantumExplorer left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

QuantumExplorer commented May 9, 2026

Uh oh!

QuantumExplorer May 9, 2026

Choose a reason for hiding this comment

Uh oh!

QuantumExplorer commented May 9, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

QuantumExplorer commented May 9, 2026 •

edited by coderabbitai Bot

Loading

coderabbitai Bot commented May 9, 2026 •

edited

Loading

codecov Bot commented May 9, 2026 •

edited

Loading