Skip to content

feat(grovedb,query): allow AggregateCountOnRange as carrier subquery#663

Merged
QuantumExplorer merged 15 commits into
developfrom
claude/stoic-ardinghelli-508a69
May 14, 2026
Merged

feat(grovedb,query): allow AggregateCountOnRange as carrier subquery#663
QuantumExplorer merged 15 commits into
developfrom
claude/stoic-ardinghelli-508a69

Conversation

@QuantumExplorer
Copy link
Copy Markdown
Member

@QuantumExplorer QuantumExplorer commented May 14, 2026

Summary

  • Allow AggregateCountOnRange (ACOR) as a subquery item under an outer Keys/Range* carrier query — the natural multi-outer-key extension of the existing single-leg ACOR. Returns one (outer_key, u64) pair per matched outer key while leaving the leaf wire format byte-identical.
  • Split Query::validate_aggregate_count_on_range into a leaf validator (existing semantics, renamed) and a new carrier validator, with the top-level dispatcher routing by query shape and returning the leaf inner range either way.
  • Add GroveDb::verify_aggregate_count_query_per_key(...) -> Result<(RootHash, Vec<(Vec<u8>, u64)>), Error> that accepts both leaf and carrier shapes (leaf collapses to a single-entry Vec with an empty key). The legacy verify_aggregate_count_query now strictly validates the leaf shape.

Why

Dash Platform's count-query module needs to answer

SELECT brand, COUNT(*) FROM widgets
WHERE brand IN ('brand_000', 'brand_001') AND color > 'color_00000500'
GROUP BY brand

against a byBrand / <brand> / color / <ProvableCountTree> layout. Today grovedb rejects the nested shape at Query::validate_aggregate_count_on_range, forcing callers to fan out one proof per outer key (k× larger proofs) or fall back to the unproven path. With this change the same GroveDB proof contains all k counts and is verified in one pass.

Asymptotic complexity for |outer matches| = k, outer tree B, leaf tree C: O(k · (log B + log C)).

Wire-format compat

  • Existing leaf-only ACOR proofs are byte-identical: drive's existing aggregate_count_path_query + verify_aggregate_count_query round-trip works without changes.
  • The new carrier shape is structurally different in the GroveDBProof envelope (multi-key outer merk proof + per-outer-key lower_layers) — it cannot be confused for a leaf proof.

What's NOT in this PR

  • Drive-side consumer changes (separate PR in dashpay/platform).
  • Conditional-subquery carriers (out of scope per spec).
  • Multi-In on prefix (IN × IN) — the carrier accepts ONE outer fan-out dimension.

Test plan

  • 10 new validation tests in grovedb-query exercising carrier rules: happy paths, RangeFull/ACOR outer rejection, missing/non-ACOR subquery rejection, conditional-branch rejection, empty-items rejection, Range-outer acceptance, dispatch behavior of the top-level validator.
  • 7 new integration tests in grovedb:
    • acor_subquery_two_outer_keys_succeeds — 3-deep tree, two outer brands, 1 000 colors each, RangeAfter on color leaf; expect 2 entries in lex-asc order with the correct per-brand count.
    • acor_subquery_with_unknown_outer_key_returns_present_keys_only — absent outer key contributes no entry (prover doesn't emit a lower layer for it).
    • acor_subquery_rejects_acor_at_both_levels — validation and prove-entry gate both reject.
    • acor_subquery_carrier_with_range_outer_succeeds — carrier with a RangeAfter outer item.
    • acor_leaf_unchanged_under_per_key_verifier — leaf ACOR returns a one-entry Vec with an empty key and the same count verify_aggregate_count_query returns.
    • acor_per_key_rejects_non_acor_path_query — non-ACOR path query rejected up front.
    • acor_subquery_count_forgery_is_caught — bumping a HashWithCount in the leaf bytes through the carrier envelope is rejected by the chain check.
  • cargo test --workspace --features 'minimal verify' — all 1516 lib tests pass.
  • cargo clippy -p grovedb -p grovedb-query --features minimal -- -D warnings — clean.
  • Drive-side integration (separate PR in dashpay/platform).

🤖 Generated with Claude Code

Summary by CodeRabbit

  • New Features

    • Per-key aggregate-count verification for carrier-shaped queries and clearer leaf vs carrier query validation
    • Better handling of empty leaf subtrees so carrier proofs can report zero counts per outer key
    • query API now enforces leaf-only shape when returning a single aggregate count
  • Bug Fixes

    • Require V1 proof envelopes for aggregate-count verification; V0 envelopes rejected with clear errors
  • Tests

    • Large expansion of unit and end-to-end tests covering carrier scenarios, validation, and forgery detection

Review Change Stack

Allow `AggregateCountOnRange` as a subquery item under an outer `Keys`/
`Range*` carrier query — the natural multi-outer-key extension of the
existing single-leg ACOR. The carrier returns one `(outer_key, u64)`
pair per matched outer key while leaving the leaf wire format and
existing entry points byte-identical.

Why: Dash Platform's count-query module needs to answer
`SELECT brand, COUNT(*) ... WHERE brand IN (...) GROUP BY brand`
against `byBrand / <brand> / color / <ProvableCountTree>` layouts.
Today grovedb rejects the nested shape at `Query::validate_*`,
forcing callers to either fan out one proof per outer key (k× larger
proofs) or fall back to the unproven path. With this change the same
GroveDB proof contains all k counts and is verified in one pass.

Changes:
- `grovedb-query`: split validation into a leaf validator (renamed
  from the old `validate_aggregate_count_on_range`) and a new carrier
  validator that requires `Key`/`Range*` outer items, a leaf-ACOR
  subquery, and no conditional branches. The top-level
  `validate_aggregate_count_on_range` now dispatches by query shape
  and returns the leaf inner range either way.
- `grovedb`: add `validate_leaf_aggregate_count_on_range` at
  SizedQuery/PathQuery for entry points that must reject the carrier.
  No prover changes needed — the existing recursion naturally walks
  outer items -> subquery_path -> leaf ACOR via `query_items_at_path`.
- `grovedb`: add `verify_aggregate_count_query_per_key` returning
  `(RootHash, Vec<(Vec<u8>, u64)>)`. Leaf queries surface as a single
  entry with empty key + the same count `verify_aggregate_count_query`
  returns. The legacy `verify_aggregate_count_query` now strictly
  validates the leaf shape; carrier queries must use the new entry.
- Tests: 10 new carrier-shape validation tests in grovedb-query and
  7 end-to-end carrier proof tests in grovedb (two-outer-key happy
  path, absent-outer-key returns present-only, both-levels rejection,
  Range-outer carrier, leaf symmetry, non-ACOR rejection, and count
  forgery rejection through the carrier envelope).

Asymptotic complexity for |outer matches| = k, outer tree B,
leaf tree C: O(k * (log B + log C)).

Out of scope (deferred): conditional-branch carriers, multi-In on
prefix (IN x IN), drive-side integration.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented May 14, 2026

Warning

Rate limit exceeded

@QuantumExplorer has exceeded the limit for the number of commits that can be reviewed per hour. Please wait 43 minutes and 34 seconds before requesting another review.

You’ve run out of usage credits. Purchase more in the billing tab.

⌛ How to resolve this issue?

After the wait time has elapsed, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

We recommend that you space out your commits to avoid hitting the rate limit.

🚦 How do rate limits work?

CodeRabbit enforces hourly rate limits for each developer per organization.

Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout.

Please see our FAQ for further information.

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: c6419b0e-1a71-4b13-9dd0-d50b485c4401

📥 Commits

Reviewing files that changed from the base of the PR and between 2bbf7d1 and d1cd151.

📒 Files selected for processing (3)
  • grovedb/src/operations/get/query.rs
  • grovedb/src/operations/proof/aggregate_count/per_key.rs
  • grovedb/src/tests/aggregate_count_query_tests.rs
📝 Walkthrough

Walkthrough

This PR splits AggregateCountOnRange handling into explicit leaf vs carrier validations, adds query-level ACOR constructors/detectors, gates proof generation to V1 for ACOR, and implements classification-driven V1 verification (leaf-chain and per-key carrier walkers) with extensive carrier-focused tests.

Changes

ACOR Leaf and Carrier Shape Separation

Layer / File(s) Summary
Query Construction, Detection, and Validation
grovedb-query/src/aggregate_count.rs, grovedb-query/src/lib.rs
New module provides Query::new_aggregate_count_on_range constructor, detection helpers (aggregate_count_on_range, has_aggregate_count_on_range_anywhere), and independent validators for leaf and carrier ACOR shapes with comprehensive unit tests.
Query Type Wrapper Validators & API gating
grovedb/src/query/mod.rs, grovedb/src/operations/get/query.rs
Adds SizedQuery::validate_leaf_aggregate_count_on_range and PathQuery::validate_leaf_aggregate_count_on_range, updates query_aggregate_count docs and runtime validation to enforce leaf-only shape at the single-u64 API.
Proof Generation Entry Point Version Gating
grovedb/src/operations/proof/generate.rs
Detects ACOR requests early in prove_query_non_serialized, validates shape immediately, rejects V0 for ACOR proofs, and ensures V1 generation includes empty-count lower layers for carrier queries.
Proof helpers and lower-chain enforcement
grovedb/src/operations/proof/aggregate_count/helpers.rs
Adds canonical GroveDBProof decoding with trailing-bytes check, merk-bytes guard, single-key and carrier-layer verification helpers, OuterMatch, and enforce_lower_chain.
Leaf-chain V1 verifier
grovedb/src/operations/proof/aggregate_count/leaf_chain.rs
Implements recursive V1 leaf-chain verification that validates successive single-key proofs and delegates final leaf count verification.
AggregateCount classification
grovedb/src/operations/proof/aggregate_count/classification.rs
Adds AggregateCountClassification and classify_aggregate_count_path_query to normalize leaf vs carrier shapes and extract inner-range/subquery_path.
Proof envelope glue / GroveDb entry points
grovedb/src/operations/proof/aggregate_count/mod.rs
Adds GroveDb::verify_aggregate_count_query (leaf) and verify_aggregate_count_query_per_key (leaf-or-carrier) with V1-envelope enforcement and dispatch to the new walkers.
Per-key carrier verifier and subquery descent
grovedb/src/operations/proof/aggregate_count/per_key.rs
Implements per-key V1 walker that executes carrier outer proofs, recurses subquery_path per matched outer key, and returns per-key counts in query order.
Carrier ACOR End-to-End Tests
grovedb/src/tests/aggregate_count_query_tests.rs
Adds extensive carrier-focused integration tests covering multi-key results, absent-key handling, nested ACOR rejection, Range outer-item support, ordering, proof tampering detection, V0-envelope rejection, zero-count behavior, corruption/trailing-bytes checks, and entry-point validations.

🎯 4 (Complex) | ⏱️ ~45 minutes

🐰 A tale of shapes now split so clear,
Leaf and carrier, both so near,
With V1 proof and query-path so deep,
Our counts and keys safe we shall keep!

🚥 Pre-merge checks | ✅ 5
✅ Passed checks (5 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title accurately captures the main change: enabling AggregateCountOnRange to function as a carrier subquery, which is the core feature addition enabling multi-outer-key aggregate count verification.
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch claude/stoic-ardinghelli-508a69

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@codecov
Copy link
Copy Markdown

codecov Bot commented May 14, 2026

Codecov Report

❌ Patch coverage is 91.21996% with 95 lines in your changes missing coverage. Please review.
✅ Project coverage is 90.76%. Comparing base (a917d92) to head (d1cd151).

Files with missing lines Patch % Lines
...db/src/operations/proof/aggregate_count/helpers.rs 81.76% 33 Missing ⚠️
...rovedb/src/operations/proof/aggregate_count/mod.rs 70.49% 18 Missing ⚠️
grovedb-query/src/aggregate_count.rs 96.71% 16 Missing ⚠️
...db/src/operations/proof/aggregate_count/per_key.rs 90.18% 16 Missing ⚠️
grovedb/src/operations/get/query.rs 83.58% 11 Missing ⚠️
grovedb/src/operations/proof/generate.rs 97.22% 1 Missing ⚠️
Additional details and impacted files
@@             Coverage Diff             @@
##           develop     #663      +/-   ##
===========================================
- Coverage    90.76%   90.76%   -0.01%     
===========================================
  Files          184      189       +5     
  Lines        55791    56407     +616     
===========================================
+ Hits         50639    51196     +557     
- Misses        5152     5211      +59     
Components Coverage Δ
grovedb-core 88.49% <86.72%> (-0.03%) ⬇️
merk 92.32% <ø> (ø)
storage 86.36% <ø> (ø)
commitment-tree 96.43% <ø> (ø)
mmr 96.76% <ø> (ø)
bulk-append-tree 89.14% <ø> (ø)
element 95.75% <ø> (ø)
🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🧹 Nitpick comments (2)
grovedb/src/tests/aggregate_count_query_tests.rs (2)

1656-1672: ⚡ Quick win

Make the ordering contract observable in the explicit-key test.

The test currently supplies brand_000 then brand_001 and asserts the same order, so it also passes if results preserve request order rather than canonical lexicographic order. Reversing the input keys would make the ordering guarantee real.

Suggested tweak
         let path_query = carrier_acor_path_query(
-            &[b"brand_000", b"brand_001"],
+            &[b"brand_001", b"brand_000"],
             QueryItem::RangeAfter(b"color_00499".to_vec()..),
         );
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@grovedb/src/tests/aggregate_count_query_tests.rs` around lines 1656 - 1672,
The test's ordering assertion isn't robust because it passes the keys in lex
order; change the carrier_acor_path_query call to provide the input keys in
reverse (e.g., pass b"brand_001" then b"brand_000") and update the subsequent
expectations for results[0].0 and results[1].0 accordingly so the test verifies
that GroveDb::verify_aggregate_count_query_per_key (used after prove_query)
returns canonical lexicographic ordering rather than preserving request order.

791-796: ⚡ Quick win

Assert the rejection reason, not just InvalidQuery.

This passes on any entry-point InvalidQuery, so a future regression could still look green even if the carrier/leaf-inner validation path changes. Pin the malformed-inner Key reason here as well.

Suggested tightening
-        assert!(
-            matches!(prove_result, Err(crate::Error::InvalidQuery(_))),
-            "carrier ACOR with malformed leaf-inner Key must be rejected at entry, got {:?}",
-            prove_result.map(|b| b.len())
-        );
+        match prove_result {
+            Err(crate::Error::InvalidQuery(msg)) => assert!(
+                msg.contains("AggregateCountOnRange may not wrap Key"),
+                "expected malformed inner ACOR rejection, got: {msg}"
+            ),
+            other => panic!(
+                "carrier ACOR with malformed leaf-inner Key must be rejected at entry, got {:?}",
+                other.map(|b| b.len())
+            ),
+        }
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@grovedb/src/tests/aggregate_count_query_tests.rs` around lines 791 - 796, The
test currently only checks for Err(crate::Error::InvalidQuery(_)) when calling
prove_query; tighten it to assert the specific invalid-query reason for a
malformed leaf-inner Key (the carrier ACOR path) by matching the concrete
InvalidQuery variant produced (e.g.,
Err(crate::Error::InvalidQuery(crate::InvalidQueryReason::MalformedLeafInnerKey))
or the exact enum/variant used in your codebase), or by asserting the error's
description/message equals the expected malformed-inner-Key reason; update the
assert to match that specific variant/message instead of the wildcard
underscore.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@grovedb/src/operations/proof/aggregate_count.rs`:
- Around line 95-100: The code currently runs ACOR validation on the inner Query
by calling path_query.query.query.validate_leaf_aggregate_count_on_range(),
which skips SizedQuery pagination checks (limit/offset); change the validation
call to run on the SizedQuery/PathQuery level (e.g., call
validate_leaf_aggregate_count_on_range on path_query.query or otherwise ensure
the SizedQuery wrapper is validated) and keep mapping errors via
crate::query_validation_error_to_invalid_query; apply the same change to the
other aggregate-count verification sites in this file that mirror this pattern
(the later blocks referencing inner_range/path_query and
validate_leaf_aggregate_count_on_range).
- Around line 140-141: The doc comments in aggregate_count.rs claim the returned
vector is "ascending lexicographic order" but the implementation respects the
query's left_to_right direction, so update the docs (in the comments around the
vector/return description in aggregate_count.rs and the similar blocks at the
other noted locations) to say "in query-direction (left_to_right) order" instead
of "ascending lexicographic order", or alternatively change the implementation
to sort the per-key result vector into ascending lexicographic order before
returning; mention or use the existing left_to_right flag in your change so the
behavior is explicit.

---

Nitpick comments:
In `@grovedb/src/tests/aggregate_count_query_tests.rs`:
- Around line 1656-1672: The test's ordering assertion isn't robust because it
passes the keys in lex order; change the carrier_acor_path_query call to provide
the input keys in reverse (e.g., pass b"brand_001" then b"brand_000") and update
the subsequent expectations for results[0].0 and results[1].0 accordingly so the
test verifies that GroveDb::verify_aggregate_count_query_per_key (used after
prove_query) returns canonical lexicographic ordering rather than preserving
request order.
- Around line 791-796: The test currently only checks for
Err(crate::Error::InvalidQuery(_)) when calling prove_query; tighten it to
assert the specific invalid-query reason for a malformed leaf-inner Key (the
carrier ACOR path) by matching the concrete InvalidQuery variant produced (e.g.,
Err(crate::Error::InvalidQuery(crate::InvalidQueryReason::MalformedLeafInnerKey))
or the exact enum/variant used in your codebase), or by asserting the error's
description/message equals the expected malformed-inner-Key reason; update the
assert to match that specific variant/message instead of the wildcard
underscore.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: f96586f0-6339-4473-8c8c-6234f3da9593

📥 Commits

Reviewing files that changed from the base of the PR and between a917d92 and f5e39e3.

📒 Files selected for processing (5)
  • grovedb-query/src/query.rs
  • grovedb/src/lib.rs
  • grovedb/src/operations/proof/aggregate_count.rs
  • grovedb/src/query/mod.rs
  • grovedb/src/tests/aggregate_count_query_tests.rs

Comment thread grovedb/src/operations/proof/aggregate_count.rs Outdated
Comment thread grovedb/src/operations/proof/aggregate_count.rs Outdated
QuantumExplorer and others added 5 commits May 15, 2026 04:42
- Move ACOR construction/validation out of `grovedb-query/src/query.rs`
  (it had grown past 1500 lines) into a focused
  `grovedb-query/src/aggregate_count.rs` with its own `impl Query` block
  and tests. `query.rs` shrinks from ~1580 to ~910 lines.

- Per CodeRabbit major: route verifier validation through the
  PathQuery layer so `SizedQuery::limit`/`offset` are rejected up
  front for both leaf and carrier shapes. New tests pin the
  pagination rejection.

- V0 proof envelopes can't carry the carrier shape (they're produced
  only by pre-feature grove versions); the new `verify_v0_with_classification`
  rejects carrier classification with `InvalidProof("only supported on V1
  proof envelopes")` instead of attempting a never-emitted carrier walk.
  Removes the unused `verify_v0_carrier_layer` / `verify_v0_subquery_path`.

- Per CodeRabbit minor: ordering doc on
  `verify_aggregate_count_query_per_key` now says "query-direction
  order" (matches the merk walker) instead of asserting strict
  ascending lex, since the carrier honors `Query::left_to_right`.

- Per CodeRabbit nitpick: pin the malformed-leaf-inner-Key rejection
  message in the carrier path so future refactors can't silently
  re-route the error.

- New end-to-end tests: V0+carrier rejection, long subquery_path
  (2 intermediate single-key layers), corrupted outer-layer merk
  bytes, undecodable envelope, legacy `verify_aggregate_count_query`
  rejection of carrier shapes, carrier pagination rejection.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Patch coverage was 89.86% (just under codecov's 90% target). The
biggest gaps were untested error paths in the carrier verifier and
a few carrier-validator branches the top-level dispatcher masks.

New tests:

grovedb (carrier proof verifier error paths):
- `acor_carrier_missing_outer_lower_layer_is_rejected` — drops one
  `lower_layers[outer_key]` entry from a real envelope; verifier
  must surface "missing lower layer for outer key".
- `acor_carrier_missing_subquery_path_layer_is_rejected` — drops the
  subquery_path "color" layer; exercises `verify_v1_subquery_path`'s
  missing-layer branch.
- `acor_carrier_non_merk_proof_bytes_is_rejected` — swaps a
  `ProofBytes::Merk(...)` layer for `ProofBytes::MMR(...)`; the
  `expect_merk_bytes` helper rejects.

grovedb-query (direct carrier-validator branches):
- `validate_carrier_acor_direct_rejects_missing_subquery`
- `validate_carrier_acor_direct_rejects_acor_outer_item`
- `validate_carrier_acor_direct_rejects_range_full_outer`

These call `validate_carrier_aggregate_count_on_range` directly to
hit branches the top-level dispatcher routes around (the dispatcher
classifies first and sends ACOR-outer-item shapes to the leaf
validator).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…rely

V0 proof envelopes (`GroveDBProofV0` / `MerkOnlyLayerProof`) predate
the ACOR feature — ACOR was introduced in #656, well after V1
envelopes became the default for grove versions used by Dash Platform
v12+. The V0+ACOR combination is impossible in any deployed Platform
release, so the previous compatibility shims are dead code.

This change makes that explicit:

- Prover entry: `prove_query_non_serialized` rejects ACOR queries
  paired with grove versions that dispatch to V0 with
  `Error::NotSupported("requires V1 proof envelopes")`. The check
  fires before any work happens, so callers can't accidentally emit
  a V0 ACOR proof.

- Verifier entries: both `verify_aggregate_count_query` (leaf) and
  `verify_aggregate_count_query_per_key` (leaf or carrier) refuse
  `GroveDBProof::V0` envelopes via a shared `require_v1_envelope`
  helper, surfacing `Error::InvalidProof`. A forged V0 envelope
  carrying ACOR-looking bytes can't even reach the verification logic.

- Removed dead code: `verify_v0_leaf_chain` and
  `verify_v0_with_classification` are deleted. `MerkOnlyLayerProof`
  and `GroveDBProofV0` are no longer imported by `aggregate_count.rs`.

- Tests updated: `provable_count_tree_works_on_grove_v2_envelope`
  inverts to `aggregate_count_rejects_grove_v2_envelope`, asserting
  the prover refuses the combination. `acor_carrier_rejects_v0_envelope`
  similarly tightens its assertion to expect `NotSupported` from the
  prover instead of `InvalidProof` from the verifier.

The V0 short-circuit in `prove_subqueries` (the v0 path) stays as
defense-in-depth — unreachable through the entry gate but a safety
net for any future internal caller that bypasses
`prove_query_non_serialized`.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The carrier-layer verifier was hardcoding `left_to_right=true` when
calling `merk::execute_proof`, even though the rest of the carrier
classification (`AcorClassification::carrier_left_to_right`) and the
prover (`generate_merk_proof(..., query.left_to_right, ...)`) both
honor the query's direction. Consequence: a carrier query with
`left_to_right=false` would produce a correct multi-key proof
emitted in reverse order, but the verifier would feed it through
`execute_proof` walking left-to-right and discover only the last
matched key (the merk walker terminates as soon as the first
"impossible direction" boundary is hit), returning a single result
instead of all matches.

The single-key descent (`verify_single_key_layer_proof_v0`) is
unaffected because it's exactly-one-key, so order doesn't matter.

The fix is a one-line change: pass `left_to_right` through to
`execute_proof` in `execute_carrier_layer_proof`. Adds a new test
`acor_subquery_right_to_left_returns_descending_order` that builds
a 3-brand carrier with `left_to_right=false` and asserts results
come back in descending lex order (brand_002, brand_001, brand_000),
each with the correct per-brand count.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Strip the "CRITICAL:" framing and the hypothetical "hardcoding true
would..." — a short factual note about what would actually go wrong
is enough.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (1)
grovedb/src/operations/proof/aggregate_count.rs (1)

261-270: 💤 Low value

Trailing-byte tolerance in proof decoding.

bincode::decode_from_slice returns the number of bytes consumed but the value is discarded, so any trailing bytes appended to a valid proof are silently ignored. Since the resulting (RootHash, count) is still bound by the cryptographic chain check, this is not a correctness/security issue, but it does make the encoding non-canonical (multiple byte-strings can verify to the same result). If strict canonical-encoding is desired here, consider rejecting proofs whose consumed != proof.len().

♻️ Optional canonical-encoding tweak
-    let (proof, _) = bincode::decode_from_slice(proof, config)
-        .map_err(|e| Error::CorruptedData(format!("unable to decode proof: {}", e)))?;
-    Ok(proof)
+    let (decoded, consumed) = bincode::decode_from_slice(proof, config)
+        .map_err(|e| Error::CorruptedData(format!("unable to decode proof: {}", e)))?;
+    if consumed != proof.len() {
+        return Err(Error::CorruptedData(format!(
+            "proof has {} trailing bytes after decoding",
+            proof.len() - consumed
+        )));
+    }
+    Ok(decoded)
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@grovedb/src/operations/proof/aggregate_count.rs` around lines 261 - 270,
decode_grovedb_proof currently discards the byte-count returned by
bincode::decode_from_slice which allows trailing bytes to be ignored and makes
the encoding non-canonical; change decode_grovedb_proof to capture the returned
consumed length from bincode::decode_from_slice(proof, config) and if consumed
!= proof.len() return an Error::CorruptedData (or a new descriptive error)
indicating trailing bytes, so only proofs that consume the entire input are
accepted.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Nitpick comments:
In `@grovedb/src/operations/proof/aggregate_count.rs`:
- Around line 261-270: decode_grovedb_proof currently discards the byte-count
returned by bincode::decode_from_slice which allows trailing bytes to be ignored
and makes the encoding non-canonical; change decode_grovedb_proof to capture the
returned consumed length from bincode::decode_from_slice(proof, config) and if
consumed != proof.len() return an Error::CorruptedData (or a new descriptive
error) indicating trailing bytes, so only proofs that consume the entire input
are accepted.

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: c8ccea73-86cd-40f1-bdfc-f9828934a20a

📥 Commits

Reviewing files that changed from the base of the PR and between f5e39e3 and d8ddda5.

📒 Files selected for processing (7)
  • grovedb-query/src/aggregate_count.rs
  • grovedb-query/src/lib.rs
  • grovedb-query/src/query.rs
  • grovedb/src/operations/proof/aggregate_count.rs
  • grovedb/src/operations/proof/generate.rs
  • grovedb/src/query/mod.rs
  • grovedb/src/tests/aggregate_count_query_tests.rs
💤 Files with no reviewable changes (1)
  • grovedb-query/src/query.rs
✅ Files skipped from review due to trivial changes (1)
  • grovedb-query/src/lib.rs

QuantumExplorer and others added 7 commits May 15, 2026 05:34
The "ACOR" / "Acor" shorthand baked count into every identifier and
prose mention — but the leaf/carrier shape generalizes to forthcoming
aggregate variants (sum, average, …), and "ASOR" / "AAOR" don't make
that lineage obvious. This commit drops the abbreviation so future
aggregate modules can use parallel naming without inheriting a
count-specific prefix.

Renames:

- `AcorClassification` → `AggregateCountClassification`
- `classify_path_query` → `classify_aggregate_count_path_query`
- `acor_subquery_*` test functions → `carrier_*` (the
  `aggregate_count_query_tests` module already scopes them)
- `acor_carrier_*` → `carrier_*`
- `acor_leaf_*` → `leaf_*`
- `acor_per_key_*` → `per_key_*`
- `carrier_acor_path_query` helper → `carrier_count_path_query`
- `validate_acor_*` → `validate_aggregate_count_*`
- `validate_carrier_acor_*` → `validate_carrier_aggregate_count_*`
- `make_acor_query` → `make_aggregate_count_query`
- `make_leaf_acor_subquery` → `make_leaf_aggregate_count_subquery`
- `inner_acor*` → `inner_aggregate_count*`
- `bad_inner_acor` → `bad_inner_aggregate_count`

Docs/comments/error-message prose: "ACOR" → "aggregate-count" (or the
full `AggregateCountOnRange` when referring to the QueryItem). Module
docs note that future variants will live in sibling modules
(`aggregate_sum`, `aggregate_average`, …) with parallel naming.

No behavioral changes — pure rename. All 62 grovedb + 28 grovedb-query
aggregate-count tests still pass; workspace tests + clippy clean.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The single 750-line `aggregate_count.rs` had grown to cover four
distinct concerns: public API, classification, the leaf-only chain
walker, and the per-key carrier walker (plus a pile of shared
helpers). Split into a directory module with one file per concern.

```
grovedb/src/operations/proof/aggregate_count/
├── mod.rs            227 lines — public API (impl GroveDb,
│                                  verify_aggregate_count_query +
│                                  verify_aggregate_count_query_per_key),
│                                  require_v1_envelope, submodule wiring
├── classification.rs  77 lines — AggregateCountClassification struct +
│                                  classify_aggregate_count_path_query
├── leaf_chain.rs      77 lines — verify_v1_leaf_chain (legacy
│                                  single-u64 walker)
├── per_key.rs        221 lines — per-key carrier walker
│                                  (with_classification, _per_key,
│                                  _carrier_layer, _subquery_path)
└── helpers.rs        270 lines — shared utilities (decode envelope,
                                  verify_count_leaf, expect_merk_bytes,
                                  verify_single_key_layer_proof_v0,
                                  OuterMatch + execute_carrier_layer_proof,
                                  enforce_lower_chain)
```

Visibility is `pub(super)` everywhere — the submodules are crate
implementation detail, only `verify_aggregate_count_query` and
`verify_aggregate_count_query_per_key` are public.

Pure code reorganization — no behavior change. All 62 grovedb +
28 grovedb-query aggregate-count tests still pass; workspace +
clippy clean.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
`bincode::decode_from_slice` returns the number of bytes consumed but
we were discarding it, so a valid proof with arbitrary trailing bytes
appended would still decode to the same envelope and verify
successfully. The chain-bound cryptographic check still gives the
same `(RootHash, count)`, so this isn't a correctness/security
problem — but it does mean a single logical proof has infinitely
many byte encodings, which breaks any equality-by-bytes assumption
(caching, deduplication, hashing the proof itself).

`decode_grovedb_proof` now compares `consumed` to `proof.len()` and
rejects with `Error::CorruptedData("trailing bytes after the encoded
envelope")` when they differ.

Test `aggregate_count_proof_with_trailing_bytes_is_rejected` pins
the rejection at both entry points (`verify_aggregate_count_query`
and `verify_aggregate_count_query_per_key`) with a real proof +
one appended zero byte.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The original spec explicitly defers the "Range × Range × ACOR" /
"IN × IN on prefix" composition: a carrier whose subquery is itself
another carrier (rather than a leaf `AggregateCountOnRange`). The
carrier validator already rejects this — it calls
`validate_leaf_aggregate_count_on_range` on the subquery, which
catches the inner carrier's outer items — but we didn't have a test
documenting that contract.

Two new tests, one per layer:

- `grovedb-query::validate_carrier_aggregate_count_rejects_nested_carrier`
  drives the static validator with `outer.set_subquery(inner_carrier)`
  and asserts `InvalidOperation`.

- `grovedb::rejects_nested_carrier_range_range_aggregate_count` builds
  a full `PathQuery` with the same shape and asserts both the static
  `PathQuery::validate_aggregate_count_on_range` and the prover's
  entry-point gate (`prove_query` → `InvalidQuery`) refuse.

If we ever decide to support the nested shape, these tests are the
contract that has to change first.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Adds an explicit end-to-end test that mirrors the canonical SQL
aggregate-count query

    SELECT COUNT(*) FROM t WHERE a = 1 AND b > 4 AND c > 4

against an `(a, b, c)`-indexed grove laid out as

    TEST_LEAF / byA / <a_val> / byB / <b_val> / byC / <count_tree>

The mapping to the carrier ACOR shape is:
- `A = 1` is a fixed prefix → `path_query.path`
- `B > 4` is the variable outer dimension → carrier's `RangeAfter`
- per matched `B`, walk `byC` → carrier's `subquery_path`
- `COUNT(C > 4)` is the leaf aggregate-count subquery

The tree contains both `A = 1` and `A = 2` so we also confirm the
fixed prefix actually scopes the result (only `A = 1` rows count).

Expected: two `(b_val, 5)` entries — `b_5` and `b_7` each have 5
`C > c_4` matches.

This shape was already supported via the existing carrier
machinery — the test just makes the SQL → grove mapping obvious
for callers.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
When a carrier `AggregateCountOnRange` query's descent reached an
empty `Element::ProvableCountTree(None, ..)` or
`ProvableCountSumTree(None, ..)`, the V1 prover treated it as a
terminal "tree with no children" result and inserted no
`lower_layer` for that step. The carrier verifier then rejected the
proof at the chain walk with `"carrier aggregate-count proof
missing subquery_path layer for key …"`. The correct behavior is to
return `count = 0` for that outer key — an existing empty leaf
count tree is still a valid match.

The fix is a new, narrow arm in `prove_subqueries_v1` that recurses
into empty `Provable{Count,CountSum}Tree(None, ..)` elements when
both:

1. The surrounding `PathQuery` is an aggregate-count query
   (`has_aggregate_count_on_range_anywhere() == true`), and
2. The current level's query has a subquery / matching subquery_path
   step on this key.

The recursion opens the empty merk, hits the aggregate-count
short-circuit, runs `prove_aggregate_count_on_range` on the empty
subtree (which returns an empty op stream), and emits a
`LayerProof { merk_proof: ProofBytes::Merk(empty), .. }`. The
verifier's `verify_count_leaf` reads empty bytes as `(NULL_HASH, 0)`,
and the chain check
`combine_hash(H(empty_tree_value), NULL_HASH) == parent_value_hash`
holds because the parent committed the tree element with the same
`NULL_HASH` child root.

Non-aggregate-count queries keep their existing "empty tree =
terminal result" semantics. The narrow arm fires *before* the
general empty-tree arm and only matches on the two provable-count
flavors, so behavior is unchanged for every other tree type.

Adds `carrier_returns_zero_count_for_empty_leaf_subtree` —
inserts `byBrand/brand_000/color` as an empty
`ProvableCountTree`, runs a carrier query, and asserts the
returned vector is `[(b"brand_000", 0)]` with matching root hash.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
`query_aggregate_count` is the no-proof counterpart of
`verify_aggregate_count_query`: it returns a single `u64` and has no
way to surface per-outer-key carrier counts. But the validator it
called — `path_query.validate_aggregate_count_on_range` — is the
top-level dispatcher that accepts BOTH leaf and carrier shapes. A
carrier-shape path query would slip past validation here, then the
subsequent `count_aggregate_on_range` call would either error
(wrong merk tree type at `path_query.path`) or silently miscompute
because the function only ever counts the inner range against the
single subtree at `path_query.path`.

Same fix we applied to `verify_aggregate_count_query` earlier:
switch to `validate_leaf_aggregate_count_on_range`, which rejects
the carrier shape up front. The docstring is updated to point
carrier-shape callers at
`verify_aggregate_count_query_per_key` (the proof-based per-key
entry).

New test `no_proof_rejects_carrier_shape` constructs a valid
carrier path query, sanity-checks that the dispatcher-level
validator still accepts it (so the rejection below is specifically
from the leaf-only tightening), and asserts the no-proof entry
returns `InvalidQuery` with no storage reads.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@grovedb/src/operations/proof/aggregate_count/per_key.rs`:
- Around line 132-136: Replace the panic caused by
classification.carrier_subquery_path.as_ref().expect(...) with error
propagation: change to an ok_or_else that returns Error::InvalidProof with a
clear contextual message (e.g. "missing carrier_subquery_path in
classification") so the verifier treats this as a verification failure instead
of aborting; where subsequent fallible calls are made on that path, wrap/map
errors with context using .map_err(|e| Error::CorruptedData(format!("context:
{}", e))) as per the coding guideline so all failures are returned with useful
context.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 4a72e90e-06f9-4f51-ad86-08bd46a9d9e6

📥 Commits

Reviewing files that changed from the base of the PR and between d8ddda5 and 2bbf7d1.

📒 Files selected for processing (10)
  • grovedb-query/src/aggregate_count.rs
  • grovedb/src/operations/get/query.rs
  • grovedb/src/operations/proof/aggregate_count.rs
  • grovedb/src/operations/proof/aggregate_count/classification.rs
  • grovedb/src/operations/proof/aggregate_count/helpers.rs
  • grovedb/src/operations/proof/aggregate_count/leaf_chain.rs
  • grovedb/src/operations/proof/aggregate_count/mod.rs
  • grovedb/src/operations/proof/aggregate_count/per_key.rs
  • grovedb/src/operations/proof/generate.rs
  • grovedb/src/tests/aggregate_count_query_tests.rs
💤 Files with no reviewable changes (1)
  • grovedb/src/operations/proof/aggregate_count.rs
🚧 Files skipped from review as they are similar to previous changes (2)
  • grovedb-query/src/aggregate_count.rs
  • grovedb/src/tests/aggregate_count_query_tests.rs

Comment thread grovedb/src/operations/proof/aggregate_count/per_key.rs
QuantumExplorer and others added 2 commits May 15, 2026 06:31
…tion drift

`verify_v1_carrier_layer` was `.expect()`-ing that
`classification.carrier_subquery_path` is `Some` whenever
`carrier_outer_items` is `Some`. That invariant is held today by
`classify_aggregate_count_path_query`, but a panic is the wrong
failure mode if the invariant ever drifts — a verifier should
surface verification failures, not abort. Replace with `ok_or_else`
returning `Error::InvalidProof` with a contextual message.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Closes the leaf/carrier asymmetry on the no-proof side. Before this
commit:

  proof path: verify_aggregate_count_query (leaf only, -> u64)
              verify_aggregate_count_query_per_key (leaf or carrier,
                                                    -> Vec<(key, u64)>)
  no-proof:   query_aggregate_count (leaf only, -> u64)
              <missing carrier sibling>

A carrier-shape caller that didn't need a proof had to go through
`prove_query` + `verify_aggregate_count_query_per_key` anyway —
paying proof generation, serialization, and chain verification just
to throw the proof bytes away.

`query_aggregate_count_per_key` mirrors the proof-path surface
(returns `Vec<(Vec<u8>, u64)>`) and accepts the same shapes:

- Leaf path query: delegates to `query_aggregate_count` and wraps
  as a one-entry vec with an empty key — matches the proof-path
  leaf-symmetry contract.

- Carrier path query: opens the carrier subtree via `query_raw`
  with a "shallow" path query (carrier's outer items, no subquery)
  to enumerate matched outer keys. For each match, walks
  `subquery_path` manually and calls `count_aggregate_on_range` on
  the leaf merk. Absent outer keys contribute no entry; outer keys
  whose leaf is an empty `ProvableCountTree` contribute `(key, 0)`.

Validation is the dispatcher-level
`validate_aggregate_count_on_range` (accepts both shapes); non-ACOR
queries get `InvalidQuery` before any storage reads. Non-tree
matches on the carrier level surface `InvalidQuery` rather than the
merk-level `count_aggregate_on_range` error.

Six tests pin the contract:
- `no_proof_per_key_leaf_matches_single_count` — leaf shape returns
  the same `u64` as `query_aggregate_count`, wrapped.
- `no_proof_per_key_carrier_returns_per_outer_count` — two outer
  brands, 500 colors each -> two entries.
- `no_proof_per_key_skips_absent_outer_keys` — absent outer key
  contributes no entry.
- `no_proof_per_key_empty_leaf_returns_zero` — outer key exists,
  leaf is empty `ProvableCountTree` -> `(key, 0)`.
- `no_proof_per_key_matches_proof_path_per_key` — element-for-element
  equality with `verify_aggregate_count_query_per_key` on a
  non-trivial 3-brand carrier query.
- `no_proof_per_key_rejects_non_aggregate_count_query` — non-ACOR
  path queries rejected with `InvalidQuery`.

Reuses the existing `query_aggregate_count_on_range` version gate
since this is another entry point for the same feature, not a
distinct protocol change.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@QuantumExplorer
Copy link
Copy Markdown
Member Author

Self Reviewed.

@QuantumExplorer QuantumExplorer merged commit 8755418 into develop May 14, 2026
10 of 11 checks passed
@QuantumExplorer QuantumExplorer deleted the claude/stoic-ardinghelli-508a69 branch May 14, 2026 23:51
QuantumExplorer added a commit to dashpay/platform that referenced this pull request May 15, 2026
…+ bench probe

Pulls in dashpay/grovedb#663 ("feat(grovedb,query): allow AggregateCountOnRange
as carrier subquery"), merge commit 87554188ad — bumps every pinned
`grovedb*` dep across the workspace from a917d92d to 87554188.

The new grovedb APIs:
- `Query::validate_carrier_aggregate_count_on_range`
- `Query::validate_leaf_aggregate_count_on_range`
- `GroveDb::query_aggregate_count_per_key`
- `GroveDb::verify_aggregate_count_query_per_key`

These unblock the chapter 30 G7 shape (`brand IN[...] AND color > floor` with
`group_by = [brand]`) at the grovedb layer. Drive wire-up is the next step
(lift mode_detection.rs:140 rejection + new path-query builder + new
DocumentCountMode variant + per-key verifier wrapper).

This commit also adds `probe_carrier_acor` to document_count_worst_case bench
as a feasibility smoke test against the existing 100 000-row widget fixture:

  [carrier-acor] no-proof entries (2):
    ("brand_000", 499)
    ("brand_001", 499)
  [carrier-acor] proof bytes: 4332 B
  [carrier-acor] verified root_hash: 0x62ee7348f4d28dd9d7cf86a6c725fa8276...
  [carrier-acor] verified entries (2):
    ("brand_000", 499)
    ("brand_001", 499)

4 332 B for k=2, matching the complexity estimate documented in chapter 30
(`O(k · (log B + log C'))`) and ~17 % smaller than the concatenated-leaf-ACOR
alternative.

Verification:
- cargo check --workspace clean
- cargo test -p drive --lib drive_document_count_query → 45 tests passing
- cargo test -p drive --lib verify → 240 tests passing
- All chapter 29 / chapter 30 proof sizes (585 / 1041 / 1327 / 1911 / 1102 /
  1381 / 2072 / 2656 B for Q1..Q8) unchanged — bump is purely additive
- Carrier-ACOR end-to-end: query → prove → verify all round-trip with root
  hash matching the chapter's known 0x62ee7348... fixture root

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
QuantumExplorer added a commit that referenced this pull request May 15, 2026
…0 envelopes

V0 (`MerkOnlyLayerProof`) envelopes predate the aggregate-sum feature and
cannot legitimately carry an aggregate-sum proof, so the V0 layer walker
was unreachable in any honestly-produced proof. Mirror the count-side
changes from PR #663:

  - Convert `aggregate_sum.rs` into a subdirectory mirroring
    `aggregate_count/`: `mod.rs` (public API + V0 rejection),
    `helpers.rs` (envelope decode, single-key layer verification,
    chain enforcement, leaf sum verification), `leaf_chain.rs` (V1
    leaf-chain walker). Removes the dead `verify_v0_layer` path.

  - Add a prover-side V0 gate in `prove_query_non_serialized`: when
    grove_version dispatches to V0 and the path query carries an
    `AggregateSumOnRange` anywhere, return `NotSupported` instead of
    emitting a V0 envelope the verifier would (correctly) reject.

  - Update tests: replace the V0 round-trip test with a V0-rejection
    test; broaden the empty-leaf type-confusion test to accept either
    the V0-rejection or the terminal-type-gate error; remove the now-
    unreachable V0 missing-lower-layer test (V1 counterpart already
    pins the missing-layer behavior); refresh stale doc-comments that
    pointed at `aggregate_sum.rs` line numbers.

No carrier-shape support yet — `AggregateSumOnRange` is still leaf-only
on the merk side, so there is no `classification.rs` / `per_key.rs`
mirror. The leaf-only shape can be extended later in parallel with
matching merk-level work, just like the count side did.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant