feat(merk,grovedb): add no-proof query_aggregate_count entry point#662
Conversation
Adds an O(log n) execution variant of `AggregateCountOnRange` that returns the count directly, without producing or verifying a proof. Server-side handlers answering `prove=false` count requests no longer need to compute a proof just to discard it. - merk: `Merk::count_aggregate_on_range` walks the same classification path as `prove_aggregate_count_on_range` (Contained / Disjoint / Boundary, using each node's stored aggregate count to short-circuit fully-inside / fully-outside subtrees) but skips proof-op emission. NonCounted-correctness is preserved via the same `own_count = node_count − left − right` derivation the prover uses. - grovedb: `GroveDb::query_aggregate_count` validates the PathQuery shape via `validate_aggregate_count_on_range`, opens the leaf merk at the given path, and delegates to the merk-level walk. Tree-type rejection (`ProvableCountTree` / `ProvableCountSumTree` only) is enforced at the merk entry. - version: new `query_aggregate_count_on_range` field on `GroveDBOperationsQueryVersions`, wired through v1/v2/v3 at version 0. The returned count is not independently verifiable — callers trust their own merk read path. For a verifiable count, callers continue to use `prove_query` + `verify_aggregate_count_query`. Tests: 6 merk-level tests cross-check the no-proof count against the prover's count across all range variants, the empty-merk and wrong-tree-type cases. 11 GroveDB-level tests cover the public API on single- and three-layer paths, all range variants, empty trees, malformed inner ranges, and the NormalTree rejection. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|
Warning Rate limit exceeded
You’ve run out of usage credits. Purchase more in the billing tab. ⌛ How to resolve this issue?After the wait time has elapsed, a review can be triggered using the We recommend that you space out your commits to avoid hitting the rate limit. 🚦 How do rate limits work?CodeRabbit enforces hourly rate limits for each developer per organization. Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout. Please see our FAQ for further information. ℹ️ Review info⚙️ Run configurationConfiguration used: defaults Review profile: CHILL Plan: Pro Run ID: 📒 Files selected for processing (3)
📝 WalkthroughWalkthroughThis PR adds a no-proof variant of aggregate count on range queries to GroveDB. It extends version management, implements counting traversal in the merk layer, exposes a public ChangesNo-Proof Aggregate Count Query
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~20 minutes Poem
🚥 Pre-merge checks | ✅ 5✅ Passed checks (5 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## develop #662 +/- ##
===========================================
+ Coverage 90.74% 90.76% +0.01%
===========================================
Files 184 184
Lines 55532 55791 +259
===========================================
+ Hits 50395 50639 +244
- Misses 5137 5152 +15
🚀 New features to boost your workflow:
|
Addresses codecov/patch feedback (PR #662 patch coverage was 87.89%, target 90%): 1. Refactor walk_count_only to collapse error branches: - Extract `provable_count_from_walker` helper to share the aggregate_data + provable_count_from_aggregate error mapping between the Contained-leaf and Boundary positions. - Replace match-on-Option<RefWalker> with if-let-Some so the "link is Some but walk returned None" arm — defensive, unreachable in practice — is no longer counted as an uncovered branch. 2. Drop the redundant tree_type check on `RefWalker::count_aggregate_on_range`: the caller (`Merk::count_aggregate_on_range`) already validates, and the per-node `provable_count_from_aggregate` check catches any mismatch between declared and in-memory type. 3. Add positive tests for more code paths: - merk: ProvableCountSumTree happy-path, RangeAfter, RangeTo, RangeToInclusive, RangeAfterToInclusive (each cross-checks against the prover). - grovedb: transactional read (`TransactionArg = Some(&tx)`) and the path-not-found error path (`open_transactional_merk_at_path` error arm). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…t.rs
The no-proof aggregate-count walk does not produce a proof — it's a
read operation that happens to share its tree-walking pattern with
`prove_aggregate_count_on_range`. `prove.rs` is documented as
"Generating Merkle proofs for queries against a Merk tree"
(merk/src/merk/mod.rs:47), so the function fits more naturally in
`get.rs` ("Getting values by key from a Merk tree") alongside the
other read entry points.
No functional change — just file relocation.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
There was a problem hiding this comment.
Actionable comments posted: 1
🧹 Nitpick comments (1)
grovedb/src/tests/aggregate_count_query_tests.rs (1)
1443-1459: ⚡ Quick winMake the transaction test observe uncommitted state.
Starting the transaction after all data is already committed only proves that
Some(&tx)doesn't error. This still passes ifquery_aggregate_countignores the transaction and reads from the base view. A small uncommitted insert/delete insidetx, plus aNonevsSome(&tx)assertion, would turn this into a real regression test for transaction threading.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@grovedb/src/tests/aggregate_count_query_tests.rs` around lines 1443 - 1459, The test no_proof_uses_provided_transaction currently starts the transaction after all data is committed, so change it to make the transaction observe uncommitted state: after calling let tx = db.start_transaction() perform a mutating operation inside that transaction (e.g., insert or remove a key under the same path used by PathQuery via the transaction API) that will change the expected aggregate count by ±1, then call grove_db.query_aggregate_count(&path_query, Some(&tx), v) and assert the returned count reflects the uncommitted change, and also call grove_db.query_aggregate_count(&path_query, None, v) (or before the tx mutation) to assert the base view does not include the change; use the existing symbols tx, query_aggregate_count, PathQuery::new_aggregate_count_on_range and TEST_LEAF to locate where to add the transactional insert/delete and the paired assertions.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@merk/src/proofs/query/aggregate_count.rs`:
- Around line 524-580: The code currently treats impossible mismatches as silent
undercounts; change both child-walk branches and the own_count computation to
fail fast: for the left/right walker, replace the if let Some(...) pattern with
a match on walker.walk(...) and if it returns None while the corresponding link
is present, propagate an error instead of skipping (use the same error
propagation mechanism as cost_return_on_error! to return a corrupted-state
error); likewise, after computing own_count =
node_count.saturating_sub(left_link_aggregate).saturating_sub(right_link_aggregate),
detect if node_count < left_link_aggregate + right_link_aggregate and return an
error rather than clamping — update the logic in walk_count_only / walker.walk
usage sites (functions/symbols: walker.walk, walk_count_only,
cost_return_on_error!, node_count, left_link_aggregate, right_link_aggregate,
total, range.contains(&node_key)) so callers no longer receive silently
truncated totals.
---
Nitpick comments:
In `@grovedb/src/tests/aggregate_count_query_tests.rs`:
- Around line 1443-1459: The test no_proof_uses_provided_transaction currently
starts the transaction after all data is committed, so change it to make the
transaction observe uncommitted state: after calling let tx =
db.start_transaction() perform a mutating operation inside that transaction
(e.g., insert or remove a key under the same path used by PathQuery via the
transaction API) that will change the expected aggregate count by ±1, then call
grove_db.query_aggregate_count(&path_query, Some(&tx), v) and assert the
returned count reflects the uncommitted change, and also call
grove_db.query_aggregate_count(&path_query, None, v) (or before the tx mutation)
to assert the base view does not include the change; use the existing symbols
tx, query_aggregate_count, PathQuery::new_aggregate_count_on_range and TEST_LEAF
to locate where to add the transactional insert/delete and the paired
assertions.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: f53d71ec-bfd9-40d0-9b76-c6194db3ce0f
📒 Files selected for processing (8)
grovedb-version/src/version/grovedb_versions.rsgrovedb-version/src/version/v1.rsgrovedb-version/src/version/v2.rsgrovedb-version/src/version/v3.rsgrovedb/src/operations/get/query.rsgrovedb/src/tests/aggregate_count_query_tests.rsmerk/src/merk/prove.rsmerk/src/proofs/query/aggregate_count.rs
Address CodeRabbit feedback on PR #662. **Major (correctness)**: The proof variant's verifier catches inconsistent tree state at verify time, but the no-proof variant returns the count straight to the caller — so it must fail loudly instead of silently undercounting. Restore the corrupted-state arms my earlier coverage- focused refactor had collapsed: - Re-introduce `CorruptedState` errors when `tree.link(true/false)` is `Some` but `walker.walk(...)` returns `None`. - Switch own_count derivation from `saturating_sub` to `checked_sub`: children claiming more keys than the parent's aggregate is corruption, not something to clamp to 0. **Nitpick (test quality)**: Tighten `no_proof_uses_provided_transaction` so it actually verifies transaction threading rather than just proving `Some(&tx)` doesn't error: - Read the base count (10). - Insert an in-range key inside a transaction. - Assert the transactional read sees 11 (uncommitted insert visible). - Assert the non-transactional read still sees 10 (base view isolated). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|
This is Claude — reply to @coderabbitai's review. Nitpick on
This catches a regression where Both pieces of feedback (the major fail-fast change + this test) are in 395b1e0b. Inline reply to the major one is on the diff thread. |
|
Tip For best results, initiate chat on the files or code changes.
If you found this review helpful, would you consider giving us a shout-out on X? Thank you for using CodeRabbit! |
CI linting flagged `ok_or_else(|| Error::CorruptedState(...))` with a zero-cost &'static str argument — the closure is unnecessary. Use `ok_or(...)` instead. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Mirrors PR #662's `query_aggregate_count` for the signed-sum side. Callers that need a sum value but not a proof (e.g. server handlers answering `prove=false` sum requests) can now bypass proof construction, serialization, and verification entirely. The merk-level walk is `O(log n + |boundary|)` in the number of distinct keys, identical complexity to the prover but without the proof-op allocations or hash recomputations. The signed-sum arithmetic carries the same `i128` accumulator the prover and verifier use (so adversarial intermediate sums never wrap), and narrows to `i64` at the public entry point. An out-of-i64 result is classified as `Error::CorruptedData` since a real `ProvableSumTree` maintains every aggregate as `i64` at every level. NEW APIS - `Merk::sum_aggregate_on_range(&inner_range, grove_version) -> CostResult<i64, Error>` in `merk/src/merk/get.rs`. Checks `tree_type == ProvableSumTree`; rejects any other tree type with `Error::InvalidProofError`. Returns 0 for an empty merk. - `RefWalker::sum_aggregate_on_range(&inner_range, grove_version)` in `merk/src/proofs/query/aggregate_sum.rs`. Walks the same Contained / Disjoint / Boundary classification path as `create_aggregate_sum_on_range_proof`, but emits no proof ops. - `GroveDb::query_aggregate_sum(path_query, transaction, grove_version) -> CostResult<i64, Error>` in `grovedb/src/operations/get/query.rs`. Validates the PathQuery up-front via `validate_aggregate_sum_on_range` (same gate the prover and verifier use — catches malformed ASOR queries plus the empty-path rejection from the prior commit before any storage reads), opens the leaf merk at `path_query.path`, and delegates to the merk-level walk. - New `query_aggregate_sum_on_range` field on `GroveDBOperationsQueryVersions`, wired through v1/v2/v3 at version `0`. NotSummed-correctness is preserved via the same `own_sum = node_sum - left_struct - right_struct` derivation the prover uses. NotSummed-wrapped subtrees have stored aggregate 0, so the subtraction yields 0 at the wrapper boundary - they do not contribute to the in-range total. The returned sum is **not** independently verifiable: callers are trusting their own merk read path. For a verifiable sum, continue using `prove_query` + `verify_aggregate_sum_query`. Documented explicitly on both entry points. TESTS - 10 new merk-level cross-checks (`merk/src/proofs/query/aggregate_sum.rs::tests`): each range variant against `prove_aggregate_sum_on_range`'s computed sum, plus empty-merk-returns-0, NormalTree rejection, ProvableCountTree rejection (precise tree-type match, not "any provable aggregate tree"), and a mixed-positive/negative scenario that exercises the signed `own_sum` subtraction. - 11 new GroveDB-level cross-checks (`grovedb/src/tests/aggregate_sum_query_tests.rs::tests`): every range shape on a populated `ProvableSumTree`, empty subtree returns 0, negative-sum scenario, invalid-inner-range (`Key`) rejected with `InvalidQuery`, empty-path rejected with `InvalidQuery`, NormalTree leaf rejected with `MerkError` from the merk-level gate. Workspace `cargo test --all-features`: 2985 passing / 0 failing (was 2964 / 0). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Issue being fixed or feature implemented
Adds a no-proof execution variant of
AggregateCountOnRange. Callers that need a count value but not a proof (e.g. server handlers answeringprove=falsecount requests) previously had to callprove_queryand discard the proof bytes. That wastes CPU and allocations on proof construction / serialization for a count that's then thrown away, and reads as confused intent.Caller context: dashpay/platform#3623 wires up the unified
GetDocumentsCountendpoint with range-count support overrange_countableindexes; its summed-mode path walks every emitted element in Rust today (O(distinct values in range)). With this primitive available, that path becomes O(log n) and drops directly intoexecute_range_count_no_proof.What was done?
Merk::count_aggregate_on_rangewalks the same Contained / Disjoint / Boundary classification asprove_aggregate_count_on_range, using each internal node's stored aggregate count to short-circuit fully-inside / fully-outside subtrees, but emits no proof ops. NonCounted-correctness is preserved via the sameown_count = node_count − left_struct − right_structderivation the prover uses (NonCountedleaves have stored aggregate 0 → own_count 0). Tree-type gate (ProvableCountTree/ProvableCountSumTreeonly) and empty-merk-returns-0 contract are identical to the proof variant.GroveDb::query_aggregate_count(path_query, transaction, grove_version) -> CostResult<u64, Error>. Validates the PathQuery shape up front viavalidate_aggregate_count_on_range(same gate the prover and verifier use), opens the leaf merk atpath_query.path, and delegates to the merk-level walk.query_aggregate_count_on_rangefield onGroveDBOperationsQueryVersions, wired through v1/v2/v3 at version0.The returned count is not independently verifiable — callers trust their own merk read path. For a verifiable count, callers continue to use
prove_query+verify_aggregate_count_query. The doc comments on both entry points say so explicitly.How Has This Been Tested?
Merk-level (6 new tests, merk/src/proofs/query/aggregate_count.rs):
no_proof_matches_prover_closed_range_inclusiveno_proof_matches_prover_closed_range_exclusiveno_proof_matches_prover_open_range_fromno_proof_matches_prover_range_below_all_keysno_proof_empty_merk_returns_zerono_proof_rejected_on_normal_treeEach cross-checks
count_aggregate_on_rangeagainst the prover's count for the same merk + range, so any divergence between the two walks fails the test.GroveDB-level (11 new tests, grovedb/src/tests/aggregate_count_query_tests.rs):
ProvableCountTreeandProvableCountSumTreeProvableCountTreereturns 0Key) rejected withError::InvalidQuerybefore storage readsNormalTreerejected viaError::MerkErrorfrom the merk-level gateEach test cross-checks the no-proof result against the proof variant (
prove_query+verify_aggregate_count_query).Suite: full merk lib (
cargo test -p grovedb-merk --lib) → 415 passing; full grovedb lib (cargo test -p grovedb --lib) → 1507 passing, 0 failed.Breaking Changes
None. The new field on
GroveDBOperationsQueryVersionsis additive and starts at0across all versions. No existing API signatures changed.Checklist:
🤖 Generated with Claude Code
Summary by CodeRabbit