feat: subtree get/insert#1
Conversation
530f0e3 to
3fc13af
Compare
3f36eba to
a00ba1b
Compare
| Merk::open("./test4.db")?, | ||
| Merk::open("./test5.db")?, | ||
| Merk::open("./test6.db")?, | ||
| Merk::open(Path::new(SUBTREES_DIR).join("test1.db"))?, |
There was a problem hiding this comment.
All trees need to be stored in the same RocksDb database to ensure atomicity and data consistency.
There was a problem hiding this comment.
Yeah, it will be a scope of a next task
There was a problem hiding this comment.
Actually, done in this PR, please take a look
| .ok_or(Error::InvalidPath("key not found in Merk"))? | ||
| .as_slice(), | ||
| )?; | ||
| element.follow_reference(merk) |
There was a problem hiding this comment.
You could get into an infinite loop here if references are pointing to each other. This should never be possible, but it's still probably a good idea to protect against them.
There was a problem hiding this comment.
In order to protect against this, I would suggest passing all previous keys in the recursion in order to prevent cyclical recursions, and to limit recursions to 16.
There was a problem hiding this comment.
Good catch, thank you
| .ok_or(Error::InvalidPath("key not found in Merk"))? | ||
| .as_slice(), | ||
| )?; | ||
| element.is_tree()?; |
There was a problem hiding this comment.
I think we should eventually pass an option to bypass the path/tree verification.
|
Very good initial work guys! |
a00ba1b to
08108ad
Compare
fix: circular reference path
Rename auto traits feature
Codex review of PR #654 surfaced five real bugs in the NonCounted implementation. All fixed; regression tests added. P1 #1 — serialize.rs: stack-overflow vector still open The nested-wrapper guard ran AFTER bincode::decode_from_slice had already recursed through the Box<Element> chain. A payload of repeated [15, 15, 15, ...] bytes could blow the stack before any check fired. Fix: pre-check the leading two bytes for a wrapper-of- wrapper before invoking bincode. The post-check stays as defense in depth. Test: deserialize_rejects_long_nested_wrapper_chain_without_recursion feeds 1024 leading wrapper bytes — pre-check stops it on byte 1. P1 #2 — batch propagation dropped the wrapper apply_batch_structure converted a NonCounted(Tree) into the internal InsertTreeWithRootHash op but stored no information that the original was wrapped. On execution the reconstructed element was bare, so the on-disk parent element lost the wrapper byte AND the parent count tree's aggregate started counting the subtree. Fix: add `non_counted: bool` to both InsertTreeWithRootHash and InsertNonMerkTree (#[non_exhaustive]). Set it from element.is_non_counted() during apply_batch_structure; re-wrap the reconstructed element via into_non_counted() during execution. Test: batch_propagation_preserves_non_counted_wrapper_on_subtree inserts a NonCounted(CountTree) plus a child under it in one batch and asserts get_raw on the parent still returns NonCounted. P1 #3 — reconstruct_with_root_key dropped the wrapper No NonCounted arm meant existing wrapped trees couldn't propagate. Any subtree mutation that flowed through update_tree_item_preserve_flag reached the helper and got None, failing the operation. Fix: add a NonCounted arm that recurses on the inner element and re-wraps the reconstructed result. Test: reconstruct_preserves_non_counted_wrapper. P2 #4 — batch insert path bypassed the parent-type guard Direct Merk insertion already rejected NonCounted children outside count-bearing parents, but the batch path skipped the same check. Users could persist NonCounted into Normal/Sum/BigSum trees via batch. Fix: mirror the per-merk guard with the same predicate (element.is_non_counted() && !in_tree_type.is_count_bearing()). Test: batch_insert_rejects_non_counted_into_normal_tree. P2 #5 — get path didn't follow NonCounted(Reference) get_caching_optional matched only the bare Reference variant; a NonCounted-wrapped reference fell through and was returned as a value, contradicting the wrapper-transparency contract. Fix: into_underlying() on the get result before dispatching, and same in the follow_reference inner loop so reference chains hop through wrapped intermediate references. Test: get_follows_non_counted_reference inserts a NonCounted-wrapped reference inside a count tree and asserts get() returns the referenced item. Test counts (with --features minimal,test_utils,full): - grovedb-element: 45 (was 44, +1 stack-overflow regression) - grovedb-merk: 364 (was 362, +2: reconstruct + reconstruct-non-tree) - grovedb: 1452 (was 1449, +3: NonCounted regressions) Build paths touched: GroveOp::InsertTreeWithRootHash and GroveOp::InsertNonMerkTree gained a non_counted: bool field. The struct literal construction sites in tests/cost-estimation modules get non_counted: false (they predate the wrapper feature). Match-only sites (e.g. cost estimators destructuring with `..`) are unchanged. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…#654) * feat: add Element::NonCounted wrapper to opt-out of count propagation A new `Element::NonCounted(Box<Element>)` variant that behaves identically to its inner element except it contributes 0 to the parent count tree's aggregate. Sums still propagate. May only be inserted into count-bearing trees (CountTree, ProvableCountTree, CountSumTree, ProvableCountSumTree). Use case: insert housekeeping rows, schema markers, or auxiliary indexes inside a count tree without inflating its count. Single-variant wrapper design (instead of 15+ explicit variants) keeps the diff surface and audit footprint small. Existing `Element` predicates and helpers look through the wrapper via a new `underlying()` accessor; cost, proof, and pattern-match sites dispatch on the underlying element. The wrapper byte is included in the serialized value so the value hash distinguishes wrapped from unwrapped at the cryptographic layer. Nested `NonCounted(NonCounted(...))` is rejected at construction, serialization, and deserialization to close a stack-overflow vector. Tests: 10 in grovedb-element (constructor, helpers, bincode round-trip, nested-rejection), 5 in grovedb-merk (insert validation, aggregate_data end-to-end). Full workspace test suite still passes. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * fix: address NonCounted review feedback CodeRabbit review findings on PR #654: - batch/mod.rs (process_reference_with_hop_count_greater_than_one): hash must be over the WRAPPED bytes (which is what's on disk), not the underlying bytes. Now matches on element.underlying() for dispatch but serializes the outer element for value_hash. - batch/mod.rs (apply_batch_structure): the long if/else-if chain that rewrites parent insert ops into InsertTreeWithRootHash / InsertNonMerkTree was matching on the outer Element, so a NonCounted-wrapped tree fell through to the "non tree" error path. Now rebinds via element.underlying() before the chain. - query.rs (follow_element): a Reference resolving to NonCounted(Item) returned the wrapper while a directly-queried NonCounted(Item) was unwrapped. Normalize the resolved value via into_underlying() too. - proof/verify.rs (extract_count_from_element): trunk proofs rooted at a NonCounted(CountTree/...) failed with "target is not a count tree element". Look through the wrapper. - merk/element/costs.rs (specialized_costs_for_key_value): the +1 wrapper byte was a deliberate under-count; replace with explicit wrapper_overhead added to all constant-size value_len computations so storage cost accounting matches on-disk bytes exactly. - merk/element/get.rs (V0 + V1 cost paths): same wrapper_overhead fix for storage_loaded_bytes on tree and sum-item arms. - merk/element/insert.rs: the NonCounted insert guard only ran in insert(); add the same check at the start of insert_reference and insert_subtree so NonCounted(Reference) and NonCounted(Tree) cannot be inserted into non-count-bearing trees through those paths. - grovedb/operations/insert/mod.rs: replace unreachable!() for Element::NonCounted(_) on the public insert path with a typed Error::InvalidInput so a hand-built nested wrapper returns an error rather than panicking. Tests: - merk/tree_type/mod.rs: add is_count_bearing test for all variants (CodeRabbit nitpick). - merk/element/insert.rs: add insert_subtree path validation test. Full workspace tests still pass (1449 grovedb, 362 grovedb-merk, 41 grovedb-element). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * chore: bump orchard pin to 898258d (dashified rebased on 0.13.1) Old orchard rev pulled core2 ^0.3, all of whose published versions are yanked from crates.io, breaking fresh `cargo update` runs. The new dashified branch is rebased on orchard 0.13.1 which uses corez instead. This unblocks CI dependency resolution. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * refactor: ElementType gets NonCountedXxx twins at 0x80|base Drops the flat ElementType::NonCounted = 15 variant and replaces it with 15 NonCountedXxx variants whose discriminants are the base type's discriminant with the high bit set (e.g. NonCountedItem = 128, NonCountedTree = 130, ...). Bit 7 cleanly encodes "is this wrapped?" and bits 0-6 encode the underlying base type. The on-disk format is unchanged: Element::NonCounted still serializes as [15, ...inner bytes]. ElementType::from_serialized_value detects this by peeking byte 1 when byte 0 is the new NON_COUNTED_WRAPPER_DISCRIMINANT (15) constant, then synthesizes the NonCountedXxx variant. TryFrom<u8> on byte 15 alone is rejected (it needs the inner byte to disambiguate) — every other base byte and the new 128..=142 range round-trip normally. New helpers on ElementType: - is_non_counted(self) -> bool: (self as u8) & 0x80 != 0 - base(self) -> ElementType: strips the high bit, returns underlying base type (or self if already a base) is_tree, is_item, is_reference, has_simple_value_hash, and proof_node_type all dispatch via base(), so a NonCountedTree is still a tree, a NonCountedItem still uses Kv proof nodes, etc. Element::element_type() returns the synthetic NonCountedXxx for a wrapped element, so callers get both "what kind" and "is wrapped" from a single ElementType value. Tests: - test_element_type_from_discriminant updated: TryFrom(15) is now Err; added 128, 129, 142 mappings and out-of-range 127, 143. - test_non_counted_helpers: is_non_counted and base() correctness + the discriminant relationship (twin = base | 0x80). - test_simple_vs_combined_hash: NonCounted twins inherit base hash kind. - test_proof_node_type_through_non_counted_wrapper: wrapper transparent for proof-node-type selection. - test_from_serialized_value: 2-byte peek for byte-15 case, including truncated/nested/unknown-inner rejections. - test_is_tree: covers NonCountedXxx + spot checks is_item/is_reference. - test_non_counted_wrapper_discriminant_pinned: pins the bincode discriminant for Element::NonCounted to 15 and verifies the synthetic ElementType resolution end-to-end. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * fix(lint): clear pre-existing clippy warnings + cover NonCounted in debugger CI runs `cargo clippy --workspace --all-features -- -D warnings`. Three pre-existing warnings in untouched files were already promoted to errors by a recent clippy update and were blocking CI: - merk/src/proofs/query/verify.rs:693: collapsible-match — fold the inner `if k.as_slice() == key` into the outer match arm guard. - merk/src/test_utils/mod.rs:268: explicit-counter-loop — replace the hand-rolled `seed += 1` counter with `for seed in initial_seed..end`. - grovedb/src/replication.rs:163: useless-conversion — drop the redundant `.into_iter()` call. Also covers a non-exhaustive match in `grovedb/src/debugger.rs` (only compiled with the `grovedbg` feature, hence missed earlier): the visualizer wire format has no NonCounted variant, so we render the inner element transparently — the wrapper is invisible in the debug UI. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * fix(non_counted): address Codex review (P1 + P2 findings) Codex review of PR #654 surfaced five real bugs in the NonCounted implementation. All fixed; regression tests added. P1 #1 — serialize.rs: stack-overflow vector still open The nested-wrapper guard ran AFTER bincode::decode_from_slice had already recursed through the Box<Element> chain. A payload of repeated [15, 15, 15, ...] bytes could blow the stack before any check fired. Fix: pre-check the leading two bytes for a wrapper-of- wrapper before invoking bincode. The post-check stays as defense in depth. Test: deserialize_rejects_long_nested_wrapper_chain_without_recursion feeds 1024 leading wrapper bytes — pre-check stops it on byte 1. P1 #2 — batch propagation dropped the wrapper apply_batch_structure converted a NonCounted(Tree) into the internal InsertTreeWithRootHash op but stored no information that the original was wrapped. On execution the reconstructed element was bare, so the on-disk parent element lost the wrapper byte AND the parent count tree's aggregate started counting the subtree. Fix: add `non_counted: bool` to both InsertTreeWithRootHash and InsertNonMerkTree (#[non_exhaustive]). Set it from element.is_non_counted() during apply_batch_structure; re-wrap the reconstructed element via into_non_counted() during execution. Test: batch_propagation_preserves_non_counted_wrapper_on_subtree inserts a NonCounted(CountTree) plus a child under it in one batch and asserts get_raw on the parent still returns NonCounted. P1 #3 — reconstruct_with_root_key dropped the wrapper No NonCounted arm meant existing wrapped trees couldn't propagate. Any subtree mutation that flowed through update_tree_item_preserve_flag reached the helper and got None, failing the operation. Fix: add a NonCounted arm that recurses on the inner element and re-wraps the reconstructed result. Test: reconstruct_preserves_non_counted_wrapper. P2 #4 — batch insert path bypassed the parent-type guard Direct Merk insertion already rejected NonCounted children outside count-bearing parents, but the batch path skipped the same check. Users could persist NonCounted into Normal/Sum/BigSum trees via batch. Fix: mirror the per-merk guard with the same predicate (element.is_non_counted() && !in_tree_type.is_count_bearing()). Test: batch_insert_rejects_non_counted_into_normal_tree. P2 #5 — get path didn't follow NonCounted(Reference) get_caching_optional matched only the bare Reference variant; a NonCounted-wrapped reference fell through and was returned as a value, contradicting the wrapper-transparency contract. Fix: into_underlying() on the get result before dispatching, and same in the follow_reference inner loop so reference chains hop through wrapped intermediate references. Test: get_follows_non_counted_reference inserts a NonCounted-wrapped reference inside a count tree and asserts get() returns the referenced item. Test counts (with --features minimal,test_utils,full): - grovedb-element: 45 (was 44, +1 stack-overflow regression) - grovedb-merk: 364 (was 362, +2: reconstruct + reconstruct-non-tree) - grovedb: 1452 (was 1449, +3: NonCounted regressions) Build paths touched: GroveOp::InsertTreeWithRootHash and GroveOp::InsertNonMerkTree gained a non_counted: bool field. The struct literal construction sites in tests/cost-estimation modules get non_counted: false (they predate the wrapper feature). Match-only sites (e.g. cost estimators destructuring with `..`) are unchanged. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * fix(non_counted): address second Codex review pass (3 fixes + tighter tests) CodeRabbit follow-up review on 10d3821 surfaced three more issues: 1. ElementType::from_serialized_value accepted invalid inner discriminants. The bitwise OR `Self::try_from(NON_COUNTED_FLAG | inner_byte)` is a no-op when `inner_byte` already has bit 7 set. A payload like `[15, 128, ...]` silently parsed as `NonCountedItem` (since 0x80 | 0x80 == 0x80) instead of being rejected. Fix: explicitly require `inner_byte < 15` (i.e. a real base discriminant 0..=14) before applying the bit. New rejection tests cover [15, 128], [15, 142], [15, 16], [15, 100]. 2. GroveOp::RefreshReference rejected wrapped references. The arm matched `&element` directly as `Element::Reference`, so a stored `NonCounted(Reference)` failed refresh as "not a reference". Fix: match `element.underlying()` so the wrapper is transparent for refresh — the inner reference's path is what matters. 3. check_subtree_exists rejected wrapped tree parents. Although get_caching_optional now unwraps NonCounted, the subtree validation helper still only accepted bare tree variants. Any path through a `NonCounted(Tree)` parent failed validation, breaking the wrapper-transparency contract. Fix: `element.map(|e| e.into_underlying())` before matching tree variants. Regression test `check_subtree_exists_through_non_counted_wrapper` inserts a child into a wrapped subtree and asserts it succeeds. Also tightened existing tests per nitpicks: - batch_insert_rejects_non_counted_into_normal_tree now asserts the error message contains "non-counted" (catches the parent-type guard specifically, not any unrelated batch failure). - batch_propagation_preserves_non_counted_wrapper_on_subtree now also reads the outer count tree's aggregate and asserts it equals 1 (i.e. the NonCounted subtree did NOT contribute to the count). This protects both invariants the test documents. Test counts (with --features minimal,test_utils,full): - grovedb-element: 45 (unchanged; from_serialized_value rejection cases added to existing test) - grovedb-merk: 364 (unchanged) - grovedb: 1453 (was 1452, +1: check_subtree_exists regression) Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * fix(non_counted): wrapper transparency in typed APIs and cost paths Targeted audit (security-auditor agent) found 5 categories of catchall match arms over Element where NonCounted was silently rejected or mis-handled instead of dispatching through to the inner element. HIGH: typed non-Merk tree APIs rejected wrapped trees The mmr_tree_*, bulk_*, commitment_tree_*, and dense_tree_* methods all matched bare variants like `Element::MmrTree(..)` and treated `NonCounted(MmrTree)` as "not the expected tree", returning InvalidInput. NonCounted-wrapped non-Merk trees are a perfectly valid use case (insert a wrapped MmrTree inside a CountTree to suppress its count contribution), so this broke every read/write path. - operations/mmr_tree.rs (5 sites) - operations/bulk_append_tree.rs (7 sites) - operations/commitment_tree.rs (5 sites) - operations/dense_tree.rs (5 sites) Fix: dispatch via element.underlying() / into_underlying() everywhere. Test: typed_mmr_api_works_through_non_counted_wrapper exercises a wrapped MmrTree end-to-end via mmr_tree_leaf_count. HIGH: query_encoded_many rejected wrapped references and resolved items operations/get/query.rs:90 matched QueryResultElement::ElementResultItem(Element::Reference(..)) directly, so any wrapped reference fell into the catchall as "path_queries can only refer to references". The resolved value at line 106 had the same problem. Fix: extract the element first then match .into_underlying(), mirroring the pattern already used elsewhere in the file. MEDIUM: aggregate_sum_query rejected NonCounted-wrapped sum items element/aggregate_sum_query/mod.rs:775 — exactly the kind of element the wrapper exists for (suppress count, keep sum). Now matches .into_underlying(). MEDIUM: batch flag-update returned wrong/None cost for wrapped trees batch/mod.rs:2443 — the just-in-time flag-update path dispatched on bare new_element and routed `NonCounted(Tree)` through the `_ => None` arm, dropping the layered cost. Fix: capture wrapper_overhead, then match on .underlying() and add wrapper_overhead to value_len in each constant-size branch (parallel to merk/src/element/costs.rs). MEDIUM: estimated_costs replace paths inconsistent with insert paths estimated_costs/average_case_costs.rs:329 (replace_element) and estimated_costs/worst_case_costs.rs:189, 244 (insert_element + replace_element) all routed `NonCounted(Tree)` through the serialized_size fallback instead of the layered cost arm. Average's insert path was already correct (uses tree_flags_and_type) — the replace paths and worst_case insert/replace were inconsistent. Fix: same wrapper_overhead pattern + match .underlying(). Test counts (with --features minimal,test_utils,full): - grovedb-element: 45 (unchanged) - grovedb-merk: 364 (unchanged) - grovedb: 1454 (+1: typed_mmr_api_works_through_non_counted_wrapper) Clippy clean with --workspace --all-features. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This is an attempt to implement simple insert and get operations for GroveDb (inside subtrees without exposing to public API yet) to store binary data