Skip to content

Commit da53ef0

Browse files
fix(verify): require provable tree type at aggregate query terminal layer
Security finding (Codex): the `verify_aggregate_sum_query` and `verify_aggregate_count_query` chain walkers only checked `element.is_any_tree()` for path elements. At the terminal (leaf) layer this is insufficient — if the honest tree at the queried path happens to be an EMPTY Merk-backed tree of any type (NormalTree, SumTree, BigSumTree, CountTree, CountSumTree, ProvableCountTree, ProvableCountSumTree, ProvableSumTree), its stored `value_hash = combine_hash(H(element_bytes), NULL_HASH)`. The merk verifier accepts empty proof bytes as `(NULL_HASH, 0)`, so an attacker can construct a forged proof with: - layer 0: honest single-key proof of the leaf path key in its parent - layer 1: empty bytes (forged) and the chain check passes uniformly. The verifier returns `sum = 0` (or `count = 0`) against the trusted root hash, even though the leaf isn't a Provable{Sum,Count}Tree. The numeric answer is correct (an empty tree has sum 0 / count 0), so this isn't a value forgery — but it IS a type-confusion soundness gap: a caller that infers "leaf is a ProvableSumTree" from "the aggregate verifier accepted" is deceived. The prover-side gate in `Merk::prove_aggregate_{sum,count}_on_range` already rejects non-provable inputs, but the verifier didn't mirror that invariant. THE FIX In `enforce_lower_chain`, add an `is_terminal: bool` parameter. At intermediate depths nothing changes (`is_any_tree()` still suffices — the GroveDB grove can route through any tree type on the way down). At the terminal depth — passed `is_terminal = true` when `depth + 1 == path_keys.len()` — the verifier now requires: - aggregate-sum: `matches!(element, Element::ProvableSumTree(..))` - aggregate-count: `matches!(element, Element::ProvableCountTree(..) | Element::ProvableCountSumTree(..))` Wrapper variants (NonCounted, NotSummed) are stripped via the existing `into_underlying()` so they continue to work transparently. TESTS Three new regression tests that surgically construct the forgery from a real honest single-key envelope and confirm the verifier now rejects: - `empty_leaf_type_confusion_forgery_rejected` (sum side, empty NormalTree at leaf) - `empty_provable_count_tree_at_leaf_rejected_for_sum` (sum side, empty ProvableCountTree at leaf — confirms type-specificity) - `empty_leaf_type_confusion_forgery_rejected` (count side, empty NormalTree at leaf) The path == 0 case is unaffected: the merk-level hash divergence between `node_hash` and `node_hash_with_sum` / `node_hash_with_count` makes it computationally infeasible to forge a proof that matches the trusted root, so the path-elements check is unnecessary at the root. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
1 parent ff5645b commit da53ef0

4 files changed

Lines changed: 438 additions & 24 deletions

File tree

grovedb/src/operations/proof/aggregate_count.rs

Lines changed: 43 additions & 17 deletions
Original file line numberDiff line numberDiff line change
@@ -147,13 +147,19 @@ fn verify_v0_layer(
147147
)?;
148148

149149
// Chain check: combine_hash(H(tree_value), lower_hash) must equal the
150-
// value_hash recorded by the parent merk for this tree element.
150+
// value_hash recorded by the parent merk for this tree element. When
151+
// the next descent IS the leaf, also require that the element is
152+
// specifically a ProvableCountTree / ProvableCountSumTree (parallel to
153+
// the sum-side guard) — closes the empty-Merk-tree type-confusion
154+
// bypass.
155+
let is_terminal = depth + 1 == path_keys.len();
151156
enforce_lower_chain(
152157
path_query,
153158
&next_key,
154159
&proven_value_bytes,
155160
&lower_hash,
156161
&parent_proof_hash,
162+
is_terminal,
157163
grove_version,
158164
)?;
159165

@@ -211,12 +217,14 @@ fn verify_v1_layer(
211217
grove_version,
212218
)?;
213219

220+
let is_terminal = depth + 1 == path_keys.len();
214221
enforce_lower_chain(
215222
path_query,
216223
&next_key,
217224
&proven_value_bytes,
218225
&lower_hash,
219226
&parent_proof_hash,
227+
is_terminal,
220228
grove_version,
221229
)?;
222230

@@ -304,25 +312,27 @@ fn verify_single_key_layer_proof_v0(
304312
Ok((value_bytes, root_hash, proved.proof))
305313
}
306314

307-
/// Enforce the layer-chain hash equality: the parent merk's recorded
308-
/// value_hash for the tree element must equal `combine_hash(H(value),
309-
/// lower_layer_root_hash)`. This is what makes the count cryptographically
310-
/// bound to the GroveDB root hash — the leaf count proof's reconstructed
311-
/// `lower_hash` must agree with the parent's commitment, transitively up to
312-
/// the root.
315+
/// Enforce the layer-chain hash equality plus, at the terminal layer, the
316+
/// leaf-tree-type invariant.
313317
///
314-
/// Intermediate path elements may be any tree type — the GroveDB grove can
315-
/// route through Normal/Sum/Count/etc. trees on the way down to the
316-
/// provable-count leaf. The leaf-level tree-type check is enforced by the
317-
/// merk prover (`Merk::prove_aggregate_count_on_range`); here we only
318-
/// require that each non-leaf element on the path *is* some non-empty tree,
319-
/// since only trees have a lower layer to chain into.
318+
/// At intermediate depths the only requirement is that the element be
319+
/// *some* tree (we have to descend further). At the terminal depth — the
320+
/// last path element, whose inner Merk is the actual count target — the
321+
/// element MUST deserialize to `ProvableCountTree` or `ProvableCountSumTree`
322+
/// (after wrapper unwrapping). Without this, an empty Merk-backed tree of
323+
/// any other type at the leaf accepts a forged empty leaf proof, because
324+
/// every empty Merk-backed tree has `inner_root = NULL_HASH` and so its
325+
/// stored `value_hash = combine_hash(H(bytes), NULL_HASH)` matches the
326+
/// recomputation uniformly. The honest prover-side gate in
327+
/// `Merk::prove_aggregate_count_on_range` already rejects non-provable-count
328+
/// inputs; this is the matching verifier-side gate.
320329
fn enforce_lower_chain(
321330
path_query: &PathQuery,
322331
target_key: &[u8],
323332
proven_value_bytes: &[u8],
324333
lower_hash: &CryptoHash,
325334
parent_proof_hash: &CryptoHash,
335+
is_terminal: bool,
326336
grove_version: &GroveVersion,
327337
) -> Result<(), Error> {
328338
let element = Element::deserialize(proven_value_bytes, grove_version)
@@ -337,14 +347,30 @@ fn enforce_lower_chain(
337347
)
338348
})?
339349
.into_underlying();
340-
if !element.is_any_tree() {
350+
if is_terminal {
351+
if !matches!(
352+
element,
353+
Element::ProvableCountTree(..) | Element::ProvableCountSumTree(..)
354+
) {
355+
return Err(Error::InvalidProof(
356+
path_query.clone(),
357+
format!(
358+
"aggregate-count proof's terminal path element at key {} must be a \
359+
ProvableCountTree or ProvableCountSumTree (got {}); a count aggregate \
360+
is only meaningful against a tree that binds its count into the node hash",
361+
hex::encode(target_key),
362+
element.type_str()
363+
),
364+
));
365+
}
366+
} else if !element.is_any_tree() {
341367
return Err(Error::InvalidProof(
342368
path_query.clone(),
343369
format!(
344-
"aggregate-count proof's path element at key {} is not a tree element \
345-
(got {:?}); count queries can only descend through tree elements",
370+
"aggregate-count proof's intermediate path element at key {} is not a tree \
371+
element (got {}); count queries can only descend through tree elements",
346372
hex::encode(target_key),
347-
std::mem::discriminant(&element)
373+
element.type_str()
348374
),
349375
));
350376
}

grovedb/src/operations/proof/aggregate_sum.rs

Lines changed: 45 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -151,12 +151,23 @@ fn verify_v0_layer(
151151
grove_version,
152152
)?;
153153

154+
// When the next descent IS the leaf, require that the element we're
155+
// about to bottom out into is specifically a ProvableSumTree. Without
156+
// this gate, an empty Merk-backed tree of any other type (Tree,
157+
// SumTree, CountTree, …) at the leaf path would accept a forged empty
158+
// leaf proof — its stored value_hash already equals
159+
// `combine_hash(H(bytes), NULL_HASH)`, so the chain check passes — and
160+
// the verifier would silently return sum=0 for a non-ProvableSumTree
161+
// leaf (type-confusion, not value forgery, but a soundness gap all
162+
// the same).
163+
let is_terminal = depth + 1 == path_keys.len();
154164
enforce_lower_chain(
155165
path_query,
156166
&next_key,
157167
&proven_value_bytes,
158168
&lower_hash,
159169
&parent_proof_hash,
170+
is_terminal,
160171
grove_version,
161172
)?;
162173

@@ -212,12 +223,14 @@ fn verify_v1_layer(
212223
grove_version,
213224
)?;
214225

226+
let is_terminal = depth + 1 == path_keys.len();
215227
enforce_lower_chain(
216228
path_query,
217229
&next_key,
218230
&proven_value_bytes,
219231
&lower_hash,
220232
&parent_proof_hash,
233+
is_terminal,
221234
grove_version,
222235
)?;
223236

@@ -300,15 +313,27 @@ fn verify_single_key_layer_proof_v0(
300313
Ok((value_bytes, root_hash, proved.proof))
301314
}
302315

303-
/// Enforce the layer-chain hash equality. Identical contract to the count
304-
/// side: the parent merk's recorded value_hash for the tree element must
305-
/// equal `combine_hash(H(value), lower_layer_root_hash)`.
316+
/// Enforce the layer-chain hash equality plus, at the terminal layer,
317+
/// the leaf-tree-type invariant.
318+
///
319+
/// At intermediate depths the only requirement is that the element be
320+
/// *some* tree (we have to descend further). At the terminal depth — the
321+
/// last path element, whose inner Merk is the actual aggregate target —
322+
/// the element MUST deserialize to `Element::ProvableSumTree` (after
323+
/// wrapper unwrapping). Without this check, an empty Merk-backed tree of
324+
/// any other type at the leaf accepts a forged empty leaf proof, because
325+
/// every empty Merk-backed tree has `inner_root = NULL_HASH` and so its
326+
/// stored `value_hash = combine_hash(H(bytes), NULL_HASH)` — the chain
327+
/// check passes uniformly. The honest prover-side gate in
328+
/// `Merk::prove_aggregate_sum_on_range` already rejects non-ProvableSumTree
329+
/// inputs; this is the matching verifier-side gate.
306330
fn enforce_lower_chain(
307331
path_query: &PathQuery,
308332
target_key: &[u8],
309333
proven_value_bytes: &[u8],
310334
lower_hash: &CryptoHash,
311335
parent_proof_hash: &CryptoHash,
336+
is_terminal: bool,
312337
grove_version: &GroveVersion,
313338
) -> Result<(), Error> {
314339
let element = Element::deserialize(proven_value_bytes, grove_version)
@@ -323,14 +348,27 @@ fn enforce_lower_chain(
323348
)
324349
})?
325350
.into_underlying();
326-
if !element.is_any_tree() {
351+
if is_terminal {
352+
if !matches!(element, Element::ProvableSumTree(..)) {
353+
return Err(Error::InvalidProof(
354+
path_query.clone(),
355+
format!(
356+
"aggregate-sum proof's terminal path element at key {} must be a \
357+
ProvableSumTree (got {}); a sum aggregate is only meaningful against \
358+
a tree that binds its sum into the node hash",
359+
hex::encode(target_key),
360+
element.type_str()
361+
),
362+
));
363+
}
364+
} else if !element.is_any_tree() {
327365
return Err(Error::InvalidProof(
328366
path_query.clone(),
329367
format!(
330-
"aggregate-sum proof's path element at key {} is not a tree element \
331-
(got {:?}); sum queries can only descend through tree elements",
368+
"aggregate-sum proof's intermediate path element at key {} is not a tree \
369+
element (got {}); sum queries can only descend through tree elements",
332370
hex::encode(target_key),
333-
std::mem::discriminant(&element)
371+
element.type_str()
334372
),
335373
));
336374
}

grovedb/src/tests/aggregate_count_query_tests.rs

Lines changed: 118 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1230,4 +1230,122 @@ mod tests {
12301230
other => panic!("expected InvalidProof, got {:?}", other),
12311231
}
12321232
}
1233+
1234+
/// Security regression: empty-leaf type-confusion forgery
1235+
/// (parallel of `empty_leaf_type_confusion_forgery_rejected` on the
1236+
/// sum side).
1237+
///
1238+
/// The honest leaf is an empty NormalTree (root_key=None). Every
1239+
/// empty Merk-backed tree stores `inner_root = NULL_HASH`, so its
1240+
/// recorded value_hash equals `combine_hash(H(element_bytes),
1241+
/// NULL_HASH)`. The merk-level count verifier accepts empty proof
1242+
/// bytes as `(NULL_HASH, 0)`. Before the fix the verifier's loose
1243+
/// `is_any_tree()` check happily accepted NormalTree element bytes
1244+
/// and the chain hash matched by coincidence, letting an attacker
1245+
/// prove `count = 0` against a path that wasn't actually a
1246+
/// ProvableCountTree. The numeric answer (0) is correct for an
1247+
/// empty tree of any type, but the implicit claim "the leaf is a
1248+
/// ProvableCountTree" was a soundness gap.
1249+
#[test]
1250+
fn empty_leaf_type_confusion_forgery_rejected() {
1251+
use std::collections::BTreeMap;
1252+
1253+
use bincode::config;
1254+
use grovedb_version::version::v2::GROVE_V2;
1255+
1256+
use crate::operations::proof::{
1257+
GroveDBProof, GroveDBProofV0, MerkOnlyLayerProof, ProveOptions,
1258+
};
1259+
1260+
// Use V0 (GROVE_V2) envelope — its MerkOnlyLayerProof is simpler
1261+
// to surgically reconstruct than V1's LayerProof/ProofBytes.
1262+
let v: &GroveVersion = &GROVE_V2;
1263+
let db = make_test_grovedb(v);
1264+
db.insert(
1265+
[TEST_LEAF].as_ref(),
1266+
b"evil",
1267+
Element::empty_tree(),
1268+
None,
1269+
None,
1270+
v,
1271+
)
1272+
.unwrap()
1273+
.expect("insert empty normal tree at evil");
1274+
1275+
// Honest probe to harvest the layer-0 merk proof bytes that prove
1276+
// `evil` exists in the TEST_LEAF merk with its NormalTree element
1277+
// bytes.
1278+
let probe = PathQuery::new_single_key(vec![TEST_LEAF.to_vec()], b"evil".to_vec());
1279+
let probe_proof_bytes = db
1280+
.grove_db
1281+
.prove_query(&probe, None, v)
1282+
.unwrap()
1283+
.expect("honest probe should succeed");
1284+
1285+
let cfg = config::standard()
1286+
.with_big_endian()
1287+
.with_limit::<{ 256 * 1024 * 1024 }>();
1288+
let probe_decoded: GroveDBProof = bincode::decode_from_slice(&probe_proof_bytes, cfg)
1289+
.unwrap()
1290+
.0;
1291+
1292+
let (root_mp, test_leaf_mp) = match probe_decoded {
1293+
GroveDBProof::V0(GroveDBProofV0 { root_layer, .. }) => (
1294+
root_layer.merk_proof,
1295+
root_layer
1296+
.lower_layers
1297+
.get(TEST_LEAF)
1298+
.expect("descent")
1299+
.merk_proof
1300+
.clone(),
1301+
),
1302+
GroveDBProof::V1(_) => panic!("expected V0 envelope under GROVE_V2"),
1303+
};
1304+
1305+
let leaf = MerkOnlyLayerProof {
1306+
merk_proof: Vec::new(),
1307+
lower_layers: BTreeMap::new(),
1308+
};
1309+
let mut test_leaf_map = BTreeMap::new();
1310+
test_leaf_map.insert(b"evil".to_vec(), leaf);
1311+
let test_leaf_layer = MerkOnlyLayerProof {
1312+
merk_proof: test_leaf_mp,
1313+
lower_layers: test_leaf_map,
1314+
};
1315+
let mut root_lower = BTreeMap::new();
1316+
root_lower.insert(TEST_LEAF.to_vec(), test_leaf_layer);
1317+
1318+
let forged = GroveDBProof::V0(GroveDBProofV0 {
1319+
root_layer: MerkOnlyLayerProof {
1320+
merk_proof: root_mp,
1321+
lower_layers: root_lower,
1322+
},
1323+
prove_options: ProveOptions::default(),
1324+
});
1325+
let forged_bytes = bincode::encode_to_vec(&forged, cfg).expect("encode");
1326+
1327+
let attack_pq = PathQuery::new_aggregate_count_on_range(
1328+
vec![TEST_LEAF.to_vec(), b"evil".to_vec()],
1329+
QueryItem::RangeFrom(b"a".to_vec()..),
1330+
);
1331+
1332+
let result = GroveDb::verify_aggregate_count_query(&forged_bytes, &attack_pq, v);
1333+
match result {
1334+
Err(e) => {
1335+
let msg = format!("{e}");
1336+
assert!(
1337+
msg.contains("must be a ProvableCountTree")
1338+
|| msg.contains("ProvableCountSumTree"),
1339+
"verifier rejected as expected but with an unrelated message: {msg}"
1340+
);
1341+
}
1342+
Ok((root_hash, count)) => panic!(
1343+
"BUG: empty-leaf forgery accepted by aggregate-count verifier! \
1344+
Returned (root_hash={}, count={}) — the leaf is a NormalTree, \
1345+
not a ProvableCountTree.",
1346+
hex::encode(root_hash),
1347+
count
1348+
),
1349+
}
1350+
}
12331351
}

0 commit comments

Comments
 (0)