Skip to content

feat(merk): support large value opcodes in encoding and decoding#406

Merged
QuantumExplorer merged 2 commits into
developfrom
featsupport-large-value-opcodes
Jan 20, 2026
Merged

feat(merk): support large value opcodes in encoding and decoding#406
QuantumExplorer merged 2 commits into
developfrom
featsupport-large-value-opcodes

Conversation

@QuantumExplorer
Copy link
Copy Markdown
Member

@QuantumExplorer QuantumExplorer commented Jan 20, 2026

This update introduces support for large value opcodes in the encoding and decoding processes. Values greater than or equal to 64KB are now handled using a u32 for value length instead of a u16.

The changes include:

  • Updated encoding logic to differentiate between small and large values, adjusting the opcode range accordingly.
  • Enhanced decoding logic to correctly interpret the new opcodes for large values.
  • Adjustments to the encoding length calculations to accommodate the new header sizes for large values.

These modifications ensure that the system can now efficiently manage larger data sizes while maintaining backward compatibility with existing smaller values.

Summary by CodeRabbit

  • Improvements
    • Proofs now support much larger value payloads (up to 64 MB) with more efficient header handling, improved size calculations, and stricter validation to prevent oversized or malformed data.
  • Documentation
    • Clarified inline documentation describing large-value opcode behavior and length semantics.

✏️ Tip: You can customize this high-level summary in your review settings.

@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented Jan 20, 2026

📝 Walkthrough

Walkthrough

Adds large-value proof encoding/decoding support: new large opcodes for Push (0x20–0x25) and PushInverted (0x28–0x2d), conditional headers using 16-bit or 32-bit value lengths based on size (MAX_VALUE_LEN = 64 MB), and corresponding updates to encoding, decoding, and length calculation.

Changes

Cohort / File(s) Summary
Large-Value Proof Encoding/Decoding
merk/src/proofs/encoding.rs
Adds MAX_VALUE_LEN (64 MB) and large-value opcode variants (0x20–0x25, 0x28–0x2d). Encoding now chooses small headers (1B opcode + 1B key_len + 2B value_len) or large headers (large-opcode + 1B key_len + 4B value_len) and writes value as u16/u32 accordingly. Decoding recognizes new opcodes, enforces MAX_VALUE_LEN, parses 32-bit lengths, value, optional value_hash/counts/feature types. Updates encoding_length to account for 16v vs 32v header sizes. Minor formatting and integration into existing Push/PushInverted paths.

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Poem

🐰 I nibble bytes beneath the moon,

New opcodes hum a larger tune,
From small to grand the values leap,
Sixty-four megabytes to keep—
A hop, a stitch, encoded soon.

🚥 Pre-merge checks | ✅ 2 | ❌ 1
❌ Failed checks (1 warning)
Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (2 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title accurately describes the main change: adding support for large value opcodes in encoding and decoding operations.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing touches
  • 📝 Generate docstrings

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
merk/src/proofs/encoding.rs (1)

895-1593: Missing test coverage for large value variants.

The test suite covers small value encoding/decoding thoroughly, but there are no tests for the new large value opcodes (0x20-0x25, 0x28-0x2d). Consider adding tests that:

  1. Encode/decode values at the boundary (exactly 65536 bytes)
  2. Encode/decode values larger than 65536 bytes
  3. Verify round-trip encoding/decoding for each large value variant

Do you want me to generate test cases for the large value variants?

🤖 Fix all issues with AI agents
In `@merk/src/proofs/encoding.rs`:
- Around line 629-804: The decoding paths for opcodes 0x20-0x25 and 0x28-0x2d
allocate value buffers using an unchecked u32 `value_len`, allowing a
maliciously large length to trigger memory exhaustion; add a constant
MAX_VALUE_LEN (e.g. usize) and validate `value_len as usize <= MAX_VALUE_LEN`
immediately after decoding `value_len` and before vec allocation in all branches
handling Node::KV, Node::KVValueHash, Node::KVRefValueHash,
Node::KVValueHashFeatureType, Node::KVCount, and Node::KVRefValueHashCount (both
Push and PushInverted variants); factor the logic into a small helper (e.g.
read_validated_value or ensure_max_value_len) to perform
decode->validate->allocate->read_exact and return a Decode error if the length
exceeds the max to avoid duplication.
🧹 Nitpick comments (1)
merk/src/proofs/encoding.rs (1)

33-275: Consider extracting helper functions to reduce code duplication.

The encoding logic for small vs. large values is repeated for each variant with minor differences. A helper function could handle the common pattern:

fn write_kv_header<W: Write>(
    dest: &mut W,
    small_opcode: u8,
    large_opcode: u8,
    key: &[u8],
    value_len: usize,
) -> ed::Result<()> {
    if value_len < 65536 {
        dest.write_all(&[small_opcode, key.len() as u8])?;
        dest.write_all(key)?;
        (value_len as u16).encode_into(dest)
    } else {
        dest.write_all(&[large_opcode, key.len() as u8])?;
        dest.write_all(key)?;
        (value_len as u32).encode_into(dest)
    }
}

This would reduce the repetitive branching logic across 12 match arms.

Comment thread merk/src/proofs/encoding.rs
Comment thread merk/src/proofs/encoding.rs
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
merk/src/proofs/encoding.rs (1)

41-148: Enforce MAX_VALUE_LEN during encoding to avoid invalid proofs and u32 truncation.

value.len() is written as u32 without a cap; values > 64MB (or > u32::MAX) will truncate and produce proofs the decoder rejects. Add a guard and fail early in all value-bearing branches.

🛠️ Suggested fix (apply to all value-bearing variants)
 Op::Push(Node::KV(key, value)) => {
     debug_assert!(key.len() < 256);
-    if value.len() < 65536 {
+    if value.len() > MAX_VALUE_LEN as usize {
+        return Err(ed::Error::IOError(std::io::Error::new(
+            std::io::ErrorKind::InvalidInput,
+            "value length exceeds MAX_VALUE_LEN",
+        )));
+    }
+    if value.len() < 65536 {
         dest.write_all(&[0x03, key.len() as u8])?;
         dest.write_all(key)?;
         (value.len() as u16).encode_into(dest)?;
         dest.write_all(value)?;
     } else {
         dest.write_all(&[0x20, key.len() as u8])?;
         dest.write_all(key)?;
         (value.len() as u32).encode_into(dest)?;
         dest.write_all(value)?;
     }
 }

Also applies to: 169-280

@QuantumExplorer QuantumExplorer changed the title feat: support large value opcodes in encoding and decoding feat(merk): support large value opcodes in encoding and decoding Jan 20, 2026
@QuantumExplorer QuantumExplorer merged commit 86562f6 into develop Jan 20, 2026
8 checks passed
@QuantumExplorer QuantumExplorer deleted the featsupport-large-value-opcodes branch January 20, 2026 12:26
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants