Skip to content

feat: upgrade to quantum-safe cryptography (ChaCha20-Poly1305 + BLAKE3)#440

Merged
grumbach merged 19 commits intomaidsafe:masterfrom
grumbach:post_quatum
Mar 13, 2026
Merged

feat: upgrade to quantum-safe cryptography (ChaCha20-Poly1305 + BLAKE3)#440
grumbach merged 19 commits intomaidsafe:masterfrom
grumbach:post_quatum

Conversation

@grumbach
Copy link
Member

@grumbach grumbach commented Mar 10, 2026

Summary

Upgrades the cryptographic primitives to quantum-resistant alternatives, preparing self_encryption for post-quantum security:

  • AES-128-CBC → ChaCha20-Poly1305: 256-bit key, 96-bit nonce, authenticated encryption (AEAD) — provides both confidentiality and integrity in a single pass
  • SHA3-256 → BLAKE3: Faster content hashing with 256-bit output, used for chunk addressing and key derivation
  • Major version bump: 0.34.3 → 1.0.0 (breaking change — encrypted data is not backward-compatible)

What changed

New files

  • src/cipher.rs — ChaCha20-Poly1305 encrypt/decrypt replacing the old src/aes.rs (CBC mode)
  • src/hash.rscontent_hash() using BLAKE3, replacing XorName::from_content() (SHA3-256)

Key derivation changes (src/utils.rs)

  • get_pad_key_and_iv()get_pad_key_and_nonce() — now returns Result with bounds-checked indexing
  • Key: full 32 bytes from neighboring chunk hash (was 16 bytes)
  • Nonce: 12 bytes from second neighbor hash (was 16-byte IV)
  • get_n_1_n_2() now validates total_num_chunks >= 3 and returns Result

Encrypt/decrypt pipeline (src/encrypt.rs, src/decrypt.rs)

  • Encrypt: compress → ChaCha20-Poly1305 encrypt → XOR obfuscate (was: compress → AES-CBC encrypt → XOR)
  • Decrypt: XOR deobfuscate → ChaCha20-Poly1305 decrypt → decompress
  • AEAD tag (16 bytes) now authenticates each chunk — tampered chunks are rejected at the crypto layer

DataMap changes (src/data_map.rs)

  • infos() now returns &[ChunkInfo] instead of Vec<ChunkInfo> (zero-copy)
  • Serialization uses versioned format with DataMap::to_bytes() / DataMap::from_bytes()
  • debug_bytes() uses safe .get() indexing instead of direct array access

Recursive DataMap handling (src/lib.rs)

  • get_root_data_map() and get_root_data_map_parallel() now use DataMap::from_bytes() instead of test_helpers::deserialise()
  • Added recursion depth limit (100) to prevent stack overflow on malicious DataMaps
  • All XorName::from_content() calls replaced with hash::content_hash()

Streaming APIs (src/stream_encrypt.rs, src/stream_decrypt.rs, src/stream_file.rs)

  • Updated to use new cipher module and BLAKE3 hashing throughout
  • All four streaming functions work with the new crypto: streaming_encrypt_from_file, streaming_decrypt_from_storage, stream_encrypt, streaming_decrypt

Safety improvements

  • Eliminated all direct array indexing (vec[0]) — replaced with .get(), .first(), .ok_or()
  • get_pad_key_and_nonce() returns Result instead of panicking on bad indices
  • Inline format variables throughout (format!("{var}") instead of format!("{}", var))

Dependencies

  • Added: chacha20poly1305 = "0.10", blake3 = "1"
  • Removed: aes, cbc
  • Updated deny.toml license allow list

New tests (19 added in tests/integration_tests.rs)

Test What it verifies
test_stream_encrypt_decrypt_roundtrip Iterator-based streaming encrypt → decrypt
test_file_stream_encrypt_decrypt_roundtrip File-based streaming encrypt → file decrypt
test_stream_encrypt_file_decrypt_storage_cross Cross-API: stream_encrypt → streaming_decrypt_from_storage
test_file_encrypt_stream_decrypt_cross Cross-API: file encrypt → streaming_decrypt
test_memory_encrypt_stream_decrypt Cross-API: in-memory encrypt → streaming_decrypt
test_stream_encrypt_memory_decrypt Cross-API: stream_encrypt → in-memory decrypt
test_streaming_decrypt_range_access Random-access decryption of byte ranges
test_large_file_streaming_roundtrip 10MB file streaming roundtrip
test_minimum_size_data Minimum encryptable size (3 bytes)
test_single_chunk_boundary Exactly at single chunk boundary
test_multi_chunk_boundary Exactly at multi-chunk boundary
test_non_aligned_sizes Non-aligned sizes (prime numbers)
test_encrypted_chunks_do_not_contain_plaintext Sentinel pattern not in encrypted output
test_tampered_chunk_fails_decryption Bit-flipped chunk rejected by AEAD
test_wrong_datamap_fails_decryption Wrong DataMap cannot decrypt chunks
test_aead_tag_protects_integrity Truncated/modified AEAD tag → decryption failure
test_content_hash_is_blake3 content_hash() matches raw blake3::hash()
test_encrypted_chunk_dst_hash_is_blake3 Chunk addresses are BLAKE3 hashes
test_key_derivation_sizes Key=32 bytes, Nonce=12 bytes, Pad=52 bytes

Breaking changes

  • Encrypted data format is incompatible with previous versions (different cipher, different hashing)
  • DataMap::infos() returns &[ChunkInfo] instead of Vec<ChunkInfo>
  • get_pad_key_and_nonce() returns Result (was infallible get_pad_key_and_iv())
  • Content addresses change (BLAKE3 vs SHA3-256) — existing chunk stores are invalidated

Why quantum-safe

AES-128 has an effective security level of 64 bits against Grover's algorithm on a quantum computer. ChaCha20-Poly1305 with a 256-bit key retains 128-bit security post-quantum. BLAKE3 similarly maintains its security margin against quantum attacks. While the XOR obfuscation layer and convergent key derivation are symmetric constructions unaffected by Shor's algorithm, upgrading to 256-bit primitives across the board ensures the entire pipeline remains secure in a post-quantum world.

@grumbach
Copy link
Member Author

@claude please review

- Allow clippy::type_complexity on test helper function
- Ignore RUSTSEC-2025-0141 (bincode unmaintained, no upgrade path)
@grumbach grumbach changed the title feat: post quatum crypto feat: upgrade to quantum-safe cryptography (ChaCha20-Poly1305 + BLAKE3) Mar 10, 2026
Replace all internal usages with stream_encrypt-based implementations.
Python and Node.js bindings retain their interface but use stream_encrypt internally.
Integration tests use a local helper function instead.
Replace all internal usages with streaming_decrypt-based implementations.
Python and Node.js bindings retain their interface but use streaming_decrypt internally.
Integration tests use a local decrypt_to_file helper.
Copy link
Member

@dirvine dirvine left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the PR. I'm requesting changes for three blocking issues.

  1. The ChaCha20-Poly1305 swap introduces a nonce-reuse hazard. In the current derivation, the AEAD key comes from n_1_src_hash and the nonce comes from n_2_src_hash, so two different current chunks can be encrypted under the same key/nonce pair whenever their two predecessor chunks match. That's not safe for ChaCha20-Poly1305.

    AES-GCM would not fix this; it has the same nonce-uniqueness requirement. Safer options would be:

    • a misuse-resistant AEAD such as AES-GCM-SIV or AES-SIV, if the goal is to stay safe even when nonces repeat, or
    • keep ChaCha20-Poly1305, but derive (key, nonce, pad) through a KDF / rehash step over richer context such as src_hash || n_1_src_hash || n_2_src_hash || chunk_index || child_level with domain separation. That way, different current chunks do not reuse the same nonce under the same key.
  2. The Python streaming_decrypt_from_storage callback contract appears to have changed from list[chunk_name] -> list[bytes] to indexed tuples, but the public-facing tests still exercise the older shape. That looks like a runtime API break in the bindings.

  3. The Node TypeScript declaration for streamingDecryptFromStorage is out of sync with the Rust implementation: the declaration says the callback returns a single Uint8Array, while the Rust side iterates the result as an array of chunk buffers.

Happy to re-review once those are addressed.

@dirvine
Copy link
Member

dirvine commented Mar 11, 2026

Design note for the ChaCha20-Poly1305 path

The core constraint here is not just “pick a stronger AEAD”, it is “pick a stronger AEAD without breaking the self-encryption model”. This code path still needs to preserve:

  • deterministic / convergent behaviour for the same chunking context,
  • context binding to neighbouring chunks / DataMap structure,
  • no key/nonce reuse across different plaintexts,
  • stable cross-language derivation logic.

The current ChaCha20-Poly1305 design fails on the third point because the AEAD key is derived from n_1_src_hash and the nonce from n_2_src_hash, so two different current chunks can reuse the same (key, nonce) whenever the two predecessor hashes are the same.

Important: switching to AES-GCM would not solve this. AES-GCM has the same nonce-uniqueness requirement. If the derivation stays the same, the problem stays the same.

Two viable directions:

  1. Preferred if the goal is robustness under deterministic derivation:
    use a misuse-resistant AEAD such as AES-GCM-SIV or AES-SIV.

    That gives much better failure behaviour if a nonce repeats, which is exactly the class of mistake this design is currently vulnerable to.

  2. If we keep ChaCha20-Poly1305:
    derive pad || key || nonce from a domain-separated KDF / rehash over the full chunk context, not just the two neighbour hashes.

    For example, derive from something like:

    version || src_hash || n_1_src_hash || n_2_src_hash || chunk_index || child_level

    A practical way to do that in this PR would be to use a domain-separated BLAKE3 derivation / XOF, since BLAKE3 is already being introduced:

    okm = KDF("self_encryption/chunk/v2", src_hash || n1 || n2 || chunk_index || child_level)

    and then split okm into:

    • pad
    • key
    • nonce

    This fixes the immediate ChaCha20-Poly1305 issue because a different current chunk implies a different src_hash, and therefore a different derived nonce and/or key. It also keeps the process deterministic.

A few concrete recommendations if you go with option 2:

  • Put src_hash into the nonce derivation at minimum. Right now the current chunk content only affects the XOR pad.
  • Add explicit domain separation and a format/version tag so future crypto changes do not collide with this derivation.
  • Include chunk_index and child_level so the same three hashes in a different structural position do not accidentally collide.
  • Add a regression test that constructs two files whose current chunk differs but whose predecessor chunk hashes are identical, and assert that the derived (key, nonce) pair is different.
  • Treat this as a format bump, because ciphertext compatibility has already changed.

In short: AES-GCM is not the fix here; either use a nonce-misuse-resistant AEAD, or keep ChaCha20-Poly1305 and make the derivation depend on the full current chunk context.

…mprove Python docs

- Replace naive byte-slicing key derivation with domain-separated BLAKE3 KDF
  over full chunk context (src_hash || n1 || n2 || chunk_index || child_level)
  using context string 'self_encryption/chunk/v2' to eliminate nonce-reuse hazard
- Fix Node.js TS declaration: streaming_decrypt_from_storage callback returns
  Uint8Array[] not Uint8Array
- Improve Python binding docstrings for callback contract clarity
- Add regression tests for KDF domain separation by src_hash, chunk_index,
  and child_level
Leave 4KiB headroom below 4MiB to accommodate occasional compression
growth that can cause chunks to slightly exceed the 4MiB boundary.
GitHub no longer supports macos-13 runners.
Add three new unit tests for the BLAKE3 KDF in get_pki():
- test_kdf_deterministic: same inputs always produce identical output
- test_kdf_avalanche_single_bit_flip: single bit change causes >30% output divergence
- test_kdf_output_sizes: verify pad=52, key=32, nonce=12 bytes
The get_chunks callback now unpacks (index, name) tuples and returns
(index, bytes) tuples, matching the expected contract.
Drop all backward-compatible v0/legacy DataMap format handling,
keeping only the clean v1 versioned format. Removes old format
fallback in from_bytes(), visit_seq v0 handler in Deserialize impl,
and 6 legacy tests.
@grumbach grumbach enabled auto-merge (rebase) March 12, 2026 07:44
@mickvandijke
Copy link
Contributor

mickvandijke commented Mar 12, 2026

Thanks for the thorough work on this @grumbach — the crypto upgrade is well-designed. The BLAKE3 KDF with domain separation is a big improvement over the old byte-slicing approach, and the bounds-checked indexing throughout eliminates a whole class of panics. Solid stuff.

I did a deep review and found a few things worth looking at before merge:


1. 🔴 Critical — No test verifies AEAD integrity through the library's own pipeline

test_tampered_chunk_fails_decryption flips a bit in a stored chunk, but decrypt() looks up chunks by dst_hash — the hash no longer matches after tampering, so the chunk is "not found" before AEAD verification ever runs. test_aead_tag_protects_integrity tests the raw chacha20poly1305 crate directly, bypassing the XOR pad and compression layers. Together, there's no coverage proving the library's pipeline rejects tampered data via AEAD.

Suggestion: Add a test that calls decrypt_chunk() directly with tampered ciphertext and valid src_hashes — this bypasses the hash lookup and forces the AEAD layer to be the actual gate.

2. 🟠 High — test_file_encrypt_stream_decrypt_cross is a duplicate

Both test_stream_encrypt_file_decrypt_storage_cross and test_file_encrypt_stream_decrypt_cross use stream_encryptstreaming_decrypt. The second claims to test file-encrypt → stream-decrypt cross-compat but is identical to the first. One of them should use a different API path.

3. 🟡 Medium — shrink_data_map hardcodes child_level=0

When shrink_data_map recursively encrypts a DataMap, it calls encrypt(bytes)? which uses child_level=0 in the KDF. The DataMap::with_child() metadata tracks the level, but the KDF doesn't use it. In practice content uniqueness at each level makes collisions astronomically unlikely, but threading the actual child level through would be good defense-in-depth for nonce domain separation.

4. 🟡 Medium — Parallel decryption removed — rayon dep possibly dead

decrypt_sorted_set now processes chunks sequentially instead of using rayon. This is a performance regression for large files. Meanwhile rayon remains in Cargo.toml — if nothing else uses it, it's dead weight. Either restore parallelism or remove the dep.

5. 🟡 Medium — from_bytes() and Deserialize impl are independent code paths

data_map.rs has two separate binary deserialization implementations (~line 38 and ~line 136) that could diverge during maintenance. Consider having from_bytes() delegate to bincode::deserialize() (which uses the trait impl) to eliminate the duplication.

6. 🔵 Low — get_n_1_n_2 doesn't validate chunk_index < total_num_chunks

utils.rs:46-58: An out-of-range chunk_index silently returns neighbor indices that look valid. The error is caught downstream by .get() in get_pad_key_and_nonce, but an early guard with a clear error message would be better.

7. 🔵 Low — test_key_derivation_sizes doesn't test what it claims

The test name suggests it verifies key=32, nonce=12, pad=52 byte sizes, but it only checks size_of::<XorName>() == 32 and does an encrypt/decrypt roundtrip. The actual size verification only exists in the unit test test_kdf_output_sizes in utils.rs. Consider renaming or replacing with actual size assertions on get_pad_key_and_nonce() output.

grumbach and others added 3 commits March 12, 2026 19:17
…vel KDF, parallel decrypt, DataMap parsing, bounds check, test rename

- Add AEAD tamper detection integration test (Issue 1)
- Fix stream encrypt/decrypt cross-implementation test (Issue 2)
- Pass child_level through KDF for domain separation in shrink_data_map (Issue 3)
- Parallelize decrypt_sorted_set with rayon par_iter (Issue 4)
- Simplify DataMap::from_bytes to delegate to bincode (Issue 5)
- Add bounds check in get_n_1_n_2 for out-of-range chunk_index (Issue 6)
- Rename test_key_derivation_sizes to test_key_derivation_sizes_and_roundtrip (Issue 7)
…ream, fix stale docstring

- Fix invalid `xor_name::crate::hash::content_hash` syntax in src/tests.rs
  (write_and_read and seek_over_chunk_limit) — should be `crate::hash::`
- Thread child_level through DecryptionStream instead of hardcoding 0,
  so streaming decrypt uses the correct KDF domain separation
- Fix Python docstring that still said "SHA256" instead of "BLAKE3"

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…sages, test duplication

- Remove unreachable `<error>` branches in `debug_bytes` (len > 6 is
  already proven by the early return, so direct indexing is safe)
- Propagate read errors in Node.js and Python file-read iterators
  instead of silently returning None on I/O failure
- Improve `into_datamap()` doc and error messages to explain that
  `chunks()` must be fully consumed before calling it
- Deduplicate 5 near-identical streaming roundtrip integration tests
  into 2 parameterized helpers

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Copy link
Contributor

@mickvandijke mickvandijke left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great work!! 🚢

@grumbach grumbach dismissed dirvine’s stale review March 13, 2026 01:54

Issues fixed

@grumbach grumbach merged commit 30a7f9b into maidsafe:master Mar 13, 2026
16 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants