feat: upgrade to quantum-safe cryptography (ChaCha20-Poly1305 + BLAKE3)#440
feat: upgrade to quantum-safe cryptography (ChaCha20-Poly1305 + BLAKE3)#440grumbach merged 19 commits intomaidsafe:masterfrom
Conversation
|
@claude please review |
- Allow clippy::type_complexity on test helper function - Ignore RUSTSEC-2025-0141 (bincode unmaintained, no upgrade path)
Replace all internal usages with stream_encrypt-based implementations. Python and Node.js bindings retain their interface but use stream_encrypt internally. Integration tests use a local helper function instead.
Replace all internal usages with streaming_decrypt-based implementations. Python and Node.js bindings retain their interface but use streaming_decrypt internally. Integration tests use a local decrypt_to_file helper.
dirvine
left a comment
There was a problem hiding this comment.
Thanks for the PR. I'm requesting changes for three blocking issues.
-
The ChaCha20-Poly1305 swap introduces a nonce-reuse hazard. In the current derivation, the AEAD key comes from
n_1_src_hashand the nonce comes fromn_2_src_hash, so two different current chunks can be encrypted under the same key/nonce pair whenever their two predecessor chunks match. That's not safe for ChaCha20-Poly1305.AES-GCMwould not fix this; it has the same nonce-uniqueness requirement. Safer options would be:- a misuse-resistant AEAD such as
AES-GCM-SIVorAES-SIV, if the goal is to stay safe even when nonces repeat, or - keep ChaCha20-Poly1305, but derive
(key, nonce, pad)through a KDF / rehash step over richer context such assrc_hash || n_1_src_hash || n_2_src_hash || chunk_index || child_levelwith domain separation. That way, different current chunks do not reuse the same nonce under the same key.
- a misuse-resistant AEAD such as
-
The Python
streaming_decrypt_from_storagecallback contract appears to have changed fromlist[chunk_name] -> list[bytes]to indexed tuples, but the public-facing tests still exercise the older shape. That looks like a runtime API break in the bindings. -
The Node TypeScript declaration for
streamingDecryptFromStorageis out of sync with the Rust implementation: the declaration says the callback returns a singleUint8Array, while the Rust side iterates the result as an array of chunk buffers.
Happy to re-review once those are addressed.
|
Design note for the ChaCha20-Poly1305 path The core constraint here is not just “pick a stronger AEAD”, it is “pick a stronger AEAD without breaking the self-encryption model”. This code path still needs to preserve:
The current ChaCha20-Poly1305 design fails on the third point because the AEAD key is derived from Important: switching to Two viable directions:
A few concrete recommendations if you go with option 2:
In short: |
…mprove Python docs - Replace naive byte-slicing key derivation with domain-separated BLAKE3 KDF over full chunk context (src_hash || n1 || n2 || chunk_index || child_level) using context string 'self_encryption/chunk/v2' to eliminate nonce-reuse hazard - Fix Node.js TS declaration: streaming_decrypt_from_storage callback returns Uint8Array[] not Uint8Array - Improve Python binding docstrings for callback contract clarity - Add regression tests for KDF domain separation by src_hash, chunk_index, and child_level
Leave 4KiB headroom below 4MiB to accommodate occasional compression growth that can cause chunks to slightly exceed the 4MiB boundary.
GitHub no longer supports macos-13 runners.
Add three new unit tests for the BLAKE3 KDF in get_pki(): - test_kdf_deterministic: same inputs always produce identical output - test_kdf_avalanche_single_bit_flip: single bit change causes >30% output divergence - test_kdf_output_sizes: verify pad=52, key=32, nonce=12 bytes
The get_chunks callback now unpacks (index, name) tuples and returns (index, bytes) tuples, matching the expected contract.
Drop all backward-compatible v0/legacy DataMap format handling, keeping only the clean v1 versioned format. Removes old format fallback in from_bytes(), visit_seq v0 handler in Deserialize impl, and 6 legacy tests.
|
Thanks for the thorough work on this @grumbach — the crypto upgrade is well-designed. The BLAKE3 KDF with domain separation is a big improvement over the old byte-slicing approach, and the bounds-checked indexing throughout eliminates a whole class of panics. Solid stuff. I did a deep review and found a few things worth looking at before merge: 1. 🔴 Critical — No test verifies AEAD integrity through the library's own pipeline
Suggestion: Add a test that calls 2. 🟠 High —
|
…vel KDF, parallel decrypt, DataMap parsing, bounds check, test rename - Add AEAD tamper detection integration test (Issue 1) - Fix stream encrypt/decrypt cross-implementation test (Issue 2) - Pass child_level through KDF for domain separation in shrink_data_map (Issue 3) - Parallelize decrypt_sorted_set with rayon par_iter (Issue 4) - Simplify DataMap::from_bytes to delegate to bincode (Issue 5) - Add bounds check in get_n_1_n_2 for out-of-range chunk_index (Issue 6) - Rename test_key_derivation_sizes to test_key_derivation_sizes_and_roundtrip (Issue 7)
…ream, fix stale docstring - Fix invalid `xor_name::crate::hash::content_hash` syntax in src/tests.rs (write_and_read and seek_over_chunk_limit) — should be `crate::hash::` - Thread child_level through DecryptionStream instead of hardcoding 0, so streaming decrypt uses the correct KDF domain separation - Fix Python docstring that still said "SHA256" instead of "BLAKE3" Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…sages, test duplication - Remove unreachable `<error>` branches in `debug_bytes` (len > 6 is already proven by the early return, so direct indexing is safe) - Propagate read errors in Node.js and Python file-read iterators instead of silently returning None on I/O failure - Improve `into_datamap()` doc and error messages to explain that `chunks()` must be fully consumed before calling it - Deduplicate 5 near-identical streaming roundtrip integration tests into 2 parameterized helpers Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Summary
Upgrades the cryptographic primitives to quantum-resistant alternatives, preparing self_encryption for post-quantum security:
What changed
New files
src/cipher.rs— ChaCha20-Poly1305 encrypt/decrypt replacing the oldsrc/aes.rs(CBC mode)src/hash.rs—content_hash()using BLAKE3, replacingXorName::from_content()(SHA3-256)Key derivation changes (
src/utils.rs)get_pad_key_and_iv()→get_pad_key_and_nonce()— now returnsResultwith bounds-checked indexingget_n_1_n_2()now validatestotal_num_chunks >= 3and returnsResultEncrypt/decrypt pipeline (
src/encrypt.rs,src/decrypt.rs)DataMap changes (
src/data_map.rs)infos()now returns&[ChunkInfo]instead ofVec<ChunkInfo>(zero-copy)DataMap::to_bytes()/DataMap::from_bytes()debug_bytes()uses safe.get()indexing instead of direct array accessRecursive DataMap handling (
src/lib.rs)get_root_data_map()andget_root_data_map_parallel()now useDataMap::from_bytes()instead oftest_helpers::deserialise()XorName::from_content()calls replaced withhash::content_hash()Streaming APIs (
src/stream_encrypt.rs,src/stream_decrypt.rs,src/stream_file.rs)streaming_encrypt_from_file,streaming_decrypt_from_storage,stream_encrypt,streaming_decryptSafety improvements
vec[0]) — replaced with.get(),.first(),.ok_or()get_pad_key_and_nonce()returnsResultinstead of panicking on bad indicesformat!("{var}")instead offormat!("{}", var))Dependencies
chacha20poly1305 = "0.10",blake3 = "1"aes,cbcdeny.tomllicense allow listNew tests (19 added in
tests/integration_tests.rs)test_stream_encrypt_decrypt_roundtriptest_file_stream_encrypt_decrypt_roundtriptest_stream_encrypt_file_decrypt_storage_crosstest_file_encrypt_stream_decrypt_crosstest_memory_encrypt_stream_decrypttest_stream_encrypt_memory_decrypttest_streaming_decrypt_range_accesstest_large_file_streaming_roundtriptest_minimum_size_datatest_single_chunk_boundarytest_multi_chunk_boundarytest_non_aligned_sizestest_encrypted_chunks_do_not_contain_plaintexttest_tampered_chunk_fails_decryptiontest_wrong_datamap_fails_decryptiontest_aead_tag_protects_integritytest_content_hash_is_blake3content_hash()matches rawblake3::hash()test_encrypted_chunk_dst_hash_is_blake3test_key_derivation_sizesBreaking changes
DataMap::infos()returns&[ChunkInfo]instead ofVec<ChunkInfo>get_pad_key_and_nonce()returnsResult(was infallibleget_pad_key_and_iv())Why quantum-safe
AES-128 has an effective security level of 64 bits against Grover's algorithm on a quantum computer. ChaCha20-Poly1305 with a 256-bit key retains 128-bit security post-quantum. BLAKE3 similarly maintains its security margin against quantum attacks. While the XOR obfuscation layer and convergent key derivation are symmetric constructions unaffected by Shor's algorithm, upgrading to 256-bit primitives across the board ensures the entire pipeline remains secure in a post-quantum world.