-
Notifications
You must be signed in to change notification settings - Fork 230
Description
Taking a step back from #354, I thought it'd be good to look how and where ILP and SIMD parallelism is currently used across the project as a whole, and how that could be improved.
The only place we presently have any sort of parallelism abstraction at the trait-level is BlockCipher::ParBlocks. Otherwise various crates leverage e.g. SIMD internally. Regarding BlockCipher::ParBlocks specifically, the only crate that leverages it is the aes crate.
The following crates have SIMD backends:
Ciphers
aeschacha20
UHFs/"MACs"
polyvalpoly1305
AEADs
In AEADs, we'd like to glue the above crates together in fairly fixed combinations in order to leverage ILP, passing SIMD buffers from ciphers to UHFs for authentication:
aes-gcm/aes-gcm-siv:aes+ghash/polyvalchacha20poly1305:chacha20+poly1305
(also aes-siv and pmac, but this is less of a priority)
In either of these cases there's a single specific buffer type I think it'd be nice for both the cipher implementation and UHF to support in common:
aes-gcm/aes-gcm-siv: "i128x8" i.e.[__m128i; 8]on x86/x86_64chacha20poly1305: "i256x4" i.e.[__m256i; 4]on x86/x86_64
Concrete proposal
My suggestion is to get rid of BlockCipher::ParBlocks and replace it with more general SIMD types and traits designed to work with them, namely:
- Add a new
utilscrate e.g.simd-bufferswhich provides "i128x8" and "i256x4" SIMD buffer types which are backed by__m128i/__m256ion x86/x86_64 and otherwise provide a portable implementation. These types don't need to implement any sort of arithmetic, just provide wrappers for passing data between SIMD implementations. - Add traits to
cipheranduniversal-hashwhich operate on SIMD buffers. - Use SIMD buffers types in the implementations of
aes-gcm,aes-gcm-siv, andchacha20poly1305
cipher API suggestion
I'd suggest adding traits to cipher which use the SIMD buffer types which are useful for both block ciphers and stream ciphers.
I also think it might make sense to use a generic parameter rather than an associated type to permit support for multiple buffer types (e.g. on newer CPUs, "i128x4" might be a better option for AES, but we can support both):
/// Note that for practical purposes, we only need to support block cipher encryption,
/// but there could also be a `BlockDecryptPar` for completeness/consistency.
pub trait BlockEncryptPar<B: SimdBuffer> {
fn encrypt_par(&self, buffer: &mut B);
}
pub trait StreamCipherPar<B: SimdBuffer> {
fn try_apply_keystream_par(&mut self, buffer: &mut B) -> Result<(), LoopError>;
}universal-hash API suggestion
pub trait UniversalHashPar<B: SimdBuffer> {
fn update_par(&mut self, blocks: &B);
}SIMD ctr support
Trying to move the end-user facing aes-ctr types into aes has created a very annoying circular dependency between the block-ciphers and stream-ciphers repo. Furthermore, ctr is quite a bit more general now than what the CTR types in the aes crate provide, and also aes doesn't actually provide the CTR "flavors" (Ctr32BE/Ctr32Le) needed by aes-gcm and aes-gcm-siv.
But really, it seems like the main benefit of the implementation in the aes crate is being able to use _mm_xor_si128 to XOR a "i128x8" type.
If we had BlockEncryptPar and StreamCipherPar traits, the ctr crate could glue the two together, accepting a SIMD buffer as input, computing the next buffer of keystream output, and XORing the latter into the former. This would allow ctr to be generally SIMD optimized, and also mean we only have one ctr implementation to worry about instead of a separate one in the AES crate.