Skip to content

Improving cipher parallelism #444

@tarcieri

Description

@tarcieri

Taking a step back from #354, I thought it'd be good to look how and where ILP and SIMD parallelism is currently used across the project as a whole, and how that could be improved.

The only place we presently have any sort of parallelism abstraction at the trait-level is BlockCipher::ParBlocks. Otherwise various crates leverage e.g. SIMD internally. Regarding BlockCipher::ParBlocks specifically, the only crate that leverages it is the aes crate.

The following crates have SIMD backends:

Ciphers

  • aes
  • chacha20

UHFs/"MACs"

  • polyval
  • poly1305

AEADs

In AEADs, we'd like to glue the above crates together in fairly fixed combinations in order to leverage ILP, passing SIMD buffers from ciphers to UHFs for authentication:

  • aes-gcm/aes-gcm-siv: aes + ghash/polyval
  • chacha20poly1305: chacha20 + poly1305

(also aes-siv and pmac, but this is less of a priority)

In either of these cases there's a single specific buffer type I think it'd be nice for both the cipher implementation and UHF to support in common:

  • aes-gcm/aes-gcm-siv: "i128x8" i.e. [__m128i; 8] on x86/x86_64
  • chacha20poly1305: "i256x4" i.e. [__m256i; 4] on x86/x86_64

Concrete proposal

My suggestion is to get rid of BlockCipher::ParBlocks and replace it with more general SIMD types and traits designed to work with them, namely:

  • Add a new utils crate e.g. simd-buffers which provides "i128x8" and "i256x4" SIMD buffer types which are backed by __m128i/__m256i on x86/x86_64 and otherwise provide a portable implementation. These types don't need to implement any sort of arithmetic, just provide wrappers for passing data between SIMD implementations.
  • Add traits to cipher and universal-hash which operate on SIMD buffers.
  • Use SIMD buffers types in the implementations of aes-gcm, aes-gcm-siv, and chacha20poly1305

cipher API suggestion

I'd suggest adding traits to cipher which use the SIMD buffer types which are useful for both block ciphers and stream ciphers.

I also think it might make sense to use a generic parameter rather than an associated type to permit support for multiple buffer types (e.g. on newer CPUs, "i128x4" might be a better option for AES, but we can support both):

/// Note that for practical purposes, we only need to support block cipher encryption,
/// but there could also be a `BlockDecryptPar` for completeness/consistency.
pub trait BlockEncryptPar<B: SimdBuffer> {
    fn encrypt_par(&self, buffer: &mut B);
}

pub trait StreamCipherPar<B: SimdBuffer> {
    fn try_apply_keystream_par(&mut self, buffer: &mut B) -> Result<(), LoopError>;
}

universal-hash API suggestion

pub trait UniversalHashPar<B: SimdBuffer> {
    fn update_par(&mut self, blocks: &B);
}

SIMD ctr support

Trying to move the end-user facing aes-ctr types into aes has created a very annoying circular dependency between the block-ciphers and stream-ciphers repo. Furthermore, ctr is quite a bit more general now than what the CTR types in the aes crate provide, and also aes doesn't actually provide the CTR "flavors" (Ctr32BE/Ctr32Le) needed by aes-gcm and aes-gcm-siv.

But really, it seems like the main benefit of the implementation in the aes crate is being able to use _mm_xor_si128 to XOR a "i128x8" type.

If we had BlockEncryptPar and StreamCipherPar traits, the ctr crate could glue the two together, accepting a SIMD buffer as input, computing the next buffer of keystream output, and XORing the latter into the former. This would allow ctr to be generally SIMD optimized, and also mean we only have one ctr implementation to worry about instead of a separate one in the AES crate.

Metadata

Metadata

Assignees

No one assigned

    Labels

    cipherBlock and stream cipher crate

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions