Conversation
|
It's not clear to me why benches fails, since I can run it fine locally. Compiler version difference? (I use I'm guessing a feature related to trait bounds got stabilized between 1.81 and 1.84, since |
|
I don't think we need the
You could try bumping the compiler version used for benchmarks. |
|
I believe the failing mgm test isn't related to this PR? |
|
@Demindiro it does look unrelated, yes |
Rust 1.81.0 is unable to compile HS1-SIV, but 1.84.0 is.
LLVM isn't smart enough to change the array version to a cmovcc and instead spills to memory
Despite my best efforts I seem unable to get LLVM to emit vectorized code, even though it should be obviously beneficial. I suspect LLVM is thrown off by the 64 bit multiply, which is missing in the SSE2 instruction set. It did take me a while to figure out that casting an array of __m128i to [u64; 2] would end up the most performant. The SSE2 version is about ~%20 faster for me, so it is a substantial improvement. Also, inline(always) on pretty much everything is now beneficial, whereas before it led to significant regressions. It does create a fair bit of code bloat though.
|
The |
HS1-SIV uses ChaCha and a new hash algorithm. This implementation is based on the paper and the reference implementation.
I generated custom test vectors since none seem to be provided. I've included the reference implementation to show how they've been generated.
Hs1Paramsis quite ugly, but I'm unsure if I can make it any cleaner. It could be hidden by using newtypes for the 3 parameter sets instead.trait ChaChaImplis necessary becausechacha20::variants::Variantand in particularchacha20::variants::Ietfisn't exposed, sochacha20::ChaChaCoreis unusable.I've spent some time optimizing it. It certainly can be optimized more, though so far further attempts have failed.
It should be free of any data-dependent branches, though I haven't looked at the generated assembly very closely.