Conversation
a1e14b5 to
b393f8b
Compare
aes/src/armv8.rs
Outdated
|
|
||
| /// AES key expansion | ||
| #[inline] | ||
| pub fn expand_key<const N: usize>(key: &[u8; 16]) -> [[u8; 16]; N] { |
There was a problem hiding this comment.
Note: the implementation is using const generics. It's only using 1.51+ compatible features so far, and it's an internal implementation details, so I figured why not.
For comparison, the corresponding AES-NI implementation contains a lot of code duplication.
|
Tests are confirmed passing on an Apple M1 |
0a8b89f to
15c5f1b
Compare
|
Some preliminary benchmarks on a M1 Mac Mini:
|
88f2a98 to
3eb1e34
Compare
465cca1 to
fc260d0
Compare
b3ec618 to
e5bcf77
Compare
|
Removing WIP. I'd call this complete except for pipelining. It implements the following:
I will look at pipelining, with an eye on what improves performance on the M1 (since that's the most powerful ARMv8 I have access to). In the meantime I would love it if anyone could benchmark it on other 64-bit ARMv8 platforms. I'll leave this PR open for awhile to invite review. |
0e53607 to
9f81b5c
Compare
|
Implemented pipelining which operates 8-blocks-at-a-time. Saw some pretty nice performance gains on the Apple M1 (reaching nearly 10GB/sec on AES-128!) |
Adds a new nightly-only backend which uses ARMv8 Cryptography Extensions gated under the newly introduced `armv8` crate feature. Support is provided for AES-128, AES-192, and AES-256, with runtime CPU feature detection on Linux and macOS targets. These extensions are supported on both 32-bit and 64-bit ARM targets, however the current implementation is gated on `aarch64` (as that's the only architecture it's been tested on so far). However, it could be easily extended to 32-bit ARMv8 targets as well.
|
Going to go ahead and land this. At this point I'd say it's the best tested of all of the backends. |
Adds a new backend which uses ARMv8 Cryptography Extensions. These are currently unstable so support is gated under a newly added
armv8crate feature.These extensions are supported on both 32-bit and 64-bit ARM targets, however the current implementation is gated on
aarch64(as that's the only architecture it's been tested on so far).Closes #10.