Conversation
MartinNowak
commented
Jan 7, 2017
- use slicing by 8 algorithm with bigger precomputed tables
- roughly 4x faster
2904621 to
40d7df8
Compare
|
This suggests that this implementation is endian sensitive. |
40d7df8 to
86d0185
Compare
|
Nope, the implementation also works for big endian, b/c it assembles the uint's from byte-wise reads instead of relying on unaligned hardware loads. Still did the reordering of the operations which indeed provided a noticeable speedup. |
86d0185 to
f4d6ecb
Compare
|
Ok, I hadn't digested that LGTM on the basis of matching other slicing by 8 implementations. |
|
After your comment, I was actually a bit unsure whether genTables is endian correct, so I build a gdc cross-compiler and tested on my MIPS router. Renamed the enum to make clear that this can only be done on LE architectures. |
- use slicing by 8 algorithm with bigger precomputed tables - roughly 4x faster
f4d6ecb to
382f9d2
Compare
|
LGTM, also love the CTFE construction of the table. |
|
Auto-merge toggled on |
|
Dumb question here but why can't sse4 crc32 numonics be used when available? I apologize of the answer is obvious |
|
@bgaff SSE4.2 crc32 instruction is for castagnoli polynomial, not IEEE. |