A slew of CoreCLR issues have been opened up in regard to method Seek().
Testing against Vector<byte>.Zero is currently slow because of the low quality code generation. Slightly better code is generated if you first convert to Vector<long>, but It is better to simply scan for a non-zero value over the four longs.
Another thing to consider is inside FindFirstEqualByte maybe using a Debruijn table to find the appropriate byte once you have a long.
static readonly byte[] Debruijn64 =
{
0, 0, 0, 7, 0, 4, 7, 5, 3, 0, 2, 4, 0, 7, 1, 5,
7, 3, 2, 0, 2, 2, 4, 2, 6, 1, 7, 4, 3, 1, 6, 4,
7, 6, 3, 5, 3, 2, 0, 1, 7, 2, 1, 2, 6, 4, 3, 4,
6, 5, 3, 1, 7, 1, 6, 4, 5, 3, 1, 6, 5, 6, 5, 5,
};
const long DEBRUIJN_SEQ64 = 0x26752B916FC7B0D;
static int DebruijnFindByte(long v)
{
return Debruijn64[((ulong)((v & -v) * DEBRUIJN_SEQ64)) >> 58];
}
This will perform better than the current quasi-tree search, and this helper will almost certainly inline. Obviously, I'd rather there be a bitscanforward intrinsic.
A slew of CoreCLR issues have been opened up in regard to method
Seek().Testing against
Vector<byte>.Zerois currently slow because of the low quality code generation. Slightly better code is generated if you first convert toVector<long>, but It is better to simply scan for a non-zero value over the four longs.Another thing to consider is inside
FindFirstEqualBytemaybe using a Debruijn table to find the appropriate byte once you have a long.This will perform better than the current quasi-tree search, and this helper will almost certainly inline. Obviously, I'd rather there be a bitscanforward intrinsic.