Skip to content

Ascii and Latin1 APIs can return wrong results for misaligned buffers #84877

@MihaZupan

Description

@MihaZupan
char* pBuffer = stackalloc char[17];
Span<char> misalignedSpan = new((char*)((byte*)pBuffer + 1), 16);
misalignedSpan.Fill('a');

if (!Ascii.IsValid(misalignedSpan)) Console.WriteLine("Oh no");

Ascii.IsValid pins the input and works with pointers internally. GetIndexOfFirstNonAsciiChar_Intrinsified will attempt to 16-byte align the input pointer to make use of aligned reads, but in doing so it ends up working with the "corrupted" data if the input wasn't 2-byte aligned initially.

pBuffer = (char*)(((nuint)pBuffer + SizeOfVector128InBytes) & ~(nuint)(SizeOfVector128InBytes - 1));

As a result it may end up returning the wrong result (both false positives and false negatives).

Latin1Utility also likely shares the same issue.

It's safe to assume that related operations on Encoding that make use of these helpers are likewise affected.

Metadata

Metadata

Assignees

No one assigned

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions