Skip to content

Consider making AisAsciiTo6Bits less branchy #196

@idg10

Description

@idg10

A colleague asked an AI for performance improvement suggestions, and it came up with this:

Optimize NmeaPayloadParser.AisAsciiTo6Bits: Replace the nested ternary operators with a static ReadOnlySpan lookup table for O(1) conversion.

It's true that AisAsciiTo6Bits is very 'branchy', which is something modern CPUs don't really like:

internal static byte AisAsciiTo6Bits(byte c) => (byte)(c < 48
    ? throw new ArgumentOutOfRangeException("Payload characters must be in range 48-87 or 96-119")
    : (c < 88
        ? c - 48
        : (c < 96
            ? throw new ArgumentOutOfRangeException("Payload characters must be in range 48-87 or 96-119")
            : (c < 120
                ? c - 56
                : throw new ArgumentOutOfRangeException("Payload characters must be in range 48-87 or 96-119")))));

Branches tend to disrupt the CPU pipeline, so this is not ideal. A lookup table would enable this to be branch-free.

However, it comes at a price: that lookup table needs to end up in the CPU's cache before it will actually deliver the improved performance we're looking for. And in doing so, it will evict something else from the cache. This will tend to have negative consequences for other code, and those negative consequences might outweigh the benefits.

So this is exactly the sort of fix that can show a significant improvement in a microbenchmark but which could actually make the overall performance of a real application worse.

So I'm wondering if there's another way. This seems like exactly the sort of problem that has had a lot of attention in recent years when it comes to performance tuning. When people vectorise code, they often transform things that are logically comparisons and branches into bit twiddling.

The first thing to check is whether the JIT is already doing something like that. (And if not, whether it could be induced to do so by asking for aggressive optimization.)

If not, we could look at whether there is a more pipeline-friendly way to implement this logic that is less cache-hungry than a lookup table.

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions