Skip to content

LLVM emits ARM code with scalable vectors that is not a multiple of the native size. #9026

@mcourteaux

Description

@mcourteaux

On ARM, LLVM generates this error:

vector_extract index must be a constant multiple of the result type's known minimum vector length.

For IR that looks like this:

  %514 = call <16 x i8> @llvm.vector.extract.v16i8.nxv16i8(<vscale x 16 x i8> %512, i64 0)
  %515 = call <16 x i8> @llvm.aarch64.neon.sabd.v16i8(<16 x i8> %513, <16 x i8> %514)
  %516 = call <5 x i8> @llvm.vector.extract.v5i8.nxv21i8(<vscale x 21 x i8> %505, i64 16)
  %517 = call <vscale x 16 x i8> @llvm.vector.insert.nxv16i8.v5i8(<vscale x 16 x i8> poison, <5 x i8> %516, i64 0)
  %518 = call <5 x i8> @llvm.vector.extract.v5i8.nxv21i8(<vscale x 21 x i8> %508, i64 16)
  %519 = call <vscale x 16 x i8> @llvm.vector.insert.nxv16i8.v5i8(<vscale x 16 x i8> poison, <5 x i8> %518, i64 0)

Looking through Halide, we never generate this intrinsic with any argument other than 0, so the 16 that appears in the IR is probably the result of LLVM internal optimizations.

To reproduce, run the correctness/fuzz_extract_lanes on ARM (currently in #8629), with seed 11290674455725750672.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions