Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion compiler/rustc_target/src/asm/mod.rs
Original file line number Diff line number Diff line change
Expand Up @@ -623,7 +623,7 @@ impl InlineAsmRegClass {
Self::Hexagon(r) => r.supported_types(arch),
Self::LoongArch(r) => r.supported_types(arch),
Self::Mips(r) => r.supported_types(arch),
Self::S390x(r) => r.supported_types(arch, allow_experimental_reg),
Self::S390x(r) => r.supported_types(arch),
Self::Sparc(r) => r.supported_types(arch),
Self::SpirV(r) => r.supported_types(arch),
Self::Wasm(r) => r.supported_types(arch),
Expand Down
16 changes: 4 additions & 12 deletions compiler/rustc_target/src/asm/s390x.rs
Original file line number Diff line number Diff line change
Expand Up @@ -38,22 +38,14 @@ impl S390xInlineAsmRegClass {
pub fn supported_types(
self,
_arch: InlineAsmArch,
allow_experimental_reg: bool,
) -> &'static [(InlineAsmType, Option<Symbol>)] {
match self {
Self::reg | Self::reg_addr => types! { _: I8, I16, I32, I64; },
Self::freg => types! { _: F16, F32, F64; },
Self::vreg => {
if allow_experimental_reg {
// non-clobber-only vector register support is unstable.
types! {
vector: I32, F16, F32, I64, F64, I128, F128,
VecI8(16), VecI16(8), VecI32(4), VecI64(2), VecF16(8), VecF32(4), VecF64(2);
}
} else {
&[]
}
}
Self::vreg => types! {
vector: I32, F16, F32, I64, F64, I128, F128,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've been looking at the z/Arch vector docs and there only seem to be instructions that load a full 128-bit into a vector register. I don't think it makes sense to expose smaller sizes here.

Specifically you only have the options of:

  • loading a full 128 bits from memory
  • loading a single element and replicating it over the 128 bits of the vector register
  • loading a single element and inserting it inside an existing vector

This is in contrast to AArch64 for example which does have instructions that load a single element and zero-extend it to the full size of the vector register.

As such I'm inclined to remove i32/i64/f16/f32/f64 from the set of allowed types.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To be clear, that would deviate from what clang/llvm accept. I'll leave it up to the target maintainer(s) to argue one way or the other.

Strictly speaking the behavior as implemented is correct according to the reference:

https://doc.rust-lang.org/nightly/reference/inline-assembly.html?highlight=assemb#r-asm.operand-type.supported-operands.in

The allocated register will contain the value of <expr> at the start of the assembly code.

So that should be clarified if we remove support for the smaller scalar types.

Copy link
Member

@taiki-e taiki-e Mar 22, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It might not make much sense as inputs, but since s390x has shuffle instructions, I think it could be meaningful as outputs.

e.g., by combining shuffle with normal vector addition, we can place the sum of all lanes in the lower lane.
(Some of _mm512_reduce_* in x86_64 are actually implemented in this way: https://godbolt.org/z/b5G6qKzns)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We absolutely need floating-point types in vector registers. VRs overlap the FPRs, and there's a whole set of scalar floating-point instructions operating on vector registers, using it basically as a larger FPR set. There's no need to extend anything, the (shorter) floating-point value sits at a defined location inside the VR, with the remaining lanes being ignored by the relevant instructions. The existing load/store operations work just fine for those. (There actually are VECTOR LOAD LOGICAL ELEMENT AND ZERO (VLLEZ) instructions that clear the other lanes as well, but it's generally not necessary to use them.)

The shorter integer types are less important, but would still be good to have, in particular for the (scalar) integer-to-float / float-to-integer conversion instructions that operate on VRs. (Again, those integer values would sit at a defined lane - normally the same as a floating-point value of the same size.)

All this is the same as GCC and LLVM handle inline asm with those types in VRs.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In that case I withdraw my objection. Would it make sense to add i8/i16 as well then for consistency? These are special cased on x86 because x86 doesn't have instructions that move an i8/i16 into a vector register.

VecI8(16), VecI16(8), VecI32(4), VecI64(2), VecF16(8), VecF32(4), VecF64(2);
},
Self::areg => &[],
}
}
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -12,16 +12,11 @@ This tracks support for additional registers in architectures where inline assem

| Architecture | Register class | Registers | LLVM constraint code |
| ------------ | -------------- | --------- | -------------------- |
| s390x | `vreg` | `v[0-31]` | `v` |

> **Notes**:
> - s390x `vreg` is clobber-only in stable.

## Register class supported types

| Architecture | Register class | Target feature | Allowed types |
| ------------ | -------------- | -------------- | ------------- |
| s390x | `vreg` | `vector` | `i32`, `f32`, `i64`, `f64`, `i128`, `f128`, `i8x16`, `i16x8`, `i32x4`, `i64x2`, `f32x4`, `f64x2` |
| x86 | `xmm_reg` | `sse` | `i128` |
| x86 | `ymm_reg` | `avx` | `i128` |
| x86 | `zmm_reg` | `avx512f` | `i128` |
Expand All @@ -40,4 +35,3 @@ This tracks support for additional registers in architectures where inline assem

| Architecture | Register class | Modifier | Example output | LLVM modifier |
| ------------ | -------------- | -------- | -------------- | ------------- |
| s390x | `vreg` | None | `%v0` | None |
33 changes: 11 additions & 22 deletions tests/assembly-llvm/asm/s390x-types.rs
Original file line number Diff line number Diff line change
Expand Up @@ -8,7 +8,6 @@
//@ compile-flags: -Zmerge-functions=disabled

#![feature(no_core, repr_simd, f16, f128)]
#![cfg_attr(s390x_vector, feature(asm_experimental_reg))]
#![crate_type = "rlib"]
#![no_core]
#![allow(asm_sub_register, non_camel_case_types)]
Expand All @@ -19,27 +18,17 @@ use minicore::*;
type ptr = *const i32;

#[repr(simd)]
pub struct i8x16([i8; 16]);
#[repr(simd)]
pub struct i16x8([i16; 8]);
#[repr(simd)]
pub struct i32x4([i32; 4]);
#[repr(simd)]
pub struct i64x2([i64; 2]);
#[repr(simd)]
pub struct f16x8([f16; 8]);
#[repr(simd)]
pub struct f32x4([f32; 4]);
#[repr(simd)]
pub struct f64x2([f64; 2]);

impl Copy for i8x16 {}
impl Copy for i16x8 {}
impl Copy for i32x4 {}
impl Copy for i64x2 {}
impl Copy for f16x8 {}
impl Copy for f32x4 {}
impl Copy for f64x2 {}
pub struct Simd<T, const N: usize>([T; N]);

impl<T: Copy, const N: usize> Copy for Simd<T, N> {}

type i8x16 = Simd<i8, 16>;
type i16x8 = Simd<i16, 8>;
type i32x4 = Simd<i32, 4>;
type i64x2 = Simd<i64, 2>;
type f16x8 = Simd<f16, 8>;
type f32x4 = Simd<f32, 4>;
type f64x2 = Simd<f64, 2>;

extern "C" {
fn extern_func();
Expand Down
24 changes: 1 addition & 23 deletions tests/ui/asm/s390x/bad-reg.rs
Original file line number Diff line number Diff line change
@@ -1,16 +1,13 @@
//@ add-minicore
//@ revisions: s390x s390x_vector s390x_vector_stable
//@ revisions: s390x s390x_vector
//@[s390x] compile-flags: --target s390x-unknown-linux-gnu -C target-feature=-vector
//@[s390x] needs-llvm-components: systemz
//@[s390x_vector] compile-flags: --target s390x-unknown-linux-gnu -C target-feature=+vector
//@[s390x_vector] needs-llvm-components: systemz
//@[s390x_vector_stable] compile-flags: --target s390x-unknown-linux-gnu -C target-feature=+vector
//@[s390x_vector_stable] needs-llvm-components: systemz
//@ ignore-backends: gcc

#![crate_type = "rlib"]
#![feature(no_core, repr_simd)]
#![cfg_attr(not(s390x_vector_stable), feature(asm_experimental_reg))]
#![no_core]
#![allow(non_camel_case_types)]

Expand Down Expand Up @@ -73,46 +70,27 @@ fn f() {
asm!("", out("v0") _); // always ok
asm!("", in("v0") v); // requires vector & asm_experimental_reg
//[s390x]~^ ERROR register class `vreg` requires the `vector` target feature
//[s390x_vector_stable]~^^ ERROR register class `vreg` can only be used as a clobber in stable [E0658]
//[s390x_vector_stable]~| ERROR type `i64x2` cannot be used with this register class in stable [E0658]
asm!("", out("v0") v); // requires vector & asm_experimental_reg
//[s390x]~^ ERROR register class `vreg` requires the `vector` target feature
//[s390x_vector_stable]~^^ ERROR register class `vreg` can only be used as a clobber in stable [E0658]
//[s390x_vector_stable]~| ERROR type `i64x2` cannot be used with this register class in stable [E0658]
asm!("", in("v0") x); // requires vector & asm_experimental_reg
//[s390x]~^ ERROR register class `vreg` requires the `vector` target feature
//[s390x_vector_stable]~^^ ERROR register class `vreg` can only be used as a clobber in stable [E0658]
//[s390x_vector_stable]~| ERROR type `i32` cannot be used with this register class in stable [E0658]
asm!("", out("v0") x); // requires vector & asm_experimental_reg
//[s390x]~^ ERROR register class `vreg` requires the `vector` target feature
//[s390x_vector_stable]~^^ ERROR register class `vreg` can only be used as a clobber in stable [E0658]
//[s390x_vector_stable]~| ERROR type `i32` cannot be used with this register class in stable [E0658]
asm!("", in("v0") b);
//[s390x]~^ ERROR register class `vreg` requires the `vector` target feature
//[s390x_vector]~^^ ERROR type `u8` cannot be used with this register class
//[s390x_vector_stable]~^^^ ERROR register class `vreg` can only be used as a clobber in stable [E0658]
//[s390x_vector_stable]~| ERROR type `u8` cannot be used with this register class
asm!("", out("v0") b);
//[s390x]~^ ERROR register class `vreg` requires the `vector` target feature
//[s390x_vector]~^^ ERROR type `u8` cannot be used with this register class
//[s390x_vector_stable]~^^^ ERROR register class `vreg` can only be used as a clobber in stable [E0658]
//[s390x_vector_stable]~| ERROR type `u8` cannot be used with this register class
asm!("/* {} */", in(vreg) v); // requires vector & asm_experimental_reg
//[s390x]~^ ERROR register class `vreg` requires the `vector` target feature
//[s390x_vector_stable]~^^ ERROR register class `vreg` can only be used as a clobber in stable [E0658]
//[s390x_vector_stable]~| ERROR type `i64x2` cannot be used with this register class in stable [E0658]
asm!("/* {} */", in(vreg) x); // requires vector & asm_experimental_reg
//[s390x]~^ ERROR register class `vreg` requires the `vector` target feature
//[s390x_vector_stable]~^^ ERROR register class `vreg` can only be used as a clobber in stable [E0658]
//[s390x_vector_stable]~| ERROR type `i32` cannot be used with this register class in stable [E0658]
asm!("/* {} */", in(vreg) b);
//[s390x]~^ ERROR register class `vreg` requires the `vector` target feature
//[s390x_vector]~^^ ERROR type `u8` cannot be used with this register class
//[s390x_vector_stable]~^^^ ERROR register class `vreg` can only be used as a clobber in stable [E0658]
//[s390x_vector_stable]~| ERROR type `u8` cannot be used with this register class
asm!("/* {} */", out(vreg) _); // requires vector & asm_experimental_reg
//[s390x]~^ ERROR register class `vreg` requires the `vector` target feature
//[s390x_vector_stable]~^^ ERROR register class `vreg` can only be used as a clobber in stable [E0658]

// Clobber-only registers
// areg
Expand Down
Loading
Loading