Skip to content

Use __kuser_cmpxchg64 for 64-bit atomics on pre-v6 ARM Linux/Android#82

Merged
taiki-e merged 1 commit intomainfrom
arm-linux
Mar 25, 2023
Merged

Use __kuser_cmpxchg64 for 64-bit atomics on pre-v6 ARM Linux/Android#82
taiki-e merged 1 commit intomainfrom
arm-linux

Conversation

@taiki-e
Copy link
Owner

@taiki-e taiki-e commented Mar 5, 2023

Currently, we are using fallback implementation for 64-bit atomics on pre-v6 ARM Linux/Android such as armv5te-unknown-linux-gnueabi and arm-linux-androideabi.

However, Linux kernel 3.1+ provides kernel user helpers for 64-bit atomics. This could be more efficient than a lock-based fallback implementation, because it calls native atomic instructions, depending on the actual CPU version.

This PR uses __kuser_cmpxchg64 on Linux kernel 3.1+, otherwise use fallback implementation as before.

Since Rust 1.64, the Linux kernel requirement for Rust when using std1 is 3.2+, so it should be possible to omit the dynamic kernel version check if the std feature is enabled on Rust 1.64+, but that has not yet been implemented.

Footnotes

  1. https://blog.rust-lang.org/2022/08/01/Increasing-glibc-kernel-requirements.html#affected-targets says "Targets which only use libcore and not libstd are unaffected."

@taiki-e taiki-e added the O-arm Target: 32-bit Arm processors (armv6, armv7, thumb...), including 64-bit Arm in AArch32 state label Mar 5, 2023
@taiki-e taiki-e force-pushed the arm-linux branch 10 times, most recently from a7a39ac to b91c04c Compare March 12, 2023 12:00
@taiki-e taiki-e marked this pull request as ready for review March 12, 2023 12:04
@taiki-e taiki-e force-pushed the arm-linux branch 8 times, most recently from aecb2af to 97993ae Compare March 25, 2023 16:05
@taiki-e
Copy link
Owner Author

taiki-e commented Mar 25, 2023

Ok, benchmarked on Graviton2 (Neoverse-N1)'s aarch32 mode. https://cirrus-ci.com/task/6137907514179584
It was about two times faster in most cases than the fallback implementation.

@taiki-e taiki-e merged commit d97ddb7 into main Mar 25, 2023
@taiki-e taiki-e deleted the arm-linux branch March 25, 2023 17:27
tgross35 pushed a commit to rust-lang/compiler-builtins that referenced this pull request Jan 22, 2026
This is a PR for thumbv6-none-eabi (bere-metal Armv6k in Thumb mode)
which proposed to be added by
rust-lang/rust#150138.

Armv6k supports atomic instructions, but they are unavailable in Thumb
mode unless Thumb-2 instructions available (v6t2).

Using Thumb interworking (can be used via `#[instruction_set]`) allows
us to use these instructions even from Thumb mode without Thumb-2
instructions, but LLVM does not implement that processing (as of LLVM
21), so this PR implements it in compiler-builtins.

The code around `__sync` builtins is basically copied from
`arm_linux.rs` which uses kernel_user_helpers for atomic implementation.
The atomic implementation is a port of my [atomic-maybe-uninit inline
assembly code].

This PR has been tested on QEMU 10.2.0 using patched compiler-builtins
and core that applied the changes in this PR and
rust-lang/rust#150138 and the [portable-atomic
no-std test suite] (can be run with `./tools/no-std.sh
thumbv6-none-eabi` on that repo) which tests wrappers around
`core::sync::atomic`. (Note that the target-spec used in test sets
max-atomic-width to 32 and atomic_cas to true, unlike the current
rust-lang/rust#150138.) The original
atomic-maybe-uninit implementation has been tested on real Arm hardware.

(Note that Armv6k also supports 64-bit atomic instructions, but they are
skipped here. This is because there is no corresponding code in
`arm_linux.rs` (since the kernel requirements increased in 1.64, it may
be possible to implement 64-bit atomics there as well. see also
taiki-e/portable-atomic#82), the code becomes
more complex than for 32-bit and smaller atomics.)

[atomic-maybe-uninit inline assembly code]: https://github.com/taiki-e/atomic-maybe-uninit/blob/HEAD/src/arch/arm.rs
[portable-atomic no-std test suite]: https://github.com/taiki-e/portable-atomic/tree/HEAD/tests/no-std-qemu
tgross35 pushed a commit to tgross35/rust that referenced this pull request Feb 10, 2026
This is a PR for thumbv6-none-eabi (bere-metal Armv6k in Thumb mode)
which proposed to be added by
rust-lang#150138.

Armv6k supports atomic instructions, but they are unavailable in Thumb
mode unless Thumb-2 instructions available (v6t2).

Using Thumb interworking (can be used via `#[instruction_set]`) allows
us to use these instructions even from Thumb mode without Thumb-2
instructions, but LLVM does not implement that processing (as of LLVM
21), so this PR implements it in compiler-builtins.

The code around `__sync` builtins is basically copied from
`arm_linux.rs` which uses kernel_user_helpers for atomic implementation.
The atomic implementation is a port of my [atomic-maybe-uninit inline
assembly code].

This PR has been tested on QEMU 10.2.0 using patched compiler-builtins
and core that applied the changes in this PR and
rust-lang#150138 and the [portable-atomic
no-std test suite] (can be run with `./tools/no-std.sh
thumbv6-none-eabi` on that repo) which tests wrappers around
`core::sync::atomic`. (Note that the target-spec used in test sets
max-atomic-width to 32 and atomic_cas to true, unlike the current
rust-lang#150138.) The original
atomic-maybe-uninit implementation has been tested on real Arm hardware.

(Note that Armv6k also supports 64-bit atomic instructions, but they are
skipped here. This is because there is no corresponding code in
`arm_linux.rs` (since the kernel requirements increased in 1.64, it may
be possible to implement 64-bit atomics there as well. see also
taiki-e/portable-atomic#82), the code becomes
more complex than for 32-bit and smaller atomics.)

[atomic-maybe-uninit inline assembly code]: https://github.com/taiki-e/atomic-maybe-uninit/blob/HEAD/src/arch/arm.rs
[portable-atomic no-std test suite]: https://github.com/taiki-e/portable-atomic/tree/HEAD/tests/no-std-qemu
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

O-arm Target: 32-bit Arm processors (armv6, armv7, thumb...), including 64-bit Arm in AArch32 state

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant