wasi: atomics-aware threading and synchronous sort pipeline#11712
wasi: atomics-aware threading and synchronous sort pipeline#11712DePasqualeOrg wants to merge 6 commits intouutils:mainfrom
Conversation
Add #[cfg(target_os = "wasi")] blocks alongside existing unix and windows platform code. No changes to existing platform behavior. Enables compilation to wasm32-wasip1 and wasm32-wasip1-threads targets for running in WASI-compatible runtimes like WasmKit and Wasmer.
On wasm32-wasip1 (no atomics), sort crashes because ext_sort, merge, check, and rayon all spawn threads unconditionally. Add synchronous code paths gated on cfg(all(target_os = "wasi", not(target_feature = "atomics"))) so sort works on both wasip1 (sync) and wasip1-threads (threaded). Key changes: - Extract read_to_chunk() from chunks::read() for shared use - Add synchronous ext_sort with chunked sort-write-merge flow - Add SyncFileMerger for threadless merge operations - Add synchronous check for order verification - Gate rayon par_sort with sequential fallback
|
Would you split PR (at least for symlink support)? |
46e35c5 to
9bced35
Compare
9bced35 to
aa75a81
Compare
|
I separated the symlink changes out into #11713. |
|
I added a new commit to address an issue that would cause CI failures. The WASI CI uses stable Rust, but
|
|
I resolved the linter error in CI. |
|
GNU testsuite comparison: |
|
GNU testsuite comparison: |
6b396f5 to
562cefd
Compare
|
GNU testsuite comparison: |
|
Is it able to split PR per utility? Diff is still too big. How about |
|
The changes here are interdependent, and splitting out |
This PR includes comprehensive
wasm32-wasiplatform improvements, enabling many more tools to compile for WASI beyond the existingfeat_wasmset. It includes a full synchronous fallback forsorton targets without thread support, and symlink support forcpvia WASI'ssymlink_path.These changes underwent many rounds of review and refinement with Claude Code and Codex. I've tested them extensively running on Wasmtime and Wasmer.
Motivation
Recent PRs (#11574, #11568, #11569, #11573, #11595, #11624) added initial WASI support – platform stubs,
FileInformation, tail feature-gating, and a basic single-threaded sort path. This PR builds on that work with three improvements:target_feature = "atomics"to distinguishwasm32-wasip1(no threads) fromwasm32-wasip1-threads, so the threaded sort path is preserved on runtimes that support it.cp: usessymlink_pathinstead of returning errors. (lnsymlink support is in a separate PR.)This PR supports the following WASI targets:
wasm32-wasip1: universal baseline, works on all WASI runtimeswasm32-wasip1-threads: supports threads via the atomics and bulk-memory proposalsChanges
uucore
lib.rs#![cfg_attr(all(target_os = "wasi", feature = "fs"), feature(wasi_ext))]— onlystd::os::wasi::fsneeds the unstable gate;std::os::wasi::ffiis stablefeatures/fs.rsFileInformationusingstd::os::wasi::fs::MetadataExtfornlink(),PartialEq(dev+ino),Hash(dev+ino). Addis_stdin_directory,path_ends_with_terminatorfeatures/fsext.rs#[cfg(not(target_os = "wasi"))]fromread_fs_listso the inner WASI block (returning emptyVec) is reachablefeatures/mode.rsget_umask()returning default 0o022mods/io.rsinto_stdio()converts throughFilefirst (no directStdio::from(OwnedFd)on WASI)sort: synchronous fallback for WASI without threads
On
wasm32-wasip1,sortcrashes because it unconditionally spawns threads viastd::thread::spawnand rayon. This PR adds synchronous code paths gated on#[cfg(all(target_os = "wasi", not(target_feature = "atomics")))]so that sort works on both WASI targets.The sort command uses threads in four places, each with a synchronous alternative:
ext_sort/threaded.rsmerge.rsSyncFileMergerthat reads on demandcheck.rscheck_syncthat reads and checks inlinesort.rspar_sort_by/par_sort_unstable_bysort_by/sort_unstable_byThe
ext_sortmodule unconditionally compiles thethreadedmodule, which handles both cases via internalcfgguards. The existing separatewasi.rs(a read-all-into-memory fallback from #11624) is removed in favor of this more complete implementation, along with its unusedparse_into_chunkhelper.Both the synchronous
ext_sortandmergepaths emit a warning and fall back to uncompressed temp files if--compress-programis passed, since process spawning is not available on WASI without threads.The
chunks.rsfile extractsread_to_chunk()fromread()so both threaded and synchronous code paths share the same chunk-reading logic.Other tool crates
platform/mod.rsis_unsafe_overwritestub (returns false)cp.rsstd::os::wasi::fs::symlink_path; return error for timestamp preservation on WASI (filetimepanics infrom_last_access_time/from_last_modification_time) —handle_preservesuppresses for optional (-a) and reports for required (--preserve=timestamps)native_int_str.rs#[cfg(target_os = "wasi")] use std::os::wasi::ffi::{OsStrExt, OsStringExt}mktemp.rs#[cfg(unix)]instead of#[cfg(not(windows))]Cargo.tomlctrlccrate on WASI (no signal handling); makerayonconditional on atomics supporttmp_dir.rsctrlcusagesort.rsTMPDIRenv var before callingenv::temp_dir()(panics on WASI)platform/mod.rsPid,ProcessChecker(#[allow(dead_code)]— follow mode is disabled on WASI),supports_pid_checkspaths.rsfile_id_eq(returns false — no stable inode API on WASI yet)text.rs,args.rstouch.rsUnsupportedPlatformFeatureerror fortouch -on WASI (no/dev/stdoutpath)What's stubbed vs. fully functional
Most WASI stubs are for Unix concepts that don't exist in WASI's capability-based security model:
touch -(returns error – WASI has no/dev/stdoutpath), timestamp preservation incp(filetimecrate panics on WASI)Build requirements
std::os::wasi::fsextensions, gated with#![cfg_attr(all(target_os = "wasi", feature = "fs"), feature(wasi_ext))])wasm32-wasip1:cargo +nightly build --target wasm32-wasip1 --releasewasm32-wasip1-threads:cargo +nightly build --target wasm32-wasip1-threads -Zbuild-std=std,panic_abort --releaseTesting
cargo build --releasecompiles with no warningswasm32-wasip1: compiles, uses synchronous paths for sortwasm32-wasip1-threads: compiles, uses threaded paths for sortwasip1-threads) and Wasmtime (wasip1)sortrequiresTMPDIRenvironment variable on WASI (Rust'sstd::env::temp_dir()panics on WASI – this is a Rust std library issue, not specific to coreutils)Note on cfg alias commit
The last commit adds a
build.rsto the sort crate that defines awasi_no_threadscfg alias, replacing ~49 instances of the verbose#[cfg(all(target_os = "wasi", not(target_feature = "atomics")))]with#[cfg(wasi_no_threads)]. This is a readability improvement only and can be reverted if maintainers prefer the explicit predicates.