Skip to content

wasi: atomics-aware threading and synchronous sort pipeline#11712

Open
DePasqualeOrg wants to merge 6 commits intouutils:mainfrom
DePasqualeOrg:wasi-support
Open

wasi: atomics-aware threading and synchronous sort pipeline#11712
DePasqualeOrg wants to merge 6 commits intouutils:mainfrom
DePasqualeOrg:wasi-support

Conversation

@DePasqualeOrg
Copy link
Copy Markdown

@DePasqualeOrg DePasqualeOrg commented Apr 8, 2026

This PR includes comprehensive wasm32-wasi platform improvements, enabling many more tools to compile for WASI beyond the existing feat_wasm set. It includes a full synchronous fallback for sort on targets without thread support, and symlink support for cp via WASI's symlink_path.

These changes underwent many rounds of review and refinement with Claude Code and Codex. I've tested them extensively running on Wasmtime and Wasmer.

Motivation

Recent PRs (#11574, #11568, #11569, #11573, #11595, #11624) added initial WASI support – platform stubs, FileInformation, tail feature-gating, and a basic single-threaded sort path. This PR builds on that work with three improvements:

  1. Atomics-aware cfg guards: uses target_feature = "atomics" to distinguish wasm32-wasip1 (no threads) from wasm32-wasip1-threads, so the threaded sort path is preserved on runtimes that support it.
  2. Full synchronous sort pipeline: replaces the in-memory-only fallback with a proper chunked sort-write-merge strategy (ext_sort, merge, check) that handles large inputs via temp files.
  3. Real symlink support for cp: uses symlink_path instead of returning errors. (ln symlink support is in a separate PR.)

This PR supports the following WASI targets:

  • wasm32-wasip1: universal baseline, works on all WASI runtimes
  • wasm32-wasip1-threads: supports threads via the atomics and bulk-memory proposals

Changes

uucore

File Change
lib.rs Add #![cfg_attr(all(target_os = "wasi", feature = "fs"), feature(wasi_ext))] — only std::os::wasi::fs needs the unstable gate; std::os::wasi::ffi is stable
features/fs.rs Add WASI variant to FileInformation using std::os::wasi::fs::MetadataExt for nlink(), PartialEq (dev+ino), Hash (dev+ino). Add is_stdin_directory, path_ends_with_terminator
features/fsext.rs Remove outer #[cfg(not(target_os = "wasi"))] from read_fs_list so the inner WASI block (returning empty Vec) is reachable
features/mode.rs Add WASI get_umask() returning default 0o022
mods/io.rs WASI into_stdio() converts through File first (no direct Stdio::from(OwnedFd) on WASI)

sort: synchronous fallback for WASI without threads

On wasm32-wasip1, sort crashes because it unconditionally spawns threads via std::thread::spawn and rayon. This PR adds synchronous code paths gated on #[cfg(all(target_os = "wasi", not(target_feature = "atomics")))] so that sort works on both WASI targets.

The sort command uses threads in four places, each with a synchronous alternative:

File Threaded path Synchronous fallback
ext_sort/threaded.rs Sorter thread for parallel chunk sorting Sequential read-sort-write loop with same chunked strategy
merge.rs Reader thread for async file I/O during merge SyncFileMerger that reads on demand
check.rs Reader thread for async file I/O during order checking check_sync that reads and checks inline
sort.rs Rayon par_sort_by / par_sort_unstable_by sort_by / sort_unstable_by

The ext_sort module unconditionally compiles the threaded module, which handles both cases via internal cfg guards. The existing separate wasi.rs (a read-all-into-memory fallback from #11624) is removed in favor of this more complete implementation, along with its unused parse_into_chunk helper.

Both the synchronous ext_sort and merge paths emit a warning and fall back to uncompressed temp files if --compress-program is passed, since process spawning is not available on WASI without threads.

The chunks.rs file extracts read_to_chunk() from read() so both threaded and synchronous code paths share the same chunk-reading logic.

Other tool crates

Tool File Change
cat platform/mod.rs Add WASI is_unsafe_overwrite stub (returns false)
cp cp.rs Add WASI symlink using std::os::wasi::fs::symlink_path; return error for timestamp preservation on WASI (filetime panics in from_last_access_time/from_last_modification_time) — handle_preserve suppresses for optional (-a) and reports for required (--preserve=timestamps)
env native_int_str.rs Add #[cfg(target_os = "wasi")] use std::os::wasi::ffi::{OsStrExt, OsStringExt}
mktemp mktemp.rs Gate permissions with #[cfg(unix)] instead of #[cfg(not(windows))]
sort Cargo.toml Exclude ctrlc crate on WASI (no signal handling); make rayon conditional on atomics support
sort tmp_dir.rs Add WASI no-op signal handler; gate ctrlc usage
sort sort.rs Check TMPDIR env var before calling env::temp_dir() (panics on WASI)
tail platform/mod.rs Add WASI stubs for Pid, ProcessChecker (#[allow(dead_code)] — follow mode is disabled on WASI), supports_pid_checks
tail paths.rs Add WASI file_id_eq (returns false — no stable inode API on WASI yet)
tail text.rs, args.rs Add WASI backend name and help text
touch touch.rs Return UnsupportedPlatformFeature error for touch - on WASI (no /dev/stdout path)

What's stubbed vs. fully functional

Most WASI stubs are for Unix concepts that don't exist in WASI's capability-based security model:

  • Stubbed (no-op): signal handling, PID monitoring, umask, file ownership checks, hostname, Unix permission display, touch - (returns error – WASI has no /dev/stdout path), timestamp preservation in cp (filetime crate panics on WASI)
  • Fully functional: file I/O, directory operations, sorting (threaded or synchronous), text processing, symlinks (relative targets only – absolute targets are rejected by WASI's capability-based sandbox), temp files, environment variables

Build requirements

  • Rust nightly (for std::os::wasi::fs extensions, gated with #![cfg_attr(all(target_os = "wasi", feature = "fs"), feature(wasi_ext))])
  • wasm32-wasip1: cargo +nightly build --target wasm32-wasip1 --release
  • wasm32-wasip1-threads: cargo +nightly build --target wasm32-wasip1-threads -Zbuild-std=std,panic_abort --release

Testing

  • Host (macOS): cargo build --release compiles with no warnings
  • wasm32-wasip1: compiles, uses synchronous paths for sort
  • wasm32-wasip1-threads: compiles, uses threaded paths for sort
  • All newly enabled tools verified working on Wasmer (wasip1-threads) and Wasmtime (wasip1)
  • sort requires TMPDIR environment variable on WASI (Rust's std::env::temp_dir() panics on WASI – this is a Rust std library issue, not specific to coreutils)

Note on cfg alias commit

The last commit adds a build.rs to the sort crate that defines a wasi_no_threads cfg alias, replacing ~49 instances of the verbose #[cfg(all(target_os = "wasi", not(target_feature = "atomics")))] with #[cfg(wasi_no_threads)]. This is a readability improvement only and can be reverted if maintainers prefer the explicit predicates.

Add #[cfg(target_os = "wasi")] blocks alongside existing unix and windows
platform code. No changes to existing platform behavior.

Enables compilation to wasm32-wasip1 and wasm32-wasip1-threads targets
for running in WASI-compatible runtimes like WasmKit and Wasmer.
On wasm32-wasip1 (no atomics), sort crashes because ext_sort, merge,
check, and rayon all spawn threads unconditionally. Add synchronous
code paths gated on cfg(all(target_os = "wasi", not(target_feature
= "atomics"))) so sort works on both wasip1 (sync) and
wasip1-threads (threaded).

Key changes:
- Extract read_to_chunk() from chunks::read() for shared use
- Add synchronous ext_sort with chunked sort-write-merge flow
- Add SyncFileMerger for threadless merge operations
- Add synchronous check for order verification
- Gate rayon par_sort with sequential fallback
@oech3
Copy link
Copy Markdown
Contributor

oech3 commented Apr 8, 2026

Would you split PR (at least for symlink support)?

@DePasqualeOrg
Copy link
Copy Markdown
Author

I separated the symlink changes out into #11713.

@DePasqualeOrg DePasqualeOrg changed the title wasi: atomics-aware threading, synchronous sort pipeline, and symlink support wasi: atomics-aware threading and synchronous sort pipeline Apr 8, 2026
@DePasqualeOrg
Copy link
Copy Markdown
Author

I added a new commit to address an issue that would cause CI failures.

The WASI CI uses stable Rust, but std::os::wasi::fs requires the unstable feature(wasi_ext) gate. This commit replaces all unstable APIs with stable libc equivalents:

  • uucore/fs.rs: FileInformation now stores libc::stat instead of std::fs::Metadata on WASI, using libc::fstat/libc::stat/libc::lstat for dev/ino/nlink access
  • cp.rs: std::os::wasi::fs::symlink_path replaced with libc::symlink
  • uucore/lib.rs: #![cfg_attr(..., feature(wasi_ext))] removed entirely
  • is_enotsup_error() updated to use libc::EOPNOTSUPP on WASI instead of a hardcoded value

@DePasqualeOrg
Copy link
Copy Markdown
Author

I resolved the linter error in CI.

@github-actions
Copy link
Copy Markdown

github-actions bot commented Apr 8, 2026

GNU testsuite comparison:

GNU test failed: tests/tail/tail-n0f. tests/tail/tail-n0f is passing on 'main'. Maybe you have to rebase?
Skipping an intermittent issue tests/date/date-locale-hour (passes in this run but fails in the 'main' branch)
Congrats! The gnu test tests/expand/bounded-memory is now passing!
Congrats! The gnu test tests/printf/printf-surprise is now passing!

@github-actions
Copy link
Copy Markdown

github-actions bot commented Apr 8, 2026

GNU testsuite comparison:

Skip an intermittent issue tests/tail/symlink (fails in this run but passes in the 'main' branch)
Skipping an intermittent issue tests/date/date-locale-hour (passes in this run but fails in the 'main' branch)
Note: The gnu test tests/cp/link-heap is now being skipped but was previously passing.
Note: The gnu test tests/pr/bounded-memory is now being skipped but was previously passing.
Congrats! The gnu test tests/expand/bounded-memory is now passing!
Congrats! The gnu test tests/printf/printf-surprise is now passing!

@github-actions
Copy link
Copy Markdown

github-actions bot commented Apr 8, 2026

GNU testsuite comparison:

Skip an intermittent issue tests/cut/bounded-memory (fails in this run but passes in the 'main' branch)
Skipping an intermittent issue tests/date/date-locale-hour (passes in this run but fails in the 'main' branch)
Congrats! The gnu test tests/expand/bounded-memory is now passing!
Congrats! The gnu test tests/printf/printf-surprise is now passing!

@oech3
Copy link
Copy Markdown
Contributor

oech3 commented Apr 8, 2026

Is it able to split PR per utility? Diff is still too big. How about sort?

@DePasqualeOrg
Copy link
Copy Markdown
Author

The changes here are interdependent, and splitting out sort would result in PRs of ~560 and ~110 lines each, which would need to be sequenced properly and rebased on upstream changes. They should also probably wait for #11717, in which I've enabled integration tests. If/when that PR is merged, I can enable more tests for these tools in this PR. Keeping them in one PR would reduce the cognitive load for me, since I'm now keeping track of three related PRs.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants