Skip to content

Complete implementation of all commands, in-place editing, and optimizations#77

Merged
sylvestre merged 46 commits intouutils:mainfrom
dspinellis:aci-commands
Jun 4, 2025
Merged

Complete implementation of all commands, in-place editing, and optimizations#77
sylvestre merged 46 commits intouutils:mainfrom
dspinellis:aci-commands

Conversation

@dspinellis
Copy link
Copy Markdown
Collaborator

@dspinellis dspinellis commented May 19, 2025

At this state sed implements all POSIX commands, and can correctly run the two complex scripts: hanoi.sed (solves the Towers of Hanoi puzzle) and math.sed (implements an arbitrary precision integer math calculator).

The performance of this Rust implementation is now better than the GNU and FreeBSD implementations for most benchmark cases.

GNU sed under Linux

               no-op-short rust is  3.04 times faster than gnu
          access-log-no-op rust is  1.91 times faster than gnu
       access-log-no-subst rust is  1.59 times faster than gnu
          access-log-subst rust is  1.32 times faster than gnu
         access-log-no-del rust is  1.54 times faster than gnu
        access-log-all-del rust is  1.99 times faster than gnu
       access-log-translit rust is 12.67 times faster than gnu
    access-log-complex-sub gnu is  4.24 times faster than rust
                 remove-cr rust is  1.68 times faster than gnu
              genome-subst rust is  1.15 times faster than gnu
                number-fix rust is  1.06 times faster than gnu
               long-script gnu is  2.19 times faster than rust
                     hanoi rust is  1.96 times faster than gnu
                 factorial rust is  1.25 times faster than gnu

FreeBSD sed

               no-op-short rust is  3.50 times faster than fbsd
          access-log-no-op rust is  2.45 times faster than fbsd
       access-log-no-subst rust is  1.38 times faster than fbsd
          access-log-subst rust is  4.02 times faster than fbsd
         access-log-no-del rust is  1.35 times faster than fbsd
        access-log-all-del rust is  4.74 times faster than fbsd
       access-log-translit fbsd is  1.14 times faster than rust
    access-log-complex-sub rust is  2.00 times faster than fbsd
                 remove-cr rust is  2.48 times faster than fbsd
              genome-subst rust is  2.21 times faster than fbsd
                number-fix fbsd is  1.04 times faster than rust
               long-script fbsd is 15.46 times faster than rust
                     hanoi rust is  9.33 times faster than fbsd
                 factorial rust is 115.72 times faster than fbsd

Remaining tasks

  • Improve runtime error reporting by including script coordinates in each command.
  • Fix buffering on terminal output to match current implementations
  • Profile lagging benchmarks to see whether there's room for further optimizations.
  • Implement more GNU extensions.

dspinellis added 15 commits May 19, 2025 19:28
According to POSIX, labels consist of portable filename characters.
Previously empty RE were saved and reused based on their occurrence
in the compiled script.

Per POSIX if an RE is empty (that is, no pattern is specified) sed shall
behave as if the last RE used in the last command applied (either as an
address or as part of a substitute command) was specified.

Now REs are saved and reused at runtime.
The previous order resulted in a duplicate label error.
regex::Regex EREs don't support back-references, so switch to
fancy_regex.

While at it remove test_compile_re_reuse_saved test case, which is no
longer valid, as REs are reused at runtime.
@dspinellis dspinellis changed the title Implement a, c, i commands and fix faults to run example scripts Implement a, c, i commands and fix faults to run sophisticated scripts May 19, 2025
Comment thread src/uu/sed/src/processor.rs Outdated
fn re_or_saved_re(regex: &Option<Regex>, context: &mut ProcessingContext) -> UResult<Regex> {
match regex {
Some(re) => {
*context.saved_regex.borrow_mut() = Some(re.clone());
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

would it be possible to ignore the clone ?

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Indeed; fixed.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The change further reduced the 50! calculation from 4.34" to 3.46" (against 10.28" of GNU sed).

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

impressive!

Comment thread src/uu/sed/src/processor.rs Outdated
Comment on lines +202 to +210
let caps = match caps {
Ok(c) => c,
Err(e) => {
return Err(USimpleError::new(
2,
format!("regular expression capture retrieval error: {}", e),
));
}
};
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

as you wish but you could write:

Suggested change
let caps = match caps {
Ok(c) => c,
Err(e) => {
return Err(USimpleError::new(
2,
format!("regular expression capture retrieval error: {}", e),
));
}
};
let caps = caps
.map_err(|e| USimpleError::new(
2,
format!("regular expression capture retrieval error: {}", e),
))?;

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, fixed!

Comment thread src/uu/sed/src/fast_io.rs

/// Copy the specified file to the output.
pub fn copy_file(&mut self, path: &PathBuf) -> io::Result<()> {
#[cfg(unix)]
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do you know what will happen here on !unix ?

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, there are no memory mapped data, so there's nothing to flush. (The flush is needed when we switch from coalescing and writing out memory mapped data (with zero copies) to BufWriter.

This is code for the r command, which is still WIP.

This improves performance up to 1.5 times, with no downside.

           no-op-short     rust-sed     is similarly fast as   rust-sed-f0
      access-log-no-op     rust-sed is   1.5 times faster than rust-sed-f0
   access-log-no-subst     rust-sed     is similarly fast as   rust-sed-f0
      access-log-subst     rust-sed     is similarly fast as   rust-sed-f0
     access-log-no-del     rust-sed is   1.1 times faster than rust-sed-f0
    access-log-all-del     rust-sed     is similarly fast as   rust-sed-f0
   access-log-translit     rust-sed is   1.2 times faster than rust-sed-f0
access-log-complex-sub     rust-sed     is similarly fast as   rust-sed-f0
             remove-cr     rust-sed     is similarly fast as   rust-sed-f0
          genome-subst     rust-sed     is similarly fast as   rust-sed-f0
            number-fix     rust-sed is   1.1 times faster than rust-sed-f0
           long-script     rust-sed     is similarly fast as   rust-sed-f0
                 hanoi     rust-sed     is similarly fast as   rust-sed-f0
             factorial     rust-sed     is similarly fast as   rust-sed-f0
dspinellis added 19 commits May 24, 2025 20:26
This improves performance by 10% in the following benchmark case.
Performance in all others remains the same.

access-log-subst     rust-sed is   1.1 times faster than rust-sed-eb64104
In theory, find, find_iter, captures, and captures_iter according to the
values of sub.occurrence, sub.replacement.max_group_number could allow
for better performance.
In practice, specializing for the easiest case to use find did not show
any improvement.

           no-op-short     rust-sed     is similarly fast as   rust-sed-7868f4f
      access-log-no-op     rust-sed     is similarly fast as   rust-sed-7868f4f
   access-log-no-subst     rust-sed     is similarly fast as   rust-sed-7868f4f
      access-log-subst     rust-sed     is similarly fast as   rust-sed-7868f4f
     access-log-no-del     rust-sed     is similarly fast as   rust-sed-7868f4f
    access-log-all-del     rust-sed     is similarly fast as   rust-sed-7868f4f
   access-log-translit     rust-sed     is similarly fast as   rust-sed-7868f4f
access-log-complex-sub     rust-sed     is similarly fast as   rust-sed-7868f4f
             remove-cr     rust-sed     is similarly fast as   rust-sed-7868f4f
          genome-subst     rust-sed     is similarly fast as   rust-sed-7868f4f
            number-fix     rust-sed     is similarly fast as   rust-sed-7868f4f
           long-script     rust-sed     is similarly fast as   rust-sed-7868f4f
                 hanoi     rust-sed     is similarly fast as   rust-sed-7868f4f
             factorial     rust-sed     is similarly fast as   rust-sed-7868f4f

Consequently, this commit just documents the change and its results, and
will be reverted.
This reverts commit f18f772.
No performance improvement was seen.
This improves one benchmark case by 20%:
access-log-all-del rust-sed is 1.2 times faster than rust-sed-f18f772

Unfortunatelly, all other cases remain unaffected.
The hope is that the now available regex::bytes will run find() faster
than captures_iter().  Indeed, using find() increases the performance
of several benchmark cases by 2-12%.

           no-op-short rust-sed is  1.02 times faster than rust-sed-5fad204
      access-log-no-op rust-sed is  1.01 times faster than rust-sed-5fad204
   access-log-no-subst rust-sed is  1.08 times faster than rust-sed-5fad204
      access-log-subst rust-sed is  1.12 times faster than rust-sed-5fad204
   access-log-translit rust-sed is  1.02 times faster than rust-sed-5fad204
access-log-complex-sub rust-sed is  1.04 times faster than rust-sed-5fad204
             remove-cr rust-sed is  1.12 times faster than rust-sed-5fad204
          genome-subst rust-sed is  1.02 times faster than rust-sed-5fad204
                 hanoi rust-sed is  1.01 times faster than rust-sed-5fad204

All other cases remain the same.
Performance improves significantly in the following case:

number-fix rust-sed is 1.07 times faster than rust-sed-90d60b8

Other changes <5% are likely to be noise.
- Add RE to detect specifications that require an RE engine, rather than
  a literal string match.
- Change RE unit tests to test directly the RE rather than the constructor.
- Use arrays for similar tests to avoid repetition.
Regex uses an automaton even for matching literal strings. Moving
through its state transitions is suboptimal.  This commit introduces
literal string matching with the memchr::memmem matcher, which may use
SIMD and other specialized features to speed up the search as well as
other algorithms for special sizes.

This boosts substantially the performance of several benchmark cases.

access-log-no-subst rust-86 is 2.84 times faster than rust-sed-a6
   access-log-subst rust-86 is 1.87 times faster than rust-sed-a6
  access-log-no-del rust-86 is 3.00 times faster than rust-sed-a6
 access-log-all-del rust-86 is 2.80 times faster than rust-sed-a6
          remove-cr rust-86 is 4.35 times faster than rust-sed-a6
       genome-subst rust-86 is 3.65 times faster than rust-sed-a6

It makes makes the performance of Rust better than the GNU and FreeBSD
implementations for most benchmark cases.

           no-op-short rust is  3.04 times faster than gnu
      access-log-no-op rust is  1.91 times faster than gnu
   access-log-no-subst rust is  1.59 times faster than gnu
      access-log-subst rust is  1.32 times faster than gnu
     access-log-no-del rust is  1.54 times faster than gnu
    access-log-all-del rust is  1.99 times faster than gnu
   access-log-translit rust is 12.67 times faster than gnu
access-log-complex-sub gnu is  4.24 times faster than rust
             remove-cr rust is  1.68 times faster than gnu
          genome-subst rust is  1.15 times faster than gnu
            number-fix rust is  1.06 times faster than gnu
           long-script gnu is  2.19 times faster than rust
                 hanoi rust is  1.96 times faster than gnu
             factorial rust is  1.25 times faster than gnu

           no-op-short rust is  3.50 times faster than fbsd
      access-log-no-op rust is  2.45 times faster than fbsd
   access-log-no-subst rust is  1.38 times faster than fbsd
      access-log-subst rust is  4.02 times faster than fbsd
     access-log-no-del rust is  1.35 times faster than fbsd
    access-log-all-del rust is  4.74 times faster than fbsd
   access-log-translit fbsd is  1.14 times faster than rust
access-log-complex-sub rust is  2.00 times faster than fbsd
             remove-cr rust is  2.48 times faster than fbsd
          genome-subst rust is  2.21 times faster than fbsd
            number-fix fbsd is  1.04 times faster than rust
           long-script fbsd is 15.46 times faster than rust
                 hanoi rust is  9.33 times faster than fbsd
             factorial rust is 115.72 times faster than fbsd
By using uucore's .map_err_context.
@dspinellis dspinellis changed the title Implement a, c, i commands and fix faults to run sophisticated scripts Complete implementation of all commands, in-place editing, and optimizations May 30, 2025
@dspinellis
Copy link
Copy Markdown
Collaborator Author

@sylvestre I integrated the commits for the remaining commands and in-place editing. The implementation is mostly done now.

@sylvestre
Copy link
Copy Markdown
Contributor

Excellent, I will have a look :)

@sylvestre
Copy link
Copy Markdown
Contributor

I will reply to your email too :)

@codecov
Copy link
Copy Markdown

codecov Bot commented Jun 4, 2025

Codecov Report

Attention: Patch coverage is 0% with 760 lines in your changes missing coverage. Please review.

Project coverage is 0.00%. Comparing base (2eae370) to head (0b390c0).
Report is 47 commits behind head on main.

Files with missing lines Patch % Lines
src/uu/sed/src/fast_regex.rs 0.00% 250 Missing ⚠️
src/uu/sed/src/compiler.rs 0.00% 185 Missing ⚠️
src/uu/sed/src/processor.rs 0.00% 149 Missing ⚠️
src/uu/sed/src/in_place.rs 0.00% 88 Missing ⚠️
src/uu/sed/src/fast_io.rs 0.00% 42 Missing ⚠️
src/uu/sed/src/command.rs 0.00% 36 Missing ⚠️
src/uu/sed/src/sed.rs 0.00% 5 Missing ⚠️
src/uu/sed/src/named_writer.rs 0.00% 3 Missing ⚠️
src/uu/sed/src/script_line_provider.rs 0.00% 2 Missing ⚠️
Additional details and impacted files
@@          Coverage Diff           @@
##            main     #77    +/-   ##
======================================
  Coverage   0.00%   0.00%            
======================================
  Files         11      12     +1     
  Lines       2061    2632   +571     
  Branches     182     225    +43     
======================================
- Misses      2061    2632   +571     
Flag Coverage Δ
macos_latest 0.00% <0.00%> (ø)
ubuntu_latest 0.00% <0.00%> (ø)
windows_latest 0.00% <0.00%> (ø)

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@sylvestre sylvestre merged commit 3c0a69f into uutils:main Jun 4, 2025
15 checks passed
dspinellis added a commit that referenced this pull request Jun 4, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants