Complete implementation of all commands, in-place editing, and optimizations#77
Complete implementation of all commands, in-place editing, and optimizations#77sylvestre merged 46 commits intouutils:mainfrom
Conversation
According to POSIX, labels consist of portable filename characters.
Previously empty RE were saved and reused based on their occurrence in the compiled script. Per POSIX if an RE is empty (that is, no pattern is specified) sed shall behave as if the last RE used in the last command applied (either as an address or as part of a substitute command) was specified. Now REs are saved and reused at runtime.
The previous order resulted in a duplicate label error.
regex::Regex EREs don't support back-references, so switch to fancy_regex. While at it remove test_compile_re_reuse_saved test case, which is no longer valid, as REs are reused at runtime.
| fn re_or_saved_re(regex: &Option<Regex>, context: &mut ProcessingContext) -> UResult<Regex> { | ||
| match regex { | ||
| Some(re) => { | ||
| *context.saved_regex.borrow_mut() = Some(re.clone()); |
There was a problem hiding this comment.
would it be possible to ignore the clone ?
There was a problem hiding this comment.
The change further reduced the 50! calculation from 4.34" to 3.46" (against 10.28" of GNU sed).
| let caps = match caps { | ||
| Ok(c) => c, | ||
| Err(e) => { | ||
| return Err(USimpleError::new( | ||
| 2, | ||
| format!("regular expression capture retrieval error: {}", e), | ||
| )); | ||
| } | ||
| }; |
There was a problem hiding this comment.
as you wish but you could write:
| let caps = match caps { | |
| Ok(c) => c, | |
| Err(e) => { | |
| return Err(USimpleError::new( | |
| 2, | |
| format!("regular expression capture retrieval error: {}", e), | |
| )); | |
| } | |
| }; | |
| let caps = caps | |
| .map_err(|e| USimpleError::new( | |
| 2, | |
| format!("regular expression capture retrieval error: {}", e), | |
| ))?; | |
Suggested by: sylvestre
Suggested by: sylvestre
|
|
||
| /// Copy the specified file to the output. | ||
| pub fn copy_file(&mut self, path: &PathBuf) -> io::Result<()> { | ||
| #[cfg(unix)] |
There was a problem hiding this comment.
do you know what will happen here on !unix ?
There was a problem hiding this comment.
Yes, there are no memory mapped data, so there's nothing to flush. (The flush is needed when we switch from coalescing and writing out memory mapped data (with zero copies) to BufWriter.
This is code for the r command, which is still WIP.
This improves performance up to 1.5 times, with no downside.
no-op-short rust-sed is similarly fast as rust-sed-f0
access-log-no-op rust-sed is 1.5 times faster than rust-sed-f0
access-log-no-subst rust-sed is similarly fast as rust-sed-f0
access-log-subst rust-sed is similarly fast as rust-sed-f0
access-log-no-del rust-sed is 1.1 times faster than rust-sed-f0
access-log-all-del rust-sed is similarly fast as rust-sed-f0
access-log-translit rust-sed is 1.2 times faster than rust-sed-f0
access-log-complex-sub rust-sed is similarly fast as rust-sed-f0
remove-cr rust-sed is similarly fast as rust-sed-f0
genome-subst rust-sed is similarly fast as rust-sed-f0
number-fix rust-sed is 1.1 times faster than rust-sed-f0
long-script rust-sed is similarly fast as rust-sed-f0
hanoi rust-sed is similarly fast as rust-sed-f0
factorial rust-sed is similarly fast as rust-sed-f0
This improves performance by 10% in the following benchmark case. Performance in all others remains the same. access-log-subst rust-sed is 1.1 times faster than rust-sed-eb64104
In theory, find, find_iter, captures, and captures_iter according to the
values of sub.occurrence, sub.replacement.max_group_number could allow
for better performance.
In practice, specializing for the easiest case to use find did not show
any improvement.
no-op-short rust-sed is similarly fast as rust-sed-7868f4f
access-log-no-op rust-sed is similarly fast as rust-sed-7868f4f
access-log-no-subst rust-sed is similarly fast as rust-sed-7868f4f
access-log-subst rust-sed is similarly fast as rust-sed-7868f4f
access-log-no-del rust-sed is similarly fast as rust-sed-7868f4f
access-log-all-del rust-sed is similarly fast as rust-sed-7868f4f
access-log-translit rust-sed is similarly fast as rust-sed-7868f4f
access-log-complex-sub rust-sed is similarly fast as rust-sed-7868f4f
remove-cr rust-sed is similarly fast as rust-sed-7868f4f
genome-subst rust-sed is similarly fast as rust-sed-7868f4f
number-fix rust-sed is similarly fast as rust-sed-7868f4f
long-script rust-sed is similarly fast as rust-sed-7868f4f
hanoi rust-sed is similarly fast as rust-sed-7868f4f
factorial rust-sed is similarly fast as rust-sed-7868f4f
Consequently, this commit just documents the change and its results, and
will be reverted.
This reverts commit f18f772. No performance improvement was seen.
This improves one benchmark case by 20%: access-log-all-del rust-sed is 1.2 times faster than rust-sed-f18f772 Unfortunatelly, all other cases remain unaffected.
The hope is that the now available regex::bytes will run find() faster
than captures_iter(). Indeed, using find() increases the performance
of several benchmark cases by 2-12%.
no-op-short rust-sed is 1.02 times faster than rust-sed-5fad204
access-log-no-op rust-sed is 1.01 times faster than rust-sed-5fad204
access-log-no-subst rust-sed is 1.08 times faster than rust-sed-5fad204
access-log-subst rust-sed is 1.12 times faster than rust-sed-5fad204
access-log-translit rust-sed is 1.02 times faster than rust-sed-5fad204
access-log-complex-sub rust-sed is 1.04 times faster than rust-sed-5fad204
remove-cr rust-sed is 1.12 times faster than rust-sed-5fad204
genome-subst rust-sed is 1.02 times faster than rust-sed-5fad204
hanoi rust-sed is 1.01 times faster than rust-sed-5fad204
All other cases remain the same.
Performance improves significantly in the following case: number-fix rust-sed is 1.07 times faster than rust-sed-90d60b8 Other changes <5% are likely to be noise.
- Add RE to detect specifications that require an RE engine, rather than a literal string match. - Change RE unit tests to test directly the RE rather than the constructor. - Use arrays for similar tests to avoid repetition.
Regex uses an automaton even for matching literal strings. Moving
through its state transitions is suboptimal. This commit introduces
literal string matching with the memchr::memmem matcher, which may use
SIMD and other specialized features to speed up the search as well as
other algorithms for special sizes.
This boosts substantially the performance of several benchmark cases.
access-log-no-subst rust-86 is 2.84 times faster than rust-sed-a6
access-log-subst rust-86 is 1.87 times faster than rust-sed-a6
access-log-no-del rust-86 is 3.00 times faster than rust-sed-a6
access-log-all-del rust-86 is 2.80 times faster than rust-sed-a6
remove-cr rust-86 is 4.35 times faster than rust-sed-a6
genome-subst rust-86 is 3.65 times faster than rust-sed-a6
It makes makes the performance of Rust better than the GNU and FreeBSD
implementations for most benchmark cases.
no-op-short rust is 3.04 times faster than gnu
access-log-no-op rust is 1.91 times faster than gnu
access-log-no-subst rust is 1.59 times faster than gnu
access-log-subst rust is 1.32 times faster than gnu
access-log-no-del rust is 1.54 times faster than gnu
access-log-all-del rust is 1.99 times faster than gnu
access-log-translit rust is 12.67 times faster than gnu
access-log-complex-sub gnu is 4.24 times faster than rust
remove-cr rust is 1.68 times faster than gnu
genome-subst rust is 1.15 times faster than gnu
number-fix rust is 1.06 times faster than gnu
long-script gnu is 2.19 times faster than rust
hanoi rust is 1.96 times faster than gnu
factorial rust is 1.25 times faster than gnu
no-op-short rust is 3.50 times faster than fbsd
access-log-no-op rust is 2.45 times faster than fbsd
access-log-no-subst rust is 1.38 times faster than fbsd
access-log-subst rust is 4.02 times faster than fbsd
access-log-no-del rust is 1.35 times faster than fbsd
access-log-all-del rust is 4.74 times faster than fbsd
access-log-translit fbsd is 1.14 times faster than rust
access-log-complex-sub rust is 2.00 times faster than fbsd
remove-cr rust is 2.48 times faster than fbsd
genome-subst rust is 2.21 times faster than fbsd
number-fix fbsd is 1.04 times faster than rust
long-script fbsd is 15.46 times faster than rust
hanoi rust is 9.33 times faster than fbsd
factorial rust is 115.72 times faster than fbsd
By using uucore's .map_err_context.
|
@sylvestre I integrated the commits for the remaining commands and in-place editing. The implementation is mostly done now. |
|
Excellent, I will have a look :) |
|
I will reply to your email too :) |
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #77 +/- ##
======================================
Coverage 0.00% 0.00%
======================================
Files 11 12 +1
Lines 2061 2632 +571
Branches 182 225 +43
======================================
- Misses 2061 2632 +571
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
At this state sed implements all POSIX commands, and can correctly run the two complex scripts: hanoi.sed (solves the Towers of Hanoi puzzle) and math.sed (implements an arbitrary precision integer math calculator).
The performance of this Rust implementation is now better than the GNU and FreeBSD implementations for most benchmark cases.
GNU sed under Linux
FreeBSD sed
Remaining tasks