Skip to content

Error reporting improvements#92

Merged
dspinellis merged 15 commits intouutils:mainfrom
dspinellis:error-reporting
Jul 8, 2025
Merged

Error reporting improvements#92
dspinellis merged 15 commits intouutils:mainfrom
dspinellis:error-reporting

Conversation

@dspinellis
Copy link
Copy Markdown
Collaborator

@dspinellis dspinellis commented Jun 16, 2025

This PR improves the quality of runtime error reporting to include the associated sed command location and also the input file location when needed.

@codecov
Copy link
Copy Markdown

codecov Bot commented Jun 16, 2025

Codecov Report

Attention: Patch coverage is 0% with 189 lines in your changes missing coverage. Please review.

Project coverage is 0.00%. Comparing base (cb9f729) to head (ab07fd5).
Report is 16 commits behind head on main.

Files with missing lines Patch % Lines
src/uu/sed/src/error_handling.rs 0.00% 66 Missing ⚠️
src/uu/sed/src/processor.rs 0.00% 65 Missing ⚠️
src/uu/sed/src/compiler.rs 0.00% 24 Missing ⚠️
src/uu/sed/src/named_writer.rs 0.00% 20 Missing ⚠️
src/uu/sed/src/command.rs 0.00% 11 Missing ⚠️
src/uu/sed/src/fast_io.rs 0.00% 1 Missing ⚠️
src/uu/sed/src/script_line_provider.rs 0.00% 1 Missing ⚠️
src/uu/sed/src/sed.rs 0.00% 1 Missing ⚠️
Additional details and impacted files
@@          Coverage Diff          @@
##            main     #92   +/-   ##
=====================================
  Coverage   0.00%   0.00%           
=====================================
  Files         12      13    +1     
  Lines       2632    2716   +84     
  Branches     225     224    -1     
=====================================
- Misses      2632    2716   +84     
Flag Coverage Δ
macos_latest 0.00% <0.00%> (ø)
ubuntu_latest 0.00% <0.00%> (ø)
windows_latest 0.00% <0.00%> (ø)

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@dspinellis
Copy link
Copy Markdown
Collaborator Author

@sylvestre This PR is ready for review.

dspinellis added 15 commits July 8, 2025 17:37
Specify the argument number rather than its initial string, as the
latter can be ambiguous.
Following row numbering and most common convention.
These can be used to improve runtime error reporting.

The implementation required moving ScriptValue to script_line_provider
to avoid the circular use chain.
This is needed for improved error reporting.
This will be used to also host the runtime error function.
In contrast to the whole mutable and large Command struct, this is immutable
and small, and can therefore can be easily copied around for reporting
runtime errors.
Only provide location to the Fancy RE engine, which is the only one
requiring it. This should improve the following 2-10% fall in benchmark
results that appeared with the improved error reporting.

This is the original fall in performance after adding error location
information..
           no-op-short previous is  1.05 times faster than error-loc
      access-log-no-op previous is  1.05 times faster than error-loc
   access-log-no-subst previous is  1.06 times faster than error-loc
      access-log-subst previous is  1.05 times faster than error-loc
     access-log-no-del previous is  1.07 times faster than error-loc
    access-log-all-del previous is  1.05 times faster than error-loc
   access-log-translit previous is  1.06 times faster than error-loc
access-log-complex-sub previous is  1.01 times faster than error-loc
             remove-cr previous is  1.07 times faster than error-loc
          genome-subst previous is  1.10 times faster than error-loc
            number-fix previous is  1.11 times faster than error-loc
           long-script previous is  1.03 times faster than error-loc
                 hanoi error-loc     is similarly fast as   previous
             factorial previous is  1.02 times faster than error-loc

The change of this commit improves performance in most cases as follows.
           no-op-short error-loc is  1.05 times faster than collaped-re
      access-log-no-op collaped-re is  1.04 times faster than error-loc
   access-log-no-subst collaped-re is  1.05 times faster than error-loc
      access-log-subst collaped-re is  1.04 times faster than error-loc
     access-log-no-del collaped-re is  1.05 times faster than error-loc
    access-log-all-del collaped-re is  1.05 times faster than error-loc
   access-log-translit error-loc     is similarly fast as   collaped-re
access-log-complex-sub collaped-re is  1.02 times faster than error-loc
             remove-cr collaped-re is  1.05 times faster than error-loc
          genome-subst collaped-re is  1.06 times faster than error-loc
            number-fix collaped-re is  1.05 times faster than error-loc
           long-script collaped-re is  1.01 times faster than error-loc
                 hanoi error-loc is  1.01 times faster than collaped-re
             factorial collaped-re is  1.03 times faster than error-loc

This results in a smaller pessimization over the original version before
the error location information was added.
           no-op-short previous is  1.10 times faster than collaped-re
      access-log-no-op previous is  1.02 times faster than collaped-re
   access-log-no-subst previous is  1.01 times faster than collaped-re
      access-log-subst collaped-re     is similarly fast as   previous
     access-log-no-del previous is  1.02 times faster than collaped-re
    access-log-all-del collaped-re     is similarly fast as   previous
   access-log-translit previous is  1.06 times faster than collaped-re
access-log-complex-sub collaped-re is  1.01 times faster than previous
             remove-cr previous is  1.02 times faster than collaped-re
          genome-subst previous is  1.04 times faster than collaped-re
            number-fix previous is  1.06 times faster than collaped-re
           long-script previous is  1.02 times faster than collaped-re
                 hanoi previous is  1.01 times faster than collaped-re
             factorial collaped-re     is similarly fast as   previous
Instead, detect and handle runtime errors at the point where regex
methods are called.

This is always faster or the same as the preceding method.

           no-op-short current is  1.03 times faster than preceding
      access-log-no-op current     is similarly fast as   preceding
   access-log-no-subst current is  1.05 times faster than preceding
      access-log-subst current     is similarly fast as   preceding
     access-log-no-del current is  1.05 times faster than preceding
    access-log-all-del current is  1.04 times faster than preceding
   access-log-translit current     is similarly fast as   preceding
access-log-complex-sub current     is similarly fast as   preceding
             remove-cr current is  1.03 times faster than preceding
          genome-subst current     is similarly fast as   preceding
            number-fix current     is similarly fast as   preceding
           long-script current is  1.01 times faster than preceding
                 hanoi current is  1.02 times faster than preceding
             factorial current     is similarly fast as   preceding

Also, this rectified most performance pessimization introduced by adding
error locations in fast_regex.

           no-op-short previous  is  1.07 times faster than error-loc
      access-log-no-op previous     is similarly fast as    error-loc
   access-log-no-subst error-loc is  1.04 times faster than previous
      access-log-subst previous     is similarly fast as    error-loc
     access-log-no-del error-loc is  1.03 times faster than previous
    access-log-all-del error-loc is  1.04 times faster than previous
   access-log-translit previous  is  1.06 times faster than error-loc
access-log-complex-sub error-loc is  1.02 times faster than previous
             remove-cr previous     is similarly fast as    error-loc
          genome-subst previous  is  1.04 times faster than error-loc
            number-fix previous  is  1.06 times faster than error-loc
           long-script previous     is similarly fast as    error-loc
                 hanoi previous     is similarly fast as    error-loc
             factorial previous     is similarly fast as    error-loc

TODO: In fast_regex distinguish between UTF-8 conversion and regex
errors.  The former should be reported as I/O errors with input file
location info, which the latter should be reported as script errors
with script file location info.
@dspinellis dspinellis merged commit 6f5f5c0 into uutils:main Jul 8, 2025
18 of 19 checks passed
@dspinellis
Copy link
Copy Markdown
Collaborator Author

Rebased and merged to avoid further divergence from the main branch.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant