Remove string content from panics#153677
Conversation
This comment has been minimized.
This comment has been minimized.
|
|
||
| /// Copied from core::str::raw::slice_error_fail | ||
| #[inline(never)] | ||
| fn slice_error_fail(s: &Wtf8, begin: usize, end: usize) -> ! { |
This comment was marked as resolved.
This comment was marked as resolved.
Sorry, something went wrong.
There was a problem hiding this comment.
warning: unused variable: `s` --> library/core/src/wtf8.rs:488:21 | 488 | fn slice_error_fail(s: &Wtf8, begin: usize, end: usize) -> ! { | ^ help: if this is intentional, prefix it with an underscore: `_s`
| // SAFETY: is_code_point_boundary checks that the index is valid | ||
| unsafe { slice_unchecked(self, range.start, range.end) } | ||
| } else { | ||
| slice_error_fail(self, range.start, range.end) |
This comment was marked as resolved.
This comment was marked as resolved.
Sorry, something went wrong.
| // SAFETY: is_code_point_boundary checks that the index is valid | ||
| unsafe { slice_unchecked(self, range.start, self.len()) } | ||
| } else { | ||
| slice_error_fail(self, range.start, self.len()) |
This comment was marked as resolved.
This comment was marked as resolved.
Sorry, something went wrong.
| // SAFETY: is_code_point_boundary checks that the index is valid | ||
| unsafe { slice_unchecked(self, 0, range.end) } | ||
| } else { | ||
| slice_error_fail(self, 0, range.end) |
This comment was marked as resolved.
This comment was marked as resolved.
Sorry, something went wrong.
String content can be useful for debugging panics, particularly when you are working on a small codebase where you can infer more about the path taken through your program based on the content of the string. In production deployments that upload crashes for centralized debugging, string content poses a risk to user privacy. The string can accidentally contain information that the user is not expecting to share with the development team. On balance it seems like the risks outweigh the benefits. It is easy to add a `dbg!()` statement to gather more information in development; but comparatively tricky to ensure panics are sanitized by every rust app that monitors their production deployments. Ref https://internals.rust-lang.org/t/stop-including-string-content-in-index-panics/24067
534370d to
05ac9d4
Compare
| #[inline(never)] | ||
| fn slice_error_fail(s: &Wtf8, begin: usize, end: usize) -> ! { | ||
| assert!(begin <= end); | ||
| panic!("index {begin} and/or {end} in `{s:?}` do not lie on character boundary"); | ||
| let len = s.len(); | ||
| if end > len { | ||
| panic!("end byte index {end} is out of bounds for string of length {len}"); | ||
| } | ||
| if begin > end { | ||
| panic!("byte range starts at {begin} but ends at {end}"); | ||
| } | ||
| if !s.is_code_point_boundary(begin) { | ||
| panic!("byte index {begin} is not a code point boundary"); | ||
| } | ||
| panic!("byte index {end} is not a code point boundary"); | ||
| } |
This comment was marked as resolved.
This comment was marked as resolved.
Sorry, something went wrong.
This comment has been minimized.
This comment has been minimized.
@ConradIrwin use |
String panics are unfortunately common at Zed, and doubly unfortunately they may incidentally include the contents of a user's buffer. Although this hasn't happened yet (to my knowledge), I don't want to be in the position of having received sensitive information this way. See also rust-lang/rust#153677
String panics are unfortunately common at Zed, and doubly unfortunately they may incidentally include the contents of a user's buffer. Although this hasn't happened yet (to my knowledge), I don't want to be in the position of having received sensitive information this way. See also rust-lang/rust#153677
String panics are unfortunately common at Zed, and doubly unfortunately they may incidentally include the contents of a user's buffer. Although this hasn't happened yet (to my knowledge), I don't want to be in the position of having received sensitive information this way. See also rust-lang/rust#153677
Co-authored-by: TKanX <124454788+TKanX@users.noreply.github.com>
String panics are a non-trivial percentage of the crashes we see at Zed, and doubly unfortunately they may incidentally include the contents of a user's buffer. Although this hasn't happened yet (to my knowledge), I don't want to be in the position of having received sensitive information this way. See also rust-lang/rust#153677 Release Notes: - N/A
|
Thanks @TKanX! |
|
Hello libs and libs-api folks! Not sure which team would be responsible for this, so nominating for both. I think that not reflecting more that one USV in the panic message makes sense, like how indexing a vec doesn't show the elements of the vec. But wanted to run it by teams as a user-visible change. How do you feel about it? |
| let ch = s[floor..ceil].chars().next().unwrap(); | ||
| panic!( | ||
| "start byte index {begin} is not a char boundary; it is inside {ch:?} (bytes {range:?}) of `{s_trunc}`{ellipsis}" | ||
| "start byte index {begin} is not a char boundary; it is inside {ch:?} (bytes {range:?} of string)" |
There was a problem hiding this comment.
| "start byte index {begin} is not a char boundary; it is inside {ch:?} (bytes {range:?} of string)" | |
| "start byte index {begin} is not a char boundary; it is inside {ch:?} (bytes {range:?})" |
"of string" sounds odd to me -- is it needed? (Same for similar messages)
There was a problem hiding this comment.
I like including the word string because the problem is due to mis-use of a str/String API, which may not otherwise be obvious
| let ch = s[floor..ceil].chars().next().unwrap(); | ||
| panic!( | ||
| "end byte index {end} is not a char boundary; it is inside {ch:?} (bytes {range:?}) of `{s_trunc}`{ellipsis}" | ||
| "end byte index {end} is not a char boundary; it is inside {ch:?} (bytes {range:?} of string)" |
There was a problem hiding this comment.
"of string" sounds odd to me -- is it needed? (Same for similar messages)
@RalfJung
| "end byte index {end} is not a char boundary; it is inside {ch:?} (bytes {range:?} of string)" | |
| "end byte index {end} is not a char boundary; it is inside {ch:?} (bytes {range:?})" |
There was a problem hiding this comment.
1, 2, 6 describe the type so not redundant right?
| assert!(begin <= end); | ||
| panic!("index {begin} and/or {end} in `{s:?}` do not lie on character boundary"); | ||
| let len = s.len(); | ||
| if begin > len { | ||
| panic!("start byte index {begin} is out of bounds for string of length {len}"); | ||
| } | ||
| if end > len { | ||
| panic!("end byte index {end} is out of bounds for string of length {len}"); | ||
| } | ||
| if begin > end { | ||
| panic!("byte range starts at {begin} but ends at {end}"); | ||
| } | ||
| if !s.is_code_point_boundary(begin) { | ||
| panic!("byte index {begin} is not a code point boundary"); | ||
| } | ||
| panic!("byte index {end} is not a code point boundary"); |
There was a problem hiding this comment.
🎉 This looks like a huge improvement to usability.
|
This was discussed in today's @rust-lang/libs-api meeting; our conclusion was that we'd like to implement this as a usability improvement. We do not believe that this is sufficient or even necessarily useful for sanitisation of panic messages, but it is a much more readable panic message. +1 from us. If someone would in the future want to expand this to also e.g. print the character indexed into, that would be welcome but this PR is already an improvement as-is. @rustbot label -I-libs-nominated -I-libs-api-nominated |
|
With the nod from libs-api on changing the messages, the change looks fine code-review wise, so (Note that we can always tweak things further in other PRs if people so wish.) |
|
⌛ Testing commit 7ec050a with merge 686fce3... Workflow: https://github.com/rust-lang/rust/actions/runs/23234002650 |
Remove string content from panics String content can be useful for debugging panics, particularly when you are working on a small codebase where you can infer more about the path taken through your program based on the content of the string. In production deployments that upload crashes for centralized debugging, string content poses a risk to user privacy. The string can accidentally contain information that the user is not expecting to share with the development team. On balance it seems like the risks outweigh the benefits. It is easy to add a `dbg!()` statement to gather more information in development; but comparatively tricky to ensure panics are sanitized by every rust app that monitors their production deployments. Ref https://internals.rust-lang.org/t/stop-including-string-content-in-index-panics/24067 r? @scottmcm
|
@bors yield (to encoding rollup) |
|
Auto build was cancelled. Cancelled workflows: The next pull request likely to be tested is #154035. |
String content can be useful for debugging panics, particularly when you are working on a small codebase where you can infer more about the path taken through your program based on the content of the string.
In production deployments that upload crashes for centralized debugging, string content poses a risk to user privacy. The string can accidentally contain information that the user is not expecting to share with the development team.
On balance it seems like the risks outweigh the benefits. It is easy to add a
dbg!()statement to gather more information in development; but comparatively tricky to ensure panics are sanitized by every rust app that monitors their production deployments.Ref https://internals.rust-lang.org/t/stop-including-string-content-in-index-panics/24067
r? @scottmcm