Fix Issue 18279 - rt.util.utf does not properly reserve buffer in toUTF16/toUTF16z#2053
Fix Issue 18279 - rt.util.utf does not properly reserve buffer in toUTF16/toUTF16z#2053dlang-bot merged 1 commit intodlang:masterfrom
Conversation
|
Thanks for your pull request, @n8sh! We are looking forward to reviewing it, and you should be hearing from a maintainer soon. Some tips to help speed things up:
Bear in mind that large or tricky changes may require multiple rounds of review and revision. Please see CONTRIBUTING.md for more information. Bugzilla references
|
|
Thanks for your PR. I guess a unittest is not really feasible, is it? |
| r.length = slen; | ||
| r.length = 0; | ||
| if (!__ctfe) | ||
| r.reserve(slen); |
There was a problem hiding this comment.
reserve still does a lot if slen is zero. Maybe better to return early for that case. Same below.
There was a problem hiding this comment.
Hm... reserve probably shouldn't be expensive if request is 0.
There was a problem hiding this comment.
Probably can just return "" in that case, or "".ptr for z versions.
There was a problem hiding this comment.
reserve probably shouldn't be expensive if request is 0
reserve returns the allocated block size, so unfortunately, hard to avoid looking up block info.
There was a problem hiding this comment.
Okay will add a check for slen == 0
|
@n8sh in the future, if you could post a link to the PR on bugzilla when you open it, so others don't duplicate your work |
schveiguy
left a comment
There was a problem hiding this comment.
Fix one issue and I think it’s good
src/rt/util/utf.d
Outdated
| // Reserve still does a lot if slen is zero. | ||
| // Return early for that case. | ||
| if (0 == slen) | ||
| return &""w[0];// Terminating null is auto-included at end of string literals. |
There was a problem hiding this comment.
This isn’t right. Return ””w.ptr
There was a problem hiding this comment.
If I do that I get a complaint about the code not being @safe. I'll assume it's okay to mark toUTF16z as @trusted since toUTF16 already is.
There was a problem hiding this comment.
Well, it's not @safe, as the intention is to return a null-terminated array, which is inherently unsafe. Not only that, but index 0 doesn't exist, so you potentially would have a RangeError thrown (it should be what happens, since that is the point of making you index the first element in @safe code).
Where is the code that is calling this, complaining that it's not @safe?
There was a problem hiding this comment.
Oh, I see, it was marked @safe. It shouldn't have been. ugh...
There was a problem hiding this comment.
I think for now, even though it's not really @safe, the best option may be to use:
return &"\0"w[0];There was a problem hiding this comment.
Well, it's not @safe, as the intention is to return a null-terminated array, which is inherently unsafe.
I don't think there is a problem: a pointer to something known to exist (in the DATA segment or in GC memory) is fine in @safe code, as this compiles:
{
static const char a = 'a';
const char* p = &a;
}
You are not supposed to do any arithmetics on p and accessing *p should be fine. So "a".ptr could be accepted in @safe code, even "".ptr as literals are guaranteed to be zero-terminated. Am I missing something?
There was a problem hiding this comment.
It's technically @safe, but disingenuous. The point is to return a null-terminated string, not a pointer to a single char.
There was a problem hiding this comment.
So "a".ptr could be accepted in @safe code
Just noticed that this was still accepted as deprecated in dmd 2.078, but no longer in master. I cannot find it in https://dlang.org/deprecate.html, though.
There was a problem hiding this comment.
https://dlang.org/changelog/pending.html#ptr-safe-end-of-deprecation
When I made dlang/dmd#7688, I didn't realize it wasn't on the deprecations page. We really need to have more structure in our deprecation process :/
There was a problem hiding this comment.
It's technically @safe, but disingenuous.
I agree. The syntax &a[0] is not nice, especially with literals, but it makes range checks kind of explicit. Let's hope the compiler elides them if possible.
|
This one is pretty obvious, regardless of the safety debate, so I'm just going to merge it. |
The offending pattern:
The apparent intention is to preallocate the buffer but this idiom doesn't work as pointed out by Dmitry Olshansky: "On append the slice is not the tail so it will reallocate."