Improve MdUtf8String#21720
Conversation
| pItr++; | ||
| } | ||
| } | ||
| const int MaxStringLength = 1024; |
There was a problem hiding this comment.
Previously it was unlimited length; but the SpanHelper.IndexOf isn't happy with that. Have chosen 1024; is that too small?
Is the actual length stored in the metadata so the overload with length could be called, or is it just a null terminated utf8string? /cc @GrabYourPitchforks
There was a problem hiding this comment.
The risk is that IndexOf('\0') may access memory that it is not supposed to access when it is given arbitrary sized buffer. It should not happen today given what the implementation looks like, but it is not a good idea to start spreading subtle assumptions like these throughout the codebase.
For now, I would just call StubHelpers.strlen or some other proper strlen variant here (there are several). Consider refactoring of wcslen/strlen to use fully managed (vectorized, etc.) implementation in separate change.
There was a problem hiding this comment.
To keep it isolated to one place and easier to refactor/identify? Makes sense.
The comment in the unmanaged code that I removed for the string compare is:
// Important: the string in pSsz isn't null terminated so the length must be used
// when performing operations on the string.So it does feel an uncomfortable coupling; though I suppose that's for the other .ctor of MdUtf8String which takes length rather than looking for a null termination?
There was a problem hiding this comment.
Consider refactoring of wcslen/strlen to use fully managed (vectorized, etc.) implementation in separate change.
|
@dotnet-bot test Ubuntu x64 Checked Innerloop Build and Test |
| if ((s.m_StringHeapByteLength == m_StringHeapByteLength) && (m_StringHeapByteLength != 0)) | ||
| { | ||
| return EqualsCaseSensitive(s.m_pStringHeap, m_pStringHeap, m_StringHeapByteLength); | ||
| isEqual = SpanHelpers.SequenceEqual<byte>(ref Unsafe.AsRef<byte>(s.m_pStringHeap), ref Unsafe.AsRef<byte>(m_pStringHeap), m_StringHeapByteLength) ? true : false; |
There was a problem hiding this comment.
This does not need to use Unsafe.AsRef. Regular cast to byte* is enough. Maybe change m_pStringHeap to byte* - it may result into fewer casts.
There was a problem hiding this comment.
What's the syntax here? Do I need to ref the dereference e.g. ref *m_pStringHeap it doesn't seem happy using m_pStringHeap directly
Error CS1620 Argument 2 must be passed with the 'ref' keyword
Adding just ref complains about its type (after complaining a readonly value can't be passed by ref)
Error CS1503 Argument 2: cannot convert from 'ref byte*' to 'ref byte'
Using ref deference seems ok, if a bit weird
| pItr++; | ||
| } | ||
| } | ||
| const int MaxStringLength = 1024; |
There was a problem hiding this comment.
The risk is that IndexOf('\0') may access memory that it is not supposed to access when it is given arbitrary sized buffer. It should not happen today given what the implementation looks like, but it is not a good idea to start spreading subtle assumptions like these throughout the codebase.
For now, I would just call StubHelpers.strlen or some other proper strlen variant here (there are several). Consider refactoring of wcslen/strlen to use fully managed (vectorized, etc.) implementation in separate change.
63d118e to
bab96db
Compare
|
Thank you! |
* Move MdUtf8String::EqualsCaseSensitive to managed code * Move MdUtf8String.ToString to safe code * Use Encoding.UTF8.GetString Commit migrated from dotnet/coreclr@e52aaee
They get called in a multiplicative call chain:
Contributes to dotnet/corefx#34283 (comment)