Skip to content
This repository was archived by the owner on Jan 23, 2023. It is now read-only.

Improve MdUtf8String#21720

Merged
jkotas merged 7 commits into
dotnet:masterfrom
benaadams:MdUtf8String
Dec 31, 2018
Merged

Improve MdUtf8String#21720
jkotas merged 7 commits into
dotnet:masterfrom
benaadams:MdUtf8String

Conversation

@benaadams
Copy link
Copy Markdown
Member

@benaadams benaadams commented Dec 30, 2018

They get called in a multiplicative call chain:

image

Contributes to dotnet/corefx#34283 (comment)

pItr++;
}
}
const int MaxStringLength = 1024;
Copy link
Copy Markdown
Member Author

@benaadams benaadams Dec 30, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Previously it was unlimited length; but the SpanHelper.IndexOf isn't happy with that. Have chosen 1024; is that too small?

Is the actual length stored in the metadata so the overload with length could be called, or is it just a null terminated utf8string? /cc @GrabYourPitchforks

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The risk is that IndexOf('\0') may access memory that it is not supposed to access when it is given arbitrary sized buffer. It should not happen today given what the implementation looks like, but it is not a good idea to start spreading subtle assumptions like these throughout the codebase.

For now, I would just call StubHelpers.strlen or some other proper strlen variant here (there are several). Consider refactoring of wcslen/strlen to use fully managed (vectorized, etc.) implementation in separate change.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To keep it isolated to one place and easier to refactor/identify? Makes sense.

The comment in the unmanaged code that I removed for the string compare is:

// Important: the string in pSsz isn't null terminated so the length must be used
// when performing operations on the string.

So it does feel an uncomfortable coupling; though I suppose that's for the other .ctor of MdUtf8String which takes length rather than looking for a null termination?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Consider refactoring of wcslen/strlen to use fully managed (vectorized, etc.) implementation in separate change.

#21729

@benaadams
Copy link
Copy Markdown
Member Author

@dotnet-bot test Ubuntu x64 Checked Innerloop Build and Test

Comment thread src/System.Private.CoreLib/src/System/RtType.cs Outdated
if ((s.m_StringHeapByteLength == m_StringHeapByteLength) && (m_StringHeapByteLength != 0))
{
return EqualsCaseSensitive(s.m_pStringHeap, m_pStringHeap, m_StringHeapByteLength);
isEqual = SpanHelpers.SequenceEqual<byte>(ref Unsafe.AsRef<byte>(s.m_pStringHeap), ref Unsafe.AsRef<byte>(m_pStringHeap), m_StringHeapByteLength) ? true : false;
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This does not need to use Unsafe.AsRef. Regular cast to byte* is enough. Maybe change m_pStringHeap to byte* - it may result into fewer casts.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's the syntax here? Do I need to ref the dereference e.g. ref *m_pStringHeap it doesn't seem happy using m_pStringHeap directly

Error	CS1620	Argument 2 must be passed with the 'ref' keyword

Adding just ref complains about its type (after complaining a readonly value can't be passed by ref)

Error	CS1503	Argument 2: cannot convert from 'ref byte*' to 'ref byte'	

Using ref deference seems ok, if a bit weird

Comment thread src/System.Private.CoreLib/src/System/RtType.cs Outdated
pItr++;
}
}
const int MaxStringLength = 1024;
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The risk is that IndexOf('\0') may access memory that it is not supposed to access when it is given arbitrary sized buffer. It should not happen today given what the implementation looks like, but it is not a good idea to start spreading subtle assumptions like these throughout the codebase.

For now, I would just call StubHelpers.strlen or some other proper strlen variant here (there are several). Consider refactoring of wcslen/strlen to use fully managed (vectorized, etc.) implementation in separate change.

Comment thread src/System.Private.CoreLib/src/System/RtType.cs Outdated
@jkotas jkotas merged commit e52aaee into dotnet:master Dec 31, 2018
@jkotas
Copy link
Copy Markdown
Member

jkotas commented Dec 31, 2018

Thank you!

@benaadams benaadams deleted the MdUtf8String branch December 31, 2018 15:42
picenka21 pushed a commit to picenka21/runtime that referenced this pull request Feb 18, 2022
* Move MdUtf8String::EqualsCaseSensitive to managed code

* Move MdUtf8String.ToString to safe code

* Use Encoding.UTF8.GetString


Commit migrated from dotnet/coreclr@e52aaee
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants