Skip to content
This repository was archived by the owner on Jan 23, 2023. It is now read-only.

Fix impl of ReadOnlySpan StartsWith/EndsWith for Unix.#16418

Merged
ahsonkhan merged 1 commit into
dotnet:masterfrom
ahsonkhan:FixForUnix
Feb 16, 2018
Merged

Fix impl of ReadOnlySpan StartsWith/EndsWith for Unix.#16418
ahsonkhan merged 1 commit into
dotnet:masterfrom
ahsonkhan:FixForUnix

Conversation

@ahsonkhan
Copy link
Copy Markdown

@ahsonkhan ahsonkhan commented Feb 16, 2018

@stephentoub
Copy link
Copy Markdown
Member

What was the bug and what was the fix? It's not immediately obvious from the diff.

@ahsonkhan
Copy link
Copy Markdown
Author

ahsonkhan commented Feb 16, 2018

For StartsWith (aside from some code cleanup in the loop):

return Interop.Globalization.StartsWith(_sortHandle, b, prefix.Length - length, a, prefix.Length - length, options);

Changed to:

return Interop.Globalization.StartsWith(_sortHandle, b, length, a, length, options);

For EndsWith:
Instead of Slicing and calling the StartsWith helpers, which do not work well for cases such as "Hello".EndsWith("ello-"), created EndsWith helpers which process the characters backwards.

See test cases that were failing here: https://mc.dot.net/#/user/ahsonkhan/pr~2Fjenkins~2Fdotnet~2Fcorefx~2Fmaster~2F/test~2Ffunctional~2Fcli~2F/7d0fa7161a57203949b2b4f2a4da134dd10d3812/workItem/System.Memory.Tests

image

@stephentoub
Copy link
Copy Markdown
Member

Thanks for the explanation. I'm not 100% understanding the EndsWith case, though. These are all cases where non-ASCII was being used in the input, e.g. a soft hyphen? Why doesn't the slicing approach work for those? It hits the fallback path and then that fails?

@ahsonkhan
Copy link
Copy Markdown
Author

ahsonkhan commented Feb 16, 2018

Let's take this example, which is supposed to return true - "Hello".EndsWith("ello-")

return StartsWithOrdinalHelper(source.Slice(source.Length - suffix.Length), suffix, options);

source.Length = 5, suffix.Length = 5, source.Slice(0) => source.

Compare the first character of source with first character of suffix. H != e. Return false, which is incorrect.
If you process the characters backwards, you will hit the fallback path (since hyphen is a special character), call Globalization external method, and the result will be correct.

@ahsonkhan ahsonkhan deleted the FixForUnix branch February 16, 2018 19:56
@stephentoub
Copy link
Copy Markdown
Member

stephentoub commented Feb 16, 2018

In that case, don't we still have a bug? We do length checks before this helper call that returns false if source.Length is < prefix/suffix.Length, but that will break an example like "Hello".EndsWith("Hello\u00AD"), which should return true (right?), but "Hello".Length==5 and "Hello\u00AD".Length == 6.

Maybe that's how it's supposed to behave, but it seems inconsistent that "Hello".EndsWith("ello\u00AD") is true but "Hello".EndsWith("ello\u00AD") is false. @tarekgh?

@ahsonkhan
Copy link
Copy Markdown
Author

ahsonkhan commented Feb 16, 2018

In that case, don't we still have a bug?

You are right. Good point.

"Hello".EndsWith("Hello\u00AD") and "Hello".EndsWith("ello\u00AD") should both return true but the length check bails too early and returns false.

We need to remove these checks:

if (source.Length < prefix.Length)
{
return false;
}

if (source.Length < suffix.Length)
{
return false;
}

And remove the debug asserts:

Debug.Assert(source.Length >= suffix.Length);

@tarekgh
Copy link
Copy Markdown
Member

tarekgh commented Feb 16, 2018

I recall we used to have the IsFastSort check in addition to the ignore symbols which take care of that. and the conditions at that time would be correct.

@tarekgh
Copy link
Copy Markdown
Member

tarekgh commented Feb 16, 2018

if (_isAsciiEqualityOrdinal && CanUseAsciiOrdinalForOptions(options) && source.IsFastSort() && prefix.IsFastSort())

@tarekgh
Copy link
Copy Markdown
Member

tarekgh commented Feb 16, 2018

and we don't have to do IsFastSort check on Windows.

@tarekgh
Copy link
Copy Markdown
Member

tarekgh commented Feb 16, 2018

to mention, checks like

if (_isAsciiEqualityOrdinal && CanUseAsciiOrdinalForOptions(options))

wouldn't be enough for Linux to optimize for ordinal.

@tarekgh
Copy link
Copy Markdown
Member

tarekgh commented Feb 16, 2018

talked offline with @ahsonkhan and he pointed we check the sortable chars in the loop

https://github.com/dotnet/coreclr/blob/master/src/mscorlib/src/System/Globalization/CompareInfo.Unix.cs#L323

so I think removing the length check as suggested would be ok

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants