Skip to content
This repository was archived by the owner on Jan 23, 2023. It is now read-only.

Add String.Join overloads accepting a char separator#7942

Merged
AlexGhiondea merged 12 commits into
dotnet:masterfrom
jamesqo:new-string-apis
Nov 23, 2016
Merged

Add String.Join overloads accepting a char separator#7942
AlexGhiondea merged 12 commits into
dotnet:masterfrom
jamesqo:new-string-apis

Conversation

@jamesqo
Copy link
Copy Markdown

@jamesqo jamesqo commented Nov 1, 2016

This PR adds string.Join overloads that take a char separator, which has been approved by dotnet/corefx#5552. It supplants #7621.

Please note that while I've verified the code I've wrote compiles, it's completely untested since I don't have a way to run tests on it. That will be done when the api is exposed in corefx.

Another note: I didn't add a Join(char, IEnumerable<string>) overload after seeing #7621 (comment) which referred to @stephentoub's comment at #2945 (comment).

cc @jkotas @stephentoub @AlexGhiondea @weshaggard @bbowyersmyth @ghost @hughbe

Copy link
Copy Markdown
Author

@jamesqo jamesqo Nov 1, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I kept the parameter names the same as the string-based overloads, so this is value as opposed to values.

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I avoided duplicating the Join(string, object[]) bug where we return "" if the first item is null here.

Copy link
Copy Markdown
Author

@jamesqo jamesqo Nov 3, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@justinvp Unfortunately, this means that it's not possible to share code with the string overload here for now. Ongoing discussion @ https://github.com/dotnet/coreclr/issues/4506#issuecomment-258033970

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can still share the code - just need a bool argument on what the behavior in this case should be.

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jkotas, that seems kind of ugly to me. I think it would be better to simply leave things as-is for now, and when fixing the bug is approved then share the code.

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Used checked inside the loop, instead of checking for overflow at the end. 3 2B sized strings for example could overflow enough that the total length goes back to a positive value, so the check seems fairly pointless.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This will throw OverflowException. Other Join overloads throw OutOfMemory exception in this case...

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hm, I can change it to do a totalLength < 0 check instead then.

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I happened to see @stephentoub's PR a while ago at #4559, so I mimicked his logic here.

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed may/may not have better performance outside the loop, but I think putting it here is best for readability.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wouldn't a simple result[copiedLength] = separator be faster?

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jkotas result is a string, so using a set indexer won't work here.

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Logic copied from @stephentoub's PR.

@hughbe
Copy link
Copy Markdown

hughbe commented Nov 1, 2016

Nice! In the issue over in corefx I said I would add some tests and expose the implementation - I don't actually have time to do this anytime soon, so if you want you can take over

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: nameof (nameof was just deployed throughout mscorlib). Applies throughout.

@justinvp
Copy link
Copy Markdown

justinvp commented Nov 2, 2016

Any thoughts on sharing code with the string separator methods? #7621 (comment)

@jamesqo
Copy link
Copy Markdown
Author

jamesqo commented Nov 2, 2016

@hughbe Sure, I would be willing to add tests / expose the impl in corefx.

@jamesqo
Copy link
Copy Markdown
Author

jamesqo commented Nov 2, 2016

@justinvp Great idea! Esp. considering one of the Join overloads is generic it should help shave down on some jitted code bloat if an app uses both overloads with the same type.

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What if separator is null?

Copy link
Copy Markdown
Author

@jamesqo jamesqo Nov 3, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@justinvp Then pSeparator will be initialized to null. This is technically implementation-defined behavior of Roslyn, but it looks like they're planning to update the spec sometime.

On hindsight, the real issue here looks like separator.Length will nullref if it is null (edit: I just realized, m_firstChar as well)... I'll just add a separator = separator ?? string.Empty beforehand. Thanks for pointing this out.

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This file is no longer being used after the Join char/string impls were unified, so I deleted it.

@jkotas
Copy link
Copy Markdown
Member

jkotas commented Nov 3, 2016

Explicit check for overflow may be a bit faster, like:

long totalSeparatorsLength = (long)(count - 1) * separatorLength;
if (totalSeparatorsLength > int.MaxValue)
   throw new OutOfMemoryException(); 
int totalLength = (int)separatorsLength;

@jamesqo
Copy link
Copy Markdown
Author

jamesqo commented Nov 3, 2016

@jkotas Nice idea. Updated the PR to match.

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: validataion -> validation

Copy link
Copy Markdown

@justinvp justinvp Nov 3, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When I added the new Split overloads, I made them all [ComVisible(false)]. I think we should do the same for these new Join overloads.

Copy link
Copy Markdown
Author

@jamesqo jamesqo Nov 3, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok edit: See @danmosemsft's comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since you changed String -> string above, maybe go ahead and do the same here.

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't know why these overloads are diffing since I didn't change them... but since they are, I guess I might as well format them.

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: space between using and (

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should the public unsafe methods be annotated [SecuritySafeCritical] and private JoinCore annotated [SecurityCritical]?

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If I understand @jkotas comment https://github.com/dotnet/corefx/issues/12592#issuecomment-253971384 they are not needed in this assembly as we don't build mscorlib for desktop and the mirror is switched off. @jkotas can confirm.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, will not add unless someone says otherwise.

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If I understand @jkotas comment dotnet/corefx#12592 (comment) they are not needed in this assembly as we don't build mscorlib for desktop and the mirror is switched off.

I realize that's the case for corefx. I asked because I'm not sure about coreclr given it appears the security annotations are still being maintained here (e.g. #7476). @jkotas should confirm.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is not necessary to maintain these anymore.

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you can remove the // auto-generated part of the comment.

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@AlexGhiondea There appear to be lots of those comments throughout the repo; if such a change is made it should probably be done in bulk.

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jamesqo that would work. I was just thinking that since you are adding this method you can clean-up the comment. I am fine with it as-is :).

@jamesqo
Copy link
Copy Markdown
Author

jamesqo commented Nov 4, 2016

@jkotas, I think this is ready to be merged.

@jkotas
Copy link
Copy Markdown
Member

jkotas commented Nov 6, 2016

@AlexGhiondea Could you please take care of this?

@jamesqo
Copy link
Copy Markdown
Author

jamesqo commented Nov 16, 2016

Looks like the string.Join(string,object[]) bug is no more, since #8114 has been merged. In a few days (after the corefx tests for that PR have been merged), I will update this PR again and introduce code sharing between the (char,object[]) & (string,object[]) overloads.

@AlexGhiondea
Copy link
Copy Markdown

@jamesqo could you rebase and update the PR so that we can merge this?
Thanks for working on it!

@jamesqo jamesqo changed the title Add String.Join overloads accepting a char separator [no merge] Add String.Join overloads accepting a char separator Nov 21, 2016
@jamesqo jamesqo changed the title [no merge] Add String.Join overloads accepting a char separator Add String.Join overloads accepting a char separator Nov 21, 2016
@jamesqo
Copy link
Copy Markdown
Author

jamesqo commented Nov 21, 2016

@AlexGhiondea Sure. This is ready for merge now, I believe.

Copy link
Copy Markdown

@AlexGhiondea AlexGhiondea left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for taking care of this!

@danmoseley
Copy link
Copy Markdown
Member

@stephentoub apparently has a couple comments b efore we merge this.

@jamesqo
Copy link
Copy Markdown
Author

jamesqo commented Nov 22, 2016

@danmosemsft @stephentoub What are they?

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks like the original author went out of his or her way to keep this check out of this loop. I assume this doesn't make a measurable impact, but have you checked?

Copy link
Copy Markdown
Author

@jamesqo jamesqo Nov 23, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@stephentoub The original version was incorrect in the first place. jointLength < 0 does not tell us whether anything overflowed b/c the int could have wrapped around to negative and passed 0 again, since we're dealing with a variable amount of strings. jointLength + 1 < 0 is also pointless.

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You could check for overflow by doing something like this:

'If (totalLength + currentValue.Length < totalLength)'

Copy link
Copy Markdown
Author

@jamesqo jamesqo Nov 23, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@AlexGhiondea That would have to be in the loop as well; there is no difference from the version that is currently checked in.

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jamesqo yes, it should still be in the loop but that check would catch overflows even when the value overflows enough to come back into the positive number range.

Copy link
Copy Markdown
Author

@jamesqo jamesqo Nov 23, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@AlexGhiondea, a single int + int cannot overflow that much. Worst case scenario, totalLength and currentValue.Length are both int.MaxValue. Their sum is only -2. It's only when we multiply we run into problems.

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jamesqo ah yes -- I was thrown off by your comment above about the previous version of the code 😄

@AlexGhiondea AlexGhiondea merged commit 3f59c07 into dotnet:master Nov 23, 2016
@jamesqo jamesqo deleted the new-string-apis branch November 23, 2016 23:41
sergign60 pushed a commit to sergign60/coreclr that referenced this pull request Dec 2, 2016
* Add String.Join overloads that accept char separators
@karelz karelz modified the milestone: 2.0.0 Aug 28, 2017
picenka21 pushed a commit to picenka21/runtime that referenced this pull request Feb 18, 2022
…7942)

* Add String.Join overloads that accept char separators

Commit migrated from dotnet/coreclr@3f59c07
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

9 participants