Skip to content
This repository was archived by the owner on Jan 23, 2023. It is now read-only.

Add another Zip IEnumerable<T> extension method#26582

Merged
stephentoub merged 4 commits intodotnet:masterfrom
agocke:add-linq-zip
Nov 2, 2018
Merged

Add another Zip IEnumerable<T> extension method#26582
stephentoub merged 4 commits intodotnet:masterfrom
agocke:add-linq-zip

Conversation

@agocke
Copy link
Copy Markdown
Member

@agocke agocke commented Jan 25, 2018

There's already a Zip extension method that takes two IEnumerables
and a result selector Func that's meant to combine the elements
of the enumerables in some way. Perhaps the most common use of this
method is to just create a tuple, so I've added an overload which
removes the result selector and instead produces a ValueTuple by
default.

@agocke
Copy link
Copy Markdown
Member Author

agocke commented Jan 25, 2018

Closing, didn't realize we needed API approval before PR.

@agocke agocke closed this Jan 25, 2018
@karelz karelz added this to the 2.1.0 milestone Feb 4, 2018
@RikkiGibson
Copy link
Copy Markdown
Member

Looks like #16011 was approved. Is it possible to reopen this?

@agocke agocke reopened this Oct 27, 2018

second = new ThrowsOnMatchEnumerable<int>(new int[] { 1, 2, 3 }, 2);

var zip = first.Zip(second);
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ditto

[Fact]
public void Zip2()
{
var count = (new int[] { 0, 1, 2 }).AsQueryable().Zip((new int[] { 10, 11, 12 }).AsQueryable()).Count();
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: var => int

}

public static IEnumerable<(TFirst, TSecond)> Zip<TFirst, TSecond>(this IEnumerable<TFirst> first, IEnumerable<TSecond> second)
=> first.Zip(second, (x, y) => (x, y));
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we avoid the delegate invocation on every iteration by having the implementation instead be expanded out like the original overload is? e.g.

public static IEnumerable<(TFirst, TSecond)> Zip<TFirst, TSecond>(this IEnumerable<TFirst> first, IEnumerable<TSecond> second)
{
    if (first == null || second == null)
    {
        throw Error.ArgumentNull(first == null ? nameof(first) : nameof(second));
    }

    return ZipIterator(first, second, resultSelector);
}
 
private static IEnumerable<(TFirst, TSecond)> ZipIterator<TFirst, TSecond>(IEnumerable<TFirst> first, IEnumerable<TSecond> second)
{
    using (IEnumerator<TFirst> e1 = first.GetEnumerator())
    using (IEnumerator<TSecond> e2 = second.GetEnumerator())
    {
        while (e1.MoveNext() && e2.MoveNext())
        {
            yield return (e1.Current, e2.Current);
        }
    }
}

There's already a Zip extension method that takes two IEnumerables
and a result selector Func that's meant to combine the elements
of the enumerables in some way. Perhaps the most common use of this
method is to just create a tuple, so I've added an overload which
removes the result selector and instead produces a ValueTuple by
default.
@agocke agocke modified the milestones: 2.1.0, 2.2 Oct 30, 2018
@agocke
Copy link
Copy Markdown
Member Author

agocke commented Oct 31, 2018

CI says I broke System.Net.NameResolution.PalTests.NameResolutionPalTests/TryGetAddrInfo_HostName. I won't say it's impossible, but is there any way this is a known issue?

@stephentoub
Copy link
Copy Markdown
Member

but is there any way this is a known issue

We've seen that happen on the arm legs. It's not you it's us.

throw Error.ArgumentNull(nameof(first));
}

if (second is null)
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: Is there any benefit to using is instead of == here? It just looks strange having it be different from the other overload above it.

Copy link
Copy Markdown
Member Author

@agocke agocke Nov 1, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is null never takes into account user-defined equality, so it always issues the optimal IL (brfalse) for a null check. It doesn't matter in this case because the argument is an interface so there can be no user-defined equality, but that's the beauty of is null -- if you always use it you never have to think about whether or not the type itself has optimized for a null check.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks; that's what I thought. FWIW, either we should stick with == here, or we should change the other similar occurrences to use is. Otherwise this is just an oddity of style off by itself.

public static System.Collections.Generic.IEnumerable<TSource> Union<TSource>(this System.Collections.Generic.IEnumerable<TSource> first, System.Collections.Generic.IEnumerable<TSource> second, System.Collections.Generic.IEqualityComparer<TSource> comparer) { throw null; }
public static System.Collections.Generic.IEnumerable<TSource> Where<TSource>(this System.Collections.Generic.IEnumerable<TSource> source, System.Func<TSource, bool> predicate) { throw null; }
public static System.Collections.Generic.IEnumerable<TSource> Where<TSource>(this System.Collections.Generic.IEnumerable<TSource> source, System.Func<TSource, int, bool> predicate) { throw null; }
public static System.Collections.Generic.IEnumerable<(TFirst First,TSecond Second)> Zip<TFirst, TSecond>(this System.Collections.Generic.IEnumerable<TFirst> first, System.Collections.Generic.IEnumerable<TSecond> second) { throw null; }
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: missing space in "First,TSecond"

Copy link
Copy Markdown
Member

@stephentoub stephentoub left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks.

@stephentoub stephentoub merged commit 6b2b9e4 into dotnet:master Nov 2, 2018
@agocke agocke deleted the add-linq-zip branch November 2, 2018 23:07
EgorBo pushed a commit to EgorBo/corefx that referenced this pull request Nov 4, 2018
* Add another Zip IEnumerable<T> extension method

There's already a Zip extension method that takes two IEnumerables
and a result selector Func that's meant to combine the elements
of the enumerables in some way. Perhaps the most common use of this
method is to just create a tuple, so I've added an overload which
removes the result selector and instead produces a ValueTuple by
default.

* Respond to PR comments

* Also update IQueryable

* Respond to PR feedback
@stephentoub
Copy link
Copy Markdown
Member

stephentoub commented Nov 15, 2018

I strongly believe these need to be First and Second, not first and second. I don't care whether that's via naming of tuple elements or whether it's via creating a custom struct. If we keep the tuple, then either we need to not use any names and just have it be Item1 and Item2, or we need to use First and Second. I would fight to not ship this API at all if the only way to do so were using first and second; I don't often have this strong an opinion for this sort of thing, but here, I do.

@Neme12
Copy link
Copy Markdown

Neme12 commented Nov 15, 2018

The element names are stored in attributes, they're not the name of any API. The only purpose of these attributes is to control how it's exposed into a C# language feature. That's why I think their opinion is pretty relevant here.

If we were adding attributes to control how an API is exposed in F#, I think the opinion of the F# team would be relevant there too.

@stephentoub
Copy link
Copy Markdown
Member

The element names are stored in attributes, they're not the name of any API.

It doesn't matter whether the names are encoded in CLI metadata or in attributes; from the perspective of consuming the method, documenting the method, browsing the method, etc., it's API.

@bartonjs
Copy link
Copy Markdown
Member

FWIW, the tuple types concept page (https://docs.microsoft.com/en-us/dotnet/csharp/tuples) uses capital field alias names.

The guidance for .NET is that all public elements, other than parameter names, are PascalCased (Framework Design Guidelines 3.1.1 (big chart on page 40, second edition)). ValueTuple, which is what actually is returned, has the fields Item1 and Item2. The presence of TupleElementNamesAttribute makes the IDE hide the normal field names and suggest only those names from that attribute; but that's just an IDE behavior. (FWIW, ReSharper suggests decomposing into var (item1, item2) if they're unnamed,

Our instinct definitely was to say that neither Tuple nor ValueTuple have any place in public API. But if we relax "never" then our feeling is that the API shape should look like every other .NET API, and that means that (as something other than a parameter) it needs to follow PascalCase.

@Neme12
Copy link
Copy Markdown

Neme12 commented Nov 15, 2018

What do you mean by capital? That page shows camel case:
image
image

@bartonjs
Copy link
Copy Markdown
Member

@Neme12

public static double StandardDeviation(IEnumerable<double> sequence)
{
    (int Count, double Sum, double SumOfSquares) computation = ComputeSumsAnSumOfSquares(sequence);

    var variance = computation.SumOfSquares - computation.Sum * computation.Sum / computation.Count;
    return Math.Sqrt(variance / computation.Count);
}

private static (int Count, double Sum, double SumOfSquares) ComputeSumsAnSumOfSquares(IEnumerable<double> sequence)
{
    var computation = (count: 0, sum: 0.0, sumOfSquares: 0.0);

    foreach (var item in sequence)
    {
        computation.count++;
        computation.sum += item;
        computation.sumOfSquares += item * item;
    }

    return computation;
}

That's the only time it's used as a return value on that page (that I saw).

@RikkiGibson
Copy link
Copy Markdown
Member

And then the usage assigns to camelCased variables... The plot thickens

@Neme12
Copy link
Copy Markdown

Neme12 commented Nov 15, 2018

Ok, so it's inconsistent.

@CyrusNajmabadi
Copy link
Copy Markdown

Our instinct definitely was to say that neither Tuple nor ValueTuple have any place in public API. But if we relax "never" then our feeling is that the API shape should look like every other .NET API, and that means that (as something other than a parameter) it needs to follow PascalCase.

THe latter sentence doesn't logically follow the first. As part of the API shape it should follow whatever rules make sense for this new construct. Since it is something new, it should have an explicit decision outlined for it rather than simply saying that it falls under hte "everything else" bucket.

I'm just trying to give a strong signal around the intent here. In the majority of LDM meetings these were thought of as "lightweight parameter lists". As such, my intuition is simply that the parameter naming takes effect. IDE followed the guidance and has been nudging people in this direction.

That said, i'm 100% behind the: just side-step this entirely by not putting tuples in your public API.

Can we decide on that first? If there's no tuple, we don't even have to discss naming. If we then decide we should have tuples here, then we can decide what the right naming choice is based on how we think people will think about these symbols and how we think will be best to produce/consume them.

@Neme12
Copy link
Copy Markdown

Neme12 commented Nov 15, 2018

It doesn't matter whether the names are encoded in CLI metadata or in attributes; from the perspective of consuming the method, documenting the method, browsing the method, etc., it's API.

It does matter, because one is the universal CLR API language, whereas the other one is a C# specific attribute controlling how the API is exposed into a C# language feature. That's why I think that the intent of the C# LDM with respect to that language feature is relevant here.

@stephentoub
Copy link
Copy Markdown
Member

@Neme12, you and I will have to agree to disagree here.

@CyrusNajmabadi
Copy link
Copy Markdown

CyrusNajmabadi commented Nov 15, 2018

whereas the other one is a C# specific attribute controlling how the API is exposed into a C# language feature

I don't think that's the intent. It's not a C# specific attribute. It's a generalized attribute for any language to provide hints to every other language about the intended name they are giving for this tuple field. Basically, tehy are saying "instead of thinking of that field as Item5, think of it as 'customer'." The primary question is: is the intent of the attribute to say: think of it as if you have a field here. or it is: think of it as if you have a parameter here. (Or, maybe it's something even different from both of those and we shouldn't even think it maps to either of those).

that said, this still boils down to the following decision tree to me:

  1. should we even use a tuple here? Yes/no? If no. We're done. No concern about naming.
  2. If yes, what should tuple names be cased as in APIs? I personally think that needs a deeper conversation that especially relates to how we think people will think about these guys and especially how we expect the common coding patterns to be around it.

@Neme12
Copy link
Copy Markdown

Neme12 commented Nov 15, 2018

whereas the other one is a C# specific attribute

(Please don't bring up VB. 😄)

@Neme12
Copy link
Copy Markdown

Neme12 commented Nov 15, 2018

I do think a tuple is the proper thing to return here, especially if there is ever a corresponding Unzip method. That one would mostly be useful for tuples, as opposed to a custom type. If given a choice, I'd rather have the tuple-returning pascal cased version.

@CyrusNajmabadi
Copy link
Copy Markdown

@stephentoub @agocke What next steps would be appropriate here? It sounds like tehre was some feeling that perhaps these APIs should not be using tuples in the first place. Is that something that can be rectified asap? Who would be the right person to make that change?

Thanks!

@stephentoub
Copy link
Copy Markdown
Member

@CyrusNajmabadi, open a new issue. All discussion on this topic in the last two days has happened on closed issues/PRs.

@CyrusNajmabadi
Copy link
Copy Markdown

Sounds good. Thank @stephentoub .

stephentoub added a commit to stephentoub/corefx that referenced this pull request Nov 27, 2018
jlennox pushed a commit to jlennox/corefx that referenced this pull request Dec 16, 2018
@mariusGundersen
Copy link
Copy Markdown

Couldn't this have been implemented simply as first.Zip(second, ValueTuple.Create)? Or is there a performance advantage of the current code?

@stephentoub
Copy link
Copy Markdown
Member

Or is there a performance advantage of the current code?

Right. What you have is functionally correct, but:

@mariusGundersen
Copy link
Copy Markdown

I learned something new today, thanks! Hopefully that issue is adressed soon, I don't think many people know about it. It seems very counter intuitive that to make a lambda will have performance improvements.

stephentoub added a commit that referenced this pull request Feb 26, 2019
stephentoub added a commit that referenced this pull request Feb 27, 2019
…)"" (#35595)

* Revert "Revert "Add another Zip IEnumerable<T> extension method (#26582)" (#33709)"

This reverts commit 53ae9af.

* Adapt to ThrowHelper changes
picenka21 pushed a commit to picenka21/runtime that referenced this pull request Feb 18, 2022
* Add another Zip IEnumerable<T> extension method

There's already a Zip extension method that takes two IEnumerables
and a result selector Func that's meant to combine the elements
of the enumerables in some way. Perhaps the most common use of this
method is to just create a tuple, so I've added an overload which
removes the result selector and instead produces a ValueTuple by
default.

* Respond to PR comments

* Also update IQueryable

* Respond to PR feedback


Commit migrated from dotnet/corefx@6b2b9e4
picenka21 pushed a commit to picenka21/runtime that referenced this pull request Feb 18, 2022
picenka21 pushed a commit to picenka21/runtime that referenced this pull request Feb 18, 2022
…et/corefx#26582)"" (dotnet/corefx#35595)

* Revert "Revert "Add another Zip IEnumerable<T> extension method (dotnet/corefx#26582)" (dotnet/corefx#33709)"

This reverts commit dotnet/corefx@53ae9af.

* Adapt to ThrowHelper changes


Commit migrated from dotnet/corefx@720591a
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

8 participants