Add another Zip IEnumerable<T> extension method#26582
Add another Zip IEnumerable<T> extension method#26582stephentoub merged 4 commits intodotnet:masterfrom
Conversation
|
Closing, didn't realize we needed API approval before PR. |
|
Looks like #16011 was approved. Is it possible to reopen this? |
src/System.Linq/tests/ZipTests.cs
Outdated
|
|
||
| second = new ThrowsOnMatchEnumerable<int>(new int[] { 1, 2, 3 }, 2); | ||
|
|
||
| var zip = first.Zip(second); |
| [Fact] | ||
| public void Zip2() | ||
| { | ||
| var count = (new int[] { 0, 1, 2 }).AsQueryable().Zip((new int[] { 10, 11, 12 }).AsQueryable()).Count(); |
| } | ||
|
|
||
| public static IEnumerable<(TFirst, TSecond)> Zip<TFirst, TSecond>(this IEnumerable<TFirst> first, IEnumerable<TSecond> second) | ||
| => first.Zip(second, (x, y) => (x, y)); |
There was a problem hiding this comment.
Should we avoid the delegate invocation on every iteration by having the implementation instead be expanded out like the original overload is? e.g.
public static IEnumerable<(TFirst, TSecond)> Zip<TFirst, TSecond>(this IEnumerable<TFirst> first, IEnumerable<TSecond> second)
{
if (first == null || second == null)
{
throw Error.ArgumentNull(first == null ? nameof(first) : nameof(second));
}
return ZipIterator(first, second, resultSelector);
}
private static IEnumerable<(TFirst, TSecond)> ZipIterator<TFirst, TSecond>(IEnumerable<TFirst> first, IEnumerable<TSecond> second)
{
using (IEnumerator<TFirst> e1 = first.GetEnumerator())
using (IEnumerator<TSecond> e2 = second.GetEnumerator())
{
while (e1.MoveNext() && e2.MoveNext())
{
yield return (e1.Current, e2.Current);
}
}
}There's already a Zip extension method that takes two IEnumerables and a result selector Func that's meant to combine the elements of the enumerables in some way. Perhaps the most common use of this method is to just create a tuple, so I've added an overload which removes the result selector and instead produces a ValueTuple by default.
|
CI says I broke |
We've seen that happen on the arm legs. It's not you it's us. |
| throw Error.ArgumentNull(nameof(first)); | ||
| } | ||
|
|
||
| if (second is null) |
There was a problem hiding this comment.
Nit: Is there any benefit to using is instead of == here? It just looks strange having it be different from the other overload above it.
There was a problem hiding this comment.
is null never takes into account user-defined equality, so it always issues the optimal IL (brfalse) for a null check. It doesn't matter in this case because the argument is an interface so there can be no user-defined equality, but that's the beauty of is null -- if you always use it you never have to think about whether or not the type itself has optimized for a null check.
There was a problem hiding this comment.
Thanks; that's what I thought. FWIW, either we should stick with == here, or we should change the other similar occurrences to use is. Otherwise this is just an oddity of style off by itself.
| public static System.Collections.Generic.IEnumerable<TSource> Union<TSource>(this System.Collections.Generic.IEnumerable<TSource> first, System.Collections.Generic.IEnumerable<TSource> second, System.Collections.Generic.IEqualityComparer<TSource> comparer) { throw null; } | ||
| public static System.Collections.Generic.IEnumerable<TSource> Where<TSource>(this System.Collections.Generic.IEnumerable<TSource> source, System.Func<TSource, bool> predicate) { throw null; } | ||
| public static System.Collections.Generic.IEnumerable<TSource> Where<TSource>(this System.Collections.Generic.IEnumerable<TSource> source, System.Func<TSource, int, bool> predicate) { throw null; } | ||
| public static System.Collections.Generic.IEnumerable<(TFirst First,TSecond Second)> Zip<TFirst, TSecond>(this System.Collections.Generic.IEnumerable<TFirst> first, System.Collections.Generic.IEnumerable<TSecond> second) { throw null; } |
There was a problem hiding this comment.
Nit: missing space in "First,TSecond"
* Add another Zip IEnumerable<T> extension method There's already a Zip extension method that takes two IEnumerables and a result selector Func that's meant to combine the elements of the enumerables in some way. Perhaps the most common use of this method is to just create a tuple, so I've added an overload which removes the result selector and instead produces a ValueTuple by default. * Respond to PR comments * Also update IQueryable * Respond to PR feedback
|
I strongly believe these need to be |
|
The element names are stored in attributes, they're not the name of any API. The only purpose of these attributes is to control how it's exposed into a C# language feature. That's why I think their opinion is pretty relevant here. If we were adding attributes to control how an API is exposed in F#, I think the opinion of the F# team would be relevant there too. |
It doesn't matter whether the names are encoded in CLI metadata or in attributes; from the perspective of consuming the method, documenting the method, browsing the method, etc., it's API. |
|
FWIW, the tuple types concept page (https://docs.microsoft.com/en-us/dotnet/csharp/tuples) uses capital field alias names. The guidance for .NET is that all public elements, other than parameter names, are PascalCased (Framework Design Guidelines 3.1.1 (big chart on page 40, second edition)). Our instinct definitely was to say that neither Tuple nor ValueTuple have any place in public API. But if we relax "never" then our feeling is that the API shape should look like every other .NET API, and that means that (as something other than a parameter) it needs to follow PascalCase. |
public static double StandardDeviation(IEnumerable<double> sequence)
{
(int Count, double Sum, double SumOfSquares) computation = ComputeSumsAnSumOfSquares(sequence);
var variance = computation.SumOfSquares - computation.Sum * computation.Sum / computation.Count;
return Math.Sqrt(variance / computation.Count);
}
private static (int Count, double Sum, double SumOfSquares) ComputeSumsAnSumOfSquares(IEnumerable<double> sequence)
{
var computation = (count: 0, sum: 0.0, sumOfSquares: 0.0);
foreach (var item in sequence)
{
computation.count++;
computation.sum += item;
computation.sumOfSquares += item * item;
}
return computation;
}That's the only time it's used as a return value on that page (that I saw). |
|
And then the usage assigns to camelCased variables... The plot thickens |
|
Ok, so it's inconsistent. |
THe latter sentence doesn't logically follow the first. As part of the API shape it should follow whatever rules make sense for this new construct. Since it is something new, it should have an explicit decision outlined for it rather than simply saying that it falls under hte "everything else" bucket. I'm just trying to give a strong signal around the intent here. In the majority of LDM meetings these were thought of as "lightweight parameter lists". As such, my intuition is simply that the parameter naming takes effect. IDE followed the guidance and has been nudging people in this direction. That said, i'm 100% behind the: just side-step this entirely by not putting tuples in your public API. Can we decide on that first? If there's no tuple, we don't even have to discss naming. If we then decide we should have tuples here, then we can decide what the right naming choice is based on how we think people will think about these symbols and how we think will be best to produce/consume them. |
It does matter, because one is the universal CLR API language, whereas the other one is a C# specific attribute controlling how the API is exposed into a C# language feature. That's why I think that the intent of the C# LDM with respect to that language feature is relevant here. |
|
@Neme12, you and I will have to agree to disagree here. |
I don't think that's the intent. It's not a C# specific attribute. It's a generalized attribute for any language to provide hints to every other language about the intended name they are giving for this tuple field. Basically, tehy are saying "instead of thinking of that field as Item5, think of it as 'customer'." The primary question is: is the intent of the attribute to say: think of it as if you have a field here. or it is: think of it as if you have a parameter here. (Or, maybe it's something even different from both of those and we shouldn't even think it maps to either of those). that said, this still boils down to the following decision tree to me:
|
(Please don't bring up VB. 😄) |
|
I do think a tuple is the proper thing to return here, especially if there is ever a corresponding Unzip method. That one would mostly be useful for tuples, as opposed to a custom type. If given a choice, I'd rather have the tuple-returning pascal cased version. |
|
@stephentoub @agocke What next steps would be appropriate here? It sounds like tehre was some feeling that perhaps these APIs should not be using tuples in the first place. Is that something that can be rectified asap? Who would be the right person to make that change? Thanks! |
|
@CyrusNajmabadi, open a new issue. All discussion on this topic in the last two days has happened on closed issues/PRs. |
|
Sounds good. Thank @stephentoub . |
This reverts commit 6b2b9e4.
…" (dotnet#33709) This reverts commit 6b2b9e4.
|
Couldn't this have been implemented simply as |
Right. What you have is functionally correct, but:
|
|
I learned something new today, thanks! Hopefully that issue is adressed soon, I don't think many people know about it. It seems very counter intuitive that to make a lambda will have performance improvements. |
* Add another Zip IEnumerable<T> extension method There's already a Zip extension method that takes two IEnumerables and a result selector Func that's meant to combine the elements of the enumerables in some way. Perhaps the most common use of this method is to just create a tuple, so I've added an overload which removes the result selector and instead produces a ValueTuple by default. * Respond to PR comments * Also update IQueryable * Respond to PR feedback Commit migrated from dotnet/corefx@6b2b9e4
…x#26582)" (dotnet/corefx#33709) This reverts commit dotnet/corefx@6b2b9e4. Commit migrated from dotnet/corefx@53ae9af
…et/corefx#26582)"" (dotnet/corefx#35595) * Revert "Revert "Add another Zip IEnumerable<T> extension method (dotnet/corefx#26582)" (dotnet/corefx#33709)" This reverts commit dotnet/corefx@53ae9af. * Adapt to ThrowHelper changes Commit migrated from dotnet/corefx@720591a


There's already a Zip extension method that takes two IEnumerables
and a result selector Func that's meant to combine the elements
of the enumerables in some way. Perhaps the most common use of this
method is to just create a tuple, so I've added an overload which
removes the result selector and instead produces a ValueTuple by
default.