-
Notifications
You must be signed in to change notification settings - Fork 4k
Description
Describe the enhancement requested
We've recently been doing some profiling of code that uses the C# Arrow library, and which frequently accesses columns by name in a tight loop - and found the following LINQ expression makes a significant contribution to execution time.
arrow/csharp/src/Apache.Arrow/Schema.cs
Lines 81 to 86 in 2df7b23
| public int GetFieldIndex(string name, IEqualityComparer<string> comparer = default) | |
| { | |
| comparer ??= StringComparer.CurrentCulture; | |
| return _fieldsList.IndexOf(_fieldsList.First(x => comparer.Equals(x.Name, name))); | |
| } |
Obviously there are workarounds for this, such as creating our own cached mapping of column names to ordinals - but it would be nice if the built-in method was a little more efficient. In addition to the use of linq, we're also iterating through _fieldsList twice (once to find a match, and then another to find the index of the match).
We've also noticed that usage of StringComparer.CurrentCulture causes an allocation every time this method is called (assuming a value for comparer is not provided). I'm assuming the choice to use a culture-aware comparison by default is intentional?
Component(s)
C#