Reuse HashHelpers for BinaryFormatter objectholder hashes#25509
Conversation
| { | ||
| if (min < 0) | ||
| throw new ArgumentException(SR.Arg_HTCapacityOverflow); | ||
| Contract.EndContractBlock(); |
There was a problem hiding this comment.
typo below
// Allow the hashtables to grow to maximum possible size (~2G elements) before encoutering capacity overflow.
|
There is another duplicate here: |
|
I see that https://github.com/dotnet/corefx/blob/master/src/System.Private.DataContractSerialization/src/System/Runtime/Serialization/ObjectToIdCache.cs could potentially use HashHelpers also. Not suggesting you do this now. Correction: no it couldn't, as ObjecttoIDCache uses a sequence of primes that are no less than 2x the previous one, whereas HashHelpers uses primes that are no less than 1.2x the previous one. I see GCC uses 2x. Wonder what the history is there. |
|
Interestingly ObjecttoIdCache.cs will go beyond the largest prime that fits in an int (0x7FEFFFFD) and just use a non prime (0x7FEFFFFF). It says that's the largest possible array size, although I would have thought that was 0x7FFFFFFF. Maybe there's some overhead. This assumes the hashtables would still work (albeit inefficiently) with a non prime and not eg loop forever. |
| ** | ||
| ===========================================================*/ | ||
|
|
||
| using System; |
There was a problem hiding this comment.
you can remove the comment above wrapped in ==========
The largest possible array size in both .NET Framework and .NET Core is 0X7FEFFFFF (https://docs.microsoft.com/en-us/dotnet/framework/configure-apps/file-schema/runtime/gcallowverylargeobjects-element). |
|
Largest array size is 0x7FFFFFC7 for 1-byte sized value-type elements and 0X7FEFFFFF otherwise. If you are looking to |
| </resheader> | ||
| <data name="Arg_HTCapacityOverflow" xml:space="preserve"> | ||
| <value>Hashtable's capacity overflowed and went negative. Check load factor, capacity and the current size of the table.</value> | ||
| </data> |
There was a problem hiding this comment.
Why is an error in HashHelpers referring to Hashtable? If that were to actually be hit, it would have no relevance to BinaryFormatter.
8d4979e to
ce441ee
Compare
| @@ -40,24 +30,41 @@ internal static class HashHelpers | |||
| 1103, 1327, 1597, 1931, 2333, 2801, 3371, 4049, 4861, 5839, 7013, 8419, 10103, 12143, 14591, | |||
There was a problem hiding this comment.
I would add a comment explaining why we don't just grow the table... since apparently it wasn't obvious to us
| <value>System.Resources.ResXResourceWriter, System.Windows.Forms, Version=4.0.0.0, Culture=neutral, PublicKeyToken=b77a5c561934e089</value> | ||
| </resheader> | ||
| <data name="Arg_HTCapacityOverflow" xml:space="preserve"> | ||
| <value>Capacity overflowed and went negative.</value> |
There was a problem hiding this comment.
Something like "Cannot add more than {0} objects." might be more useful/user oriented? They don't care whether we use signed ints.
There was a problem hiding this comment.
I didn't address this one as we now should never hit this message unless the array entirely overflows.
|
Tests (in particular if this is fixing https://github.com/dotnet/corefx/issues/24902)? |
|
I'm waiting for a decision being made in https://github.com/dotnet/corefx/issues/25533. Precisely speaking, if we should always compute prime numbers and completely get rid of the pre-computed ones. I might benchmark it both ways to get the discussion going again. |
I expect that it will be too slow to compute the small prime numbers compared to the table lookup. I think the table make sense for small numbers. |
|
Overlap with dotnet/coreclr#15453 |
|
@ViktorHofer if we're blocked, would it make sense to close the PR until it is unblocked? |
|
I don't think we are blocked here. We haven't found a consensus yet. But I agree, the discussion seems to be stall |
|
If we can't each consensus in next few days, I would recommend to bring the discussion into an issue/email and close the PR, until we know what to do. |
|
Closing, please reopen when consensus is reached. |
ce441ee to
eed669a
Compare
|
Reopening to finally fix the remaining issues that are tagged with 2.1. |
eed669a to
06daf0e
Compare
|
@ViktorHofer is this one ready to go now? |
|
Working on this right now. |
|
@stephentoub > Tests (in particular if this is fixing #24902)? Just added relevant ones. @stephentoub @jkotas PTAL |
fb7c2c9 to
af85154
Compare
|
Gosh... We have the same implementation in CoreLib shared. |
| private Object _syncRoot; | ||
|
|
||
| private static ConditionalWeakTable<object, SerializationInfo> s_serializationInfoTable; | ||
| private static ConditionalWeakTable<object, SerializationInfo> SerializationInfoTable => LazyInitializer.EnsureInitialized(ref s_serializationInfoTable); |
There was a problem hiding this comment.
In CoreLib we use Interlocked.CompareExchange instead. @jkotas any preference here?
internal static ConditionalWeakTable<object, SerializationInfo> SerializationInfoTable
{
get
{
if (s_serializationInfoTable == null)
Interlocked.CompareExchange(ref s_serializationInfoTable, new ConditionalWeakTable<object, SerializationInfo>(), null);
return s_serializationInfoTable;
}
}There was a problem hiding this comment.
I do not have preference.
LazyInitializer.EnsureInitialized is convenience helper. It looks nicer, but it results into bigger slower code.
There was a problem hiding this comment.
thanks :) I'm fine with using LazyInitializer here as the serialization code path isn't highly perf related.
|
This is failing on x86 Win machines with an OutOfMemoryException. |
…efx#25509) * Reuse HashHelpers for BinaryFormatter objectholder hashes * Revert "Merge pull request dotnet/corefx#6203 from SunnyWar/master" This reverts commit dotnet/corefx@ddf8ca0, reversing changes made to dotnet/corefx@0a0ea7f. * Change resource string, make HashTable reuse existing HashHelper * Add comment describing hash number growth * Add hash number growth tests for BinaryFormatter & HashSet * Disable tests on x86 because of OOMs Commit migrated from dotnet/corefx@b6b5982
Fixes https://github.com/dotnet/corefx/issues/25533
Fixes https://github.com/dotnet/corefx/issues/24902
With #17949 we increased the MaxArrazSize of ObjectHolders. Because of that change our largest prime number 6584983 can now be smaller than the max size of an array. By @danmosemsft suggestion I'm reusing the HashHelpers utility class to utilize larger pre-computed prime numbers and remove the logic for calculating the next optimal prime number.
This also changes the start prime number size as before it was 5 and now it is 3. When I'm off plane I will add benchmarks with performance numbers. A quick test didn't show any noticeable performance regressions.
@Alois-xx I'm marking issue #24902 as fixed. As @stephentoub already mentioned we brought back BinaryFormatter for a smoother transition from .NET Framework to .NET Core and don't intend to optimize it further. If you want to, you can work by yourself on changing the data structure to remove the necessity of the pre-computed prime numbers and therefore the virtual limit. That would require extensive tests to be confident that we don't break or regress existing code.
cc @joperezr