Added value type serialization support#36506
Added value type serialization support#36506ahsonkhan merged 2 commits intodotnet:masterfrom YohDeadfall:struct-serialization
Conversation
The reason for ReflectionEmitMaterializer is performance (and without boxing), so base decisions on that... Also currently "data types" (Decimal, int, etc) are created explicitly in JsonValueConverter.TryRead. These are "single value" data types. So this commit will be useful for value types that are not used as "single-value data types". |
| generator.Emit(OpCodes.Ldloca_S, local); | ||
| generator.Emit(OpCodes.Initobj, type); | ||
| generator.Emit(OpCodes.Ldloc, local); | ||
| generator.Emit(OpCodes.Box, type); |
There was a problem hiding this comment.
Does this have to be boxed?
There was a problem hiding this comment.
Yes, it boxes because JsonClassInfo.ConstructorDelegate has object as a return type. It's definitely possible to avoid boxing via preallocated ByReference<T> for each struct and storing it in a TLS or in a shared pool, but I intentionally haven't implemented that yet. The first step is basic support with tests, and optimization as the second after a discussion.
| { | ||
| get | ||
| { | ||
| yield return new object[] { typeof(SimpleTestStruct), SimpleTestStruct.s_data }; |
There was a problem hiding this comment.
If the goal of this PR is to provide struct support in the serializer, we should consider adding a lot more test cases, including nested types (struct within reference, reference within struct, etc.).
There was a problem hiding this comment.
Haven't done that because structs have no difference from primitive value types and decimal which isn't primitive. The idea of that test was to check that setting a value on a struct instance sets the value on the original struct on the heap and not on its stack allocated copy. But if it's required, I can write more tests. Should I do that and should I duplicate tests of SimpleTestClass?
There was a problem hiding this comment.
We need to at least verify multiple properties are applied from json.
|
@steveharter I've rebased the PR and am waiting for another review. |
| realMethod.Name, | ||
| type, | ||
| ".ctor", | ||
| typeof(object), |
There was a problem hiding this comment.
Why use typeof(object) here for the non-struct case? However it probably doesn't matter since it doesn't imply boxing for ref types.
There was a problem hiding this comment.
There is no need to introduce condition here. JsonClassInfo.ConstructorDelegate returns an object.
|
|
||
| internal void Set(object obj, TDeclaredProperty value) | ||
| { | ||
| if (default(TClass) == null) |
There was a problem hiding this comment.
Using default on every getter\setter may be expensive. It seems like an easy thing to cache as a bool (class or struct).
There was a problem hiding this comment.
This is a well known optimization. The JIT eliminates the check and replaces it by a constant value (true for reference types and Nullable<T>, false for other value types).
| else | ||
| { | ||
| var get = (GetPropertyValueByRef)_get; | ||
| return get(ref Unsafe.As<ByReference>(obj).Value); |
There was a problem hiding this comment.
Doesn't this only work for your test class with a single value?
There was a problem hiding this comment.
ByReference represents a boxed object, this is how the CLR stores value types on the heap. The code does unsafe cast of a boxed value to ByReference and gets a reference to its actual value.
There was a problem hiding this comment.
@jkotas any concern using class like this to get access to the real value from a boxed object?:
private sealed class ByReference
{
public TClass Value;
}
...
ref Unsafe.As<ByReference>(obj).Value
There was a problem hiding this comment.
This is unsafe construct that depends on runtime implementation details. I do not think it is a good idea to use it in high-level libraries like this one.
Can you use Unsafe.Unbox instead?
There was a problem hiding this comment.
Somehow missed that method, thanks for pointing out. It will make the code a bit trickier, but it's definitely possible to use Unsafe.Unbox<T> here.
| } | ||
| } | ||
|
|
||
| internal void Set(object obj, TDeclaredProperty value) |
There was a problem hiding this comment.
Isn't the struct boxed on each setter? So the new property value won't be applied to obj?
Ideally this is ref TClass obj for structs so we don't box (and unbox) but that requires re-factoring of other calling code. If we change it to ref object obj and for structs below cast obj to TClass and then re-assign that back to obj that would work but slow (unbox + box)
There was a problem hiding this comment.
It's already boxed and stored on the heap. object means a reference to a class instance or a reference to a boxed value type located on the heap too. ref object is a pointer to a local which stores a pointer to the actual object.
There was a problem hiding this comment.
Yes the my sentence was because I wasn't sure the ByReference class was sufficient.
steveharter
left a comment
There was a problem hiding this comment.
Please verify multiple properties on a struct works.
|
Removed the dirty trick with |
|
This isn't quite what I had in mind with that proposal. Additionally, is there any way to do this without use of Unsafe APIs? Putting those into a serializer / deserializer makes me extraordinarily uncomfortable. |
|
Sure, it's possible, but there is a cost in complexity. A stack frame should store a wrapper around the struct in the Speaking about your proposal, how do you plan to implement property accessors ? |
|
I had planned on using a combination of a force-JIT flag and calli, which should be lighter-weight than bouncing through multiple delegate invocations. But importantly, the complexity of the unsafe code would be wholly encapsulated within coreclr (where we could keep it in sync with how the runtime works) rather than spread across non-coreclr packages which would have to be serviced independently if something in the runtime changes or if a bug were discovered. (This shouldn't affect your PR much. It just means that if that other issue does make its way in at some future point then this code can be refactored to ride atop it, which should get a measurable perf boost.) If the alternative for not using |
|
@GrabYourPitchforks Totally agree that such a kind of stuff should be moved off. I don't like the chained delegates I introduced, but there is no other way for now except code generation. So I'm looking forward to @steveharter Rebased and squashed commits. |
|
@steveharter Rebased again to include changes in the test project. |
| } | ||
| else | ||
| { | ||
| var propertyGetter = CreateDelegate<GetProperty<TClass, TProperty>>(getMethodInfo); |
There was a problem hiding this comment.
Concerned about performance here. Have you measured?
Some comments here would be good as well - like the double hop.
I believe you need both a strongly-typed class (TClass) and property (TProperty) in order to have a direct reference to the getter\setter (instead of methodinfo.invoke) which we have above however then this new delegate below is returned which doesn't have a strongly-typed TClass.
Also concerned about the extra hop (assuming there is one).
There was a problem hiding this comment.
It's slower (not measured, but it is) because of the second delegate one line below. Unfortunately, it's a price of Unsafe.Unbox<T> which has a constraint on T. I will think about merging two generic JsonPropertyInfo into a single class and about adding specialized classes for struct and class handling.
this new delegate below is returned which doesn't have a strongly-typed TClass.
Yep, because passing a value as TClass when it's a structure will cause passing it by value, not by reference. To avoid it I pass an object and get a reference to it in lines 80 and 89. As I said, there is a constraint on T which can be avoided by some magic with delegates. The JIT doesn't check constrains, but the compiler does,
There was a problem hiding this comment.
Can we can avoid the second delegate for the reference type (non-struct) case?
There was a problem hiding this comment.
Nope due to the signature check.
There was a problem hiding this comment.
Let me benchmark this and see what the hit is.
|
cc @rynowak @JamesNK any concerns with supporting struct at the highest level? (the struct must still have public properties - fields not supported with this PR). For areas that expose a top-level type, like in extensibilty scenarios, this means we will want to pass the type argument by reference ( |
|
Imo, it could be difficult for customers to reason about the functionality with conditional support like this, so we should avoid piecemeal support of the struct feature. Today, the serializer supports classes. If we want to support structs, we should make sure that nested/custom structs (structs within reference types, reference types within structs, etc.) are supported, potentially with a story for fields/default values/ctors. |
|
Are there major blockers to lighting up value types all over the stack (even if it's slightly ineffient? requires boxing, etc). I'd like to say yes, to having struct support, but it's hard to explain to people why it would only work as a top-level object. |
|
I've opened dotnet/coreclr#24189 to get some clarifications about an AOT friendly way to get the underlying value of an nullable, but for now it's possible to do the same trick which exists in Npgsql (see npgsql/npgsql#1773 for example). I think that it's okay because the current implementation doesn't support AOT out from the box. So I can invert the code and make two final property infos, one for reference types and another one for value types. Nullable handling logic will be implemented via a delegate and a method to take an underlying value only if |
|
cc @JeremyKuhne has a scenario where we have a top-level struct with public properties (actually just our JsonElement). |
|
@ahsonkhan @rynowak any final thoughts? I recommend supporting this. The other 3 serializers I tested (Json.Net, Jil, Utf8Json) support structs at top-level. The perf hit on end-to-end is within the margin of error (simple POCO type with 10 properties. Struct performance is the same. Before: After: |
|
I've changed the emit materializer behavior to support value types with parameterless constructors to align it with |
|
@steveharter @ahsonkhan Rebased the PR again and ready for the final review. |
Which run on every pull request. As you know some steps fail due to unrelated reasons. So I would like to push only them, not all pipelines. |
I don't think re-running a specific leg via a command is supported yet, with azure pipelines. I can re-run a specific leg from the UI (under details), but that's not fully available yet either.
cc @safern |
|
Azure Pipelines successfully started running 1 pipeline(s). |
|
Assigned additional memory to the VM on which I'm building Core FX and enabled page files, but still getting |
|
It's strange that even 48 GB of RAM and 16 GB of page files isn't enough to build the solution. Task Manager showed no high memory usage, only 16 GB of memory were used before fails. |
@ViktorHofer, @karelz - have either of you seen this sort of issue before when building corefx locally? Do you have some guidance or solution here for @YohDeadfall? I have not hit OOM myself when building corefx. |
|
@rainersigwald any idea what's causing the OOM in the build (see the callstack above)? |
|
@YohDeadfall can you please try the build on another machine/VM just as a test? |
|
The callstack doesn't jump out at me. What version of MSBuild, running on what runtime (MSBuild.exe on desktop, or .NET core), is running? On what OS? Is it possible to capture a memory dump of the processes before they fail? |
|
Can it be due to high parallelism?
While build I saw that MSBuild shipped with VS2019 was used. It's version is 16.1.68-preview+g64a5b6be6d.
OS Name Microsoft Windows 10 Pro Insider Preview
Done. How can I send them to you? Via e-mail? |
|
/azp run corefx-ci |
|
Azure Pipelines successfully started running 1 pipeline(s). |
|
@rainersigwald It fails in random places, but with |
Tested on a real machine. Got out of memory three times... Here its configuration: |
|
And there are other problems on the physical machine: |
I think the easiest way would be to file a Visual Studio feedback item at https://developercommunity.visualstudio.com/spaces/8/index.html. That should allow you to upload the memory dump. If you mention the item here I can short-circuit the routing process. |
|
Looks like that building Core FX in parallel causes the error. I've decreased core count from 32 to 8 on the VM, and everything goes well now. @rainersigwald Here is a link to the issue. |
BenchmarkDotNet=v0.11.3.1003-nightly, OS=Windows 10.0.18895
Intel Xeon CPU E5-2687W 0 3.10GHz, 1 CPU, 16 logical and 16 physical cores
.NET Core SDK=3.0.100-preview5-011568
[Host] : .NET Core 3.0.0-preview5-27626-15 (CoreCLR 4.6.27622.75, CoreFX 4.700.19.22408), 64bit RyuJIT
Job-GBKDPQ : .NET Core ? (CoreCLR 3.0.19.26472, CoreFX 4.700.19.26901), 64bit RyuJIT
Job-AEKLGR : .NET Core ? (CoreCLR 3.0.19.26472, CoreFX 4.700.19.26901), 64bit RyuJIT
Runtime=Core IterationTime=250.0000 ms MaxIterationCount=20
MinIterationCount=15 WarmupCount=1 IndexViewModel
Location
LoginViewModel
MyEventsListerViewModel
|
|
Does it makes sense to add reflection emit strategy in this PR? It's to implement field support. |
I would add it separately, in a new PR on top of this. @YohDeadfall, thanks for getting the perf numbers. I am wondering what's causing the 20-40% regression on certain cases. I would have thought ToString and ToBytes to behave the same way in terms of perf, so the fact that they don't might point to perf results not being consistent. |
|
@ViktorHofer, @safern do you know if something's up with the ubuntu leg - Ubuntu.1604.Arm64.Open-arm64: A few unrelated tests keeping failing due to timeout (but only for that particular leg). For example - System.Reflection.Tests: |
I re-tried running the benchmarks and it looks like the performance is reasonable with no noticeable regression. I commented out some of the config that reduces accuracy in the interest of test execution time: IndexViewModel BenchmarkDotNet=v0.11.3.1003-nightly, OS=Windows 10.0.18362
Intel Core i7-6700 CPU 3.40GHz (Skylake), 1 CPU, 8 logical and 4 physical cores
.NET Core SDK=3.0.100-preview4-010907
[Host] : .NET Core 3.0.0-preview6-27629-07 (CoreCLR 4.6.27624.71, CoreFX 4.700.19.22902), 64bit RyuJIT
Job-BMHKSV : .NET Core ? (CoreCLR 3.0.19.26472, CoreFX 4.700.19.26901), 64bit RyuJIT
Runtime=Core Toolchain=CoreRun
Before:
After:
Before:
After:
|
Yes we are currently hitting frequent issues with docker images on arm64, @MattGal is rolling out a potential fix soon. |
|
One note; Arm64 is a different setup path (since they aren't VMs) but once we solve this on AMD64 we should have a good idea how to deal with it for ARM64 as well. |
* Added value type serialization support * Fixed coding style issue Commit migrated from dotnet/corefx@d70ba24
This pull request add value type serialization support, but it works only for properties. Therefore, it fixes #36240 partly. Opened #36505 to track field serialization.
Open questions:
ReflectionMaterializerdoes it for all types, butReflectionEmitMaterializerdidn't previously./cc @ahsonkhan @steveharter