Conversation
|
I couldn't figure out the best area label to add to this PR. If you have write-permissions please help me learn by adding exactly one area label. |
There was a problem hiding this comment.
| As of this writing there is no specific support for operating with incoherent memory, device memory or similar. Passing non-ordinary memory to the runtime by the means of pointer operations or native interop will result in Undefined Behavior. | |
| As of this writing there is no specific support for operating with incoherent memory, device memory or similar. Passing non-ordinary memory to the runtime by the means of pointer operations or native interop results in Undefined Behavior. |
Nit: The rest of the doc is present tense
There was a problem hiding this comment.
Style nit: The doc still has a mix of present and future tenses.
There was a problem hiding this comment.
Do we also want to mention that reading and storing object references and unmanaged pointers is atomic?
There was a problem hiding this comment.
Object references are always aligned. We can mention that (if it is already not mentioned).
But unmanaged pointers are just numbers with dereference operation defined. They may not be aligned, then reading/storing is not necessarily atomic.
There was a problem hiding this comment.
Ah, you mean the pointers themselves. When managed by the runtime the pointers and references are "properly aligned" and thus read/writes are atomic.
There was a problem hiding this comment.
I meant the value of unmanaged pointer. Unmanaged pointer is not a primitive type, so the current wording does not cover it.
There was a problem hiding this comment.
Yes, it may be worth mentioning this. It follows from the "properly aligned" but pointers/references is a very common scenario.
|
https://github.com/dotnet/runtime/tree/main/docs/design/specs may be better location for this doc. It is not CoreCLR specific (botr is generally CoreCLR specific) and it augments ECMA spec (augments of ECMA spec live under docs/design/specs). |
There was a problem hiding this comment.
Might be worth calling out that this is explicitly different from Ecma 335, which only has this guarantee if all accesses are the same size.
There was a problem hiding this comment.
It is the same about dependent reads - ECMA does not guarantee it. I am not sure pointing to every small difference will make the document easier to read.
Not a lot of people would read ECMA spec prior reading this doc anyways.
There was a problem hiding this comment.
What about members of structs?
There was a problem hiding this comment.
Fields are variables.
Regardless how you get something that has, for example, type int, it will be aligned accordingly. The runtime (i.e. JIT, GC, TypeSystem) will ensure alignment for all variables under runtime's control.
There was a problem hiding this comment.
FieldOffsetAttribute could possibly introduce unaligned fields.
There was a problem hiding this comment.
Arguably, by using FieldOffsetAttribute you are not letting the runtime to manage locations of the fields.
I will add a note about FieldOffsetAttribute as something that may cause unaligned variables.
There was a problem hiding this comment.
Mention c# volatile keyword in here? Since most readers will commonly use that.
There was a problem hiding this comment.
Technically C# volatile belongs to C# spec, but it is common, so a few notes on that would be useful.
Adding "volatile" on a variable in C# is just a decoration that makes no difference to CLR, but it is a hint to C# compiler itself (and few other compilers) to emit reads and writes of that variable as volatile reads/writes - this is the part that CLR will honor.
There was a problem hiding this comment.
Yeah that would be great as an "aside" here, although it's not a CLR concept it is useful context
There was a problem hiding this comment.
Does any of this map to C# that can be described?
There was a problem hiding this comment.
I do not think C# emits these.
There was a problem hiding this comment.
Would be good to reference the actual attribute
There was a problem hiding this comment.
Same comment wherever CLR is used, is Mono relevant?
There was a problem hiding this comment.
I think Mono is the same here.
There was a problem hiding this comment.
I will be changing CLR reference to .Net, as discussed a few pages above. This doc tries to be applicable to all common implementations of .Net
(we will treat disagreements with the final form of this doc as bugs, if such disagreements are found)
There was a problem hiding this comment.
More precisely "will become observable not later than effects of accessing any member of the type..." or something like that.
Probably needs some re-wording.
Easy way to see it as if there is a lock that is taken when static constructor runs and releasing the lock is a release. The lock also guarantees that static constructor runs only once.
The implementation is not always a lock though, sometimes static constructors run at JIT time, NativeAOT may not run some static constructors at all and just use precomputed results embedded in the binary. The ordering result is the same.
There was a problem hiding this comment.
No, but people keep bringing this up as a concern. - "I heard that some CPUs do not honor data dependency" - Yes, some flavors of Alphas had that issue, possibly a mistake in the design, no, noone does that again.
There was a problem hiding this comment.
@danmoseley - here is somehting that could differ on Mono. Mono has a switch to turn this off. It is not the default behavior though, so I do not think it needs to be mentioned.
Co-authored-by: Jan Kotas <jkotas@microsoft.com> Co-authored-by: Dan Moseley <danmose@microsoft.com>
There was a problem hiding this comment.
private readonly object
Then it's consistent with
https://learn.microsoft.com/en-us/dotnet/csharp/language-reference/statements/lock#guidelines
There was a problem hiding this comment.
Does this forbid memory-mapped files mapped at multiple virtual addresses?
There was a problem hiding this comment.
Runtime will assume that the memory is ordinary memory. If there is some additional semantics to the writes (as write to multimapped location will update other views), that is an Undefined Behavior. There can be no guarantees from the runtime on what will happen.
It may work as expected in simple cases, but the runtime would not help or prevent that.
There was a problem hiding this comment.
Are "Speculative writes are not allowed" and "Reads cannot be introduced" really consequences of the above assumption, or additional guarantees made by the runtime?
There was a problem hiding this comment.
It is a design choice, but it follows from assuming that ordinary writes/reads can have observable sideeffects.
If you assume that a write may be visible from other threads (unless the location is known to be local to the current thread), then you can't replace
location = actualValue;with
location = someSpeculativeValue;
if (speculative value was wrong)
{
location = actualValue;
}Memory models that require only singlethreaded correctness allow this (ex: ordinary writes in C++), which is often unexpected and may result in subtle bugs in multithreaded scenarios.
There was a problem hiding this comment.
So we are basically saying that the jit cannot introduce a race, but is allowed to remove one?
There was a problem hiding this comment.
For ordinary memory accesses - yes. A program cannot condition its correctness on a nondeterministic race, that is not guaranteed to ever happen, so we can remove races. Adding a race would be observable.
There was a problem hiding this comment.
Note that there are rare scenarios when a program explicitly wants to observe races - like making multiple or repeated reads of the same location and checking if the value has changed.
In such case it needs to use Volatile.Read, or a proposed compiler fence (#75874)
There was a problem hiding this comment.
Maybe off-topic, but doesn't this also need compiler memory barriers to be effective?
e.g.
runtime/src/libraries/System.Threading/tests/InterlockedTests.cs
Lines 677 to 682 in 118a162
There was a problem hiding this comment.
Yes. Process-wide barrier is typically paired with a compiler barrier. A not-inlineable getter or a volatile read can work as such, but maybe it is time to have an official compiler barrier helper.
There was a problem hiding this comment.
localObj.ToString() might not be a good example, because .ToString() could possibly read static fields that were written by the constructor (without using volatile). Static field reads are not dependent reads, and do not have the ordering guarantee. Probably better to explicitly read some instance field here.
There was a problem hiding this comment.
The example is expected to work correctly even if .ToString accesses string's static fields.
it would be highly inconvenient if it did not work as caller does not often know these details.
Static field reads are not dependent reads
The rule about static constructors above tells that the cctor runs before any member is accessed and it would be useless to guarantee that static constructor "runs before" accessing without implying that the access will see the complete results.
The mechanism will vary and often the results of the static constructors will be computed well in advance before the actual access (at JIT time or Load time).
As a last resort this will be a dependent read from the type-attached storage, and the type would be derived from the instance, so there is a dependency chain from the instance to the static data.
There was a problem hiding this comment.
I didn't mean static constructor. Consider
sealed class C
{
//There may only be one instance, so why not make every field static?
static string name;
private C()
{
name = "My Name";
}
public override string ToString()
{
return name;//should always return "My Name"
}
static readonly object lockObj = new object();
static C c;
//Thread 1 and 2
public static void PrintSingleton()
{
if (c is null)
{
lock (lockObj)
{
c ??= new C();
}
}
Console.WriteLine(c.ToString());
}
}There is no dependency between reading C.c in C.PrintSingleton() and reading C.name in C.ToString(). Both threads may not print "My Name".
There was a problem hiding this comment.
That is just a write to some shared memory from the constructor. It could be a static variable in another class or unmanaged memory. Constructor can do many things. It can also launch a thread that will write to statics or it can do c = this or lockObj = null.
Having a constructor with sideeffects other than constructing and initializing the instance would make the type a poor choice for the singleton pattern.
The sample here is not about that, but about guarantees provided by the runtime.
I will add a definition of a simple MyClass so there is no confusion about what constructor does.
There was a problem hiding this comment.
It may be better to use volatile read here to ensure side effects of the constructor, e.g. static field writes, are visible. LazyInitializer currently also uses volatile reads.
There was a problem hiding this comment.
No need for volatile reads. Any data written directly or indirectly prior to the publishing of the instance will be visible via the instance. (see comments about the statics in the previous comment)
There was a problem hiding this comment.
It may be better to use volatile read here. See above for reasons.
There was a problem hiding this comment.
Reads cannot be introduced
Does this mean the following RyuJit behavior is a bug?
[MethodImpl(MethodImplOptions.NoInlining)]
private static int Problem(int* p)
{
var b = *p == 1;
var c = b;
if (b)
{
JitUse(c);
return 2;
}
return 1;
}
[MethodImpl(MethodImplOptions.NoInlining)]
public static void JitUse<T>(T arg) { }IN000a: 000000 sub rsp, 40
G_M51150_IG02: ; offs=000004H, size=000DH, bbWeight=1 PerfScore 8.25, gcrefRegs=00000000 {}, byrefRegs=00000000 {}, BB01 [0000], byref, isz
IN0001: 000004 xor eax, eax
IN0002: 000006 cmp dword ptr [rcx], 1 ; The original load
IN0003: 000009 sete al
IN0004: 00000C cmp dword ptr [rcx], 1 ; The duplicated load
IN0005: 00000F jne SHORT G_M51150_IG05
G_M51150_IG03: ; offs=000011H, size=000DH, bbWeight=0.50 PerfScore 1.75, gcrefRegs=00000000 {}, byrefRegs=00000000 {}, BB02 [0001], byref
IN0006: 000011 mov ecx, eax
IN0007: 000013 call [RyuJitReproduction.Program:JitUse(bool)]
IN0008: 000019 mov eax, 2
G_M51150_IG04: ; offs=00001EH, size=0005H, bbWeight=0.50 PerfScore 0.62, epilog, nogc, extend
IN000b: 00001E add rsp, 40
IN000c: 000022 ret
G_M51150_IG05: ; offs=000023H, size=0005H, bbWeight=0.50 PerfScore 0.12, gcVars=0000000000000000 {}, gcrefRegs=00000000 {}, byrefRegs=00000000 {}, BB03 [0002], gcvars, byref
IN0009: 000023 mov eax, 1
G_M51150_IG06: ; offs=000028H, size=0005H, bbWeight=0.50 PerfScore 0.62, epilog, nogc, extend
IN000d: 000028 add rsp, 40
IN000e: 00002C ret (This particular example needs DOTNET_JitNoCSE=1 to be reproduced)
There was a problem hiding this comment.
If DOTNET_JitNoCSE is a supported scenario, then this would seem like a bug.
"b == c" in the source and compiler introduced a race. @AndyAyersMS
There was a problem hiding this comment.
JitNoCSE was just to create an easy and small sample; without it the compiler would have folded the duplicated load back, of course, that's not guaranteed in the general case.
Opened #75916 to track this.
There was a problem hiding this comment.
Yes, we should avoid duplicating reads like this.
|
This pull request has been automatically marked |
|
@VSadov I think it would be a good idea to merge this doc even though if it is not 100% finished. Could you please mark the few places with open questions, and address any remaining feedback? |
|
@jkotas Yes, I was going to ask the same question. I think there are couple questions (like provide some details/examples in couple places), but otherwise there is nothing major and actionable. I will go through the feedback once more, address remaining questions and I think we can wrap this up. |
|
@jkotas - I think I addressed all the remaining actionable feedback. |
|
I believe that we have identified a few open questions where we are not sure about the current CoreCLR JIT being compliant with what's written in the doc and we are not sure about the cost to make it compliant. (@markples has a work item to investigate one of them.) Can we mention them in the doc? |
|
@jkotas There was a concern about compiler optimization honoring ordering of data-dependent reads (and writes to object fields to a lesser degree). We expect that from the hardware (for writes we insert fences if needed). The question was whether compiler itself may violate such ordering in some optimizations and whether it is costly to prevent that if it does. I think the concern was mostly about reads as writing has obvious sideeffects, which limits what optimizations can do. Preliminary conclusion was that compiler honors data dependency because the internal representations are tree-based. If that does not hold, we will fix the compiler, ... or the document. I hope it just holds. At very least it would be unintuitive. |
Co-authored-by: Jan Kotas <jkotas@microsoft.com>
We should also make sure that this holds for Mono LLVM-based codegen. |
|
Could you please include a link in the doc? |
|
Thanks!!! |
|
|
||
| There was a lot of ambiguity around the guarantees provided by object assignments. Going forward the runtimes will only provide the guarantees described in this document. | ||
|
|
||
| _It is believed that compiler optimizations do not violate the ordering guarantees in sections about [data-dependent reads](~#data-dependent-reads-are-ordered) and [object assignments](~#object-assignment), but further investigations are needed to ensure compliance or to fix possible violations. That is tracked by the following issue:_ https://github.com/dotnet/runtime/issues/79764 |
There was a problem hiding this comment.
The given links lead to HTTP 404.
https://github.com/dotnet/runtime/blob/main/docs/design/specs/~
Is the resulting url.
Edit: created #79785 to fix this.
A document describing memory model of .NET runtimes.
Fixes #63474