Skip to content
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
47 changes: 26 additions & 21 deletions docs/design/coreclr/jit/jit-call-morphing.md
Original file line number Diff line number Diff line change
Expand Up @@ -27,11 +27,13 @@ new LclVar to preserve the order of evaluation rule.
Each argument is an arbitrary expression tree. The JIT tracks a summary of observable side-effects
using a set of five bit flags in every GenTree node: `GTF_ASG`, `GTF_CALL`, `GTF_EXCEPT`, `GTF_GLOB_REF`,
and `GTF_ORDER_SIDEEFF`. These flags are propagated up the tree so that the top node has a particular
flag set if any of its child nodes has the flag set. Decisions about whether to evaluate arguments
into temp LclVars are made by examining these flags on each of the arguments.
flag set if any of its child nodes has the flag set. Decisions about whether we need to take special
case to evaluate arguments in order are made by examining these flags on each of the arguments.
Typically, evaluating an argument early means creating a temporary local and assigning it as part of
the early list in the GenTreeCall node.


*Our design goal for call sites is to create a few temp LclVars as possible, while preserving the
*Our design goal for call sites is to create as few temp LclVars as possible, while preserving the
order of evaluation rules of IL and C#.*


Expand Down Expand Up @@ -77,27 +79,27 @@ them there while pushing some new arguments for a nested call. Thus we allow ne
calls for x86 but do not allow them for the other architectures.


Rules for when Arguments must be evaluated into temp LclVars
Rules for when Arguments must be evaluated early
-----------------

During the first Morph phase known as global Morph we call `CallArgs::ArgsComplete()`
after we have completed determining ABI information for each arg. This method applies
the following rules:

1. When an argument is marked as containing an assignment using `GTF_ASG`, then we
force all previous non-constant arguments to be evaluated into temps. This is very
force all previous non-constant arguments to be evaluated early. This is very
conservative, but at this phase of the JIT it is rare to have an assignment subtree
as part of an argument.
2. When an argument is marked as containing a call using the `GTF_CALL` flag, then
we force that argument and any previous argument that is marked with any of the
`GTF_ALL_EFFECT` flags into temps.
`GTF_ALL_EFFECT` flags to be evaluated early.
* Additionally, for `FEATURE_FIXED_OUT_ARGS`, any previous stack based args that
we haven't marked as needing a temp but still need to store in the outgoing args
area is marked as needing a placeholder temp using `needPlace`.
3. We force any arguments that use `localloc` to be evaluated into temps.
we haven't marked as needing early evaluating but still need to store in the outgoing
args area is marked as needing a placeholder temp using `needPlace`.
3. We force any arguments that use `localloc` to be evaluated early.
4. We mark any address taken locals with the `GTF_GLOB_REF` flag. For two special
cases we call `SetNeedsTemp()` and set up the temp in `fgMorphArgs`. `SetNeedsTemp`
records the tmpNum used and sets `isTmp` so that we handle it like the other temps.
cases we call `CallArgs::SetTemp()` and set up the temp earlier in `fgMorphArgs`.
`CallArgs::SetTemp` records the tmpNum used and sets `isTmp` so that we handle it like the other temps.
The special cases are for `GT_MKREFANY` and for a `TYP_STRUCT` argument passed by
value when we can't optimize away the extra copy.

Expand All @@ -120,26 +122,29 @@ LclFlds and put them before the constant args.
to the least complex.


Evaluating Args into new LclVar temps and the creation of the LateArgs
Evaluating Args early and the creation of the LateArgs
-----------------

After calling `SortArgs()`, the `EvalArgsToTemps()` method is called to create
After calling `SortArgs()`, the `EvalArgsEarly()` method is called to create
the temp assignments and to populate the LateArgs list.

For arguments that are marked as needing a temp:
For arguments that are marked as evaluating early:
-----------------

1. We create an assignment using `gtNewTempAssign`. This assignment replaces
the original argument in the early argument list. After we create the assignment
the argument is marked with `m_isTmp = true`.
2. Arguments that are already marked with `m_isTmp` are treated similarly as
above except we don't create an assignment for them.
3. A `TYP_STRUCT` argument passed by value will have `m_isTmp` set to true
and will use a `GT_COPYBLK` or a `GT_COPYOBJ` to perform the assignment of the temp.
4. The assignment node or the CopyBlock node is referred to as `arg1 SETUP` in the JitDump.
3. Some arguments may not need to have temps created, e.g. for a comma
with an invariant effective value but side effects in it. The side effects may
extracted directly to the setup node and the invariant node put in the late arg list.
4. A `TYP_STRUCT` argument passed by value will have `m_isTmp` set to true
and will use a block copy to perform the assignment of the temp.
5. The store or the CopyBlock node is referred to as `arg1 SETUP` in the JitDump.


For arguments that are marked as not needing a temp:
For arguments that are marked as not requiring early evaluation:
-----------------

1. If this is an argument that is passed in a register, then the existing
Expand All @@ -152,6 +157,6 @@ evaluated directly into the outgoing arg area or pushed on the stack.
After the Call node is fully morphed the LateArgs list will contain the arguments
passed in registers as well as additional ones for `m_needPlace` marked
arguments whenever we have a nested call for a stack based argument.
When `m_needTmp` is true the LateArg will be a LclVar that was created
to evaluate the arg (single-def/single-use). When `m_needTmp` is false
the LateArg can be an arbitrary expression tree.
When `m_evaluateEarly` is true the LateArg will be a LclVar that was created
to evaluate the arg (single-def/single-use) or a simple invariant tree.
When `m_evaluateEarly` is false the LateArg can be an arbitrary expression tree.
2 changes: 2 additions & 0 deletions src/coreclr/jit/compiler.h
Original file line number Diff line number Diff line change
Expand Up @@ -2498,6 +2498,8 @@ class Compiler

GenTree* gtNewConWithPattern(var_types type, uint8_t pattern);

GenTree* gtNewGenericCon(var_types type, uint8_t* cnsVal);

GenTreeLclVar* gtNewStoreLclVarNode(unsigned lclNum, GenTree* data);

GenTreeLclFld* gtNewStoreLclFldNode(
Expand Down
98 changes: 92 additions & 6 deletions src/coreclr/jit/gentree.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -1468,7 +1468,7 @@ CallArgs::CallArgs()
, m_hasRegArgs(false)
, m_hasStackArgs(false)
, m_argsComplete(false)
, m_needsTemps(false)
, m_needsEarlyEvaluation(false)
#ifdef UNIX_X86_ABI
, m_alignmentDone(false)
#endif
Expand Down Expand Up @@ -7654,9 +7654,9 @@ GenTree* Compiler::gtNewOneConNode(var_types type, var_types simdBaseType /* = T
}

//------------------------------------------------------------------------
// CreateInitValue:
// Create an IR node representing a constant value with the specified 8
// byte character broadcast into all of its bytes.
// gtNewConWithPattern:
// Create an IR node representing a constant value with the specified byte
// broadcast into all of its bytes.
//
// Parameters:
// type - The primitive type. For small types the constant will be
Expand Down Expand Up @@ -7718,6 +7718,92 @@ GenTree* Compiler::gtNewConWithPattern(var_types type, uint8_t pattern)
}
}

//------------------------------------------------------------------------
// gtNewGenericCon:
// Create an IR node representing a constant value of any type.
//
// Parameters:
// type - The primitive type. For small types the constant will be
// zero/sign-extended and a TYP_INT node will be returned.
// cnsVal - Pointer to data
//
// Returns:
// An IR node representing the constant.
//
GenTree* Compiler::gtNewGenericCon(var_types type, uint8_t* cnsVal)
{
switch (type)
{
#define READ_VALUE(typ) \
typ val; \
memcpy(&val, cnsVal, sizeof(typ));

case TYP_BYTE:
{
READ_VALUE(int8_t);
return gtNewIconNode(val);
}
case TYP_BOOL:
case TYP_UBYTE:
{
READ_VALUE(uint8_t);
return gtNewIconNode(val);
}
case TYP_SHORT:
{
READ_VALUE(int16_t);
return gtNewIconNode(val);
}
case TYP_USHORT:
{
READ_VALUE(uint16_t);
return gtNewIconNode(val);
}
case TYP_INT:
{
READ_VALUE(int32_t);
return gtNewIconNode(val);
}
case TYP_LONG:
{
READ_VALUE(int64_t);
return gtNewLconNode(val);
}
case TYP_FLOAT:
{
READ_VALUE(float);
return gtNewDconNode(val, TYP_FLOAT);
}
case TYP_DOUBLE:
{
READ_VALUE(double);
return gtNewDconNode(val);
}
case TYP_REF:
case TYP_BYREF:
{
READ_VALUE(target_ssize_t);
return gtNewIconNode(val, type);
}
#ifdef FEATURE_SIMD
case TYP_SIMD8:
case TYP_SIMD12:
case TYP_SIMD16:
#if defined(TARGET_XARCH)
case TYP_SIMD32:
case TYP_SIMD64:
#endif // TARGET_XARCH
{
return gtNewVconNode(type, cnsVal);
}
#endif // FEATURE_SIMD
default:
unreached();

#undef READ_VALUE
}
}

GenTreeLclVar* Compiler::gtNewStoreLclVarNode(unsigned lclNum, GenTree* data)
{
LclVarDsc* varDsc = lvaGetDesc(lclNum);
Expand Down Expand Up @@ -9296,7 +9382,7 @@ void CallArgs::InternalCopyFrom(Compiler* comp, CallArgs* other, CopyNodeFunc co
m_hasRegArgs = other->m_hasRegArgs;
m_hasStackArgs = other->m_hasStackArgs;
m_argsComplete = other->m_argsComplete;
m_needsTemps = other->m_needsTemps;
m_needsEarlyEvaluation = other->m_needsEarlyEvaluation;

// Unix x86 flags related to stack alignment intentionally not copied as
// they depend on where the call will be inserted.
Expand All @@ -9311,7 +9397,7 @@ void CallArgs::InternalCopyFrom(Compiler* comp, CallArgs* other, CopyNodeFunc co
carg->m_tmpNum = arg.m_tmpNum;
carg->m_signatureType = arg.m_signatureType;
carg->m_wellKnownArg = arg.m_wellKnownArg;
carg->m_needTmp = arg.m_needTmp;
carg->m_evaluateEarly = arg.m_evaluateEarly;
carg->m_needPlace = arg.m_needPlace;
carg->m_isTmp = arg.m_isTmp;
carg->m_processed = arg.m_processed;
Expand Down
17 changes: 9 additions & 8 deletions src/coreclr/jit/gentree.h
Original file line number Diff line number Diff line change
Expand Up @@ -4571,8 +4571,9 @@ class CallArg
var_types m_signatureType : 5;
// The type of well-known argument this is.
WellKnownArg m_wellKnownArg : 5;
// True when we force this argument's evaluation into a temp LclVar.
bool m_needTmp : 1;
// True if this argument needs to be evaluated in the early list (usually
// by introducing a copy into a temp LclVar).
bool m_evaluateEarly : 1;
// True when we must replace this argument with a placeholder node.
bool m_needPlace : 1;
// True when we setup a temp LclVar for this argument.
Expand All @@ -4590,7 +4591,7 @@ class CallArg
, m_tmpNum(BAD_VAR_NUM)
, m_signatureType(TYP_UNDEF)
, m_wellKnownArg(WellKnownArg::None)
, m_needTmp(false)
, m_evaluateEarly(false)
, m_needPlace(false)
, m_isTmp(false)
, m_processed(false)
Expand Down Expand Up @@ -4679,8 +4680,8 @@ class CallArgs
// True if we have one or more stack arguments.
bool m_hasStackArgs : 1;
bool m_argsComplete : 1;
// One or more arguments must be copied to a temp by EvalArgsToTemps.
bool m_needsTemps : 1;
// One or more arguments must be evaluated early by EvalArgsEarly.
bool m_needsEarlyEvaluation : 1;
#ifdef UNIX_X86_ABI
// Updateable flag, set to 'true' after we've done any required alignment.
bool m_alignmentDone : 1;
Expand Down Expand Up @@ -4736,8 +4737,8 @@ class CallArgs
void AddFinalArgsAndDetermineABIInfo(Compiler* comp, GenTreeCall* call);

void ArgsComplete(Compiler* comp, GenTreeCall* call);
void EvalArgsToTemps(Compiler* comp, GenTreeCall* call);
void SetNeedsTemp(CallArg* arg);
void EvalArgsEarly(Compiler* comp, GenTreeCall* call);
void SetEvaluateEarly(CallArg* arg);
bool IsNonStandard(Compiler* comp, GenTreeCall* call, CallArg* arg);

GenTree* MakeTmpArgNode(Compiler* comp, CallArg* arg);
Expand All @@ -4753,7 +4754,7 @@ class CallArgs
bool AreArgsComplete() const { return m_argsComplete; }
bool HasRegArgs() const { return m_hasRegArgs; }
bool HasStackArgs() const { return m_hasStackArgs; }
bool NeedsTemps() const { return m_needsTemps; }
bool NeedsEarlyEvaluation() const { return m_needsEarlyEvaluation; }

#ifdef UNIX_X86_ABI
void ComputeStackAlignment(unsigned curStackLevelInBytes)
Expand Down
88 changes: 15 additions & 73 deletions src/coreclr/jit/importer.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -3637,86 +3637,28 @@ GenTree* Compiler::impImportStaticReadOnlyField(CORINFO_FIELD_HANDLE field, CORI
//
GenTree* Compiler::impImportCnsTreeFromBuffer(uint8_t* buffer, var_types valueType)
{
GenTree* tree = nullptr;
switch (valueType)
GenTree* tree;
if (valueType == TYP_REF)
{
// Use memcpy to read from the buffer and create an Icon/Dcon tree
#define CreateTreeFromBuffer(type, treeFactory) \
type v##type; \
memcpy(&v##type, buffer, sizeof(type)); \
tree = treeFactory(v##type);

case TYP_BOOL:
{
CreateTreeFromBuffer(bool, gtNewIconNode);
break;
}
case TYP_BYTE:
{
CreateTreeFromBuffer(int8_t, gtNewIconNode);
break;
}
case TYP_UBYTE:
{
CreateTreeFromBuffer(uint8_t, gtNewIconNode);
break;
}
case TYP_SHORT:
target_ssize_t ptr;
memcpy(&ptr, buffer, sizeof(ptr));
if (ptr == 0)
{
CreateTreeFromBuffer(int16_t, gtNewIconNode);
break;
}
case TYP_USHORT:
{
CreateTreeFromBuffer(uint16_t, gtNewIconNode);
break;
}
case TYP_UINT:
case TYP_INT:
{
CreateTreeFromBuffer(int32_t, gtNewIconNode);
break;
}
case TYP_LONG:
case TYP_ULONG:
{
CreateTreeFromBuffer(int64_t, gtNewLconNode);
break;
tree = gtNewNull();
}
case TYP_FLOAT:
{
CreateTreeFromBuffer(float, gtNewDconNode);
break;
}
case TYP_DOUBLE:
else
{
CreateTreeFromBuffer(double, gtNewDconNode);
break;
setMethodHasFrozenObjects();
tree = gtNewIconEmbHndNode((void*)(ssize_t)ptr, nullptr, GTF_ICON_OBJ_HDL, nullptr);
tree->gtType = TYP_REF;
INDEBUG(tree->AsIntCon()->gtTargetHandle = ptr);
}
case TYP_REF:
{
size_t ptr;
memcpy(&ptr, buffer, sizeof(ssize_t));

if (ptr == 0)
{
tree = gtNewNull();
}
else
{
setMethodHasFrozenObjects();
tree = gtNewIconEmbHndNode((void*)ptr, nullptr, GTF_ICON_OBJ_HDL, nullptr);
tree->gtType = TYP_REF;
INDEBUG(tree->AsIntCon()->gtTargetHandle = ptr);
}
break;
}
default:
return nullptr;
}
else
{
tree = gtNewGenericCon(valueType, buffer);
}

assert(tree != nullptr);
tree->gtType = genActualType(valueType);
return tree;
}

Expand Down
Loading