The current implementation of ABI (Application Binary Interface, aka calling convention) support in RyuJIT is problematic in a number of areas, especially when it comes to the handling of structs (aka value types).
-
RyuJIT currently supports 4 target architectures: x86, x64 (aka x86-64), ARM and ARM64, with two different ABIs for x64 (Windows and Linux). These each have unique requirements, yet these requirements are expressed in the code programmatically, with #ifdefs, and yet even where the requirements are shared, they are often handled in different code paths.
-
When passing or returning structs, the code generator sometimes requires that the struct must be copied to or from memory. The morpher (
fgMorphArgs()
) attempts to discern these cases, and create copies when necessary, but sometimes it makes copies when they aren't needed. -
Even in cases where the code generator currently requires the struct to be in memory, it could be enhanced to handle the in-register case:
- Currently, when we have a register-passed struct that fits in a register,
but that doesn't have a single field of a matching type,
fgMorphArgs()
generates aGT_LCL_FLD
of the appropriate scalar type to reference the value. This forces the struct to be markedlvDoNotEnregister
. However, the backend has support for performing the necessary move in some cases (e.g. when a struct with a single field ofTYP_DOUBLE
is passed in an integer register asTYP_LONG
), by generating aGT_BITCAST
to move the value to the appropriate register. - In other cases (e.g. a struct with two
TYP_INT
fields in registers), the backend should be able to generate the necessary code to place the fields in the necessary register(s).
- Currently, when we have a register-passed struct that fits in a register,
but that doesn't have a single field of a matching type,
-
Even when the requirements are similar, the IL representation, as well as the transformations performed by
fgMorphArgs()
, are not the same. -
Much of the information about each argument is contained in the
fgArgInfo
on theGT_CALL
node. It in turn contains anargTable
with an entry for each argument. However, this information is not complete, especially on x64/Linux where repeated calls are made to the VM to obtain the struct descriptor. -
The functionality of
fgMorphArgs()
combines the determination of the ABI requirements, which sets up thefgArgInfo
andargTable
, with the IR transformations required to ensure that the arguments of theGT_CALL
are in the appropriate form. -
When
fgCanFastTailCall()
is called, it doesn't yet have thefgArgInfo
, so it must duplicate some of the analysis that is done infgMorphArgs()
This is a preliminary design, and is likely to change as the implementation proceeds:
First, the fgArgInfo
is extended to contain all the information needed to determine
how an argument is passed. Ideally, most of the #ifdef
s relating to ABI differences
can be eliminated by querying the fgArgInfo
. Most of the information will be queried
via properties, such that when a target doesn't support a particular struct passing
mechanism (e.g. passing structs by reference), the property will unconditionally return false, and the associated code paths will be eliminated.
The initial determination of the number of arguments and how they
are passed is extracted from fgMorphArgs()
into a separate method: gtInitArgInfo()
. It is idempotent - that is, it can be re-invoked and will simply return if it
has already been called. It can be called by fgCanFastTailCall()
so that it can query
the argTable
to get the information it requires.
This method is responsible for the first part of what is currently fgMorphArgs()
, plus setting up the argTable
:
- Count the number of args.
- Create any non-standard args (e.g. indirection cells or cookie parameters) that are needed, but don't yet create copies
- Create the
argTable
for the given number of args - Initialize the
fgArgInfo
for each arg, with all the information about how the arg is passed, and whether it requires a temp, but don't yet create any temps.- On x64/ux, this is the only method that should need to consult the struct descriptor for outgoing arguments.
- The
isProcessed
flag remains false untilfgMorphArgs()
has handled the arg. - The
fgArgInfo
contains an array of register numbers (sized according to the maximum number of registers used for a single argument). If the first register inREG_STK
, the argument is passed entirely on the stack. For most targets, if the first register is a register, the argument is passed entirely in registers. When arguments can be split (_TARGET_ARM_
), this will be indicated with anisSplit
property oftrue
.- Note that the
isSplit
property would evaluate to false on targets where it is not supported, reducing the need forifdef
s (we can rely on the compiler to eliminate those dead paths).
- Note that the
- Validate that each struct argument is either a
GT_LCL_VAR
, aGT_OBJ
, or aGT_MKREFANY
.
During the initial fgMorph
phase, fgMorphArgs()
does the following:
-
Calls
gtInitArgInfo()
to ensure that theargTable
is set up properly. -
Creates a copy of each argument as necessary.
- This should only be done if one or more of the following conditions hold:
- A copy is required to preserve possible ordering dependencies, in which
case the
needsTmp
field of thefgArgInfo
was set to true byfgInitArgInfo()
. - A struct arg has been promoted, it is passed in register(s) (or split),
and has not yet been marked
lvDoNotEnregister
.
- A copy is required to preserve possible ordering dependencies, in which
case the
- This should only be done if one or more of the following conditions hold:
-
Sets up the actual argument for any non-standard args.
-
Transforms struct arg nodes from
GT_LCL_VAR
,GT_OBJ
orGT_MKREFANY
into:GT_FIELD_LIST
(i.e. a list of fields) if the lclVar is promoted and either 1) passed on the stack, or 2) each register used to pass the struct corresponds to exactly one field of the struct. The type of the register in which a field is passed need not match the type of the field.- The case of a single
GT_FIELD_LIST
node subsumes the currentGT_LCL_FLD
representation for a matching single-field struct, and does not require a lclVar to be markedlvDoNotEnregister
. Any register type mismatch (e.g. a float field passed in an integer register) will be handled byLowering
(see below). - In future, this should include any case of a promoted struct, and the
backend (
Lowering
and/orCodeGen
) should be enhanced to correctly perform the needed re-assembling of fields into registers.
- The case of a single
GT_LCL_VAR
if the argument is a non-promoted struct that is either markedlvDoNotEnregister
or fully enregistered, such as a SIMD type lclVar or (in future) a struct that fits entirely into a register.GT_OBJ
otherwise. In this case, if it is a partial reference to a lclVar, it must be markedlvDoNotEnregister
. (If it is a full reference to a lclVar, it falls into theGT_LCL_VAR
case above.) This representation will be used even for structs that are passed as a primitive type (i.e. that currently use theGT_LCL_FLD
representation).
During Lowering
, any mismatches between the type of an actual register argument (i.e. the
GT_OBJ
or the GT_FIELD_LIST
element) and the type of the register, will cause a
GT_BITCAST
node to be inserted. The purpose of this node is simply to instruct the
register allocator to move the value between the register files, without requiring the
value to necessarily be spilled to memory.'
There are additional improvements for struct parameters for future consideration:
- Support passing promoted structs in registers (as suggested above), where
Lowering
would insert the necessary IR to assemble the fields into registers. - Instead of generating
GT_FIELD_LIST
, we should consider modeling the passing of a promoted struct as separate arguments. This would probably be best implemented by modifying theargTable
duringfgMorphArgs()
such that it reflects the "as-if" signature with the exploded struct fields.- How this would impact the handling of fields that must be packed into a single
register remains to be determined (i.e. does
fgMorphArgs()
generate the IR to assemble the fields into a single register-sized value, or is that somehow deferred?)
- How this would impact the handling of fields that must be packed into a single
register remains to be determined (i.e. does
- Support vector calling conventions. This should be somewhat simplified by the extraction of the ABI code.