Home | History | Annotate | Download | only in docs
      1 Target-specific lowering in ICE
      2 ===============================
      3 
      4 This document discusses several issues around generating target-specific ICE
      5 instructions from high-level ICE instructions.
      6 
      7 Meeting register address mode constraints
      8 -----------------------------------------
      9 
     10 Target-specific instructions often require specific operands to be in physical
     11 registers.  Sometimes one specific register is required, but usually any
     12 register in a particular register class will suffice, and that register class is
     13 defined by the instruction/operand type.
     14 
     15 The challenge is that ``Variable`` represents an operand that is either a stack
     16 location in the current frame, or a physical register.  Register allocation
     17 happens after target-specific lowering, so during lowering we generally don't
     18 know whether a ``Variable`` operand will meet a target instruction's physical
     19 register requirement.
     20 
     21 To this end, ICE allows certain directives:
     22 
     23     * ``Variable::setWeightInfinite()`` forces a ``Variable`` to get some
     24       physical register (without specifying which particular one) from a
     25       register class.
     26 
     27     * ``Variable::setRegNum()`` forces a ``Variable`` to be assigned a specific
     28       physical register.
     29 
     30 These directives are described below in more detail.  In most cases, though,
     31 they don't need to be explicity used, as the routines that create lowered
     32 instructions have reasonable defaults and simple options that control these
     33 directives.
     34 
     35 The recommended ICE lowering strategy is to generate extra assignment
     36 instructions involving extra ``Variable`` temporaries, using the directives to
     37 force suitable register assignments for the temporaries, and then let the
     38 register allocator clean things up.
     39 
     40 Note: There is a spectrum of *implementation complexity* versus *translation
     41 speed* versus *code quality*.  This recommended strategy picks a point on the
     42 spectrum representing very low complexity ("splat-isel"), pretty good code
     43 quality in terms of frame size and register shuffling/spilling, but perhaps not
     44 the fastest translation speed since extra instructions and operands are created
     45 up front and cleaned up at the end.
     46 
     47 Ensuring a non-specific physical register
     48 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
     49 
     50 The x86 instruction::
     51 
     52     mov dst, src
     53 
     54 needs at least one of its operands in a physical register (ignoring the case
     55 where ``src`` is a constant).  This can be done as follows::
     56 
     57     mov reg, src
     58     mov dst, reg
     59 
     60 so long as ``reg`` is guaranteed to have a physical register assignment.  The
     61 low-level lowering code that accomplishes this looks something like::
     62 
     63     Variable *Reg;
     64     Reg = Func->makeVariable(Dst->getType());
     65     Reg->setWeightInfinite();
     66     NewInst = InstX8632Mov::create(Func, Reg, Src);
     67     NewInst = InstX8632Mov::create(Func, Dst, Reg);
     68 
     69 ``Cfg::makeVariable()`` generates a new temporary, and
     70 ``Variable::setWeightInfinite()`` gives it infinite weight for the purpose of
     71 register allocation, thus guaranteeing it a physical register (though leaving
     72 the particular physical register to be determined by the register allocator).
     73 
     74 The ``_mov(Dest, Src)`` method in the ``TargetX8632`` class is sufficiently
     75 powerful to handle these details in most situations.  Its ``Dest`` argument is
     76 an in/out parameter.  If its input value is ``nullptr``, then a new temporary
     77 variable is created, its type is set to the same type as the ``Src`` operand, it
     78 is given infinite register weight, and the new ``Variable`` is returned through
     79 the in/out parameter.  (This is in addition to the new temporary being the dest
     80 operand of the ``mov`` instruction.)  The simpler version of the above example
     81 is::
     82 
     83     Variable *Reg = nullptr;
     84     _mov(Reg, Src);
     85     _mov(Dst, Reg);
     86 
     87 Preferring another ``Variable``'s physical register
     88 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
     89 
     90 (An older version of ICE allowed the lowering code to provide a register
     91 allocation hint: if a physical register is to be assigned to one ``Variable``,
     92 then prefer a particular ``Variable``'s physical register if available.  This
     93 hint would be used to try to reduce the amount of register shuffling.
     94 Currently, the register allocator does this automatically through the
     95 ``FindPreference`` logic.)
     96 
     97 Ensuring a specific physical register
     98 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
     99 
    100 Some instructions require operands in specific physical registers, or produce
    101 results in specific physical registers.  For example, the 32-bit ``ret``
    102 instruction needs its operand in ``eax``.  This can be done with
    103 ``Variable::setRegNum()``::
    104 
    105     Variable *Reg;
    106     Reg = Func->makeVariable(Src->getType());
    107     Reg->setWeightInfinite();
    108     Reg->setRegNum(Reg_eax);
    109     NewInst = InstX8632Mov::create(Func, Reg, Src);
    110     NewInst = InstX8632Ret::create(Func, Reg);
    111 
    112 Precoloring with ``Variable::setRegNum()`` effectively gives it infinite weight
    113 for register allocation, so the call to ``Variable::setWeightInfinite()`` is
    114 technically unnecessary, but perhaps documents the intention a bit more
    115 strongly.
    116 
    117 The ``_mov(Dest, Src, RegNum)`` method in the ``TargetX8632`` class has an
    118 optional ``RegNum`` argument to force a specific register assignment when the
    119 input ``Dest`` is ``nullptr``.  As described above, passing in ``Dest=nullptr``
    120 causes a new temporary variable to be created with infinite register weight, and
    121 in addition the specific register is chosen.  The simpler version of the above
    122 example is::
    123 
    124     Variable *Reg = nullptr;
    125     _mov(Reg, Src, Reg_eax);
    126     _ret(Reg);
    127 
    128 Disabling live-range interference
    129 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
    130 
    131 (An older version of ICE allowed an overly strong preference for another
    132 ``Variable``'s physical register even if their live ranges interfered.  This was
    133 risky, and currently the register allocator derives this automatically through
    134 the ``AllowOverlap`` logic.)
    135 
    136 Call instructions kill scratch registers
    137 ----------------------------------------
    138 
    139 A ``call`` instruction kills the values in all scratch registers, so it's
    140 important that the register allocator doesn't allocate a scratch register to a
    141 ``Variable`` whose live range spans the ``call`` instruction.  ICE provides the
    142 ``InstFakeKill`` pseudo-instruction to compactly mark such register kills.  For
    143 each scratch register, a fake trivial live range is created that begins and ends
    144 in that instruction.  The ``InstFakeKill`` instruction is inserted after the
    145 ``call`` instruction.  For example::
    146 
    147     CallInst = InstX8632Call::create(Func, ... );
    148     NewInst = InstFakeKill::create(Func, CallInst);
    149 
    150 The last argument to the ``InstFakeKill`` constructor links it to the previous
    151 call instruction, such that if its linked instruction is dead-code eliminated,
    152 the ``InstFakeKill`` instruction is eliminated as well.  The linked ``call``
    153 instruction could be to a target known to be free of side effects, and therefore
    154 safe to remove if its result is unused.
    155 
    156 Instructions producing multiple values
    157 --------------------------------------
    158 
    159 ICE instructions allow at most one destination ``Variable``.  Some machine
    160 instructions produce more than one usable result.  For example, the x86-32
    161 ``call`` ABI returns a 64-bit integer result in the ``edx:eax`` register pair.
    162 Also, x86-32 has a version of the ``imul`` instruction that produces a 64-bit
    163 result in the ``edx:eax`` register pair.  The x86-32 ``idiv`` instruction
    164 produces the quotient in ``eax`` and the remainder in ``edx``, though generally
    165 only one or the other is needed in the lowering.
    166 
    167 To support multi-dest instructions, ICE provides the ``InstFakeDef``
    168 pseudo-instruction, whose destination can be precolored to the appropriate
    169 physical register.  For example, a ``call`` returning a 64-bit result in
    170 ``edx:eax``::
    171 
    172     CallInst = InstX8632Call::create(Func, RegLow, ... );
    173     NewInst = InstFakeKill::create(Func, CallInst);
    174     Variable *RegHigh = Func->makeVariable(IceType_i32);
    175     RegHigh->setRegNum(Reg_edx);
    176     NewInst = InstFakeDef::create(Func, RegHigh);
    177 
    178 ``RegHigh`` is then assigned into the desired ``Variable``.  If that assignment
    179 ends up being dead-code eliminated, the ``InstFakeDef`` instruction may be
    180 eliminated as well.
    181 
    182 Managing dead-code elimination
    183 ------------------------------
    184 
    185 ICE instructions with a non-nullptr ``Dest`` are subject to dead-code
    186 elimination.  However, some instructions must not be eliminated in order to
    187 preserve side effects.  This applies to most function calls, volatile loads, and
    188 loads and integer divisions where the underlying language and runtime are
    189 relying on hardware exception handling.
    190 
    191 ICE facilitates this with the ``InstFakeUse`` pseudo-instruction.  This forces a
    192 use of its source ``Variable`` to keep that variable's definition alive.  Since
    193 the ``InstFakeUse`` instruction has no ``Dest``, it will not be eliminated.
    194 
    195 Here is the full example of the x86-32 ``call`` returning a 32-bit integer
    196 result::
    197 
    198     Variable *Reg = Func->makeVariable(IceType_i32);
    199     Reg->setRegNum(Reg_eax);
    200     CallInst = InstX8632Call::create(Func, Reg, ... );
    201     NewInst = InstFakeKill::create(Func, CallInst);
    202     NewInst = InstFakeUse::create(Func, Reg);
    203     NewInst = InstX8632Mov::create(Func, Result, Reg);
    204 
    205 Without the ``InstFakeUse``, the entire call sequence could be dead-code
    206 eliminated if its result were unused.
    207 
    208 One more note on this topic.  These tools can be used to allow a multi-dest
    209 instruction to be dead-code eliminated only when none of its results is live.
    210 The key is to use the optional source parameter of the ``InstFakeDef``
    211 instruction.  Using pseudocode::
    212 
    213     t1:eax = call foo(arg1, ...)
    214     InstFakeKill  // eax, ecx, edx
    215     t2:edx = InstFakeDef(t1)
    216     v_result_low = t1
    217     v_result_high = t2
    218 
    219 If ``v_result_high`` is live but ``v_result_low`` is dead, adding ``t1`` as an
    220 argument to ``InstFakeDef`` suffices to keep the ``call`` instruction live.
    221 
    222 Instructions modifying source operands
    223 --------------------------------------
    224 
    225 Some native instructions may modify one or more source operands.  For example,
    226 the x86 ``xadd`` and ``xchg`` instructions modify both source operands.  Some
    227 analysis needs to identify every place a ``Variable`` is modified, and it uses
    228 the presence of a ``Dest`` variable for this analysis.  Since ICE instructions
    229 have at most one ``Dest``, the ``xadd`` and ``xchg`` instructions need special
    230 treatment.
    231 
    232 A ``Variable`` that is not the ``Dest`` can be marked as modified by adding an
    233 ``InstFakeDef``.  However, this is not sufficient, as the ``Variable`` may have
    234 no more live uses, which could result in the ``InstFakeDef`` being dead-code
    235 eliminated.  The solution is to add an ``InstFakeUse`` as well.
    236 
    237 To summarize, for every source ``Variable`` that is not equal to the
    238 instruction's ``Dest``, append an ``InstFakeDef`` and ``InstFakeUse``
    239 instruction to provide the necessary analysis information.
    240