Home | History | Annotate | Download | only in docs
      1 =====================================
      2 Garbage Collection Safepoints in LLVM
      3 =====================================
      4 
      5 .. contents::
      6    :local:
      7    :depth: 2
      8 
      9 Status
     10 =======
     11 
     12 This document describes a set of experimental extensions to LLVM. Use
     13 with caution.  Because the intrinsics have experimental status,
     14 compatibility across LLVM releases is not guaranteed.
     15 
     16 LLVM currently supports an alternate mechanism for conservative
     17 garbage collection support using the ``gcroot`` intrinsic.  The mechanism
     18 described here shares little in common with the alternate ``gcroot``
     19 implementation and it is hoped that this mechanism will eventually
     20 replace the gc_root mechanism.
     21 
     22 Overview
     23 ========
     24 
     25 To collect dead objects, garbage collectors must be able to identify
     26 any references to objects contained within executing code, and,
     27 depending on the collector, potentially update them.  The collector
     28 does not need this information at all points in code - that would make
     29 the problem much harder - but only at well-defined points in the
     30 execution known as 'safepoints' For most collectors, it is sufficient
     31 to track at least one copy of each unique pointer value.  However, for
     32 a collector which wishes to relocate objects directly reachable from
     33 running code, a higher standard is required.
     34 
     35 One additional challenge is that the compiler may compute intermediate
     36 results ("derived pointers") which point outside of the allocation or
     37 even into the middle of another allocation.  The eventual use of this
     38 intermediate value must yield an address within the bounds of the
     39 allocation, but such "exterior derived pointers" may be visible to the
     40 collector.  Given this, a garbage collector can not safely rely on the
     41 runtime value of an address to indicate the object it is associated
     42 with.  If the garbage collector wishes to move any object, the
     43 compiler must provide a mapping, for each pointer, to an indication of
     44 its allocation.
     45 
     46 To simplify the interaction between a collector and the compiled code,
     47 most garbage collectors are organized in terms of three abstractions:
     48 load barriers, store barriers, and safepoints.
     49 
     50 #. A load barrier is a bit of code executed immediately after the
     51    machine load instruction, but before any use of the value loaded.
     52    Depending on the collector, such a barrier may be needed for all
     53    loads, merely loads of a particular type (in the original source
     54    language), or none at all.
     55 
     56 #. Analogously, a store barrier is a code fragment that runs
     57    immediately before the machine store instruction, but after the
     58    computation of the value stored.  The most common use of a store
     59    barrier is to update a 'card table' in a generational garbage
     60    collector.
     61 
     62 #. A safepoint is a location at which pointers visible to the compiled
     63    code (i.e. currently in registers or on the stack) are allowed to
     64    change.  After the safepoint completes, the actual pointer value
     65    may differ, but the 'object' (as seen by the source language)
     66    pointed to will not.
     67 
     68   Note that the term 'safepoint' is somewhat overloaded.  It refers to
     69   both the location at which the machine state is parsable and the
     70   coordination protocol involved in bring application threads to a
     71   point at which the collector can safely use that information.  The
     72   term "statepoint" as used in this document refers exclusively to the
     73   former.
     74 
     75 This document focuses on the last item - compiler support for
     76 safepoints in generated code.  We will assume that an outside
     77 mechanism has decided where to place safepoints.  From our
     78 perspective, all safepoints will be function calls.  To support
     79 relocation of objects directly reachable from values in compiled code,
     80 the collector must be able to:
     81 
     82 #. identify every copy of a pointer (including copies introduced by
     83    the compiler itself) at the safepoint,
     84 #. identify which object each pointer relates to, and
     85 #. potentially update each of those copies.
     86 
     87 This document describes the mechanism by which an LLVM based compiler
     88 can provide this information to a language runtime/collector, and
     89 ensure that all pointers can be read and updated if desired.  The
     90 heart of the approach is to construct (or rewrite) the IR in a manner
     91 where the possible updates performed by the garbage collector are
     92 explicitly visible in the IR.  Doing so requires that we:
     93 
     94 #. create a new SSA value for each potentially relocated pointer, and
     95    ensure that no uses of the original (non relocated) value is
     96    reachable after the safepoint,
     97 #. specify the relocation in a way which is opaque to the compiler to
     98    ensure that the optimizer can not introduce new uses of an
     99    unrelocated value after a statepoint. This prevents the optimizer
    100    from performing unsound optimizations.
    101 #. recording a mapping of live pointers (and the allocation they're
    102    associated with) for each statepoint.
    103 
    104 At the most abstract level, inserting a safepoint can be thought of as
    105 replacing a call instruction with a call to a multiple return value
    106 function which both calls the original target of the call, returns
    107 it's result, and returns updated values for any live pointers to
    108 garbage collected objects.
    109 
    110   Note that the task of identifying all live pointers to garbage
    111   collected values, transforming the IR to expose a pointer giving the
    112   base object for every such live pointer, and inserting all the
    113   intrinsics correctly is explicitly out of scope for this document.
    114   The recommended approach is to use the :ref:`utility passes 
    115   <statepoint-utilities>` described below. 
    116 
    117 This abstract function call is concretely represented by a sequence of
    118 intrinsic calls known collectively as a "statepoint relocation sequence".
    119 
    120 Let's consider a simple call in LLVM IR:
    121 
    122 .. code-block:: llvm
    123 
    124   define i8 addrspace(1)* @test1(i8 addrspace(1)* %obj) 
    125          gc "statepoint-example" {
    126     call void ()* @foo()
    127     ret i8 addrspace(1)* %obj
    128   }
    129 
    130 Depending on our language we may need to allow a safepoint during the execution 
    131 of ``foo``. If so, we need to let the collector update local values in the 
    132 current frame.  If we don't, we'll be accessing a potential invalid reference 
    133 once we eventually return from the call.
    134 
    135 In this example, we need to relocate the SSA value ``%obj``.  Since we can't 
    136 actually change the value in the SSA value ``%obj``, we need to introduce a new 
    137 SSA value ``%obj.relocated`` which represents the potentially changed value of
    138 ``%obj`` after the safepoint and update any following uses appropriately.  The 
    139 resulting relocation sequence is:
    140 
    141 .. code-block:: llvm
    142 
    143   define i8 addrspace(1)* @test1(i8 addrspace(1)* %obj) 
    144          gc "statepoint-example" {
    145     %0 = call i32 (i64, i32, void ()*, i32, i32, ...)* @llvm.experimental.gc.statepoint.p0f_isVoidf(i64 0, i32 0, void ()* @foo, i32 0, i32 0, i32 0, i32 0, i8 addrspace(1)* %obj)
    146     %obj.relocated = call coldcc i8 addrspace(1)* @llvm.experimental.gc.relocate.p1i8(i32 %0, i32 7, i32 7)
    147     ret i8 addrspace(1)* %obj.relocated
    148   }
    149 
    150 Ideally, this sequence would have been represented as a M argument, N
    151 return value function (where M is the number of values being
    152 relocated + the original call arguments and N is the original return
    153 value + each relocated value), but LLVM does not easily support such a
    154 representation.
    155 
    156 Instead, the statepoint intrinsic marks the actual site of the
    157 safepoint or statepoint.  The statepoint returns a token value (which
    158 exists only at compile time).  To get back the original return value
    159 of the call, we use the ``gc.result`` intrinsic.  To get the relocation
    160 of each pointer in turn, we use the ``gc.relocate`` intrinsic with the
    161 appropriate index.  Note that both the ``gc.relocate`` and ``gc.result`` are
    162 tied to the statepoint.  The combination forms a "statepoint relocation 
    163 sequence" and represents the entirety of a parseable call or 'statepoint'.
    164 
    165 When lowered, this example would generate the following x86 assembly:
    166 
    167 .. code-block:: gas
    168   
    169 	  .globl	test1
    170 	  .align	16, 0x90
    171 	  pushq	%rax
    172 	  callq	foo
    173   .Ltmp1:
    174 	  movq	(%rsp), %rax  # This load is redundant (oops!)
    175 	  popq	%rdx
    176 	  retq
    177 
    178 Each of the potentially relocated values has been spilled to the
    179 stack, and a record of that location has been recorded to the
    180 :ref:`Stack Map section <stackmap-section>`.  If the garbage collector
    181 needs to update any of these pointers during the call, it knows
    182 exactly what to change.
    183 
    184 The relevant parts of the StackMap section for our example are:
    185 
    186 .. code-block:: gas
    187   
    188   # This describes the call site
    189   # Stack Maps: callsite 2882400000
    190 	  .quad	2882400000
    191 	  .long	.Ltmp1-test1
    192 	  .short	0
    193   # .. 8 entries skipped ..
    194   # This entry describes the spill slot which is directly addressable
    195   # off RSP with offset 0.  Given the value was spilled with a pushq, 
    196   # that makes sense.
    197   # Stack Maps:   Loc 8: Direct RSP     [encoding: .byte 2, .byte 8, .short 7, .int 0]
    198 	  .byte	2
    199 	  .byte	8
    200 	  .short	7
    201 	  .long	0
    202 
    203 This example was taken from the tests for the :ref:`RewriteStatepointsForGC` utility pass.  As such, it's full StackMap can be easily examined with the following command.
    204 
    205 .. code-block:: bash
    206 
    207   opt -rewrite-statepoints-for-gc test/Transforms/RewriteStatepointsForGC/basics.ll -S | llc -debug-only=stackmaps
    208 
    209 Base & Derived Pointers
    210 ^^^^^^^^^^^^^^^^^^^^^^^
    211 
    212 A "base pointer" is one which points to the starting address of an allocation
    213 (object).  A "derived pointer" is one which is offset from a base pointer by
    214 some amount.  When relocating objects, a garbage collector needs to be able 
    215 to relocate each derived pointer associated with an allocation to the same 
    216 offset from the new address.
    217 
    218 "Interior derived pointers" remain within the bounds of the allocation 
    219 they're associated with.  As a result, the base object can be found at 
    220 runtime provided the bounds of allocations are known to the runtime system.
    221 
    222 "Exterior derived pointers" are outside the bounds of the associated object;
    223 they may even fall within *another* allocations address range.  As a result,
    224 there is no way for a garbage collector to determine which allocation they 
    225 are associated with at runtime and compiler support is needed.
    226 
    227 The ``gc.relocate`` intrinsic supports an explicit operand for describing the
    228 allocation associated with a derived pointer.  This operand is frequently 
    229 referred to as the base operand, but does not strictly speaking have to be
    230 a base pointer, but it does need to lie within the bounds of the associated
    231 allocation.  Some collectors may require that the operand be an actual base
    232 pointer rather than merely an internal derived pointer. Note that during 
    233 lowering both the base and derived pointer operands are required to be live 
    234 over the associated call safepoint even if the base is otherwise unused 
    235 afterwards.
    236 
    237 If we extend our previous example to include a pointless derived pointer, 
    238 we get:
    239 
    240 .. code-block:: llvm
    241 
    242   define i8 addrspace(1)* @test1(i8 addrspace(1)* %obj) 
    243          gc "statepoint-example" {
    244     %gep = getelementptr i8, i8 addrspace(1)* %obj, i64 20000
    245     %token = call i32 (i64, i32, void ()*, i32, i32, ...)* @llvm.experimental.gc.statepoint.p0f_isVoidf(i64 0, i32 0, void ()* @foo, i32 0, i32 0, i32 0, i32 0, i8 addrspace(1)* %obj, i8 addrspace(1)* %gep)
    246     %obj.relocated = call i8 addrspace(1)* @llvm.experimental.gc.relocate.p1i8(i32 %token, i32 7, i32 7)
    247     %gep.relocated = call i8 addrspace(1)* @llvm.experimental.gc.relocate.p1i8(i32 %token, i32 7, i32 8)
    248     %p = getelementptr i8, i8 addrspace(1)* %gep, i64 -20000
    249     ret i8 addrspace(1)* %p
    250   }
    251 
    252 Note that in this example %p and %obj.relocate are the same address and we
    253 could replace one with the other, potentially removing the derived pointer
    254 from the live set at the safepoint entirely.  
    255 
    256 GC Transitions
    257 ^^^^^^^^^^^^^^^^^^
    258 
    259 As a practical consideration, many garbage-collected systems allow code that is
    260 collector-aware ("managed code") to call code that is not collector-aware
    261 ("unmanaged code"). It is common that such calls must also be safepoints, since
    262 it is desirable to allow the collector to run during the execution of
    263 unmanaged code. Futhermore, it is common that coordinating the transition from
    264 managed to unmanaged code requires extra code generation at the call site to
    265 inform the collector of the transition. In order to support these needs, a
    266 statepoint may be marked as a GC transition, and data that is necessary to
    267 perform the transition (if any) may be provided as additional arguments to the
    268 statepoint.
    269 
    270   Note that although in many cases statepoints may be inferred to be GC
    271   transitions based on the function symbols involved (e.g. a call from a
    272   function with GC strategy "foo" to a function with GC strategy "bar"),
    273   indirect calls that are also GC transitions must also be supported. This
    274   requirement is the driving force behind the decision to require that GC
    275   transitions are explicitly marked.
    276 
    277 Let's revisit the sample given above, this time treating the call to ``@foo``
    278 as a GC transition. Depending on our target, the transition code may need to
    279 access some extra state in order to inform the collector of the transition.
    280 Let's assume a hypothetical GC--somewhat unimaginatively named "hypothetical-gc"
    281 --that requires that a TLS variable must be written to before and after a call
    282 to unmanaged code. The resulting relocation sequence is:
    283 
    284 .. code-block:: llvm
    285 
    286   @flag = thread_local global i32 0, align 4
    287 
    288   define i8 addrspace(1)* @test1(i8 addrspace(1) *%obj)
    289          gc "hypothetical-gc" {
    290 
    291     %0 = call i32 (i64, i32, void ()*, i32, i32, ...)* @llvm.experimental.gc.statepoint.p0f_isVoidf(i64 0, i32 0, void ()* @foo, i32 0, i32 1, i32* @Flag, i32 0, i8 addrspace(1)* %obj)
    292     %obj.relocated = call coldcc i8 addrspace(1)* @llvm.experimental.gc.relocate.p1i8(i32 %0, i32 7, i32 7)
    293     ret i8 addrspace(1)* %obj.relocated
    294   }
    295 
    296 During lowering, this will result in a instruction selection DAG that looks
    297 something like:
    298 
    299 ::
    300 
    301   CALLSEQ_START
    302   ...
    303   GC_TRANSITION_START (lowered i32 *@Flag), SRCVALUE i32* Flag
    304   STATEPOINT
    305   GC_TRANSITION_END (lowered i32 *@Flag), SRCVALUE i32 *Flag
    306   ...
    307   CALLSEQ_END
    308 
    309 In order to generate the necessary transition code, the backend for each target
    310 supported by "hypothetical-gc" must be modified to lower ``GC_TRANSITION_START``
    311 and ``GC_TRANSITION_END`` nodes appropriately when the "hypothetical-gc"
    312 strategy is in use for a particular function. Assuming that such lowering has
    313 been added for X86, the generated assembly would be:
    314 
    315 .. code-block:: gas
    316 
    317 	  .globl	test1
    318 	  .align	16, 0x90
    319 	  pushq	%rax
    320 	  movl $1, %fs:Flag@TPOFF
    321 	  callq	foo
    322 	  movl $0, %fs:Flag@TPOFF
    323   .Ltmp1:
    324 	  movq	(%rsp), %rax  # This load is redundant (oops!)
    325 	  popq	%rdx
    326 	  retq
    327 
    328 Note that the design as presented above is not fully implemented: in particular,
    329 strategy-specific lowering is not present, and all GC transitions are emitted as
    330 as single no-op before and after the call instruction. These no-ops are often
    331 removed by the backend during dead machine instruction elimination.
    332 
    333 
    334 Intrinsics
    335 ===========
    336 
    337 'llvm.experimental.gc.statepoint' Intrinsic
    338 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
    339 
    340 Syntax:
    341 """""""
    342 
    343 ::
    344 
    345       declare i32
    346         @llvm.experimental.gc.statepoint(i64 <id>, i32 <num patch bytes>,
    347                        func_type <target>, 
    348                        i64 <#call args>, i64 <flags>,
    349                        ... (call parameters),
    350                        i64 <# transition args>, ... (transition parameters),
    351                        i64 <# deopt args>, ... (deopt parameters),
    352                        ... (gc parameters))
    353 
    354 Overview:
    355 """""""""
    356 
    357 The statepoint intrinsic represents a call which is parse-able by the
    358 runtime.
    359 
    360 Operands:
    361 """""""""
    362 
    363 The 'id' operand is a constant integer that is reported as the ID
    364 field in the generated stackmap.  LLVM does not interpret this
    365 parameter in any way and its meaning is up to the statepoint user to
    366 decide.  Note that LLVM is free to duplicate code containing
    367 statepoint calls, and this may transform IR that had a unique 'id' per
    368 lexical call to statepoint to IR that does not.
    369 
    370 If 'num patch bytes' is non-zero then the call instruction
    371 corresponding to the statepoint is not emitted and LLVM emits 'num
    372 patch bytes' bytes of nops in its place.  LLVM will emit code to
    373 prepare the function arguments and retrieve the function return value
    374 in accordance to the calling convention; the former before the nop
    375 sequence and the latter after the nop sequence.  It is expected that
    376 the user will patch over the 'num patch bytes' bytes of nops with a
    377 calling sequence specific to their runtime before executing the
    378 generated machine code.  There are no guarantees with respect to the
    379 alignment of the nop sequence.  Unlike :doc:`StackMaps` statepoints do
    380 not have a concept of shadow bytes.  Note that semantically the
    381 statepoint still represents a call or invoke to 'target', and the nop
    382 sequence after patching is expected to represent an operation
    383 equivalent to a call or invoke to 'target'.
    384 
    385 The 'target' operand is the function actually being called.  The
    386 target can be specified as either a symbolic LLVM function, or as an
    387 arbitrary Value of appropriate function type.  Note that the function
    388 type must match the signature of the callee and the types of the 'call
    389 parameters' arguments.
    390 
    391 The '#call args' operand is the number of arguments to the actual
    392 call.  It must exactly match the number of arguments passed in the
    393 'call parameters' variable length section.
    394 
    395 The 'flags' operand is used to specify extra information about the
    396 statepoint. This is currently only used to mark certain statepoints
    397 as GC transitions. This operand is a 64-bit integer with the following
    398 layout, where bit 0 is the least significant bit:
    399 
    400   +-------+---------------------------------------------------+
    401   | Bit # | Usage                                             |
    402   +=======+===================================================+
    403   |     0 | Set if the statepoint is a GC transition, cleared |
    404   |       | otherwise.                                        |
    405   +-------+---------------------------------------------------+
    406   |  1-63 | Reserved for future use; must be cleared.         |
    407   +-------+---------------------------------------------------+
    408 
    409 The 'call parameters' arguments are simply the arguments which need to
    410 be passed to the call target.  They will be lowered according to the
    411 specified calling convention and otherwise handled like a normal call
    412 instruction.  The number of arguments must exactly match what is
    413 specified in '# call args'.  The types must match the signature of
    414 'target'.
    415 
    416 The 'transition parameters' arguments contain an arbitrary list of
    417 Values which need to be passed to GC transition code. They will be
    418 lowered and passed as operands to the appropriate GC_TRANSITION nodes
    419 in the selection DAG. It is assumed that these arguments must be
    420 available before and after (but not necessarily during) the execution
    421 of the callee. The '# transition args' field indicates how many operands
    422 are to be interpreted as 'transition parameters'.
    423 
    424 The 'deopt parameters' arguments contain an arbitrary list of Values
    425 which is meaningful to the runtime.  The runtime may read any of these
    426 values, but is assumed not to modify them.  If the garbage collector
    427 might need to modify one of these values, it must also be listed in
    428 the 'gc pointer' argument list.  The '# deopt args' field indicates
    429 how many operands are to be interpreted as 'deopt parameters'.
    430 
    431 The 'gc parameters' arguments contain every pointer to a garbage
    432 collector object which potentially needs to be updated by the garbage
    433 collector.  Note that the argument list must explicitly contain a base
    434 pointer for every derived pointer listed.  The order of arguments is
    435 unimportant.  Unlike the other variable length parameter sets, this
    436 list is not length prefixed.
    437 
    438 Semantics:
    439 """"""""""
    440 
    441 A statepoint is assumed to read and write all memory.  As a result,
    442 memory operations can not be reordered past a statepoint.  It is
    443 illegal to mark a statepoint as being either 'readonly' or 'readnone'.
    444 
    445 Note that legal IR can not perform any memory operation on a 'gc
    446 pointer' argument of the statepoint in a location statically reachable
    447 from the statepoint.  Instead, the explicitly relocated value (from a
    448 ``gc.relocate``) must be used.
    449 
    450 'llvm.experimental.gc.result' Intrinsic
    451 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
    452 
    453 Syntax:
    454 """""""
    455 
    456 ::
    457 
    458       declare type*
    459         @llvm.experimental.gc.result(i32 %statepoint_token)
    460 
    461 Overview:
    462 """""""""
    463 
    464 ``gc.result`` extracts the result of the original call instruction
    465 which was replaced by the ``gc.statepoint``.  The ``gc.result``
    466 intrinsic is actually a family of three intrinsics due to an
    467 implementation limitation.  Other than the type of the return value,
    468 the semantics are the same.
    469 
    470 Operands:
    471 """""""""
    472 
    473 The first and only argument is the ``gc.statepoint`` which starts
    474 the safepoint sequence of which this ``gc.result`` is a part.
    475 Despite the typing of this as a generic i32, *only* the value defined
    476 by a ``gc.statepoint`` is legal here.
    477 
    478 Semantics:
    479 """"""""""
    480 
    481 The ``gc.result`` represents the return value of the call target of
    482 the ``statepoint``.  The type of the ``gc.result`` must exactly match
    483 the type of the target.  If the call target returns void, there will
    484 be no ``gc.result``.
    485 
    486 A ``gc.result`` is modeled as a 'readnone' pure function.  It has no
    487 side effects since it is just a projection of the return value of the
    488 previous call represented by the ``gc.statepoint``.
    489 
    490 'llvm.experimental.gc.relocate' Intrinsic
    491 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
    492 
    493 Syntax:
    494 """""""
    495 
    496 ::
    497 
    498       declare <pointer type>
    499         @llvm.experimental.gc.relocate(i32 %statepoint_token, 
    500                                        i32 %base_offset, 
    501                                        i32 %pointer_offset)
    502 
    503 Overview:
    504 """""""""
    505 
    506 A ``gc.relocate`` returns the potentially relocated value of a pointer
    507 at the safepoint.
    508 
    509 Operands:
    510 """""""""
    511 
    512 The first argument is the ``gc.statepoint`` which starts the
    513 safepoint sequence of which this ``gc.relocation`` is a part.
    514 Despite the typing of this as a generic i32, *only* the value defined
    515 by a ``gc.statepoint`` is legal here.
    516 
    517 The second argument is an index into the statepoints list of arguments
    518 which specifies the allocation for the pointer being relocated.
    519 This index must land within the 'gc parameter' section of the
    520 statepoint's argument list.  The associated value must be within the
    521 object with which the pointer being relocated is associated. The optimizer
    522 is free to change *which* interior derived pointer is reported, provided that
    523 it does not replace an actual base pointer with another interior derived 
    524 pointer.  Collectors are allowed to rely on the base pointer operand 
    525 remaining an actual base pointer if so constructed.
    526 
    527 The third argument is an index into the statepoint's list of arguments
    528 which specify the (potentially) derived pointer being relocated.  It
    529 is legal for this index to be the same as the second argument
    530 if-and-only-if a base pointer is being relocated. This index must land
    531 within the 'gc parameter' section of the statepoint's argument list.
    532 
    533 Semantics:
    534 """"""""""
    535 
    536 The return value of ``gc.relocate`` is the potentially relocated value
    537 of the pointer specified by it's arguments.  It is unspecified how the
    538 value of the returned pointer relates to the argument to the
    539 ``gc.statepoint`` other than that a) it points to the same source
    540 language object with the same offset, and b) the 'based-on'
    541 relationship of the newly relocated pointers is a projection of the
    542 unrelocated pointers.  In particular, the integer value of the pointer
    543 returned is unspecified.
    544 
    545 A ``gc.relocate`` is modeled as a ``readnone`` pure function.  It has no
    546 side effects since it is just a way to extract information about work
    547 done during the actual call modeled by the ``gc.statepoint``.
    548 
    549 .. _statepoint-stackmap-format:
    550 
    551 Stack Map Format
    552 ================
    553 
    554 Locations for each pointer value which may need read and/or updated by
    555 the runtime or collector are provided via the :ref:`Stack Map format
    556 <stackmap-format>` specified in the PatchPoint documentation.
    557 
    558 Each statepoint generates the following Locations:
    559 
    560 * Constant which describes the calling convention of the call target. This
    561   constant is a valid :ref:`calling convention identifier <callingconv>` for
    562   the version of LLVM used to generate the stackmap. No additional compatibility
    563   guarantees are made for this constant over what LLVM provides elsewhere w.r.t.
    564   these identifiers.
    565 * Constant which describes the flags passed to the statepoint intrinsic
    566 * Constant which describes number of following deopt *Locations* (not
    567   operands)
    568 * Variable number of Locations, one for each deopt parameter listed in
    569   the IR statepoint (same number as described by previous Constant)
    570 * Variable number of Locations pairs, one pair for each unique pointer
    571   which needs relocated.  The first Location in each pair describes
    572   the base pointer for the object.  The second is the derived pointer
    573   actually being relocated.  It is guaranteed that the base pointer
    574   must also appear explicitly as a relocation pair if used after the
    575   statepoint. There may be fewer pairs then gc parameters in the IR
    576   statepoint. Each *unique* pair will occur at least once; duplicates
    577   are possible.
    578 
    579 Note that the Locations used in each section may describe the same
    580 physical location.  e.g. A stack slot may appear as a deopt location,
    581 a gc base pointer, and a gc derived pointer.
    582 
    583 The LiveOut section of the StkMapRecord will be empty for a statepoint
    584 record.
    585 
    586 Safepoint Semantics & Verification
    587 ==================================
    588 
    589 The fundamental correctness property for the compiled code's
    590 correctness w.r.t. the garbage collector is a dynamic one.  It must be
    591 the case that there is no dynamic trace such that a operation
    592 involving a potentially relocated pointer is observably-after a
    593 safepoint which could relocate it.  'observably-after' is this usage
    594 means that an outside observer could observe this sequence of events
    595 in a way which precludes the operation being performed before the
    596 safepoint.
    597 
    598 To understand why this 'observable-after' property is required,
    599 consider a null comparison performed on the original copy of a
    600 relocated pointer.  Assuming that control flow follows the safepoint,
    601 there is no way to observe externally whether the null comparison is
    602 performed before or after the safepoint.  (Remember, the original
    603 Value is unmodified by the safepoint.)  The compiler is free to make
    604 either scheduling choice.
    605 
    606 The actual correctness property implemented is slightly stronger than
    607 this.  We require that there be no *static path* on which a
    608 potentially relocated pointer is 'observably-after' it may have been
    609 relocated.  This is slightly stronger than is strictly necessary (and
    610 thus may disallow some otherwise valid programs), but greatly
    611 simplifies reasoning about correctness of the compiled code.
    612 
    613 By construction, this property will be upheld by the optimizer if
    614 correctly established in the source IR.  This is a key invariant of
    615 the design.
    616 
    617 The existing IR Verifier pass has been extended to check most of the
    618 local restrictions on the intrinsics mentioned in their respective
    619 documentation.  The current implementation in LLVM does not check the
    620 key relocation invariant, but this is ongoing work on developing such
    621 a verifier.  Please ask on llvm-dev if you're interested in
    622 experimenting with the current version.
    623 
    624 .. _statepoint-utilities:
    625 
    626 Utility Passes for Safepoint Insertion
    627 ======================================
    628 
    629 .. _RewriteStatepointsForGC:
    630 
    631 RewriteStatepointsForGC
    632 ^^^^^^^^^^^^^^^^^^^^^^^^
    633 
    634 The pass RewriteStatepointsForGC transforms a functions IR by replacing a 
    635 ``gc.statepoint`` (with an optional ``gc.result``) with a full relocation 
    636 sequence, including all required ``gc.relocates``.  To function, the pass 
    637 requires that the GC strategy specified for the function be able to reliably 
    638 distinguish between GC references and non-GC references in IR it is given.
    639 
    640 As an example, given this code:
    641 
    642 .. code-block:: llvm
    643 
    644   define i8 addrspace(1)* @test1(i8 addrspace(1)* %obj) 
    645          gc "statepoint-example" {
    646     call i32 (i64, i32, void ()*, i32, i32, ...)* @llvm.experimental.gc.statepoint.p0f_isVoidf(i64 2882400000, i32 0, void ()* @foo, i32 0, i32 0, i32 0, i32 5, i32 0, i32 -1, i32 0, i32 0, i32 0)
    647     ret i8 addrspace(1)* %obj
    648   }
    649 
    650 The pass would produce this IR:
    651 
    652 .. code-block:: llvm
    653 
    654   define i8 addrspace(1)* @test1(i8 addrspace(1)* %obj) 
    655          gc "statepoint-example" {
    656     %0 = call i32 (i64, i32, void ()*, i32, i32, ...)* @llvm.experimental.gc.statepoint.p0f_isVoidf(i64 2882400000, i32 0, void ()* @foo, i32 0, i32 0, i32 0, i32 5, i32 0, i32 -1, i32 0, i32 0, i32 0, i8 addrspace(1)* %obj)
    657     %obj.relocated = call coldcc i8 addrspace(1)* @llvm.experimental.gc.relocate.p1i8(i32 %0, i32 12, i32 12)
    658     ret i8 addrspace(1)* %obj.relocated
    659   }
    660 
    661 In the above examples, the addrspace(1) marker on the pointers is the mechanism
    662 that the ``statepoint-example`` GC strategy uses to distinguish references from
    663 non references.  Address space 1 is not globally reserved for this purpose.
    664 
    665 This pass can be used an utility function by a language frontend that doesn't 
    666 want to manually reason about liveness, base pointers, or relocation when 
    667 constructing IR.  As currently implemented, RewriteStatepointsForGC must be 
    668 run after SSA construction (i.e. mem2ref).
    669 
    670 RewriteStatepointsForGC will ensure that appropriate base pointers are listed
    671 for every relocation created.  It will do so by duplicating code as needed to
    672 propagate the base pointer associated with each pointer being relocated to
    673 the appropriate safepoints.  The implementation assumes that the following 
    674 IR constructs produce base pointers: loads from the heap, addresses of global 
    675 variables, function arguments, function return values. Constant pointers (such
    676 as null) are also assumed to be base pointers.  In practice, this constraint
    677 can be relaxed to producing interior derived pointers provided the target 
    678 collector can find the associated allocation from an arbitrary interior 
    679 derived pointer.
    680 
    681 In practice, RewriteStatepointsForGC can be run much later in the pass 
    682 pipeline, after most optimization is already done.  This helps to improve 
    683 the quality of the generated code when compiled with garbage collection support.
    684 In the long run, this is the intended usage model.  At this time, a few details
    685 have yet to be worked out about the semantic model required to guarantee this 
    686 is always correct.  As such, please use with caution and report bugs.
    687 
    688 .. _PlaceSafepoints:
    689 
    690 PlaceSafepoints
    691 ^^^^^^^^^^^^^^^^
    692 
    693 The pass PlaceSafepoints transforms a function's IR by replacing any call or 
    694 invoke instructions with appropriate ``gc.statepoint`` and ``gc.result`` pairs,
    695 and inserting safepoint polls sufficient to ensure running code checks for a 
    696 safepoint request on a timely manner.  This pass is expected to be run before 
    697 RewriteStatepointsForGC and thus does not produce full relocation sequences.  
    698 
    699 As an example, given input IR of the following:
    700 
    701 .. code-block:: llvm
    702 
    703   define void @test() gc "statepoint-example" {
    704     call void @foo()
    705     ret void
    706   }
    707 
    708   declare void @do_safepoint()
    709   define void @gc.safepoint_poll() {
    710     call void @do_safepoint()
    711     ret void
    712   }
    713 
    714 
    715 This pass would produce the following IR:
    716 
    717 .. code-block:: llvm
    718 
    719   define void @test() gc "statepoint-example" {
    720     %safepoint_token = call i32 (i64, i32, void ()*, i32, i32, ...)* @llvm.experimental.gc.statepoint.p0f_isVoidf(i64 2882400000, i32 0, void ()* @do_safepoint, i32 0, i32 0, i32 0, i32 0)
    721     %safepoint_token1 = call i32 (i64, i32, void ()*, i32, i32, ...)* @llvm.experimental.gc.statepoint.p0f_isVoidf(i64 2882400000, i32 0, void ()* @foo, i32 0, i32 0, i32 0, i32 0)
    722     ret void
    723   }
    724 
    725 In this case, we've added an (unconditional) entry safepoint poll and converted the call into a ``gc.statepoint``.  Note that despite appearances, the entry poll is not necessarily redundant.  We'd have to know that ``foo`` and ``test`` were not mutually recursive for the poll to be redundant.  In practice, you'd probably want to your poll definition to contain a conditional branch of some form.
    726 
    727 
    728 At the moment, PlaceSafepoints can insert safepoint polls at method entry and 
    729 loop backedges locations.  Extending this to work with return polls would be 
    730 straight forward if desired.
    731 
    732 PlaceSafepoints includes a number of optimizations to avoid placing safepoint 
    733 polls at particular sites unless needed to ensure timely execution of a poll 
    734 under normal conditions.  PlaceSafepoints does not attempt to ensure timely 
    735 execution of a poll under worst case conditions such as heavy system paging.
    736 
    737 The implementation of a safepoint poll action is specified by looking up a 
    738 function of the name ``gc.safepoint_poll`` in the containing Module.  The body
    739 of this function is inserted at each poll site desired.  While calls or invokes
    740 inside this method are transformed to a ``gc.statepoints``, recursive poll 
    741 insertion is not performed.
    742 
    743 By default PlaceSafepoints passes in ``0xABCDEF00`` as the statepoint
    744 ID and ``0`` as the number of patchable bytes to the newly constructed
    745 ``gc.statepoint``.  These values can be configured on a per-callsite
    746 basis using the attributes ``"statepoint-id"`` and
    747 ``"statepoint-num-patch-bytes"``.  If a call site is marked with a
    748 ``"statepoint-id"`` function attribute and its value is a positive
    749 integer (represented as a string), then that value is used as the ID
    750 of the newly constructed ``gc.statepoint``.  If a call site is marked
    751 with a ``"statepoint-num-patch-bytes"`` function attribute and its
    752 value is a positive integer, then that value is used as the 'num patch
    753 bytes' parameter of the newly constructed ``gc.statepoint``.  The
    754 ``"statepoint-id"`` and ``"statepoint-num-patch-bytes"`` attributes
    755 are not propagated to the ``gc.statepoint`` call or invoke if they
    756 could be successfully parsed.
    757 
    758 If you are scheduling the RewriteStatepointsForGC pass late in the pass order,
    759 you should probably schedule this pass immediately before it.  The exception 
    760 would be if you need to preserve abstract frame information (e.g. for
    761 deoptimization or introspection) at safepoints.  In that case, ask on the 
    762 llvm-dev mailing list for suggestions.
    763 
    764 
    765 Supported Architectures
    766 =======================
    767 
    768 Support for statepoint generation requires some code for each backend.
    769 Today, only X86_64 is supported.  
    770 
    771 Bugs and Enhancements
    772 =====================
    773 
    774 Currently known bugs and enhancements under consideration can be
    775 tracked by performing a `bugzilla search
    776 <http://llvm.org/bugs/buglist.cgi?cmdtype=runnamed&namedcmd=Statepoint%20Bugs&list_id=64342>`_
    777 for [Statepoint] in the summary field. When filing new bugs, please
    778 use this tag so that interested parties see the newly filed bug.  As
    779 with most LLVM features, design discussions take place on `llvm-dev
    780 <http://lists.llvm.org/mailman/listinfo/llvm-dev>`_, and patches
    781 should be sent to `llvm-commits
    782 <http://lists.llvm.org/mailman/listinfo/llvm-commits>`_ for review.
    783 
    784